機率機器學習

機率機器學習

《機率機器學習》是2023年清華大學出版社出版的圖書,作者是朱軍。

基本介紹

  • 中文名:機率機器學習
  • 作者:朱軍
  • 出版時間:2023年7月1日
  • 出版社:清華大學出版社
  • ISBN:9787302631842 
  • 定價:99 元
內容簡介,圖書目錄,

內容簡介

隨著深度學習、大規模預訓練模型和生成式人工智慧的進展,機器學習已成為解決很多工程和科學問題的**方案。《機率機器學習》一書從機率建模和統計推斷的角度系統介紹機器學習的基本概念、經典算法及前沿進展。主要內容包括機率機器學習基礎、學習理論、機率圖模型、近似機率推斷、高斯過程、深度生成模型、強化學習等。全書從實例出發,由淺入深,直觀與嚴謹相結合,並提供了延伸閱讀內容和豐富的參考文獻。

圖書目錄

目錄
基礎篇
第 1章緒論 ...................................................................................................3
1.1機器學習 ............................................................................................. 3
1.1.1什麼是機器學習 ........................................................................ 3
1.1.2機器學習的基本任務.................................................................. 5
1.1.3 K-近鄰:一種“懶惰”學習方法 ................................................ 9
1.2機率機器學習 .................................................................................... 11
1.2.1為什麼需要機率機器學習 ......................................................... 11
1.2.2機率機器學習包含的內容 ......................................................... 13
1.3延伸閱讀 ........................................................................................... 16
1.4習題.................................................................................................. 17
第 2章機率統計基礎 ....................................................................................19
2.1機率.................................................................................................. 19
2.1.1事件空間與機率 ...................................................................... 19
2.1.2連續型和離散型隨機變數 ......................................................... 21
2.1.3變數變換 ................................................................................ 22
2.1.4聯合分布、邊緣分布和條件分布 ............................................... 22
2.1.5獨立與條件獨立 ...................................................................... 24
2.1.6貝葉斯公式 ............................................................................. 24
2.2常見機率分布及其數字特徵 ................................................................ 25
2.2.1隨機變數的常用數字特徵 ......................................................... 25
2.2.2離散型變數的機率分布 ............................................................ 26
2.2.3連續型變數的機率分布 ............................................................ 27
2.3統計推斷 ........................................................................................... 28
2.3.1最大似然估計.......................................................................... 29
2.3.2誤差 ....................................................................................... 29
2.4貝葉斯推斷........................................................................................ 30
2.4.1基本流程 ................................................................................ 30
2.4.2常見套用和方法 ...................................................................... 32
2.4.3線上貝葉斯推斷 ...................................................................... 33
2.4.4共軛先驗 ................................................................................ 33
2.5資訊理論基礎........................................................................................ 35
2.5.1 熵 .......................................................................................... 35
2.5.2 互信息.................................................................................... 36
2.5.3 相對熵.................................................................................... 36
2.6習題.................................................................................................. 37
第 3章線性回歸模型 ....................................................................................39
3.1基本模型 ........................................................................................... 39
3.1.1 統計決策基本模型 ................................................................... 39
3.1.2 線性回歸及最小二乘法 ............................................................ 40
3.1.3 機率模型及最大似然估計 ......................................................... 42
3.1.4 帶基函式的線性回歸................................................................ 43
3.2正則化線性回歸 ................................................................................. 44
3.2.1 嶺回歸.................................................................................... 45
3.2.2 Lasso...................................................................................... 47
3.2.3 Lp範數正則化的線性回歸........................................................ 49
3.3貝葉斯線性回歸 ................................................................................. 50
3.3.1 最大後驗分布估計 ................................................................... 51
3.3.2 貝葉斯預測分布 ...................................................................... 51
3.3.3 貝葉斯模型選擇 ...................................................................... 54
3.3.4 經驗貝葉斯和相關向量機 ......................................................... 56
3.4模型評估 ........................................................................................... 57
3.4.1 評價指標 ................................................................................ 57
3.4.2 交叉驗證 ................................................................................ 58
3.5延伸閱讀 ........................................................................................... 59
3.6習題.................................................................................................. 60
第 4章樸素貝葉斯分類器 .............................................................................61
4.1基本分類模型 .................................................................................... 61
4.1.1 貝葉斯分類器.......................................................................... 62
4.1.2 核密度估計 ............................................................................. 63
4.1.3 維數災.................................................................................... 65
4.2樸素貝葉斯模型 ................................................................................. 66
4.2.1 生成式模型 ............................................................................. 66
4.2.2 樸素貝葉斯假設 ...................................................................... 67
4.2.3 最大似然估計.......................................................................... 68
4.2.4 最大後驗估計.......................................................................... 69
4.3樸素貝葉斯的擴展.............................................................................. 70
4.3.1 多值特徵 ................................................................................ 70
4.3.2多類別分類 ............................................................................. 71
4.3.3連續型特徵 ............................................................................. 72
4.3.4半監督樸素貝葉斯分類器 ......................................................... 73
4.3.5樹增廣樸素貝葉斯分類器 ......................................................... 73
4.4樸素貝葉斯的分析.............................................................................. 74
4.4.1分類邊界 ................................................................................ 74
4.4.2預測機率 ................................................................................ 75
4.5延伸閱讀 ........................................................................................... 76
4.6習題.................................................................................................. 77
第 5章對數幾率回歸和廣義線性模型.............................................................78
5.1對數幾率回歸 .................................................................................... 78
5.1.1模型定義 ................................................................................ 78
5.1.2對數幾率回歸的隱變數表示...................................................... 79
5.1.3最大條件似然估計 ................................................................... 80
5.1.4正則化方法 ............................................................................. 82
5.1.5判別式模型與生成式模型對比 .................................................. 83
5.2隨機梯度下降 .................................................................................... 84
5.2.1基本方法 ................................................................................ 85
5.2.2動量法.................................................................................... 86
5.2.3 AdaGrad方法......................................................................... 86
5.2.4 RMSProp法 ........................................................................... 86
5.2.5 Adam法................................................................................. 87
5.3貝葉斯對數幾率回歸 .......................................................................... 87
5.3.1拉普拉斯近似.......................................................................... 87
5.3.2預測分布 ................................................................................ 89
5.4廣義線性模型 .................................................................................... 89
5.4.1指數族分布 ............................................................................. 89
5.4.2指數族分布的性質 ................................................................... 91
5.4.3廣義線性模型.......................................................................... 92
5.5延伸閱讀 ........................................................................................... 93
5.6習題.................................................................................................. 94
第 6章深度神經網路 ....................................................................................95
6.1神經網路的基本原理 .......................................................................... 95
6.1.1非線性學習的基本框架 ............................................................ 95
6.1.2感知機.................................................................................... 95
6.1.3多層感知機 ............................................................................. 97
6.1.4反向傳播 ................................................................................ 98
6.2卷積神經網路 ...................................................................................101
6.2.1基本組成 ...............................................................................101
6.2.2批歸一化 ...............................................................................104
6.2.3殘差網路 ...............................................................................105
6.3循環神經網路 ...................................................................................106
6.3.1基本原理 ...............................................................................107
6.3.2長短時記憶網路 .....................................................................110
6.4延伸閱讀 ..........................................................................................112
6.5習題.................................................................................................113
第 7章支持向量機與核方法 ........................................................................ 114
7.1硬間隔支持向量機.............................................................................114
7.1.1分類邊界 ...............................................................................114
7.1.2線性可分的支持向量機 ...........................................................114
7.1.3硬間隔支持向量機的對偶問題 .................................................116
7.2軟間隔支持向量機.............................................................................118
7.2.1軟約束與損失函式 ..................................................................118
7.2.2軟間隔 SVM的對偶問題 ........................................................120
7.2.3支持向量回歸.........................................................................122
7.3核方法 .............................................................................................123
7.3.1核函式的基本性質 ..................................................................123
7.3.2表示定理 ...............................................................................125
7.3.3常見的核函式.........................................................................126
7.3.4機率生成模型誘導的核函式.....................................................127
7.3.5神經切線核 ............................................................................128
7.4多分類支持向量機.............................................................................129
7.4.1一對多...................................................................................129
7.4.2一對一...................................................................................130
7.4.3聯合最佳化 ...............................................................................130
7.5支持向量機的機率解釋 ......................................................................131
7.5.1 Platt校準..............................................................................131
7.5.2最大熵判別學習 .....................................................................131
7.6延伸閱讀 ..........................................................................................132
7.7習題.................................................................................................133
第 8章聚類 ............................................................................................... 134
8.1聚類問題 ..........................................................................................134
8.1.1任務描述 ...............................................................................134
8.1.2距離度量 ...............................................................................135
8.2 K-均值算法......................................................................................137
8.2.1最佳化目標 ...............................................................................137
8.2.2 K-均值算法介紹.....................................................................138
8.2.3疊代初值和停止條件...............................................................139
8.2.4 K-均值算法中的模型選擇 .......................................................140
8.3混合高斯模型 ...................................................................................141
8.3.1隱變數模型 ............................................................................142
8.3.2混合分布模型.........................................................................142
8.3.3混合分布模型與聚類...............................................................144
8.4 EM算法 ..........................................................................................145
8.4.1高斯混合模型的 EM算法 .......................................................147
8.4.2 EM算法收斂性......................................................................148
8.4.3 EM算法與 K-均值的聯繫 ......................................................149
8.5評價指標 ..........................................................................................149
8.5.1外部評價指標.........................................................................149
8.5.2內部評價指標.........................................................................150
8.6延伸閱讀 ..........................................................................................151
8.7習題.................................................................................................151
第 9章降維 ............................................................................................... 153
9.1降維問題 ..........................................................................................153
9.2主成分分析.......................................................................................154
9.2.1基本原理 ...............................................................................154
9.2.2高維 PCA..............................................................................156
9.3主成分分析的原理.............................................................................156
9.3.1最大化方差 ............................................................................157
9.3.2最小化重建誤差 .....................................................................158
9.3.3機率主成分分析 .....................................................................159
9.4自編碼器 ..........................................................................................160
9.4.1自編碼器的基本模型...............................................................160
9.4.2稀疏自編碼器.........................................................................161
9.4.3去噪自編碼器.........................................................................162
9.5局部線性嵌入 ...................................................................................162
9.5.1局部線性嵌入的基本過程 ........................................................162
9.5.2最優局部線性重構 ..................................................................164
9.5.3保持局部最優重構的嵌入表示 .................................................165
9.5.4參數選擇 ...............................................................................166
9.6詞向量嵌入.......................................................................................167
9.6.1隱含語義分析.........................................................................167
9.6.2神經語言模型.........................................................................168
9.7延伸閱讀 ..........................................................................................170
9.8習題.................................................................................................171
第 10章集成學習....................................................................................... 173
10.1決策樹............................................................................................173
10.1.1 ID3算法 ............................................................................174
10.1.2 C4.5算法 ...........................................................................175
10.1.3 CART算法 ........................................................................175
10.2裝包法............................................................................................176
10.2.1基本方法 ............................................................................176
10.2.2隨機森林 ............................................................................177
10.3提升法............................................................................................178
10.3.1 AdaBoost算法....................................................................178
10.3.2從最佳化角度看 AdaBoost ......................................................179
10.3.3梯度提升 ............................................................................182
10.3.4梯度提升決策樹 ..................................................................183
10.3.5 XGBoost算法 ....................................................................184
10.4機率集成學習..................................................................................185
10.4.1混合線性模型......................................................................185
10.4.2層次化混合專家模型............................................................186
10.5深度模型的集成 ..............................................................................188
10.5.1 Dropout:一種模型集成的策略 ............................................188
10.5.2深度集成 ............................................................................189
10.6延伸閱讀 ........................................................................................190
10.7習題 ...............................................................................................190
第 11章學習理論....................................................................................... 192
11.1基本概念 ........................................................................................192
11.1.1偏差-複雜度分解 .................................................................193
11.1.2結構風險最小化 ..................................................................195
11.1.3 PAC理論 ...........................................................................196
11.1.4基本不等式 .........................................................................197
11.2有限假設空間..................................................................................198
11.2.1 Hoeffding不等式.................................................................198
11.2.2並集上界 ............................................................................199
11.3無限假設空間..................................................................................201
11.3.1 VC維 ................................................................................201
11.3.2 Rademacher複雜度.............................................................203
11.3.3間隔理論 ............................................................................204
11.3.4 PAC貝葉斯........................................................................205
11.4深度學習理論..................................................................................206
11.4.1雙重下降 ............................................................................207
11.4.2良性過擬合 .........................................................................208
11.4.3隱式正則化 ........................................................................209
11.5延伸閱讀 ........................................................................................209
11.6習題 ...............................................................................................210
高級篇
第 12章機率圖模型 ................................................................................... 215
12.1概述 ...............................................................................................215
12.2機率圖模型的表示 ...........................................................................217
12.2.1貝葉斯網路 .........................................................................217
12.2.2馬爾可夫隨機場 ..................................................................221
12.2.3有向圖與無向圖的關係 ........................................................224
12.3機率圖模型的推斷 ...........................................................................226
12.3.1變數消減 ............................................................................226
12.3.2訊息傳遞 ............................................................................229
12.3.3因子圖................................................................................230
12.3.4最大機率取值......................................................................231
12.3.5連線樹................................................................................231
12.4參數學習 ........................................................................................232
12.4.1貝葉斯網路的參數學習 ........................................................232
12.4.2馬爾可夫隨機場的參數學習..................................................233
12.4.3條件隨機場 .........................................................................235
12.5結構學習 ........................................................................................236
12.5.1樹狀貝葉斯網路 ..................................................................236
12.5.2高斯馬爾可夫隨機場............................................................238
12.6延伸閱讀 ........................................................................................238
12.7習題 ...............................................................................................239
第 13章變分推斷....................................................................................... 241
13.1基本原理 ........................................................................................241
13.1.1變分的基本原理 ..................................................................241
13.1.2推斷任務 ............................................................................242
13.2變分推斷 ........................................................................................244
13.2.1對數似然的變分下界............................................................244
13.2.2平均場方法 .........................................................................245
13.2.3信念傳播 ............................................................................248
13.3變分 EM ........................................................................................250
13.3.1從 EM到變分 EM ..............................................................250
13.3.2指數分布族的變分 EM算法.................................................251
13.3.3機率潛在語義分析 ...............................................................252
13.3.4隨機 EM算法.....................................................................253
13.4變分貝葉斯 .....................................................................................254
13.4.1貝葉斯定理的變分表示 ........................................................255
13.4.2貝葉斯高斯混合模型............................................................255
13.5期望傳播 ........................................................................................258
13.5.1基礎 EP算法......................................................................258
13.5.2圖模型的 EP算法 ...............................................................260
13.6延伸閱讀 ........................................................................................261
13.7習題 ...............................................................................................261
第 14章蒙特卡洛方法 ................................................................................ 263
14.1概述 ...............................................................................................263
14.2基礎採樣算法..................................................................................264
14.2.1基於重參數化的採樣............................................................264
14.2.2拒絕採樣 ............................................................................266
14.2.3重要性採樣 .........................................................................267
14.2.4重要性重採樣......................................................................268
14.2.5原始採樣 ............................................................................269
14.3馬爾可夫鏈蒙特卡洛........................................................................269
14.3.1馬爾可夫鏈 .........................................................................269
14.3.2 Metropolis Hastings採樣.....................................................271
14.3.3 Gibbs採樣 .........................................................................273
14.3.4 Gibbs採樣的變種 ...............................................................274
14.4輔助變數採樣..................................................................................274
14.4.1切片採樣 ............................................................................275
14.4.2輔助變數採樣......................................................................276
14.5基於動力學系統的 MCMC採樣 .......................................................277
14.5.1動力學系統 .........................................................................277
14.5.2哈密爾頓方程的離散化 ........................................................278
14.5.3哈密爾頓蒙特卡洛 ...............................................................279
14.5.4隨機梯度 MCMC採樣.........................................................280
14.6延伸閱讀 ........................................................................................281
14.7習題 ...............................................................................................282
第 15章高斯過程....................................................................................... 284
15.1貝葉斯神經網路 ..............................................................................284
15.1.1貝葉斯線性回歸 ..................................................................284
15.1.2貝葉斯神經網路 ..................................................................285
15.1.3無限寬貝葉斯神經網路 ........................................................286
15.2高斯過程回歸..................................................................................287
15.2.1定義 ...................................................................................287
15.2.2無噪聲情況下的預測............................................................288
15.2.3有噪聲的預測......................................................................289
15.2.4殘差建模 ............................................................................290
15.2.5協方差函式 .........................................................................291
15.3高斯過程分類..................................................................................293
15.3.1基本模型 ............................................................................293
15.3.2拉普拉斯近似推斷 ...............................................................293
15.3.3期望傳播近似推斷 ...............................................................295
15.3.4與支持向量機的關係............................................................296
15.4稀疏高斯過程..................................................................................297
15.4.1基於誘導點的稀疏近似 ........................................................297
15.4.2稀疏變分高斯過程 ...............................................................299
15.5延伸閱讀 ........................................................................................300
15.6習題 ...............................................................................................301
第 16章深度生成模型 ................................................................................ 302
16.1基本框架 ........................................................................................302
16.1.1生成模型基本概念 ...............................................................302
16.1.2基於層次化貝葉斯的建模 .....................................................303
16.1.3基於深度神經網路的建模 .....................................................304
16.2流模型............................................................................................305
16.2.1仿射耦合流模型 ..................................................................306
16.2.2殘差流模型 .........................................................................308
16.2.3去量化................................................................................309
16.3自回歸生成模型 ..............................................................................310
16.3.1神經自回歸密度估計器 ........................................................310
16.3.2連續型神經自回歸密度估計器 ..............................................312
16.4變分自編碼器..................................................................................313
16.4.1模型定義 ............................................................................313
16.4.2基於重參數化的參數估計 .....................................................314
16.5生成對抗網路..................................................................................315
16.5.1基本模型 ............................................................................315
16.5.2沃瑟斯坦生成對抗網路 ........................................................318
16.6擴散機率模型..................................................................................319
16.6.1模型定義 ............................................................................319
16.6.2模型訓練 ............................................................................320
16.6.3共享參數 ............................................................................321
16.7延伸閱讀 ........................................................................................322
16.8習題 ...............................................................................................323
第 17章強化學習....................................................................................... 324
17.1決策任務 ........................................................................................324
17.2多臂老虎機 .....................................................................................325
17.2.1伯努利多臂老虎機 ...............................................................325
17.2.2上置信度區間算法 ...............................................................327
17.2.3湯普森採樣算法 ..................................................................328
17.2.4上下文多臂老虎機 ...............................................................328
17.3馬爾可夫決策過程 ...........................................................................330
17.3.1基本定義 ............................................................................330
17.3.2貝爾曼方程 .........................................................................332
17.3.3最最佳化值函式與最優策略 .....................................................333
17.3.4策略評估 ............................................................................334
17.3.5策略疊代算法......................................................................335
17.3.6值函式疊代算法 ..................................................................335
17.4強化學習 ........................................................................................336
17.4.1蒙特卡洛採樣法 ..................................................................336
17.4.2時序差分學習......................................................................337
17.4.3 Sarsa算法 ..........................................................................338
17.4.4 Q-學習 ...............................................................................339
17.4.5值函式近似 .........................................................................340
17.4.6策略搜尋 ............................................................................342
17.5延伸閱讀 ........................................................................................343
17.6習題 ...............................................................................................344
參考文獻 ....................................................................................................... 346

相關詞條

熱門詞條

聯絡我們