數據挖掘基礎(第2版)

數據挖掘基礎(第2版)

《數據挖掘基礎(第2版)》是2023年清華大學出版社出版的圖書,作者是劉鵬、陶建輝。

基本介紹

  • 中文名:數據挖掘基礎(第2版)
  • 作者:劉鵬、陶建輝
  • 出版時間:2023年6月1日
  • 出版社:清華大學出版社
  • ISBN:9787302634492 
  • 定價:49 元
內容簡介,圖書目錄,

內容簡介

本書介紹數據挖掘的基本概念,包括數據挖掘的常用算法、常用工具、用途和套用場景及套用狀況,講述常用數據挖掘方法,如分類、聚類、關聯規則的概念、思想、典型算法、套用場景等。此外,本書還從實際套用出發,講解基於日誌的大數據挖掘技術的原理、工具、套用場景和成功案例。

圖書目錄

第1 章 數據挖掘的概念
1.1 數據挖掘概述 ················································································ 1
1.1.1 什麼是數據挖掘 ·········································································· 2
1.1.2 數據挖掘常用算法概述 ································································· 2
1.1.3 數據挖掘常用工具概述 ································································· 3
1.2 數據探索 ······················································································ 5
1.2.1 數據概述 ··················································································· 5
1.2.2 數據質量 ··················································································· 6
1.2.3 數據預處理 ················································································ 8
1.3 數據挖掘的套用 ··········································································· 10
1.3.1 數據挖掘的現狀及發展趨勢 ························································· 10
1.3.2 數據挖掘需要解決的問題 ···························································· 10
1.3.3 數據挖掘的套用場景 ·································································· 12
1.4 作業與練習 ················································································· 15
參考文獻 ··························································································· 15
第2 章 分類
2.1 分類概述 ···················································································· 16
2.1.1 分類的基本概念 ········································································ 16
2.1.2 解決分類問題的一般方法 ···························································· 16
2.1.3 分類模型的過擬合 ····································································· 18
2.2 決策樹 ······················································································· 18
2.2.1 決策樹的工作原理及構建 ···························································· 18
2.2.2 決策樹歸納算法 ········································································ 19
2.2.3 處理決策樹中的過擬合 ······························································· 21
2.3 貝葉斯決策與分類器 ····································································· 21
2.3.1 規則分類器 ·············································································· 21
2.3.2 貝葉斯定理在分類中的套用 ························································· 22
2.3.3 樸素貝葉斯在分類中的套用 ························································· 23
2.4 支持向量機 ················································································· 24
2.4.1 最大邊緣超平面 ········································································ 24
VIII 數據挖掘基礎(第2 版)
2.4.2 線性支持向量機SVM ································································· 25
2.4.3 非線性支持向量機SVM ······························································ 27
2.5 分類在實際場景中的套用案例 ························································· 31
2.5.1 在關鍵字檢索中的套用 ······························································· 31
2.5.2 在甄別欺詐行為中的套用 ···························································· 32
2.5.3 在線上廣告推薦中的套用 ···························································· 32
2.5.4 在Web 機器人檢測中的套用 ························································ 34
2.6 作業與練習 ················································································· 35
參考文獻 ··························································································· 35
第3 章 聚類
3.1 聚類概述 ···················································································· 36
3.1.1 聚類的基本概念 ········································································ 36
3.1.2 聚類的評價標準 ········································································ 37
3.1.3 聚類算法的選擇 ········································································ 39
3.2 聚類算法 ···················································································· 39
3.2.1 層次聚類算法 ··········································································· 39
3.2.2 劃分聚類算法 ··········································································· 40
3.2.3 基於密度的聚類算法 ·································································· 41
3.2.4 基於格線的聚類算法 ·································································· 42
3.2.5 基於模型的聚類算法 ·································································· 43
3.2.6 使用Spark 實現K-means 的訓練 ···················································· 43
3.3 聚合分析方法 ·············································································· 45
3.3.1 歐氏距離 ················································································· 45
3.3.2 聚合過程 ················································································· 45
3.3.3 聚類樹 ···················································································· 47
3.3.4 聚合分析方法套用實例 ······························································· 48
3.4 聚類在實際場景中的套用案例 ························································· 49
3.4.1 在電網中的套用 ········································································ 49
3.4.2 在電力用戶用電行為分析中的套用 ················································ 49
3.4.3 在電商中的套用 ········································································ 50
3.4.4 聚類實現的例子 ········································································ 50
3.5 作業與練習 ················································································· 56
參考文獻 ··························································································· 56
第4 章 關聯規則
4.1 關聯規則概述 ·············································································· 57
4.1.1 經典案例導入 ··········································································· 57
4.1.2 關聯規則的基本概念和定義 ························································· 58
4.1.3 關聯規則的分類 ········································································ 60
4.2 關聯規則的挖掘過程 ····································································· 61
4.2.1 知識回顧 ················································································· 61
4.2.2 頻繁項集產生 ··········································································· 62
4.2.3 強關聯規則 ·············································································· 63
4.2.4 關聯規則評價標準 ····································································· 64
4.3 關聯規則的Apriori 算法 ································································· 65
4.3.1 知識回顧 ················································································· 65
4.3.2 Apriori 算法的核心思想 ······························································· 66
4.3.3 Apriori 算法描述 ········································································ 66
4.3.4 Apriori 算法評價 ········································································ 68
4.3.5 Apriori 算法改進 ········································································ 68
4.4 關聯規則的FP-growth 算法 ···························································· 69
4.4.1 構建FP 樹 ··············································································· 70
4.4.2 從FP 樹中挖掘頻繁項集 ······························································ 72
4.4.3 FP-growth 算法與Apriori 算法的區別 ·············································· 73
4.4.4 使用Spark 實現FP-growth 算法的訓練 ············································ 73
4.5 實戰:關聯規則挖掘實例 ······························································· 74
4.5.1 關聯規則挖掘技術在國內外的套用現狀 ··········································· 74
4.5.2 關聯規則套用實例 ····································································· 75
4.5.3 關聯規則在大型超市中套用的步驟 ················································ 77
4.6 作業與練習 ················································································· 79
參考文獻 ··························································································· 79
第5 章 綜合實戰—日誌的挖掘與套用
5.1 日誌的概念 ················································································· 80
5.1.1 日誌是什麼 ·············································································· 80
5.1.2 日誌能做什麼 ··········································································· 81
5.2 日誌處理 ···················································································· 82
5.2.1 產生日誌 ················································································· 82
5.2.2 傳輸日誌 ················································································· 83
5.2.3 存儲日誌 ················································································· 85
5.2.4 分析日誌 ················································································· 88
5.2.5 日誌規範與標準 ········································································ 97
5.3 R 語言與日誌分析工具 ·································································· 99
5.3.1 R 語言 ···················································································· 99
5.3.2 日誌分析工具 ·········································································· 103
5.3.3 日誌分析系統的規劃建設 ··························································· 106
5.4 日誌挖掘套用 ············································································· 110
5.4.1 安全運維 ················································································ 110
5.4.2 系統健康分析 ·········································································· 110
5.4.3 用戶行為分析 ·········································································· 111
5.4.4 業務分析設計 ·········································································· 112
5.5 日誌分析挖掘實例 ······································································· 113
5.6 作業與練習 ················································································ 115
參考文獻 ·························································································· 115
第6 章 數據挖掘套用案例
6.1 電力行業採用聚類方法進行主變油溫分析 ········································· 116
6.1.1 需求背景及採用的大數據分析方法 ··············································· 116
6.1.2 大數據分析方法的實現過程 ························································ 117
6.1.3 大數據分析方法的實現結果 ························································ 119
6.2 銀行信貸評價 ············································································· 119
6.2.1 簡介 ······················································································ 119
6.2.2 神經網路模型 ·········································································· 120
6.2.3 實證檢驗 ················································································ 120
6.3 指數預測 ·················································································· 121
6.3.1 金融時間序列概況 ···································································· 121
6.3.2 小波消噪 ················································································ 122
6.3.3 向量機 ··················································································· 123
6.3.4 指數預測 ················································································ 123
6.4 客戶分群的精準智慧型行銷 ····························································· 124
6.4.1 挖掘目標 ················································································ 124
6.4.2 分析方法和過程 ······································································· 124
6.4.3 建模仿真 ················································································ 128
6.5 使用WEKA 進行房屋定價 ···························································· 129
6.6 作業與練習 ··············································································· 133
參考文獻 ························································································· 133
附錄A

相關詞條

熱門詞條

聯絡我們