《Spark Streaming技術內幕及源碼剖析》是2017年5月1日出版的圖書,作者是王家林、夏陽。
圖書簡介,基本信息,圖書目錄,
圖書簡介
本書以大數據處理引擎Spark的穩定版本1.6.x為基礎,從套用案例、原理、源碼、流程、調優等多個角度剖析Spark上的實時計算框架SparkStreaming。在勾勒出SparkStreaming架構輪廓的基礎上,從基本源碼開始進行剖析,由淺入深地引導已具有Spark和SparkStreaming基礎技術知識的讀者進行SparkStreaming的進階學習,理解SparkStreaming的原理和運行機制,為流數據處理的決策和套用提供了技術參考;結合SparkStreaming的深入套用的需要,對SparkStreaming的性能調優進行了分析,也對SparkStreaming功能的改造和擴展提供了指導。
本書適合大數據領域CTO、架構師、高級軟體工程師,尤其是Spark領域已有SparkStreaming基礎知識的從業人員閱讀,也可供需要深入學習Spark、SparkStreaming的高校研究生和高年級本科生參考。
基本信息
作者:王家林、夏陽
定價:49元
印次:1-1
ISBN:9787302464914
出版日期:2017.05.01
印刷日期:2017.03.24
定價:49元
印次:1-1
ISBN:9787302464914
出版日期:2017.05.01
印刷日期:2017.03.24
圖書目錄
第1章 Spark Streaming套用概述 ······1
1.1 Spark Streaming套用案例 ·······2
1.2 Spark Streaming套用剖析 ·····13
第2章 Spark Streaming基本原理 ····15
2.1 Spark Core簡介 ··················16
2.2 Spark Streaming設計思想 ·····26
2.3 Spark Streaming整體架構 ·····30
2.4 編程接口 ·························33
第3章 Spark Streaming運行流程詳解·············39
3.1 從StreamingContext的初始化到啟動 ··········40
3.2 數據接收 ·························54
3.3 數據處理 ·························91
3.4 數據清理 ························115
3.5 容錯機制 ························127
3.5.1 容錯原理 ·························128
3.5.2 Driver容錯機制 ·················152
3.5.3 Executor容錯機制 ··············161
3.6 No Receiver方式 ···············167...
3.7輸出不重複·····················175
3.8消費速率的動態控制·········176
3.9狀態操作························189
3.10視窗操作·······················212
3.11頁面展示·······················216
3.12SparkStreaming應用程式的停止··········227
第4章SparkStreaming性能調優機制···········237
4.1並行度解析·····················238
4.1.1數據接收的並行度·············238
4.1.2數據處理的並行度·············240
4.2記憶體······························240
4.3序列化···························240
4.4BatchInterval···················241
4.5Task·······························242
4.6JVMGC·························242
第5章Spark2.0中的流計算··········245
5.1連續應用程式··················246
5.2無邊界表unboundedtable····248
5.3增量輸出模式··················249
5.4API簡化··························250
5.5其他改進························250