Python for Data Analysis

Python for Data Analysis

《Python for Data Analysis》是2013年O'Reilly Media出版的圖書,作者是Wesly McKinney。

基本介紹

  • 中文名:Python for Data Analysis
  • 作者:Wesly McKinney
  • 出版社:O'Reilly Media
  • 出版時間:2013年6月16日
  • 頁數:450 頁
  • 裝幀:Paperback
  • ISBN:9781549329784
內容簡介,圖書目錄,作者簡介,

內容簡介

這本書主要是用 pandas 連線 SciPy 和 NumPy,用pandas做數據處理是Pycon2012上一個很熱門的話題。另一個功能強大的東西是Sage,它將很多開源的軟體集成到統一的 Python 接口。
Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.
Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing.
Use the IPython interactive shell as your primary development environment
Learn basic and advanced NumPy (Numerical Python) features
Get started with data analysis tools in the pandas library
Use high-performance tools to load, clean, transform, merge, and reshape data
Create scatter plots and static or interactive visualizations with matplotlib
Apply the pandas groupby facility to slice, dice, and summarize datasets
Measure data by points in time, whether it’s specific instances, fixed periods, or intervals
Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

圖書目錄

Chapter 1 Preliminaries
What Is This Book About?
Why Python for Data Analysis?
Essential Python Libraries
Installation and Setup
Community and Conferences
Navigating This Book
Acknowledgements
Chapter 2 Introductory Examples
1.usa.gov data from bit.ly
MovieLens 1M Data Set
US Baby Names 1880-2010
Conclusions and The Path Ahead
Chapter 3 IPython: An Interactive Computing and Development Environment
IPython Basics
Using the Command History
Interacting with the Operating System
Software Development Tools
IPython HTML Notebook
Tips for Productive Code Development Using IPython
Advanced IPython Features
Credits
Chapter 4 NumPy Basics: Arrays and Vectorized Computation
The NumPy ndarray: A Multidimensional Array Object
Universal Functions: Fast Element-wise Array Functions
Data Processing Using Arrays
File Input and Output with Arrays
Linear Algebra
Random Number Generation
Example: Random Walks
Chapter 5 Getting Started with pandas
Introduction to pandas Data Structures
Essential Functionality
Summarizing and Computing Descriptive Statistics
Handling Missing Data
Hierarchical Indexing
Other pandas Topics
Chapter 6 Data Loading, Storage, and File Formats
Reading and Writing Data in Text Format
Binary Data Formats
Interacting with HTML and Web APIs
Interacting with Databases
Chapter 7 Data Wrangling: Clean, Transform, Merge, Reshape
Combining and Merging Data Sets
Reshaping and Pivoting
Data Transformation
String Manipulation
Example: USDA Food Database
Chapter 8 Plotting and Visualization
A Brief matplotlib API Primer
Plotting Functions in pandas
Plotting Maps: Visualizing Haiti Earthquake Crisis Data
Python Visualization Tool Ecosystem
Chapter 9 Data Aggregation and Group Operations
GroupBy Mechanics
Data Aggregation
Group-wise Operations and Transformations
Pivot Tables and Cross-Tabulation
Example: 2012 Federal Election Commission Database
Chapter 10 Time Series
Date and Time Data Types and Tools
Time Series Basics
Date Ranges, Frequencies, and Shifting
Time Zone Handling
Periods and Period Arithmetic
Resampling and Frequency Conversion
Time Series Plotting
Moving Window Functions
Performance and Memory Usage Notes
Chapter 11 Financial and Economic Data Applications
Data Munging Topics
Group Transforms and Analysis
More Example Applications
Chapter 12 Advanced NumPy
ndarray Object Internals
Advanced Array Manipulation
Broadcasting
Advanced ufunc Usage
Structured and Record Arrays
More About Sorting
NumPy Matrix Class
Advanced Array Input and Output
Performance Tips
Appendix Python Language Essentials
The Python Interpreter
The Basics
Data Structures and Sequences
Functions
Files and the operating system
· · · · · ·

作者簡介

Wes McKinney 資深數據分析專家,對各種Python庫(包括NumPy、pandas、matplotlib以及IPython等)等都有深入研究,並在大量的實踐中積累了豐富的經驗。撰寫了大量與Python數據分析相關的經典文章,被各大技術社區爭相轉載,是Python和開源技術社區公認的權威人物之一。開發了用於數據分析的著名開源Python庫——pandas,廣獲用戶好評。在創建Lambda Foundry(一家致力於企業數據分析的公司)之前,他曾是AQR Capital Management的定量分析師。

相關詞條

熱門詞條

聯絡我們