基本介紹
內容簡介
《Python數據分析(影印版)》內容簡介:你是否在尋找一本完整介紹Python操縱、處旬拘棵理、提取和壓縮
結構化數據的指南?《Python數據分析(影印版)》包含了許多實例分析,通過若干個Python庫——包括NumPy,探立邀剃pandas,matplotlib和IPython—項廈享—為你展示了如何高效炒墓閥地解決大量數據分析的問題。灶轎蜜
作者簡介
作者:(美戀燥陵櫃愉歸國)麥金尼(Wes McKinney)
圖書目錄
Preface
1.Preliminaries
What Is This Book About?
Why Python for Data Analysis?
Python as Glue
Solving the "Two—Language" Problem
Why Not Python?
Essential Python Libraries
NumPy
pandas
matplotlib
IPython
SciPy
Installation and Setup
Windows
Apple OS X
GNU/Linux
Python 2 and Python 3
Integrated Development Environments (IDEs)
Community and Conferences
Navigating This Book
Code Examples
Data for Examples
Import Conventions
Jargon
Acknowledgements
2.Introductory Examples
1.usa.gov data from bit.ly
Counting Time Zones in Pure Python
Counting Time Zones with pandas
MovieLens 1M Data Set
Measuring rating disagreement
US Baby Names 1880—2010
Analyzing Naming Trends
Conclusions and The Path Ahead
3.IPython:An Interactive Computing and Development Environment
IPython Basics
Tab Completion
Introspection
The %run Command
Executing Code from the Clipboard
Keyboard Shortcuts
Exceptions and Tracebacks
Magic Commands
Qt—based Rich GUI Console
Matplotlib Integration and Pylab Mode
Using the Command History
Searching and Reusing the Command History
Input and Output Variables
Logging the Input and Output
Interacting with the Operating System
Shell Commands and Aliases
Directory Bookmark System
Software Development Tools
Interactive Debugger
Timing Code: %time and %timeit
Basic Profiling: %prun and %run —p
Profiling a Function Line—by—Line
IPython HTML Notebook
Tips for Productive Code Development Using IPython
Reloading Module Dependencies
Code Design Tips
Advanced IPython Features
Making Your Own Classes IPython—friendly
Profiles and Configuration
Credits
4.NumPy Basics:Arrays and Vectorized Computation
The NumPy ndarray: A Multidimensional Array Object
Creating ndarrays
Data Types for ndarrays
Operations between Arrays and Scalars
Basic Indexing and Slicing
Boolean Indexing
Fancy Indexing
Transposing Arrays and Swapping Axes
Universal Functions: Fast Element—wise Array Functions
Data Processing Using Arrays
Expressing Conditional Logic as Array Operations
Mathematical and Statistical Methods
Methods for Boolean Arrays
Sorting
Unique and Other Set Logic
File Input and Output with Arrays
Storing Arrays on Disk in Binary Format
Saving and Loading Text Files
Linear Algebra
Random Number Generation
Example: Random Walks
Simulating Many Random Walks at Once
5.Getting Started with pandas
Introduction to pandas Data Structures
Series
DataFrame
Index Objects
Essential Functionality
Reindexing
Dropping entries from an axis
Indexing, selection, and filtering
Arithmetic and data alignment
Function application and mapping
Sorting and ranking
Axis indexes with duplicate values
Summarizing and Computing Descriptive Statistics
Correlation and Covariance
Unique Values, Value Counts, and Membership
Handling Missing Data
Filtering Out Missing Data
Filling in Missing Data
Hierarchical Indexing
Reordering and Sorting Levels
Summary Statistics by Level
Using a DataFrame's Columns
Other pandas Topics
Integer Indexing
Panel Data
5.Data Loading, Storage, and File Formats
Reading and Writing Data in Text Format
Reading Text Files in Pieces
Writing Data Out to Text Format
Manually Working with Delimited Formats
JSON Data
XML and HTML: Web Scraping
Binary Data Formats
Using HDF5 Format
Reading Microsoft Excel Files
Interacting with HTML and Web APIs
Interacting with Databases
Storing and Loading Data in MongoDB
7.Data Wrangling: Clean, Transform, Merge, Reshape
Combining and Merging Data Sets
Database—style DataFrame Merges
Merging on Index
Concatenating Along an Axis
Combining Data with Overlap
Reshaping and Pivoting
Reshaping with Hierarchical Indexing
Pivoting "long" to "wide" Format
Data Transformation
Removing Duplicates
Transforming Data Using a Function or Mapping
Replacing Values
Renaming Axis Indexes
Discretization and Binning
Detecting and Filtering Outliers
Permutation and,Random Sampling
Computing Indicator/Dummy Variables
String Manipulation
String Object Methods
Regular expressions
Vectorized string functions in pandas
Example: USDA Food Database
……
8.Plotting and Visualization
9.Data Aggregation and Group Operations
10.Time Series
11.Financial and Economic Data Applications
12.Advanced NumPy
Appendix:Python Language Essentials
Index
名人推薦
科學和數據分析領域已經等了本書好幾年了:具有具體的實用建議以及如何聚沙成塔的見解。它應該會成為接下來若干年裡Python科學計算方面的經典參考資料。”
——Fernando Perez UC Berkeley大學的助理 研究員,也是IPython的原創作者之一
IPython Basics
Tab Completion
Introspection
The %run Command
Executing Code from the Clipboard
Keyboard Shortcuts
Exceptions and Tracebacks
Magic Commands
Qt—based Rich GUI Console
Matplotlib Integration and Pylab Mode
Using the Command History
Searching and Reusing the Command History
Input and Output Variables
Logging the Input and Output
Interacting with the Operating System
Shell Commands and Aliases
Directory Bookmark System
Software Development Tools
Interactive Debugger
Timing Code: %time and %timeit
Basic Profiling: %prun and %run —p
Profiling a Function Line—by—Line
IPython HTML Notebook
Tips for Productive Code Development Using IPython
Reloading Module Dependencies
Code Design Tips
Advanced IPython Features
Making Your Own Classes IPython—friendly
Profiles and Configuration
Credits
4.NumPy Basics:Arrays and Vectorized Computation
The NumPy ndarray: A Multidimensional Array Object
Creating ndarrays
Data Types for ndarrays
Operations between Arrays and Scalars
Basic Indexing and Slicing
Boolean Indexing
Fancy Indexing
Transposing Arrays and Swapping Axes
Universal Functions: Fast Element—wise Array Functions
Data Processing Using Arrays
Expressing Conditional Logic as Array Operations
Mathematical and Statistical Methods
Methods for Boolean Arrays
Sorting
Unique and Other Set Logic
File Input and Output with Arrays
Storing Arrays on Disk in Binary Format
Saving and Loading Text Files
Linear Algebra
Random Number Generation
Example: Random Walks
Simulating Many Random Walks at Once
5.Getting Started with pandas
Introduction to pandas Data Structures
Series
DataFrame
Index Objects
Essential Functionality
Reindexing
Dropping entries from an axis
Indexing, selection, and filtering
Arithmetic and data alignment
Function application and mapping
Sorting and ranking
Axis indexes with duplicate values
Summarizing and Computing Descriptive Statistics
Correlation and Covariance
Unique Values, Value Counts, and Membership
Handling Missing Data
Filtering Out Missing Data
Filling in Missing Data
Hierarchical Indexing
Reordering and Sorting Levels
Summary Statistics by Level
Using a DataFrame's Columns
Other pandas Topics
Integer Indexing
Panel Data
5.Data Loading, Storage, and File Formats
Reading and Writing Data in Text Format
Reading Text Files in Pieces
Writing Data Out to Text Format
Manually Working with Delimited Formats
JSON Data
XML and HTML: Web Scraping
Binary Data Formats
Using HDF5 Format
Reading Microsoft Excel Files
Interacting with HTML and Web APIs
Interacting with Databases
Storing and Loading Data in MongoDB
7.Data Wrangling: Clean, Transform, Merge, Reshape
Combining and Merging Data Sets
Database—style DataFrame Merges
Merging on Index
Concatenating Along an Axis
Combining Data with Overlap
Reshaping and Pivoting
Reshaping with Hierarchical Indexing
Pivoting "long" to "wide" Format
Data Transformation
Removing Duplicates
Transforming Data Using a Function or Mapping
Replacing Values
Renaming Axis Indexes
Discretization and Binning
Detecting and Filtering Outliers
Permutation and,Random Sampling
Computing Indicator/Dummy Variables
String Manipulation
String Object Methods
Regular expressions
Vectorized string functions in pandas
Example: USDA Food Database
……
8.Plotting and Visualization
9.Data Aggregation and Group Operations
10.Time Series
11.Financial and Economic Data Applications
12.Advanced NumPy
Appendix:Python Language Essentials
Index