內容簡介
機器學習因運用大數據實現強大且快速的預測而大肯立受歡迎。然而,其強大的輸出背後,真正力量來自複雜的算法,涉拳達承及大量的統計分析,以大數據作為驅動而產生實質性的洞察力。《機器學習算法(第2版 影印版 英文版)》第2版的勸鑽奔熱機器學習算法引導您取得與機器學習過程中的主要算法相關的顯著開求奔愚嬸髮結果,並幫助您加強和掌握有監督,半監督和加強學習等鴉翻海領域的統計解釋。一旦全面吃透了算法的核心概念,您將基於廣泛的庫(如sclkit-learn、NLTK、TensorFlow和Keras)來旬霉想探索現實世界的示例。您將發現新的主題,如主成分分析(PCA)、獨立成分分析(ICA)、貝葉斯回歸、判別分析、微棕慨高級聚類和高斯混合等。
圖書目錄
Preface
Chapter 1: A Gentle Introduction to Machine Learning
Introduction - classic and adaptive machines
Descriptive analysis
Predictive analysis
Only learning matters
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Computational neuroscience
Beyond machine learning - deep learning and bio-inspired adaptive
systems
Machine learning and big data
Summary
Chapter 2: Important Elements in Machine Learning
Data formats
Multiclass strategies
One-vs-all
One-vs-one
Learnability
Underfitting and overfitting
Error measures and cost functions
PAC learning
Introduction to statistical learning concepts
MAP learning
Maximum likelihood learning
Class balancing
Resampling with replacement
SMOTE resampling
Elements of information theory
Entropy
Cross-entropy and mutual information
Divergence measures between two probability distributions
Summary
Chapter 3: Feature Selection and Feature Engineering
scikit-learn toy datasets
Creating training and test sets
Managing categorical data
Managing missing features
Data scaling and normalization
Whitening
Feature selection and filtering
Principal Component Analysis
Non-Negative Matrix Factorization
Sparse PCA
Kernel PCA
Independent Component Analysis
Atom extraction and dictionary learning
Visualizing high-dimensional datasets using t-SNE
Summary
Chapter 4: Regression Algorithms
Linear models for regression
A bidimensional example
Linear regression with scikit-learn and higher dimensionality
R2 score
Explained variance
Regressor analytic expression
Ridge, Lasso, and ElasticNet
Ridge
Lasso
ElasticNet
Robust regression
RANSAC
Huber regression
Bayesian regression
Polynomial regression
Isotonic regression
Summary
Chapter 5: Linear Classification Algorithms
Linear classification
Logistic regression
Implementation and optimizations
Stochastic gradient descent algorithms
Passive-aggressive algorithms
Passive-aggressive regression
Finding the optimal hyperparameters through a grid search
Classification metrics
Confusion matrix
Precision
Recall
F-Beta
Cohen's Kappa
Global classification report
Learning curve
ROC curve
Summary
Chapter 6: Naive Bayes and Discriminant Analysis
Bayes' theorem
Naive Bayes classifiers
Naive Bayes in scikit-learn
Bernoulli Naive Bayes
Multinomial Naive Bayes
An example of Multinomial Naive Bayes for text classification
Gaussian Naive Bayes
Discriminant analysis
Summary
Chapter 7: Support Vector Machines
Linear SVM
SVMs with scikit-learn
Linear classification
Kernel-based classification
Radial Basis Function
Polynomial kernel
Sigmoid kernel
Custom kernels
Non-linear examples
v-Support Vector Machines
Support Vector Regression
An example of SVR with the Airfoil Self-Noise dataset
Introducing semi-supervised Support Vector Machines (S3VM)
Summary
Chapter 8: Decision Trees and Ensemble Learning
Binary Decision Trees
Binary decisions
Impurity measures
Gini impurity index
Cross-entropy impurity index
Misclassification impurity index
Feature importance
Decision Tree classification with scikit-learn
Decision Tree regression
Example of Decision Tree regression with the Concrete Compressive
Strength dataset
Introduction to Ensemble Learning
Random Forests
Feature importance in Random Forests
AdaBoost
Gradient Tree Boosting
Voting classifier
Summary
Chapter 9: Clustering Fundamentals
Clustering basics
k-NN
Gaussian mixture
Finding the optimal number of components
K-means
Finding the optimal number of clusters
Optimizing the inertia
Silhouette score
Calinski-Harabasz index
Cluster instability
Evaluation methods based on the ground truth
Homogeneity
Completeness
Adjusted Rand Index
Summary
Chapter 10: Advanced Clustering
DBSCAN
Spectral Clustering
Online Clustering
Mini-batch K-means
BIRCH
Biclustering
Summary
Chapter 11 : Hierarchical Clustering
Hierarchical strategies
Agglomerative Clustering
Dendrograms
Agglomerative Clustering in scikit-learn
Connectivity constraints
Summary
Chapter 12: Introducing Recommendation Systems
Naive user-based systems
Implementing a user-based system with scikit-learn
Content-based systems
Model-free (or memory-based) collaborative filtering
Model-based collaborative filtering
Singular value decomposition strategy
Alternating least squares strategy
ALS with Apache Spark MLlib
Summary
Chapter 13: Introducing Natural Language Processing
NLTK and built-in corpora
Corpora examples
The Bag-of-Words strategy
Tokenizing
Sentence tokenizing
Word tokenizing
Stopword removal
Language detection
Stemming
Vectorizing
Count vectorizing
N-grams
TF-IDF vectorizing
Part-of-Speech
Named Entity Recognition
A sample text classifier based on the Reuters corpus
Summary
Chapter 14: Topic Modeling and Sentiment Analysis in NLP
Topic modeling
Latent Semantic Analysis
Probabilistic Latent Semantic Analysis
Latent Dirichlet Allocation
Introducing Word2vec with Gensim
Sentiment analysis
VADER sentiment analysis with NLTK
Summary
Chapter 15: Introducing Neural Networks
Deep learning at a glance
Artificial neural networks
MLPs with Keras
Interfacing Keras to scikit-learn
Summary
Chapter 16: Advanced Deep Learning Models
Deep model layers
Fully connected layers
Convolutional layers
Dropout layers
Batch normalization layers
Recurrent Neural Networks
An example of a deep convolutional network with Keras
An example of an LSTM network with Keras
A brief introduction to TensorFIow
Computing gradients
Logistic regression
Classification with a multilayer perceptron
Image convolution
Summary
Chapter 17: Creating a Machine Learning Architecture
Machine learning architectures
Data collection
Normalization and regularization
Dimensionality reduction
Data augmentation
Data conversion
Modeling/grid search/cross-validation
Visualization
GPU support
A brief introduction to distributed architectures
Scikit-learn tools for machine learning architectures
Pipelines
Feature unions
Summary
Other Books You May Enjoy
Index
SMOTE resampling
Elements of information theory
Entropy
Cross-entropy and mutual information
Divergence measures between two probability distributions
Summary
Chapter 3: Feature Selection and Feature Engineering
scikit-learn toy datasets
Creating training and test sets
Managing categorical data
Managing missing features
Data scaling and normalization
Whitening
Feature selection and filtering
Principal Component Analysis
Non-Negative Matrix Factorization
Sparse PCA
Kernel PCA
Independent Component Analysis
Atom extraction and dictionary learning
Visualizing high-dimensional datasets using t-SNE
Summary
Chapter 4: Regression Algorithms
Linear models for regression
A bidimensional example
Linear regression with scikit-learn and higher dimensionality
R2 score
Explained variance
Regressor analytic expression
Ridge, Lasso, and ElasticNet
Ridge
Lasso
ElasticNet
Robust regression
RANSAC
Huber regression
Bayesian regression
Polynomial regression
Isotonic regression
Summary
Chapter 5: Linear Classification Algorithms
Linear classification
Logistic regression
Implementation and optimizations
Stochastic gradient descent algorithms
Passive-aggressive algorithms
Passive-aggressive regression
Finding the optimal hyperparameters through a grid search
Classification metrics
Confusion matrix
Precision
Recall
F-Beta
Cohen's Kappa
Global classification report
Learning curve
ROC curve
Summary
Chapter 6: Naive Bayes and Discriminant Analysis
Bayes' theorem
Naive Bayes classifiers
Naive Bayes in scikit-learn
Bernoulli Naive Bayes
Multinomial Naive Bayes
An example of Multinomial Naive Bayes for text classification
Gaussian Naive Bayes
Discriminant analysis
Summary
Chapter 7: Support Vector Machines
Linear SVM
SVMs with scikit-learn
Linear classification
Kernel-based classification
Radial Basis Function
Polynomial kernel
Sigmoid kernel
Custom kernels
Non-linear examples
v-Support Vector Machines
Support Vector Regression
An example of SVR with the Airfoil Self-Noise dataset
Introducing semi-supervised Support Vector Machines (S3VM)
Summary
Chapter 8: Decision Trees and Ensemble Learning
Binary Decision Trees
Binary decisions
Impurity measures
Gini impurity index
Cross-entropy impurity index
Misclassification impurity index
Feature importance
Decision Tree classification with scikit-learn
Decision Tree regression
Example of Decision Tree regression with the Concrete Compressive
Strength dataset
Introduction to Ensemble Learning
Random Forests
Feature importance in Random Forests
AdaBoost
Gradient Tree Boosting
Voting classifier
Summary
Chapter 9: Clustering Fundamentals
Clustering basics
k-NN
Gaussian mixture
Finding the optimal number of components
K-means
Finding the optimal number of clusters
Optimizing the inertia
Silhouette score
Calinski-Harabasz index
Cluster instability
Evaluation methods based on the ground truth
Homogeneity
Completeness
Adjusted Rand Index
Summary
Chapter 10: Advanced Clustering
DBSCAN
Spectral Clustering
Online Clustering
Mini-batch K-means
BIRCH
Biclustering
Summary
Chapter 11 : Hierarchical Clustering
Hierarchical strategies
Agglomerative Clustering
Dendrograms
Agglomerative Clustering in scikit-learn
Connectivity constraints
Summary
Chapter 12: Introducing Recommendation Systems
Naive user-based systems
Implementing a user-based system with scikit-learn
Content-based systems
Model-free (or memory-based) collaborative filtering
Model-based collaborative filtering
Singular value decomposition strategy
Alternating least squares strategy
ALS with Apache Spark MLlib
Summary
Chapter 13: Introducing Natural Language Processing
NLTK and built-in corpora
Corpora examples
The Bag-of-Words strategy
Tokenizing
Sentence tokenizing
Word tokenizing
Stopword removal
Language detection
Stemming
Vectorizing
Count vectorizing
N-grams
TF-IDF vectorizing
Part-of-Speech
Named Entity Recognition
A sample text classifier based on the Reuters corpus
Summary
Chapter 14: Topic Modeling and Sentiment Analysis in NLP
Topic modeling
Latent Semantic Analysis
Probabilistic Latent Semantic Analysis
Latent Dirichlet Allocation
Introducing Word2vec with Gensim
Sentiment analysis
VADER sentiment analysis with NLTK
Summary
Chapter 15: Introducing Neural Networks
Deep learning at a glance
Artificial neural networks
MLPs with Keras
Interfacing Keras to scikit-learn
Summary
Chapter 16: Advanced Deep Learning Models
Deep model layers
Fully connected layers
Convolutional layers
Dropout layers
Batch normalization layers
Recurrent Neural Networks
An example of a deep convolutional network with Keras
An example of an LSTM network with Keras
A brief introduction to TensorFIow
Computing gradients
Logistic regression
Classification with a multilayer perceptron
Image convolution
Summary
Chapter 17: Creating a Machine Learning Architecture
Machine learning architectures
Data collection
Normalization and regularization
Dimensionality reduction
Data augmentation
Data conversion
Modeling/grid search/cross-validation
Visualization
GPU support
A brief introduction to distributed architectures
Scikit-learn tools for machine learning architectures
Pipelines
Feature unions
Summary
Other Books You May Enjoy
Index