Skip to main content

Machine learning

Content:


Introduction

1.1 What Is machine Learning
1.2 When Do We Need Machine Learning
1.3 Types of Learning
1.4 Relations to Other Fields 

1.5 How to Read This Book 

      1.5.1 Possible Course Plans Based on This Book 
1.6 Notation 

Part I Foundations 

2 A Gentle Start 
2.1 A Formal Model { The Statistical Learning Framework 
2.2 Empirical Risk Minimization 
       2.2.1 Something May Go Wrong 
2.3 Empirical Risk Minimization with Inductive Bias 
      2.3.1 Finite Hypothesis Classes 

3 A Formal Learning Model 
3.1 PAC Learning 
3.2 A More General Learning Model 
    3.2.1 Releasing the Realizability Assumption 
    3.2.2 The Scope of Learning Problems Modeled 

4 Learning via Uniform Convergence 
4.1 Uniform Convergence Is Su cient for Learnability 
4.2 Finite Classes Are Agnostic PAC Learnable

The Bias-Complexity Tradeo  
5.1 The No-Free-Lunch Theorem 
5.1.1 No-Free-Lunch and Prior Knowledge 
5.2 Error Decomposition 

6 The VC-Dimension 
6.1 In nite-Size Classes Can Be Learnable 
6.2 The VC-Dimension 

6.3 Threshold Functions 
       6.3.1 Intervals 
       6.3.2 Axis Aligned Rectangles 
       6.3.3 Finite Classes 
       6.3.4 VC-Dimension and the Number of Parameters 
6.4 The Fundamental Theorem of PAC learning 
6.5 Proof of Theorem 
      6.5.1 Sauer's Lemma and the Growth Function 
      6.5.2 Uniform Convergence for Classes of Small E ective Size 

7 Nonuniform Learnability 
7.1 Nonuniform Learnability 
      7.1.1 Characterizing Nonuniform Learnability 
7.2 Structural Risk Minimization 
7.3 Minimum Description Length and Occam's Razor 
      7.3.1 Occam's Razor 
7.4 Other Notions of Learnability { Consistency 
7.5 Discussing the Di erent Notions of Learnability 
      7.5.1 The No-Free-Lunch Theorem Revisited

8.1 Formal De nition* 
8.2 Implementing the ERM Rule 
      8.2.1 Finite Classes 
      8.2.2 Axis Aligned Rectangles
      8.2.3 Boolean Conjunctions 
      8.2.4 Learning 3-Term DNF 
8.3 E ciently Learnable, but Not by a Proper ERM 
8.4 Hardness of Learning*

Part II From Theory to Algorithms 


9 Linear Predictors 
9.1 Halfspaces
       9.1.1 Linear Programming for the Class of Halfspaces
       9.1.2 Perceptron for Halfspaces
       9.1.3 The VC Dimension of Halfspaces
9.2 Linear Regression 
       9.2.1 Least Squares
       9.2.2 Linear Regression for Polynomial Regression Tasks 
9.3 Logistic Regression 

10 Boosting
 10.1 Weak Learnability 
     10.1.1 E cient Implementation of ERM for Decision Stumps 
10.2 AdaBoost 
10.3 Linear Combinations of Base Hypotheses 
        10.3.1 The VC-Dimension of L(B; T)
10.4 AdaBoost for Face Recognition

11 Model Selection and Validation 
11.1 Model Selection Using SRM 
11.2 Validation
          11.2.1 Hold Out Set
           11.2.2 Validation for Model Selection
           11.2.3 The Model-Selection Curve
           11.2.4 k-Fold Cross Validation]
           11.2.5 Train-Validation-Test Split
11.3 What to Do If Learning Fails 

12 Convex Learning Problems 
12.1 Convexity, Lipschitzness, and Smoothness 
        12.1.1 Convexity
        12.1.2 Lipschitzness
        12.1.3 Smoothness
12.2 Convex Learning Problems
        12.2.1 Learnability of Convex Learning Problems 
        12.2.2 Convex-Lipschitz/Smooth-Bounded Learning Problems 
12.3 Surrogate Loss Functions 

13 Regularization and Stability 
13.1 Regularized Loss Minimization
         13.1.1 Ridge Regression
13.2 Stable Rules Do Not Over t
13.3 Tikhonov Regularization as a Stabilizer
         13.3.1 Lipschitz Loss
         13.3.2 Smooth and Nonnegative Loss
13.4 Controlling the Fitting-Stability Tradeo

14 Stochastic Gradient Descent
14.1 Gradient Descent 
        14.1.1 Analysis of GD for Convex-Lipschitz Functions 
14.2 Subgradients 
         14.2.1 Calculating Subgradients 
         14.2.2 Subgradients of Lipschitz Functions 
         14.2.3 Subgradient Descent 
14.3 Stochastic Gradient Descent (SGD)
         14.3.1 Analysis of SGD for Convex-Lipschitz-Bounded Functions
14.4 Variants 
          14.4.1 Adding a Projection Step
          14.4.2 Variable Step Size 
          14.4.3 Other Averaging Techniques 
          14.4.4 Strongly Convex Functions*
14.5 Learning with SGD
          14.5.1 SGD for Risk Minimization
          14.5.2 Analyzing SGD for Convex-Smooth Learning Problems
          14.5.3 SGD for Regularized Loss Minimization 

15 Support Vector Machines
15.1 Margin and Hard-SVM 
       15.1.1 The Homogenous Case
       15.1.2 The Sample Complexity of Hard-SVM
15.2 Soft-SVM and Norm Regularization 
       15.2.1 The Sample Complexity of Soft-SVM
       15.2.2 Margin and Norm-Based Bounds versus Dimension 
       15.2.3 The Ramp Loss* 209
15.3 Optimality Conditions and \Support Vectors"* 
15.4 Duality*
15.5 Implementing Soft-SVM Using SGD 

16 Kernel Methods 
16.1 Embeddings into Feature Spaces 
16.2 The Kernel Trick 
        16.2.1 Kernels as a Way to Express Prior Knowledge 
         16.2.2 Characterizing Kernel Functions* 
16.3 Implementing Soft-SVM with Kernels 

17 Multiclass, Ranking, and Complex Prediction Problems 
17.1 One-versus-All and All-Pairs 
17.2 Linear Multiclass Predictors 
         17.2.1 How to Construct   
         17.2.2 Cost-Sensitive Classi cation 
         17.2.3 ERM 
         17.2.4 Generalized Hinge Loss 
         17.2.5 Multiclass SVM and SGD 
17.3 Structured Output Prediction 
17.4 Ranking 
          17.4.1 Linear Predictors for Ranking 
17.5 Bipartite Ranking and Multivariate Performance Measures 
          17.5.1 Linear Predictors for Bipartite Ranking 

18 Decision Trees 
18.1 Sample Complexity 
18.2 Decision Tree Algorithms
       18.2.1 Implementations of the Gain Measure
       18.2.2 Pruning 
       18.2.3 Threshold-Based Splitting Rules for Real-Valued Features 
18.3 Random Forests 

19 Nearest Neighbor 
19.1 k Nearest Neighbors 
19.2 Analysis 
         19.2.1 A Generalization Bound for the 1-NN Rule 
         19.2.2 The \Curse of Dimensionality" 
19.3 E cient Implementation* 

20 Neural Networks 
20.1 Feedforward Neural Networks 
20.2 Learning Neural Networks 
20.3 The Expressive Power of Neural Networks 
         20.3.1 Geometric Intuition 
20.4 The Sample Complexity of Neural Networks 
20.5 The Runtime of Learning Neural Networks 
20.6 SGD and Backpropagation 


Part III Additional Learning Models 

21 Online Learning 
21.1 Online Classi cation in the Realizable Case 
        21.1.1 Online Learnability 
21.2 Online Classi cation in the Unrealizable Case 
        21.2.1 Weighted-Majority 
21.3 Online Convex Optimization 
21.4 The Online Perceptron Algorithm 

22 Clustering 
22.1 Linkage-Based Clustering Algorithms 
22.2 k-Means and Other Cost Minimization Clusterings 
          22.2.1 The k-Means Algorithm 
22.3 Spectral Clustering 
          22.3.1 Graph Cut 
          22.3.2 Graph Laplacian and Relaxed Graph Cuts 
          22.3.3 Unnormalized Spectral Clustering 
22.4 Information Bottleneck* 
22.5 A High Level View of Clustering 

23 Dimensionality Reduction 
23.1 Principal Component Analysis (PCA) 
        23.1.1 A More E cient Solution for the Case d m 
        23.1.2 Implementation and Demonstration 
23.2 Random Projections 
23.3 Compressed Sensing 
         23.3.1 Proofs* 
23.4 PCA or Compressed Sensing? 

24 Generative Models 

24.1 Maximum Likelihood Estimator 
          24.1.1 Maximum Likelihood Estimation for Continuous Random 
Variables 
          24.1.2 Maximum Likelihood and Empirical Risk Minimization 
          24.1.3 Generalization Analysis 
24.2 Naive Bayes 
24.3 Linear Discriminant Analysis 
24.4 Latent Variables and the EM Algorithm 
          24.4.1 EM as an Alternate Maximization Algorithm 
          24.4.2 EM for Mixture of Gaussians (Soft k-Means) 
24.5 Bayesian Reasoning 

25 Feature Selection and Generation 
25.1 Feature Selection 
         25.1.1 Filters 
         25.1.2 Greedy Selection Approaches 
         25.1.3 Sparsity-Inducing Norms 
25.2 Feature Manipulation and Normalization 
          25.2.1 Examples of Feature Transformations 
25.3 Feature Learning 
          25.3.1 Dictionary Learning Using Auto-Encoders 

Part IV Advanced Theory 

26 Rademacher Complexities 
26.1 The Rademacher Complexity 
         26.1.1 Rademacher Calculus 
26.2 Rademacher Complexity of Linear Classes 
26.3 Generalization Bounds for SVM 
26.4 Generalization Bounds for Predictors with Low `1 Norm 

27 Covering Numbers 
27.1 Covering 
        27.1.1 Properties 
27.2 From Covering to Rademacher Complexity via Chaining 

28 Proof of the Fundamental Theorem of Learning Theory 
28.1 The Upper Bound for the Agnostic Case 
28.2 The Lower Bound for the Agnostic Case 
          28.2.1 Showing That m( ; ) 0:5 log(1=(4 ))= 2 
          28.2.2 Showing That m( ; 1=8) 8d= 2
28.3 The Upper Bound for the Realizable Case 
28.3.1 From -Nets to PAC Learnability 

29 Multiclass Learnability
29.1 The Natarajan Dimension 

29.2 The Multiclass Fundamental Theorem 
          29.2.1 On the Proof of Theorem
29.3 Calculating the Natarajan Dimension 
          29.3.1 One-versus-All Based Classes 
          29.3.2 General Multiclass-to-Binary Reductions 
          29.3.3 Linear Multiclass Predictors 
29.4 On Good and Bad ERMs 

30 Compression Bounds 
30.1 Compression Bounds 

30.2 Axis Aligned Rectangles
        30.2.1 Halfspaces 
        30.2.2 Separating Polynomials 
        30.2.3 Separation with Margin 

31 PAC-Bayes 
31.1 PAC-Bayes Bounds

Comments

Popular posts from this blog

Special Permissions in linux

The setuid permission on an executable file means that the command will run as the user owning the file, not as the user that ran the command. One example is the passwd command: [student@desktopX ~]$ ls -l /usr/bin/passwd -rw s r-xr-x. 1 root root 35504 Jul 16 2010 /usr/bin/passwd In a long listing, you can spot the setuid permissions by a lowercase s where you would normally expect the x (owner execute permissions) to be. If the owner does not have execute permissions, this will be replaced by an uppercase S . The special permission setgid on a directory means that files created in the directory will inherit their group ownership from the directory, rather than inheriting it from the creating user. This is commonly used on group collaborative directories to automatically change a file from the default private group to the shared group, or if files in a directory should be

The Seven-Step Model of Migration

Irrespective of the migration approach adopted, the Seven-step Model of Cloud Migration creates a more rational point of view towards the migration process and offers the ability to imbibe several best practices throughout the journey Step 1: Assess Cloud migration assessments are conducted to understand the complexities in the migration process at the code, design and architectural levels. The investment and the recurring costs are also evaluated along with gauging the tools, test cases, functionalities and other features related to the configuration. Step 2: Isolate The applications to be migrated to the cloud from the internal data center are freed of dependencies pertaining to the environment and the existing system. This step cuts a clearer picture about the complexity of the migration process. Step 3: Map Most organisations hold a detailed mapping of their environment with all the systems and applications. This information can be used to distinguish between the

RequestsDependencyWarning: urllib3 (1.24.1) or chardet (3.0.4) doesn't match a supported version

import tweepy /usr/lib/python2.7/dist-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.24.1) or chardet (3.0.4) doesn't match a supported version!   RequestsDependencyWarning) Traceback (most recent call last):   File "<stdin>", line 1, in <module>   File "/usr/local/lib/python2.7/dist-packages/tweepy/__init__.py", line 14, in <module>     from tweepy.api import API   File "/usr/local/lib/python2.7/dist-packages/tweepy/api.py", line 12, in <module>     from tweepy.binder import bind_api   File "/usr/local/lib/python2.7/dist-packages/tweepy/binder.py", line 11, in <module>     import requests   File "/usr/lib/python2.7/dist-packages/requests/__init__.py", line 97, in <module>     from . import utils   File "/usr/lib/python2.7/dist-packages/requests/utils.py", line 26, in <module>     from ._internal_utils import to_native_string   File "/usr/lib/python2.

tag