machine_learning_from_scratch_matlab_python
machine_learning_from_scratch_matlab_python copied to clipboard
Vectorized Machine Learning in Python ๐ From Scratch
Machine Learning in Python From scratch
A Vectorized Python ๐ implementation using only NumPy, SciPy, and Matplotlib resembling as closely as possible to both provided and personally-completed code in the octave/matlab as part of the excellent Stanford University's Machine Learning Course on Coursera. The course is taught by Andrew Ng a genius and an excellent popularizer, which is a rare combination.
This course helped me write a blog answering the following question What is Machine Learning?
Supervised Learning
Given a set of labeled observations, find a function f which can be used to assign a class or value to unseen observations. Predictions should be similar to real labels.
Regression
In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function.
1. Linear regression with one variable to predict pro๏ฌts for a food truck
2. Regularized Linear regression with multiple variables to predict the prices of houses
- ๐ Demo | Linear Regression with multiple variables Notebook
- โถ๏ธ Demo | Linear Regression with multiple variables Matlab
Classification
In a classification problem, we instead are trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.
3. Regularized logistic regression to predict whether microchips passes quality assurance (QA)
- ๐ Demo | Regularized Logistic Regression Notebook
- โถ๏ธ Demo | Regularized Logistic Regression Matlab
4. Multi-class Logistic regression to recognize handwritten digits
- ๐ Demo | Multi-class Logistic regression Notebook
- โถ๏ธ Demo | Multi-class Logistic regression Matlab
5. Neural Networks (MLP) to recognize handwritten digits
- ๐ Demo | Neural Networks Notebook Part I, Demo | Neural Networks Notebook Part II
- โถ๏ธ Demo | Neural Networks Matlab Part I, Demo | Neural NetworksMatlab Part II
6. Support Vector Machines SVM ( with and without Gaussian Kernels)
Metrics to evaluate ML algorithms
Tackling Overfitting and Underfitting problems.
7. High Bias vs High Variance
Unsupervised Learning
Labeling can be tedious (too long, too slow), often done by humans and no real labels to compare. Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables. We can derive this structure by clustering the data based on relationships among the variables in the data. With unsupervised learning there is no feedback based on the prediction results.
Clustering
Group objects in clusters, similar within cluster, dissimilar between clusters.
8. K-means clustering algorithm for image compression
- ๐ Demo | K-means Notebook
- โถ๏ธ Demo | K-means Matlab
Dimensionality reduction
Reduce data set dimensions. Used for ata compression or big data visualization.
9. Principal Component Analysis (PCA) to perform dimensionality reduction
Anomaly detection
Identifies rare items (outliers) which raise suspicions by differing significantly from the majority of the data.
10. Anomaly detection algorithm to detect anomalous behavior in server computers of a data center
Recommender System
Predicts the rating or preference a user would give to an item.
11. Collaborative ๏ฌltering recommender system applied to a dataset of movie ratings
- ๐ Demo | Collaborative ๏ฌltering recommender system Notebook
- โถ๏ธ Demo | Collaborative ๏ฌltering recommender system Matlab