Machine Learning is a combination of data, hypotheses, and prediction. While the field evolves rapidly with new methods and algorithms, the foundational concepts remain critical. This section provides an overview of the core ideas in machine learning. Because specific methods and tools can become outdated quickly, I strongly encourage you to invest time in studying the mathematical knowledges covered in other sections. A solid grasp of these concepts equips you with the ability to understand, adapt, and even create new approaches as the field progresses.

🧠?

Part 1: Intro to Machine Learning

Machine learning Artificial intelligence (AI) Deep learning Supervised learning Unsupervised learning Learning process of ML Categories of machine learning

\(\lambda\|w\|_p\)

Part 2: Regularized Regression

Interactive Demo Ridge regression Bias-variance tradeoff Generalization Regularization Cross-validation (CV) K-fold cross-validation Leave-one-out cross-validation (LOOCV) Lasso regression

\(\mathcal{X} \mapsto \mathcal{Y}\)

Part 3: Intro to Classification

Interactive Demo Binary logistic regression sigmoid (logistic) function logit (pre-activation) Decision boundary Feature mapping Linearly separable Kernel trick Random fourier features RBF (Gaussian) kernel Softmax function Multinomial logistic regression

\( x \mapsto h_\theta(x) \)

Part 4: Neural Networks Basics

Interactive Demo Deep neural network (DNN) Multilayer perceptron (MLP) Hidden layer Activation function ReLU Vanishing gradients Backpropagation Gradient clipping Exploding gradients Graphics processing units (GPUs)

\(\nabla \mathcal{L}\)

Part 5: Automatic Differentiation

Code Included Automatic differentiation (AD) Computational graph

\(w^\top x + w_0\)

Part 6: Support Vector Machine (SVM)

Interactive Demo Support vector machine (SVM) Soft margin constraints

\(\mathcal{K}(x_i, x_j)\)

Part 7: Principal Component Analysis (PCA) & Autoencoders

Interactive Demo Principal Component Analysis (PCA) Dimensionality reduction Kernel PCA Double centering trick Autoencoder Lipschitz continuity Data Reconstruction Denoising autoencoder Manifolds

\(\boldsymbol{f}^\top \mathbf{L} \boldsymbol{f} \)

Part 8: Clustering

Interactive Demo K-means clustering Distortion One-hot encoding (Dummy encoding) K-means++ Vector quantization Spectral clustering Graph Laplacian Dirichlet energy

🧠!

Part 9: Intro to Deep Neural Networks

Interactive Demo Feedforward networks Convolutional Neural Networks (CNNs) Residual connection Layer normalization Attention Self-attention Multi-Head Attention (MHA) Positional encoding Transformer

🔁

Part 10: Intro to Reinforcement Learning

Reinforcement Learning (RL) Model-based RL Model-free RL Agent Reward Policy Markov Decision Process (MDP) Discount factor Return Value function Q-function Advantage function Bellman's equations Value Iteration Policy Iteration Temporal Difference Learning Q-Learning SARSA Exploration vs Exploitation Policy Gradient REINFORCE Actor-Critic

V - Machine Learning

Machine Learning