📋 ML Fundamentals Cheat Sheet

Core machine learning concepts and terminology for the AIF-C01 exam.

Types of Learning

  • Supervised: learn from labeled data (classification, regression).
  • Unsupervised: find patterns in unlabeled data (clustering, dimensionality reduction).
  • Reinforcement: learn through trial and error with rewards.
  • Self-supervised: generate labels from unlabeled data (used in pre-training FMs).
  • Semi-supervised: combine small labeled set with large unlabeled set.

Key Concepts

  • Features: input variables used to make predictions.
  • Labels: target output in supervised learning.
  • Training set: data used to train the model.
  • Validation set: data used to tune hyperparameters.
  • Test set: data used to evaluate final model performance.
  • Overfitting: model learns noise, performs poorly on new data.
  • Underfitting: model is too simple, misses patterns.

Evaluation Metrics

  • Accuracy: % of correct predictions (not great for imbalanced data).
  • Precision: of predicted positives, how many are actually positive.
  • Recall (Sensitivity): of actual positives, how many were correctly predicted.
  • F1 Score: harmonic mean of precision and recall.
  • AUC-ROC: model's ability to distinguish between classes.
  • RMSE: root mean squared error for regression tasks.

Common Algorithms

  • Linear/Logistic Regression: simple, interpretable baseline models.
  • Decision Trees / Random Forests: tree-based, handle non-linear data.
  • K-Means: unsupervised clustering into K groups.
  • Neural Networks: multi-layer models for complex patterns.
  • XGBoost: gradient boosting, often wins tabular data competitions.

Practice Machine Learning Questions

Put your knowledge to the test with practice questions.

More AIF-C01 Cheat Sheets