This alphabetically sorted collection of AI, ML, and data resources was last updated on 3/26/2021.

ML breakdown: Supervised + Unsupervised + RL
Classifier comparison: scikit-learn.org
t-interval for slope parameter beta_1
A Unified Data Infra
AI and ML Blueprint
  • Expectation-maximization (EM): assumes random components and computes for each point a probability of being generated by each component of the model. Then iteratively tweaks the parameters to maximize the likelihood of the data given those assignments. Example: Gaussian Mixture
F* statistic equation
  • Gradient Boosting: optimization of arbitrary differentiable loss functions. — Risk of overfitting
Lasso equation
Learning Curve example
  • Linear Discriminant Analysis (LDA): A classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes’ rule. The model fits a Gaussian density to each class, assuming that all classes share the same covariance matrix.
Correlation formula
Normal equation
  • Random Forests: each tree is built using a sample of rows (with replacement) from training set. + Less prone to overfitting
Ridge Regression
  • R2: strength of a linear relationship. Could be 0 for nonlinear relationships. Never worsens with more features
Stochastic gradient descent cost function
T-test formula
validation curve example

This article was originally published on my personal website adamnovotny.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store