Useful Links | Viveckh's Notepad

# Useful Links

# Supervised learning

Function Approximation - taking a set of training examples, coming up with a function that generalizes to cases beyond the data we've seen.

# Induction & Deduction

Induction - Going from examples to general rules
Deduction - Going from general rules to specific examples

# Unsupervised learning

Coming up with grouping and summarizing based on the training examples without any specific labels.

# Reinforcement Learning

Learning based on rewards with delayed rewards. Like playing a game without knowing the rules. Looks like function approximation in supervised learning, but instead of X and Ys, it will be X and Z(rewards) to find Y.

# Optimization

Supervised learning: Labels data well
Unsupervised learning: cluster scores well
Reinforcement learning: behavior scores well

# Scoring

For classification problems with imbalanced dataset, F1 score should be used since the metric properly reflects how well the model can classify both positive and negatve cases by providing an equal weight to precision as well as recall. Think of problems like spam or fraud detection where the positive cases in the dataset is very low. ROC-AUC does a decent job as well but it can give a higher score to a model which predict only a few positive cases properly.

Choosing the right metric for classification problems

# VC/LC

If your model has high bias, you should:

Try adding/creating more features
Try decreasing the regularisation parameter λ

These two things will increase your model complexity and therefore will contribute to solve your underfitting problem.

If your model has high variance, you should:

Get more data
Try a smaller set of features
Try increasing the regularisation parameter λ

# Tips and tricks

Use cross-validation where possible
Stratified sampling when classes aren't uniformly distributed

Decision Trees →