What Actually IS Machine Learning?

The Input: Breaking the Paradigm

For years, I programmed by writing rules.

If user clicks button -> Then open window.
If temperature > 100 -> Then boil water.

This is Traditional Programming: I give the computer the Data and the Rules, and it gives me the Answer.

Machine Learning flips this upside down. I give the computer the Data and the Answer, and it figures out the Rules.

Traditional: Input + Rules = Output
Machine Learning: Input + Output = Rules

1. The Three Types of Learning

Machine Learning isn't one thing. It generally falls into three buckets.

A. Supervised Learning (The Teacher)

This is 90% of business ML. The data comes with an "Answer Key" (Labels).

Analogy: A teacher shows a child flashcards. "This is a cat." "This is a dog." Eventually, the child learns to identify them on their own.
The Goal: Predict the label for new data.
Examples: Predicting House Prices (Regression), Spam Detection (Classification).

B. Unsupervised Learning (The Explorer)

The data has no labels. The machine is left alone to find patterns.

Analogy: Giving a child a bucket of mixed LEGOs. They naturally sort them by color or size, even if you never told them what "Red" or "Small" means.
The Goal: Discover hidden structures.
Examples: Customer Segmentation (Clustering), Recommendation Systems ("People who bought X also bought Y").

C. Reinforcement Learning (The Gamer)

The model learns by trial and error.

Analogy: Training a dog. Good behavior = Treat (Reward). Bad behavior = No Treat (Penalty).
Examples: Teaching a robot to walk, or an AI to play Chess.

2. The Vocabulary: Speaking the Language

To work in ML, you have to stop calling them "columns" and "rows."

Term	Definition	Simple Example
Features (X)	The input data used to make predictions.	Square Footage, Bedrooms, Zip Code.
Target / Label (y)	The answer we want to predict.	House Price.
Training	The process of the model "learning" the rules.	Finding the math formula that turns X into y.
Model	The mathematical engine (The Artifact).	The formula itself (Price = SqFt x 200).
Inference	Using the trained model on new data.	Estimating the price of a house that just went on the market.

2. The Machine Learning Cycle

You don't just "train" a model once and walk away. It is a lifecycle.

A. The Golden Rule: Splitting Data

If I train my model on all my data, it will memorize the answers (Overfitting). To test if it actually learned, I need to hide some data.

The 3-Way Split:

Training Set (70%): The Study Guide. The model learns from this.
Validation Set (15%): The Practice Exam. We use this to tune settings (Hyperparameters) while training.
Test Set (15%): The Final Exam. We lock this away in a vault and only touch it once at the very end to see how the model performs in the real world.

B. The Robust Way: K-Fold Cross-Validation

Splitting data once is risky. What if my "Test Set" just happened to be really easy?

K-Fold Cross-Validation fixes this.

Split the data into K equal parts (e.g., 5 folds).
Train on 4 parts, Test on 1 part.
Repeat 5 times, rotating the test part.
Average the scores.

Analogy: Instead of taking one final exam, the student takes 5 mini-exams covering different chapters. The average score is a better measure of their knowledge.

3. The Two Main Problems: Regression vs. Classification

In Supervised Learning (where we spend most of our time), there are only two questions we usually ask:

1. "How Much?" (Regression)

The answer is a Quantity (Continuous Number).
Examples: "What will the temperature be tomorrow?", "What is the price of Bitcoin?"

2. "Which One?" (Classification)

The answer is a Category (Label).
Examples: "Is this email Spam?", "Is this tumor Benign or Malignant?", "Is this image a Cat, Dog, or Bird?"

4. The Danger Zone: Overfitting

This is the most important concept for a beginner.

Imagine a student studying for a history test.

Student A memorizes the exact dates and wording of the practice questions. (He gets 100% on the practice test).
Student B studies the concepts and causes of the events. (She gets 90% on the practice test).

On the real exam, the questions are rephrased.

Student A fails. He didn't learn history; he memorized the practice sheet. This is Overfitting.
Student B passes. She learned the general pattern, so she can handle new data.

The Golden Rule: We don't want a model that memorizes the training data. We want a model that generalizes to new data.

5. Evaluation Metrics: Keeping Score

How do we know if the model is "good"? It depends on the problem.

For Regression (Predicting Numbers)

MAE (Mean Absolute Error): "On average, my prediction is off by 500." (Easy to understand).
MSE (Mean Squared Error): Punishes large errors heavily. (Good for math).
RMSE (Root Mean Squared Error): The standard metric. It’s in the same units as your target (e.g., "Dollars").

For Classification (Predicting Categories)

Accuracy: % of correct answers. (Trap: If 99% of patients are healthy, a model that says "Healthy" for everyone is 99% accurate but useless).
Precision: "Of all the people I said had cancer, how many actually did?" (Quality of positive predictions).
Recall: "Of all the people who actually had cancer, how many did I find?" (Quantity of positive predictions).
F1-Score: The balance between Precision and Recall.

What Actually IS Machine Learning?

The Input: Breaking the Paradigm

1. The Three Types of Learning

A. Supervised Learning (The Teacher)

B. Unsupervised Learning (The Explorer)

C. Reinforcement Learning (The Gamer)

2. The Vocabulary: Speaking the Language

2. The Machine Learning Cycle

A. The Golden Rule: Splitting Data

B. The Robust Way: K-Fold Cross-Validation

3. The Two Main Problems: Regression vs. Classification

4. The Danger Zone: Overfitting

5. Evaluation Metrics: Keeping Score

For Regression (Predicting Numbers)

For Classification (Predicting Categories)

Comments

Machine Learning

Scikit-Learn & Linear Regression

More from this blog

Convexity, Logistic Loss, and the Overfitting Battle

Logistic Regression and Decision Surfaces

Optimization Mechanics — Feature Scaling, Convergence Testing, and Polynomial Mapping

Upgrading to Multiple Linear Regression

gradient_descent.py

Command Palette

The Input: Breaking the Paradigm

1. The Three Types of Learning

A. Supervised Learning (The Teacher)

B. Unsupervised Learning (The Explorer)

C. Reinforcement Learning (The Gamer)

2. The Vocabulary: Speaking the Language

2. The Machine Learning Cycle

A. The Golden Rule: Splitting Data

B. The Robust Way: K-Fold Cross-Validation

3. The Two Main Problems: Regression vs. Classification

4. The Danger Zone: Overfitting

5. Evaluation Metrics: Keeping Score

For Regression (Predicting Numbers)

For Classification (Predicting Categories)

Comments

Machine Learning

Scikit-Learn & Linear Regression

More from this blog