Core Concepts & Architecture

To understand how AI models are built in the real world, you must grasp core data science terminology and the architectural hierarchy.

The AI / ML / DL Hierarchy

It is crucial to understand that AI, Machine Learning, and Deep Learning are not synonymous—they are Russian nesting dolls of technology.

AI: Any technique that enables computers to mimic human behavior.
ML: Statistical techniques that give computers the ability to learn without being explicitly programmed.
DL: Computationally heavy algorithms inspired by the human brain (neural networks).

The Data Science Pipeline

Data, Features, and Labels

Data is the lifeblood of AI. A dataset is a collection of structured or unstructured data.

Features: The input variables (e.g., patient age, blood pressure, heart rate).
Labels: The target variable you are trying to predict (e.g., whether the patient has diabetes).

Training vs Testing

You cannot test a model on the same data it learned from, because it might just memorize the answers (known as overfitting). Data is always split:

Training Set (typically 80%): Given to the algorithm to learn the patterns.
Testing Set (typically 20%): Kept highly secret. Used at the very end to evaluate the model's true accuracy on unseen data.

By iterating on different algorithms (like Random Forests or Support Vector Machines) and measuring their accuracy on the testing set, data scientists select the best-performing Model to deploy to production.

The AI / ML / DL Hierarchy​

The Data Science Pipeline​

Data, Features, and Labels​

Training vs Testing​

The AI / ML / DL Hierarchy

The Data Science Pipeline

Data, Features, and Labels

Training vs Testing