A 5-hour Python primer (Week 0) for learners new to coding — skip it via the diagnostic if you already code — followed by 10 project-led weeks. Each week has lessons, a knowledge-check assessment, and (for some weeks) a richer portfolio piece you build and ship.
Prerequisite · 5h Python primer
Python in 5 hours — your runway into AI Builder
5 lessonsAssessment · ≥70%Skippable via diagnostic
Five short lessons + a real-data mini-project. By the end, you can read a CSV with pandas, write a function, and stop being afraid of `for` loops. If you already know Python, take the diagnostic at the top of this page and skip straight to Week 1.
Every example uses something you already touch — exam marks, mandi prices, GST on a kirana bill, IRCTC fares, and real district-level rainfall data from data.gov.in. No `foo`/`bar` placeholder code anywhere in this week.
Learning objectives
- ›Read and write Python code that uses variables, lists, dictionaries, conditions, loops, and functions
- ›Open a CSV in pandas and answer a question with df.head, column select, and df.mean
- ›Run code in a Colab notebook and share it with a classmate
- ›Decide when a problem is small enough for a function and big enough for pandas
- ›Pass the Week 0 exit assessment so Week 1 unlocks
Lessons
- ·Variables, types, print, input45m
- ·Lists and dicts — many things at once45m
- ·Decisions and loops — if, for, while45m
- ·Functions — package once, use everywhere60m
- ·Real data with pandas — your gateway to AI60m
Get oriented with what AI is, what it is not, and how the AI Project Cycle frames every project we will build over the next 10 weeks. Open Google Colab and run your first AI-flavoured Python program. Assumes you finished Week 0 or passed its diagnostic.
We will study three Indian AI systems live in production — Ola surge pricing, Flipkart product recommendations, and ISRO satellite image analysis — and trace how each one moves through the AI Project Cycle.
Learning objectives
- ›Distinguish AI from regular software with at least three examples
- ›Walk through the AI Project Cycle (Problem Scoping, Data, Modeling, Evaluation)
- ›Apply the 4Ws Problem Canvas to one Indian AI use case
- ›Run Python in Google Colab and use variables, input, print
- ›Have your first conversation with the AI Buddy
Lessons
- ·What is AI, and what is it not30m
- ·The AI Project Cycle and the 4Ws Canvas30m
Turn a vague hunch into a well-scoped AI problem. Practice the AI Project Cycle on a real Indian dataset and write Python that reads, filters, and prints CSV data.
We work with three Indian datasets — district rainfall (IMD), mandi vegetable prices (Agmarknet), and air quality (CPCB) — and pick ONE to scope into a problem your village or city actually has.
Learning objectives
- ›Apply the 4Ws Canvas to a self-chosen Indian problem
- ›Identify whether enough data exists to even attempt the problem
- ›Use Python lists, loops, conditionals, and functions
- ›Read a CSV with the standard library and pandas
- ›Spot the difference between a "data problem" and a "design problem"
Lessons
- ·Scope before you code30m
- ·Where does Indian data come from30m
Week 3
Data collection, cleaning & exploration
2 lessonsAssessment · ≥70%
Real Indian datasets are messy. Learn pandas filtering, grouping, missing-value handling, and the small set of moves that turn 80% of dirty data into something a model can use.
We clean a real-shaped slice of the U-DISE school-enrollment dataset (mixed dtypes, missing values clustered around tribal districts, three different date formats) — the kind of mess that ships in actual government CSVs.
Learning objectives
- ›Use pandas DataFrame, .dropna, .fillna, .astype, .groupby, .merge
- ›Decide whether a missing value should be dropped, imputed, or flagged
- ›Spot encoding/dtype bugs that silently corrupt a model
- ›Document a cleaning notebook so a teammate can reproduce it
Lessons
- ·The 80/20 of data cleaning30m
- ·Group-by is your superpower30m
Use matplotlib to turn a cleaned dataset into a story. Decide what chart fits what question, and present findings persuasively.
We work with the air quality dataset for your city or "which crops are most profitable in Maharashtra" — choose one — and build a dashboard answering a real question a journalist or policymaker would ask.
Learning objectives
- ›Pick the right chart type for a given question
- ›Build a 4-chart dashboard with matplotlib
- ›Write captions that turn charts into a story
- ›Submit the dashboard as your first portfolio piece
Lessons
- ·Charts as questions, not decoration45m
- ·Storytelling — the caption is the chart45m
Train your first machine-learning model. Understand what classification means, why a train/test split matters, and how to read accuracy without lying to yourself.
We predict whether a student passes Class 10 based on study hours + attendance + private tuition (yes/no), using a synthetic but realistically distributed Indian school dataset.
Learning objectives
- ›Distinguish supervised vs unsupervised, classification vs regression
- ›Train a decision tree with scikit-learn
- ›Split data into train/test and explain why
- ›Read accuracy and explain what it does NOT tell you
- ›Save your model + write a 1-paragraph "what I built" description
Lessons
- ·What "learning" actually means30m
- ·Train/test split — the most under-rated idea in ML30m
Three classifiers, three vibes. Learn KNN, Naive Bayes, and Random Forest — and the metrics (precision, recall, F1, confusion matrix) that reveal what accuracy hides.
We predict crop type — rice, wheat, mango, cotton, sugarcane, banana — from soil and climate features (N, P, K, rainfall, temperature, humidity), using a public Kaggle "Crop Recommendation" dataset shape that maps to real ICAR soil testing.
Learning objectives
- ›Train KNN, Naive Bayes, and Random Forest on the same dataset
- ›Read precision, recall, F1, and confusion matrix
- ›Explain when accuracy lies — and what to use instead
- ›Read feature importance from a Random Forest
- ›Pick the right classifier for the job (and defend the choice)
Lessons
- ·Three classifiers, three vibes30m
- ·Accuracy is a liar (sometimes)30m
Images are just numbers. We'll classify handwritten digits using sklearn (in-browser), then learn what convolutional networks add and when transfer learning is the right call.
We use handwritten Devanagari digit classification (the DHCD dataset shape) — recognising ० to ९ — as the practical grounding. The same techniques power UPI receipt OCR, vehicle number-plate readers, and Aadhaar form digitisation pipelines used across Indian e-governance.
Learning objectives
- ›Explain how an image becomes a feature vector (pixels, channels, flattening)
- ›Train a classifier on a real image dataset using sklearn
- ›Read errors from a confusion matrix on image data
- ›Explain in one paragraph what a CNN adds over a flat classifier
- ›Describe transfer learning and when to use it
Lessons
- ·An image is just numbers30m
- ·What CNNs add — and what transfer learning steals30m
Text is just sequences of words, and words can become numbers. We'll build a sentiment classifier on Hindi-English product reviews using bag-of-words + Naive Bayes, then peek at how embeddings and LLMs change the game.
We classify customer reviews from an Indian e-commerce site (Flipkart-style mix of English, Hindi, and Hinglish — "phone bahut accha hai", "delivery slow tha", "value for money"). Code-switching is the norm, not an edge case. The same workflow powers Indian customer-service routing, complaint triage, and market-research firms processing Twitter/X data in vernacular languages.
Learning objectives
- ›Convert raw text into a numeric feature matrix (bag-of-words, TF-IDF)
- ›Train a sentiment classifier and read precision/recall by class
- ›Explain why Naive Bayes is a strong baseline for text
- ›Recognise where bag-of-words fails and embeddings help
- ›Describe how modern LLMs replace this whole pipeline (and when not to use them)
Lessons
- ·From words to vectors30m
- ·Embeddings, transformers, and when to skip the whole pipeline30m
Stop training models. Start calling them. Learn how LLMs work under the hood, how to engineer prompts that don't waste tokens, and how to ground answers in your own data with RAG — using a Class 9 maths tutor as the running example.
We build a maths tutor assistant for Class 9 students that answers in English or Hindi based on the student's preference. Same techniques power Indian government chatbots (MyGov), education platforms (PhysicsWallah, BYJU'S), and customer service for fintech (Cred, Razorpay) — every team that ships an LLM-powered product faces the same prompt + grounding + cost trade-offs.
Learning objectives
- ›Explain in 2 sentences how an LLM generates text (tokens, sampling, temperature)
- ›Write a clear system prompt + user prompt for a defined task
- ›Use few-shot examples to control output format
- ›Describe RAG (retrieval-augmented generation) and when to use it
- ›Recognise hallucinations and design guardrails against them
Lessons
- ·How LLMs actually work — and the four knobs you control30m
- ·Prompt engineering, RAG, and not-getting-fooled30m
Pull everything together. Pick one Indian problem, build a working AI solution end to end, write it up honestly, and ship it. This is the week your portfolio becomes a thing you can put on a CV.
The capstone is your call. It must use real or scraped Indian data, solve a problem someone actually has (not a Kaggle leaderboard), and ship as a runnable artifact + a short write-up. Past student capstones include: a Marathi-English code-switched complaint router for a municipal corporation, a vegetable price predictor for a Pune mandi, a school-dropout risk score for a Bihar district, and a Hindi-language nutrition assistant for ASHA workers.
Learning objectives
- ›Scope a capstone problem using the 4Ws Canvas from Week 2
- ›Choose the right modelling approach (classifier / regressor / LLM / RAG) for the problem
- ›Build, evaluate, and document the system end to end
- ›Write a 250-word project description that a non-technical stakeholder can read
- ›Identify ethical risks specific to your problem and Indian context
Lessons
- ·How to scope a capstone you'll actually finish30m
- ·Ethics, deployment, and what comes after the certificate30m
Portfolio piece — Capstone — your shipped AI project
Pick one Indian problem and ship a working AI solution end to end. Real or scraped Indian data, a defensible metric, a 250-word writeup, an ethics statement. This is the portfolio piece you'll talk about in interviews — make it small enough to finish and honest enough to defend.