Modules in this track
- Talking to LLMs — prompting, structured output, tool use, embeddings
- RAG & Agents — retrieval, ReAct loops, multi-agent, MCP servers
- Serve & Ship — vLLM, on-device inference, cost optimization, observability
- Frontier & Capstone — multimodal, voice, safety, ship a production RAG-agent
What you’ll build
By the end of this track, your repo has:
- A 3-tool function-calling assistant (Claude API, Colab notebook)
- A RAG over 200 arXiv papers (hybrid search + cross-encoder reranker)
- A ReAct agent solving SWE-style tasks (80 lines, no LangChain)
- A 4-bit quantized LLM running on your phone via llama.cpp
- A real-time voice agent (Whisper → Claude → Kokoro TTS)
- A production-grade RAG-agent deployed to Vercel/Modal with eval suite + Langfuse tracing
Each lesson links to a runnable Colab so you can hit the API or run the model on your phone’s browser. No setup beyond opening a tab.