ai-engineer.sh
GitHub

Introduction

This course teaches you how large language models work and how to build with them. It covers the full stack — from the math behind transformers to engineering production systems that use LLMs effectively.

Who this is for

This material is designed for software engineers and technical practitioners who want to go beyond API calls and understand what's happening under the hood. You don't need a PhD in machine learning, but you should be comfortable with:

  • Python — the primary language used in examples
  • Basic linear algebra — vectors, matrices, dot products
  • General programming concepts — APIs, data structures, version control

If you've used ChatGPT or Claude and want to understand how they work — or build your own tools on top of them — you're in the right place.

What you'll learn

The course is organized into four sections that build on each other:

SectionFocus
Getting StartedSetup, prerequisites, and orientation
FundamentalsHow LLMs are built — data, tokenization, training
TransformersThe architecture powering modern LLMs — attention, embeddings, inference
LLM EngineeringPrompting, RAG, fine-tuning, agents, and production patterns

Each article is self-contained but assumes you've read the preceding ones. Start from the beginning if you're new, or jump to a specific topic if you already have a foundation.

How to use this site

  • Read sequentially — the sidebar navigation follows the recommended order
  • Run the code — examples are meant to be executed, not just read
  • Check prerequisites — some articles require specific libraries or hardware

All code examples use Python and PyTorch unless stated otherwise. See the prerequisites page for environment setup.

Why understand the internals

You can use LLMs effectively without understanding how they work. But knowing the internals helps you:

  • Debug unexpected behavior — tokenization artifacts, context window limits, and hallucinations all have technical explanations
  • Choose the right model — parameter count, context length, and training data matter for different use cases
  • Build better systems — prompt engineering, RAG, and fine-tuning are more effective when you understand what the model is actually doing
  • Evaluate new developments — the field moves fast, and a solid foundation lets you separate signal from noise
Edit this page on GitHub