My Insights About Compiler Design

↑

Introduction to Compiler Design

NOTE:
I'm not a professional computer scientist, so the whole content here is based on my poor knowledge.
Therefore, it might contain inefficient thinking ways or even wrong thinking ways. I do not assume any responsibility from it.
That's it, thanks for your understanding.

What Is a Compiler?

A compiler is a program that transforms source code written in one language (e.g., C, Rust, TypeScript) into another form (usually machine code, bytecode, or another high-level language).
Examples of compilers:
- GCC / Clang
- Rustc

The Big Picture: Stages of a Compiler

Lexical Analysis (tokenization)

Breaks raw text into tokens: keywords, identifiers, literals, symbols.
Example: a = 3 + b → [ID("a"), '=', INT(3), '+', ID("b")]

Syntax Analysis (parsing)

Builds a tree structure called an AST (Abstract Syntax Tree).
Example: 3 + u becomes a binary operation node.

Semantic Analysis

Checks correctness:
- types
- scope
- variable declarations
- function signatures
Example: "abc" + 5 is invalid in many languages.

Intermediate Representation (IR)

The AST is transformed into a lower-level language closer to machine operations.
Common IRs:
- LLVM IR
- bytecode
- three-address code (TAC)

Optimization

The compiler improves your code:
- dead code elimination
- constant folding
- loop optimizations
- register allocation

Code Generation

Final stage: output machine code, bytecode, JS, etc.

Compiler vs Interpreter

Compilers: translate entire program before running.
Interpreters: execute code step-by-step.

Compiler	Interpreter
Produces binary or bytecode	Runs line-by-line
Faster execution	More flexible debugging
Example: C, Rust	Example: Python, Ruby

The AST (Abstract Syntax Tree)

The AST is the core internal structure produced by the parser.
Example AST for 1 + 2 * 3:

Add
 ├── Int(1)
 └── Mul
      ├── Int(2)
      └── Int(3)

The tree respects precedence: multiplication evaluated before addition.

Intermediate Representation (IR)

An IR is a lower-level representation used for:
- optimization
- analysis
- machine-code generation
A simple “three-address code” example:

t1 = 2 * 3
t2 = 1 + t1

LLVM IR is a real-world example:

%t1 = mul i32 2, 3
%t2 = add i32 1, %t1

Optimizations

Compilers apply many optimizations automatically:

constant folding: 2 * 3 → 6
dead code elimination: remove unused variables
inlining: replace function calls with their bodies
loop unrolling
common subexpression elimination

Back Ends and Code Generation

The back end turns IR into:
- machine code (x86, ARM)
- bytecode (Python, Java)
- another language (TS → JS)
Example: LLVM back end produces machine instructions:

mov eax, 2
imul eax, 3
add eax, 1

Front-End vs Back-End

Front-End	Back-End
Lexing	IR to machine code
Parsing	Register allocation
Semantic analysis	Optimization passes
AST generation	Instruction scheduling

Different Phases of a Compiler

Prior Knowledge

A compiler transforms source code written in a high-level language (like C, Rust, or Haskell) into machine code or an intermediate form.

This transformation happens in several well-structured phases.

Each phase handles one part of the job:
- understanding the program,
- checking for correctness,
- optimizing,
- and generating low-level code.

Although implementations vary between compilers (e.g., GCC, LLVM Clang, Rustc), the overall architecture remains similar.

Lexical Analysis (Scanning)

The first phase of compilation.

Converts a raw source code string into a sequence of tokens.

A token is a meaningful atomic unit:
- keywords (if, while)
- identifiers (x, count)
- literals (10, "hello")
- operators (+, *, ==)
- punctuation (;, (, ))

A lexer removes:
- spaces,
- comments,
- line breaks,
- unnecessary formatting.

Example:

Source:     x = 3 + 5;
Tokens:     IDENTIFIER("x"), EQUALS, NUMBER(3), PLUS, NUMBER(5), SEMICOLON

Or in more details as
Tokens:     IDENTIFIER("x"), OPERATOR("="), NUMBER(3), OPERATOR("+"), NUMBER(5), PUNCTUATION(";")

Syntax Analysis (Parsing)

Given the sequence of tokens, the parser checks whether the code follows the grammatical structure of the language.

It produces an AST (Abstract Syntax Tree), a hierarchical representation of the program.

Example: x = 3 + 5 becomes:

      Assignment
     /          \
   Variable     Add
    |          /   \
   "x"        3     5

The parser enforces:
- operator precedence,
- parentheses grouping,
- valid statements and expressions.

Parser types:
- Top-down: LL(k), recursive-descent (easy to implement)
- Bottom-up: LR(k), LALR, GLR (used by many production compilers)

Semantic Analysis

The semantic analyzer checks for meaningful correctness of the AST.

Examples of semantic checks:

Type checking
Variable declarations and scopes
Function arity (argument count)
Operator/type compatibility
Control-flow correctness

The compiler builds:
- a symbol table mapping names to types/locations,
- a type environment,
- a scope tree.

Example error caught here:

x = "hello" + 5  -- type error: cannot add string and integer

Semantic analysis transforms the AST into a typed AST (or annotates nodes with type info).

Intermediate Representation (IR) Generation

Once semantics are validated, the compiler transforms the AST into an IR (Intermediate Representation).

IR is a lower-level, more regular representation than the AST.

Most modern compilers use a form of SSA (Static Single Assignment) IR.

Purpose:
- enable optimizations,
- provide target-independence,
- simplify code generation.

Example (LLVM IR):

%1 = add i32 3, 5
store i32 %1, i32* @x

The IR removes high-level syntax and resolves many abstractions.

Optimization

This is often the largest and most complex phase.

Perform optimizations on IR to:
- minimize runtime,
- minimize memory,
- increase CPU instruction-level parallelism,
- reduce branches and allocations.

Typical optimizations include:

Constant folding
Dead code elimination
Loop unrolling
Function inlining
Common subexpression elimination
Escape analysis (eliminate heap allocations)
Copy propagation

Optimizations may occur at:
- high-level AST,
- mid-level IR (LLVM),
- low-level machine code.

Code Generation

Transforms the optimized IR into:
- target machine code,
- assembly, or
- bytecode (for JVM, CLR, BEAM, etc.).

This phase handles:
- register allocation,
- calling conventions,
- instruction selection,
- stack frame setup,
- function prologues/epilogues.

Example (x86-64 assembly):

mov eax, 3
add eax, 5
mov [x], eax

Linking

Final phase: combining all compiled units and libraries into an executable.

The linker resolves:
- symbol references,
- external functions (e.g., printf),
- imported modules,
- static libraries,
- shared libraries.

For languages like C/C++, linking is separate.

For languages like Go, Rust, and Haskell, the compiler internally orchestrates linking.

The Big Picture Diagram

    Source Code
         |
         v
  [Lexical Analysis]
         |
         v
   [Syntax Parsing]
         |
         v
 [Semantic Analysis]
         |
         v
 [Intermediate Representation]
         |
         v
     [Optimization]
         |
         v
  [Code Generation]
         |
         v
      [Linking]
         |
         v
    Executable

Introduction to the Lexical Analyzer

What Is a Lexical Analyzer?

A lexical analyzer, also called a lexer or scanner, is the first phase of a compiler.

Its job is to transform raw source code (a stream of characters) into a stream of tokens.

A token is a structured, meaningful element such as:
- Keywords: if, let, while
- Identifiers: x, totalAmount
- Literals: 42, "hello", true
- Operators: +, -, ==, =
- Punctuation: ;, (, ), {, }

The lexer removes:
- whitespace (spaces, tabs, newlines),
- comments,
- formatting irrelevant to syntax.

The output of a lexer is a clean token stream ready for parsing.

Why Is Lexical Analysis Needed?

Languages are easier to reason about when reduced to meaningful units.

A parser should process tokens, not raw characters.

Lexers simplify parsing by:
- grouping multi-character operators (==, >=)
- handling keywords vs identifiers
- normalizing whitespace
- detecting early lexical errors (invalid characters)

Without a lexer, parsers become extremely complicated.

The Lexical Unit: Tokens

Each token has generally:
- type (e.g., NUMBER, IDENTIFIER)
- lexeme — the exact string matched
- literal value (optional, e.g. integer value 42)
- position (line/column for error reporting)

Token {
    type: IDENTIFIER,
    lexeme: "total",
    literal: null,
    line: 3
}

Tokens form a clean “interface” between source text and syntax analysis.

How the Lexer Works

Lexers commonly operate using:
- Deterministic Finite Automata (DFA)
- Regular expressions

The lexer scans the input left to right, grouping characters into the longest valid token.

Example:
Input:   x = 12 + 7
Output:  IDENT("x"), EQUALS, NUMBER(12), PLUS, NUMBER(7)

Token Categories (Example)

Identifiers

myVar, count1, is_valid

Keywords

if, else, while, return

Literals: integers, floats, strings, booleans

42, 3.14, "abc", true

Operators

+, -, *, /, ==, !=, >=, <=, =

Punctuation

;, ,, (, ), {, }

Comments (ignored by the lexer)

// single-line
/* multi-line */

Lexical Errors

The lexer is also responsible for catching invalid characters:

abc @ def
    ^
    error: unexpected character '@'

If an invalid lexeme is encountered, compilation stops at the lexical phase.

There are several common ways to build a lexer:

Regex-based lexers

DFA-based scanners

Hand-written lexers — common in simpler compilers

Longest Match Rule

Lexers usually apply the maximal match principle:

Input:           >=
Lexer picks:    ">="   (not just ">")

This ensures ambiguous prefixes are handled properly.

Introduction to Deterministic Finite Automata (DFA)

What Is a DFA? (In Simple Words, About Strict Definitions Please Go Read A Professional Book)

A Deterministic Finite Automaton (DFA) is a simple mathematical machine used to recognize and process patterns in strings.

A DFA reads an input string character by character and uses states to decide what to do next.

If the machine ends in an accept state, then the string matches the pattern.

Formal Definition of a DFA

A DFA consists of 5 components:

DFA = (Q, Σ, δ, q₀, F)

Q   = set of states
Σ   = input alphabet (characters allowed)
δ   = transition function: Q × Σ → Q
q₀  = start state
F   = set of accepting states

Example 1: DFA for Recognizing Strings That End With "ab"

Alphabet: {a, b}
Language: All strings that end with "ab"
Examples accept:  "ab", "aab", "bbab", "aaab"
Examples reject:  "a", "ba", "abb", "bbaa"

The DFA Transition Table:

State	On `a`	On `b`
`q0` (start)	`q1`	`q0`
`q1`	`q1`	`q2`
`q2` (accept)	`q1`	`q0`

State meanings:

q0: nothing matched yet
q1: last char was 'a'
q2 (accept): last two chars are "ab"

Simulating input aab:

Start in q0
Read 'a' → q1
Read 'a' → q1
Read 'b' → q2 (accept!)

Example 2: DFA for Numbers with Even Number of 0s

This DFA checks whether the number of 0 characters is even.

Alphabet: {0, 1}
Accept: strings with even number of 0s

State	On `0`	On `1`
`Even` (start, accept)	`Odd`	`Even`
`Odd`	`Even`	`Odd`

State meaning:
- Even: seen even number of 0s (accept state)
- Odd: seen odd number of 0s

Example: 10100
Even --1--> Even --0--> Odd --1--> Odd --0--> Even --0--> Odd (reject)

Transition Table Representation

A DFA can also be represented using a table.
For the "ends with ab" automaton:

       |   a   |   b
--------------------------------
q0     |  q1   |  q0
q1     |  q1   |  q2
q2     |  q1   |  q0

The table precisely defines the behaviour of the machine.

DFA Simulation (Pseudocode)

Every DFA can be simulated by repeatedly applying the transition function.

def run_dfa(trans, start, accepts, s):
    state = start
    for c in s:
        state = trans[state][c]
    return state in accepts

transition = {
    "q0": {"a": "q1", "b": "q0"},
    "q1": {"a": "q1", "b": "q2"},
    "q2": {"a": "q1", "b": "q0"},
}

print(run_dfa(transition, "q0", {"q2"}, "aab"))   // True

DFA vs NFA (Briefly)

A DFA has:
- exactly one transition per input symbol,
- exactly one active state at any time.
An NFA (nondeterministic finite automaton) may have:
- multiple possible transitions,
- ε-transitions (move without input),
- multiple active states.
But both recognize exactly the same class of languages (regular languages).

DFA: Strings Ending with "ab"
State	On `a`	On `b`
`q0` (start)	`q1`	`q0`
`q1`	`q1`	`q2`
`q2` (accept)	`q1`	`q0`

NFA: Strings Ending with "ab"
State	On `a`	On `b`
`q0` (start)	{q0, q1}	{q0}
`q1`	{q1}	{q2}
`q2` (accept)	{q1}	{q0}

Left Factoring

What Is Left Factoring?

Left factoring is a grammar transformation technique used in compiler design.
It's essential for building predictive parsers (like LL(1) parsers) that need to decide which production to use by looking at only one token ahead.

When two or more productions for the same non-terminal start with the same symbol(s), the parser cannot decide which rule to apply.

Left factoring extracts the common prefix into a single production, deferring the decision until after the prefix is consumed.

Why Do We Need It?

Consider a predictive parser looking at input and trying to choose a production:

Grammar:
    A -> α β₁
    A -> α β₂

Problem: When parser sees the start of α, it cannot decide between the two rules!

The parser would need to "look ahead" beyond α to make a decision — but LL(1) parsers only look at one token.
Left factoring solves this by restructuring the grammar:

After left factoring:
    A  -> α A'
    A' -> β₁
    A' -> β₂

Now: Parser consumes α first, THEN decides between β₁ and β₂.

The General Algorithm

For each non-terminal A with productions that share a common prefix α:

Before:
    A -> α β₁ | α β₂ | ... | α βₙ | γ

After:
    A  -> α A' | γ
    A' -> β₁ | β₂ | ... | βₙ

Where γ represents any alternatives that don't share the prefix α.
If βᵢ is empty for some i, then A' -> ε (epsilon) is one of the alternatives.

Example 1: Simple Arithmetic Expression

A classic example from expression parsing:

Before (problematic):
    expr -> term + expr
    expr -> term

Problem: Both start with "term" — parser can't decide!

After left factoring:

After (left-factored):
    expr  -> term expr'
    expr' -> + expr
    expr' -> ε

Visualization of the transformation:

Example 2: The Dangling Else Problem

This is a famous ambiguity in programming language grammars:

Before (problematic):
    stmt -> if ( expr ) stmt else stmt
    stmt -> if ( expr ) stmt
    stmt -> while ( expr ) stmt
    stmt -> id = expr ;

Problem: Both "if" rules share a long common prefix!

After left factoring:

After (left-factored):
    stmt  -> if ( expr ) stmt stmt'
    stmt  -> while ( expr ) stmt
    stmt  -> id = expr ;
    stmt' -> else stmt
    stmt' -> ε

The common prefix if ( expr ) stmt is factored out.
The decision between "else" and "nothing" is deferred to stmt'.

Input After Parsing `if(expr)stmt`	Action
Next token is `else`	Use `stmt' -> else stmt`
Next token is something else	Use `stmt' -> ε`

Example 3: Multiple Groups of Common Prefixes

Sometimes you have multiple different prefixes that each need factoring:

Before (problematic):
    decl -> int id ;
    decl -> int id [ num ] ;
    decl -> float id ;
    decl -> float id [ num ] ;

Here we have TWO groups of common prefixes: int id and float id.
First pass — factor out within each group:

After (left-factored):
    decl  -> int id decl'
    decl  -> float id decl'
    decl' -> ;
    decl' -> [ num ] ;

We can go even further by factoring out the type:

Even more factored (optional):
    decl  -> type id decl'
    type  -> int
    type  -> float
    decl' -> ;
    decl' -> [ num ] ;

Example 4: Three or More Alternatives

When more than two alternatives share a prefix, you may need multiple passes:

Before (problematic):
    print_stmt -> print expr ;
    print_stmt -> print expr , expr ;
    print_stmt -> print expr , expr , expr ;

First pass — factor out print expr:

After (first pass):
    print_stmt  -> print expr print_stmt'
    print_stmt' -> ;
    print_stmt' -> , expr ;
    print_stmt' -> , expr , expr ;

But wait! print_stmt' still has common prefixes. Second pass:

After (second pass):
    print_stmt   -> print expr print_stmt'
    print_stmt'  -> ;
    print_stmt'  -> , expr print_stmt''
    print_stmt'' -> ;
    print_stmt'' -> , expr ;

Example 5: Nested Factoring (Iterative Process)

A more complex example showing how factoring can require multiple iterations:

Before (problematic):
    S -> a b c d
    S -> a b c e
    S -> a b f
    S -> a g
    S -> h

This requires THREE passes to fully factor. Here's the progression:

Pass	Common Prefix Found	Result
1	`a`	Extract `S'` for what follows `a`
2	`b` (in S')	Extract `S''` for what follows `b`
3	`c` (in S'')	Extract `S'''` for what follows `c`

After (pass 1):
    S  -> a S'
    S  -> h
    S' -> b c d
    S' -> b c e
    S' -> b f
    S' -> g

After (pass 2):
    S   -> a S'
    S   -> h
    S'  -> b S''
    S'  -> g
    S'' -> c d
    S'' -> c e
    S'' -> f

After (pass 3 — final):
    S    -> a S'
    S    -> h
    S'   -> b S''
    S'   -> g
    S''  -> c S'''
    S''  -> f
    S''' -> d
    S''' -> e

Visualizing the final grammar as a decision tree:

Example 6: A Realistic Function Definition Grammar

Real programming languages have complex grammars. Here's a function definition:

Before (problematic):
    func -> id ( ) { stmts }
    func -> id ( ) : type { stmts }
    func -> id ( params ) { stmts }
    func -> id ( params ) : type { stmts }

All four start with id ( — a parser seeing foo( has no idea which rule to use!
After careful factoring:

After (left-factored):
    func    -> id ( func'
    func'   -> ) func''
    func'   -> params ) func''
    func''  -> { stmts }
    func''  -> : type { stmts }

Decision points:

After Seeing	Look At	Decision
`id (`	Next token	`)` → no params; otherwise → has params
`id ( ... )`	Next token	`{` → no return type; `:` → has return type

When Left Factoring Is Not Enough

Left factoring only helps with common prefixes.
It does NOT solve:
- Left recursion: A -> A α | β (needs different transformation)
- Ambiguity: when the grammar genuinely allows multiple parse trees
- Non-LL(1) grammars: some grammars simply cannot be made LL(1)

Problem	Solution
Common prefixes	Left factoring ✓
Left recursion	Left recursion elimination
Ambiguity	Rewrite grammar or use precedence rules
Inherently non-LL(1)	Use more powerful parser (LR, GLR, etc.)

Summary

Left factoring transforms: A -> αβ₁ | αβ₂ into A -> αA' and A' -> β₁ | β₂
It enables predictive (LL) parsing by deferring decisions until after common prefixes.
May require multiple passes for deeply nested common prefixes.
Often used together with left recursion elimination when building LL(1) parsers.

Predictive Parsing

What Is Predictive Parsing?

Predictive parsing is what you get when you've successfully transformed your grammar (using left factoring, left recursion elimination, etc.) so that one token of lookahead is always sufficient to make the right choice—no guessing, no backtracking, no regrets.

A predictive parser is a top-down parser that never backtracks.

At each step, it looks at the current non-terminal and the next input token, then deterministically chooses which production to use.

This is only possible when the grammar has been carefully prepared (no left recursion, no common prefixes).

The Promise of Predictive Parsing:

    Current state: I'm trying to expand non-terminal A
    I see token t in the input
    ↓
    I know EXACTLY which production to use (or that there's an error)
    ↓
    No guessing. No trying one rule and backtracking if it fails.

The Key Insight: FIRST Sets

For predictive parsing to work, we need to know: "What tokens can a production start with?"
This is captured by the FIRST set.

Definition:

    FIRST(α) = the set of all terminals (tokens) that can appear
               as the first symbol of any string derived from α

    If α can derive ε (empty string), then ε ∈ FIRST(α) as well.

Simple examples:

FIRST(if ( expr ) stmt)  = { if }       — always starts with "if"
FIRST(while ( expr ) stmt) = { while }  — always starts with "while"
FIRST(id = expr ;)       = { id }       — always starts with an identifier
FIRST(+ term)            = { + }        — always starts with "+"
FIRST(ε)                 = { ε }        — the empty string "starts with" nothing

The Golden Rule for Predictive Parsing

If A -> α and A -> β are two different productions for the same non-terminal:

FIRST(α) ∩ FIRST(β) = ∅    (must be empty!)

If they overlap, seeing a token in the overlap wouldn't tell you which production to use.

When this condition holds for all non-terminals, the grammar is suitable for predictive parsing.

Example: Good Grammar (Predictive Parsing Works)

Grammar:
    stmt -> if ( expr ) stmt
    stmt -> while ( expr ) stmt
    stmt -> id = expr ;

FIRST sets:
    FIRST(if ( expr ) stmt)     = { if }
    FIRST(while ( expr ) stmt)  = { while }
    FIRST(id = expr ;)          = { id }

Checking for overlap:

Pair	Intersection	Result
{ if } ∩ { while }	∅	✓
{ if } ∩ { id }	∅	✓
{ while } ∩ { id }	∅	✓

No overlap! The parser can always decide:

See "if"    → use stmt -> if ( expr ) stmt
See "while" → use stmt -> while ( expr ) stmt
See "id"    → use stmt -> id = expr ;
See anything else → syntax error!

Example: Bad Grammar (Predictive Parsing Fails)

Grammar:
    expr -> term + expr
    expr -> term

Assume term can start with digits (0-9):
    FIRST(term + expr) = { 0, 1, 2, ..., 9 }
    FIRST(term)        = { 0, 1, 2, ..., 9 }

The problem:

FIRST(term + expr) ∩ FIRST(term) = { 0, 1, 2, ..., 9 }

                                    ↑ Complete overlap!

If the parser sees 5, it has no idea whether to use expr -> term + expr or expr -> term.
This is exactly why we need left factoring!

Computing FIRST Sets: The Algorithm

For a terminal a: FIRST(a) = { a }
For epsilon: FIRST(ε) = { ε }
For a non-terminal A, look at all its productions and take the union.
For a sequence X₁ X₂ ... Xₙ:

FIRST(X₁ X₂ ... Xₙ):

    1. Start with FIRST(X₁) (excluding ε)
    2. If ε ∈ FIRST(X₁), also add FIRST(X₂) (excluding ε)
    3. If ε ∈ FIRST(X₁) and ε ∈ FIRST(X₂), also add FIRST(X₃) ...
    4. Continue until you hit an Xᵢ where ε ∉ FIRST(Xᵢ)
    5. If ALL of X₁...Xₙ can derive ε, then add ε to the result

Example calculation:

Grammar:
    E  -> T E'
    E' -> + T E' | ε
    T  -> F T'
    T' -> * F T' | ε
    F  -> ( E ) | id

Computing FIRST sets (iteratively):
    FIRST(F)  = { (, id }
    FIRST(T') = { *, ε }
    FIRST(T)  = FIRST(F) = { (, id }
    FIRST(E') = { +, ε }
    FIRST(E)  = FIRST(T) = { (, id }

FOLLOW Sets: Handling Epsilon

When a production can derive ε, we also need to know what can follow that non-terminal.
FOLLOW(A) = the set of terminals that can appear immediately after A in some derivation.

Why we need FOLLOW:

    E' -> + T E'
    E' -> ε

    If we see something NOT in FIRST(+ T E'), should we use E' -> ε?
    Only if that something is in FOLLOW(E')!

FOLLOW set rules:

1. $ ∈ FOLLOW(S) where S is the start symbol ($ = end of input)

2. If A -> α B β, then:
   FIRST(β) - {ε} ⊆ FOLLOW(B)

3. If A -> α B, or A -> α B β where ε ∈ FIRST(β), then:
   FOLLOW(A) ⊆ FOLLOW(B)

The Parsing Table

Once we have FIRST and FOLLOW sets, we can build a parsing table.
The table tells us: "Given non-terminal A and lookahead token t, which production to use?"

Table construction:

    For each production A -> α:
        1. For each terminal a ∈ FIRST(α), add A -> α to Table[A, a]
        2. If ε ∈ FIRST(α), for each terminal b ∈ FOLLOW(A), add A -> α to Table[A, b]

Example parsing table for arithmetic expressions:

	`id`	`+`	`*`	`(`	`)`	`$`
`E`	E -> T E'			E -> T E'
`E'`		E' -> + T E'			E' -> ε	E' -> ε
`T`	T -> F T'			T -> F T'
`T'`		T' -> ε	T' -> * F T'		T' -> ε	T' -> ε
`F`	F -> id			F -> ( E )

Empty cells = syntax error. Multiple entries in one cell = grammar is not LL(1).

The Predictive Parsing Algorithm

Uses a stack to track what we're expecting to see.
Matches terminals, expands non-terminals using the table.

def predictive_parse(input_tokens, table, start_symbol):
    stack = ['$', start_symbol]  # $ marks bottom
    tokens = input_tokens + ['$']
    pos = 0

    while stack:
        top = stack.pop()
        current = tokens[pos]

        if top == '$':
            if current == '$':
                return True  # Success!
            else:
                return False  # Error: extra input

        elif top is terminal:
            if top == current:
                pos += 1  # Match! Consume token
            else:
                return False  # Error: mismatch

        else:  # top is non-terminal
            if (top, current) in table:
                production = table[(top, current)]
                # Push RHS in reverse order
                for symbol in reversed(production.rhs):
                    if symbol != 'ε':
                        stack.append(symbol)
            else:
                return False  # Error: no rule

    return pos == len(tokens) - 1

Trace: Parsing id + id * id

Let's trace through parsing the expression id + id * id:

Stack	Input	Action
`$ E`	`id + id * id $`	E -> T E' (Table[E, id])
`$ E' T`	`id + id * id $`	T -> F T' (Table[T, id])
`$ E' T' F`	`id + id * id $`	F -> id (Table[F, id])
`$ E' T' id`	`id + id * id $`	Match id
`$ E' T'`	`+ id * id $`	T' -> ε (Table[T', +])
`$ E'`	`+ id * id $`	E' -> + T E' (Table[E', +])
`$ E' T +`	`+ id * id $`	Match +
`$ E' T`	`id * id $`	T -> F T'
`$ E' T' F`	`id * id $`	F -> id
`$ E' T' id`	`id * id $`	Match id
`$ E' T'`	`* id $`	T' -> * F T'
`$ E' T' F *`	`* id $`	Match *
`$ E' T' F`	`id $`	F -> id
`$ E' T' id`	`id $`	Match id
`$ E' T'`	`$`	T' -> ε
`$ E'`	`$`	E' -> ε
`$`	`$`	Accept! ✓

LL(1) — What Does It Mean?

Predictive parsers are also called LL(1) parsers:

Letter	Meaning
L	Left-to-right scan of input
L	Leftmost derivation (expand leftmost non-terminal first)
1	One token of lookahead

A grammar is LL(1) if and only if its parsing table has no multiply-defined entries.
Not all grammars can be made LL(1) — some languages require more powerful parsers.

After Parsing: Intermediate Representation

Once parsing succeeds, the compiler typically builds an Intermediate Representation (IR).
When computer scientists say "representation," they mean "how we encode or store information" — it doesn't have to be text.
Two main styles:

Tree-Based IR (AST)	Linear IR (Three-Address Code)
Preserves hierarchical, nested structure	Flattens into a sequence of instructions
Each node = operation, children = operands	Explicit temporaries and jumps
Good for analysis and transformation	Closer to machine code

Example for a + b * c:

A Simple Compiler

About It

This is a simple compiler I wrote when I was 9 ~ 10, which is super super minimal and I couldn't even guarantee it still runs.
However, I found it weeks ago from my old hard drive. Therefore, I'll uploading it in my GitHub and named it Laotzus.