Skip to main content

C Compilation

Understanding how a C program transforms from human-readable code to an executable binary is essential knowledge for any C programmer. This page explains the compilation process step-by-step, helping you troubleshoot errors and optimize your programs.

Overview of the Compilation Process

The process of converting C source code into an executable program involves multiple stages, each handled by different components:

Let's explore each step in detail.

Step 1: Preprocessing

Input: .c file | Output: Expanded source code

The preprocessor handles all preprocessor directives, which begin with a # symbol. Its main tasks include:

  • Including header files (#include)
  • Expanding macros (#define)
  • Conditional compilation (#ifdef, #ifndef, #endif, etc.)
  • Removing comments

Example:

Original source code:

c
#include <stdio.h>
#define MAX 100

int main() {
// Print maximum value
printf("Max value is: %d\n", MAX);
return 0;
}

After preprocessing:

c
/* Contents of stdio.h are inserted here */

int main() {
printf("Max value is: %d\n", 100);
return 0;
}
tip

You can see the preprocessor output using the -E flag with gcc:

bash
gcc -E myprogram.c -o myprogram.i

Step 2: Compilation

Input: Preprocessed code | Output: Assembly code

The compiler translates the preprocessed C code into assembly language specific to your target processor architecture. During this phase:

  • The code is checked for syntax errors
  • Warnings about potential issues are generated
  • Optimizations might be applied (depending on compiler flags)

This stage produces assembly code that's still human-readable but much closer to machine language.

tip

You can stop at this stage and view the assembly code with:

bash
gcc -S myprogram.c -o myprogram.s

Step 3: Assembly

Input: Assembly code | Output: Object file (.o)

The assembler converts assembly code into machine code (binary). The output is called an object file and contains:

  • Machine code instructions
  • A table of symbols (function names, global variables)
  • Relocation information for linking
  • Debugging information (if requested)

Object files are not yet executable because they may contain references to external functions or variables that need to be resolved.

tip

Generate just the object file with:

bash
gcc -c myprogram.c -o myprogram.o

Step 4: Linking

Input: Object file(s) | Output: Executable program

The linker performs several important tasks:

  1. Combines multiple object files into a single executable
  2. Resolves references to external functions and variables
  3. Incorporates code from static libraries (.a files)
  4. Sets up the initial program runtime environment

For example, when your program calls printf(), the linker finds this function in the standard C library and includes the necessary code to make your program work.

Common Errors in Each Stage

Understanding which compilation stage produces an error helps in fixing it faster:

StageError TypeExample
PreprocessingFile not foundfatal error: stdio.h: No such file or directory
CompilationSyntax errorserror: expected ';' before '}' token
LinkingUndefined referencesundefined reference to 'sqrt'

Compilation Flags

These are some common GCC flags you can use to control the compilation process:

  • -o <name>: Specify the output file name
  • -Wall: Enable all warnings
  • -g: Include debugging information
  • -O1, -O2, -O3: Different levels of optimization
  • -std=c99: Specify C language standard

Example of a command with multiple flags:

bash
gcc -Wall -g -O2 -std=c99 myprogram.c -o myprogram

One-Step vs. Separate Steps

While you can compile in one step:

bash
gcc myprogram.c -o myprogram

Breaking it down can be useful for debugging or understanding where errors occur:

bash
# Preprocessing
gcc -E myprogram.c -o myprogram.i

# Compilation
gcc -S myprogram.i -o myprogram.s

# Assembly
gcc -c myprogram.s -o myprogram.o

# Linking
gcc myprogram.o -o myprogram

Static vs. Dynamic Linking

C programs can link to libraries in two ways:

  • Static linking: Library code is copied into the executable
bash
gcc myprogram.c -static -o myprogram
  • Dynamic linking: Program contains references to shared libraries (.so files on Linux, .dll on Windows)
bash
gcc myprogram.c -o myprogram
note

Static linking produces larger executables but they have no external dependencies. Dynamic linking creates smaller executables but requires the linked libraries to be present on the system.

Summary

The C compilation process involves four main stages:

  1. Preprocessing: Expands macros and includes header files
  2. Compilation: Converts C code to assembly language
  3. Assembly: Converts assembly to machine code
  4. Linking: Resolves references and creates the final executable

Understanding this process helps you interpret compiler errors, optimize your programs, and write more effective C code.

Further Reading



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)