This document provides a high-level introduction to Git's architecture and the major subsystems that comprise the Git version control system. It explains how commands are dispatched, how data is stored and accessed, and how the various layers interact. This overview is intended to provide context for the more detailed documentation that follows.
For detailed information on specific subsystems, see:
Git is organized into five major architectural layers that build upon each other. Each layer has distinct responsibilities and interfaces with the layers above and below it.
Architecture Layers
| Layer | Key Files | Responsibilities |
|---|---|---|
| User Interface | git.c1-900 | Command parsing, dispatch to built-in commands, alias resolution |
| Configuration & Build | Makefile1-3000 config.c1-3000 | Platform detection, compilation, runtime configuration management |
| Core Data | read-cache.c1-3000 packfile.c1-2000 refs.c1-2000 | Index/staging area, object database, reference storage |
| Operations | diff.c revision.c sequencer.c | High-level Git operations like diff, log, rebase |
| Network | transport.c fetch-pack.c send-pack.c | Remote repository communication |
Sources: Makefile1-100 git.c1-900 read-cache.c1-200 packfile.c1-100 refs.c1-100
When a user runs a Git command, it flows through a well-defined dispatch mechanism that handles options, aliases, and routing to the appropriate implementation.
Command Dispatch Table
The command dispatch table in git.c529-700 maps command names to implementation functions:
Key dispatch flags defined in git.c21-31:
RUN_SETUP - Requires a Git repositoryRUN_SETUP_GENTLY - Try to find a repository but don't failUSE_PAGER - Automatically paginate outputNEED_WORK_TREE - Requires a working tree (not bare repo)Sources: git.c1-100 git.c157-366 git.c368-464 git.c466-527 builtin.h1-120
Git's data layer consists of three primary storage systems: the index (staging area), the object database (packfiles and loose objects), and the reference system.
Index Structure
The index is managed by read-cache.c and contains:
struct index_state - Main index state at read-cache.c111-200struct cache_entry - Individual file entries with stat infoObject Database
Objects are stored in two formats:
.git/objects/??/*, created by object-file.c.pack files, accessed via packfile.cThe packfile system uses:
struct packed_git at packfile.h13-51 - Represents a single packfile.idx files) for fast object lookup via packfile.c159-187.midx files) to aggregate multiple packs via midx.c96-175.rev files) for offset-to-object mapping via pack-revindex.cReference System
References (branches, tags, HEAD) are managed through a pluggable backend architecture:
struct ref_store at refs.c44-50.git/refs/Sources: read-cache.c1-200 packfile.c1-300 packfile.h13-51 midx.c1-175 refs.c1-100 refs/files-backend.c83-130 refs/reftable-backend.c1-100
Object Lookup Process
.git/objects/The packfile system uses:
Sources: packfile.c159-300 packfile.c2000-2100 midx.c330-500
Reference Storage Formats
The reference system at refs.c39-66 defines a pluggable backend with two implementations:
Files Backend refs/files-backend.c83-130
.git/refs/.git/packed-refs via refs/packed-backend.c.git/refs/worktree/, .git/refs/bisect/)Reftable Backend refs/reftable-backend.c38-60
.git/reftable/Key Reference Operations
refs_resolve_ref_unsafe() at refs.c390-399 - Resolve symbolic refs recursivelyrefs_read_ref() at refs.c418-421 - Read a ref valueref_transaction_commit() - Atomic multi-ref updatesSources: refs.c39-100 refs.h1-100 refs/files-backend.c83-204 refs/reftable-backend.c38-60 refs/ref-cache.c1-100
Build System
The Makefile1-3000 orchestrates:
config.mak.unamegit binarygit describe or version fileConfiguration Precedence
Configuration is loaded in order (later overrides earlier) by config.c2000-3000:
/etc/gitconfig~/.gitconfig or $XDG_CONFIG_HOME/git/config.git/config.git/config.worktreeGIT_*-c key=valueKey functions:
git_config() at config.c2000-2100 - Read and parse config filesgit_config_get_value() - Query a config keygit_config_set() - Write config valuesSources: Makefile1-1000 config.c1-3000 generate-cmdlist.sh1-100 git.c157-280
Common Operation Pattern
Most Git commands follow this pattern implemented in git.c466-527:
setup_git_directory() if RUN_SETUP flag setgit_config()read_index() at read-cache.c2300-2500refs_resolve_ref_unsafe() at refs.c390-399Sources: git.c466-527 read-cache.c2300-2500 refs.c390-421 packfile.c2000-2100
Index Entry
Defined in cache.h and managed by read-cache.c95-227
Packed Git
Defined at packfile.h13-51 and accessed via packfile.c300-500
Reference Store
Defined in refs/refs-internal.h200-250 with backends in refs/files-backend.c83-94 and refs/reftable-backend.c38-60
Sources: read-cache.c95-227 packfile.h13-51 refs/refs-internal.h200-250
Git's architecture is organized into five distinct layers, each with clear responsibilities:
The system uses a consistent pattern: commands flow through the dispatch layer, load configuration and repository state, perform operations using the data layer, and write results. The pluggable backend architecture (especially for references) allows Git to scale to different repository sizes and use cases.
Sources: git.c1-900 Makefile1-1000 config.c1-500 read-cache.c1-500 packfile.c1-500 refs.c1-500