This page describes how the source code under configuration management
is transformed into a working SQLite library.
You do not need to know any of this in order to build SQLite.
The makefiles should do everything for you automatically.
But many people find a knowledge of what is going on behind
the scenes helpful.

{rightimage: make-lib.gif}
SQLite is implemented in ANSI-C.
But many of the C code files that are input to the C compiler
are generated from other scripts and programs rather than being
typed in manually.  The diagram to the right shows the complete
build process.


In the image above, red ovals are original
source files from the configuration management system.
Green ovals are C code that is automatically generated.
Blue rectangles are build tools and compilers that are
needed on the host platform in order to compile SQLite.
Yellow rectangles are build tools and compilers for which
the source code is part of the SQLite source tree.  The
final output (the SQLite library) is a purple oval near
the bottom of the diagram.

The files contained within the light-blue bubble are the
C code files that become part of the SQLite library.
You will notice that some files from CVS are within the
blue bubble and others are not.  Not every code file in
the CVS repository ends up being part of the SQLite library.
On the {link: /download.html download page}, the
downloads with names of the form *sqlite-X.X.X.tar.gz* are
snapshots of the CVS tree.  These are the red ovals.  The
downloads with names of the form *sqlite-source-X_X_X.zip*
contain just the files inside the light-blue bubble.

These are the generated C code files:

*: *keywordhash.h*.
This file contains C code to implement a hash table of all
of the SQL keywords that SQLite understands.  Do not be
misled by the ".h" suffix - this file contains actual
code, not just declarations.  The reason for using ".h"
instead of ".c" is that the file is #include-ed into the
middle of {link: /cvstrac/rlog?f=sqlite/src/tokenize.c tokenize.c}.
The keywordhash.h code file is generated by a custom C program
named {link: /cvstrac/rlog?f=sqlite/tool/mkkeywordhash.c mkkeywordhash.c}.
We might have just as easily have hand-coded the keyword hash
table, and in fact that was done in earlier versions of SQLite.
But the hash table that mkkeywordhash.c is optimized for both
speed and size.  It saves about 2K of code space.  And when
you are trying to build an SQL database engine that will fit
on embedded devices, every little bit of code space helps.

*: *sqlite3.h*.
This is the header file that defines the programmer API for
SQLite.  This is mostly just a copy of the
{link: /cvstrac/rlog?f=sqlite/src/sqlite.h.in sqlite.h.in}
file from CVS with current library version number from
the file named
{link: /cvstrac/rlog?f=sqlite/VERSION VERSION} inserted
in strategic places.

*: *parse.c* and *parse.h*.
These files implement the SQL parser for SQLite.  The input
grammar is in a source file named
{link: /cvstrac/rlog?f=sqlite/src/parse.y parse.y}.  This
parse.y file is converted into C code by the
{link: http://www.hwaci.com/sw/lemon/ Lemon} parser generator.
The source code to Lemon is part of the SQLite source tree.
The {link: /cvstrac/rlog?f=sqlite/tool/lemon.c lemon.c}
file is compiled to generate the lemon executable.  Then
the lemon executable is run with parse.y as its input to
generate the output files.  The
{link: /cvstrac/rlog?f=sqlite/tool/lempar.c lempar.c} file
is a template used by Lemon to generate its output files.

*: *opcodes.h*.
This file contains #defines that map opcode names into
opcode numbers for the Virtual Database Engine (VDBE) in
the core of SQLite.  It is generated by an AWK script that
uses both the parse.h file generated by Lemon and the
{link: /cvstrac/rlog?f=sqlite/src/vdbe.c vdbe.c} source
file from CVS as inputs.  The vdbe.c file is the implementation
of the virtual machine.  It is scanned to figure out which
opcodes are needed.  The parse.h file is used because for
efficiency reasons we want to make some of the VDBE opcodes
have the same numeric value as token codes in the parser.
For example, the token code for the "+" operator is the
same as the addition opcode in the VDBE.  Arranging things
this way makes code generation much easier.

*: *opcodes.c*.
This file maps VDBE opcode numbers back into symbolic names
so that symbolic opcode names (rather than obscure opcode
numbers) can appear in the output of
{link: /lang_explain.html EXPLAIN}.

Generating the processed C code can be a little bit tricky.
Note the dependency trace from parse.h to opcodes.h to
opcodes.c.  You have to be careful to do things in the right
order.  Fortunately, the makefiles do this for you automatically.

Note: there is a makefile target that will generate just the
processed C code and stop.  If you type

   make target_source

Then the makefiles will construct a subdirectory named
"tsrc" and put copies of the processed C code into that
directory.  That is how the sqlite-source-X_X_X.zip downloads
are generated: we just run the target_source make target
and ZIP up the "tsrc" subdirectory.

After all of the processed C code has been prepared as
shown above, the SQLite library is generated simply by
passing the processed C code into an ordinary C compiler.

-----
*Building The Amalgamation*

{rightimage: make-amal.gif}

Beginning with version 3.3.14, SQLite is available in the
form of a single huge
file that contains all of the C code for SQLite.
We call this single source file
"{link: /cvstrac/wiki?p=TheAmalgamation the amalgamation}".
The diagram to the right shows how the amalgamation is built.


Very little has changed from the previous diagram.  The
processed C code in the light-blue bubble is the same and
all the steps needed to generate that code are the same.
The only difference is in what we do with the processed
C code.

To generate the amalgamation, there is a
{link: http://www.tcl.tk/ Tcl} script named
{link: /cvstrac/rlog?f=sqlite/tool/mksqlite3c.tcl mksqlite3c.tcl}
that reads the processed C code and copies it all into the
amalgamation file, "sqlite3.c", in the right order.
The mksqlite3c.tcl script has to take care to replace
#includes of internal header files with the actual content
of those headers, and to make sure that headers are not
included more than once.  And it has to add the sources in
just the right order.  So building the amalgamation is more
than just concatenating the files together.  But it is not
a lot more.

Beginning with version 3.3.15, there is a makefile target that
will automatically build the amalgamation.  Type:

    make sqlite3.c

And the makefile will automatically construct the processed
C code then run mksqlite3c.tcl for you.