This page describes how the source code under configuration management is transformed into a working SQLite library. You do not need to know any of this in order to build SQLite. The makefiles should do everything for you automatically. But many people find a knowledge of what is going on behind the scenes helpful. {rightimage: make-lib.gif} SQLite is implemented in ANSI-C. But many of the C code files that are input to the C compiler are generated from other scripts and programs rather than being typed in manually. The diagram to the right shows the complete build process. In the image above, red ovals are original source files from the configuration management system. Green ovals are C code that is automatically generated. Blue rectangles are build tools and compilers that are needed on the host platform in order to compile SQLite. Yellow rectangles are build tools and compilers for which the source code is part of the SQLite source tree. The final output (the SQLite library) is a purple oval near the bottom of the diagram. The files contained within the light-blue bubble are the C code files that become part of the SQLite library. You will notice that some files from CVS are within the blue bubble and others are not. Not every code file in the CVS repository ends up being part of the SQLite library. On the {link: /download.html download page}, the downloads with names of the form *sqlite-X.X.X.tar.gz* are snapshots of the CVS tree. These are the red ovals. The downloads with names of the form *sqlite-source-X_X_X.zip* contain just the files inside the light-blue bubble. These are the generated C code files: *: *keywordhash.h*. This file contains C code to implement a hash table of all of the SQL keywords that SQLite understands. Do not be misled by the ".h" suffix - this file contains actual code, not just declarations. The reason for using ".h" instead of ".c" is that the file is #include-ed into the middle of {link: /cvstrac/rlog?f=sqlite/src/tokenize.c tokenize.c}. The keywordhash.h code file is generated by a custom C program named {link: /cvstrac/rlog?f=sqlite/tool/mkkeywordhash.c mkkeywordhash.c}. We might have just as easily have hand-coded the keyword hash table, and in fact that was done in earlier versions of SQLite. But the hash table that mkkeywordhash.c is optimized for both speed and size. It saves about 2K of code space. And when you are trying to build an SQL database engine that will fit on embedded devices, every little bit of code space helps. *: *sqlite3.h*. This is the header file that defines the programmer API for SQLite. This is mostly just a copy of the {link: /cvstrac/rlog?f=sqlite/src/sqlite.h.in sqlite.h.in} file from CVS with current library version number from the file named {link: /cvstrac/rlog?f=sqlite/VERSION VERSION} inserted in strategic places. *: *parse.c* and *parse.h*. These files implement the SQL parser for SQLite. The input grammar is in a source file named {link: /cvstrac/rlog?f=sqlite/src/parse.y parse.y}. This parse.y file is converted into C code by the {link: http://www.hwaci.com/sw/lemon/ Lemon} parser generator. The source code to Lemon is part of the SQLite source tree. The {link: /cvstrac/rlog?f=sqlite/tool/lemon.c lemon.c} file is compiled to generate the lemon executable. Then the lemon executable is run with parse.y as its input to generate the output files. The {link: /cvstrac/rlog?f=sqlite/tool/lempar.c lempar.c} file is a template used by Lemon to generate its output files. *: *opcodes.h*. This file contains #defines that map opcode names into opcode numbers for the Virtual Database Engine (VDBE) in the core of SQLite. It is generated by an AWK script that uses both the parse.h file generated by Lemon and the {link: /cvstrac/rlog?f=sqlite/src/vdbe.c vdbe.c} source file from CVS as inputs. The vdbe.c file is the implementation of the virtual machine. It is scanned to figure out which opcodes are needed. The parse.h file is used because for efficiency reasons we want to make some of the VDBE opcodes have the same numeric value as token codes in the parser. For example, the token code for the "+" operator is the same as the addition opcode in the VDBE. Arranging things this way makes code generation much easier. *: *opcodes.c*. This file maps VDBE opcode numbers back into symbolic names so that symbolic opcode names (rather than obscure opcode numbers) can appear in the output of {link: /lang_explain.html EXPLAIN}. Generating the processed C code can be a little bit tricky. Note the dependency trace from parse.h to opcodes.h to opcodes.c. You have to be careful to do things in the right order. Fortunately, the makefiles do this for you automatically. Note: there is a makefile target that will generate just the processed C code and stop. If you type make target_source Then the makefiles will construct a subdirectory named "tsrc" and put copies of the processed C code into that directory. That is how the sqlite-source-X_X_X.zip downloads are generated: we just run the target_source make target and ZIP up the "tsrc" subdirectory. After all of the processed C code has been prepared as shown above, the SQLite library is generated simply by passing the processed C code into an ordinary C compiler. ----- *Building The Amalgamation* {rightimage: make-amal.gif} Beginning with version 3.3.14, SQLite is available in the form of a single huge file that contains all of the C code for SQLite. We call this single source file "{link: /cvstrac/wiki?p=TheAmalgamation the amalgamation}". The diagram to the right shows how the amalgamation is built. Very little has changed from the previous diagram. The processed C code in the light-blue bubble is the same and all the steps needed to generate that code are the same. The only difference is in what we do with the processed C code. To generate the amalgamation, there is a {link: http://www.tcl.tk/ Tcl} script named {link: /cvstrac/rlog?f=sqlite/tool/mksqlite3c.tcl mksqlite3c.tcl} that reads the processed C code and copies it all into the amalgamation file, "sqlite3.c", in the right order. The mksqlite3c.tcl script has to take care to replace #includes of internal header files with the actual content of those headers, and to make sure that headers are not included more than once. And it has to add the sources in just the right order. So building the amalgamation is more than just concatenating the files together. But it is not a lot more. Beginning with version 3.3.15, there is a makefile target that will automatically build the amalgamation. Type: make sqlite3.c And the makefile will automatically construct the processed C code then run mksqlite3c.tcl for you.