Page History

Performance Tuning

Any questions about the meaning of the content of this page can be sent to vbsqliteNOSPAM@NOSPAMag-software.com (just remove the NOSPAM)

This page is an area where you can place code, hints, suggestion or text that relates to increasing the speed of the SQLite databases.

Place indexes on join and where column

Placing indexes on columns that are used in joins or where statements are the best way to increase query speed.

Use cluster indexes

Clustered indexes are indexes that comprise more than one column. These increase the speed of queries where you have multiple joins of one table or you are selecting rows based on multiple columns. (This is true in other DB systems, I don't know if it holds true in SQLite.)

How SQLite works out JOINs

From an email by D. Richard Hipp on 13 January 2004:

SQLite implements JOIN USING by translating the USING clausing into some extra WHERE clause terms. It does the same with NATURAL JOIN and JOIN ON. So while those constructs might be helpful to the human reader, they don't really make any difference to SQLite's query optimizer.

SQLite converts joins into where statements. This requires CPU time to perform the conversion, however small that CPU time might be. If you are calling a lot of SQL statements which are returning small amount of rows then it might reduce the overall time (CPU cost) if you manually remove the JOIN and replace them with where statements.
While savings on this might be small in tight loops of 1000's or in low memory situations it could be worth the coding. When SQLite supports a prepare/precompile interface then precompiling the SQL Statements will be the better way to go.

Order your tables properly in your where clause

Place the tables where you can eliminate the most rows by using a where clause (preferably on an indexed column) first, in order to limit the number of JOIN operations required.

The following is from a message posted by D. Richard Hipp to the SQLite mailing list regarding join translations:

When SQLite sees this:

SELECT * FROM a JOIN b ON a.x=b.y;

It translate it into the following before compiling it:

SELECT * FROM a, b WHERE a.x=b.y;

Neither form is more efficient that the other. Both will generate identical code. (There are subtle differences on an LEFT OUTER JOIN, but those details can be ignored when you are looking at things at a high level, as we are.)

SQLite implements joins using nested loops with the outer loop formed by the first table in the join and the inner loop formed by the last table in the join. So for the example above you would have:

For each row in a: For each row in b such that b.y=a.x: Return the row

If you reverse the order of the tables in the FROM clause like this:

SELECT * FROM b, a WHERE a.x=b.y;

You should get an equivalent result on output, but SQLite will implement the query differently. Specifically it does this:

For each row in b: For each row in a such that a.x=b.y: Return the row

The trick is that you want to arrange the order of tables so that the "such that" clause on the inner loop is able to use an index to jump right to the appropriate row instead of having to do a full table scan. Suppose, for example, that you have an index on a(x) but not on b(y). Then if you do this:

SELECT * FROM a, b WHERE a.x=b.y; For each row in a: For each row in b such that b.y=a.x: Return the row

For each row in a, you have to do a full scan of table b. So the time complexity will be O(N^2). But if you reverse the order of the tables in the FROM clause, like this:

SELECT * FROM b, a WHERE b.y=a.x; For each row in b: For each row in a such that a.x=b.y Return the row

Now the inner loop is able to use an index to jump directly to the rows in a that it needs and does not need to do a full scan of the table. The time complexity drops to O(NlogN).

So the rule should be: For every table other than the first, make sure there is a term in the WHERE clause (or the ON or USING clause if that is your preference) that lets the search jump directly to the relavant rows in that table based on the results from tables to the left.

Other database engines with more complex query optimizers will typically attempt to reorder the tables in the FROM clause in order to give you the best result. SQLite is more simple-minded - it codes whatever you tell it to code.

Before you ask, I'll point out that it makes no different whether you say "a.x=b.y" or "b.y=a.x". They are equivalent. All of the following generate the same code:

ON a.x=b.y ON b.y=a.x WHERE a.x=b.y WHERE b.y=a.x

Indexes on INTEGER PRIMARY KEY columns (don't do it)

When you create a column with INTEGER PRIMARY KEY, SQLite uses this column as the key for (index to) the table structure. This is a hidden index (as it isn't displayed in SQLite_Master table) on this column. Adding another index on the column is not needed and will never be used. In addition it will slow INSERT, DELETE and UPDATE operations down.

Use transactions when updating tables

Make sure that you wrap up all multiple updates inside a transaction, e.g.:

 
BEGIN TRANSACTION; 
UPDATE table1 SET col1='1'; 
UPDATE table1 SET col1='2'; 
... 
INSERT INTO table1(col1) VALUES ('2'); 
... 
COMMIT TRANSACTION;

(The word "TRANSACTION" is optional.)

Using a transaction is the fastest way to update data in SQLite. Basically, this is how it works: After each transaction the SQLite engine closes and opens the database file. When SQLite opens a database file it populates the SQlite internal structures, which takes time. So if you have 100 updates and don't use a transaction then SQlite will open and close the database 100 times. Using transactions improves speed. Use them.

sqlite - Performance Tuning