Not logged in
Any questions about the meaning of the content of this page can be sent to
vbsqliteNOSPAM@NOSPAMag-software.com (just remove the NOSPAM)
This page is an area where you can place code, hints, suggestion or text
that relates to increasing the speed of the SQLite databases.
Placing indexes on columns that are used in joins or where statements are
the best way to increase query speed.
Clustered indexes are indexes that comprise more than one column. These
increase the speed of queries where you have multiple joins of one table or you
are selecting rows based on multiple columns. (This is true in other DB
systems, I don't know if it holds true in SQLite.)
From an email by D. Richard Hipp on 13 January 2004:
SQLite implements JOIN USING by translating the USING clausing into some
extra WHERE clause terms. It does the same with NATURAL JOIN and JOIN ON. So
while those constructs might be helpful to the human reader, they don't really
make any difference to SQLite's query optimizer.
SQLite converts joins into where statements. This requires CPU time to
perform the conversion, however small that CPU time might be. If you are
calling a lot of SQL statements which are returning small amount of rows then
it might reduce the overall time (CPU cost) if you manually remove the JOIN and
replace them with where statements.While savings on this might be small in
tight loops of 1000's or in low memory situations it could be worth the coding.
When SQLite supports a prepare/precompile interface then precompiling the SQL
Statements will be the better way to go.
Place the tables where you can eliminate the most rows by using a where
clause (preferably on an indexed column) first, in order to limit the number of
JOIN operations required.
The following is from a message posted by D. Richard Hipp to the SQLite
mailing list regarding join translations:
When SQLite sees this:
SELECT * FROM a JOIN b ON a.x=b.y;
It translate it into the following before compiling it:
SELECT * FROM a, b WHERE a.x=b.y;
Neither form is more efficient that the other. Both will generate
identical code. (There are subtle differences on an LEFT OUTER JOIN,
but those details can be ignored when you are looking at things at a
high level, as we are.)
SQLite implements joins using nested loops with the outer
loop formed by the first table in the join and the inner loop formed by
the last table in the join. So for the example above you would have:
For each row in a:
For each row in b such that b.y=a.x:
Return the row
If you reverse the order of the tables in the FROM clause like
SELECT * FROM b, a WHERE a.x=b.y;
You should get an equivalent result on output, but SQLite will implement
the query differently. Specifically it does this:
For each row in b:
For each row in a such that a.x=b.y:
Return the row
The trick is that you want to arrange the order of tables so that the
"such that" clause on the inner loop is able to use an index to jump
right to the appropriate row instead of having to do a full table scan.
Suppose, for example, that you have an index on a(x) but not on b(y).
Then if you do this:
For each row in a, you have to do a full scan of table b. So the time
complexity will be O(N^2). But if you reverse the order of the tables
in the FROM clause, like this:
SELECT * FROM b, a WHERE b.y=a.x;
For each row in b:
For each row in a such that a.x=b.y
Return the row
Now the inner loop is able to use an index to jump directly to the rows
in a that it needs and does not need to do a full scan of the table.
The time complexity drops to O(NlogN).
So the rule should be: For every table other than the first, make sure
there is a term in the WHERE clause (or the ON or USING clause if that
is your preference) that lets the search jump directly to the relavant
rows in that table based on the results from tables to the left.
Other database engines with more complex query optimizers will typically
attempt to reorder the tables in the FROM clause in order to give you
the best result. SQLite is more simple-minded - it codes whatever you
tell it to code.
Before you ask, I'll point out that it makes no different whether you
say "a.x=b.y" or "b.y=a.x". They are equivalent. All of the following
generate the same code:
When you create a column with INTEGER PRIMARY KEY, SQLite uses this column
as the key for (index to) the table structure. This is a hidden index (as it
isn't displayed in SQLite_Master table) on this column. Adding another index
on the column is not needed and will never be used. In addition it will slow
INSERT, DELETE and UPDATE operations down.
Make sure that you wrap up all multiple updates inside a transaction,
UPDATE table1 SET col1='1';
UPDATE table1 SET col1='2';
INSERT INTO table1(col1) VALUES ('2');
(The word "TRANSACTION" is optional.)
Using a transaction is the fastest way to update data in SQLite. Basically,
this is how it works: After each transaction the SQLite engine closes and opens
the database file. When SQLite opens a database file it populates the SQlite
internal structures, which takes time. So if you have 100 updates and don't
use a transaction then SQlite will open and close the database 100 times.
Using transactions improves speed. Use them.
SQLite can use multiple indexes on a query. You just have to tell it to explicitly by restructuring your SQL.
As an example, consider this query:
SELECT * FROM table1 WHERE a=5 AND b=11;
Suppose there are two indices:
CREATE INDEX index1 ON table1(a);
CREATE INDEX index2 ON table1(b);
As written, SQLite will only use one of these two indices
to perform the query. The choice is arbitrary (unless you
have run ANALYZE and SQLite has some information to help it
pick the "best" index.)
If you want to use both indices, rewrite the query this
SELECT * FROM table1 WHERE rowid IN
(SELECT rowid FROM table1 WHERE a=5
INTERSECT SELECT rowid FROM table1 WHERE b=11);
The optimizer in PostgreSQL will make this change for you
automatically and will use a bitmap to implement the IN
operator and the INTERSECT. With SQLite, though, you have to
type in the expanded version yourself. And because rowids
in SQLite are user visible and changeable and can thus be
diffuse, SQLite is unable to use bitmaps to optimize the
computation. But modulo the bitmap optimization, SQLite gives
you all the capabilities of PostgreSQL, you just have to type
it in yourself rather than letting the optimizer do it for
Checked only with version 2.8.18! When you only want to check if something exists in a table which has an index(e.g. an index on "hour"), you might do this like so:
SELECT hour FROM appointments WHERE hour=5 LIMIT 1;
- or worse -
SELECT * FROM appointments WHERE hour=5 LIMIT 1;
SELECT 1 FROM appointments WHERE hour=5 LIMIT 1;
Entered by Nathan P Sharp, Jan 2008
While trying to improve the performance of bulk imports in our C++ project, we found that creating the prepared statements was a large hit. Creating them once at the construction of our class, though, made the problem worse! It turns out that prepared statements that are generated before the transaction start do not work with the transaction. The fix was simply to create new prepared statements once per transaction.