The Main Problem
The original report of this problem is ticket #2409. The problem appears when an SQLITE_BUSY error occurs in the middle of an explicit transaction. The database file has not been damaged at this point, but it is possible that the in-memory cache of the database could be inconsistent. If the cache is left in an inconsistent state and the transaction is subsequently committed, then the database might well end up being corrupted.
How To Avoid The Problem
- If you start your transaction using "BEGIN EXCLUSIVE" instead of just "BEGIN" then it is impossible to ever get an SQLITE_BUSY error in the middle of the transaction. Hence the problem cannot occur. On the other hand, you lose some concurrency because other processes cannot read the database file while you are running an exclusive transaction.
- You can avoid SQLITE_BUSY errors by using the sqlite3_busy_handler() or sqlite3_busy_timeout() interfaces to intercept failed locking attempts and retry the lock after a short delay.
- If you do encounter an SQLITE_BUSY error on a statement,
other than a COMMIT statement,
within a transaction, the only safe
thing to do is execute a ROLLBACK command to roll back
the entire transaction. If you try to reissue the failed
statement, or issue a different statement, or if you try
to commit the transaction, database corruption might
result.
-
Note: It is safe to retry a COMMIT
command that returns SQLITE_BUSY. You can also retry
any command that returns SQLITE_BUSY that is not run
from within a transaction. But if SQLITE_BUSY is
returned for any command other than COMMIT that is
run from within a transaction, you must rollback
the transaction immediately to avoid corrupting
the database file.
- The only time you will ever get an SQLITE_BUSY error for a non-COMMIT command within a transaction is if your database change has become so big that it will no longer fit entirely in cache. If you increase the cache size from the default 2000 pages, you might reduce the chance of this ever happening.
- If you are certain that no other processes will ever try to read the database file at the same time you have your transaction open, then SQLITE_BUSY can never occur in your application and this problem will never come up.
Related Problems
Similar problems may occur if a statement within a transaction is aborted due to a call to sqlite3_interrupt() or due to a progress handler callback returning non-zero. The return codes for these events are SQLITE_INTERRUPT and SQLITE_ABORT respectively. If you have a query within a transaction that returns SQLITE_INTERRUPT or SQLITE_ABORT, the only safe thing to do is to rollback the transaction.
If a statement fails due to SQLITE_BUSY, SQLITE_INTERRUPT, or SQLITE_ABORT then you should not try to read the database until after the transaction has been rolled back. You cannot corrupt the database file by reading, but you might well get an SQLITE_CORRUPT error during your read, or (worse) you might get incorrect data. If you have two or more database connections that share the same cache (using the shared cache mode) and one connection encounters a BUSY, INTERRUPT, or ABORT error, then the other connections should avoid reading the database until the transaction has rolled back.
How This Problem Will Be Fixed
In SQLite version 3.2.8 and earlier, when this situation would arise, SQLite would return SQLITE_IOERR and automatically rollback the transaction, rather than return SQLITE_BUSY. The fix in version 3.4.0 will be to revert to the 3.2.8 behavior. If the in-memory database cache overflows during a large transaction and SQLite is unable to acquire an exclusive lock on the database file in order to spill the cache into the database, then it will return SQLITE_IOERR (or SQLITE_IOERR_BLOCKED if you have extended result codes enabled) and will automatically rollback the transaction. SQLite might still return SQLITE_BUSY in some cases, but when it does return SQLITE_BUSY you can be assured that it is safe to retry the statement or commit the transaction.
The SQLITE_INTERRUPT and SQLITE_ABORT functionality is rarely used so we will merely change it to automatically rollback any open transactions. Such a change is unlikely to present any compatibility issues.
And, of course, we will be adding many new tests to the automated test suite to help insure that this particular class of errors does not recur in a future release.