SQLite is different from most other SQL database engines in that its primary design goal is to be simple:
- Simple to administer
- Simple to operate
- Simple to use in a program
- Simple to maintain and customize
Many people like SQLite because it is small and fast. But those qualities are just happy accidents. Users also find that SQLite is very reliable. Reliability is a consequence of simplicity. With less complication, there is less to go wrong. So, yes, SQLite is small, fast, and reliable, but first and foremost, SQLite strives to be simple.
Simplicity in a database engine can be either a strength or a weakness, depending on what you are trying to do. In order to achieve simplicity, SQLite has had to sacrifice other characteristics that some people find useful, such as high concurrency, fine-grained access control, a rich set of built-in functions, stored procedures, esoteric SQL language features, XML and/or Java extensions, tera- or peta-byte scalability, and so forth. If you need these kinds of features and don't mind the added complexity that they bring, then SQLite is probably not the database for you. SQLite is not intended to be an enterprise database engine. It's not designed to compete with Oracle or PostgreSQL.
The basic rule of thumb for when it is appropriate to use SQLite is this: Use SQLite in situations where simplicity of administration, implementation, and maintenance are more important than the countless complex features that enterprise database engines provide. As it turns out, situations where simplicity is the better choice are more common than many people realize.
Situations Where SQLite Works Well
- Websites
SQLite usually will work great as the database engine for low to medium traffic websites (which is to say, 99.9% of all websites). The amount of web traffic that SQLite can handle depends, of course, on how heavily the website uses its database. Generally speaking, any site that gets fewer than a 100000 hits/day should work fine. The 100000 hits/day figure is a conservative estimate, not a hard upper bound. SQLite has been demonstrated to work with 10 times that amount of traffic.
- Embedded devices and applications
Because an SQLite database requires little or no administration, SQLite is a good choice for devices or services that must work unattended and without human support. SQLite is a good fit for use in cellphones, PDAs, set-top boxes, and/or appliances. It also works well as an embedded database in downloadable consumer applications.
- Application File Format
SQLite has been used with great success as the on-disk file format for desktop applications such as financial analysis tools, CAD packages, record keeping programs, and so forth. The traditional File/Open operation does an sqlite_open() and executes a BEGIN TRANSACTION to get exclusive access to the content. File/Save does a COMMIT followed by another BEGIN TRANSACTION. The use of transactions guarantees that updates to the application file are atomic, durable, isolated, and consistent.
Temporary triggers can be added to the database to record all changes into a (temporary) undo/redo log table. These changes can then be played back when the user presses the Undo and Redo buttons. Using this technique, a unlimited depth undo/redo implementation can be written in surprising little code.
- Replacement for ad hoc disk files
Many programs use fopen(), fread(), and fwrite() to create and manage files of data in home-grown formats. SQLite works well as a replacement for these ad hoc data files.
- Internal or temporary databases
For programs that have a lot of data that must be sifted and sorted in diverse ways, it is often easier and quicker to load the data into an in-memory SQLite database and use queries with joins and ORDER BY clauses to extract the data in the form and order needed rather than to try to code the same operations manually. Using an SQL database internally in this way also gives the program greater flexibility since new columns and indices can be added without having to recode every query.
- Command-line dataset analysis tool
Experienced SQL users can employ the command-line sqlite program to analyze miscellaneous datasets. Raw data can be imported using the COPY command, then that data can be sliced and diced to generate a myriad of summary reports. Possible uses include website log analysis, sports statistics analysis, compilation of programming metrics, and analysis of experimental results.
You can also do the same thing with a enterprise client/server database, of course. The advantages to using SQLite in this situation are that SQLite is much easier to set up and the resulting database is a single file that you can store on a floppy disk or email to a colleague.
- Stand-in for an enterprise database during demos or testing
If you are writing a client application for an enterprise database engine, it makes sense to use a generic database backend that allows you to connect to many different kinds of SQL database engines. It makes even better sense to go ahead and include SQLite in the mix of supported database and to statically link the SQLite engine in with the client. That way the client program can be used standalone with an SQLite data file for testing or for demonstrations.
- Database Pedagogy
Because it is simple to setup and use (installation is trivial: just copy the sqlite or sqlite.exe executable to the target machine and run it) SQLite makes a good database engine for use in teaching SQL. Students can easily create as many databases as they like and can email databases to the instructor for comments or grading. For more advanced students who are interested in studying how an RDBMS is implemented, the modular and well-commented and documented SQLite code can serve as a good basis. This is not to say that SQLite is an accurate model of how other database engines are implemented, but rather a student who understands how SQLite works can more quickly comprehend the operational principles of other systems.
- Experimental SQL language extensions
The simple, modular design of SQLite makes it a good platform for prototyping new, experimental database language features or ideas.
Situations Where Another RDBMS May Work Better
- Client/Server Applications
If you have many client programs access a common database over a network, you should consider using a client/server database engine instead of SQLite. SQLite will work over a network filesystem, but because of the latency associated with most network filesystems, performance will not be great. Also, the file locking logic of many network filesystems implementation contains bugs (on both Unix and windows). If file locking does not work like it should, it might be possible for two or more client programs to modify the same part of the same database at the same time, resulting in database corruption. Because this problem results from bugs in the underlying filesystem implementation, there is nothing SQLite can do to prevent it.
A good rule of thumb is that you should avoid using SQLite in situations where the same database will be accessed simultaneously from many computers over a network filesystem.
Note:
client/server usually means that there are multiple client processes and at least one server process. If there is no server process, it is not truly a "client/server" setting. Usually, the server process will access the database at the behest of all clients. In such circumstances, there is no reason for clients to access SQLite (directly) over a network filesystem. Therefore, you can perfectly well use SQLite for client/server applications.
- High-volume Websites
SQLite will normally work fine as the database backend to a website. But if your website is so busy that you are thinking of splitting the database component off onto a separate machine, then you should definitely consider using an enterprise-class client/server database engine instead of SQLite.
Note:
First, what is "high-volume"?
Second, scaling is always problematic, regardless of what technology you use.
The first problems in scaling that you will hit, are usually not related to the backend technology, but to the application design itself. For example, if you're sending back a list with all customers to the web browser, it may work well for 50 customers. In a situation where you have 10,000 customers, you will need to introduce at least some kind of <page 1 2 ...> and/or <previous/next> logic and limit the query to (e.g.) 50 customers at a time. Sending too much data for a request and consuming too many server resources in doing so, is a very typical scaling problem; and it is totally unrelated to the backend technology. Conclusion: before you will hit the SQLite limitations, you will hit a string of design issues that limit the scaleability of your application well below the SQLite limits.
-
When you start a transaction in SQLite (which happens automatically
before any write operation that is not within an explicit BEGIN...COMMIT)
the engine has to allocate a bitmap of dirty pages in the disk file to
help it manage its rollback journal. SQLite needs 256 bytes of RAM for
every 1MB of database. For smaller databases, the amount of memory
required is not a problem, but when databases begin to grow into the
multi-gigabyte range, the size of the bitmap can get quite large. If
you need to store and modify more than a few dozen GB of data, you should
consider using a different database engine.
Note:
Just as if Oracle, SQL server or any other rdbms don't consume huge amounts in memory when managing large databases ... I seriously doubt that they limit themselves to 256 bytes per megabyte of data.
- High Concurrency
SQLite uses reader/writer locks on the entire database file. That means if any process is reading from any part of the database, all other processes are prevented from writing any other part of the database. Similarly, if any one process is writing to any part of the database, all other processes are prevented from reading any other part of the database. For many situations, this is not a problem. Each application does its database work quickly and moves on, and no lock lasts for more than a few milliseconds. But there are some situations that require more concurrancy, and those problems will need to seek a different solution.
Note:
How do other database servers tackle this problem?
Well, for a starters, you will find in Oracle or SQL Server that only one thread or process will write to the filesystem at a time. Therefore, it's not that other database servers allow multiple writes to the filesystem concurrently.
How do these other database servers manage high concurrency?
They actually apply a relatively simple trick. All changes are done to memory (and appended to the journal) first, while the thread that commits changes to the filesystem will commit these changes only much later, while on his round across dirty memory records, he will pick up the changes and commit them to the filesystem.
In this way, for example, you may get a counter-intuitive "Out of memory" error on SQL Server, when you delete all records from a large table. Why? Because, there is not enough space available to keep all these deleted records in memory, until the writing thread can commit these deletes to disk.
This has several serious implications: records may never be served directly from the filesystem itself, but must be served from memory (where you will find the records' latest versions). Second, there must be a strategy to clean out records from memory that are not being accessed any longer or not sufficiently (to prevent this shared memory from growing without bounds). Third, all clients must forward their requests to this shared memory custodian process; this implies a separate process to manage this shared memory.
Changing SQLite to incorporate such custodian logic would defeat the object of having an embeddable database that doesn't need its own separate process.
Therefore, a solution would probably be to develop such shared memory custodian as as a separate addon, along with a protocol to format the request/response conversation between this custodian and the processes that need to access a common database with high concurrency.
By the way, at a lower level you will find that the operating system already intelligently caches disk sectors that are being accessed regularly. Even though the database server itself would probably outperform the underlying OS in knowing how and what exactly to cache, the underlying caching by the OS already contributes substantially to improving SQLite's performance under high concurrency.
What about multi-tier usage? Anybody had success using it from appservers? (By what I read I guess it should work pretty well there).
wiki add-ons