Recently watched a great presentation and introduction to SQLite by its creator, Dr. Richard Hipp.
A short summary of the video below.
SQLite is used in many places
Strong contender for most-widely deployed application in the world; certainly within the top five. SQLite is deployed in:
- Every smartphone (Android and iOS)
- Every major browser (Chrome, Firefox, Safari)
- Every installation of Python and PHP
- Many popular desktop applications
Some people have suggested
zlib as a viable competitor, but technicalities abound on whether N applications using a shared copy of
zlib counts N times or 1. SQLite is usually statically linked by each application that uses it.
Replete with features
- Well-defined, backward compatible file format
- Power-safe transactions
- Foreign keys
- Full-text indexing
- R-tree indexes for multidimensional (up to five) lookups
- Recursive CTEs
- Partial indexes
Only one writer
A major contributor to SQLite’s compactness and reliability is its self-imposed limitation on limiting writes to a single process. There is no such limitation on the number of concurrent readers, however.
SQLite can scale larger than you think
- Up to 64-way joins
- Up to 140TB per database/file
- Up to 125 databases per connection
SQLite is used in the flight sofware for the A350 XWB family of aircraft. It has DO-178B (Software Considerations in Airborne Systems and Equipment Certification) certification, which means that it has 100% MC/DC (modified condition/decision coverage) code coverage. Every branch operation in the code must be tested. SQLite has 22,000 branch opcodes, all told, and every one of them must be tested for correctness to ensure that all possible outcomes are expected, and that no redundant branches are present. A significant portion of the test suite is generated from high level specs.
To quote from the presentation:
I just ran a test run the other day. It did 177 million different tests, all of which have to pass. And that’s just one platform. That’s one in three test suites, on a single platform. We have to do that on multiple platforms: various different flavours of Unix, Windows, VxWorks, OpenBSD, Mac. And we do it on different processors: 64-bit SPARC, which is big-endian, 32-bit PowerPC, we do ARM, of course we do Intel.
The longest test package, the one with 177 million tests, takes about 8 hours on a modern workstation. We’ve another that runs for 16 (hours)…
It takes three, four, five days, we took four days because we didn’t run into any serious problems. It takes around four days to run a full release-and-test cycle, three guys working on it.
SQLite is an engine that lives with the application
… as opposed to Postgres and friends, in which the database engine lives together with the data on the other side of the network away from the application. Because of this, it leads to the next point, which is:
SQLite does not compete with traditional client/server databases
- Your data is stored remotely, or
- Your data is too large to store on a single machine, or
- You need support for concurrent writers
A client/server database engine should be used. Instead, SQLite competes with the usage of flat files (e.g. a bunch of JSON files to store local application data).
A nifty use case of SQLite is as a cache-aside cache of a traditional client/server database.
You can store blobs in SQLite
Adobe, when using SQLite for Lightroom thumbnails, did a benchmark comparing read and write speeds for blobs stored direcly on disk as compared to being stored in SQLite. For blobs smaller than 50kB, across various DB page sizes, an up to 2x increase in throughput was observed. The reason for this is that at smaller blob sizes, the overhead of
fopen starts to dominate (everything is in a single file in SQLite). The benchmark is published here.
Useful tools exist
sqldiff for diffing 2 SQLite databases.