Sqlite3 Transaction

Submitted by Eus
on April 25, 2010 - 11:55pm

After some time into web application development during my undergraduate study, I had wondered about the effect of modifying the underlying DB tables when a query result set over the DB tables are being iterated by a PHP script making an HTML table. Well, I believe that the result set is contained in its own memory and is detached from the underlying DB tables so the iteration will not be affected by any change to the underlying DB. How about the use of a cursor or a similar mechanism in iterating the result set? Well, I believe that the cursor operates on the result set memory instead of on the underlying DB. Well, those assumptions can be wrong and the DB can choose to employ locking. If locking is employed, how does it affect the other writers or readers? I think the sure way to answer the posed questions is by conducting some tests to see the behavior of the DB when using its API. Below is the result of conducting some tests on Sqlite3 to answer the above questions when I was working on Service Publishing AP.

The result set of an Sqlite3 DB prepared statement query must be iterated by a mechanism like a DB cursor: sqlite3_step(). The experiments using this source code show the effect of modifying the underlying DB when iterating a result set using sqlite3_step(). Moreover, the use of sqlite3_exec(..., "begin", ...) can set up a non-automatic transaction that will lock the underlying DB.

By utilizing a manual BEGIN (i.e., sqlite3_exec(..., "begin", ...)), all subsequent modifications to the data before COMMIT (i.e., sqlite3_exec(..., "commit", ...)) won't be seen by the reader. But, no other writer can write while the previous transaction has not been committed by the first writer. Therefore, if one writer needs a long time to tinker with various modifications before eventually committing the data, it should copy the main table to a temporary table and work on the temporary one. In this way, a lot of writers can tinker with their own set of data without getting into each other's way. A use-case for this is the operation of service_list.c of service_publishing_ap that is backed by sqlite3. In service_list.c, there is no restriction as to how many process can load the service set (i.e., opening up a DB connection) and perform temporary write operations that will only be finalized if save_service_list is successful. The temporary table creation is not too overkill if the data set is not large, which is the case for service_list.c. Well, my original idea was to employ double linked list for service_list.c's data back-end rather than sqlite3 since there is no need for a search operation but only insert/update/delete operation that a double linked list handles well. But then, I thought I would reinvent the wheel, especially in the area of locking and data reliability. So, I use sqlite3.

I found out that while iterating a table with sqlite3_exec(..., "select * from table", callback_fn, ...), a manipulation on the same table like inserting additional records done in the callback function callback_fn will be seen by the iterator. For example, if before the iteration the number of records that will be iterated is 4, adding some records that have the same select criteria will actually return more than 4 records. While deleting 2 records out of 4 will make the iterator only return 2 records. IOW, sqlite3 does not create a virtual table that contains the original 4 records and then iterating the virtual table without being affected by any change to the original table (i.e., the result set is not detached from the underlying DB).

When a reader is iterating a table using a prepared statement (sqlite3_exec() is just a wrapper), no writer can commit its transaction (committing will return a DB locked error) while the reader has not called sqlite3_reset() or sqlite3_finalize() (ie., after executing sqlite3_step(), the reader holds the read lock denying any writer from writing). Having initializing a transaction by calling "begin", the reader also cannot call "rollback" to relinquish its read lock while sqlite3_step() is still returning SQLITE_ROW. The writer, however, can do read and write as usual as long as the writer does not do commit although any change that the writer makes won't be seen by the reader. Therefore, the behavior described in the previous paragraph is only possible when the callback function callback_fn is executed after the reader has released the lock. This is why the change made to the original table will affect subsequent read: since the reader has to relinquish its read lock, the reader has to reread the original table, and therefore, sees the change made to the original table (i.e., sqlite3_exec() only holds a read lock for the duration of doing sqlite3_step() instead of holding the read lock for the whole iteration session of sqlite3_exec() from the first row till the last).

Another interesting point is that when the database is locked, there is always the possibility of sqlite3_step(), and therefore, sqlite3_exec() returning SQLITE_BUSY. This must be taken into consideration when using them to craft another APIs like what I did in service_publishing_ap's service_list.

Another interesting finding that I discovered is that sqlite3_finalize() has to be invoked when any of the prepared-statement-related operations after the corresponding successful sqlite3_prepare_v2 fails(). It is because sqlite3_close() won't work if there is a prepared statement that hasn't been finalized.

Another interesting finding about the use of a trigger whose body performs the same operation that triggers the trigger at the first place is that it is a recursive call that will never end in case of "update" like:

create trigger if not exists service_list_mod_time_update_trigger before update on service_list for each row begin update service_list set id = new.id, position = new.position, mod_time = strftime ('%s', 'now'), cat_id = new.cat_id, uri = new.uri, desc = new.desc, long_desc = new.long_desc, is_enabled = new.is_enabled where id = old.id; end;

Doing the same with "insert" will stop because the second recursive insert will cause a violation error when hitting a duplicated primary key and by default the action is to abort.