Ticket #126 (closed enhancement: fixed)

Opened 3 years ago

Last modified 3 months ago

Writer contention leads to persistent database lock

Reported by: cboos@… Owned by: gh
Priority: high Milestone:
Component: implementation Version: pysqlite-2.0
Severity: critical Keywords: writer thread lock
Cc:

Description

This ticket describes the problem and a possible solution for the database lock problem PySQLite has when multiple threads attempt to write to the database at the same time.

This writer contention problem has already been described before:

  • by Karl Mierle in Argon's Locking page
  • on the sqlite mailing list
  • see also #124, which links to my original problem in Trac

If one wants to retain some degree of concurrency and not use EXCLUSIVE transactions, the solution would be that all writer threads (but one) perform a "ROLLBACK". By doing so, they would release their SHARED lock and give the remaining thread a chance to acquire the EXCLUSIVE lock.

First, I'll provide a sample test program (attachment:writer_contention.py) which tries to reproduce the problematic scenario:

  1. thread A and B both acquire the SHARED lock,
  2. then they both attempt to get a RESERVED lock
  3. one of them succeeds (A), the other waits (B)
  4. when A tries to promote its RESERVED lock to an EXCLUSIVE lock, it will first have to wait for B which still has its SHARED lock...
  5. both wait for each other: "deadlock" situation

The test program also illustrates the fact that with pysqlite-2.0.4, such a deadlock has the undesirable side-effect of leaving the database in a locked state. This prevents any further operation to succeed, even read attempts from other threads will fail.

Then, I propose a way to implement a workaround (attachment:safe_writer_contention.py). This involves writing an intermediate layer, wrapping the Connection and Cursor objects. In the case of a database lock, the exception is intercepted, and a "ROLLBACK" is performed. This can only work if all the existing cursors for that connection are first closed, hence the intermediate layer which keeps track of the active cursors using weakrefs. This could eventually be integrated in the binding itself, later on.

Though this Python-only solution works by itself, pysqlite can also be improved to better handle the situation:

  • A first patch, attachment:finalized_statements.patch makes the database recover safely after a writer contention problem, by avoiding to have leaked statements
  • A second patch, attachment:reduced_timeout.patch reduces the duration of the timeout used while waiting for a SHARED or RESERVED lock; the thread will only use the full duration of the timeout while waiting for an EXCLUSIVE lock; that way, the write contention can be concluded without having to wait for the full timeout period, only for a fraction of it; this also reduces the risk that the thread attempting to get the EXCLUSIVE lock will hit its timeout even if the other threads are giving up

The results of the test programs, with or without the patches, are summarized in the following table:

patch writer_contention.py safe_writer_contention.py
2.0.4 FAILED (errors=2) in 6.855s OK in 2.318s
2.0.4 + finalized_statements FAILED (failures=1) in 2.320s OK 2.317s
2.0.4 + reduced_timeout FAILED (errors=2) in 3.571s OK in 0.679s
2.0.4 + combined FAILED (failures=1) in 2.320s OK in 0.680s

One can see that:

  • with the first patch, the errors disappear; those errors were database lock exception when the main thread attempted to read or write to the database after the 2 writer threads were gone
  • with the second patch, one can see that the writer contention issue is solved much quicker (0.7 instead of 2.3 seconds, timeout was 2.0 seconds)

The attachment:combined.patch is the merge of the two other patches. All the patches apply on 2.0.4 or on source:trunk@913 after the cleanup in #125 has been applied.

Attachments

writer_contention.py (4.0 kB) - added by cboos 3 years ago.
Test program which deterministically reproduces the problem
safe_writer_contention.py (5.5 kB) - added by cboos 3 years ago.
Test program which works around the writer contention problem by issuing a ROLLBACK upon a database failure
finalized_statements.patch (1.0 kB) - added by cboos 3 years ago.
always call sqlite3_finalize on the statements, even after a timeout condition
reduced_timeout.patch (3.8 kB) - added by cboos 3 years ago.
only wait for the full duration of the timeout when trying to get an EXCLUSIVE lock, and a fifth of that duration in the other cases
combined.patch (4.3 kB) - added by cboos 3 years ago.
the two other patches combined
interrupt_on_writer_contention-r180.patch (7.8 kB) - added by cboos@… 3 years ago.
Patch on 2.1.0 (r180) implementing immediate resolution of the "writer contention" situation

Change History

Changed 3 years ago by cboos

Test program which deterministically reproduces the problem

Changed 3 years ago by cboos

Test program which works around the writer contention problem by issuing a ROLLBACK upon a database failure

Changed 3 years ago by cboos

always call sqlite3_finalize on the statements, even after a timeout condition

Changed 3 years ago by cboos

only wait for the full duration of the timeout when trying to get an EXCLUSIVE lock, and a fifth of that duration in the other cases

Changed 3 years ago by cboos

the two other patches combined

Changed 3 years ago by gh

  • status changed from new to assigned
  • type changed from defect to enhancement

Changed 3 years ago by cboos

I just realized that I conducted all my experiments with sqlite-3.2.7 configured with --enable-threadsafe (on Linux).

If one uses a sqlite binary which has not been compiled with this flag, then my workaround approach (attachment:safe_writer_contention.py) won't work.

Also, I'm not sure if this is an enhancement request, as the bug fixed by the first patch seems to be a quite critical one to me...

Ok, the second patch is an enhancement request :)

Changed 3 years ago by cboos@…

Some results on Windows (with sqlite-3.2.7 and ActivePython?-2.4.1)

patch writer_contention.py safe_writer_contention.py
2.0.4 FAILED (errors=2) in 7.371s OK in 3.115s
2.0.4 + finalized_statements FAILED (failures=1) in 3.054s OK 2.854s
2.0.4 + reduced_timeout FAILED (errors=2) in 4.116s OK in 1.272s
2.0.4 + combined FAILED (failures=1) in 2.894s OK in 1.251s

Pretty consistent with the Linux results.

Changed 3 years ago by cboos@…

Now, I'd like to point out that there's another way to workaround the writer contention problem described above: using the IMMEDIATE isolation level also solves this. (*)

If attachment:writer_contention.py is modified that way:

@@ -9,7 +9,8 @@
     return "test_trans"

 def get_conn():
-    return (sqlite.connect(get_db_path(), timeout=2.0))
+    return (sqlite.connect(get_db_path(), timeout=2.0,
+                           isolation_level="IMMEDIATE"))

then the program succeeds, provided:

  • the finalize_statements.patch has been applied
  • sqlite-3.2.7 is thread-safe (i.e. sqlite3 on Windows or configured with --enable-threadsafe on Linux)
patch writer_contention.py/isolation_level="IMMEDIATE"
2.0.4 FAILED (errors=2) in 6.856s
2.0.4 + finalized_statements OK 2.320s
2.0.4 + reduced_timeout FAILED (errors=2) in 3.579s
2.0.4 + combined OK in 0.682s

(*) it's not completely clear for me why it succeeds, shouldn't the SHARED lock of thread B prevent thread A to get the EXCLUSIVE lock also in this case?

Changed 3 years ago by cboos@…

I tried r163 on Windows and Linux, and the results are consistent with the above ones corresponding to "2.0.4 + finalized_statements".

Changed 3 years ago by cboos@…

Patch on 2.1.0 (r180) implementing immediate resolution of the "writer contention" situation

Changed 3 years ago by cboos@…

I re-evaluated the reduced_timeout approach, and concluded that the idea could be taken further: if we are in a "writer contention" situation, the only wait out will be that one of the thread (preferably not the one waiting for the EXCLUSIVE lock) gets a Database Locked exception. Instead of relying on the time-out for that, we directly interrupt the other thread(s) (the one not waiting for the EXCLUSIVE lock), to save some time.

All unit tests still pass, timing not changed for them.

While testing patch writer_contention.py/isolation_level="IMMEDIATE":

patch timeout=2s timeout=10s
2.1.0 OK 3.28s OK 10.93s
2.1.0 + interrupt_on_writer_contention OK 0.73s OK 0.71s

First row is with stock pysqlite at r180, second with attachment:interrupt_on_writer_contention-r180.patch applied.

Platform is Windows, SQLite 3.2.7.

Changed 3 years ago by gh

First, thanks for the new patch. I'm slowly reviewing and testing your proposed changes, and discovering other problems that need fixing while doing so.

I believe you're fooling yourself with calling sqlite3_interrupt. If I read the docs correctly, it will only help interrupting any statements on the connection it's called on, right? I believe it does not help at all for other connection objects that open the same database file.

Btw. I now have something locally that resets all statements in the connection cache such that the rollback will succeed. There will be no need for your intermediate layer with weakrefs etc. this way.

Eventually and with care, I hope we can have a pysqlite release that behaves quite a bit nicer in a concurrent setup.

Changed 3 years ago by guest

I believe you're fooling yourself with calling sqlite3_interrupt.

No, I don't think so. I use the same code path as the one used when the timeout is reached. So if it works in the timeout case, it should work for the shortcut case too.

If I read the docs correctly, it will only help interrupting any statements on the connection it's called on, right?

Yes.

I believe it does not help at all for other connection objects that open the same database file.

Right again. Maybe I should have documented the code a little bit better: in the following change (function _sqlite_step_with_busyhandler)

  • src/util.c

     
    7179            break; 
    7280        } 
    7381 
    74         if (pysqlite_time() - connection->timeout_started > connection->timeout) { 
     82        /* Interrupt on "writer contention" or "timeout reached" conditions */ 
     83        if (waiting_for_exclusive_lock_flag && !manage_exclusive_flag || 
     84            pysqlite_time() - connection->timeout_started > timeout) { 
     85            sqlite3_interrupt(connection->db); 
    7586            break; 
    7687        } 

the intent is to shortcut the waiting period for a thread that would otherwise wait in vain for a SHARED/RESERVED lock which it will/should never get because another thread has already requested an EXCLUSIVE lock. That other thread is the only one for which manage_exclusive_flag is true.

i.e. if we have the follow concurrent write scenario:

  1. thread A => SELECT (gets the SHARED lock)
  2. thread B => SELECT (gets the SHARED lock)
  3. thread C => INSERT (gets the SHARED, then the RESERVED lock) => (waits for the EXCLUSIVE lock)
  4. thread B => INSERT (waits for the RESERVED lock)

Then, after thread A releases the SHARED lock, the outcome of this situation is currently:

  1. thread B => (go on waiting for RESERVED lock)
  2. thread C => (go on waiting for EXCLUSIVE lock)
  3. ... time passes ...
  4. (either B or C has waited more than the timeout, let's say it's B)
  5. thread B reaches the timeout, executes sqlite3_interrupt to cancel its current query and aborts with a Database is locked OperationalException
  6. thread C can proceed

With my proposed change, this would be instead:

  1. thread B => detects that some other thread is waiting for the exclusive lock (because C set the waiting_for_exclusive_lock_flag), decides to give up immediately instead of waiting for the timeout (and it does the same sqlite3_interrupt and raise a Database is locked OperationalException, as above)
  2. thread C can proceed

The behavior is very similar (one thread succeeds, the other fails) except that we completely bypass the timeout period and that both threads have a very reactive behavior.

Ok, at least that was the intent, and from my tests it seems that it's indeed behaving like that.

Btw. I now have something locally that resets all statements in the connection cache such that the rollback will succeed. There will be no need for your intermediate layer with weakrefs etc. this way.

Great news! I hope I can test that soon.

Changed 3 years ago by gh

I will still have to think this through more thoroghly - so far it's only enough for nitpicking:

the new flag of yours is an interpreter-global variable, and it's not neccessary that if two connections need to wait for a lock to go away, that it's two connections to the same database. It's, however, likely in many use cases of pysqlite.

There's no simple to have your cake and eat it, too, or is there?

Changed 3 years ago by cboos@…

Yes, the assumption was that all connections would be on the same database. If the approach appears to be otherwise correct, I'll refine the patch to get rid of this assumption.

Changed 3 years ago by gh

For a test, I will try to replace pysqlite's busy handling with just configuring SQLite's builtin busy handler using sqlite3_busy_timeout(). It should detect the deadlock situation, according to the docs.

Changed 3 years ago by gh

You might want to test the SVN version. I've replaced my and your busy handling with SQLite's busy timeout, which seems to be just as good as your solution, with the advantage of detecting possible deadlock situations more reliably. This commit also features a few other goodies ;-)

Changed 3 years ago by gh

Christian, did you have any chance to test if the SVN version behaves nicer for you?

FWIW, I did implement all your proposed changes and when I was almost ready to check them in, the idea of this much simpler approach occured to me.

Changed 3 years ago by gh

  • severity changed from Patches to critical

Changed 3 years ago by guest

Oops, missed the previous update, I'll try out the svn version now.

Changed 3 years ago by cboos@…

I did some basic tests on Windows: wow, it seems to work really well!

When testing the program writer_contention.py/isolation_level="IMMEDIATE" I had a total duration of about 0.7s with my solution, as noted above.

That number would certainly have increased a little bit with the more complete and complex solution which would have involved tracking the state of each database.

The above program with your solution (r185) has a duration of only 0.6s, removes the complexity and is certainly much more robust.

... so I think you ate the cake and still have it :)

Changed 3 years ago by gh

  • status changed from assigned to closed
  • resolution set to fixed

Great! So I guess I can close this one now :)

Changed 15 months ago by christian.boos

See also #184.

Note: See TracTickets for help on using tickets.