Description
I brought up some information here, #703
but wanted to make a direct issue on this topic.
real world use of this library will be in a highly asynchronous context.
However, db.serialize is a synchronous operation:
https://github.com/mapbox/node-sqlite3/blob/master/src/database.cc#L294
As soon as the callback returns, it is reset back to previous value
issue example (pretend db.run returns a promise)
db.serialize(() => {
await db.run("..."); // serialize = true, call back returned here setting back to false
await db.run("..."); // serialize = false
});
Ok, so one might think I can work around this by simply not awaiting
await db.run(`CREATE TABLE IF NOT EXISTS test (id int primary key, test text)`);
db.serialize(() => {
for (let i = 0; i < 100000; i++) {
db.run("INSERT INTO test (id, test) VALUES (?, 'x')", [i]);
db.run("UPDATE test SET test = 'y' WHERE id = ?", [i]);
}
const result await db.get("SELECT count(test) FROM test WHERE test = 'y'");
});
Expectation: 100k returned. We set all values to y after insert.
Got: 60-70k or random values
The reason is, each of these queries are submitted to the uv_default_loop():
https://github.com/mapbox/node-sqlite3/blob/master/src/statement.cc#L432
https://github.com/mapbox/node-sqlite3/blob/master/src/statement.cc#L438
https://github.com/mapbox/node-sqlite3/blob/master/src/macros.h#L128
There are multiple threads in the uv work queue. Each of your queries are running over multiple threads out of order!
What's worse, is an update needs to lock the database for the write, so that each of these threads are now also fighting over the mutex, slowing down the uv queues too!
Solution
I'm not sure how to perfectly solve this. Primary idea is to run its own single thread for writes.
maybe each query needs to be done in some kind of Promise supported callback, where it defines which thread pool to use, a single thread write, multiple thread read.
Additionally maybe best to not use the default uv work queue at all, ensuring that file locks don't block the uv queues.
I ultimately went down investigating this when I noticed my http server started timing out during heavy insert periods.
I'm not sure it's fully possible to fix this issue at application level outside of my solution supplied on that linked issue, where every write operation needs to be behind an rwlock.writeLock and every query must be awaited on to ensure only a single query at a time.
This has a huge time cost overhead though.
Thoughts?