Skip to content

perf: concurrency issuesΒ #1299

Open
Open
@aikar

Description

@aikar

I brought up some information here, #703

but wanted to make a direct issue on this topic.

real world use of this library will be in a highly asynchronous context.

However, db.serialize is a synchronous operation:
https://github.com/mapbox/node-sqlite3/blob/master/src/database.cc#L294

As soon as the callback returns, it is reset back to previous value
issue example (pretend db.run returns a promise)

db.serialize(() => {
  await db.run("..."); // serialize = true, call back returned here setting back to false
  await db.run("..."); // serialize = false 
});

Ok, so one might think I can work around this by simply not awaiting

await db.run(`CREATE TABLE IF NOT EXISTS test (id int primary key, test text)`);
db.serialize(() => {
    for (let i = 0; i < 100000; i++) {
        db.run("INSERT INTO test (id, test) VALUES (?, 'x')", [i]);
        db.run("UPDATE test SET test = 'y' WHERE id = ?", [i]);
    }
    const result await db.get("SELECT count(test) FROM test WHERE test = 'y'");
});

Expectation: 100k returned. We set all values to y after insert.
Got: 60-70k or random values

The reason is, each of these queries are submitted to the uv_default_loop():
https://github.com/mapbox/node-sqlite3/blob/master/src/statement.cc#L432
https://github.com/mapbox/node-sqlite3/blob/master/src/statement.cc#L438
https://github.com/mapbox/node-sqlite3/blob/master/src/macros.h#L128

There are multiple threads in the uv work queue. Each of your queries are running over multiple threads out of order!

What's worse, is an update needs to lock the database for the write, so that each of these threads are now also fighting over the mutex, slowing down the uv queues too!

Solution

I'm not sure how to perfectly solve this. Primary idea is to run its own single thread for writes.
maybe each query needs to be done in some kind of Promise supported callback, where it defines which thread pool to use, a single thread write, multiple thread read.

Additionally maybe best to not use the default uv work queue at all, ensuring that file locks don't block the uv queues.

I ultimately went down investigating this when I noticed my http server started timing out during heavy insert periods.

I'm not sure it's fully possible to fix this issue at application level outside of my solution supplied on that linked issue, where every write operation needs to be behind an rwlock.writeLock and every query must be awaited on to ensure only a single query at a time.

This has a huge time cost overhead though.

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions