Skip to content

Slow reading of events ordered as_of in PostgresΒ #1919

@volodymyr-mykhailyk

Description

@volodymyr-mykhailyk

Setup:

  • Rails Event Store
  • ActiveRecord
  • PostgreSQL 14x

When trying to read events ordered by valid time (as_of) - response is very slow on modest size of events table:

event_store.read.as_of.limit(50) # takes 10s to load
# Under the hood it executes following query
# SELECT event_store_events . * FROM event_store_events ORDER BY COALESCE ( event_store_events . valid_at, event_store_events . created_at ) DESC, event_store_events . id DESC LIMIT ? OFFSET ?

The cause is that postgresql is not using individual column indexes when coalesce function is used. Here is full explain analyze output

db=> EXPLAIN (ANALYSE) SELECT event_store_events.* FROM event_store_events ORDER BY COALESCE(event_store_events.valid_at, event_store_events.created_at) LIMIT 50;
                                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=125738.62..125744.46 rows=50 width=776) (actual time=6512.793..6516.016 rows=50 loops=1)
   ->  Gather Merge  (cost=125738.62..146787.93 rows=180410 width=776) (actual time=6512.792..6516.001 rows=50 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=124738.60..124964.11 rows=90205 width=776) (actual time=6506.868..6506.876 rows=50 loops=3)
               Sort Key: (COALESCE(valid_at, created_at))
               Sort Method: top-N heapsort  Memory: 125kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 129kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 129kB
               ->  Parallel Seq Scan on event_store_events  (cost=0.00..121742.05 rows=90205 width=776) (actual time=27.019..6479.436 rows=72164 loops=3)
 Planning Time: 0.150 ms
 Execution Time: 6516.055 ms
(12 rows)

However it is possible to mitigate the problem when you add COALESCE index: CREATE INDEX events_time_sorting ON event_store_events(COALESCE(valid_at, created_at));

Here is example on same db with the index

db=> CREATE INDEX events_time_sorting ON event_store_events(COALESCE(valid_at, created_at));
CREATE INDEX
db=>  EXPLAIN (ANALYSE) SELECT event_store_events.* FROM event_store_events ORDER BY COALESCE(event_store_events.valid_at, event_store_events.created_at) LIMIT 50;
                                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..107.22 rows=50 width=776) (actual time=0.032..0.067 rows=50 loops=1)
   ->  Index Scan using events_time_sorting on event_store_events  (cost=0.42..462415.82 rows=216493 width=776) (actual time=0.031..0.062 rows=50 loops=1)
 Planning Time: 0.190 ms
 Execution Time: 0.087 ms
(4 rows)

Have you considered adding this index for optimal performance? Is there anything I'm missing why such index should not be part of the gem?

Happy to prepare PR if you agree that this is a good change. Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions