Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cstore_fdw should have better support for joins #183

Open
mtuncer opened this issue Mar 29, 2018 · 1 comment
Open

cstore_fdw should have better support for joins #183

mtuncer opened this issue Mar 29, 2018 · 1 comment

Comments

@mtuncer
Copy link
Member

mtuncer commented Mar 29, 2018

When cstore_fdw table is involved in a join, postgres pulls all data from cstore_fdw table. It makes queries unnecessarily longer.

We are getting more requests to make this work. I do not know the involved work at the moment. We should definitely investigate this for future enhancement.

@tomvantilburg
Copy link

tomvantilburg commented Mar 29, 2018

Here's a quick setup to tell the story:

Create data

CREATE FOREIGN TABLE cstore_tmp (X integer,Y integer) SERVER cstore_server;
INSERT INTO cstore_tmp 
SELECT s1, s2
FROM generate_series(1,3000) s1, generate_series(1,3000) s2;

Query with 1:N join (same as no join)

EXPLAIN SELECT count(b.*)
FROM (
  SELECT 300 minx, 300 miny, 600 maxx, 600 maxy
) a
JOIN cstore_tmp b
ON b.x BETWEEN a.minx AND a.maxx
AND b.y BETWEEN a.miny AND a.maxy;

Result:

Aggregate  (cost=189072.56..189072.57 rows=1 width=8)
  ->  Foreign Scan on cstore_tmp b  (cost=0.00..189072.00 rows=225 width=32)
        Filter: ((x >= 300) AND (x <= 600) AND (y >= 300) AND (y <= 600))
        CStore File: /var/data/postgresql/10/main/cstore_fdw/16384/191674758
        CStore File Size: 74310480

Query with N:N join

SELECT count(b.*)
FROM (
  SELECT 300 minx, 300 miny, 500 maxx, 500 maxy
  UNION ALL
  SELECT 400 minx, 400 miny, 600 maxx, 600 maxy
) a
JOIN cstore_tmp b
ON b.x BETWEEN a.minx AND a.maxx
AND b.y BETWEEN a.miny AND a.maxy;

Result:

Aggregate  (cost=504627.60..504627.61 rows=1 width=8)
  ->  Nested Loop  (cost=0.00..504072.05 rows=222222 width=32)
        Join Filter: ((b.x >= "*SELECT* 1".minx) AND (b.x <= "*SELECT* 1".maxx) AND (b.y >= "*SELECT* 1".miny) AND (b.y <= "*SELECT* 1".maxy))
        ->  Foreign Scan on cstore_tmp b  (cost=0.00..99072.00 rows=9000000 width=40)
              CStore File: /var/data/postgresql/10/main/cstore_fdw/16384/191674758
              CStore File Size: 74310480
        ->  Materialize  (cost=0.00..0.05 rows=2 width=16)
              ->  Append  (cost=0.00..0.04 rows=2 width=16)
                    ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..0.02 rows=1 width=16)
                          ->  Result  (cost=0.00..0.01 rows=1 width=16)
                    ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..0.02 rows=1 width=16)
                          ->  Result  (cost=0.00..0.01 rows=1 width=16)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants