Skip to content

Commit 88699ef

Browse files
authored
ENG-270, ENG-328 Merge pull request #165 from DiscourseGraphs/feature/supabase
ENG-270, ENG-328 Creation of the supabase embedding schema, integration with turbo and CI/CD
2 parents e40dc6c + c71198f commit 88699ef

33 files changed

+3449
-1
lines changed
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
name: Supabase deploy Function
2+
on:
3+
workflow_dispatch:
4+
push:
5+
branches:
6+
- main
7+
jobs:
8+
deploy:
9+
runs-on: ubuntu-latest
10+
env:
11+
SUPABASE_ACCESS_TOKEN: ${{ secrets.SUPABASE_ACCESS_TOKEN }}
12+
SUPABASE_PROJECT_ID: ${{ secrets.SUPABASE_PROJECT_ID_PROD }}
13+
SUPABASE_DB_PASSWORD: ${{ secrets.SUPABASE_DB_PASSWORD_PROD }}
14+
steps:
15+
- uses: actions/checkout@v4
16+
- uses: actions/setup-node@v3
17+
with:
18+
node-version: "20"
19+
- run: npm ci
20+
- uses: supabase/setup-cli@v1
21+
with:
22+
version: latest
23+
- run: npx turbo deploy -F @repo/database

packages/database/.sqruff

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
[sqruff]
2+
dialect = postgres
3+
exclude_rules = CP05,LT05
4+
5+
[sqruff:indentation]
6+
indent_unit = space
7+
tab_space_size = 4
8+
indented_joins = True

packages/database/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
This contains the database schema for vector embeddings and concepts.
2+
All CLI commands below should be run in this directory (`packages/database`.)
3+
4+
1. Setup
5+
1. Install [Docker](https://www.docker.com)
6+
2. Install the [supabase CLI](https://supabase.com/docs/guides/local-development). (There is a brew version)
7+
3. `supabase login` with your (account-specific) supabase access token. (TODO: Create a group access token.)
8+
4. `supabase link`. It will ask you for a project name, use `discourse-graphs`. (Production for now.) It will also ask you for the database password (See 1password.)
9+
5. Install [sqruff](https://github.com/quarylabs/sqruff)
10+
2. Usage:
11+
1. Use `turbo dev`, (alias for `supabase start`) before you use your local database. URLs will be given for your local supabase database, api endpoint, etc.
12+
2. You may need to `supabase db pull` if changes are deployed while you work.
13+
3. End you work session with `supabase end` to free docker resources.
14+
3. Development: We follow the supabase [Declarative Database Schema](https://supabase.com/docs/guides/local-development/declarative-database-schemas) process.
15+
1. Assuming you're working on a feature branch.
16+
2. Make changes to the schema, by editing files in `packages/database/supabase/schemas`
17+
3. If you created a new schema file, make sure to add it to `[db.migrations] schema_paths` in `packages/database/supabase/config.toml`. Schema files are applied in that order, you may need to be strategic in placing your file.
18+
4. `turbo build`, which will do the following:
19+
1. Check your logic with `sqruff lint supabase/schemas`, and eventually `sqruff fix supabase/schemas`
20+
2. Regenerate the types file with `supabase gen types typescript --local > types.gen.ts`
21+
3. See if there would be a migration to apply with `supabase db diff`
22+
5. If applying the new schema fails, repeat step 4
23+
6. If you are satisfied with the migration, create a migration file with `npm run dbdiff:save some_meaningful_migration_name`
24+
1. If all goes well, there should be a new file named `supbase/migration/2..._some_meaningful_migration_name.sql` which you should `git add`.
25+
10. You can start using your changes again `turbo dev`

packages/database/example.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# example...
2+
3+
Content:
4+
5+
* (nt1pgid) discourse-graphs/nodes/Claim
6+
* (nt2pgid) discourse-graphs/nodes/Hypothesis
7+
* (dgpgid) roam/js/discourse-graph
8+
* (et1bkid) Opposes
9+
* (et1r1bkid) source
10+
* (et1r2bkid) destination
11+
* (anyid1) If
12+
* (et1sr1bkid) Page
13+
* (et1sr2bkid) Block
14+
* (et1sr3bkid) ParentPage
15+
* (et1sr4bkid) PBlock
16+
* (et1sr5bkid) SPage
17+
* (et1sr6bkid) SBlock
18+
* (hyp1pgid) [HYP] Some hypothesis
19+
* (clm1pgid) [CLM] Some claim
20+
* (somepgid) Some page
21+
* (hyp1refbkid) a block referring to [[HYP] Some hypothesis]
22+
* (opp1bkid) OpposedBy
23+
* (clm1refbkid) a block referring to [[CLM] Some Claim]
24+
25+
Documents:
26+
27+
| id | source_local_id |
28+
|----|-----------------|
29+
| 1 | nt1pgid |
30+
| 2 | nt2pgid |
31+
| 3 | dgpgid |
32+
| 22 | hyp1pgid |
33+
| 23 | clm1pgid |
34+
| 4 | somepgid |
35+
36+
Content:
37+
38+
| id | source_local_id | page_id | scale | represents_id | text |
39+
|----|-------------|-------------|----------|---------------|----------------------------------------------|
40+
| 5 | nt1pgid | 1 | document | 16 | discourse-graphs/nodes/Claim |
41+
| 6 | nt2pgid | 2 | document | 17 | discourse-graphs/nodes/Hypothesis |
42+
| 7 | et1bkid | 3 | document | 18 | discourse-graphs/edges/OpposedBy |
43+
| 8 | somepgid | 4 | document | | Some page |
44+
| 24 | hyp1pgid | 22 | document | 20 | [HYP] Some hypothesis |
45+
| 25 | clm1pgid | 23 | document | 19 | [HYP] Some claim |
46+
| 9 | hyp1refbkid | 4 | block | | a block referring to [[HYP] Some hypothesis] |
47+
| 10 | opp1bkid | 4 | block | 21 | OpposedBy |
48+
| 11 | clm1refbkid | 4 | block | | a block referring to [[CLM] Some claim] |
49+
| 13 | et1r1bkid | 3 | block | | source |
50+
| 14 | et1r2bkid | 3 | block | | destination |
51+
52+
Concept:
53+
54+
| id | is_schema | arity | schema | name | content |
55+
|----|-----------|-------|--------|-----------------------|-----------|
56+
| 16 | true | 0 | | Claim | {} |
57+
| 17 | true | 0 | | Hypothesis | {} |
58+
| 18 | true | 2 | | Opposed-by | { "roles": ["source", "destination"], "representation": ["source", "sourceref", "destination", "destinationref", "predicate"] } |
59+
| 19 | false | 0 | 16 | [CLM] Some claim | {} |
60+
| 20 | false | 0 | 17 | [HYP] Some hypothesis | {} |
61+
| 21 | false | 2 | 18 | OpposedBy | { "concepts": {"source": 19, "destination": 20}, "occurences": [{"sourceref": 11, "destinationref": 9, "source": 25, "destination": 24, "predicate": 10 }] } |
62+
63+
Note: Open question whether the occurence structure matters, and whether it should be materialized in another table.
64+
(I would tend to say yes to both.)
65+
66+
ContentLink
67+
68+
| source | destination |
69+
|--------|-------------|
70+
| 9 | 24 |
71+
| 11 | 25 |
72+
73+
Note: I would probably create a sub-Content for the link text and use this as source.
74+
OR use a char_start, char_end.
75+
76+
Missing: Ontology

packages/database/package.json

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
{
2+
"name": "@repo/database",
3+
"version": "0.0.0",
4+
"private": true,
5+
"license": "Apache-2.0",
6+
"type": "module",
7+
"exports": {
8+
"./types.gen.ts": "./types.gen.ts"
9+
},
10+
"scripts": {
11+
"init": "supabase login",
12+
"dev": "supabase start",
13+
"stop": "supabase stop",
14+
"build": "npm run lint && npm run gentypes:local && cp ./types.gen.ts ../../apps/website/app/utils/supabase && npm run dbdiff",
15+
"lint": "tsx scripts/lint.ts",
16+
"lint:fix": "tsx scripts/lint.ts -f",
17+
"gentypes:local": "supabase start && supabase gen types typescript --local --schema public > types.gen.ts",
18+
"gentypes:production": "supabase start && supabase gen types typescript --project-id \"$SUPABASE_PROJECT_ID\" --schema public > types.gen.ts",
19+
"dbdiff": "supabase stop && supabase db diff",
20+
"dbdiff:save": "supabase stop && supabase db diff -f",
21+
"deploy": "tsx scripts/deploy.ts",
22+
"deploy:functions": "tsx scripts/lint.ts -f"
23+
},
24+
"devDependencies": {
25+
"supabase": "^2.22.12",
26+
"tsx": "^4.19.2"
27+
},
28+
"dependencies": {}
29+
}

packages/database/schema.puml

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
@startuml
2+
skinparam nodesep 10
3+
hide circle
4+
hide empty members
5+
class "SpaceAccess" [[{An access control entry for a space}]] {
6+
{field} editor : boolean
7+
}
8+
class "Account" [[{A user account on a platform}]] {
9+
{field} id : integer
10+
{field} write_permission : boolean
11+
{field} active : boolean
12+
}
13+
class "Space" [[{A space on a platform representing a community engaged in a conversation}]] {
14+
{field} id : integer
15+
{field} url : string
16+
{field} name : string
17+
}
18+
"SpaceAccess" --> "1" "Account" : "account"
19+
"SpaceAccess" --> "0..1" "Space" : "space"
20+
class "Platform" [[{A data platform where discourse happens}]] {
21+
{field} id : integer
22+
{field} name : string
23+
{field} url : string
24+
}
25+
class "Content" [[{A unit of content}]] {
26+
{field} id : integer
27+
{field} source_local_id : string
28+
{field} created : datetime
29+
{field} text : string
30+
{field} metadata : JSON
31+
{field} scale : Scale
32+
{field} last_modified : datetime
33+
}
34+
class "Document" [[{None}]] {
35+
{field} id : integer
36+
{field} source_local_id : string
37+
{field} url : string
38+
{field} created : datetime
39+
{field} metadata : JSON
40+
{field} last_modified : datetime
41+
{field} contents : blob
42+
}
43+
class "Concept" [[{An abstract concept, claim or relation}]] {
44+
{field} id : integer
45+
{field} epistemic_status : EpistemicStatus
46+
{field} name : string
47+
{field} description : string
48+
{field} created : datetime
49+
{field} last_modified : datetime
50+
{field} arity : integer
51+
{field} content : JSON
52+
{field} is_schema : boolean
53+
}
54+
"Space" --> "1" "Platform" : "platform"
55+
"Content" --> "0..1" "Space" : "space"
56+
"Document" --> "0..1" "Space" : "space"
57+
"Concept" --> "0..1" "Space" : "space"
58+
"Account" --> "1" "Platform" : "platform"
59+
abstract "Agent" [[{An agent that acts in the system}]] {
60+
{field} id : integer
61+
{field} type : EntityType
62+
}
63+
"Document" --> "0..*" "Agent" : "contributors"
64+
"Document" --> "1" "Agent" : "author"
65+
"Content" --> "1" "Document" : "document"
66+
class "ContentEmbedding" [[{None}]] {
67+
{field} model : EmbeddingName
68+
{field} vector : vector
69+
{field} obsolete : boolean
70+
}
71+
"ContentEmbedding" --> "1" "Content" : "target"
72+
"Content" --> "0..1" "Content" : "part_of"
73+
"Content" --> "0..*" "Agent" : "contributors"
74+
"Content" --> "1" "Agent" : "creator"
75+
"Content" --> "1" "Agent" : "author"
76+
"Concept" --> "0..1" "Content" : "represented_by"
77+
class "ConceptSchema" [[{None}]] {
78+
{field} id(i) : integer
79+
{field} epistemic_status(i) : EpistemicStatus
80+
{field} name(i) : string
81+
{field} description(i) : string
82+
{field} created(i) : datetime
83+
{field} last_modified(i) : datetime
84+
{field} arity(i) : integer
85+
{field} content(i) : JSON
86+
{field} is_schema(i) : boolean
87+
}
88+
"Concept" --> "1" "ConceptSchema" : "schema"
89+
"Concept" --> "0..*" "Agent" : "contributors"
90+
"Concept" --> "1" "Agent" : "author"
91+
"Concept" ^-- "ConceptSchema"
92+
class "Person" [[{A person using the system}]] {
93+
{field} name : string
94+
{field} orcid : string
95+
{field} email : string
96+
{field} id(i) : integer
97+
{field} type(i) : EntityType
98+
}
99+
class "AutomatedAgent" [[{An automated agent}]] {
100+
{field} metadata : JSON
101+
{field} name : string
102+
{field} deterministic : boolean
103+
{field} version : string
104+
{field} id(i) : integer
105+
{field} type(i) : EntityType
106+
}
107+
"Account" --> "1" "Agent" : "person"
108+
"Agent" ^-- "Person"
109+
"Agent" ^-- "AutomatedAgent"
110+
@enduml

packages/database/schema.svg

Lines changed: 1 addition & 0 deletions
Loading

0 commit comments

Comments
 (0)