-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathfk.html
152 lines (121 loc) · 6.57 KB
/
fk.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
<!DOCTYPE HTML>
<html lang="en">
<head>
<script async src="https://www.googletagmanager.com/gtag/js?id=G-CD4ENCFV58"></script>
<!-- Google tag (gtag.js) -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-64DRFX06T1"></script>
<script >
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-64DRFX06T1');
</script>
<title>CQL</title>
<link rel="shortcut icon" href="../favicon.ico" >
<link rel="StyleSheet" href="css/nstyle.css" type="text/css" media="all" >
<meta charset="utf-8">
<meta name="keywords" content="CQL,SQL,Data Integration, Data Migration, Category Theory, ETL" >
<meta name="description" content="Conexus CQL" >
<meta name="keywords" content="CQL, functorial, category theory, data integration, data migration, categorical databases, SQL, categorical query language" >
</head>
<body>
<div id="content">
<h1>Categorical Databases<img src="logo.png" height="32" style="float: right;" alt="logo" ></h1>
<a href="https://categoricaldata.net">Home</a> |
<a href="download.html">Download</a> |
<a href="examples.html">Getting Started</a> |
<a href="help/index.html" target="_blank">Manual</a> |
<a href="https://github.com/CategoricalData/CQL/wiki" target="_blank">Wiki</a> |
<a href="papers.html">Papers</a> |
<a href="screens.html">Screen Shots</a> |
<a href="https://github.com/categoricalData" target="_blank">Github</a> |
<a href="https://groups.google.com/forum/#!forum/categoricaldata" target="_blank">Google Group</a> |
<a href="https://conexus.com" target="_blank">Conexus</a> |
<a href="mailto:[email protected]">Contact</a>
<br><br>
<hr>
<h2>Compile-time Foreign Key Checking</h2>
<p>Many data integration tasks require creating multiple tables connected by foreign keys. Typically this process requires disabling foreign key constraints, loading data, and then enabling foreign key constraints, which triggers an expensive runtime check that the foreign key constraints hold and defers detection of foreign key related bugs until runtime. In CQL, queries that target schemas containing foreign keys are guaranteed, at compile time, and without accessing any source data, to always materialize target instances with correctly populated foreign keys. As a result, foreign key related errors are detected much earlier in the ETL process and without costly runtime-checking.
</p>
<p>This example (built in to the IDE with name Foreign Keys) defines a source schema about departments, professors, and students, and a query to find all possible same-department professor-student advising matches. The target schema relates matches, professors, students, and departments through four foreign keys and an equation enforcing that in every match, the department the professor works in is the same as the department the student is majoring in. A query that incorrectly populates the target schema is shown to be rejected at compile time.
</p>
<p>We start by defining a source schema for departments, professors, and students, with foreign keys specifying the department which each professor works in and which each student is majoring in. For brevity we use an empty typeside.
</p>
<pre>typeside Ty = empty
schema University = literal : Ty {
entities
Professor Student Department
foreign_keys
worksIn : Professor -> Department
majoringIn : Student -> Department
}</pre>
<br >
<p>Here is some sample data:
</p>
<pre>instance UniversityOfX = literal : University {
generators
Gauss Church Euler : Professor
Riemann Turing Kleene : Student
math cs : Department
multi_equations
worksIn -> {Gauss math, Church cs, Euler math}
majoringIn -> {Riemann math, Turing cs, Kleene cs}
}</pre>
<img src="images/examples/fk3.png" alt="fk1" width="700" >
<br>
<p>Our goal is to find all professor-student pairs where the professor and student share the same department. The target schema contains an additional entitiy for these matches, along with foreign keys to the professor and student involved, and an equation stating the that professor and student share the same department:
</p>
<pre>
schema AdvisorMatches = literal : Ty {
imports
University
entities
Match
foreign_keys
studentOf : Match -> Student
professorOf : Match -> Professor
path_equations
studentOf.majoringIn = professorOf.worksIn
}</pre>
<br>
<p>To populate the target schema we must give one sub query per target entity. The source tables are copied over directly. To populate the advisor matches, we iterate over all professors and students, dereferencing foreign keys (using the dot operator) instead of performing joins with Department. The where clause selects the professors and students which share the same department. Finally, for each foreign key in the target schema we must specify how it is to be populated. For example, the majoringIn foreign key takes students to departments; therefore, in the foreign keys clause for majoringIn, we must provide a department (dept), namely (stu.majoringIn), for each student (stu).
</p>
<pre>query findMatches = literal : University -> AdvisorMatches {
entities
Department -> {from dept:Department}
Professor -> {from prof:Professor}
Student -> {from stu:Student}
Match -> {from p:Professor s:Student
where p.worksIn = s.majoringIn}
foreign_keys
majoringIn -> {dept -> stu.majoringIn}
worksIn -> {dept -> prof.worksIn}
professorOf -> {prof -> p}
studentOf -> {stu -> s}
}
instance MatchesForUnivX = eval findMatches UniversityOfX</pre>
<br >
<p>The result is displayed in the IDE. Note that the generated IDs for the matches contain the provenance, or lineage, of how the match was formed.
</p>
<img src="images/examples/fk2.png" alt="fk2" width="700" >
<p>If the sub query for Match does not correctly populate the target table, say because of a typo:
</p>
<pre>Match -> {from p:Professor s:Student
where p.worksIn = p.worksIn}
</pre>
<br>
<p>The IDE rejects the query at compile time:
</p>
<pre>Error in query findMatches:
Target equation
v.studentOf.majoringIn = v.professorOf.worksIn
not respected: transforms to
s.majoringIn = p.worksIn
which is not provable in the sub-query for Match.
</pre>
<br>
<p>A screen shot of the entire development is shown below:</p>
<img src="images/examples/fk1.png" alt="fk3" width="700" >
</div><!--close main-->
</body>
</html>