Loading SQL schemas and schema updates from files #13

paurkedal · 2023-11-29T10:07:32Z

Not urgent, and can be simplified by a future addition to Caqti.

It can be useful to store SQL code for schemas in external files, rather than including it in the application code, esp. when it becomes complex and needs updating, as this leaves the database administrator more in control if something goes wrong or special attention to local needs are needed during an upgrade. An application may therefore ship an database initialization and upgrade utility, or even perform the actions automatically when string the application. This will requires loading SQL schema files.

The tutorial should show how to distribute auxiliary files with dune rules to fulfil the most basic need, and then how to load the files from the same location. So, in addition to the OCaml code, it needs some dune and opam knowledge, but by providing the relevant bits people don't need to hunt for it.

I have one public source: In subsocia I currently use the following not very pedagogical code to do this:

let angstrom_file_parser =
  let open Angstrom in
  let is_space = function
   | ' ' | '\t' | '\n' | '\r' -> true
   | _ -> false
  in
  let white =
    many (take_while1 is_space <|> (string "--" *> take_till ((=) '\n')))
      <* commit
  in
  white *> many (Caqti_query.angstrom_parser <* char ';' <* white)

let load_sql (module C : Caqti_lwt.CONNECTION) sql =
  let open Caqti_request.Infix in
  Lwt_io.with_file ~mode:Lwt_io.input sql @@ fun ic ->
  let* unconsumed, result = Angstrom_lwt_unix.parse angstrom_file_parser ic in
  if unconsumed.Angstrom.Buffered.len <> 0 then
    Lwt.return_error (`Msg (sql ^ ": Unconsumed input."))
  else
  let rec submit = function
   | [] -> Lwt.return_ok ()
   | stmt :: stmts ->
      let request = Caqti_type.(unit -->. unit) ~oneshot:true (fun _ -> stmt) in
      let*? () = C.exec request () in
      submit stmts
  in
  (match result with
   | Error msg -> Lwt.return_error (`Msg (sql ^ ": " ^ msg))
   | Ok stmts -> submit stmts)

I consider adding some of this code to Caqti, so that people don't need to learn Angstrom.

BTW, the some code tracks schema upgrades, which is a larger topic, which is probably better delegated to a different issue, as it could use some research into best practices.

benjamin-thomas · 2023-11-29T19:51:19Z

Sounds good.

I like the idea of putting SQL into separate files. Certain IDEs, provide syntax highlighting + auto completion directly in the code fragments. But since we are limited tooling wise in OCaml, it sounds like a useful technique to demonstrate in terms of code organization. Also, that could open up the possibility of using a dedicated SQL editor like this one as an extra benefit.

In terms of database evolutions, I've had a good experiences handling database migrations out of process, with a dedicated tool (manually).

With this in mind, it looks like this OCaml tool could be useful tool to introduce: https://github.com/tmattio/omigrate.

I can imagine use cases where making the migrations fully automatic on app start would be a requirement so it'd be interesting to show both techniques. Maybe omigrate could also help there, it's not clear to me if it can also act as a library, and not only as a CLI tool.

paurkedal · 2023-11-29T20:38:33Z

Yes, I agree about IDE support. My own requirement is limited, but I'm happy to have syntax highlighting vim when dealing with larger chunks of SQL code. Also, it can benefit other developers in multi-language projects to have a shared definition of the data model.

It's good to look at omigrate for inspiration at least, though having a quick look at the code, it does not solve any tricky problems like diffing schemas or schemas vs loaded databases, and has few commits (though not that old). Given that we already have the multi-RDBMS support (+ MariaDB), and with the mentioned additions to Caqti, the technical part up to that level would probably not amount to much code. On the other hard, I think good conventions are gold here, and I'm not sure I've know the best approach, though I have also been dealing with it for some time.

Apropos, the tutorial could have a further reading section at the end, which points to useful related projects, without necessarily actively using or documenting them.

One thing worth mentioning about the load_sql above is that the schema is parsed as a Caqti query, which has advantages and disadvantages. If one the other hand we were to load the whole schema as a single text and pass it as a literal query (using Caqti_query.L), then I think we run into some issues with how databases behave with respect to multi-statement queries. That may be solvable, but I haven't looked into it.

paurkedal added the enhancement New feature or request label Nov 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading SQL schemas and schema updates from files #13

Loading SQL schemas and schema updates from files #13

paurkedal commented Nov 29, 2023

benjamin-thomas commented Nov 29, 2023

paurkedal commented Nov 29, 2023

Loading SQL schemas and schema updates from files #13

Loading SQL schemas and schema updates from files #13

Comments

paurkedal commented Nov 29, 2023

benjamin-thomas commented Nov 29, 2023

paurkedal commented Nov 29, 2023