Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OCaml 4.14 #14

Closed
wants to merge 1 commit into from
Closed

Conversation

jonahbeckford
Copy link

Currently the provider library supports OCaml 4, except for a minor use of Int.hash introduced in 5.1. The provider-tests library, since it is tied to Jane Street libraries, depends on OCaml 5.2.

This PR drops the requirement for provider down to OCaml 4.14. It can probably go much lower, but 4.14 is the LTS version. (And I haven't decided whether I'm going to use it yet ... but couldn't test it until it worked on OCaml 4).

This PR provides differing OCaml 4 and OCaml 5 implementations of Uid ... the OCaml 5 version uses Int.hash while OCaml 4 uses the (slightly slower) polymorphic Hashtbl.hash. There should be no difference in functionality+performance for OCaml 5.

@jonahbeckford
Copy link
Author

I haven't tested the GitHub Actions changes ... I expected it to run in this PR but I guess that isn't enabled yet.

@mbarbin
Copy link
Owner

mbarbin commented Oct 25, 2024

Hello @jonahbeckford

It was very nice to see your username appear in my PR notifications. In fact, this is the first PR from an external contributor in this project. I will take a moment to celebrate. Thank you!

Some context out of my head:

  • I was hoping to get away with focusing on OCaml 5 only in my projects (I currently have dependencies to ppx_sexp_conv in many projects), but I understand where you come from. For small standalone repos such as provider here, I am not opposed to supporting more cases (That being said, I would prefer not spending time on anything < 4.14). Thank you for looking into it and your contribution 😃

  • It isn't visible here - the ci.yml file comes from a setup I have internally using some template mecanism across multiple repos. Changes there would require some additional work on my side. This is on me to figure out, just giving you context as to why it may take me some additional time to integrate.

  • Regarding conditional compilation, I would be on board with the technic you used there (I slighly prefer a version leveraging dune like you did over local preprocessing directives).

  • Question: what makes Int.hash faster than Hashtbl.hash? (is there some compiler optimization?). Quoting from OCaml 5's code base, I see:

hashtbl.ml

external seeded_hash_param :
  int -> int -> int -> 'a -> int = "caml_hash" [@@noalloc]

let hash x = seeded_hash_param 10 100 0 x

int.ml

external seeded_hash_param :
  int -> int -> int -> 'a -> int = "caml_hash" [@@noalloc]
let hash x = seeded_hash_param 10 100 0 x

Besides, even at the cost of a small performance hit, personally I'd be perfectly happy with using Hashtbl.hash in provider and avoid the conditional specialization. Uid.t is exposed as private int, so if someone cares they can use the int directly.

Re: enabling workflow from external contributors

I don't have much experience with this yet, and will think about it a bit more. For now I am going to look into enabling the run for this PR. I may relax this in the future (TBD).

@jonahbeckford
Copy link
Author

Thanks!

Question: what makes Int.hash faster than Hashtbl.hash? (is there some compiler optimization?).

I expect a future optimization will be available for Int.hash ... something that avoids the polymorphic parameter 'a and doesn't do the tagging ... something like:

(** file: int.ml *)
external seeded_hash_param_int :
  (int [@untagged]) -> (int [@untagged]) -> (int [@untagged]) -> (int [@untagged]) -> int = "caml_hash_int" [@@noalloc]
let hash (x: int) = seeded_hash_param_int 10 100 0 x

I'm a bit surprised that isn't there already since Int.hash can be part of tight loops, but I guess the team has higher priorities. For now, I agree it is overkill!

@mbarbin
Copy link
Owner

mbarbin commented Oct 25, 2024

This PR drops the requirement for provider down to OCaml 4.14. It can probably go much lower, but 4.14 is the LTS version. (And I haven't decided whether I'm going to use it yet ... but couldn't test it until it worked on OCaml 4).

I wanted to ask you, while you do your testing, would you consider adding a custom repository to a local switch? It's easy enough for me to add preview packages to https://github.com/mbarbin/opam-repository

Alternatively, I think it should be possible to opam pin your fork with 4.14-compatible changes. Please let me know how I can help you do your experimenting.

Meanwhile, if you wanted to shadow Int.hash (and Int.seeded_hash while we are at it) via the import file, and unconditionally bind this to Hashtbl helpers, I would be willing to merge this in.

The testing aspect of GitHub workflow might require a bit more work, so this can be done as a subsequent PR, so you're not blocked on it.

@jonahbeckford
Copy link
Author

No problem. I have my own repository I can add this to. It is fine if it takes a few weeks (or even months).

@mbarbin
Copy link
Owner

mbarbin commented Oct 27, 2024

I want to address #23 as part of the effort to support 4.14, so this is not resolved yet. For now the code builds with it at the tip of the master branch, with a version inspired by our discussion (#22).

If you do some testing with 4.14, I'm interested by your feedback. Also, if you end up trying it in projects, maybe you'll have thoughts on #16 - I'd love your feedback on this as well! Thank you.

@jonahbeckford
Copy link
Author

Context: I'm considering using Provider in DkCoder. The main simplifier is that I have structural information from codept about all the modules, module types and functors in a project. Another simplifier is that I can and do inject modules aliases on the top of each .ml script (like a PPX but with awareness of the whole project).

One of the last features to implement is conditional modules. For example, having MyGame_DirectXBackend.ml compile only on Windows and MyGame_MetalBackend.ml compile on macOS. And when it does compile, I'd want to pass the MyGame_DirectXBackend module a Windows DirectX implementation module and MyGame_MetalBackend.ml an Apple Metal framework module. Basically, do the wiring on behalf of the developer (similar to Spring wiring in Java programs).

I have an initial design based on matching functor parameter types (ex. module Make (Mtl: REQUIRES_APPLE_METAL) = struct let device = Mtl.createSystemDefaultDevice() end) with one or more corresponding module implementations (ex. module Provides_APPLE_METAL () = struct ... end).

When I get to finalizing the design next year, I might switch to use your Provider implementation. But one big difference is I can only take advantage of module/module type information (ex. the Provider and Reader modules were accessed in module abc.ml) rather than value/type information (ex. [> reader ] Provider.package is an argument to a function). I think I'd have to provide a high-level wrapper around Provider.lookup to perform lookups on a global Provider (or a Provider that I inject into each .ml script).

Anyway, that is my thoughts for now. I probably will ask more questions next year. But I'm open to suggestions before that.

nit: I'm not a huge fan of the naming Provider.packed ... which seems like it was named for an internal implementation detail and is long (I expect this to be typed a lot if I adopt it in DkCoder). Provider.p or Provider.typ would be nicer IMHO.

@mbarbin
Copy link
Owner

mbarbin commented Nov 7, 2024

I'll need some time to look into the context. Thank you for letting me know.

nit: I'm not a huge fan of the naming Provider.packed ... which seems like it was named for an internal implementation detail and is long (I expect this to be typed a lot if I adopt it in DkCoder). Provider.p or Provider.typ would be nicer IMHO.

Thanks very much for giving feedback, much appreciated. Here is my reasoning on that one:

I tried to re-use a common pattern for that name - packed is commonly used to name a GADT that is designed to capture existentially the type variables of another parametrized type. Provider.packed is a good candidate for re-using that common name.

That being said, I agree with the verbosity. What I envision, is that a library using Provider under the hood would not expose the packed type as such to users. For example, in vcs I ended up abstracting it:

https://github.com/mbarbin/vcs/blob/cf2d7cc2e774cf32ebcd15ed6ffca24bec7f95a3/lib/vcs/src/vcs0.mli#L24

On added benefit is that you may then rename it according to the actual project (Vcs.t in that instance). Would that work for you?

@jonahbeckford
Copy link
Author

Ah, I see. I thought it would be visible in end-user code but that was incorrect. Perfect! No issue with that. Thanks

@mbarbin
Copy link
Owner

mbarbin commented Nov 21, 2024

@jonahbeckford provider.0.0.11 is now available in the main public opam-repository, with support for OCaml 4.14. Thanks again for your help!!

Re: the other topics discussed in this PR. I would be in favor of saving what needs further toughts/discussion as new discussion tickets in the Discussions space of the project, if you don't mind creating some entry with specifics. Thank you and talk to you soon.

@jonahbeckford
Copy link
Author

Thanks! Opened discussion and am closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants