Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add float parser, generalise one_of/.optional() a bit #14

Merged
merged 10 commits into from
Nov 18, 2023
Merged

Conversation

jsdw
Copy link
Owner

@jsdw jsdw commented Nov 15, 2023

  • add chars::float to parse floats
  • add chars::case_insensitive_eq for case insensitive token matching.
  • generalise one_of and .optional(); allow expressions passed to either to return Options or bools, not just Options.

@jsdw
Copy link
Owner Author

jsdw commented Nov 15, 2023

@Easyoakland just FYI :)

@Easyoakland
Copy link
Contributor

Why implement these in an extension trait instead of the same way that a user of the library would implement these (free functions)?

Would it not make more sense for some of these functions to use a stack allocated buffer with a maximum size since there are a maximum number of digits? For example u8 max is 256 so 3 chars is sufficient.

@jsdw
Copy link
Owner Author

jsdw commented Nov 17, 2023

Why implement these in an extension trait instead of the same way that a user of the library would implement these (free functions)?

I started with freestanding functions, but the extension trait felt ergonomically superior in the end. One example is for the functions parse_f32 and parse_f64; as freestanding functions you need to call parse_f32::<_, String>(toks) (ie need 2 generics, one for tokens and one for buffer.. granted, I could elide the tokens using impl syntax but that doesn't feel right in a proper library because one may want to specify it). With an extension crate it's toks.parse_f32::<String> which is just nicer and the one generic is easier to explain (plus nice symmetry with toks.parse::<f32, String>() or whatever)

Subjective I'm sure, but I tried both and preferred this!

Would it not make more sense for some of these functions to use a stack allocated buffer with a maximum size since there are a maximum number of digits? For example u8 max is 256 so 3 chars is sufficient.

I didn't add any functions like that; only ones for f32/f64?

@Easyoakland
Copy link
Contributor

(ie need 2 generics, one for tokens and one for buffer.. granted, I could elide the tokens using impl syntax but that doesn't feel right in a proper library because one may want to specify it)

I'm not sure why that's preferable to impl syntax. If you want to specify a type why not do that earlier?

fn parser<B>(tok: &mut impl Token...){...}
let a: StrTokens<'static> = "1234".into_tokens();
parser::<String>(&mut a)

If the syntax that user's use is different from the library I think that user's defining new combinators (the main thing) becomes a second class citizen. I'm of the opinion that combinators defined in the library should be useful as a big tutorial.

I didn't add any functions like that; only ones for f32/f64?

Oh, apparently floats with excessive digits of precision are just rounded so the input can be arbitrarily long.

@Easyoakland
Copy link
Contributor

I also think the naming parse_f32(), parse_f(), float(), ... is worse than f32(), float(), skip_float().

Another advantage of free functions is then if it is actually advantageous to have both parse and skip versions of things to make two modules named parse and skip.

I'm not sure it is though. Why not only have the skip versions and allow the user to .parse() if they want to?


Another alternative could be making these things Tokens. Then you could do f32(&mut toks).as_iter().count() !=0 (or make consume return a bool) for the bool case and f32(&mut toks).parse() for the parse case. For this to work the macro for defining token newtypes should probably be made first. Or it could be a function returning a particular Tokens that is parameterized by its parse function.

At this point it might make sense to impl<T: Tokens> Tokens for &mut T and take everything by value like in core::iter then you can f32(toks).parse().

@jsdw
Copy link
Owner Author

jsdw commented Nov 17, 2023

If the syntax that user's use is different from the library I think that user's defining new combinators (the main thing) becomes a second class citizen. I'm of the opinion that combinators defined in the library should be useful as a big tutorial.

I don't think it's a huge deal either way, but I do agree with the tutorial perspective and promoting freestanding funcs as a good way to do these things. I might revert to freestanding funcs with impl syntax again for this reason; I can't think of a good reason not to just use impl syntax here, really.

I also think the naming parse_f32(), parse_f(), float(), ... is worse than f32(), float(), skip_float().

This is just opinion but I prefer the current names; parse_f32 is nicely consistent with Tokens::parse so I prefer that. float is consistent with methods like Tokens::token and Tokens::tokens in that it will return a boolean whether it's seen the thing in question. parse_f is not public anyway.

I'm not sure it is though. Why not only have the skip versions and allow the user to .parse() if they want to?

I was going to just have float(), but these are already helper utils, and I feel like most of the time somebody would actually want to get the float value back. I was originally going to not expose float at all and just parse_f32/parse_f64 and may still go that path until there is some use case for exposing float

Another alternative could be making these things Tokens. Then you could do f32(&mut toks).as_iter().count() !=0 (or make consume return a bool) for the bool case and f32(&mut toks).parse() for the parse case.

I guess right now I don't see the point so I'd rather avoid that extra complexity. Would somebody really want to iterate over the tokens that are a part of a float and do something with only a subset of them? I doubt it. (if I kept float() then somebody could do as parse_f and take a slice to the tokens anyway if they really want, but I don't see why really, so it doesn't compel me to keep it really).

@jsdw jsdw merged commit 1221933 into master Nov 18, 2023
4 checks passed
@jsdw jsdw deleted the float-parsing branch November 18, 2023 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants