-
Notifications
You must be signed in to change notification settings - Fork 6
More complicated numbers? #10
Description
I'm going to accept PR #8 tonight, this adds scientific notation (1e8, etc) to Emily. However I have some thoughts on where we should go past there.
Something strikes me as odd about #8: The accepted format is 3.4e-8. This makes sense as it is standard and what the output looks like. However, -8 is not currently accepted as a number; it's ~8 to prevent parser confusion between unary and binary -. We don't however accept 3.4e~8.
Meanwhile, there's some more things I want eventually:
- I want an octal mode (
0o434) - I want a hex mode (
0x4d4) - I want a binary mode (
0b1001) - Once there are multiple numeric types (ie int) I want to specify some standard way of making a constant an int or a float, similar to 5.0f in C
- I want octal, hex and binary mode to work with all features. I should be able to say something like
0x4ac.eb3Ea. Or53.0b1001. Currently I think C doesn't let you put anything but decimal to the right side of the .
These other requests create additional weirdness; You obviously can't do 0xeE4 because E, and "f" can't go after a hexidecimal number, because "f" and "e" are valid hex digits.
Another thing to consider: something which is not currently well documented is the relationship between the "reader" (tokenizer.ml) and the macro parser (macro.ml). My goal is "as much as possible" should be handed off to the macro parser, since this will eventually be user code and I want it to be customizable. However it seems like some things kind of can't be handled by macros easily. The internals of numbers are a prime example: with 4.5e4, you'll probably (?) misrepresent the number in binary if you try to implement e as a macro. Negation however comes with no loss-of-precision risk, so ~ or an eventual unary - could be done with a user macro.
With all these things in mind, here's what I think I want:
- IN READER: Numbers begin with a digit. If this then proceeds to 0x, 0b, or 0o, the reader switches to hex, binary or octal mode. after the numeric part an "e" or a "p" is allowed (p for power), either of which means the part after is an exponent. The "e" is not accessible in hex mode. The exponent may be negative using either a - or a ~
- IN MACRO PROCESSOR: Either ~ or unary -, depending on which macro you loaded, negates the following numeric constant. Numbers are float by default but putting an "i" after, like 4i, makes it integer-typed to start.
VERY long term goals: Maybe eventually numeric constants are internally strings until some point in the macro processing loop, and that would allow bignum constants. It would be cool to have a way to specify binary strings like 102.244.33.53.1 or something. Oh, and I've always liked the _ in Perl.
Anyone have any thoughts? This is a little complicated but IMO the numeric inlines in most numbers are not nearly as nice as they should be, and this has to be thought out in a way other Emily features don't because it partially involves the Reader (i.e.: I can't just get it wrong and then patch it later with a user macro).