Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Percent Formatter to Experimental #5255

Merged
merged 23 commits into from
Jul 27, 2024

Conversation

blaynem
Copy link
Contributor

@blaynem blaynem commented Jul 17, 2024

What that PR doin?

Creating a Percent Formatter! solving #4483

The formatter includes the option to display as an Approximate, Standard, or Explicit Plus values.

Percent Subpatterns

Pre-requisite reading: Number Pattern Character Definitions

There can be two types of subpatterns from CLDR that are split with the ; character. We always get an explicit positive subpattern, some locales include an explicit negative subpattern as well. When there is no explicit negative subpattern, an implicit negative subpattern is formed from the positive pattern with a prefixed - (ASCII U+002D HYPHEN-MINUS).

To give an example, below is a table where a value of 123 is used for the positive pattern, and -123 is used for the negative pattern. Notice the difference between tr and blo locale where the minus sign is appended to the beginning, even before the % sign.

locale pattern positive pattern negative pattern explicit?
en-US #,##0% 123% -123%
fr #,##0 % 123 % -123 %
tr %#,##0 %123 -%123
blo % #,#0;% -#,#0 % 123 % -123 true

Display Options

There are 3 Display options: Approximate, Standard, and ExplicitPlus. Below are some examples

value pattern Standard Approximate ExplicitPlus
123 #,##0% 123% ~123% +123%
123 % -#,#0 % 123 % ~123 % +123
123 % #,#0 % 123 ~% 123 +% 123
-456 #,##0% -456% ~-456% +-123%
-456 % -#,#0 % -456 % ~-456 % +-456
-456 % #,#0 -% 456 ~-% 456 +-% 456

Approximate

If the - symbol placeholder is present in the pattern, then the approximate sign gets prepended to the sign that is being used.
If the - symbol placeholder is not present, then we follow the implicit
It does not do any sort of rounding to the value itself.

Explicit Plus

If the ExplicitPlus display option is chosen, we are simply ensuring that the plus sign is always present. We follow the typical rules of the negative subpattern while ensuring the localized plus sign is included.

Open Discussion

There are a few things left to determine.

  1. If the Approximate display option is selected, should the value be rounded to the nearest whole number?
    • It could go either way. I'd vote no to keep the formatter a bit more "pure" to just accept the value it's going to format.
  2. If the Explicit Plus display option is selected, should the value be converted to the absolute value?
    • My assumption is that the end user wants to show a number with the + or - sign, not necessarily for it to be explicitly a plus. If that's the case, maybe this display option should be renamed to AlwaysShowSign instead?

@blaynem blaynem requested a review from younies as a code owner July 17, 2024 20:30
W: core::fmt::Write + ?Sized,
{
// Construct the percent sign value.
let percent_sign_value = format!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is close but it won't build because format! is in the prelude only in std, but this should work with no_std.

Since you did your work last winter, we now have the icu_pattern utility which does this stuff better. It will make the smallest, cleanest, fastest code if you change PercentEssentialsV1 to use icu_pattern.

@blaynem blaynem requested a review from a team as a code owner July 18, 2024 19:16
CONTRIBUTING.md Outdated
@@ -79,6 +79,7 @@ There are various files that auto-generated across the ICU4X repository. Here a
need to run in order to recreate them. These files may be run in more comprehensive tests such as those included in `cargo make ci-job-test` or `cargo make ci-all`.

- `cargo make testdata` - regenerates all test data in the `provider/testdata` directory.
- `cargo make bakeddata experimental` - regenerates baked data in the `provider/data/experimental` directory.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a note of this anywhere. It's definitely my inexperience with Rust and tracking down where commands come from. But felt it might be helpful to have it in here!

@blaynem blaynem marked this pull request as draft July 18, 2024 21:53
Comment on lines 54 to 60
// TODO: Determine if we throw an error when explicit plus is called and the value is negative.
Display::ExplicitPlus => self
.essential
.negative_pattern
.interpolate([self.value.to_string(), self.essential.plus_sign.to_string()])
.write_to(sink)?,
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure what to do for this portion. There's one of 2 assumptions we could make here.

  1. They want an absolute value. Example:
    a. Input: -123 -> output +123%

  2. They just want any visible sign (plus or minus).
    a. Input: -123 -> output -123%
    b. Input: 123 -> output +123%

I think my assumption would be 2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm certain it is 2, the plus is usually implicit, this makes it explicit.

Comment on lines 37 to 38
/// https://www.unicode.org/reports/tr35/tr35-numbers.html#approximate-number-formatting
/// https://www.unicode.org/reports/tr35/tr35-numbers.html#explicit-plus-signs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I keep or remove these 2 links? I think they're super beneficial to glance at, but fine removing them

@blaynem blaynem marked this pull request as ready for review July 20, 2024 18:20
CONTRIBUTING.md Outdated Show resolved Hide resolved
Sign::Negative => self.value.clone().with_sign(Sign::None),
_ => self.value.to_owned(),
}
.to_string();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: you shouldn't be calling to_string in a Writeable implementation. I believe because FixedDecimal: Writable, you can pass it directly to interpolate if you pass in a tuple instead of an array (that allows the arguments to have different types).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't notice I could do a tuple with different values! I was able to convert that like you said.

Thanks for calling out the intermediate allocations. Coming from a JS background, so it's something that I need to be more thoughtful about.

components/experimental/src/dimension/percent/format.rs Outdated Show resolved Hide resolved
components/experimental/src/dimension/percent/format.rs Outdated Show resolved Hide resolved
Comment on lines 54 to 60
// TODO: Determine if we throw an error when explicit plus is called and the value is negative.
Display::ExplicitPlus => self
.essential
.negative_pattern
.interpolate([self.value.to_string(), self.essential.plus_sign.to_string()])
.write_to(sink)?,
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm certain it is 2, the plus is usually implicit, this makes it explicit.

components/experimental/src/dimension/percent/format.rs Outdated Show resolved Hide resolved
components/experimental/src/dimension/percent/format.rs Outdated Show resolved Hide resolved
Comment on lines 98 to 99
let result = format_percent(&locale, Default::default(), positive_value);
assert_eq!(result.as_str(), "12345.67%");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: this should use assert_writeable_eq!

Suggested change
let result = format_percent(&locale, Default::default(), positive_value);
assert_eq!(result.as_str(), "12345.67%");
let default_fmt = PercentFormatter::try_new(&locale, Default::default()).unwrap();
let formatted_percent = default_fmt.format_percent(&positive_value);
assert_writeable_eq!(formatted_percent, "12345.67%");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was running into an issue there with the Display writeable. Found the macro to use in another formatter!
writeable::impl_display_with_writeable. So those are fixed.

}

#[test]
pub fn test_fr_fr() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: these tests are all almost identical, the code can be significantly deduplicated. However, what's the value in having this test essentially 7 times? What code path does test_fr_fr test that test_en_us doesn't? I think one LTR and one RTL language should be sufficient test coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely went overboard. I'll cut this down to a few of them!

components/experimental/src/dimension/percent/formatter.rs Outdated Show resolved Hide resolved
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work overall! One thing you're missing.

if self.value.sign() == Sign::Negative {
self.essential
.negative_pattern
.interpolate((abs_value, &self.essential.minus_sign))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue, here and elsewhere: instead of using impl Writeable for FixedDecimal, you should be using impl Writeable for FormattedFixedDecimal, which you can obtain from a FixedDecimalFormatter, which you should include as a field of the PercentFormatter.

Without the FixedDecimalFormatter, you won't get localized digits, decimal separators, or grouping separators.

You can test this with locales such as bn and ar-EG.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout! Added

robertbastian
robertbastian previously approved these changes Jul 25, 2024
Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work!

I think we can merge it like this, however the next step should be to use FixedDecimalFormatter for correct number formatting.

/// Represents the standard pattern for negative percents.
/// NOTE: place holder 0 is the place of the percent value.
/// place holder 1 is the place of the plus, minus, or approximate signs.
pub negative_pattern: DoublePlaceholderPattern<Cow<'data, str>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe this should be signed_pattern and the other one unsigned_pattern, because this is also used for positive approximate and explicit plus numbers.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@sffc sffc merged commit 46aace9 into unicode-org:main Jul 27, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants