Skip to content

Weird discrepancy with ICU4X #105

@aumetra

Description

@aumetra

So, to set the scene here, I have a proptest between two libraries set up. One of the libraries uses unicode-normalization under the hood, the other icu_normalizer.

I expected that both output the same values, but my CI exploded at some point on the weird string "\u{11366}\u{113ce}".
When put through the NFC normalizer, you get two different outputs:

  1. unicode-normalization: "\u{113ce}\u{11366}"
  2. icu_normalizer: "\u{11366}\u{113ce}"

Just a fun little thing I thought I'd report since it's technically a correctness issue (I'm just not good enough with Unicode to determine whether it's an issue with ICU4X or this crate).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions