But they are not different strings, they are canonically equivalent as
far as Unicode is concerned. They're even supposed to map to the same
glyph (if the font has an "ä", it should display it in both cases, if
it has an "a" and a combining diaeresis, it should make up one).
You cannot do a binary comparison of text to see if two strings are
equivalent.
Whereas you are confusing characters and code points.
"ä" and "a¨" use different code points, but they encode the same
character, and from the user's perspective it is the *character* that
is interesting (although he might confuse it with the glyph).
Actually, NTFS is a bit broken. It sees file names as a string of
16-bit words. It doesn't check that it is valid UTF-16, or even valid
UCS-2, it allows almost anything.
Apple made Mac OS X handle filenames properly, by seeing that file
names are a string of characters, not code points, so they use a
canonical form for all characters (personally, I would have preferred
the pre-composed form, though).
--
\\// Peter - http://www.softwolves.pp.se/
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html