Skip to content

Consider renaming ByteString to IsomorphicString #1471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
annevk opened this issue Feb 14, 2025 · 6 comments
Closed

Consider renaming ByteString to IsomorphicString #1471

annevk opened this issue Feb 14, 2025 · 6 comments

Comments

@annevk
Copy link
Member

annevk commented Feb 14, 2025

Context: w3ctag/design-principles#454 (comment).

@domenic
Copy link
Member

domenic commented Feb 14, 2025

I'm -0 on this. It's not immediately clear from the name what that means, whereas ByteString seems pretty clear: it's strings where each code point/code unit fits into a byte, to be used in contexts like HTTP where strings are always treated as having byte-sized code points/code units.

@annevk
Copy link
Member Author

annevk commented Apr 10, 2025

Fair. I tried to consider renaming isomorphic string to byte string instead, but I'm not entirely sure about byte string encode and byte string decode.

@nektro
Copy link
Member

nektro commented Apr 10, 2025

Latin1String / UL1String ?

@domenic
Copy link
Member

domenic commented Apr 10, 2025

Latin1 is a pretty bad choice, because on the web it means windows-1252 and on other platforms it means something else. (And the actual spec that defines it leaves its behavior undefined for most bytes.) See also nodejs/node#56542 (comment) .

@nektro
Copy link
Member

nektro commented Apr 10, 2025

got Latin1/UL1 from "Basic Latin" and "Latin-1 Supplement" being the block names for U+0000-00FF

@annevk
Copy link
Member Author

annevk commented Apr 10, 2025

I suspect when most people hear "latin1" they think of the encoding, not the Unicode block. And while it's not technically an encoding on the web, it's a label for the windows-1252 encoding as Domenic mentioned, which is very different from the actual encoding.

And the Unicode block only defines the non-ASCII part of what this string type can contain, so it's not entirely fitting either.

I'm going to close this for now, but I'm open to revisiting this if we have a much better idea at some point.

@annevk annevk closed this as not planned Won't fix, can't repro, duplicate, stale Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants