-
Notifications
You must be signed in to change notification settings - Fork 1.6k
i10n #1292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i10n #1292
Conversation
|
||
```Shell | ||
rustc --install-lang fr # downloads an official language pack from the server | ||
rustc --install-lang fr=pack.zip # a custom pack can be installed this way |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like the idea of adding a subcommand that downloads and extracts files to rustc. I'd rather see a separate utility to install these language packs (could also be part of multirust).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or a part of install.sh
/rustup.sh
(or whatever it's called these days).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the issue to add such a thing to rust. Since localization will be handled by compiler directly, why not the language packs too ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that rustc
will be talking over the network, we don't want that.
So the compiler will have support for language packs, just not for downloading/installing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not talking over the network ? I don't really see the issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my bad. Didn't think of that. Thanks for the info !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rustc also won’t be able to save these packs most of the time anyways since you need administrative privileges for that in current versions of all major systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like rust_install.sh. I don't think this is a real issue here. They can just launch the command with sudo if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don’t simply launch compilers as root. If a compiler needs administrative privileges, then something went wrong somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a side thing, so I don't find it abnormal.
Theoretically you could use macros/syntax extension to directly allow |
@Kimundi I like the spirit of that idea, but it would make language packs much less portable |
@Kimundi: And if you want to change the key-string, you'll have to change it in all other language files. I don't think that it would be very convenient... |
I would really like to see a comparison to other i18n libraries, efforts, and such. This is an incredibly complex topic, and this is a very, very brief RFC. i18n is really important, and we should gain support for it. But it's really easy to do poorly. |
I feel like it's still too early to implement these "luxury" features. I think translating now would just lead to a worse experience (= many untranslated errors in the output) as diagnostics are improved and new ones get added. I also don't get how we're supposed to change the structure of existing messages (ie. add a That said, even if I'll never use this (I'm much more used to english jargon than german), this does seem like a good feature to have in general. |
we should use named arguments as much as possible. |
It occurs to me that we forgot about pluralization. That's nontrivial to handle. I think Firefox handles it by asking for two versions of the string. |
Do not re-invent a wheel. There’s gettext and lots of infrastructure around it. IMHO this proposal is strongly inferior to implementing gettext library (that works equally well on all supported platforms, as opposed to python’s gettext implementation) in rust and pulling that into rustc. If gettext is not satisfactory in some way, then at least port something that is known to already work; the rust project really doesn’t need to solve the already-mostly-solved l10n problem all over again. P.S. this RFC is proposing infrastructure for l10n (localisation), not i18n (internationalisation). i18n is much more involved and I don’t see how rust needs it at all. |
@Manishearth Pluralization is more complicated than that, as some languages have more intricate rules than just I agree with @steveklabnik that this needs far more investigation. l20n should certainly be brought up here. |
@nagisa: You're absolutely right. I got confused but it's i10n. |
infrastructure for l10n is i18n. i18n is making a piece of software localizable, l10n is creating the translations. |
Agreed. I think they have some handling for that, but I haven't looked into it. |
I agree with the intent of this RFC, but not on the proposed solution. In my experience, translations based on simple key/value formats are a real pain to work with. Finding consistent or meaningful key names is impossible. Developers now have to follow an indirection to know what the content of the string is. The most important part of translating software lies in the tools that make it easy for the translators to keep the translations up to date, and to do so at their own pace. So I think this RFC should just be "internationalize |
I highly recommend looking at l20n. I don't recommend looking to deeply at this implementation, as it was my first rust code ever, but I feel some absurd urge to include it with my comment about l20n. |
I don't know a lot about localization (is it just me or are these acronyms ironic given that this is an accessibility issue?), but wouldn't it make sense for this to be built on top of semantic error values that could also have a machine readable (e.g. JSON) output form, as well as a localized human readable String, similar to this RFC? |
I agree with @withoutboats that machine readable debugging is probably much more easily translatable than embedding the i10n/etc inside everywhere. |
Let's not forget that pluralization is not the only thing that varies in translated strings. A huge class of languages does noun declension, where the spelling and pronunciation of a noun changes depending on its usage in a sentence, which can also be mixed with genders. It's not just the messages we have now that would need to change, but the code around how some of those messages are generated too. There are some cases where we programmatically build up parts of the string (I'm not talking about the user's own code, but the messages themselves). This can lead to cases where figuring out how many strings need to be translated will be very difficult to do. I'd also propose that when we do these translation files that the original English translation include an additional column for the context. This is usually a piece of text that describes more information about the text and the words used in order to help a translator understand the context that the translation is used. Not everyone coming up with translations is going to completely understand the code it's used in. They can also be used to describe the sections of the string that are replaced with user content. For example, whether the value that appears in |
@Nashenas88 That's a good point. l20n makes solving these kinds of grammatical issues relatively simple. |
@apasel422, I just looked up l20n, and I'm impressed by what it offers. I haven't had a chance to look at the code yet though; I hope it's something we could take advantage of easily. |
Maybe I missed something in the RFC, but how would this actually work? If the translations are regular format strings (with |
There is machinery to iterate through format strings at runtime, it's easy to use that. |
So there is a change I have been contemplating doing that is related to this RFC. It frequently happens in Rust that we have "multipart" error messages, like an error with several explanatory notes, and maybe a help suggestion as well. Currently, each of these is a distinct message, and each has its own span, and it's kind of a big mess. Furthermore, many of the messages -- such as those produced by the borrow checker -- involve "program flow". We currently display this as a multipart message highlighting each point in the code, but this must be "cross-referenced" against the original source somehow. On a related note, our messages often include a lot of terminology that users may not know. Research shows that even simple terms like "function body" can be confusing to new users and so on, to say nothing of things like "object type" or "lvalue". It'd be great if we could find a way to make these terms clearer to people. I was hoping to address all of these points by allowing us to construct richer errors. The idea would be to have:
Anyway, I bring this up here because while these goals are not directly I10N goals, they are obviously served by some of the measures in this RFC. Furthermore, for I10N purposes, I imagine these "multipart" messages ought to be a unit, since the right breakdown will probably be translated differently. I know that sometimes I have to really torture the phrasing to make it make grammatical sense in English, and I assume it would be near impossible to port that across languages. |
I was working on something similar but only for Giving a better access to information for newcomers (and even more experimented rust users!) should be a little more considered. |
I've updated l20n so that it now compiles on stable rust. The parsing and resolving works, but locale negotiation is non-existent at the moment. (repo: https://github.com/seanmonstar/l20n.rs) With some more work, it could be possible to use syntax extensions (or codegen like serde) to compile the l20n templates into rust code at compile time, instead of runtime. (Runtime should stay though, since it's also a possible strategy for an application to download updated language resources and need to compile them at runtime.) |
I will play community memory here and point out that the current format string syntax in rust was chosen explicitly to be compatible with ICU MessageFormat syntax (itself derived from Java's). This is the standard (and has been worked over very thoroughly to accommodate variations in plural, gender and similar dimensions). http://icu-project.org/apiref/icu4j/com/ibm/icu/text/MessageFormat.html |
(The most thorough conversion we had about this was in 2013, starting with https://mail.mozilla.org/pipermail/rust-dev/2013-May/003999.html ... There are lots of arguments and informative links in that thread. ) |
We discussed this RFC in the @rust-lang/compiler meeting yesterday. The rough consensus was that it is too early to "internationalize" the compiler, even though we would like to do so eventually. Even when just considering English, it is difficult to maintain the quality of error messages, and adding other languages into the mix would be a significant burden. It's also hard for us to judge the quality of those error messages. That said, we are doing some work on overhauling the error reporting infrastructure for IDE integration and better usability, and I expect that this should make internationalization easier longer term (though at the moment we have not been focusing on extracting the text of the messages themselves outside of the compiler). Therefore, I'm inclined to close this RFC for the time being (and open a corresponding issue), but I'd like to hear feedback on that first. |
What we had in mind @Manishearth and myself was more to provide the structure to allow users to add localization (so rust team can not internationalize anything). However I approve this way of doing it, for now steps need to be done before going more into this. |
@GuillaumeGomez OK. I will close then for now, but thanks for the interesting discussion. |
Given that we recently got localization for the website working, I wrote a post starting exploration of doing the same for the compiler. https://internals.rust-lang.org/t/translating-the-compiler/10376 |
rendered
cc @Manishearth