-
-
Notifications
You must be signed in to change notification settings - Fork 194
Standardize configuration inheritance. #172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize configuration inheritance. #172
Conversation
Yep, looking good so far! |
9f8f665
to
cc20964
Compare
@edsilv, I think this is done. I've adjusted configuration inheritance so you can chain together an infinite number of files (previously it was limited to one level of inheritance). Each file will search its "extends" path relative to its own file location. I've also refactored all of the common strings out of the extensions into the shared base. I suspect there may be additional strings from the Seadragon extension that ought to move up, but for now I limited myself to refactoring this so that the output remains unchanged from before. It's also worth reviewing what's left behind in the media and PDF extensions, since there seem to be inconsistencies between the CY and EN files which may indicate obsolete stuff that can simply be removed entirely. Let me know what you think, and whether I can be of any further assistance. If you're happy with this, feel free to merge to master, and then I'll update #168 to match. |
Nice! One thing that occurs to me is that there is now no single l10n file for each locale to upload to transifex for translation... I'm actually starting to question the value of transifex. It's kinda difficult to work with, and currently only supports the openseadragon extension. Maybe there's a better solution. Perhaps we could even supply the translation interface ourselves in the form of the config editor? |
There's a problem with the config editor. It's only editing the config/content for the current extension. As we've discussed, Maybe each localised config needs to contain everything for all extensions. These configs are split between An example
The same pattern would be employed for the These are then merged by the config editor. We add two checkboxes above the editor, "content" and "config". Checking "content" would allow you to edit the content only. The json can be copied and used to overwrite the The file loaded by
? |
I'm not familiar with Transifex, so I can't comment on its specific limitations. Are you familiar with TranslateWiki (https://translatewiki.net/)? It's been on my radar for some time to investigate using it for VuFind, but I haven't gotten around to it yet. Looks promising, though. Perhaps it would also be suitable for UV's needs. |
I'll take a look. I think what I'm suggesting above could obviate third party translation services? |
We can use the "extends" keyword to inherit things like licenses from the en-GB l10n file. I think this approach could solve a number of issues. It's straight-forward to filter what's included in the config editor using jmespath. |
I think one of the proposed benefits of TranslateWiki is that you may get translations into additional languages courtesy of the active volunteer population, which could be a benefit -- but it only works well if your language files match up with the sites conventions (and being unable to devote much time to cleaning up some messy things in VuFind is the main reason I haven't gone forward on that side). If you're not interested in soliciting a broader base of translation volunteers, though, it certainly could make sense to have your translation staff/volunteers simply use the existing config editor. The main challenge, which relates both to this and to your second comment, is that you have to figure out a way to easily map translations back into the source files. That's hard to do when there are multiple source files... but collapsing everything into a single file adds its own challenges, in terms of loss of modularity and potentially limitless growth. It also makes the possibilities for inheritance less flexible -- I'm gathering that each module's "common" section would get merged with the top-level "common" section... and maybe that's good enough... but it feels more restrictive than having the ability to chain files together more or less arbitrarily. It seems to me that any approach is going to have some trade-offs. To some extent, it's also a question of whether to maintain this as a single file or simply to compile more things at build time -- perhaps it still makes sense to maintain files on an extension-by-extension basis, but to compile everything into a single-file format that's more compatible with the config editor. Of course, that doesn't address the problem of figuring out where to put newly-received translations in the source files. I suppose at this point I don't know enough about workflows to comment particularly intelligently! |
I think if the config editor does a decent enough baseline translation job, it might be simpler to just go that route. My gut feeling is that UV translation will only ever be a slow trickle from implementing institutions, rather than something requiring crowdsourcing. If this turns out to be wrong, it should still be possible to add third party translation support. One thing we'd lose from transifex is the "% translated" indicator, which is quite handy... Here's one idea. We create a new "monolithic" config format that combines all content and options for all extensions, plus a shared section. These monolithic files would be named This format acts as a temporary file to be pasted into data-config can also optionally be a json array of references to locale-specific configs. The locale is included in the monolithic file, so if it's referenced directly from data-config, it knows when to apply it. It's necessary to have the option of locale-specific data-configs, because you may want to enable/disable certain features per-locale. For example, the language toggle button used on the welsh locale. After copying and pasting this monolithic file from the config editor, you run a build. During build, it's merged with the extension-specific l10n and config files (perhaps we should rename these to "content" and "options" to match the monolithic format node names? "config" being a combination of content and options). Everything but the shared "content" and "options" sections is removed. The extension-specific files are still able to inherit from each other, and when inheriting from the monolithic shared config file, only get the corresponding "content" or "options" nodes. When launching the config editor, they are merged back into the monolithic config format again for the given locale. The editor has "content" and "options" checkboxes to enable/disable those sections and keep the file size down for data-config use. It also has checkboxes to enable/disable each extension section and the shared section. The problem now is that this generated monolithic file will grow large as extensions are added. I think we can get away with this for the editing experience - it's acceptable for a tool like that to need to load heavy files. Consider Adobe Photoshop or Premier. |
Is there a way to give the config editor extra smarts so it can detect which strings are missing from one translation but present in another? I imagine that % translated would actually be fairly easy to calculate at build time -- if we want to keep track of that statistic, we could probably do something in the Travis job to post the current value somewhere. Your proposed solution sounds reasonable to me, though it would definitely be good to solicit some additional opinions given that I'm not yet so familiar with the project that I'm likely to think of every edge case! A few questions/comments:
|
I'm thinking that it would be desirable to be able to perform some kind of type checking on the configs based on interface inheritance. Config currently feels like a bit of an unsafe loose end in that regard - pretty much everything else is statically typed. That way we could ensure that each config has no missing fields and can therefore create a % translated metric around which fields are empty or somehow marked as inherited. Runtime TS type checking doesn't appear to be one of their design goals: microsoft/TypeScript#1573 Not sure how to approach this yet. We could potentially maintain json schemas to validate against. Currently these are being generated on build for the config editor to work with. JSON schema as a format doesn't really lend itself to human read/writability though, and would be a maintenance headache. Would be ideal if we could somehow use typescript interfaces. I scribbled a diagram to help explain these ideas to myself :-) https://drive.google.com/file/d/0B64TWB9qdiOMNzVHQkNnRk9tV1E/view |
Okay, I think we're pretty close to being on the same page (though honestly I ought to spend more time absorbing this -- right now still catching up after two days of VuFind Summit). How would you like to proceed? Is it worth merging this pull request as an incremental step (and so that we can also take care of #168) and then do more refactoring subsequently, or do you want to try to get more of this done here in the PR before proceeding? I suppose it depends on whether you envision the type of inheritance implemented here still being part of the process (i.e. assemble each chunk using inheritance, and then assemble into monolith) or whether you have a different mechanism in mind now that you are thinking more holistically about configuration. |
I think this PR can be merged as an incremental step. Extension-specific configs can still inherit from each other as needed. E.g. the welsh config may only change one element of the English. The difference will be that when inheriting from the Just thinking, with the Welsh example, I think the new process would lead to the inherited config being copied in during build. So you'd open the config editor for the cy-GB locale, the monolithic format would be constructed using the inherited shared and en-GB config (everything extends shared config by default - no "extends" keyword necessary). You'd then edit it and copy and paste the changed config into |
How would we control the specialized merging behavior? Could it be as simple as checking to see if the parent object has a shared section, and if so, behave differently? |
http://jsbin.com/zojune/edit?js,console In that jsbin, we'd want the merged config to exclude the "goHome" node. I think there must be a way to do it with _.merge using the customizer callback... Also, as per the community call discussion, we need a way to set a data-config global override for all locales. This is to avoid a situation where we'd need to create duplicate overrides for every locale e.g:
When bootstrapping, the UV checks for a data-config: https://github.com/UniversalViewer/universalviewer/blob/master/src/Bootstrapper.ts#L194 We need to make this locale-aware so that it only applies config for the current locale. We also need to check for the existence of a "shared" entry, which would be merged on top of the locale-specific config. The data-config contents as discussed can either be the monolithic file format with a "locale" node, or a json dictionary with paths to configs. In the case of receiving a monolithic file, shared is not an option. It's only looked for when using the dictionary. We could probably just check for the presence of a "locale" node to determine whether it's a dictionary or not. |
In the jsbin, why do we not want to allow the extended config to define strings that are not found in the base config? I agree that this could probably be prevented with the appropriate callback function, but I'm not sure I understand the reasoning. What is the harm of adding additional local strings? I agree that looking for the "locale" node should be a sufficient means of differentiating a single config from a dictionary. I can't imagine there's ever going to be a locale named "locale," so I don't see a likely chance of collision. The alternative, though, would be to give the dictionary a distinctive top-level node like "config-dictionary" and then look for that. The advantage to this approach would be improved backward compatibility, since you could assume that a config file with no 'locale' node gets the default locale of 'shared', which would be closer to the current behavior. |
I'm thinking that we want to avoid a situation where we define minimal extension configs using inheritance, then through the process of "monolithification" where everything is flattened into a single file then chunked up again, we fill these files up with their inherited elements, rather than keeping them minimal. When "chunking" we only want to overwrite what's already in these files, not bring along everything they've inherited from elsewhere. Yep, agree regarding "config-dictionary". Maybe just "configs"?
|
Okay, the chunking example definitely makes sense. I think this would have to be done recursively. This seems to do the trick (though it doesn't use _.merge at all): function customizer(a, b) { var out = b; if (_.isObject(a)) { out = {}; for (var i in a) { if (typeof b[i] !== 'undefined') { out[i] = customizer(a[i], b[i]); } else { out[i] = a[i]; } } } return out; } var merged = customizer(config, extConfig); I agree that "configs" would be fine in the dictionary. |
Here's the "monolithification" diagram for reference: https://ipfs.pics/QmZWAB89yZutvSj6utMzdQRUfBmmtMwFHXzX2ZSGVQ4RBY |
This is a first stab at #171. I'm out of time for today and would like to test more before this gets merged, but I thought I'd open a PR now so you can take a glance and comment if you think I'm on the wrong (or right) track between now and the next time I have a chance to work on this.