Open main menu

Translation capabilityEdit

Hello.Why the messages group Interface messages can be translated by any editor unlike the two message groups Workflow states and Interface which can only be translated by administrators?!Please enable translation of all messages for all translators.Thanks --ديفيد عادل وهبة خليل 2 (talk) 10:10, 24 January 2019 (UTC)

These translation groups are directly in the MediaWiki namespace. The MediaWiki namespace stores all kinds of interface messages, some of which can cause big damage if changed maliciously. To prevent this, the namespace can be edited only by administrators and interface administrators. This is a protection that can’t be removed from individual pages, so it’s technically impossible (at least without major software changes) to make them editable for everyone. You can propose changes on the messages’ talk pages using the {{editprotected}} template. —Tacsipacsi (talk) 00:25, 25 January 2019 (UTC)
@Tacsipacsi: Please look at d:Translations:MediaWiki:Search-nonefound/1/fa and compare it with MediaWiki:POTY-Button-text-full/fa. Both translations have been done using translation extension. I am not an admin on Wikidata, but I could translate the MediaWiki message. I think this is what User:ديفيد عادل وهبة خليل 2 wants to say. 4nn1l2 (talk) 06:45, 25 January 2019 (UTC)
@4nn1l2: Please take a look at MediaWiki:POTY-Button-text-full/fa&action=edit without admin rights (e.g. logged out). It correctly describes the situation, which I wrote above. It seems more like a bug that the Wikidata page can be edited, but looking at the Phabricator board of the extension, I’m not sure it will be fixed ever. —Tacsipacsi (talk) 22:30, 25 January 2019 (UTC)
Thanks. I reported the issue on Phabricator: phab:T214741. 4nn1l2 (talk) 23:17, 25 January 2019 (UTC)

Language guidelinesEdit

There seems to be a lack of general guidelines for languages. I see help pages like mw:Help:Extension:Translate, Help:Autotranslate and {{LangSwitch/doc}}, but nothing that describes our overall philosophy. My personal opinion is that we should follow three different approaches:

  • File descriptions should be in the user's language if possible, using Autotranslate templates, {{LangSwitch}} and internationalized templates like {{Germany}}. But we should avoid mixed language results like "Liturgy for албазинцев in the Русской orthodox mission in Пекине by Чмутов, Иван Иванович".
  • General policy, help etc. pages should be maintained in widely used languages like English, Spanish, French, German, Russian, Arabic, but not in languages that are only spoken by a few million people, because these versions will not be maintained and will only serve to mislead
  • Country-specific category, portal, help etc pages should be maintained in English and the local languages of the applicable country, but not in other languages. Again, the concern is that pages like Commons:Copyright rules by territory/Canada/ar are unlikely to be maintained.

I could start a work-in-progress essay, e.g. Commons:Language guidelines as a place where we could discuss and document agreed principles, but surely this has already been worked out long ago? Aymatth2 (talk) 18:00, 2 February 2019 (UTC)

Hello!
Possibly related: Commons:Language policy and Commons:Localization.
Regarding “General policy/help pages” − if using the translate extension, then changes to the reference page (the English one) automatically get reflected in translations − see eg (at time of writing) Commons:Licensing/fi#GNU_Free_Documentation_License.
Hope that helps, Jean-Fred (talk) 22:52, 2 February 2019 (UTC)
 
{{Cat see also}} example
I think we need more. Commons:Language policy is very short and basic, as is reasonable for a policy. Commons:Localization outlines different ways to internationalize content, but does not give much guidance on choosing and combining approaches. Both skim over objectives, and neither touches on maintainability or discusses issues like mixed languages on one page (see example to the right). Aymatth2 (talk) 17:37, 3 February 2019 (UTC)
Prohibiting translation is a very bad idea. For example my mother language (Hungarian) is spoken by some 10-15 million people, so translating policies into it would be forbidden. But many Hungarians don’t speak any other language, so this would effectively prevent them from comply with the policies, any and all of them. Is this really what we want to achieve? Maintainability is less of a problem for pages using the Translate extension (it’s clearly visible what parts are outdated and it’s easy to check the English version’s changes), so we should urge people to migrate to it instead of stopping them from translating. Mixed-language pages can be prevented in two ways: one is not translating them at all, but the other solution is to translate them in full. Script directionality issues can be avoided even without translating anything just by using correct HTML—check your example again; it’s better now, isn’t it? I haven’t translated a word, just marked the French-language text to be French-language and left-to-right. —Tacsipacsi (talk) 02:53, 4 February 2019 (UTC)
@Tacsipacsi: To some of your points:
  • I came to this subject from work I was doing on a set of pages like COM:CRT/Iran. There are about 200 of these pages, which change from time to time as laws are revised. If we translated them all into 50 languages, we would have 10,000 versions to maintain. That seems unrealistic.
  • "Prohibit" is too strong a word. I would want to prioritize translation of the widely used policy pages into the widely spoken languages, and place less emphasis on translating location-specific pages into languages rarely used in the location. Guidelines could discuss prioritization.
  • I fully agree that we should encourage conversion of general pages like COM:FOP to the translate extension. It would help a lot to be able to track and flag or fix outdated portions of project pages. Guidelines could explain the benefits of converting to the translate extension.
 
{{Cat see also}} example revised. Still too many languages
  • Mixed-language seems to happen mostly when a page holds nested templates, the page is not in English, and the templates ignore the page language and render text in a mix of English and the user's preferred language. Guidelines may help reduce the problem.
Aymatth2 (talk) 19:06, 4 February 2019 (UTC)
  • No one will maintain translations in 50 different languages. If all 200 pages get translated in 50 languages, then 50 people have to maintain 200 pages each. I don’t know how often they change, but I think it’s a manageable quantity. And if they aren’t updated, outdated text is indicated on the page and the English original can be checked.
  • “Should not” seems much like a prohibition. If you want to express priority, please use something like “less important” or “lower priority”; that statement is acceptable for me.
  • Commons template internationalization is designed for file description pages, where there’s no logical page language. Now I changed {{cat see also}} experimentally to respect the page language. This does not work on the main pages like COM:CRT/Iran, however. (Commons:Copyright rules by territory/Iran/ar is also bilingual, of course, as this template isn’t translated to Arabic. The category names are also in English instead of Persian/Arabic.) The current implementation is not as efficient as possible, so even some software changes may be needed before making this change in Module:Autotranslate. —Tacsipacsi (talk) 23:24, 6 February 2019 (UTC)
    • The COM:CRT/country pages are new, but my guess is that a typical page will be changed every 2 years, mostly case history and minor wording changes, with a major revision to the copyright law every 15-20 years. Some pages, e.g. COM:CRT/United States, are much more active. The translators would each see perhaps 2-3 changes per week in total. I agree that should be manageable.
    • My wording was poor. I personally doubt whether anyone will read an Volapük translation of འབྲུག་ རྒྱལ་ཁབ་, but as long as outdated content is clearly flagged to contributors, there is no reason to prohibit or even discourage the effort. Commons:Copyright rules by territory/ast is not flagged, and is very outdated. We need to avoid that type of situation.
    • Perhaps the simplest way to handle short templates in COM pages is to not use them at all. Instead of
    {{cat see also|Iranian FOP cases|lang=fa}} or {{United Kingdom|{{PAGELANGUAGE}} }}
    use
    See also category [[:Category:Iranian FOP cases|Iranian FOP cases}} or United Kingdom
    This can easily be rendered into a clean translation, perhaps including a translation of the category name. The recommended technique and rationale can be described on the guidelines page. Aymatth2 (talk) 14:04, 7 February 2019 (UTC)

┌───────────────────────┘

I don’t think omitting templates makes the life of anyone easier (except probably for the technicals making templates translatable). Templates do make translation easier: the text “See also category” should be translated only once, in contrast to the non-template situation, in which “See also” text has to be translated five times on Commons:Copyright rules by territory/Iran alone, not to speak about the pure presentation (the template has some styles making it easily distinguishable). —Tacsipacsi (talk) 00:06, 8 February 2019 (UTC)

We could make translatable templates for all common words or phrases. The author of a guideline would string them together, as:
{{This|{{PAGELANGUAGE}} }} {{guideline|{{PAGELANGUAGE}} }} {{describes|{{PAGELANGUAGE}} }} {{general principles|{{PAGELANGUAGE}} }} ...
This approach would give strange wording in most languages, and would be very poor in languages where some of the templates did not support the page language. Some templates also add non-standard text formatting. That may work on file description pages, but can give very ugly results on help or guidelines pages, where plain language in the standard text format is almost always all that should be used. Aymatth2 (talk) 13:27, 8 February 2019 (UTC)
Taking the specific example of {{See also category}}, the present template supports a very limited list of languages, does not handle cases where the word sequence is different from the English sequence ("See catname category also") and can never handle cases where the number or gender of "category" must agree with that of catname. The template may still be useful on content pages, but not in its present form on project pages. Guidelines are needed to explain why. Aymatth2 (talk) 23:29, 8 February 2019 (UTC)
Your above example has hardly anything in common with the way {{cat see also}} works—the latter translates a sentence, not words, and supports any word sequence. (E.g. in Hungarian it currently says “Lásd még a következő kategóriákat: <list>”, but it’s technically absolutely possible to change it to “Lásd még a <list> kategóriákat”, which may sound more natural, but the emphasis is moved, which I don’t like, but again, there is no technical problem with it. Just check the translation page—I made it freely translatable when migrating from the old {{LangSwitch}} based translation system.) I don’t know how gender could match the (English) category name, but I’m sure it’s technically feasible to implement this as well in the template. And if we do, translators have to maintain less translation units. If a category name is hard coded in the translation unit, every time it’s modified for some reason, all translations have to be updated with exactly the same change; if it’s translated using a template, no translations should be changed by hand. —Tacsipacsi (talk) 01:17, 9 February 2019 (UTC)
The new version of {{Cat see also}} is certainly a big improvement, a positive outcome from this discussion. It is useful to have templates like this for the millions of content pages. With the much smaller number of project pages, simple natural language is essential. Before adding {{Cat see also|Russian FOP cases}} to a project page we would need assurance that this would always render text in the language of each translation of the page: no annoying scraps of English... But this is drifting far from the original question of whether language guidelines are needed. Aymatth2 (talk) 14:56, 9 February 2019 (UTC)
It cannot be assured that any translation of the template exists that the project page is translated to. But it cannot be assured, either, that the project page itself (i.e. translation units placed directly there) are 100% translated—and this leads back to the question whether we want to solve this issue by prohibiting/discouraging certain techniques or by urging people to translate more pages. But everything only once, or else it will inevitably diverge, and different policy pages will say the same thing in different ways. Such navigational links can’t be more natural, as this type of text is not natural by design. Free text, of course, should be translated as a whole, and as such, can be translated only on that very page, but see alsos, navboxes, infoboxes etc. need a uniform wording, which can be achieved by using a template.
Back to the original question: why isn’t Commons:Language policy enough? Even if not, that should be expanded instead of creating yet another policy, the current amount is already too much. —Tacsipacsi (talk) 16:59, 9 February 2019 (UTC)

┌─────────────────────────────────┘
Policies, guidelines and help pages are different:

  • Policies define standards that everyone must follow. Commons:Language policy is very short, as it should be. It says little about why and nothing about how.
  • Guidelines give advice and recommendations, and are mostly about why and how. They reflect the views of most editors, but are not binding.
  • Help pages give detailed instructions on how to do a certain task or use a particular tool. Commons:Localization is something of an overview of help pages.

We are missing guidelines for internationalization and localization, saying what we are trying to achieve with content and project pages, what problems to avoid, and what techniques and tools are best in different situations, with examples. Aymatth2 (talk) 18:14, 9 February 2019 (UTC)

A small translation helpEdit

I actually don't know the place where I need to ask this doubt. I would like to know the English version for the text

Pokédex de Pokémon Rouge et Bleu, Game Freak, Nintendo, 8 octobre 1999, Game Boy. Adithyak1997 (talk) 12:10, 3 March 2019 (UTC)
@Adithyak1997: it is French for "Pokédex of Pokémon Red and Blue, Game Freak, Nintendo, 8 Octobre 1999, Game Boy." --HyperGaruda (talk) 18:51, 10 March 2019 (UTC)

Conversion jobEdit

I am thinking of converting COM:Copyright rules by subject matter, COM:Copyright tags and COM:Derivative works to Extension:Translate. These three pages started life around 2007, quickly became quite large, and were translated into 15 or more languages. The English versions continued to evolve. It is not clear in the translated versions what is missing and what is outdated. With the Translate extension, the missing bits would show through in English and the outdated bits would be highlighted.

The versions are:

Page Size/en Languages
COM:Copyright rules by subject matter 54,031 bytes 14
This project page in other languages:
COM:Copyright tags 40,549 bytes 29
COM:Derivative works 40,080 bytes 14
This project page in other languages:

The crude approach for each page would be:

  • Make copies of the source of each translation
  • Mark the page for translation
  • For each language:
    • Start to translate
    • Using Google translate, check each chunk of text in the old version to see if it seems to match a chunk of text in the version being translated
    • If so, paste it into the translated version
  • After completion, ask for volunteers to check the results in each language

Any comments on approach, tools, risks etc.? Thanks, Aymatth2 (talk) 19:48, 10 April 2019 (UTC)

Great idea. Instead of copying the source of translated versions (where?), I think it’s better to rename them to the name used by the extension (e.g. Commons:Copyright tags/de instead of Commons:Lizenzvorlagen; admin help might be needed in some cases to overwrite redirects, but no translation admin right at that point), and just check the revision before FuzzyBot’s edit for the pre-Translate version. Maybe it’s also worth leaving out languages from the Google Translate step for which someone speaking the language promises to do the copying shortly after the switch (I can volunteer with Hungarian and also German if no native German wants to do it). Google Translate is a good tool, but not 100% error free, so not using it when it’s not necessary helps to avoid mistakes in extreme cases. —Tacsipacsi (talk) 16:55, 11 April 2019 (UTC)
@Aymatth2, Tacsipacsi: What about Special:PageMigration? See mw:Help:Extension:Translate/Page translation administration#Migrating to page translation. --jdx Re: 17:45, 11 April 2019 (UTC)
I tried it once a while ago, and I wasn’t satisfied with it. I don’t remember what exactly my issues were, but as far as I remember, it was not fast enough to compensate its flexibility loss. By the way, I think it’s a tiny implementation question, which doesn’t have to be decided upon here, but the user implementing the change is free to chose tools to be used. —Tacsipacsi (talk) 19:39, 11 April 2019 (UTC)
To the above:
  • After FuzzyBot’s edit I can still see the historical version, but the "Edit" tab has been replaced by "Translate", so I cannot see the source version with wiki mark-up. Copying to, e.g., COM:Copyright tags/de/20190411 version, would preserve an accessible copy of the source.
  • Commons:Lizenzvorlagen should become a redirect to Commons:Copyright tags/de, rather than vice-versa. Good point. Easy to fix, I think.
  • I am thinking of Google translate only as a way of finding what a chunk of text seems to be about. So I find "Budynki zbudowane przez architekta, który co najmniej 70 (lepiej 100) lat temu" in the Polish version, and Google tells me that means "Buildings built by an architect who is at least 70 (better 100) years ago". This is obviously not a very good translation, but the Polish text may fit under the "Buildings" section in the new translation into Polish. If a sizable chunk of text seems to match the current English version, it seems worth preserving it.
  • My guess is that in many cases the new "translated" versions will not have much non-English content, either because the former translation had little substance, or because it had little connection to the current English version. For example, Commons:Copyright rules by subject matter/tr is all about US government work, not mentioned in the English version. Still, I would hate to throw away translation work that has been done in the past.
  • I will experiment with Special:PageMigration, but I have a feeling that the structure of the English and translated versions has drifted quite far apart, so it may not be very effective
  • I will advertise here when I have finished the first stage, asking for volunteers to fix up the translations. @Tacsipacsi: I think you volunteered! Aymatth2 (talk) 20:22, 11 April 2019 (UTC)
  • Yeah, it’s not easy to see the source code of the previous version, but not impossible, so creating extra pages is not needed (and they just pollute the Commons namespace, so please don’t do it). I use Navigation Popups gadget, so I can just hover over the old version’s link (e.g. on the page history page), and click Actions → Edit this version. If you don’t use it (by the way, you should give it a try if you haven’t yet, it really speeds up things!), you can append &action=edit to the URL of the old version. It’s read-only, but that’s OK for us. —Tacsipacsi (talk) 09:20, 12 April 2019 (UTC)
  • That works! Thanks for the tip. Aymatth2 (talk) 13:14, 12 April 2019 (UTC)

┌─────────────────────────────────┘
For COM:Copyright tags the job is complicated by the fact that English descriptions of tags have been moved out to pages like COM:CRT/Germany, and are now transcluded back into the English COM:Copyright tags. I think the correct way to handle this is to:

This is quite a large job, but given the importance of up-to-date information on tags, I think is worthwhile. Any comments on the approach welcome. Aymatth2 (talk) 00:58, 16 April 2019 (UTC)

COM:Copyright rules by subject matter and COM:Derivative works have now been converted to the translation extension format and are ready for review by translators. With COM:CSM, it turned out that most of the content in the translations matched content that had been moved to Commons:Copyright rules#Simple checklist, so I moved the translations there.

Question: Should I make null edits to the English versions so all the translated versions are flagged as needing review? Aymatth2 (talk) 13:43, 18 April 2019 (UTC)

@Aymatth2: Thanks for the conversion! I don’t think null edits would work; what does work is prepending !!FUZZY!! to the individual translated messages. It’s slower, but at least makes possible for fine-grained marking (e.g. you can skip French translations if you know they are correct). —Tacsipacsi (talk) 12:11, 19 April 2019 (UTC)
Thanks - that is much more precise. It just took a couple of minutes to go back over COM:Derivative works/es and flag most of the converted text for review. I will use it going forward. Aymatth2 (talk) 12:40, 19 April 2019 (UTC)
@Aymatth2: Just to make it a little bit easier for you, I will do it for Persian pages myself, so you can skip them. Thanks for your all efforts. 4nn1l2 (talk) 12:57, 19 April 2019 (UTC)
Thanks. I have marked up the other language versions, but my computer is no good for marking up Persian script. Aymatth2 (talk) 13:59, 19 April 2019 (UTC)

FuzzyBot & language-specific shortcutsEdit

FuzzyBot edits like this are not helpful as it copies shortcuts that are specific to the source (English) page. Can they be excluded? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:54, 13 April 2019 (UTC)

They could using some magic like {{#ifeq:{{PAGELANGUAGE}}|en|{{Shortcut|COM:WIkiVIP|w.wiki/Li}}|}}. But I don't think they should. BTW. w.wiki/Li doesn't seem to be a shortcut. --jdx Re: 02:34, 14 April 2019 (UTC)
I think most shortcuts are for use in discussions. They have to point to one language version so everyone is looking at the same text. I see no reason why a shortcut to an English page should not be advertised on an Arabic page, or for that matter why a shortcut to an Arabic page should not be advertised on an English page. Perhaps the language should be indicated, like COM:WikiVIP (en). Aymatth2 (talk) 02:19, 16 April 2019 (UTC)

Commons:Categories/sv translation overwritten with EnglishEdit

I just noticed that the translation I worked quite hard to create six years ago was just overwritten with an English version by FuzzyBot after Aymatth2 moved the page. Is this intentional? LX (talk, contribs) 08:10, 11 May 2019 (UTC)

  • @LX: It is temporary, a stage in migrating to the current translation extension, and should be fixed in a few hours. The result may need clean-up, since material that has been added to the English version since the translation was done will now "show through" into the Swedish version. Aymatth2 (talk) 12:28, 11 May 2019 (UTC)
  • @LX: Done now. You may want the check the result: in some places the English and Swedish versions may now differ. I should have put a note on the talk page explaining that the migration could take a day or two. My mistake. Aymatth2 (talk) 13:35, 11 May 2019 (UTC)

Traditional ChineseEdit

Language zh-hant – 中文(繁體) – is not supported by the translate extension, despite being standard in Hong Kong and Taiwan. Translations into zh-hant do not appear on the <languages/> bar along with their siblings, and are likely to drift out of sync. Is there a work-around? Aymatth2 (talk) 16:23, 12 May 2019 (UTC)

Chinese should now be translated only once (with zh language code), and than automatically changed between variants (zh-hant, zh-hans etc.); the variant can be selected in a drop-down menu next to the talk page link (on Vector skin). This is convenient for Chinese translators and readers, but hard to handle for non-Chinese translation admins. I don’t know of good workaround. —Tacsipacsi (talk) 22:12, 15 May 2019 (UTC)
@Tacsipacsi: Is this documented somewhere? I have the problem of migrating existing translations to the translate extension when there are zh-hant and zh-hans versions. Usually it will be obvious which is closest to the current English version. Do I just migrate that one, regardless of its character set? Aymatth2 (talk) 22:54, 15 May 2019 (UTC)
Commons:删除政策 and Commons:刪除守則 illustrate the problem. The second is obviously the more current translation. When migrating this page to the translate extension, is there some mark-up to show the language converter it is in, e.g., zh-tw? Aymatth2 (talk) 16:24, 17 May 2019 (UTC)
I don’t know its functioning in that depth, maybe it just tries to guess what variant the text is written in? I don’t speak any Chinese, so I have no idea about the differences, probably the character sets are disjoint. And yes, it’s documented somewhere (apart from the “Translate in zh please” text on Special:Translate), but I don’t recall where I read it. —Tacsipacsi (talk) 14:52, 18 May 2019 (UTC)
@Tacsipacsi: Thanks. You made me look further. I found that the converter to swap to the user's preferred representation is in place, but did not find if translations need tags. As I understand it, the character sets do overlap. "Chinese" is 中文 in both. Also, people in Hong Kong and Taiwan might use different characters for the same word. The converter may have to be told which variant the text uses. I will just migrate the version that seems most up-to-date, leave a "see also" link to the other version, and hope someone will add mark-up later, if needed. Aymatth2 (talk) 16:01, 18 May 2019 (UTC)
Just possibly the solution is to wrap the text in, e.g., <div lang="zh-tw">...</div>. Aymatth2 (talk) 22:42, 23 May 2019 (UTC)