January 20 CX Update: More fixes for page loading and template editor

Hello, and welcome to another CX update post, in which I am happy to report about several significant bug fixes.

  • Pages that had full stops (⟨.⟩) in headings couldn’t be loaded after auto-saving and closing the browser tab. This is now fixed. It’s a follow-up to a similar bug a fix for which was reported last week. If you still have issues with loading saved pages, please report them. (bug report)
  • Adapted infoboxes would often say “Main Page” on the top, no matter what was the page being translated, or into what language. This could also happen with other kinds of templates. This affected pages with templates that used that {{PAGENAME}} magic word. This is now fixed, and the auto-adapted template now shows the relevant page name. (bug report)
  • An unnecessary horizontal scrollbar was shown on some pages that had wide tables. It was removed. (code change)

 

January 6 CX Update: Fixes for page loading and template editor

Hello, and welcome to the first CX Update post of 2017!

We just deployed two significant fixes:

  • Many users complained recently that they cannot load a translation in progress that was auto-saved. This was happening when translating from languages that aren’t written in the Latin alphabet, such as Russian or Chinese. The data was not lost—it was correctly saved internally, but a software error prevented its proper loading. This is now supposed to be fixed, although some more work may be needed to make it more stable. If you experienced this issue, please try loading your article now. If it still doesn’t work, please report it. We apologize about this inconvenience. (bug report)
  • The template editor was remaining open when moving to the next section. This was confusing because some people didn’t realize that it’s supposed to be closed to actually save the data entered in the fields. It will now automatically close when moving to edit the next section. (bug report)

December 18 CX Update: A new help page and a fix for Bengali

Hello,

Continuing the topic of the new template editor from previous weeks, let me introduce the new detailed and illustrated help page for the template editor. It includes useful information for articles translators and for template maintainers. We would really appreciate it if you could bring this to the attention of the template maintainers in your wiki, and also if you could translate it into your language.

Other than that, translating into Bengali was broken this week because of a subtle problem with handling of a Bengali Unicode character in the title of the special page. We apologize for the inconvenience. (bug report)

Unless something surprising happens, this is the last CX update for post for 2016. Happy new year—exciting changes are planned for 2017!

December 8 CX Update: Publishing and template editor fixes

Last week we reported here about the deployment of the new Content Translation template editor. Yesterday we also published an expanded post about the new template editor it in the official Wikimedia blog.

This week brings a couple of bug fixes in the template editor and other issues:

  • The template editor for inline templates, such as IPA or unit conversion, was expanding to fill the whole screen. It now has a reasonable size that doesn’t cover the whole page. (bug report)
  • Some pages couldn’t be published and showed a “docserver-http” error. This is now fixed. If your article was stuck and you couldn’t publish it, refresh the translation interface and publish it again, and it should work. If you still have issues with publishing, please report them to the CX feedback page. (bug report)

This is an opportunity to remind you that the user interface of Content Translation itself needs to be translated, especially now that the new template editor has several new important messages. Please check the statistics for your language at translatewiki.net, and bring it to 100%. Thanks!

November 30 CX Update: New Template Editor

Content Translation is getting a major new feature: Completely re-written support for templates. It was in design, testing and development since June 2016, and the first version of this feature was released today, Wednesday November 30 to Wikipedia in Catalan and Hebrew, and tomorrow, December 1st to Wikipedia in all languages.

The goal of this new feature is to make it easy to translate the templates across languages.

We want to give more control to all the people who use the Content Translation feature directly or are affected by it: translators, other editors of articles that were created as translations, and template maintainers.

Templates are used heavily in all Wikimedia projects. When Content Translation’s development started in 2014, the developers gave it very basic template support. Templates that used a whole paragraph, such as infoboxes and long quotations, were usually skipped completely. Shorter templates inside paragraphs, such as references, unit conversions, quotes in other languages, “citation needed”, etc., were adapted to a corresponding template in the target language when possible, or substituted with wiki syntax.

While this was useful for the creation of much more than 100,000 new articles in a lot of languages, this was far from perfect. It was confusing that infoboxes and whole paragraphs of quotations were not shown during the translation, and they had to be inserted manually after creating the first version of the translated article. References were frequently adapted incorrectly and inserted a lot of hard-to-maintain wiki syntax.

We now start to address this issue by letting translators choose what to do with each template. No templates are silently ignored now, so infoboxes and all other templates are shown in the source article column during the translation. When clicking on a template, a card on the sidebar will let the translator choose what to with the template. It’s possible to skip a template entirely (“Skip template”) or to insert the wiki syntax of the template as it appears in the original language (“Keep original template”). If an equivalent template is available in the target language, it will be possible to insert it, and edit the parameters one by one (“Use equivalent template”).

tradueix-la-pagina-viquipedia-l-enciclopedia-lliure-1
The template editor, while translating the article Shalom Meir Tower from English into Catalan. All the parameter names are shown, and can be added one by one. After adding all the needed parameters, close the editor and the template will be shown.

If the equivalent templates have the same parameter names, their values will be copied automatically. If the parameter names are different, but the template in the target language has TemplateData defined with names of parameters and aliases that are the same as the parameter names in the source language, they can also be adapted automatically. You can read more about TemplateData at mediawiki.org.

tradueix-la-pagina-viquipedia-l-enciclopedia-lliure
The template, inserted after translation. Notice that the template is rendered during the translation and the differences between the design in the different languages are easy to see.

Wikis have people who develop and maintain the templates in them. This is also an opportunity for all wikis—large, medium, and small—to take a look at their templates and improve them. Here are several things that can be done:

  • Add TemplateData (link: https://www.mediawiki.org/wiki/Help:TemplateData) to templates that don’t have it yet. This will allow Content Translation and Visual Editor to show template insertion and editing forms where all the parameters are displayed conveniently.
  • Consider adding aliases for template parameter names that correspond to parameters in wikis in other languages from which articles are frequently translated into your language. You can see from which languages articles are translated most often into yours by going to the page Special:CXStats in your wiki.
  • Consider making the types of parameter more similar across languages. For example, in some languages images are provided as complete file links (e.g. “ {{Infobox person|image=[[File:Sophie Kowalevski.jpg|thumb|300px|Sofia Kovalevskaya, 1880]]}}”) and others have separate parameters for file name, size and caption (e.g. {{Infobox person|name=Sofia Kovalevskaya|image=Sophie Kowalevski.jpg|image_size=300|caption=Sofia Kovalevskaya, 1880}}). Making the parameter structure similar to the structure in the language from which articles are often translated will make the work considerably more efficient for translators and article maintainers.

As noted earlier, this is only the first release of this feature. Templates on Wikimedia projects are very diverse, and while the developers tested the new template editor with many templates in many languages, it is impossible for us to test it with all the different templates—there are just too many of them. Because of this, it may be impossible to adapt some templates at first. As always, we’d love to hear from you about templates that can’t be adapted, and about other bugs. We nevertheless believe that this feature is already an improvement over the way that templates were handled till today, and we are continuing the development to make template translation easier and more efficient based on your input.

You can read more about the design and the development of this feature, as well as details for its future improvements in Phabricator task T139332.

August 29 CX Update: Easier machine translation control, less saving errors, and more wiki syntax and templates clean-up

Highlights of recently deployed Content Translation changes:

  • One of the most common complaints about the Content Translation editing interface was that it’s too easy to remove a paragraph and there is no way to undo it. The button that removes the paragraph was in the “Automatic translation” card, which confused many translators. To address this, this card was completely redesigned, to make editing and configuring machine translation easier. (task description)
  • For several days links to foreign languages were inserted instead of internal links. This was fixed. (bug report)
  • ISBN links were frequently added with <nowiki> tags. This is now fixed. (bug report)
  • Some users couldn’t save translations and saw as “Internal database error”. This was fixed. (bug report)
  • Many fixes were made for common citation templates in Spanish, Portuguese, Polish, Welsh and other languages (see T142753 for an example of such a fix). This is a step towards generally more robust support for template adaptation (in progress), which will give translators and wiki editing communities more flexibility, ease and control of the translated content.

June 24 CX Update: Cleaner wiki syntax, better AbuseFilter support, and more improvements

Welcome back to CX updates!

For some time the development team took a break from developing Content Translation frontend features to focus on some background fixes and on other projects that were on the back-burner. Now we are back to making major updates to our article translation platform.

The areas on which we focus at the moment and for the next couple of months are making the wiki syntax of the published pages cleaner and easier to maintain after the first version of the translated article is created, and making template and reference adaptation more stable. There is much to do there, but here are some changes were already deployed:

If there was no corresponding template in the target language, but there was a template with the same name, it was used for adapting the template to the translation. This was wrong and sometimes completely unrelated templates were adapted, creating confusing content. This will not happen any longer, and only templates that are directly linked using an interlanguage link in Wikidata will be used for adaptation. (bug report)

Some pages were published with HTML tags with ContentTranslation-specific attributes such as “data-cx-draft”, “cx-segment”, “cx-link” and others. They are unnecessary in articles, and had to be removed manually by editors. This was fixed and is not supposed to happen any longer. (bug report)

Adapting references of some kinds was generating errors, and it made it impossible to publish a translation. This was fixed. (code change)

Some other things we worked on recently:

  • All messages generated by AbuseFilter were shown while writing a translation. This included some messages that don’t affect translation publishing, and this was very confusing. Now only warnings that affect page publishing are shown. (bug report)
  • Some users were seeing too many gray interlanguage links that was too long to be useful. Its length is now limited to three items. (bug report)
  • When support for a new machine translation engine is added to a language pair, it will be shown as a tip in the Automatic translation card in the sidebar. (task description)
  • Translation from namespaces other than the article namespace was sometimes failing when the namespace name was translated in the other wiki. In particular, this affected the Medical Translations Projects. This is now fixed. (bug report)
  • A pop-up window that invites users to create an article in Content Translation was shown when creating user pages using the Visual Editor. Content Translation is not intended for user pages, so we no longer show this pop-up on user pages. (bug report)
  • Some language codes, most notably Norwegian, were handled incorrectly because of inconsistencies in the actual language codes and which domain code the Wikipedia uses. We now normalize language codes. (bug report)
  • Using the “Clear paragraph” button could generate errors that prevent publishing. This was fixed. (bug report)
  • Paragraph-level parallel corpora are now fully accessible through an API. We are also preparing to make dumps of parallel corpora available for download. This should be useful to all machine translation developers and researchers.
  • The gray interlanguage links that suggest translation to a different language were not shown in Internet Explorer. This was fixed. (bug report)