Composr Tutorial: Localisation and internationalisation

Written by Chris Graham (ocProducts)
This tutorial is designed as a comprehensive guide to Composr's translation features, written for people wanting to make a complete Composr translation and understand the full technical details. We also have a simpler tutorial.

Composr has great support for internationalisation , including:
  • time-zones
  • translation of text into different languages (.ini files)
  • translation of text into different languages (Comcode pages)
  • translation of text into different languages (text files)
  • translation of images into different languages (e.g. buttons with some text on)
  • different character sets (for example, Cyrillic)
  • different locales, for different numbering systems (for example, European comma and decimal-point difference)
  • there is support for translating content into different languages


Time-zones

In Composr, time-zones can be adjusted in two ways:
  1. adjusting the site time-zone (this is a site configuration option). This sets the default, and typically will be where your organisation is mainly centred around.
  2. adjusting member time-zones in their account settings (Conversr only). This lets individual members set their own time-zones.

You can also have guest time-zones auto-detected (a configuration option). This is a good idea if your visitors are geographically-distributed.

Translations

A Composr website automatically customises the user interface based upon the language it is beng viewed in. This involves a number of language elements, but primarily language files (which include language strings). Most text in the user interface is derived from some kind of language string, which is the basic translation building block. Language strings are parameterised so as to be able to insert dynamic data within translatable sentences and paragraphs. More details are provided under "Language file format (technical overview)".

Coordinating your translation

The core development team really want Composr to be widely used by people in any language, but does not get involved in maintaining or developing individual translations (other than the standard English) due to a lack of language-specific knowledge and resources. Instead we maintain documentation and architecture, to empower contributors and help automate delivery, and we grant necessary access to translators.

If you have any feedback on how translation can be easier then please report it in the Internationalisation forum. An example bit of feedback to report might be if the exact grammatical context of an English language string is not clear (i.e. ambiguous), and you want us to add a description for that string. Another example is if a string is formulated in such a way that it is impossible for you to translate with grammatical correctness.

Internationalisation can be difficult and time consuming if someone has not already worked on a translation for your language. We therefore recommend that you try to plan ahead and bring together a team from your country to make translations go faster.

On the other hand, if the translation was substantially completed already it might be as simple as installing it from the Composr addon directory.

Tip

Not all language files need to be translated, and language files do not have to be complete, as if a string cannot be found and the fall-back language (English) isn't being used, Composr will look in the English language pack using the fall-back mechanism.

This may hurt your sense for 'completeness', but practically speaking most sites don't expose most strings, and some are only ever seen in rare circumstances or by website staff.

Language files

Language file format (technical overview)

This section will describe the format used to store language string s in Composr. In theory, learning this is not necessary, as a module in the Admin Zone is provided that works with this behind-the-scenes; however it is useful to know, especially if you are wishing to work through the language files in a text editor.

Composr language packs are made up of .ini files (the language files), containing mappings between special codes (based on the English) and the actual string as displayed. For example, a common string in the global language file (the one containing common strings used throughout the portal), is coded as:

Code (INI)

PROCEED=Proceed
 
We use multiple files so that you only need to translate what you actually use and to keep things cleanly separated by addon. For example if you are not using the galleries addon, you shouldn't need to translate the galleries.ini strings.

Composr is developed in (British) English, and this is technically known as the 'fall-back language', because English always has a complete set of language files and strings. If the active language defines all strings then the fall-back language is never even loaded – which is different to the approach of other systems such as po.

The .ini files for any translation are stored together in a directory that is named with the standard ISO two-letter code to denote that language; for example, English is 'EN'. A list of these codes is in lang/langs.ini. We use upper-case for the names; often other software uses lower-case, but it varies.

All bundled languages packs are located in the lang directory of Composr. There is also a lang_custom directory which contains files from custom language packs, or language files that 'override' those available in the lang directory. Whenever language files are edited in the Admin Zone, changes to the original file are automatically overridden to a lang_custom one.

Special strings

Language string codes that are in lower-case are special strings, that should not be translated directly. These strings contain encoded information relating to the language pack. All these strings have documentation labels to explain what they do.

For dates and times we use the PHP http://php.net/manual/en/function.strftime.php function, except the following changes are made by Composr…
  • %e is the day of the month but will not have a leading space (because that's ugly) and will work on Windows servers too
  • %l is the 12-hour clock hour but will not have a leading space digit (because that's ugly)
  • %o is an English date ordinal (e.g. st or th)

Dates

Commonly American users will want to use American-style short-form and long-form dates. That is, 2-14-2017 instead of 14-2-2017, and February 14th 2017 instead of 14th February 2017.

This is handled automatically via the "American English" configuration option, which remaps the default language string values for you.

However, it is useful for us to document the strings involved, and an illustrative example of date-string editing for non-English users, so…
String Purpose Most English-speaking countries American
date_concise_near_date
(global language file)
A date in concise long-form ~ within year %e %b (e.g. 14 Feb) %b %e (e.g. Feb 14)
date_no_year
(global language file)
A date in verbose long-form ~ within year %e%o %B %Y (e.g. 14th February) %B %e%o %Y (e.g. February 14th)
date_regular_date
(global language file)
A date in regular long-form %e%o %B %Y (e.g. 14th February 2017) %B %e%o %Y (e.g. February 14th 2017)
date_verbose_date
(global language file)
A date in verbose long-form %a %e%o %B %Y (e.g. Mon 14th February 2017) %a %B %e%o %Y (e.g. Mon February 14th 2017)
calendar_date
(calendar language file)
A date in short-form (calendar-only) %d-%m-%Y (e.g. 14-02-2017) %m-%d-%Y (e.g. 02-14-2017)
calendar_date_verbose
(calendar language file)
A date in verbose long-form (calendar-only) %a %e%o %B %Y (e.g. Mon 14th February 2017) %a %B %e%o %Y (e.g. Mon February 14th 2017)
calendar_day_of_month_verbose
(calendar language file)
A date within a month (calendar-only) %a %e%o (e.g. Mon 14th) %a %e%o (e.g. Mon 14th)


Non-English speaking users may want to make other changes, e.g. remove %o so that suffixes like th and st aren't used.

We define dates with some different variations. This is because practically it isn't appropriate to try and use only one format across the entire system. The calendar may use date formats from global, but the rest of Composr will not use date formats from the calendar.

Times

Commonly non-English users will want to switch to a 24-hour clock for time. Here's some guidance…
String Purpose Most English-speaking countries Most other countries
date_regular_time
(global language file)
A time in regular form %l:%M %p (e.g. 8:14 am or 8:14 pm) %H:%M (e.g. 08:14 or 20:14)
date_verbose_time
(global language file)
A time in verbose form (calendar-only) %l:%M %p (e.g. 8:14 am or 8:14 pm) %H:%M (e.g. 08:14 or 20:14)
calendar_minute
(calendar language file)
A time (calendar-only) %l:%M %p (e.g. 8:14 am or 8:14 pm) %H:%M (e.g. 08:14 or 20:14)
calendar_date_range_single
(calendar language file)
A time range happening on a particular date ~ within month (calendar-only) %l:%M %p (%e%o) (e.g. 8:14 am (14th) or 8:14 pm (14th)) %l:%M %p (%e%o) (e.g. 8:14 am (14th) or 8:14 pm (14th))
calendar_date_range_single_long
(calendar language file)
A time range happening on a particular date (calendar-only) %l:%M %p (%e%o %b) (e.g. 8:14 am (14th Feb) or 8:14 pm (14th Feb)) %l:%M %p (%b %e%o) (e.g. 8:14 am (Feb 14th) or 8:14 pm (Feb 14th))


We define times with some different variations. This is because practically it isn't appropriate to try and use only one format across the entire system. The calendar may use time formats from global, but the rest of Composr will not use time formats from the calendar.

Locales

Setting the correct locale may be tricky. Different operating systems may use a number of different values, for example in English we list all these locales:
  • en-GB.UTF-8 – British English, for systems that use hyphens and support a flag for utf-8 output
  • en_GB.UTF-8 – British English, for systems that use underscores and support a flag for utf-8 output
  • en-US.UTF-8 – American English (maybe British is not installed), for systems that use hyphens and support a flag for utf-8 output
  • en_US.UTF-8 – American English (maybe British is not installed), for systems that use hyphens and support a flag for utf-8 output
  • en.UTF-8 – English (maybe specific British or American is not installed), for systems that support a flag for utf-8 output
  • en-GB – British English, for systems that use hyphens and do not support a flag for utf-8 output
  • en_GB – British English, for systems that use underscores and do not support a flag for utf-8 output
  • en-US – American English (maybe British is not installed), for systems that use hyphens and do not support a flag for utf-8 output
  • en_US – American English (maybe British is not installed), for systems that use hyphens and do not support a flag for utf-8 output
  • en – English (maybe specific British or American is not installed), for systems that do not support a flag for utf-8 output

For German, the following should work de-DE.UTF-8,de_DE.UTF-8,de.UTF-8,de-DE,de_DE,de (a little simpler as we don't normally worry about different variants of German).

Microsoft Windows is good at accepting anything that vaguely matches, but it unfortunately cannot produce utf-8. So if you want a non-Latin-based language to work on Windows you'll need to fill in the locale_subst language string with all the long and short month and day names. See what the Tamil language pack does as an example:
locale_subst=Monday=திங்கட்கிழமை,Tuesday=செவ்வாய்க்கிழமை,Wednesday=புதன்கிழமை,Thursday=வியாழக்கிழமை,Friday=வெள்ளிக்கிழமை,Saturday=சனிக்கிழமை,Sunday=ஞாயிற்றுக்கிழமை,Mon=தி,Tue=செ,Wed=பு,Thu=வி,Fri=வெ,Sat=ச,Sun=ஞா,January=ஜனவரி,February=பிப்ரவரி,March=மார்ச்,April=ஏப்ரல்,May=மே,June=ஜூன்,July=ஜுலை,August=ஆகஸ்ட்,September=ஸெப்டம்பர்,October=அக்டோபர்,November=நவம்பர்,December=டிஸம்பர்,Jan=ஜன,Feb=பிப்,Mar=மார்,Apr=ஏப்ரல்,May=மே,Jun=ஜூன்,Jul=ஜூலை,Aug=ஆக,Sep=செப்,Oct=அக்,Nov=நவ,Dec=டிச,th=,st=,nd=

Common points of confusion

The software:
You may see the phrase "The software" come up a lot. We use this phase because of how we support debranding in Composr. A web design studio may have renamed the product to some other name to increase their own brand-strength. We therefore try and make the language strings generic where possible. The EN language pack automatically replaces this with the configured brand name. i.e. it is usually converted back to "Composr" dynamically.

Child:
We often use a parent/child metaphor to talk about tree structures. This often does not translate well, so you may want to talk of parent and child branches instead.

Nouns vs Verbs (e.g. Download):
Sometimes English will have a word that works both as a noun and a verb, but your language won't. Be careful. We try and clarify within string comments where it is unclear.

Phrase variants

We sometimes have multiple versions of strings, using underscores to differentiate them (e.g. FOO vs _FOO).

There is no consistent meaning to what an underscore might mean (there is too much variety in phrasing in general to be very consistent), but when there's ambiguity we try and write descriptions for the language strings involved.

Never-the-less, here are some examples of what an underscore might mean differentiating between:
  • noun and verb forms
  • singular and plural forms
  • short and long forms
  • standalone words vs words that build in variables to give a combined state label (e.g. "post date" vs "posted: {1}")
  • words to use mid-sentence, versus words to use on their own
  • slightly different contexts (for example, talking about similar but not identical things)

Code within strings

In many cases there will be some coding symbols embedded within a specific language string. These coding symbols will be written in English (like almost all programming languages are), but shouldn't themselves be translated.

These may come up in situations such as:
  1. Hello {1}
  2. There are {2} {2|apple|apples}
  3. A <strong>good</strong> day
  4. [block="something"]menu[/block]
  5. The renderer to use (hook-type: 'blocks/main_custom_gfx')
  6. Christopher&apos;s plan
  7. {1} leads the usergroup &lsquo;{2}&rsquo;
  8. Welcome back to {$SITE_NAME}

Here's how the above sample cases would be translated:
  1. The {1} bit represents a parameter to the string (something inserted dynamically) and should be left alone. Translation in French may look like Bonjour {1}.
  2. This is similar to above, but uses our pluralisation feature ({2} would be a number in this case). Translation in French may look like Il existe {2} {2|pomme|pommes}. You must not add extra spaces around the number when using this syntax.
  3. There is an HTML tag which must be left alone. Translation in French may look like Un <strong>bon</strong> jour.
  4. This is largely Comcode which must be left alone. Translation in French may look like [block="quelque chose"]menu[/block].
  5. This is partly referring to a Composr directory. Translation in French may look like Le moteur de rendu &#xE0; utiliser (du type &#xE0; crochets: 'blocks/main_custom_gfx').
  6. This contains an HTML entity (a smart-quotes apostrophe character). Translation in French may look like Le plan de Christopher. In this case we didn't keep the entity because the translation didn't warrant it. The purpose of this example is to show that "apos" is not some English text to translate directly, it is HTML, which would be the same in any language, but may or may not persist depending on your translation target.
  7. Again this one contains HTML entities. Translation in French may look like {1} dirige le &lsquo;{2}&rsquo; groupe d'utilisateurs.
  8. This contains some Tempcode. You would only translate the words 'Welcome back' and leave the {$SITE_NAME} intact. Translation in French may look like WelBienvenue &#xE0; {$SITE_NAME}.

If you use automatic translation then this code will probably get mangled. You need to carefully check all translated strings.

Character sets

There are three systems that are in common usage to allow diverse characters to be displayed in a document:
  1. Unicode
  2. Character sets
  3. HTML entities

Composr supports both character sets and Unicode. Generally everyone will want to use Unicode (utf-8) nowadays, and that's our default in Composr (via the charset language string). PHP does not have good Unicode support, but it does have a number of common extensions and techniques for handling it (mainly mbstring and iconv), and we support all of those. If there are no extensions installed we have some basic code to 'get by okay'. If you are using a non-English language, you want to ensure your host has the PHP mbstring extension, and the vast majority of hosts do.

The English pack can be used in both Unicode and Character set mode, so we use HTML entities in many places to make up the difference. Most language strings support HTML, so HTML entities can be used there, but it is not universal.

Advanced explanation of character encodings

To understand character sets, you need to understand how strings (or text files) are composed. Each character (a symbol, represented by a 'glyph' on the screen) is essentially represented a number, 0-255 (a byte); 0-127 are usually standard, and specified using the '7-bit ASCII code': the 128-255 range is essentially free, and what the numbers map to depends on the 'character set' used. As different languages use different characters (for example, accented characters, or a whole different alphabet, or even a pictographic language), different languages use different character sets.

A file that uses 'high' characters will look different when viewed in editors set to different character sets. In order to put in text in the appropriate character set, and to view it, your editor must be set to it; this is to be expected to be by default if you are translating to your native language.

utf-8 works on the principle that characters may use more than one byte. The ASCII characters all fit within one byte, but extended characters use 2 or more bytes. Normal ASCII text is therefore the same in both Unicode and a character set, but extended characters are represented very differently.

The language editor (i.e. how to change strings locally)

Image

Using the language editor to translate language strings

Using the language editor to translate language strings

(Click to enlarge)

Image

Choosing a language and language file to edit in the language editor

Choosing a language and language file to edit in the language editor

(Click to enlarge)

The language editor allows you to translate 'strings' so that your website is displayed in a language other than the original British English. Alternatively, you may just wish to change language strings to change the 'style' of the website.

The language editor is very easy to use. All you need to do is go to the translation module, choose your language, choose the language file to translate, and then you are presented with an interface to translate the strings.
A small level of integration is provided for languages which Google can translate, so as to provide a guide.

You can reach the language editor from Admin Zone > Style > Translate/rephrase Composr.

We recommend doing translations via Transifex instead. See the "Collaborative translations on Transifex" section.

Contextual translation

Many users like to translate stuff just on the public part of their own website. There is an option to change the language strings that you see on a page from the page footer which really helps you speed translation up. Specifically:
  1. In the footer there is a drop down menu which is only visible to Admins. This is titled "Select page rendering tool" as standard, although it may not be visible on custom themes or could have been translated. If you are not using the default Composr theme then you can enable the default theme temporarily by adding &keep_theme=default to the end of the website address (actually it's ?keep_theme=default if there's no ? already in the URL you're at).
  2. Choose "Translate/Rephrase Composr" from this menu and the language you want to translate to.

This will take you to the page which displays the language strings which are used on the current page.

Note that some of the strings will be re-used across different pages. Changing them on one page will affect any other page that uses them, so some care is needed. For example the word "Members" may appear on several pages. The up-side is that once it has been translated once it should not need translating on further pages.

Searching for language strings

If you want to find where a language string is defined then you can put it into the Admin Zone search between quotes. Additionally the contextual translation tool will say which file each string is from.

Collaborative translations on Transifex

You can use Transifex to translate Composr into your language with the help of others.

Transifex is great because:
  • You do not need to feel that you are alone translating everything yourself anymore.
  • It's very easy to work together. People can be translating the same language at the same time.
  • Anyone can download the current translations at any time.
  • Translations will be automatically upgraded to new major releases of Composr – there's no risk of work being lost.
  • Transifex has a high-quality user interface with word-wrapping, filtering options, comments, prioritisation, approval, and more
  • We have separated out admin strings from user strings

The process is as follows:
  1. Go to Transifex.
  2. Register as a Transifex user if you do not already have an account, or log in. If registering you'll need to set your name and role after confirming your e-mail address, choose 'Translator' as the role. Skip the steps '2' and '3' on the form, just click the button to go to your dashboard.
  3. Go to the ocProducts organisation on Transifex.
  4. Choose the version of Composr you have installed (if you're not sure, it will say on the front page of the Admin Zone). Transifex calls this the 'project'.
  5. Choose your language.
  6. Click "Join team"
  7. (Wait until the team join request has been accepted – it is a good idea to introduce yourself on the Internationalisation forum so we know who you are)
  8. Start translating individual resources (language files)

The strings are split across about 250 core resources/files; often it works well to work with other people, each doing different files. The files are actually marked by priority – the high priority or urgent priority ones are core, while files for non-bundled addons, or collections of strings only relevant to administrators, are marked normal priorities. Most users will want to ignore anything not on a urgent or high priority. Don't feel compelled to do it all for 'completeness'.

We also have certain default Comcode pages and text files available as resources.

Notes about specific language strings are automatically made available within Transifex.

To coordinate with other translators use the Internationalisation forum. We suggest you have one main topic per language, and keep editing the first post of that topic to provide basic details such as policies, translator names, and translation status.
Please try and respect the translators who have worked before you, and negotiate to a common translation standard. For example, you may need to agree on the use of formal/informal grammar within the translation to ensure everything is consistent. If you have suggestions, or are the first major translator, make them known on the forum. We have team approval turned on, i.e. you have to apply to join the translation team. The only reason for this is so that we can protect existing translators from problems caused by newbies diving right in. Generally we expect to approve all requests, unless there's already a strong language maintainer making quick progress on that language and not wanting to share the task.

Transifex is not being used for theme images. However for the great majority of sites, translating the language strings and default Comcode pages is enough. If you want to distribute translations for internationalised theme images (which are mostly just around Comcode/Chatcode editing toolbar at this point), you can do so in the Internationalisation forum.

Transifex tips:
  • Firefox has a good spellchecker that is compatible with Transifex (it doesn't seem to work in Chrome). Install the dictionary addon for your language before you start. It'll put the red squiggly lines under misspellings while you are doing the translating.
  • Don't manually change the URL in Transifex to quickly navigate between language files – Transifex won't save stuff correctly if you do.
  • The too_common_words resource is a list of words that should be filtered out of searches; it should not be directly translated, but rewritten for your language as appropriate

Testing and Debugging tips

Here are a few tips you may find useful:
  • To find out which language is in use, view the HTML source. The language code will be given within the <html ...> tag.
  • You can test multiple languages in parallel to compare using the &keep_lang=<code> feature (which will be preserved while you navigate between pages). E.g. to compare the home page in English and French you could have these 2 URLs open in different tabs:
    1. http://yourbaseurl/index.php?keep_lang=EN – English version
    2. http://yourbaseurl/index.php?keep_lang=FR – French version
  • Similar to keep_lang, you can append &special_page_type=lang to the URL to view all the language strings used to build up the page. You can edit them locally using this interface, but even if you are using Transifex the interface is useful to see what files and string names the strings are coming from. Transifex allows you to search by key name (which means Composr string name) if you select the correct resource file first. Just be aware that Composr itself doesn't divide split files into administrative and non-administrative, we only do that on Transifex – so you may need to search both the whatever and whatever__administrative resource files to locate the string.
  • It is best to re-run the installer (install.php, not the Setup Wizard) after you have your language pack in place, and select your language during the installation process. This way you will have the pre-placed content installed using the translated strings from your language pack (such as root categories, default forums, and default news categories).
  • You will probably want to test using the default test user account rather than an admin account – unless you really are intending to translate all the administrative strings too. By using a regular account you will not be distracted by unexpected untranslated administrative strings integrated into the user interface.
  • Sometimes you may need to find shorter ways of saying things due to lack of space in the design for a language that is using very long words – you may need to use abbreviations, for example.
  • Sometimes you can best get the intention of a language string name, which is shown as the 'Key' name on Transifex.
  • You may wish to occasionally search the language .ini files in Composr. If you are on Windows AstroGrep is a nice tool that supports Unicode.

Downloading translations from Transifex

» See the Changing the site language (for end users) tutorial.

Turning on a different language

» See the Changing the site language (for end users) tutorial.

Criticising language packs

Image

Choosing a language to criticise the translation of

Choosing a language to criticise the translation of

(Click to enlarge)

Image

Choosing a language to criticise the translation of

Choosing a language to criticise the translation of

(Click to enlarge)

A tool to criticize language packs is provided, to identify what has not been translated, among other things. This tool is intended for those who translate language files without using the inbuilt editor, or for those who have upgraded Composr and need to update their language packs.

It is generally better to look at outstanding translation tasks via Transifex. We update the source strings on there with each Composr patch release.

Right-to-left languages

Composr has some built-in support for right-to-left languages. You need to change the dir, en_left and en_right language strings to activate it.

Tip

There are web browser extensions that let you flip text direction on your own web browser only, to help you better read English text/code when on a right-to-left site.

However there are two issues where some extra consideration is needed…

Comcode editing

Because Comcode is written in English, and punctuation symbols are considered right-to-left punctuation when "automatic bi-directional detection" is enabled, there is a conflict between the desire to type Comcode in English and the desire to type normal right-to-left script.
The following is in our CSS, but commented out:

Code (CSS)

input[type="text"],textarea { /* So Comcode can be typed */
        unicode-bidi: bidi-override;
        direction: ltr;
}
 
Uncommenting this makes text input areas work in left-to-right. You can choose to enable it, to make Comcode easier to type, but it will make right-to-left languages harder to type and understand.

Theme layout

In the past we have tried to make our default theme support right-to-left nicely, but it's a high maintenance burden for the developers and unfortunately there are many cases where we could not elegantly do it because we are setting things on a pixel-way instead of a left/right-way. For example, you may see list bullets displaying on the wrong side of a list element. It is caused by CSS like:

Code (CSS)

ul.compact_list li {
        margin: 0 0 0 17px;
        padding: 0;
}
 
which would need changing to:

Code (CSS)

ul.compact_list li {
        margin: 0 17px 0 0;
        padding: 0;
}
 
Therefore to make things display neatly you will need to make a modified theme that makes these kinds of changes for margin settings, padding settings, and background settings.

If there are cases where our default theme can be improved without encumbering the left-to-right majority too much, please consider making a merge request on the Composr project on GitLab. Only do this if you have brought things up to a default level and tested it. We would love regular users of right-to-left languages to contribute corrections, as it is too big and complex a task to integrate into the core development process.

Additional things you can translate

As well as the core .ini files, there are other things that may be translated.

Comcode pages

A Comcode page is a page like your front page that consists mostly of static text that doesn't have any particular predefined content structure. This is contrasted to a page that is generated from some other form of content like a news article or news archive. In the simplest case, a Comcode page is pure translatable text. In more complex cases it may have HTML and Comcode tags mixed in with the text.

When you click the "Edit this page" link beneath a Comcode page it will give you the choice of which language to edit. This is how you can edit the Comcode page text, such as the home page or an about-us page. Pages in another language will display the main version's text unless some changes have already been made for that language. Obviously any subsequent copy changes you make to the wording for a particular language, i.e. adding another sentence or paragraph, will need to be repeated for every language that you maintain a translation for (assuming you want to keep your translations consistent).

To translate a Comcode page manually copy the Comcode page .txt file (assuming you originally created it in English) from the pages/comcode_custom/EN directory, to the appropriate pages/comcode_custom/<lang> directory and then customise it.

HTML pages

As HTML pages are created outside Composr, you must manually copy the file in the equivalent way to as stated for Comcode pages.
Composr does not ship with any default HTML pages, so you only need to worry about HTML page translation if you actually made some pages.

Copy the page .htm file from the pages/html_custom/EN directory (assuming you originally created it in English), to the appropriate pages/html_custom/<lang> directory and then customise it.

Text files

There are some other text files you might want to translate are, in a similar way to Comcode pages (see above):
  • text/EN/quotes.txt
  • data/modules/cms_comcode_pages/EN/*.txt
And these files don't need translating but could be replaced with equivalents in your language:
  • text/EN/synonyms.txt (a list of alternative search words, particularly used within the admin system; if translating you may want to extend it with your translations, rather than replacing it – so that the English words remain as synonyms)
  • text/EN/too_common_words.txt (a list of words that should not be considered in search results, for example)
  • text/EN/word_characters.txt (a list of characters that appear in words in your language – most languages have all the English characters, but also accented ones)

None of these files are very important, only translate them if you want to.

Images

If you look under the themes/default/images/ directory you will see there is an EN directory that contains images with English text on. You can copy this to the ISO codename of your language pack (e.g. FR), and then replace individual images with the translated ones.

For example, we have themes/default/images/EN/chatcodeeditor/new_room.png. It is stored under "EN" because it contains some English text within the image. You could also have themes/default/images/FR/chatcodeeditor/new_room.png as the French version of this image.

Make sure you clear your theme image cache (Admin Zone > Tools > Website cleanup tools) after doing the substitution.

We have the PSD files (requires Adobe Photoshop or compatible software) for many of the images in our downloads database.

You may want to just skip translating images. The toolbar buttons that use images have translated tooltips anyway.

WYSIWYG editor

Composr uses a third-party WYSIWYG editor – CKEditor.
It has its own translations which should automatically be linked to your own by the standard ISO language name.

If there's no translation for CKEditor then you need to manually add one. The data/ckeditor/lang/ta.js file provides a good example of the basic strings to translate (the ones we translated are obvious). However, you also need to make some technical code edits to data/ckeditor/ckeditor.js to reference that the language exists; for this reason a language pack can't really include a custom CKEditor translation and we would need to take the translation via a merge request on GitLab instead.

Template/CSS editor

Composr uses a third-party code editor – a modified version of EditArea.
You need to make sure you have translated versions of all data/editarea/lang/<lang>.js files. There are quite a few translations already in there.

That said, only the administrators will see the code editor, so it's not a big deal to get this translated.

MySQL collations (only applicable to ex-ocPortal users)

If you are upgrading from an ocPortal release it is highly recommend to switch the database and software over to utf-8.
Image

phpMyAdmin exports

phpMyAdmin exports

(Click to enlarge)

When ocPortal 9 or earlier installs, it will use latin1 (equivalent of ISO-8859-1) by default, because this is what the default English language pack uses. MySQL doesn't make it easy to convert the character set of the database, so the best way to do it is to export to an SQL dump, edit the dump in a text editor, and reimport into a new database:
  • Use phpMyAdmin to export all the tables to a .sql file on your computer; if it asks what character set to make the file, choose UTF-8
    • Make sure to not include multiple rows in each INSERT statement, as MySQL may not be able to reimport it on some servers (see the screenshot to the right for how to set this on phpMyAdmin, although it may vary from one version to another)
  • Do whatever you would normally do to backup your database; that will usually mean keeping a copy of the above file (but ensure it is complete before relying on it, sometimes SQL dumps don't download correctly)
  • Use a text editor to replace all instances of latin1 with utf8mb4 in the file (note that it is utf8mb4 not utf-8 – MySQL doesn't want hyphens in its codenames, and we have to use the mb4 variant for full unicode support)
  • Specifically for the creation SQL for the following select tables (if they exist) replace utf8mb4 to utf8:
    • addons
    • f_emoticons
    • f_saved_warnings
    • group_privileges
    • member_privileges
    • newsletter_subscribe
    • theme_images
    • tickets
    • url_id_monikers
    • url_title_cache
  • Unfortunately MySQL has a key limit of 1000 and utf8mb4 makes us hit it a lot; so now, either:
    • replace MyISAM with InnoDB. This will require MySQL 5.6+. Then later in Commandr run this command… :set_value('innodb','1');
    • change all utf8mb4 back to utf8 (i.e. reverse course from the above steps and suffer emojis not saving into the database)
    • edit each individual index on a VARCHAR(255)/LONGTEXT field to have a length limit of 250, via exampleexample(250) on each such index (slow and tedious, requires a lot of understanding)
  • Save the edited file
  • Use phpMyAdmin to drop all tables in your database
  • Use phpMyAdmin to import the edited file

GD fonts

We bundle some Open Source .ttf fonts (GNU FreeFont fonts) that unfortunately don't have characters for all languages.

This affects:
  1. Vertical text shown on permission editing interfaces. This issue is less relevant as it once was, as modern web browsers can render vertical text directly.
  2. The Logo Wizard.
  3. The main_custom_gfx block.

The solution to '1' is to upload Courier New Bold.ttf from your own computer to data_custom/fonts/Courier New Bold.ttf. We would distribute this file with Composr, except we don't have a licence to; however if you have a copy of Windows or Mac OS you should have your own licensed copy of this file.

The solution to '2' is to upload Verdana.ttf from your computer as data_custom/fonts/Verdana.ttf. Verdana is a default font used during the Setup Wizard; if you are using the Logo Wizard manually then you may just want to upload all your fonts to data_custom/fonts and pick one of those you uploaded.

CSS

The default CSS will have provisions to cope with translations. For example, we only have fixed layout for tables on English as we can't trust name lengths on other languages to not overrun:

Code (CSS)

.wide_table {
        width: 100%;
        /*{+START,IF,{$EQ,{$LANG},EN}}*/
                table-layout: fixed;
        /*{+END}*/
}
 
If you like you can just change it to:

Code (CSS)

.wide_table {
        width: 100%;
        table-layout: fixed;
}
 
and make sure your language strings don't overrun.

Exporting language packs

Image

Exporting an addon

Exporting an addon

(Click to enlarge)

If you wish to directly distribute your translation as an addon, you can export it as such. Most translators will not want to bother to do this, as having users actively using Transifex directly will have the side-effect of encouraging collaboration.

For languages that have very high-quality release-ready translations an addon is a good thing to provide users – but building the addons from Transifex is integrated into the development process anyway.

Comment topics

Composr associated topics with content using the topic description field. This involves a translated language string.
Therefore if you switch your default site language, any existing topics will become disassociated.
You can resolve with a manual database query (in something like phpMyAdmin):

Code (SQL)

UPDATE cms_f_topics SET t_description=REPLACE(t_description, 'Comment: ', 'Commentaire: ');
 
Replace Commentaire with whatever you have translated the COMMENT language string to.
The example assumes the table prefix is cms_.

Special language-specific challenges

Gendered-descriptors

Some languages may require stock phrases to be gendered. One known example is the member usergroup titles in French, which would need to be written as masculine or feminine based on the gender of the member being displayed. For example, the default title of a usergroup (given to members within it) is "Standard member" in English, but needs to be gendered on a french website.
A solution to this can be implemented via a template edit. We can edit the CNS_TOPIC_POST.tpl template to change:

Code

   {+START,IF_NON_EMPTY,{POSTER_TITLE}}<div class="cns_topic_poster_title">{POSTER_TITLE*}</div>{+END}
to:

Code

   {+START,IF_NON_EMPTY,{POSTER_TITLE}}
      <div class="cns_topic_poster_title">
         {+START,CASES,{$CPF_VALUE,Gender}}
         Mâle=Un mâle
         Femelle=Une femelle
         ={POSTER_TITLE*}
         {+END}
      </div>
   {+END}
   {+START,IF_NON_EMPTY,{RANK_IMAGES}}<div class="cns_topic_poster_rank_images">{RANK_IMAGES}</div>{+END}
This assumes we have created a custom profile field titled "Gender" with the options of "Mâle" and "Femelle". If filled in, it will remap their usergroup's title to either "Un mâle" or "Une femelle" accordingly. If not set, it will just use the regular usergroup title.
Of course this system is hard-coding gender-specific labels into a template, so it is not perfect. However, sites could use more elaborate Tempcode if they need to preserve the feature of having different labels for different usergroups. We'll give such an example in the next section.

Alternatively, you could just remove {+START,IF_NON_EMPTY,{POSTER_TITLE}}<div class="cns_topic_poster_title">{POSTER_TITLE*}</div>{+END} entirely if you don't require the feature at all.

Elaborate example

Using additional Tempcode, we could do additional deep remapping.

Code

   {+START,IF_NON_EMPTY,{POSTER_TITLE}}
      <div class="cns_topic_poster_title">
         {+START,IF,{$EQ,{POSTER_TITLE},Acteur}
            {+START,CASES,{$CPF_VALUE,Gender}}
            Mâle=Acteur
            Femelle=Actrice
            ={POSTER_TITLE*}
            {+END}
         {+END}
         {+START,IF,{$EQ,{POSTER_TITLE},Policier}
            {+START,CASES,{$CPF_VALUE,Gender}}
            Mâle=Policier
            Femelle=Policière
            ={POSTER_TITLE*}
            {+END}
         {+END}
      </div>
   {+END}
   {+START,IF_NON_EMPTY,{RANK_IMAGES}}<div class="cns_topic_poster_rank_images">{RANK_IMAGES}</div>{+END}

This will remap the title based on what it already is. For each of the possible values ("Acteur" and "Policier" in our example) it will provide the gendered options for it.

Exporting a language pack directly from Composr

If you translate within Composr (not Transifex), then you can export a language pack within Composr.
Go to Admin Zone > Structure > Addons, scroll down to "Create (export) an addon, from selected changed local files" at the bottom. From this screen there's an option to export a language to an addon.

Concepts

Language string
A piece of text, often a phrase, used by Composr; identified by a short code WRITTEN_LIKE_THIS
Character set
A set of characters that the one-byte-per-character representation system ties to; used to allow more than 255 characters to be represented on computers so that they may show many different language scripts
Unicode
An encoding scheme for multi-byte characters, supporting multiple languages together without any need for special character sets.
Transifex
The translation platform we use to translate Composr
Collation
A character scheme within the database, used for interpreting the basic grammatical construct of text, as well as the character set
Internationalisation
The process of making software work well in different international settings. It includes multiple aspects, such as time-zones, date formats, and language translations.

See also


Feedback

Please rate this tutorial:

Have a suggestion? Report an issue on the tracker.