Supported File Formats

Transifex is constantly being developed and our aim is to support most file formats that could be localized. Here’s a list of the localization formats Transifex fully supports:

If your project is using one of the aforementioned internationalization methods you can start using Transifex to manage your translations. Otherwise, either wait until we add support for your file formats or contribute to the project by writing a patch. Here is a tutorial on How to add support for a new file format.

Note

Files uploaded to transifex.com must be encoded in UTF-8. The same encoding is used for the translation files that the users download. If a format does not use the UTF-8 encoding, it will be explicitly specified in the format’s documentation in this page.

The Transifex Translation Storage Engine

Every project in Transifex is a grouping of resources and each resource corresponds to a source translation file. If for example you have a project with two files for translation, foo.pot and bar.pot, you’ll need to create two separate resources (e.g. foo and bar) and map each one to a translation file.

Every project on Transifex is associated with a source language. It is assumed that all resources under a project have the project’s source language. Transifex assigns the source strings to a language and using the web editor or by uploading a file, you can translate these strings into even more languages.

When importing a source file into Transifex, this file will be saved and used as a template whenever you want to export a translation file for downloading. Since the template is common between all languages and is derived from the source file, Transifex does not guarantee that the exported translation files will be exactly the same as the imported files. However, this also depends on which internationalization method you are using and what metadata it inserts in the file headers.

For a more in-depth description of how each internationalization method is being handled, check the appropriate section below.

The Engine in Detail

There are three internal structures in Transifex which are of interest: Source Entities, Translations and Template files.

Each resource inherits the source language of the project it belongs to, which can be other than English. In the following examples, however, we will use English as an example to make things easier.

Source Entities

At the core of the Transifex translation storage engine are Source Entities. These are representations of actual translatable objects, together with their metadata, such as a sentence in a particular file and context.

In the world of Gettext PO files, an entity is a whole unit inside the POT file: the msgid together with all the metadata it carries (context, occurrences, comments etc).

Translations

Each entity has Translations to a number of languages – including English. There is an 1-1 relationship between an entity and a translation. So, if you upload a fresh POT file with one msgid inside, you’ll end up with one Entity and one translation in English created.

Translations do not exist by themselves, they are always specific to an entity. This is logical too: each translation is specific to a particular context.

When you request a translation, you basically request the entity translated in a particular language. The entity itself does not contain any translation. Not even the English one.

Template files

Until now we haven’t mentioned anything about files. Starting with v1.0, Transifex no longer handles ‘files’ as atomic units, but entities, as mentioned above. Files aren’t even required: You can even have a whole translation project without uploading any file at all, just by creating entities using the Transifex API directly.

However, we do need files in so many places, so Transifex also supports them. You can create a new resource by uploading a source language file, which you can then download either in the source language itself or other languages.

(The reason it’s called a template file is that Transifex actually removes the English strings from the file and replaces them with a hash id. You never notice this, because hashes will be replaced by the translation or an empty string when you export the file.)

Each resource can have one Template file. If it does, you’ll be able to download the file localized. If not, you’ll only be able to access the entities and translations using the API.

Importing and Exporting files

Given that Transifex does not store all files you send to it natively, but rather converts them to a template, entities and translations, we’ll go though what happens when you import or export files.

Importing a Source Language file

When uploading a source file, Transifex assumes that language of the file is the source language of the project. Here are the steps taken:

  • Identify the translatable units/entities inside the file.
  • For each one, create a new Source Entity in Transifex.
  • Most source files include the string in the original language, such as English (in the case of Gettext POT, this is the msgid content). Take this string and store it in the database too (as a Translation to English).
  • Replace this string with a hash/id in order to make the file English-less, and hence, a template for all languages to be exported from.

Updating a Source Language file

When uploading/pushing a new source language file, Transifex will try to update the existing entities with the ones in the new file. Existing entities should be updated, new entities added, and no longer existing entities deleted.

  • Identify the translatable units/entities inside the file.
  • Use the English string and the metadata to find an existing 100% matching Entity in the database. If found, update it if needed (e.g. new metadata).
  • At the end we’ll have a list of new entities not found in the DB, and a list of existing entities in the DB which don’t seem to exist in the file. Try to find similar matches between them, ie. whether there are entities which match more than 80%. If yes, consider them the same entity and update them.
  • Create the remaining new entities and delete the ones which were removed from the source file.

Downloading a file

Downloading a source or translation file is exactly the same process.

When you request to download the file in French, Transifex will open the template file and substitute all English strings with the French equivalent strings. It won’t do a simple search-and-replace though, but rather use the whole entity metadata for the search. For example, two entities with “Hello” as an English string might be translated differently.

Here’s how Transifex does this:

  • Open the resource’s template file.
  • Update the file headers (depending on file format). For example, replace the Language header, or the Last-Translated-By header.
  • Replace all hashes with the appropriate translation.

Transifex can (and depending on the format will) differentiate on whether the user asked to download the file to use it in his project or to translate it off-line and have a different output. This is necessary, so that Transifex can ease the work of translators and developers in certain situations. For instance, for some formats that do not fallback automatically to the source strings in case of empty translations, the file the developer gets (by downloading the file in order to use it) might have any empty translations filled in with the source strings. The exact behavior for each format is documented in the corresponding section.

Supported File Types

Android Resources

Google Android’s format is natively supported in Transifex.

Associated file extensions:
.xml
i18n type:
ANDROID
Note:
Supported on Transifex.com.

The format of the android resources is described in the docs (see box at the end of the section).

Android resources is one of the formats that does not fallback to the source strings in case of missing translations. For this reason, if the user downloads the android file for use, all missing translations will be filled in with the source strings. Downloading the file for translation on the contrary leaves them empty, so that translators know which are untranslated.

This is a XML based format. There are three types of entries: string, string-array and plurals.

The first type of entries contains simple strings or sentences. Each entry must have a name attribute which must be unique in a resource. This is represented as a single entry for translation in Transifex as well.

The string-array type also has a name attribute which must be unique in a resource file, too. Each string-array contains multiple items. These items are represented as a single translation entity in Transifex, so that it is possible to have a different number of entries in a particular language. So, the whole content of a string-array is shown for now and the translator is responsible to create a correct list of translated items. For example, the entry:

<item>one</item>
<item>other</item>

that lists the available quantities in English should be translated in Greek as:

<item>ένα</item>
<item>πολλά</item>

and in Welsh:

<item>un</item>
<item>dau</item>
<item>eraill</item>

since there are three quantities defined in the welsh grammar.

The last type of entries, plurals, also has an attribute name (which is required to be unique as well) and contains a list of items, each one having an attribute quantity that is used to select the string for a particular quantity.

Any attributes in the string, string-array and plural elements can be used. Transifex will recognize, however, a specific attribute, translate. When its value is false, Transifex will ignore the specific entry and not present it to translators, either in the web editor or in the files downloaded for offline translation.

Keep in mind that there are two ways to use quotes and apostrophes in android resources files: one can either escape them or have the whole string enclosed inside the other type of quotes that those used. However, Transifex will always remove the quotes which enclose the string (so that they will not be presented to the translator) and escape the quotes when a resources file is requested.

Another thing to note is that you don’t need to HTML-encode entities in strings in android resources files. Transifex.com complies to this rule and it does not do any encoding of HTML entities, unless there are format specifiers in the string, like %1.

Transifex also supports developer comments in Android .xml files. A valid XML comment preceding any of the following elements: ‘string’, ‘plurals’ or ‘string-array’ will be saved as comment for the source string extracted from the element.

The following examples are from the “Hello L10n” tutorial from the Android developer guide.

Here’s a sample file res/values/strings.xml file containing our strings:

<?xml version="1.0" encoding="utf-8"?>
<resources>
    <!-- This is a comment -->
    <string name="app_name">Hello, L10N</string>
    <string name="text_a">Shall I compare thee to a summer's day?</string>
    <!--This is another comment-->
    <string name="text_b">Thou art more lovely and more temperate.</string>
    <string name="text_c" translate="false" foo="bar">DO NOT TRANSLATE</string>
</resources>

When translated in Transifex, the exported German file (eg. res/values-de/strings.xml) will have this content:

<?xml version="1.0" encoding="utf-8"?>
<resources>
    <string name="app_name">Hallo, Lokalisierung</string>
    <string name="text_a">Soll ich dich einem Sommertag vergleichen,</string>
    <string name="text_b">Der du viel lieblicher und sanfter bist?</string>
</resources>

And the Japanese file (eg. res/values-ja/strings.xml) :

<?xml version="1.0" encoding="utf-8"?>
<resources>
    <string name="text_a">あなたをなにかにたとえるとしたら夏の一日でしょうか?</string>
    <string name="text_b">だがあなたはもっと美しく、もっとおだやかです。</string>
</resources>

Apple strings files

Apple .strings files are used for the localization of Mac OS X and iPhone applications.

When a user downloads a .strings translation file from Transifex for use, the file will contain all entries. However, in case there are untranslated ones, those will be filled in with their corresponding source strings.

On the other hand, when a user downloads a .strings file for translation, any untranslated entries will be filled in with their corresponding source strings, but they will be commented out, to make it easier for the translators to recognize them, while at the same time have the source strings for them available.

Associated file extensions:
.strings
i18n type:
STRINGS
Encoding:
UTF-16
/* Insert Element menu item */
"Insert Element" = "Insert Element";

/* Error string used for unknown error types. */
"ErrorString_1" = "An unknown error occurred.";

Note

Transifex expects the files to use the UTF-16 encoding, which is the default encoding for standard strings files.

Desktop files

.desktop file contains “desktop entries”, or configuration files describing how a particular program is launched, how it appears in menu, etc. It is widely used by KDE and Gnome. Keep in mind that the source file will have all translations, while the downloaded files will only have the strings in the respective language.

Associated file extension:
.desktop
i18n type:
DESKTOP

Sample data

[Desktop Entry]
Icon=okular
Name=Okular
Name[ar]=اوكلار
Name[ast]=Okular
Name[bg]=Okular
Name[ca]=Okular
X-KDE-ServiceTypes=KParts/ReadOnlyPart
X-KDE-Library=okularpart
Type=Service
MimeType=application/vnd.kde.okular-archive;

Gettext-based formats (PO files)

Associated file extensions:
.po, .pot
i18n type:
PO

A PO file is made up of many entries, each entry describes the relation between an original untranslated string and its corresponding translation. All entries in a given PO file usually pertain to a single project, and all translations are expressed in a single target language.

Here is the typical workflow developers use with gettext, taken from the gettext manual itself:

Original C Sources ───> Preparation ───> Marked C Sources ───╮
                                                             │
              ╭─────────<─── GNU gettext Library             │
╭─── make <───┤                                              │
│             ╰─────────<────────────────────┬───────────────╯
│                                            │
│   ╭─────<─── PACKAGE.pot <─── xgettext <───╯   ╭───<─── PO Compendium
│   │                                            │              ↑
│   │                                            ╰───╮          │
│   ╰───╮                                            ├───> PO editor ───╮
│       ├────> msgmerge ──────> LANG.po ────>────────╯                  │
│   ╭───╯                                                               │
│   │                                                                   │
│   ╰─────────────<───────────────╮                                     │
│                                 ├─── New LANG.po <────────────────────╯
│   ╭─── LANG.gmo <─── msgfmt <───╯
│   │
│   ╰───> install ───> /.../LANG/PACKAGE.mo ───╮
│                                              ├───> "Hello world!"
╰───────> install ───> /.../bin/PROGRAM ───────╯

A typical PO file entry has the following schematic structure:

#  translator-comments
#. extracted-comments
#: reference...
#, flag...
#| msgid previous-untranslated-string
msgid "untranslated-string"
msgstr "translated-string"

A PO file is composed of multiple PO entries and they layout is something like this:

# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2010-06-08 10:12+0300\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: actionlog/templates/object_action_list.html:7 txpermissions/forms.py:18
msgid "User"
msgstr ""

#: actionlog/templates/object_action_list.html:8
msgid "Action"
msgstr ""

#: foo/templates/bar.html:180
msgid "{0} result"
msgid_plural "{0} results"
msgstr[0] ""
msgstr[1] ""

Transifex supports any kind of project and i18n configuration which produces valid PO files. To check the validity of your PO files, use the command msgfmt -c <pofile>, which is part of the standard gettext package.

Also, only .po/.pot files encoded in UTF-8 and without any BOM (Byte Order Mark) are supported. So, if your text editor adds such marks to the beginning of files by default then you will need to reconfigure it.

Here are a list of packages which provide PO files:

Managing PO files

Importing PO/POT files

If your project has a POT file, you should probably use that as the source file for creating a resource or you could use the PO file of the source language of the project. Whenever you import a new source file into a resource, the file is parsed using polib and each PO entry is added as a translation string in the database. Also, whenever you import a source file, the file is saved in the database and is used as a template to export strings from the database and create PO files whenever a user makes a request.

Before importing a file, you should be aware of the following things:

  • When importing POT files, the msgid of each PO entry is treated like a string for translation. On the other hand, if you upload a PO file, the strings are taken from the msgstr of each PO entry.
  • Fuzzy strings are not supported in the classic sense in Transifex. When importing a PO file, fuzzy translations are converted to Translation Suggestions.
  • Broken PO/POT files will not be imported. Before uploading a PO/POT file make sure you check if for correctness with the following command msgfmt -c file and fix any errors that may occur.
  • Comments in the headers of the translation files are not supported. Each exported translation file is created using the source file as a template so only comments in the headers of the source file will be preserved.
  • Some PO header information may be lost during the import. For example, we can only store the Last-Translator information if this user is already a Transifex member.
  • If a plural string doesn’t have all the plural form fields filled in or the number of plural forms is different than the number we have in our database for this language, the translations will not be imported.

Note

If your project has a POT file with logical IDs in the msgid section, when you upload the POT file to Transifex the logical ids will populate the source language as well. To fix this, you either need to assign the translation file of the source language as the source file of the project or just upload the translation file after the POT so it’ll overwrite the translations for the source language.

Exporting PO files

When you want to download a PO file for viewing or offline editing, Transifex will try to compile the file with the translation strings of the requested language and the source translation file as a template. The exported PO file is not guaranteed to be exactly the same as the imported file and that is beyond the scope of this application.

However, here are some design decisions that affect the exported PO file and you should be aware of:

  • Imported fuzzy strings and Translation Suggestions won’t be in the exported files. Only translated strings are included in the exported file.
  • The PO file headers will be updated using the latest data from Transifex. The headers that are altered by Transifex are:
    • Last-Translator which displays the last user that contributed to this language.
    • Project-Version-Id is the name of the project to which this resource belongs.
    • PO-Revision-Date is the time when the PO was last altered (usually set with datetime.datetime.now()).
    • Language is the language of the translation file.
    • Language-Team contains the team information for this language if a language team exists for this project.

See also

PO file format
An introduction to the PO/POT files.
GNU gettext documentation
Full documentation on GNU gettext including the handling of PO/POT files.

Publican

Publican is a featureful documentation framework, powering the documentation of the Fedora Project. It supports multiple source and translation files, allowing more than one people translating the same language by working on different files at the same time.

Transifex supports Publican on top of the PO file format.

See also

Publican Development Homepage
The development page of Publican itself, with lots of documentation.
Fedora’s Documentation
A few guides and other resources written using Publican and translated using Transifex for the Fedora Project.

Intltool

Intltool is a tool used to dynamically generate PO files by accessing the software’s source code. Its advantage over gettext is its support for additional source types such as oaf, glade, bonobo ui, nautilus theme and other XML files. Intlool was written specifically for GNOME.

Maintainers use intltool either with or without a POT source language file.

With a POT file

A maintainer uses intltool to maintain a POT file (e.g. an English one), similarly to gettext. This way, translators treat the project like a normal gettext-based project. To work, they only need access to the POT file and their own PO file.

Transifex supports this kind of intltool projects.

Without a POT file

A maintainer can go one step further and delete the POT file from the repository altogether, so that he won’t have to manage it and keep it updated. This way he delegates the responsibility of updating the translation files to the translators themselves. Translators and translation tools are required to download the whole source code and produce or update their translation files using intltool.

A major problem with this method (ie. pushing part of the internationalization process to translators and their tools, including Transifex) happens when the POT file fails to build for any reason. Translators are unable to work at all, and they do not have the credentials to go on and fix the problem. This is a common problem with intltool-based projects without a POT file.

For the above reasons, Transifex only supports intool projects with a POT file. Starting from v1.0, Transifex requires a source language file (POT). Since it no longer clones the whole repository, but only requires an HTTP link to a source language file, it’s impossible to generate the POT file itself.

If you are maintaining a project using intltool and would like to manage your translations using Transifex, you should generate your POT file (first method) and make sure it’s updated for translators to be able to work with it. Depending on your translation workflow, there are a number of ways to ensure your POT file is always updated. You may add a build rule in your Makefile which updates your POT file (e.g. intltool-update --pot) or simply setup a cronjob on your workstation or server to update it once per day.

See also

Intltool Homepage
The development page of Intltool itself.
Localize using gettext and intltool
GNOME’s instructions on using intltool.

Java property files

Associated file extensions:
.properties
i18n type:
PROPERTIES

Java .properties files are one of the formats used in Java applications for internationalization purposes. The format is relatively simple: each line consists (in the most common case) from a key and its associated value.

The translations downloaded from Transifex will use the source strings for any empty translation strings and will have the relevant entries commented-out.

The Java .properties standard dictates that the encoding needs to be ISO-8859-1 and not UTF-8. If you’d like to use Unicode files, choose the “Java PROPERTIES File - Unicode” format.

Joomla INI files

Associated file extensions:
.ini
i18n type:
INI

This is the file format used by the Joomla! CMS.

The translations downloaded from Transifex will use the source strings for any empty translation strings and will have the relevant entries commented-out.

Note

The files uploaded to Transifex should be encoded in UTF-8.

See also

Documentation
Joomla: How to create a language pack

Magento CSV files

Associated file extensions:
.csv
i18n type:
MAGENTO

This is a CSV file format used by the Magento Community. It allows comments (lines starting with a #), which are retained in the translated files when downloaded from Transifex.

Untranslated source strings are assigned an empty translation string: "". Similary in a downloaded reviewed translation file, unreviewed translations appear as "".

A sample Magento CSV file containing the English source strings and their German equivalents is shown below:

"Hello, world!","Hallo, Welt!"
# "this is a comment","Dies ist ein Kommentar"

"a phrase with ""double quotes""","Ein Satz mit ""doppelte Anführungszeichen"""
"a phrase with 'single quotes'","Ein Satz mit 'einfachen Anführungszeichen'"

Maker Interchange Format (MIF) files

Associated file extensions:
.mif
i18n type:
MIF

This is a markup language for Adobe’s FrameMaker product. It is mainly used for technical documentation. When such a file is downloaded for use, empty translations are replaced by translations in source language. When downloaded for translation (or review) in a language, only translated (or reviewed strings) in that language are present in the downloaded file.

An example follows.

<MIFFile 8.00>

<Para
    <Paraline
        <String `Hello, world'>
    >
    <String `Øβ¿'>
>

Note

Supported in Transifex.com.

See also

Documentation MIF Reference

Mozilla DTD files

A DTD file contains a list of entities that need to be localized. The entities defined in these files are then used inside the user interface XUL files.

Associated file extensions:
.dtd
i18n type:
DTD
Encoding:
UTF-8

Sample data

<!ENTITY foo.var1 "Hello">
<!-- This is a comment -->
<!ENTITY foo.var2 "How are you?">

See also

Documentation Mozilla localization quick guide

Mozilla Property Files

You can also translate Mozilla property files with Transifex. These files must be escaped Unicode encoded. In these files contain a number of javaScript properties are assigned with a string value. String resources in Mozilla property files appear in the following form:

The translations downloaded from Transifex will use the source strings for any empty translation strings and will have the relevant entries commented-out.

property_name = This is a value text
Associated file extensions:
.properties
i18n type:
MOZILLAPROPERTIES
Notes:
The files to be uploaded must be escaped Unicode encoded.

See also

Documentation Mozilla localization quick guide

PHP Files

Notes:
This is for PHP Array, Define and Alternative Array i18n types. When you download such a file for use in a particular language, all empty translations in that language are replaced by translations in the source language. But when you download the file for translation in a language, it contains only the translations in that the language only. When you download the file for review in a language, it contains only the reviewed translations in that language.

PHP Arrays

Transifex also supports the translation of PHP arrays used for the internationalization of some PHP projects.

Associated file extensions:
.php
i18n type:
PHP_ARRAY
Notes:
Supported on Transifex.com.
<?php
$LANG = array(
    "january"   => "enero",
    "february"  => "febrero",
    "march"     => "marzo",
    "april"     => "abril",
    "may"       => "mayo",
    "june"      => "junio",
    "july"      => "julio",
    "august"    => "agosto",
    "september" => "septiembre",
    "october"   => "octubre",
    "november"  => "noviembre",
    "december"  => "diciembre"
);?>

PHP Alternative Array

Transifex also supports an alternative version of PHP arrays: array assignment.

Associated file extensions:
.php
i18n type:
PHP_ALT_ARRAY
Notes:
Supported on Transifex.com.

Sample data:

<?php

$LANG['_MONDAY'] = "Monday";
$LANG["_TUESDAY"] = 'Tuesday';

/**This is a multiline
 * comment***/
$LANG["_WEDNESDAY"] = '';

$LANG["_Thursday"] = "Thursday";

?>

PHP DEFINE statements

Transifex also supports the translation of valid PHP DEFINE statements used for the internationalization of some PHP projects.

Associated file extensions:
.php
i18n type:
PHP_DEFINE
Notes:
Supported on Transifex.com.

Sample data

<?php
    DEFINE("january", "enero");
    DEFINE("february", "febrero");
    DEFINE("march", "marzo");
    DEFINE("april", "abril");
    DEFINE("may", "mayo");
    DEFINE("june", "junio");
    DEFINE("july", "julio");
    DEFINE("august", "agosto");
    DEFINE("september", "septiembre");
    DEFINE("october", "octubre");
    DEFINE("november", "noviembre");
    DEFINE("december", "diciembre");
?>

Plain text

Transifex now supports plain text files with .txt extension. The file is split on every newline splitted strings are imported from the file while ignoring the empty lines or lines with no content. Translation strings are mapped to source strings on the basis of their order in the file. For example, a string appearing 3rd in the translation file will be mapped to the source string appearing 3rd in the source file.

When downloaded for use, empty translations in the downloaded file are replaced by their corresponding source strings. When downloaded for review or for translation, the downloaded file only contains the reviewed translations (in case of download for review) and all translations (in case of download for translation).

It is to be noted that at the moment, Transifex accepts only 100% translated files to be uploaded as a translation. Transifex will complain if a file with more or less strings than in the source file is uploaded as a translation.

Associated file extensions:
.txt
i18n type:
TXT
Encoding:
UTF-8
Sample content:
  • Source file

    Hello!
    
    It's been a long time, eh.
    I hope you are doing great.
  • Translation file

    Γεια σας!
    
    Έχουμε καιρό να ειδωθούμε, ε.
    Ελπίζω να περνάτε καλά.
  • Invalid translation file

    Γεια σας!
    
    Ελπίζω να περνάτε καλά.
  • Another invalid translation file

    Γεια σας!
    
    Έχουμε καιρό να ειδωθούμε, ε.
    Ελπίζω να περνάτε καλά.
    
    Foo

Property List (.plist) files

Transifex now supports Property Lists or p-list files. Property list files are used to store serialized objects. They are mostly used in Mac OS X, iOS, NeXTSTEP, and GNUstep programming frameworks.

Associated file extensions:
.plist
i18n type:
PLIST
Encoding:
UTF-8
Notes:
Supported on Transifex.com.

P-list files are XML based files. It defines some tags for related CoreFoundation types and Foundation classes. The following table lists the XML tags, related Foundation classes and CoreFoundation and the data storage formats.

XML Tag Foundation class CoreFoundation type Storage format
<string> NSString CFString UTF-8 encoded string
<real>, <integer> NSNumber CFNumber Decimal string
<true/>, or <false/> NSNumber CFBoolean No data (tag only)
<date> NSDate CFDate ISO 8601 formatted string
<data> NSData CFData Base64 encoded data
<array> NSArray CFArray Can contain any number of child elements
<dict> NSDictionary CFDictionary Alternating <key> tags and plist element tags

Of the above mentioned XML tags, only the <string> tag contains data that should be translated. So, Transifex first validates the content of the .plist file and then it searches the entire PLIST tree for <string> tags and imports the strings contained in them for translation. XML tags not belonging to p-list format are ignored. While comments and valid non <string> p-list tags are saved as they are.Transifex expects a property-list object to be as root at the top of the p-list tree. This root object can be of primitive type (anything other than <dict> and <array>) or of container-type like <dict> or <array>. In case there isn’t a single root at the top of the hierarchy, then Transifex will complain verbosely to let the user know about this. Transifex also supports comments in p-list files. A sample p-list content is shown below.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<array>
    <dict>
        <!--This comment applies to all the elements under key John-->
        <key>John</key>
        <dict>
            <key>Name</key>
            <string>John</string>
            <!--This comment overrides any comment inherited from parents-->
            <key>Country</key>
            <string>Brazil</string>
        </dict>
    </dict>
    <integer>1</integer>
    <string>This is a text</string>
    <dict>
        <key>Some countries</key>
        <array>
            <string>India</string>
            <string>Greece</string>
            <string>Brazil</string>
        <array>
    </dict>
</array>
</plist>

Transifex imports any valid XML comment preceding an element (<dict>, <array>, <key> inside <dict>, <string>) as comment for the element (if it is a primitive type like <string>) or applies the comment for the descendants of the element. In case there is a comment for a descendant element, the comment overrides any comment inherited from a parent. One thing to be noted that inside a dictionary <dict> element, the comment must precede the <key> element and not the value element.

When a p-list file is downloaded for use from Transifex, any untranslated string in it replaced by the source string. When downloaded for review, the downloaded p-list file contains only translated strings. When downloaded for translation, it contains only translated strings. In any case, the elements which are not parsed and not ignored by Transifex are present in the downloaded files as they were in the source p-list file.

Qt Linguist (TS files)

Associated file extensions:
.ts
i18n type:
QT

The Qt format is too complicated to analyze here in great detail. If you’re interested in how a Qt file is organized head to Qt’s documentation for more information. For our needs, we’ll consider a TS file as a simple xml formatted file with the following layout:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE TS>
<TS version="2.0" language="en">
<context>
    <name>Kuvasin</name>
    <message>
        <location filename="kuvasin.cpp" line="115"/>
        <source>PROCESSING START...</source>
        <translation>Starting scan and copy process...</translation>
    </message>
    <message>
        <location filename="kuvasin.cpp" line="136"/>
        <source>USER ABORT.</source>
        <translation>User aborted processing.</translation>
    </message>
    <message numerus="yes">
        <location filename="kuvasin.cpp" line="140"/>
        <source>%n FILES PROCESSED.</source>
        <translation>
            <numerusform>One (%n) file processed.</numerusform>
            <numerusform>%n files processed.</numerusform>
        </translation>
    </message>
    ...
</context>
<context>
    <name>QLabelDropTarget</name>
    <message>
        <location filename="qlabeldroptarget.cpp" line="8"/>
        <source>ITEMS:</source>
        <translation>Items:</translation>
    </message>
    <message>
        <location filename="qlabeldroptarget.cpp" line="46"/>
        <source>SINGLE ITEM ONLY, PLEASE.</source>
        <extracomment>Multiple items are being dragged over this widget, show
a note that it won&apos;t be accepted.</extracomment>
        <translation>This drop target accepts only a single item at a
time.</translation>
    </message>
    <message>
        <location filename="qlabeldroptarget.cpp" line="68"/>
        <source>MISMATCH: FOLDER &apos;%1&apos;.</source>
        <extracomment>Widget accepts only files, and now it is being offered a
directory.</extracomment>
        <translation>This drop target accepts only files. &apos;%1&apos; is a
folder.</translation>
    </message>
    <message>
        <location filename="qlabeldroptarget.cpp" line="78"/>
        <source>MISMATCH: FILE &apos;%1&apos;.</source>
        <extracomment>Widget accepts only directories, and now it is being
offered a file.</extracomment>
        <translation>This drop target accepts only folders. &apos;%1&apos; is
a file.</translation>
    </message>
</context>
</TS>

See also

Internationalization with Qt v4.7
An introduction on how Qt manages the i18n process.
Qt v4.7 Linguist Manual: TS File Format
The full DTD of the TS file format.

Importing TS files

The importing process given the file format is pretty straightforward. We parse the xml into a DOM and iterate through all the messages, saving every translation that is not empty to the appropriate language. For source files, we also save the translation file in the database as a template which will be used to create translation files for download in every user request.

Similar to PO files, in plural forms (where numerus="yes") if one of the numerus forms is empty or the number of the numerus forms differ from the number of plurals for a specific language, then this translation will not be imported. Only fully translated strings will be added to the database.

Exporting TS files

Whenever a user requests a translation file for downloading, Transifex compiles it from scratch on top of a template file that was created based on the source file. The metadata for each context and message is kept intact, the export only modifies the actual translation strings and the file header indicating the language of the translation file.

Subtitle formats (3)

Various subtitle formats are natively supported in Transifex.

Associated file extensions:
.srt, .sub, .sbv
i18n type:
SRT, SUB and SBV respectively.
Notes:
Supported on Transifex.com.

More specifically, the supported subtitle file formats are SubRip, SubViewer and Youtube captions (.sbv).

Notes:
When you download a subtitle file for translation (or review) in a language, the downloaded file contains only the translations (or the reviewed translations in case downloaded for review) in that language. When the file is downloaded for use, all the empty translations are replaced by the translations in the source language.

SubRip

SubRip is the native subtitle format of the SubRip program. It is one of the most used formats for subtitles, supported by most software video players, many subtitle creation/editing tools and some hardware home media players. YouTube also supports .srt files. Some subtitles feature certain HTML tags inside the SubRip text.

1
00:00:20,000 --> 00:00:24,400
<i>Altocumulus</i> clouds occur between six thousand

2
00:00:24,600 --> 00:00:27,800
and twenty thousand feet above ground level.

SubViewer

SubViewer files (.sub) are the native subtitle format of the SubViewer utility. They are also widely supported.

00:04:35.03,00:04:38.82
Hello guys... please sit down...

00:05:00.19,00:05:03.47
M. Franklin,[br]are you crazy?

Youtube captions (.sbv)

.sbv is the file format outputted by YouTube’s Automatic Timing feature which creates automatically timed captions based on a transcript.

0:00:00.000,0:00:07.000
>> TIM: So its 1976 I'm coming to the end
of my career at Oxford learning physics -

0:00:08.950,0:00:15.950
I really don't know anybody who's done physics
at a PhD level so I don't have a role model

Wiki markup

Wiki markup is the syntax and keywords used by the MediaWiki software and other Wiki packages to format a page. It is a plain text format.

Associated file extensions:
.wiki
i18n type:
WIKI

See also

Documentation Mediawiki formatting help

Windows resource files (.resx)

Windows resource files (.resx) are used (among other things) for the internationalization of .NET and Windows Phone applications. Transifex supports .resx files natively. When a file of this type is downloaded for translation (or review) in a language, the downloaded file contains only translations (or reviewed translations in case downloaded for review) in that language. When downloaded for use, it’s the same as when downloaded for translation.

Associated file extensions:
.resx
i18n type:
RESX
Notes:
Supported on Transifex.com.

HTML/XHTML

Transifex has full support for HTML and XHTML content (fragments or whole documents).

Associated file extensions:
.html, .xhtml
i18n type:
HTML, XHTML
Notes:
This feature is in Beta form. Supported on Transifex.com.

Any part of an HTML/XHTML document can be uploaded to Transifex and it will be parsed, split into smaller chunks and made available for translation either through Lotte or offline. In the case of XHTML, an XML parser is used for parsing so the content must be a valid XHTML document.

How it works

When importing, content is split into smaller chunks. Each chunk is a translatable unit. Inline elements are not considered separate translatable units, i.e., the content is not split on those and so they are shown within the surrounding text. It is up to the translator to determine the correct way to handle these inline elements as well as their order in the language he works on.

For example, the following fragment:

<p title="Paragraph title">Some text and a <a href="http://..."
title="link title">link</a>.</p>

will be split into the following chunks:

  • Paragraph title
  • Some text and a <a href="http://..." title="link title">link</a>.

If you are importing both source and translation files for the first time, (bootstraping your project with existing translations) the importer assumes that the translated files are 100% complete (not partially translated). If you attempt to import partially translated HTML files, the importer will import the full file in the target language (both translated and English strings), since there is no way to know for sure whether the content should be considered translated or untranslated.

Additionally, it is also required that the source and translated file are correctly aligned. The elements have to appear in the DOM in the same order between the two files so that a first alignment will take place between the entities. In case the files are not aligned as expected, information should be provided to help you correct it.

After the initial import, you may end up having partially translated HTML files, and Transifex supports a way to work with them offline. When downloading for offline translation (“Download for translation”), Transifex will add special attributes with alignment information to each element of the exported file so that translated units can be matched with their source entities during import. These attributes can be ignored by the translators. “Download for use” will export a “clean” version of the file (without special attributes).

Empty translations are replaced with the corresponding source strings. When downloading for review, empty or unreviewed translations are also replaced with their source strings.

XLIFF

XLIFF (XML Localisation Interchange File Format) is an XML-based format created to standardize localization.

Associated file extensions:
.xlf, .xliff, .xml
i18n type:
XLIFF

An example of an XLIFF file:

<?xml version="1.0"?>
<xliff version="1.1">
 <file original="Graphic Example.psd"
  source-language="EN-US" target-language="JA-JP"
  tool="Rainbow" datatype="photoshop">
  <header>
   <skl>
    <external-file uid="3BB236513BB24732" href="Graphic Example.psd.skl"/>
   </skl>
   <phase-group>
    <phase phase-name="extract" process-name="extraction"
     tool="Rainbow" date="20010926T152258Z"
     company-name="NeverLand Inc." job-id="123"
     contact-name="Peter Pan" contact-email="ppan@xyzcorp.com">
     <note>Make sure to use the glossary I sent you yesterday.
      Thanks.</note>
    </phase>
   </phase-group>
  </header>
  <body>
   <trans-unit id="1" maxbytes="14">
    <source xml:lang="EN-US">Quetzal</source>
    <target xml:lang="JA-JP">Quetzal</target>
   </trans-unit>
   <trans-unit id="3" maxbytes="114">
    <source xml:lang="EN-US">An application to manipulate and
     process XLIFF documents</source>
    <target xml:lang="JA-JP">XLIFF 文書を編集、または処理
     するアプリケーションです。</target>
   </trans-unit>
   <trans-unit id="4" maxbytes="36">
    <source xml:lang="EN-US">XLIFF Data Manager</source>
    <target xml:lang="JA-JP">XLIFF データ・マネージャ</target>
   </trans-unit>
  </body>
 </file>
</xliff>
Which features of XLIFF are currently supported?:
 
  1. Supported elements: <group>, <context-group>, <context>, <trans-unit>, <source>, <target>
  2. Context data is collected and stored
  3. Plural data
What is not supported?:
 
  1. Attributes of the above mentioned elements are not taken into account. One of the reasons is that attributes are not used consistently.
  2. <alt-trans> element used for holding TM data in xliff files is not supported. [It can be used to import TM data for XLIFF resources]
  3. Inline elements inside <source> and <target> elements are not processed. Shown as they are in Lotte.
  4. Binary elements (not translatable): <bin-unit>, <bin-source>, <bin-target> are not supported.
  5. <glossary> elements are not supported.
  6. Other minor remaining elements (the list of elements in XLIFF is quite large) are not supported.
Things that can be improved in the near future:
 
  1. Decide a set of attributes to support for the above elements
  2. Add support for <alt-trans>, <glossary>

YAML

YAMl is used as an i18n by some projects, like those build on top of Ruby on Rails.

YAML supports various datatypes (from simple to user defined datatypes). Of those, it makes sense to translate only string-like values.

So, Transifex ignores int, float, symbols, variables and other non string and non-container (i.e. lists and mappings) datatypes during importing data from uploaded YAML files.

Regarding container datatypes, Transifex imports lists as a single unit, whereas for mappings it parses them and imports any string-like or list values. Block literals in YAML are also supported.

When a YAML file is downloaded from Transifex for use, it only contains entries that have been translated in the specific language. Additionally, if the requested language is the source language of the resource, the original comments of the source file will also be included in the downloaded file. If the user asked for reviewed translation only, then only entries with reviewed translations will be included instead.

On the contrary, files that are downloaded to be translated offline include all entries, regardless of whether they have been translated or not. Comments from the original source file are also included.

Associated file extensions:
.yml, .yaml
i18n type:
YML
Notes:

Supported on Transifex.com. Due to library limitations, only YAML Version 1.1 is supported for now.

It is also recommended to have uniform indentation used throughout the YAML file.

Transifex’s YAML handler usually expects the source YAML file to have a single root element, which is the language code for the YAML file. The language code can have one of the forms xx, xx-XX and xx_XX (for instance, en, en_US, en-US).

en:
  date:
    formats:
      default: "%Y-%m-%d"
      short: "%b %d"
      long: "%B %d, %Y"

Although not recommended, the source YAML file may also have multiple root elements or a single root element that is not a language code recognized by Transifex.

foo:
    key1: "This is a text"
    key2: "This is another text"
foo1:
    key: "another text"
Next Section: Transifex Client v0.8 »