Chapter 1. A Study of PO

Many people like to use computer programs in their native language. On the average, the working language of developers of a computer program, and the native language of its users, are different. This means that programs need to be translated. For this to be possible, first the program has to be written in such a way that it can fetch and display translations in the language set by the user. Then, there has to exist a method to collect discrete pieces of text (such as button labels, menu items, messages in dialogs...) from the program. Collected pieces of text are written into one or more files of a certain format, which the translators can work on. Finally, translated files may need to be converted into a form that the program can interpret and show translation to the user. There are many different translation systems, which support one or more of these elements of translation process.

In the realm of free software, one particular translation system has become ubiquitous: the GNU Gettext. It covers all the elements of the translation process. It provides a way for programmers to write translatable programs, a way for text to be extracted and collected for translation, a file format on which the translators work, and a set of tools for processing translation files. Beyond the technical aspects, Gettext has evolved a set of conventions, workflows and communication patterns -- a translation culture of sorts.

The most salient element of Gettext from translators' perspective is the translation file format it defines, the PO[1] format. Along with other parts of Gettext, the PO format has developed over the years into the technically most capable translation file format today. Its features both enable high quality and efficiency of translation, and yet can all "fit into one person's head". A chapter of this manual provides a tour of the PO format from translator's perspective.

Aside from the tools provided by GNU Gettext itself, many other tools for processing PO files have been written. These consist of translation editors (or "PO editors"), which provide translators with more power in editing PO files, and batch-processing tools, for purposes more specific than those covered by Gettext (e.g. conversion from and to other file formats). Pology is one of these specific batch-processing tools.

Pology consists of a Python library, with much translation-related functionality beyond basic manipulation of PO file objects, and a collection of scripts based on this library. Both the library and the scripts have this basic trait: they tend to go in depth. Pology is designed to apply precision tasks to standalone PO files, to process collections of PO files in sophisticated ways, and while doing this to cooperate well with other tools commonly used to handle PO files (such as PO editors and version control systems). On the programming side Pology strives for simplicity and robustness, so that users who know (some) Python can easily add functionality for their custom needs. To achieve this, Pology fully relies on the conventions of the PO format, without trying to abstract the translation file format.

As one measure of attention to detail, Pology has sections of language-specific and project-specific functionality, and even combinations of those. Users are encouraged to contribute their custom solutions into the main distribution, if these solutions can possibly serve needs of others.

In short, Pology is a study of PO.

1.1. Obtaining and Installing

Naturally, the easiest way is to install Pology packages for your operating system distribution, if they exist. Otherwise you must obtain Pology as source code, but you will still be able to prepare it for use quite simply.

You can either download a release tarball from [[insert link here]], or fetch the latest development version from the version control repository. To do the latter, execute[2]:

$ cd PARENTDIR
$ svn checkout svn://anonsvn.kde.org/home/kde/trunk/l10n-support/pology

This will create the pology subdirectory inside the PARENTDIR, and download full Pology source into it. When you want to update to the latest version later on, you do not have to download everything again; instead you can execute svn update in the directory Pology's root directory:

$ cd POLOGYDIR
$ svn update

This will fetch only the modifications since the checkout (or the previous update) and apply them to the existing source tree.

To prepare Pology for use, you can either properly install it or use it directly from the source directory. To install it, you first run CMake in a separate build directory to configure the build, and then make and make install to build and install:

$ cd POLOGYDIR
$ mkdir build && cd build
$ cmake ..
$ make && make install

CMake will warn you of missing requirements, and give some hints on how to customize the build (e.g. installation prefix, etc). If cmake is run like this without any arguments, Pology will be installed into a standard system location, and should be ready to use. If you install it into a custom location (e.g. inside your home directory), then you may need to set some environment variables (see below).

If you want to run Pology from its source directory, it is sufficient to set two environment variables:

$ export PATH=POLOGYDIR/bin:$PATH
$ export PYTHONPATH=POLOGYDIR:$PYTHONPATH

You can put these commands in the shell startup script (~/.bashrc for Bash shell), so that paths are already set whenever you start a shell. Setting PATH will ready Pology's scripts for execution, and PYTHONPATH its Python library for use in custom Python scripts. You should also build some documentation:

$ POLOGYDIR/user/local.sh build  # user manual
$ POLOGYDIR/api/local.sh build  # API documenation
$ POLOGYDIR/lang/LANG/doc/local.sh build  # language-specific, if any

This will make HTML pages appear in POLOGYDIR/doc-html/. To have Pology scripts output translated messages, if there exists translation into your language, you can execute:

$ POLOGYDIR/po/pology/local.sh build [LANG]

This will put compiled PO files into POLOGYDIR/mo/, from where they will be automatically picked up by scripts running from the source directory.

Pology provides shell completion for some of the included scripts, which you can activate by sourcing the corresponding completion definition file. If you have installed Pology:

$ . INSTALLDIR/share/pology/completion/bash/pology

and if running Pology from the source directory

$ . POLOGYDIR/completion/bash/pology

1.2. Dependencies

The following lists the dependencies of Pology, and notes whether they are required or optional, and what they are used for.

Required external Python packages:

  • None.

Required general software:

  • CMake >= 2.8.3. The build system used for Pology.

  • Gettext >= 0.17. Some Pology scripts use Gettext tools internally, and the library module pology.gtxtools wraps some of Gettext tools for use inside Python scripts. Also needed to build Pology user interface and documentation translations.

  • Python >= 2.5.

Optional external Python packages:

  • python-dbus >= 0.81. Used for communication with various external applications (e.g. with the Lokalize PO editor).

  • python-enchant >= 1.5.2. Frontend to various spell-checkers, used by most of Pology's spell checking functionality.

  • python-pygments >= 1.6. Syntax highlighting for PO and other code snippets in Pology documentation.

Optional general software:

  • Apertium >= 0.2. A free/open-source machine translation platform, used by the pomtrans script.

  • Docbook XSL >= 1.75.2. XSL transformations for converting Docbook into various end-user formats, used for building Pology documentation.

  • Epydoc >= 3.0. Python doctring to HTML doc generator. Needed to build the API documentation of Pology Python library.

  • LanguageTool >= 1.0. Open source language checker, used by the check-grammar sieve.

  • Libxml2. XML processing library. Some of command line tools that come with it are needed to build Pology documentation (xmllint, xsltproc).

  • Version control systems. Used by various Pology scripts that process PO files on the collection level, when the PO files are under version control. Currently supported: Git >= 1.6, Subversion >= 1.4.



[1] "PO" is an acronym for "portable object". This phrase is a quite generic term from the depths of computer science, opaque for practicing translators. Texts on software translation therefore always write simply "PO format", "PO files", etc.

[2] svn is the primary command of Subversion version control system. Subversion is almost certainly ready for installation from your operating system's package repositories.