Chapter 10. Combined Arms Tactics

While each particular PO processing tool from Pology and other packages may be documented in itself, it may not be always obvious how to use these tools together. This chapter presents some scenarios where combined tool usage may increase the quality and efficiency of daily work on translation.

10.1. Creating and Using PO Compendia

A PO compendium is simply a PO file which aggregates messages from many other normal PO files, usually all same-language PO files in a given translation project. It may aggregate only the messages currently present in project PO files, but also messages that were present once and are no longer. As such, the compendium can be regarded as an instance of a translation memory. This section explains how to create, update, and apply such a translation memory.

10.1.1. Why Translation Memory?

Imagine that the translator wants to start translating a PO file that was so far never translated, but which has content similar to some other, translated PO files. Perhaps it was even derived from those other PO files, by merging, splitting, etc. This means that many messages in the present PO file may have been translated already in some other PO file, or at least that very similar translated messages exist in other PO files. Since the translation memory (TM) contains all known translated messages, it can be used to automatically produce translated and fuzzy messages in the present PO file, significantly reducing translation effort. Matching against the TM can be performed either as the translator goes from message to message in the editor (if the editor has a TM feature), or at once for all messages (by a specialized command) before starting to go through messages in the editor.

In most non-PO based translation workflows, translation memories are crucial for efficiency. This is because most non-PO formats have no concept of merging with templates. Each new revision of the source material results in (an equivalent of) entirely empty translation files, and it is translator's duty to somehow bring old translations into the new context. A carefully maintained TM, with a corresponding matching tool, is the foremost way to do this.

In a PO-based translation workflow, merging with templates already provides most of what TM is essential for. In effect, the old PO file that is being merged can be considered as a TM for the new PO file that will become based on the new template. Even when PO files are renamed, merged, or split, if that is properly done, no translations will be lost. A TM for PO files is therefore useful mostly to smooth out glitches in translation maintenance procedures (e.g. a PO file improperly split).[45] Nevertheless, having a well maintained TM in the form of PO compendium cannot hurt, while providing for the (hopefully) rare situations where TM matching is actually needed.

Many dedicated PO editors will automatically maintain an internal TM, usually in a database format, into which they will scoop messages from all PO files that were opened in them. However, in a team environment, these internal TMs are inferior to a PO compendium. For one, different translators will have different TMs; a translator may start to work on a file for which there are TM matches in another translator's internal TM. Internal TMs may be volatile, for example corrupted due to an editor bug, or perish during system maintenance. There is no control over which messages are scooped by the editor, and how they are treated (e.g. which message parts are being ignored).

On the other hand, a PO compendium can be maintained in a central place, and, being a PO file in itself, kept in version control just like all other PO files. In this way, all translators have fast access to a unified TM, which is secured from accidental corruption. Tight control over which messages are collected and how they are collected may be asserted, in the script which is written to update the compendium. This script can be made to run periodically, and to automatically commit updated compendium the version control repository.

10.1.2. Maintaining a Centralized PO Compendium

As a first attempt, the PO compendium can be created simply by concatenting all PO files in the project into one called compendium.po, using msgcat. If PO files are organized by language (all PO files of a given language kept in directory of that language), then the concatenation command would be:

$ cd $LANGDIR
$ find -iname \*.po | xargs msgcat -o compendium.po

Unfortunatelly, a compendium created in this way has a number of drawbacks:

  • Aside from translated messages, the compendium will also contain untranslated and fuzzy messages. While untranslated messages are obviously dead weight, a case could be made for taking in fuzzy messages. But in light of the suggested usage of the compendium in the following section, fuzzy messages too should be ignored.

  • Messages in the compendium will contain all parts as normal messages do. Some of these parts (such as source references) are unnecessary, since they will be ignored when applying the compendium later. Other than increasing the size of the compendium, another problem with these parts is that changes in them will cause unnecessary version control differences, so they should be stripped from the compendium.

  • Messages will be ordered as they are seen in concatenated PO files. The ordering of messages in the compendium is also of no importance for application. But, any changes in message ordering between two compendium updates will cause unnecessary version control differences, so it is best to sort messages by their keys (msgid and msgctxt fields).

  • When two or more PO files contain the same message by key (msgid and msgctxt) but with different translations (due to context), such as:

    msgid "Open File"
    msgstr "Otvori datoteku"msgid "Open File"
    msgstr "Otvaranje datoteke"
    

    msgcat will aggregate translations (and translator comments if any) in the compendium message, and make it fuzzy:

    #, fuzzy
    msgid "Open File"
    msgstr ""
    "#-#-#-#-#  alpha.po (alpha-1.2.9)  #-#-#-#-#\n"
    "Otvori datoteku\n"
    "#-#-#-#-#  bravo.po (bravo-0.8.12)  #-#-#-#-#\n"
    "Otvaranje datoteke"
    

    Since the context should be double-checked anyway when applying the compendium later (especially for short messages), it is better to instead pick one of the translations and have a normal translated compendium message. If each translation appears only once, then it does not matter which is picked; but if one translation appears 10 times and the other once, clearly the former should be picked. That is, the most frequent translation should be picked.

  • The PO header is treated in the same way as messages by msgcat: since all headers have equal msgid field (empty), their msgstr fields will be aggregated. This too is just dead weight since the header is not used in applications of the compendium. Instead, a brief and informative header should be explicitly set (mentioning that this is a compendium PO file, for which project and language, etc).

  • In some translation projects, PO files frequently contain meta-messages, such as those where translators can add their names and contact addresses. These messages have the same key (msgid) in all PO files, but should be translated differently in general, the more so the more the people in the translation team. So it may be better to omit such messages from the compendium.

It must be noted that none of these problems are an actual deficiency of msgcat itself. Since its function is general concatenation of PO files, it cannot make any of the assumptions necessary for the present application. Instead, msgcat should be used as a part of a wider script, in which the necessary additional processing happens, tailored to the particular translation project and translation team.

Let us assume the following layout of the top directory for the translation project foo and translation team (language) nn:

foo-nn/
    ui/
        alpha.po
        bravo.po
        ...
    doc/
        alpha.po
        bravo.po
        ...
    update-compendium-foo-nn.sh
    compendium-foo-nn.po

update-compendium-foo-nn.sh will be the script to create or update the compendium, compendium-foo-nn.po the compendium itself. It helps clarity to add the project name and language into names of these two files, because both are tailored to that project and that language. Taking into account the aforementioned drawbacks of a simple compendium made by msgcat and the suggested resolutions, update-compendium-foo-nn.sh could look like this[46]:

#!/bin/sh
#
# Create the PO compendium of Foo in Nevernissian language.
#
# Usage:
#   update-compendium-foo-nn.sh [trim]
#
# The script can be called from anywhere, because PO paths are
# hardcoded within the script relative to its own location.
# If the 'trim' argument is not given (i.e. script is called
# without arguments), messages in the old compendium that are
# no longer found in project PO files are preserved in
# the new compendium; if 'trim' is given, they are removed.

# Directory where this script resides.
cmddir=`dirname $0`
# Paths of directories containing PO files, space-separated.
# (Make sure the compendium itself is not in here!)
podirs="$cmddir/ui $cmddir/doc"
# Path to the compendium.
comppo="$cmddir/compendium-foo-nn.po"

trim=$1

# If there is already a compendium, preserve it for later.
test -f $comppo && mv $comppo $comppo.old

# Collect PO files from given paths into a file.
find $podirs -iname \*.po | sort > polist

# Pre-process PO files in the project, creating temporary
# PO files named *.po.tmpcomp:
# - remove fuzzy and untranslated messages
# - declare obsolete messages non-obsolete
# - remove extracted comments, source references, flags
for pofile in `cat polist`; do
    msgattrib $pofile \
        --translated --no-fuzzy --clear-obsolete --force-po \
    | grep -v '^#[:.,]' > $pofile.tmpcomp
done
# Update file list to contain temporary PO files.
sed -i "s/$/.tmpcomp/" polist

# Reduce headers of temporary PO files to necessary minimum,
# proper header for the compendium will be added later.
posieve -q set-header -f polist \
    -srmallcomm \
    -sremoverx:'^(?!MIME-Version$|Content-Type$|Content-Transfer-Encoding$)'

# Create raw compendium from temporary PO files:
# - aggregate translations for repeated messages
# - sort messages by key
msgcat --sort-output --force-po -f polist -o $comppo

# Clean up temporary PO files and file list.
cat polist | xargs rm
rm polist

# Resolve aggregated messages to most frequent variant.
# It is safe to unfuzzy resolved messages, since at
# this point assured that only translated messages
# have been aggregated.
posieve -q resolve-aggregates $comppo -sunfuzzy

# Remove meta-messages which are found in many PO files but
# should in general be differently translated in each.
msggrep -v $comppo -o $comppo \
    -JFe 'NAME OF TRANSLATORS' \
    -JFe 'EMAIL OF TRANSLATORS' \
    -JFe 'ROLES_OF_TRANSLATORS' \
    -JFe 'CREDIT_FOR_TRANSLATORS' \

# Set the compendium header.
# Use current date as revision date.
dtnow=`date '+%Y-%m-%d %H:%M%z'`
posieve -q set-header $comppo -screate \
    -stitle:'Compendium of Foo translation into Nevernissian.' \
    -sfield:'Project-Id-Version:compendium-foo-nn' \
    -sfield:"PO-Revision-Date:$dtnow" \
    -sfield:'Last-Translator:Simulacrum' \
    -sfield:'Language-Team:Nevernissian <l10n-nn@neverwhere.org>' \
    -sfield:'Language:nn' \
    -sfield:'Plural-Forms:nplurals=9; plural=n==1 ? ...' \

# If the old compendium was preserved, add it to the new compendium
# in order to retain messages no longer found in the project
# (unless trimming was requested).
if test -f $comppo.old && test x"$trim" != xtrim; then
    msgcat --use-first --sort-output $comppo $comppo.old -o $comppo
    # ...old compendium must be the second argument, in order
    # not to override possibly updated translations of
    # existing messages in the project.
fi

# Test if new compendium is different from the old, with
# the exception  of creation time. If they are the same,
# discard the new compendium.
if test -f $comppo.old; then
    for cpfile in $comppo $comppo.old; do
        grep -v '^"PO-Revision-Date:.*\\n"$' $cpfile >$cpfile.nrd
    done
    if cmp -s $comppo.nrd $comppo.old.nrd; then
        mv $comppo.old $comppo
    else
        rm $comppo.old
    fi
    rm $comppo.nrd $comppo.old.nrd
fi

# Canonically wrap the compendium.
msgcat $comppo -o $comppo

# All done.

This script should be periodically called to update the compendium, and the updated file committed, such that all translators will automatically get it when they update their local repository copies. If after some (long) time the compendium becomes to big due to accumulation of old messages, running the script once with the trim argument will cause all old messages to be dropped.

10.1.3. Applying the PO Compendium

Translators who use a dedicated PO editor with internal TM should configure the editor to read the compendium into the internal TM. This may be done, for example, by including the compendium PO file (or the directory in which it resides) into editor's translation project paths. If the compendium is kept under version control, the editor should automatically update its internal TM from the compendium whenever the repository is updated and the editor started again. In this way, editor's internal TM becomes transient in nature, there being no problem if it gets corrupted or deleted.

When working on a particular PO file with a properly configured PO editor, as the translator jumps from one to another incomplete (untranslated or fuzzy) message, when the message is similar to one or few messages in the compendium (i.e. in internal TM) the editor will somehow "offer" those similar messages. Ideally, for each similar message the editor should show not only the possible translation, but also the difference between the two original texts (that of the current message and the TM match). This will allow the translator to quickly see how the offered translation should be adapted to fit the current original.

Dedicated PO editors may also offer batch application of the TM. This means that when the PO file is opened, the translator executes a command which fills in all untranslated messages with matches from the TM, making some translated (on exact matches) and some fuzzy (partial matches). However, simpleminded batch application of the TM should be considered dangerous. For one, exact matches in the source language may not be exact matches in the original; especially short messages frequently need different translations. But the translator will simply jump over each batch-translated message and fail to see this. The other problem comes up if the material in the compendium is not sufficiently reviewed, in which case every match from the TM, even on long messages, should be at least casually reviewed by the translator. Thus, if there is no way to configure batch application to be less indiscriminate, it is best to avoid it alltogether, or else the quality of translation may suffer.

Translators who use a general text editor to work on PO files can still make use of the compendium. One option could be merging the PO file with its template in presence of the compendium, just before starting to work on it:

$ msgmerge alpha.po alpha.pot -C compendium.po --update --previous

The -C option to msgmerge specifies the compendium from which to draw exact and partial matches, when there is no match in the PO file itself. This option can be repeated to add several compendia. The --update option is to modify the PO file in place, rather than writing the merged PO file to standard output. The --previous option is to get previous fields (#| ... comments) on fuzzy messages. Unfortunatelly, this method is a command line version of the batch application of the TM in a dedicated PO editor, and suffers from the same problem of indiscriminate exact matches that the translator will later fail to check. Therefore it should not be used (at least not for general translation).

Fortunatelly, Pology provides the poselfmerge command, which is a wrapper around msgmerge, and has several options to mitigate the indiscriminancy problem of batch application of TM. To avoid silent exact matches on short messages, the -W/--min-words-exact can be used to set the minimum length of a message in words at which the exact match will be accepted; otherwise the message is made fuzzy. If every exact match should be checked by the translator, no matter the length of the message, there is the -x/--fuzzy-exact to make all exact matches fuzzy.[47] These options have counterpart fields in Pology user configuration, so that the translator does not have to remember to use them on every run, and the PO template is not used at all. See Section 7.2, “Self-Merging PO Files with poselfmerge for details.

10.2. Efficiently Translating with a Text Editor

Dedicated PO editors provide not only direct editing enhancements (no dealing with PO format syntax, jumping through incomplete messages, automatic removal of fuzzy elements, etc), but also translation-oriented features like spell checking, translation memory collection and application, glossary suggestions, and, going beyond standalone PO files, translation project overview and statistics. Why would someone, in spite of this, prefer to work on PO files with a general text editor? There are various reasons. Some people do not like how elements of currently translated PO message are scattered all over the window (as is typical of many PO editors), out of eye focus, and some elements even not shown. Other people like to have modularity in the translation workflow, rather than relying on the PO editor for everything and accepting its limitations. Some people are simply well accustomed to their text editor and do not want a higher level editor "abstracting" the PO format for them.

When translating PO files with a general text editor, you will have to use some command line tools to achieve reasonable efficiency and quality.

10.2.1. Expected Features of the Text Editor

Starting from the text editor itself, it should have several general text-editing features. Capable editors all have these features, but they should nevertheless be mentioned, so that you can look for them.

The most important feature is probably syntax highlighting, where special parts of the text are displayed in different color, weight, or slant. In a PO file, message field keywords (msgid, msgstr) should stand out from the text itself, text in comments should look different from the text in fields, internal text elements (e.g. markup tags) should be highlighted, etc. In this way you can quickly focus on what you should be editing, and on the surrounding context of the text. Syntax higlighting was originaly introduced for various programming language source files, but has since spread to other types of structured text files; established editors should have syntax highlighting for PO files as well.

Capable editors usually provide special methods of navigating through the file, above simply scrolling up and down line by line or page by page. One particularly useful method would be line bookmarking. While in the middle of editing a given line, you have to search through the PO file for something (e.g. how a certain phrase was translated earlier): you can then bookmark the line, search as much as you like, and return to the same line by jumping to the bookmark. Otherwise you would have to remember which line (by number) it was to jump back to it, or search for the text that you remember from that line.[48]

It will usually be possible to start the editor with one or more file paths as command-line arguments, to open those files at once. This is useful when a selection of PO files in need of some editing is determined by an external command, which writes out their paths. These paths can then be fed directly to the editor, rather than having to open them manually one by one (and possibly missing some) through editor's file dialog.

10.2.2. Statistics on PO Files

Having good statistics on a single or a group of PO files is necessary for estimating the translation effort, for example how much time should be allotted for updating the existing translation for impending next release of the source material. Pology's workhorse for computing statistics is the stats sieve of posieve.

Assume the following arrangement of PO files for language nn and their templates:

l10n-nn/
    ui/
        alpha.po
        bravo.po
        ...
    doc/
        alpha.po
        bravo.po
        ...
l10n-templates/
    ui/
        alpha.pot
        bravo.pot
        ...
    doc/
        alpha.pot
        bravo.pot
        ...

If the current working directory is l10n-nn/, to compute statistics on a single PO file, posieve can be executed like this:

$ posieve stats ui/alpha.po

This will display a table with message counts, word counts and characters counts, as well as ratios to total, per category of messages (translated, fuzzy, untranslated, obsolete). To have the same output for all PO files in the ui/ directory taken together, or in the whole project, respectively:

$ posieve stats ui/
$ posieve stats

Note that word count is a much better base for estimating the translation effort than message count.

When statistics is computed for several PO files (or a directory, or several directories full of PO files), frequently it is necessary to get statistics per file (or per directory). This is done by adding the byfile or bydir sieve parameter:

$ posieve stats -s byfile ui/

However, this will output one full table for each file, which may be a bit too much data to grasp. Instead, you can request bar display, where each file is represented by a single-line bar. The bar shows either the number of messages or the number words per category, depending on whether msgbar or wbar was issued. To get word bars per file in ui/ directory, you can execute:

$ posieve stats -s byfile -s wbar ui/

Fuzzy messages introduce some uncertainty in effort estimation. If the statistics shows 50 fuzzy messages with 700 words, you cannot conclude from that if changes in those messages are small (e.g. cleaned style, punctuation) and translation can be quickly updated, or substantial (entirely new messages with passing similarity to earlier message) and require heavy editing. For this reason the stats sieve provides the ondiff parameter: for each fuzzy message the difference from previous message is computed, and based on that a part of the word count is assigned to translated category and the rest to untranslated (thus leaving nominal zero words in the fuzzy category). The result is that, for example, a PO file with a lot of messages fuzzy due to punctuation changes will show in statistics as almost completely translated by number of words.

If the translation project is organized such that new empty PO files are not automatically derived from new PO templates, then when running statistics just over language PO files it will happen that templates which do not have a counterpart PO file are not counted as fully empty PO files. To have such templates counted, the two-argument templates parameter can be issued; the first parameter is a path segment of the language directory, and the second parameter what to replace it with to get the corresponding template directory path. In the translation project setup as above, this is how you would compute the statistics on ui/ directory while taking templates into account:

$ posieve stats -s templates:l10n-nn:l10n-templates ui/

The path replacement is always done on absolute paths, so in this example it is not a problem that the relative paths (ui/alpha.po...) do not contain original and replacement segments.

The translation project may not be organized such that each language has its own top directory. Instead, language PO files may be grouped by application and PO domain, and named by language code:

project/
    alpha/
        po/
            aa.po
            bb.po
            ...
    bravo/
        po/
            aa.po
            bb.po
            ...
    ...

In this setup the stats sieve can still be run on directory paths as arguments, in order to get statistics on all PO files of a given language, by using the -I/--include-path option of posieve to single out the desired language. For example, to get statistics on all PO files of the nn language in a single table:

$ posieve stats project/ -I 'nn.po'

or by file in form of message bars:

$ posieve stats -s byfile -s msgbar project/ -I 'nn.po'

The value of the -I option is in fact a regular expression, and the option can be repeated, which allows to finely tune the file selection when necessary.

As for other statistics tools, Gettext's msgfmt with --statistics option could be considered as one (though it shows only translated, fuzzy, and untranslated message counts), and especially the the pocount command from Translate Toolkit.

10.2.3. Updating PO Files After Merging

When a single PO file is to be translated from scratch, then it is easy to just open it in the text editor and start translating messages one by one. However, usually more frequent than this is translation maintenance, in which you need to go through a bunch of freshly merged PO files and update new untranslated and fuzzy messages. The problem then is twofold: how to efficiently check which files need updating, and how to efficiently go through messages that need to be updated within a file.

To see which PO files need to be updated, you can simply run the stats sieve with byfile and msgbar/wbar parameters (and possibly ondiff), as explained in the previous section. After that you would have to manually observe incomplete files and open them in the editor one by one, which is tedious and prone to oversight. Instead, you can also add the incompfile parameter to stats, which will write paths of all incomplete PO files into a file. If PO files are organized as in the previous example, and you want to update translations in ui/ subdirectory, you would run:

$ posieve stats -s byfile -s wbar -s incompfile:toupdate.out ui/

Now toupdate.out will contain the paths of incomplete files. If the editor can be started from the command line with a number of file path arguments, you can directly feed it toupdate.out, e.g. by adding `cat toupdate.out` to the editor command.

If the translation project is organized such that each new template results in new empty PO file, you may wish to update only those PO files which where worked on before, i.e. those not entirely empty. For this you can add the mincomp parameter, which sets the minimal completeness (the ratio of translated to total messages) at which to take a PO file into consideration, with a very small value:

$ posieve stats -s mincomp:1e-6 -s incompfile:toupdate.out ui/

1e-6 is short for 0.000001, which means to take into consideration only those PO files which have more than one in a million translated files. Since there is no PO file with a million messages, this effectively means to include every PO file which has at least one translated message in it.

Once the incomplete PO files are open in the editor, to be able to jump through incomplete messages, you need to somehow use editor's search function. For fuzzy messages it is easy, you can just search for the , fuzzy string. Untranslated messages, on the other hand, are more problematic. You may think of searching for msgstr "", but this would also find long wrapped messages:

msgid ""
"Blah blah blah [...]"
"blah blah."
msgstr ""
"Bla bla bla [...]"
"bla bla."

To make untranslated messages stand out unambiguously, there is the tag-untranslated sieve. It simply adds untranslated flag to all untranslated messages (but not to fuzzy unless explicitly requested), so that you can search for , untranslated in the editor. The most convenient is to run tag-untranslated on the toupdate.out file produced by stats using the -f/--from-files:

$ posieve tag-untranslated -f toupdate.out

Fuzzy messages may be such only due to small changes in the original text, for example a single word changed in a paragraph-length message. This is not so easy to see by manually comparing the original and the translation. However, since fuzzy messages should have the previous original text in comments (if merged with --previous option of msgmerge), it is possible to automatically embed differences into those comments with the sv-diff-previous sieve; see its documentation for an example. You should run this sieve on toupdate.out as well:

$ posieve diff-previous -f toupdate.out

Your editor may even highlight the difference segments added to the previous original text, making them stand out quite clearly.

Since normally you want both to mark untranslated messages and to add differences to fuzzy messages before going through PO files, you can run the two sieves at once:

$ posieve tag-untranslated,diff-previous -f toupdate.out

As you go through incomplete messages and update the translation, you should remove any fuzzy or untranslated flags, and previous fields in #| ... comments, so that in the end you can commit (upload, send) clean updated PO files. But sometimes it will happen that you realize that you do not have enough time to update everything, and you want to commit what you have completed by that moment. The problem is that there will still be some untranslated flags and embedded differences remaining throughout the files, and leftover embedded differences would e.g. interfere with subsequent merging. To automatically remove these remaining elements, you simply run the two sieves with the strip parameter:

$ posieve tag-untranslated,diff-previous -s strip -f toupdate.out

When you update a PO file, for the sake of clarity and copyright you should also update its header with your personal data (the author comment, the Last-Translator: field, etc.) You could do this manually, but it is much simpler to set your data once in the Pology user configuration and run the update-header sieve over all updated files[49]:

$ posieve update-header -f toupdate.out

10.3. Summit with Ascription

Summit and ascription workflows, described in Chapter 5, Summitting Translation Branches and Chapter 6, Ascribing Modifications and Reviews, fit excellently together. Ascription enables review-based release control on summit scatter (Section 5.3.7, “Filtering by Ascription on Scatter” shows how to do it), while summit removes the needed for different ascription file trees per branch (and the associated effort at branch cycling). All the information that you need to set up a summit with ascription are explained in the chapters mentioned; the only thing left for this section is to show the order of actions and the resulting file structure, as implied by the technical requirements.

The first thing to set up is the summit. From the viewpoint of ascription, it is not important which summit mode is used; indeed, while the direct summit is still not advised, putting ascription on top would alleviate some of its disadvantages. In the following the summit over dynamic templates is assumed, because it is a bit less involved than the summit over static templates, but nevertheless demonstrates all important points.

After configuring and initializing the summit over dynamic templates, let the summit top directory only (that is, omitting branches) look like this:

l10n-nn/
    summit/
        foo-module/
            alpha.po
            bravo.po
            ...
        bar-module/
            kilo.po
            lima.po
            ...
        ...
    summit-config

PO files in the summit are shown split into several submodules for generality. Unlike in the chapter on summit, the summit directory is placed here within a parent language directory, and the summit configuration file summit-config in the parent directory instead of the summit directory. This is in order to have a clearer structure when the ascription is added.

The ascription is set up after the summit, such that it takes only the summit directory into account, having nothing to do with branches. After the ascription is configured and initialized, the summit with ascription tree should look like this:

l10n-nn/
    summit/
        foo-module/
            alpha.po
            bravo.po
            ...
        bar-module/
            kilo.po
            lima.po
            ...
        ...
    summit-ascript/
        foo-module/
            alpha.po
            bravo.po
            ...
        bar-module/
            kilo.po
            lima.po
            ...
        ...
    ascription-config
    summit-config

Here the ascription tree root is set to summit-ascript/ in the ascription configuration file ascription-config. With this, setting up the summit with ascription workflow is completed.

10.3.1. Several Summits with Unified Ascription

In some circumstances you may want to have several separate summits with unified ascription. This may be the case, for example, when the translation project is such that the user interface and documentation PO files are put into separate file trees in branches, and most paired UI-documentation PO files have same names.[50]

The parent language directory in this scenario, with summits and ascription set up, could look like this:

l10n-nn/
    summit/
        ui/
            foo-module/
                alpha.po
                bravo.po
                ...
            bar-module/
                kilo.po
                lima.po
                ...
            ...
            summit-config
        doc/
            foo-module/
                alpha.po
                bravo.po
                ...
            bar-module/
                kilo.po
                lima.po
                ...
            summit-config
    summit-ascript/
        ui/
            foo-module/
                alpha.po
                bravo.po
                ...
            bar-module/
                kilo.po
                lima.po
                ...
            ...
        doc/
            foo-module/
                alpha.po
                bravo.po
                ...
            bar-module/
                kilo.po
                lima.po
                ...
            ...
    ascription-config

Note here the location of summit-config files: each is within its own summit directory, which are summit/ui/ and summit/doc/. On the other hand, there is a single ascription-config file, which covers all summits. This means that summit operations (merging, scattering) must be performed from within their respective summit directories (since posummit looks through the parent directories for first summit-config file), while ascription operations can be performed from anywhere.

Having unified ascription is especially convenient in centralized summit maintenance, since translators and reviewers are concerned only with ascription (running poascribe to commit, select for review, etc.) regardless of how many summits there are.



[45] To be sure, some short messages can be quite similar in many unrelated PO files. But having TM matches only on such messages will result in very small time savings, if measurable at all.

[46] At one point, the script creates a temporary PO file for each original PO file, and then calls msgcat on these temporary files to create the first, raw compendium. These temporary files have fuzzy and untranslated messages removed, and some other adjustments, before concatenation. One could think that all these adjustments could instead be done on the raw compendium. The problem is that then there would be no unambiguous way to tell which fuzzy messages in the raw compendium were fuzzy to begin with, and which were made fuzzy by msgcat due to agreggation of translations. With fuzzy messages removed prior to concatenation, in de-aggregation by frequency that follows it is known that messages with fuzzy flags are those aggregated.

[47] The translator can still see when the match was exact, because normal fuzzy messages will have previous fields and fuzzied exact matches will not.

[48] One trick is also hitting undo once, which will normally skip to the line in which the last modification was made, and then hit redo to recover the modification.

[49] If you use ascription, you should instead tell poascribe to update headers for you when committing. This is done by adding update-headers = yes to [poascribe] section in user configuration.

[50] On the other hand, you may still have a unified summit, by defining a path transformation in summit configuration to disambiguate UI and documentation PO files sharing the same domain name.