Chapter 6. Ascribing Modifications and Reviews

It may not be obvious, especially to new translators, to which extent the translation needs to be reviewed. If the translator has exercised due diligence, how "wrong" can the translation be? Even if the translator has good command of the source language -- typically English in context of PO files -- the answer is "very wrong", all aspects considered.

With comparatively simple grammar of English, the meaning of a short English sentence (as typically encountered in program user interfaces) may vary much depend on the surrounding context. This context may not be obvious when the translator is going through isolated messages in the PO file, so he may commit the worst of errors from the reader's viewpoint: the senseless translation. An experienced reviewer will have developed sense for troublesome contexts, and will have at disposal several means to conclusively determine the context (including, for example, running the development version of the program).

Even if the context is correctly established, the translator may use "wrong" terminology, which is the next worse thing for the reader. A term used in translation does not need to be wrong by itself, in fact it may be exactly the correct term -- in another translation project. The reviewer will have more experience with terminology of the present project, and be able to bring the translation in line with it.

Style in the technical sense is a consistent choice between several perfectly valid constructs in target language when applied to text in the given technical context. For example, how to translate menu titles and items, button labels, or tooltips in user interface. Choices may include noun or verb forms, particular grammar categories, tone of address, and so on. There may be a style guide to the project which details such choices, and the reviewer will know it well.

Style in the linguistic sense is especially applicable to longer texts, such as tooltips in user interface or paragraphs in documentation. A typical error of a new translator is to closely adhere to English style and grammar. This may produce translation which is semantically and grammatically valid in the target language, but very out of style. Reviewer then steps in to naturalize such constructs.

Finally, while the reviewer may be an experienced translator, that does not mean that his own translations need no review. Immersion into the source language, distraction, and fatigue, will lead the reviewer into any of the above errors in translation, only with lesser frequency. This means that reviewers should also mutually review their own translations.

This calls for a systematic approach to review in translation workflow.

6.1. Review Stages vs. Ascription

Classical review workflow, by stages, seems simple enough. Translator translates a new PO file or updates an existing translation, and declares it ready to review. A reviewer reviews it, and declares it ready to commit into the pool from which PO files are periodically released. A committer finally commits the file. The process is iterative: the reviewer may return the file to the translator, and translator later again declare it as ready for review. There may be several stages of review (such as proof-reading, approving), each of which may return the translation to a previous stage, or forward it to some special stage. The process may also be implemented on the subfile level, where each PO message can go through stages separately.

Regardless of the technical details, review workflows of this kind all have the following in common. Members of the translation team are assigned roles (such as translator, reviewer, committer) by which they step into the workflow. A single person can have more roles. Later review stages must wait for the earlier stages to complete, and the translation cannot be updated again before the current version clears the review pipeline (or the pipeline is aborted). Once the translation is committed, it becomes a part of simply "admitted" translations, with no further qualifiers.

The system of prescribed roles requires that team members assign the roles between themselves, stick to them, and shuffle them along the way. The prescribed review pipeline requires a tool to keep track of the current review stage of translation. This makes the review workflow rigid, with probable bottlenecks. Distribution of roles may become disbalanced by people coming into and leaving the team, or the tracking tool may be prohibitive to some scenarios (e.g. single translator making small adjustments in dozens of files across the project, but having to upload each manually through a web interface).

"Rigid" and "inefficient" are comparative qualifications, so what is it that review by stages can be compared to in this way?

Review by ascriptions is even simpler conceptually, and yet less rigid and more efficient than the review by stages. It obligatory works on the PO message level, rather than PO file level. Anyone can simply translate some PO messages and directly commit modified PO files, without any review, but ascribing modifications to own name. Anyone can review any PO message at any moment, commit modifications made during the review, and ascribe the review to own name (and possibly to a certain class -- review of context, of terminology, style, etc). When the time comes to release the translation, insufficiently reviewed messages are automatically omitted, by evaluating the ascription history of each message.

Based on the ascription history, the reviewer can select a subset of PO messages, and review only the difference between their historical and current versions. For example, Alice can select to review only messages modified since she or Bob had last reviewed them for style. She could see the difference from that last review to current version, e.g. if in the whole paragraph only a single word was changed by Charlie when he reviewed the terminology. Ascription history also propagate through merging of PO files with templates, so the reviewer can compare the change in original to the change in translation since the last review and judge if one fits the other.

Since everyone just commits, translations can be efficiently kept in a version control repository, with the ascription system added on top. After having done some translating, the translator simply substitutes commit command of the version control system (VCS) with "ascribe modifications" command of the ascription system (AS, which calls the underlying VCS internally). After reviewing, the reviewer uses "ascribe reviews" command of the AS to commit reviews to ascription history (as well as any modifications made during the review). To select messages for review, the reviewer issues "diff for review" command of the AS, with suitable parameters to narrow the message set; selected messages are marked in-place in PO files and equipped with embedded differences, and possibly directly opened in a PO editor.

When the translations are to be released, the release person issues "filter for release" command of the AS, which takes the working PO files and creates final PO files, in which the insufficiently reviewed messages are removed. Here "release time" can be understood figuratively: since filtering for release should be a fully automatic process, it can be performed at any interval of convenience.

What constitutes "sufficient review" can be defined in fine detail. It could be specified that messages modified by Alice need to have only review for terminology, but not necessarily for style; Charlie may belong to the group which needs to be reviewed on style, but not necessarily on context; Bob's reviews for style may be nice to have, but never blocking the release if missing. These decisions do not preclude released messages to be reviewed later on missing points, after higher priority reviews have been completed. The definition of sufficiency may be changed at any point, e.g. as team members get more experienced and require less review, without interfering with direct translation and review activities.

In summary, an AS preserves the operational efficiency of VCS, while at the same time providing great flexibility of review. All team members can be given commit access, no web or email detours are needed. There are no prescribed roles, but a functional equivalent of role assignment happens at the last possible moment (release time), can take into account both translators' and reviewers' abilities, and changing estimates of those over time. There is no staging between completing and committing the translation, which enables a translator to continue polishing the translation undisturbed until a reviewer comes around. There are no bottlenecks when performing small changes in many files, since a single AS command commits all changes just as a single VCS command would. On commit operations, the AS can also apply various checks (e.g. decline to commit syntactically invalid PO files) and modifications (e.g. update translator's data in the PO header).

6.2. Setting up Ascription with poascribe

Pology provides an ascription system in the form of the poascribe command.

Let the organization of PO files for the language nn in the version control repository be such:

l10n-nn/
    po/
        ui/
            alpha.po
            bravo.po
            ...
        doc/
            alpha.po
            bravo.po
            ...
        ...

Having PO files grouped by language can be taken as a hard prerequisite[24]. Also necessary is a single top subdirectory for the whole PO file tree (here po/), rather than having several PO subdirectories directly in the language directory.

Setting up ascription is now simple. Create the ascription configuration file named exactly ascription-config (poascribe expects this name), on the same level as the top PO directory:

l10n-nn/
    ascription-config
    po/
        ui/
            ...
        doc/
            ...

and set in it a few global configuration fields, and data for each known translator:

# ---------------------------
# Global ascription settings.

[global]

# Roots of the original and ascription trees.
catalog-root = po
ascript-root = po-ascript

# The underlying version control system.
version-control = svn

# Data for updating PO headers on request.
language = nn
language-team = Nevernissian
team-email = l10n-nn@neverwhere.org

# Default commit message.
commit-message = Translation updates.

# -----------------------
# Registered translators.

[user-alice]
name = Alice Akmalryn
original-name = Алиса Акмалрин
email = alice.akmalryn@someplacenice.org

[user-bob]
name = Bob Bromkin
original-name = Бобан Бромкин
email = bob.byomkin@otherplacenice.org

# ...and so on.

The configuration fields used in this example, and other possible configuration fields, are listed and described below.

Global settings in the configuration file:

catalog-root

The path to top PO subdirectory. This should be a relative path, and relative to the location of the configuration file.

ascript-root

Relative path to the top directory of the ascription file tree, which will be created and updated by poascribe.

version-control

The underlying version control system of the repository. The value is a keyword, see Section 9.7.2, “Version Control Systems” for a list of VCS supported by Pology.

language, language-team, team-email, plural-header

These fields provide information about the language and the translation team, which poascribe uses to update header fields in modified PO files. language is the language code, while language-team is usually just the human-readable language name in English. plural-header is the exact contents of Plural-Forms: PO header field (if it contains a % character, you need to escape it as %%). For any of these fields that is not set, poascribe will remove the corresponding header field when updating the PO header.

title

The first comment line in the PO header, set when poascribe updates the header. It can contain the following placeholders for inserting file-dependent information: %basename is the base PO file name (e.g. alpha.po), %poname the PO domain name (e.g. alpha), %langname the human-readable language name (supplied by the language-team field), and %langcode the language code (supplied by the language field). Note that these placeholders actually must be written as %%name, to escape the special meaning of single % character. If title field is not set, poascribe will leave the title comment as it is in the PO file.

commit-message

The default commit message for the underlying VCS, when poascribe calls upon it to commit modified PO files. If this field is not set, an editor window will pop up to input the commit message, or the -m/--message option can be used to set the message through the command line. If the field is set, -m can still be used to override the default commit message.

review-tags

The set of accepted review tags, given as whitespace-separated list of tags. If set, poascribe will abort when trying to use an unknown tag, otherwise it will accept any tag.

Each known translator is represented by a [user-name] configuration section. Translator's user name in the ascription system has no direct relation with the underlying VCS account name (if VCS uses them), but it makes sense for them to be equal. This also means that a translator does not even have to have VCS account (repository commit access), though this is expected for the sake of efficiency. Translator configuration sections can contain the following fields:

name

Translator's name, in the form supposed to be readable in English. This means that if the name is not originally written in Latin script, some romanized form should be given.

original-name

Translator's name in its original form, if it differs from the romanized form given by the name field.

email

The email address at which the translator may be contacted.

As soon as ascription-config file is committed to the repository, the ascription system through poascribe is ready for use. The only expected regular modifications to the configuration file are those of adding new translators. On the other hand, translators should never be removed, because even after they go away, their ascription records remain in the system.

6.2.1. Initial Ascription

The most common situation at the start of ascription workflow is that there already exists a considerable amount of translations, contributed by many different people over time. These existing translations should be ascribed as initial modifications -- but ascribed to whom? If it is not precisely known who translated what, the solution is to introduce a generic user in the configuration file, appropriately named "Unknown Hero" (or "Lost Translator", you can be inventive):

[user-uhero]
name = Unknown Hero
original-name = Незнани јунак

You should then ascribe all existing translations as modified and reviewed by this dummy translator:

$ cd $LANGDIR
$ poascribe commit -u uhero --all-reviewed -C  po/

The commit argument is the ascription mode, and the -u option provides the user name to which ascriptions are made. This is an important point: ascriptions are made to a user defined in ascription configuration, and have nothing to do with VCS itself. It is the --all-reviewed option that declares all messages to be reviewed as well (this option is normally used only this once, and not in day to day operation). The -C option prevents automatic VCS adding and committing, which is useful for this initial step.

When this command line is executed, a progress bar will appear and the following output will start to unfold:

doc/alpha.po  (43/43)
doc/bravo.po  (81/81)
...
ui/alpha.po  (582/582)
ui/bravo.po  (931/931)
...
===== Ascription summary:
-           modified  reviewed
translated     11775     11775
fuzzy           2943      2943
obsolete/t       365       365
obsolete/f        26        26

The number in parenthesis indicates how many messages have been ascribed in the given PO file (modified/reviewed), and at the end the totals are given.

If, on the contrary, it is known who translated and reviewed what up to that point, ascription can be performed piece-wise with user names of real translators:

$ cd $LANGDIR
$ poascribe commit -u alice --all-reviewed -C  po/ui/
$ poascribe commit -u bob --all-reviewed -C  po/doc/
$ ...

After the initial ascription has been made, the ascription file tree will appear next to the original file tree. There will be one ascription PO file for each summit PO file, with the same name and relative location within the tree:

l10n-nn/
    po/
        ui/
            alpha.po
            bravo.po
            ...
        ...
    po-ascript/
        ui/
            alpha.po
            bravo.po
            ...
        ...

Ascription PO files are used by poascribe to store the ascription history, rather than e.g. a database of some sort. This has the disadvantage in performance, but advantage in simplicity and robustness. For example, ascription files will be under version control as well.

poascribe may also modify original PO files during this run, by removing any previous field comments (#| ...) on translated messages. These comments are sometimes erroneously left in when the PO file is translated with an older or less capable PO editor, and leaving them would result in unnecessary additions to ascription PO files.

The newly created ascription tree, any modifications to the original tree, and the ascription configuration file, can now be committed as usual. With Subversion as the VCS:

$ cd $LANGDIR
$ svn add ascription-config po-ascript
$ svn commit ascription-config po po-ascript -m "Initial ascription."

6.3. Daily Use for Translators

While this is generally a good idea, with ascription in place translators must always update the complete language directory by VCS, rather than just one particular PO file or subdirectory, so that the original and the ascription PO file trees are kept in sync.[25]

In order not to have to report their user name to poascribe all the time (by the -u option), translators can set it in Pology user configuration, the [poascribe] section:

[poascribe]
user = alice

With this in place, translators can submit updated PO files simply by substituting VCS commit command with poascribe commit (or shortened: co or ci). With Subversion, this would look like:

$ cd $LANGDIR
$ poascribe co po/ui/*alpha*.po
po/ui/alpha.po  (44)
po/ui/libalpha.po  (15)
===== Ascription summary:
-           modified
translated       169
>>>>> VCS is committing catalogs:
Sending      ui/alpha.po
Sending      ui/libalpha.po
Sending      summit-ascript/messages/kdefoo/fooapp.po
Sending      LANG/summit-ascript/messages/kdefoo/libfooapp.po
Transmitting file data ....
Committed revision 1267069.
$ 

The lines after >>>>> VCS... are produce by the underlying VCS, which is Subversion in this example.[26]

As can be seen from the example output, poascribe will add ascription records into ascription PO files corresponding to original PO files, and commit them all. Like a VCS command, poascribe co can take any number of PO file or directory paths. For a directory path, only files with .po extension in it will be processed, and any other ignored. poascribe can be run from any working directory with appropriate paths as arguments, and it will always find the associated ascription configuration and files. If a default commit message has not been set in the ascription configuration, poascribe will ask for it; or it can be given in command line through -m option.

6.3.1. Translators Without Commit Access

With the ascription system in place, every regular translator should have the commit access to the repository. But, there may be some period of time before new translators are given commit access, or revision control may be too technical for some, and even those who have access may not be able to commit temporarily for some reason.

These translators may send in their work by email or any other informal channel, to any member of the team how does have commit access. This team member can then commit received files without any review, as review can be conducted at any later time. If Bob sends some files to Alice, she can commit them immediately by stating Bob's user name:

$ poascribe co -u bob files...

For this to work, the translator who sent in the files has to be defined in the ascription configuration. There are no hidden costs or security issues to this (as opposed to giving VCS commit access), so every new translator should be defined there before any work of that person is committed.

6.4. Daily Use for Reviewers

An ascription system opens up all sorts of possibilities for review patterns. Reviewers should keep in mind that for each message the full modification and review history is available, so that the translation team can think about how to make good use of it. What follows are some examples to illustrate the review functionality provided by poascribe.

6.4.1. Basic Reviewing

At the very basic level (which is the only level in review by stages), messages can be classified as simply unreviewed or reviewed. Alice now wants to review all unreviewed messages in a subset of PO files, say the ui/ subdirectory. She issues the following command (di is short for diff):

$ poascribe di po/ui/
po/ui/alpha.po  (2)
po/ui/foxtrot.po  (7)
po/ui/november.po  (12)
===== Diffed for review: 21
$ 

With this, all unreviewed messages in listed PO files have been marked, and diffed. If these PO files had already been reviewed before, some of the messages modified since then (those now marked for review) may have changed very little. For example, a few changed words in a paragraph-length message, or even just some punctuation. Therefore, for each message marked for review, Alice also wants to see the difference since the last review to current version. Here are two messages with typical review elements added by poascribe di:

#. ~ascto: charlie:m
#: gui/mainwindow.cc:372
#, ediff
msgid "GAME OVER. {-You won-}{+Tie+}!"
msgstr "KRAJ IGRE. {-Pobeda-}{+Nerešeno+}!"

#. ~ascto: bob:m charlie:m
#: game-state.cpp:117
#, ediff-total
msgid "Click the pause button again to resume the game."
msgstr "Kliknite ponovo na dugme pauze da nastavite igru."

In the first message, the first thing to note is the #. ~ascto: comment. This comment succinctly lists who did what with the message since the last review; here charlie:m means that Charlie is the one who modified it. Then there is the ediff flag, which Alice can search for in the editor to jump through messages marked for review. Finally, the original and translation have been diffed; here they show that, since the last review, the message was fuzzied by changing "You won" to "Tie", and what Charlie did in translation to unfuzzy it. Even on a message as short as this, the difference tells something useful to Alice: the phrase "Game over" likely has a formulaic translation, and the fact that it is not part of the difference means that the earlier reviewer had made sure it is consistent, so Alice does not have to check that.

The #. ~ascto: comment of the second message reveals that both Charlie and Bob had been modifying it. The ediff-total flag instead of plain ediff means that this message had no reviews until now, so there are no embedded differences in text fields.

Alice can now go through marked messages in listed PO files, review translations, and possibly make modifications. When making changes in a message with embedded differences, she can freely edit the text outside of difference segments and within {+...+} segments (as these are the ones which belong to current version of the text). While reviewing, Alice does not remove any of the added message elements (except for an occasional difference segment, if she modifies a translation), as these elements are needed for a subsequent invocation of poascribe. If a message is particularly hard to translate and Alice wants to defer reviewing it for some later time, she can add to it the unreviewed flag (or nrev for short).

Once the review is complete, Alice simply commits the reviewed files:

$ poascribe co po/ui/
po/ui/alpha.po  (0/2)
po/ui/foxtrot.po  (0/7)
po/ui/november.po  (3/12)
===== Ascription summary:
-           modified  reviewed
translated         3        21
>>>>> VCS is committing catalogs:
Sending      po/ui/november.po
Sending      po-ascript/ui/alpha.po
Sending      po-ascript/ui/foxtrot.po
Sending      po-ascript/ui/november.po
Transmitting file data ....
Committed revision 1284220.
$ 

Three things have happened here. First, all review states (flags, embedded differences, etc.) have been removed, restoring diffed PO files to original state. Then, any modifications that Alice has made during review are ascribed to her (here 3 out of 21 messages). Finally, all marked messages are ascribed as reviewed by Alice (any with unreviewed or nrev flags would have been omitted here). When committing, the only original PO file that got committed is the one with modifications made during review, and all the ascription PO files were committed because of the reviews recorded in them.

When many PO files with few changes per file should be reviewed, it becomes burdensome to manually open each and every diffed file for review, and then to make sure that all are committed with poascribe co. To make this easier, -w toreview.out option can be added to the poascribe di command line, which requests that paths of all diffed PO files be written into toreview.out file. This file can then be used to batch-open diffed PO files in an editor, as well as to commit them later by adding -f toreview.out to poascribe co. There is also -o option, which tells poascribe to directly open PO files in one of the supported PO editors (see Section 9.7.1, “PO Editors”). Putting it together, to efficiently review a whole bunch of small changes throughout many PO files, with Lokalize as the PO editor, you can execute:

$ poascribe di paths... -w toreview.out -o lokalize
$ # ...only marked messages opened in Lokalize, review them...
$ poascribe co -f toreview.out

If for whatever reason you want to simply remove the review elements from messages without committing the PO files (effectively discarding the review), you can use the purge mode (short pu) of poascribe:

$ poascribe pu paths...

If -k/--keep-flags option is added to this command line, the flags which mark the messages as reviewed get preserved; more precisely, every ediff* flag is replaced with reviewed flag, and every unreviewed flag is left in, so that subsequent invocation of poascribe co can record reviews. You will want this limited purging if you have some automatic validation tools to run before committing, and these tools would be thrown off by review elements (most likely by embedded differences).

6.4.2. Selecting Messages for Review

Invocations of poascribe di without any options, as in the previous section, are equivalent to this:

$ poascribe di -s modar paths...

The -s option serves to issue a message selector. modar is the default selector for the diff operation mode, and stands for "MODified-After-Review": it selects the earliest historical modification of the message after the last (or no) review of that message, if there is any such. By selecting a historical modification of the message, the difference from it to current version can be computed and embedded into the PO file, as seen in earlier examples.

There are various specialized selectors, and they fall into two groups: shallow selectors and history selectors. Shallow selectors look only into the current version of the message, and cannot select historical versions, which means that they cannot provide embedded differences. History selectors (modar is of this type) can select messages from history and provide differences. Several selectors can be issued on the command line, and the message is selected only if all selectors select it (boolean AND-linking). Shallow selectors are thus normally used as a pre-filters for history selectors. For example, to select messages modified after the last review, but only those found in the stable branch, branch and modar selectors are chained like this:

$ poascribe di -s branch:stable -s modar paths...

It is important that the history selector is given last, because the last selector determines which historical message is selected for diffing. If the ordering had been reversed in this example, same messages would get selected, but they would not have embedded differences, because branch is a shallow selector.

Selectors can take parameters themselves, like branch:stable in the previous example. Parameters are separated from the selector name by any non-alphanumeric character; this is colon by convention, but if a parameter contains a colon, something else, like slash, tilde, etc. can be used. Number of parameters can vary, and modar in particular can take from none to three. If Alice wants to review only those messages modified by Charlie since the last review, she states this by first argument to modar:

$ poascribe di -s modar:charlie paths...

If Alice does not give much credit to other reviewers, she can request selection of messages modified after her own last review with second parameter to modar:

$ poascribe di -s modar::alice paths...

Here the first parameter ("modified by..."), which is not needed, must be explicitly skipped, before proceeding to the second parameter ("reviewed by..."). (The third optional parameter to modar will be demonstrated later on.)

When a selector parameter is a user name, normally it can also be a comma-separated list of user names (modar:bob,charlie) or prefixed with tilde to negate, i.e. to select all users other than those listed (modar:~alice).

Any selector can be negated by prepending n to its name. For example, the history selector modafter:date selects first modification after the given date; to select messages modified after the last review, but only if modified during June 2010:

$ poascribe di -s modafter:2010-06 -s nmodafter:2010-07 -s modar paths...

Negating a history selector produces a shallow selector: while modafter is a history selector, nmodafter is shallow. But the mutual ordering of the two in this example is not important, since the last selector in the chain is the usual modar.

Selectors can be issued in other modes too. If the PO file is big, and Alice has reviewed messages up to and including the message with entry number 246 when she has to pause until another day, she can commit reviews only up to this entry by issuing the espan selector:

$ poascribe co -s espan::246 paths...

The first parameter to espan, here omitted, would be the entry number of the first message to select, in case messages should not be selected starting from the first in the file. There is also the counterpart lspan selector, which works with referent line numbers (those of msgid keywords) instead of entry numbers.

If you do not want to immediately diff for review, but to see first how many messages would be selected by the selector chain that you assembled, you can use the status operation mode (st for short) instead of diff. It takes selectors in the same way as diff, and shows counts of selected messages by category. You can also add the -b option to have counts reported by PO file (where non-zero).

You may also want to observe the complete recorded ascription history of a message, all its modifications and reviews, with differences between each two modifications. For this you can use the history operation mode (hi for short), typically with one of l or e selectors to single out a particular message. The history will be written out to terminal, starting from the newest to the oldest version of the message, with highlighted embedded differences.

6.4.3. Fine-Grained Reviews

In the introduction of this chapter, several distinct things that can go wrong in translation were described. Not all reviewers may be able to check translation against all those problems. Here is a typical scenario of this kind:

Alice is computer-savvy and knows the translation project inside and out, which means that she can review well for context, terminology, and technical style. But, her language style leaves something to be desired, which shows in longer sentences and passages. Dan, on the other hand, is a very literary person, but not that much into the technical aspects. Dan's style reviews would thus be a perfect complement to Alice's general reviews.

poascribe can support this scenario in the following way. A review type tag lstyle for language style is defined in the ascription configuration, using the review-tags field:

[global]
# ...
review-tags = lstyle

With this addition to configuration, Alice can continue to review as she did before, without any changes in her workflow.

Dan selects messages for review similarly to Alice, but additionally giving the lstyle tag as the third parameter of modar, and indicating that reviews should be tagged as lstyle using the -t option:

$ poascribe di -s modar:::lstyle -t lstyle paths...

After finishing the review, Dan commits as usual:

$ poascribe co PATHS...

If Dan is always going to review the language style, in order not to have to issue the selector and the tag in command line all the time, he can make them default for the diff mode in Pology user configuration:

[poascribe]
user = dan
selectors/diff = modar:::lstyle
tags/diff = lstyle

With this, Dan can use plain poascribe di just like Alice does.

The important point of review tags is that they make reviews by types independent. For example, Dan may come around to review the language style of the given message after several modifications and general reviews have been ascribed to it -- modar:::lstyle will simply ignore all reviews other than lstyle reviews. This is going to be reflected in the ~ascto: comment of diffed messages:

#...
#. ~ascto: charlie:m alice:r bob:m
#...
msgid "..."
msgstr "..."

Here Alice has made one review between Charlie's and Bob's modifications, and that review, being general instead of lstyle, did not cause modar to stop at it. After Dan reviews this message for language style, Alice runs selection for review and gets this:

#...
#. ascto: bob:m dan:r(lstyle)
#...
msgid "..."
msgstr "..."

Again, since lstyle reviews do not mix with general reviews[27], Dan's review did not hide Bob's modification that Alice did not check so far.

6.5. General Maintenance Procedures

After the ascription system is set up, there should be very little to do to maintain it. The details depend on the established translation workflow, and this section describes some of the procedures which may apply.

6.5.1. Ascribing Merges

If PO files are periodically merged with templates in a centralized manner, by one designated person or repository automation, these modifications must also be ascribed. This is done as any other ascription, by substituting the VCS commit command with poascribe co. For example:

$ svn commit $LANGDIR -m "Everything merged."

may be substituted with:

$ poascribe commit $LANGDIR -m "Everything merged."

Since the user is not explicitly given by the -u option, this will ascribe modifications due to merging to the person set in Pology user configuration on the system where the command is executed. This is just fine. It is also possible to define a dummy user to which modifications due to merging are ascribed, though there is no known advantage to that at present.

Note that you can issue the -C option to prevent poascribe from automatically committing merged files, in case are some automatic post-merge operations that you would like to perform on merged PO files beforehand. Afterward, standalone VCS commit command can be issued, but do not forget to include the ascription file tree in it as well.

6.5.2. Shuffling Ascription PO Files

Sometimes PO files are "shuffled" in the repository: renamed, moved to another subdirectory, etc. Such shuffling should be exactly mirrored in the ascription tree:

  • If a PO file is moved or renamed, its counterpart ascription PO file should also be moved or renamed in the same way within the ascription tree.

  • If a PO file is split into two, then it depends on how you handle the splitting. A good way would be to copy the old PO file to two new names, and then merge them with new templates. In this way as much of existing translation as possible will be preserved. If this is done, then the ascription PO files should be copied to new names, but then there is nothing to merge them with. This is just right, since message ascription histories generally interleave across the split (but also see Section 6.5.4, “Trimming Ascription History”).

  • If two PO files are merged into one, you should probably handle that by using msgcat to properly concatenate them into the new PO file, and then merge the new PO file with its template. Then, the old ascription PO files should be concatenated with msgcat as well, and nothing more. But, make sure that you issue the --use-first to msgcat, for both concatenations. This is because when in the two concatenated PO files there are two messages with same msgctxt+msgid but different msgstr, msgcat will by default make a free-form composition of msgstr texts, for translator to manually disentangle later. This would ruin the ascription entry of such a message in the concatenated ascription PO file.

After the shuffling is performed in both file trees, poascribe co is executed to smooth out and commit modifications.

6.5.3. Filtering for Release

At the moment of this writing, filtering for release has not been implemented yet in poascribe, but it is planned.

However, if you translate in summit, it is possible to configure the summit to skip insufficiently reviewed messages when scattering to branches. See Section 5.3.7, “Filtering by Ascription on Scatter” for details.

6.5.4. Trimming Ascription History

At the moment of this writing, trimming the ascription history has not been implemented yet in poascribe, but it is planned.

6.6. Command Line Options

Overview of operation modes:

commit, co

Commits modifications and reviews to PO files. Default selector: any.

diff, di

Adds embedded differences and other review elements to selected messages in PO files. Default selector: modar.

history, hi

Outputs to terminal the complete history of modifications and reviews for selected messages. Default selector: any.

purge, pu

Removes all review elements from PO files (unless -k/--keep-flags option is added, when only review flags are kept). Default selector: any.

release, re

Not implemented yet.

status, st

Shows ascription counts per message category (total for all selected messages, and also per PO file if -b/--show-by-file option is added). Default selector: any.

trim, tr

Not implemented yet.

Options specific to poascribe:

-a SELECTOR[:ARGS], --select-ascription=SELECTOR[:ARGS]

By default, a historical message is selected for diffing with current message based on the last history selector given by -s/--selector option (if any). Instead, with this option you can explicitly set the selector for historical messages. It will be applied after the message has been selected by the primary selector chain. The option can be repeated, in which case a historical message is selected if all selectors match it.

-A RATIO, --min-adjsim-diff=RATIO

The minimum adjusted similarity between the current and the historical message at which embedded differences will be shown.[28] This is a number in range from 0.0 (always show) to 1.0 (never show). If the difference is not shown due to this limit, the message will get the flag ediff-ignored instead of the usual ediff. A reasonable value may be 0.6 to 0.8.

-b, --show-by-file

Some operation modes show summary at the end of the run, which is based on all processed PO files taken together. With this option you can request some of the summary elements to be shown per processed file.

-C, --no-vcs-commit

Issue this option if you want poascribe not to commit modifications to version control itself. This may be useful if you want to examine raw modifications it made, to perform some checks, etc, and commit manually later. But do not modify any messages in between, as that would defeat the purpose of ascription.

-d LEVEL, --depth=LEVEL

Operation modes normally consider ascription history of a message starting from the newest and going down to the earliest ascription. With this option you can set the depth to which history is examined, where 0 is the newest ascription only, 1 the current and first previous ascription, etc.

-D SPEC, --diff-reduce-history=SPEC

Some special (possibly custom) selectors may need to examine only differences or commonalities between each two adjacent messages. In order not to have to build this functionality into each such selector, you can issue this option to preprocess ascription history such that each historical message is reduced based on the difference with the next earlier message. The message can be reduced to the parts equal, added or removed as compared to the earlier message. This is controlled by the SPEC value, which must start with one of the letters e (equal), a (added), or r (removed). This letter may be followed with an arbitrary sequence of characters, which will be used to separate the remaining parts of the text in the message; if there are no additional characters, space is used as the separator.

-F HOOKSPEC, --filter=HOOKSPEC

Sometimes it may be necessary to apply selectors not to the ascription history as it is, but to a suitably filtered version of the history. This option can be used to set a Pology F1A hook as filter, see Section 9.10, “Processing Hooks” for details. It can be repeated to set several filters.

-G, --show-filtered

When setting a filter on ascription history by the -F/--filter option in the diff mode, it may be good to see also the difference in filtered messages, those on which the selectors were actually applied. By issuing this option, every message field with an embedded difference will get added a visually conspicuous separator, followed by the filtered version of the text with difference as well. When you commit or purge the PO file diffed in this way, the separators and the filtered text are removed together with all other review elements.

-k, --keep-flags

When the diffed PO file is purged of review elements, by default all review elements are removed, so that on subsequent commit only modifications would be ascribed, if there were any. Issuing this option on purge causes that all review elements except for flags are removed. More precisely, ediff* flags are replaced with reviewed, and unreviewed flags are simply kept. This makes the subsequent commit also ascribe reviews. You need this if you want to apply some automatic checks to the PO file after the review and before the commit, where more intrusive review elements (like embedded differences) would interfere.

-m TEXT, --message=TEXT

The text of the commit message. If default commit message is set in the ascription configuration, this text overrides it. If default commit message is not set and this option is not issued, and editor window is opened to enter the commit message.

-o EDITOR, --open-in-editor=EDITOR

When diffing for review, instead of manually opening diffed PO files and searching for messages by flags, this option can be issued to have poascribe automatically open PO files in a PO editor (and possibly have the editor filter the message list to only selected messages). This work only with PO editors explicitly supported by Pology; the EDITOR value is an editor keyword rather than an arbitrary editor command. See Section 9.7.1, “PO Editors” for the list of supported editors.

-L RATIO, --max-fraction-select=RATIO

In diff mode, this option sets the ratio of selected messages to total messages in a given PO file, above which no message in that file will be selected although the selector chain matched them. The value is the number between 0.0 and 1.0; for example, 0.2 means to accept selection if the number of selected messages is at most 20% of the total number of messages. This can be used to discern between reviewing updated PO files and newly translated PO files, as the latter take much more time to review and hence may be of lesser priority.

-s SELECTOR[:ARGS], --selector SELECTOR[:ARGS]

The option to set a selector, in various modes. Can be repeated to create selector chains, in which case a message must match all selectors to be selected. In diff mode, if the last selector in the chain is not a history selector, selected messages will have no embedded differences (unless an ascription selector is explicitly given by the -A/--select-ascription option).

-t TAG, --tag=TAG

The review tag, denoting the type of the review. If review-tags field in ascription configuration set, this must be one of the tags defined there (general review has an empty string as tag, which is the default). The tag is normally issued in diff mode: it will be appended to review flags on diffed messages (e.g. ediff/tag), which will cause on commit that the review of this type is ascribed. In commit mode, this option has effect only if --all-reviewed is issued as well, in which case this tag will override any from the PO file. Several tags may be given as comma-separated list.

-u NAME, --user=NAME

The user, one of those defined in ascription configuration, to whom modifications and reviews are ascribed on commit.

-U, --update-headers

If you work on PO files with a general text editor, you can issue this option on commit to have the header data in modified PO files automatically updated. The necessary information is fetched from the ascription configuration.

-v, --verbose

More detailed information on progress of the operation.

-w FILE, --write-modified=FILE

This option specifies the file into which to write the path of every PO file modified during the operation, one per line. This file can later be fed back to poascribe (and other Pology commands) with the -f/--files-from option.

-x FILE, --externals=FILE

If you have written some custom selectors, with this option you specify the path to the file containing them. It can be repeated to load several files with custom selectors.

--all-reviewed

On commit, normally only messages having ediff* or reviewed flags will be ascribed as reviewed. If this option is used, instead all messages will be ascribed as reviewed (except for those having unreviewed flag).

Options common with other Pology tools:

-F FILE, --files-from=FILE

See Section 9.5, “Reading Paths From a File”.

-e REGEX, --exclude-name=REGEX; -E REGEX, --exclude-path=REGEX; -i REGEX, --include-name=REGEX; -I REGEX, --include-path=REGEX

See Section 9.4, “Path Inclusion and Exclusion”.

6.7. User Configuration

The following configuration fields can be used to modify general behavior of poascribe:

[poascribe]/aselectors

The list of explicit selectors of historical messages, as if they were issued with multiple -a/--aselector options. The first character in the value must be non-alphanumeric (e.g. /), and that character is then used to separate selector specifications; the value must also end with this character.

[poascribe]/diff-reduce-history

Counterpart to -D/diff-reduce-history command line option.

[poascribe]/filters

Comma-separated list of history filters, as if they were issued with multiple -F/--filter options.

[poascribe]/max-fraction-select

Counterpart to -L/max-fraction-select command line option.

[poascribe]/min-adjsim-diff

Counterpart to -A/--min-adjsim-diff command line option.

[poascribe]/po-editor

Counterpart to -o/--open-in-editor command line option.

[poascribe]/selectors

List of message selectors, as if they were issued with multiple -s/--selector options. The first character in the value must be non-alphanumeric (e.g. /), and that character is then used to separate selector specifications; the value must also end with this character.

[poascribe]/tags

Counterpart to -t/--tag command line option.

[poascribe]/update-headers=[yes|*no]

Setting to yes is counterpart to -U/--update-headers command line option.

[poascribe]/user

Counterpart to -u/--user command line option.

[poascribe]vcs-commit/=[*yes|no]

Setting to no is counterpart to -C/--no-vcs-commit command line option.

6.8. Review Selectors

poascribe provides a variety of internal selectors, and new selectors are added as general need for them is observed in practice. Selectors come in two types: history and shallow; the former also select a historical message from which to show the differences to the current message, while the latter do not. Arguments to selectors are added consistently separated with any non-alphanumeric character, customarily colon (:) when possible. If less arguments are given than the selector can take, all remaining arguments are set to empty (the selector may or may not accept this).

Available internal selectors are as follows:

any (shallow)

Selects any message.

active (shallow)

Selects active messages, i.e. those translated and not obsolete.

asc:USER (history)

Selects latest historical message ascribed (modified or reviewed) by the given user, or to any user if the argument is empty. Multiple users can be given as a comma-separated list, and selection inverted by prepending ~.

branch:NAME (shallow)

Selects messages belonging to the given branch (see Chapter 5, Summitting Translation Branches). Several branch names may be given, as comma-separated list.

current (shallow)

Selects current messages, i.e. those not obsolete.

e:ENTRYNUM (shallow)

Selects a message with given entry number in the PO file (first message has entry number 1, second 2, etc).

espan:START:END (shallow)

Select messages with entry numbers between given start and end, including both. If start is empty, 1 is assumed; if end is empty, number of messages is assumed.

fexpr:EXPRESSION (shallow)

Selects messages matching a boolean search expression on message parts. It has same syntax as the fexpr parameter of the find-messages sieve.

hexpr:EXPRESSION:USER:DIFFSPEC (history)

Like fexpr, but matches through historical messages starting from the latest ascription. If user argument is not empty, matches only messages ascribed to that user. Multiple users can be given as a comma-separated list, and selection inverted by prepending ~. The last argument, if not empty, requests to reduce historical messages by incremental differences before matching them; see the --diff-reduce-history option for the syntax and other details.

l:LINENUM (shallow)

Selects a message with given referent line number in the PO file. This is the line number of msgid message field. ±1 offset is accepted.

lspan:START:END (shallow)

Select messages with referent line numbers between given start and end, including both. If start is empty, 1 is assumed; if end is empty, total number of lines is assumed.

mod:USER (history)

Selects latest historical message modified by the given user, or by any user if the argument is empty. Multiple users can be given as a comma-separated list, and selection inverted by prepending ~.

modafter:TIMESTAMP:USER (history)

Selects the earliest historical message modified at or after the given date and time. The full timestamp format is YEAR-MONTH DAY HOUR:MINUTE:SECOND, but trailing elements can be omitted as logical; for example, 2010-10 would be interpreted as 2010-10-01 00:00:00. If the user argument is not empty, only modifications by that user are considered; multiple users can be given as a comma-separated list, and selection inverted by prepending ~.

modam:USER1:USER2 (history)

Selects the earliest historical message which introduced modifications after the last modification, or the very first historical message. This makes sense only if one or both of the user arguments are not empty. If the first user argument is not empty, only modifications by that user are considered for selection. If the second user argument is not empty, only the modifications by that user are considered as base. For both user arguments, multiple users can be given as a comma-separated list, and selection inverted by prepending ~.

modar:MODUSER:REVUSER:TAG (history)

Selects the earliest historical message which introduced modifications after the last review, or the very first historical message if there was no review yet. If the first user argument is not empty, only modifications by that user are considered and reviews by that user are ignored. If the second user argument is not empty, only reviews by that user are considered and modifications by that user are ignored. For both user arguments, multiple users can be given as a comma-separated list, and selection inverted by prepending ~. The last argument determines which review types (by review tag) to consider, where empty value means "general review"; multiple tags can be given as comma-separated list.

modarm:MODUSER:REVUSER:TAG (history)

Like modar, but uses as base for selection the last review or the last modification. This generally makes sense only if some combination of user arguments is given too.

rev:USER (history)

Selects latest historical message reviewed by the given user, or by any user if the argument is empty. Multiple users can be given as a comma-separated list, and selection inverted by prepending ~.

revbm:REVUSER:MODUSER:TAG (history)

Selects the earliest historical message which has been reviewed just before a modification occurred. If the first user argument is not empty, only reviews by that user are considered and modifications by that user are ignored. If the second user argument is not empty, only modifications by that user are considered and reviews by that user are ignored. For both user arguments, multiple users can be given as a comma-separated list, and selection inverted by prepending ~. The last argument determines which review types (by review tag) to consider, where empty value means "general review"; multiple tags can be given as comma-separated list.

tmodar:MODUSER:REVUSER:TAG (history)

Like modar, but considers as modified only those historical messages with modifications in translation (msgstr).

unasc (shallow)

Select messages that are not yet ascribed, i.e. those which are modified but not yet committed.

Every selector automatically gets a negative counterpart, with the name prefixed by n*. The negative selector is always shallow, regardless of the type of the original selector.

6.8.1. Custom Review Selectors

Custom selectors can be written in a standalone Python source file, which is then fed to poascribe using the -x/--externals option. A file with several custom selectors should have this layout:

def selector_foo (args):
    ...

def selector_bar (args):
    ...

asc_selector_factories = {
    # key: (factory, is_history_selector)
    "foo": (selector_foo, False),
    "bar": (selector_bar, True),
}

selector_foo and selector_bar are factory functions for selectors foo and bar. After loading the file, poascribe will look for the asc_selector_factories dictionary to see which selectors are defined and of what type they are. See Section 11.5, “Writing Ascription Selectors” for the instructions on writing selector factory functions.



[24] Technically, PO files could also be grouped by PO domain:

po/
    ui/
        alpha/
            ...
            nn.po
            mm.po
            ...

but this would lead to a host of strange sharings of ascription settings and auxiliary file locations between different languages. In general, it is assumed that each translation team manages its own separate ascription.

[25] There should be no technical problem here, since VCS updates are inexpensive in terms of network traffic, but there may be a problem of changing one's habits.

[26] If the underlying VCS would a distributed one, such Git, and the push to a designated central repository is expected afterward, it must be performed manually.

[27] General review too has a tag assigned, the empty string, in case the reviewer needs to explicitly issue it in some context.

[28] Unlike for example in fuzzy messages, the similarity between the current and the earlier message from the ascription history may be exactly zero, it the PO file has undergone several merges in between. For example, in a two-word message, the first merge could have replaced the first word, and the second merge the second word.