Package pology :: Package lang :: Package sr :: Module trapnakron

Module trapnakron

Constructors of syntagma derivators for trapnakron.

Trapnakron -- transcriptions and translation of names and acronyms -- is a collection of syntagma derivator definitions residing in pology/lang/sr/trapnakron/. Its purpose is to support translation efforts in Serbian language, where proper nouns and acronyms are frequently transcribed, and sometimes translated. For translators, it can be a manual reference, or even directly sourced in translated material (see below). For readers, it is a way to obtain original forms of transcribed and translated phrases.

Trapnakron web pages are built based on trapnakron source in Pology. This makes links between original and localized forms readily available through internet search engines. Adding trapnakron or трапнакрон keyword to the search phrase causes the relevant trapnakron page to appear within top few hits, and the desired other form will be shown already in the excerpt of the hit, such that is not even necessary to follow it. This frees translators from the burden of providing original forms in parenthesis to the first mentioning (or some similar method), and frees the text of the clutter caused by this.

While trapnakron definitions may be manually collected and imported into a basic Synder object, this module provides wrappers which free the user of this manual work, as well as appropriate transformation functions (*tf parameters to Synder constructor) to produce various special behaviors on lookups. Trapnakron constructors are defined by type of textual material, e.g. for plain text or Docbook documentation. Documentation of each constructor states what special lookup behaviors will be available through Synder objects created by it.

For a short demonstration, consider this derivation of a person's name:

   钱学森, Qián Xuésēn, Tsien Hsue-shen: Ћен| Сјуесен|

Suppose that a translator wants to source it directly in the text, rather than to manually copy the transcription (e.g. to avoid having to update the text should the transcription be modified in the future). The translator therefore writes, using XML entity syntax:

   ...пројектовању ракета &qianxuesen-g; привукле су идеје...

where -g denotes genitive case. This text can be easily processed into the final form (before going out to readers), using a script based on these few lines:

   >>> from pology.lang.sr.trapnakron import trapnakron_plain
   >>> from pology.resolve import resolve_entities_simple as resents
   >>> tp = trapnakron_plain()
   >>>
   >>> s = u"...пројектовању ракета &qianxuesen-g; привукле су идеје..."
   >>> print resents(s, tp)
   ...пројектовању ракета Ћена Сјуесена привукле су идеје...
   >>>

Author: Chusslove Illich (Часлав Илић) <caslav.ilic@gmx.net>

License: GPLv3

Functions
Synder
trapnakron(envec=u'', envel=u'л', envic=u'иј', envil=u'ијл', markup='plain', tagmap=None, ptsuff=None, ltsuff=None, gnsuff=None, stsuff=None, adsuff=None, nmsuff=None, npkeyto=None, nobrhyp=False, disamb='', runtime=False)
Main trapnakron constructor, covering all options.
string
rootdir()
Get root directory to trapnakron derivation files.
 
trapnakron_plain(envec=u'', envel=u'л', envic=u'иј', envil=u'ијл')
Constructs trapnakron suitable for application to plain text.
 
trapnakron_ui(envec=u'', envel=u'л', envic=u'иј', envil=u'ијл')
Constructs trapnakron suitable for application to UI texts.
 
trapnakron_docbook4(envec=u'', envel=u'л', envic=u'иј', envil=u'ијл', tagmap=None)
Constructs trapnakron suitable for application to Docbook 4 texts.
as input
norm_pkey(pkey)
Normalize internal property keys in trapnakron.
string
norm_rtkey(text)
Normalize text into runtime key for translation scripting.
Variables
  __package__ = 'pology.lang.sr'
Function Details

trapnakron(envec=u'', envel=u'л', envic=u'иј', envil=u'ијл', markup='plain', tagmap=None, ptsuff=None, ltsuff=None, gnsuff=None, stsuff=None, adsuff=None, nmsuff=None, npkeyto=None, nobrhyp=False, disamb='', runtime=False)

 

Main trapnakron constructor, covering all options.

The trapnakron constructor sets, either by default or optionally, various transformations to enhance queries to the resulting derivator.

Default Behavior

Property values are returned as alternatives/hybridized compositions of Ekavian Cyrillic, Ekavian Latin, Ijekavian Cyrillic, and Ijekavian Latin forms, as applicable. Any of these forms can be excluded from derivation by setting its env* parameter to None. env* parameters can also be used to change the priority environment from which the particular form is derived.

Derivation and property key separator in compound keys is the ASCII hyphen (-).

Derivation keys are derived from syntagmas by applying the identify() function. In derivations where this will result in strange keys, additional keys should be defined through hidden syntagmas. Property keys are transliterated into stripped-ASCII.

Conflict resolution for derivation keys is not strict (see derivator constructor).

Optional behavior

Instead of plain text, properties may be reported with some markup. The markup type is given by markup parameter, and can be one of "plain", "xml", "docbook4". The tagmap parameter contains mapping of derivation keys to tags which should wrap properties of these derivations.

Derivation keys can have several suffixes which effect how the properties are reported:

  • Presence of the suffix given by ptsuff parameter signals that properties should be forced to plain text, if another markup is globally in effect.
  • Parameter ltsuff states the suffix which produces lighter version of the markup, where applicable (e.g. people names in Docbook).
  • When fetching a property within a sentence (with keys given e.g. as XML entities), sentence construction may require that the resolved value is of certain gender and number; parameter gnsuff can be used to provide a tuple of 4 suffixes for gender in singular and 4 suffixes for gender in plural, such that the property will resolve only if the value of gender and number matches the gender and number suffix.
  • Parameters stsuff and adsuff provide suffixes through which systematic transcription and alternative derivations are requested. They are actually tuples, where the first element is the key suffix, and the second element the suffix to primary environment which produces the systematic/alternative environment. adsuff can also be a tuple of tuples, if several alternative derivations should be reachable.
  • In case the entry is a person's name with tagged first and last name, parameter nmsuff can provide a tuple of 2 suffixes by which only the first or last name are requested, respectively.

Ordinary hyphens may be converted into non-breaking hyphens by setting the nobrhyp parameter to True. Non-breaking hyphens are added heuristically, see the to_nobr_hyphens() hook. Useful e.g. to avoid wrapping on hyphen-separated case endings.

A property key normally cannot be empty, but npkeyto parameter can be used to automatically substitute another property key when empty property key is seen in request for properties. In the simpler version, value of npkeyto is just a string of the key to substitute for empty. In the more complex version, the value is a tuple containing the key to substitute and the list of two or more supplemental property keys: empty key is replaced only if all supplemental property values exist and are equal (see e.g. trapnakron_plain for usage of this).

Some property values may have been manually decorated with disambiguation markers (¤), to differentiate them from property values of another derivation which would otherwise appear equal under a certain normalization. By default such markers are removed, but instead they can be substituted with a string given by disamb parameter.

Some derivations are defined only for purposes of obtaining their properties in scripted translations at runtime. They are by default not included, but can be by setting the runtime parameter to True.

Parameters:
  • envec (string or None) - primary environment for Ekavian Cyrillic derivation
  • envel (string or None) - primary environment for Ekavian Latin derivation
  • envic (string or None) - primary environment for Ijekavian Cyrillic derivation
  • envil (string or None) - primary environment for Ijekavian Latin derivation
  • markup (string) - target markup
  • tagmap (dict string -> string) - tags to assign to properties by derivation keys
  • ptsuff (string) - derivation key suffix to report plain text properties
  • ltsuff (string) - derivation key suffix to report properties in lighter markup
  • gnsuff ([(string, string)*]) - suffixes by gender and number, to have no resolution if gender or number do not match
  • stsuff ((string, string)) - derivation key and environment name suffixes to report systematic transcriptions
  • adsuff ((string, string) or ((string, string)*)) - derivation key and environment name suffixes to report alternative derivations
  • nmsuff ((string, string)) - suffixes for fetching only first or last name of a person
  • npkeyto (string or (string, [string*])) - property key to substitute for empty key, when given
  • nobrhyp (bool) - whether to convert some ordinary into non-breaking hyphens
  • disamb (string) - string to replace each disambiguation marker with
  • runtime (bool) - whether to include runtime-only derivations
Returns: Synder
trapnakron derivator

rootdir()

 

Get root directory to trapnakron derivation files.

Returns: string
root directory path

trapnakron_plain(envec=u'', envel=u'л', envic=u'иј', envil=u'ијл')

 

Constructs trapnakron suitable for application to plain text.

Calls trapnakron with the following setup:

  • Markup is plain text (plain).
  • Suffixes: _rm ("rod muski") for resolving the property value only if it is of masculine gender, _rz for feminine, _rs for neuter; _s for systematic transcription, _a, _a2 and _a3 for other alternatives; _im and _pr for person's last and first name.
  • Non-breaking hyphens are heuristically replacing ordinary hyphens.
  • Empty property key is converted into am (accusative masculine descriptive adjective), providing that it is equal to gm (genitive masculine descriptive adjective); i.e. if the descriptive adjective is invariable.

trapnakron_ui(envec=u'', envel=u'л', envic=u'иј', envil=u'ијл')

 

Constructs trapnakron suitable for application to UI texts.

Like trapnakron_plain, except that disambiguation markers are not removed but substituted with an invisible character, and runtime-only derivations are included too.

Retaining disambiguation markers is useful when a normalized form (typically nominative) is used at runtime as key to fetch other properties of the derivation, and the normalization is such that it would fold two different derivations to same keys if the originating forms were left undecorated.

trapnakron_docbook4(envec=u'', envel=u'л', envic=u'иј', envil=u'ијл', tagmap=None)

 

Constructs trapnakron suitable for application to Docbook 4 texts.

Calls trapnakron with the following setup:

  • Markup is Docbook 4 (docbook4).
  • Suffixes: _ot ("obican tekst") for plain-text properties, _lv ("laksa varijanta") for lighter variant of the markup. Lighter markup currently applies to: people names (no outer <personname>, e.g. when it should be elideded due to particular text segmentation on Docbook->PO extraction). Also the suffixes as for trapnakron_plain.
  • Non-breaking hyphens and empty property keys are treated like in trapnakron_plain.

norm_pkey(pkey)

 

Normalize internal property keys in trapnakron.

Parameters:
  • pkey (string or (string*) or [string*]) - property key or keys to normalize
Returns: as input
normalized keys

norm_rtkey(text)

 

Normalize text into runtime key for translation scripting.

Parameters:
  • text (string) - text to normalize into runtime key
Returns: string
runtime key