Package pology :: Module resolve

Module resolve

Replace value-defining segments in text with their values.


Author: Chusslove Illich (Часлав Илић) <caslav.ilic@gmx.net>

License: GPLv3

Functions
(string, [string...], [string...])
resolve_entities(text, entities, ignored=set([]), srcname=None, vfilter=None, undefrepl=None)
Replace XML entities in the text with their values.
string
resolve_entities_simple(text, entities, ignored=set([]), srcname=None, vfilter=None)
As resolve_entities, but returns only the resolved text.
string, int, bool
resolve_alternatives(text, select, total, althead='~@', altfilter=None, outfilter=None, condf=None, srcname=None)
Replace alternatives directives in the text with the selected alternative.
string
resolve_alternatives_simple(text, select, total, althead='~@', altfilter=None, outfilter=None, condf=None, srcname=None)
As resolve_alternatives, but return only the resolved text.
string
first_to_case(text, upper=True, nalts=0, althead='~@')
Change case of the first letter in the text.
 
first_to_upper(text, nalts=0, althead='~@')
Uppercase the first letter in the text.
 
first_to_lower(text, nalts=0, althead='~@')
Lowercase the first letter in the text.
 
expand_vars(text, varmap, head='%')
Expand variables in the text.
string
remove_accelerator(text, accels=None, greedy=False)
Remove accelerator from the text.
string
remove_fmtdirs(text, format, subs='')
Remove format directives from the text.
string
remove_literals(text, subs='', substrs=[], regexes=[], heuristic=True)
Remove literal substrings from the text.
(cat) -> numerr
convert_plurals(mapping, plhead)
Convert plural forms in the catalog [hook factory].
Variables
  DEFAULT_ALTHEAD = '~@'
  __package__ = 'pology'
Function Details

resolve_entities(text, entities, ignored=set([]), srcname=None, vfilter=None, undefrepl=None)

 

Replace XML entities in the text with their values.

Entity values are defined by the supplied dictionary of name-value pairs. Not all entities need to be replaced, some can be explicitly ignored. If an entity is neither defined nor ignored, a warning will be reported to standard output if srcname is given.

An undefined entity is by default left untouched in the resulting text. Instead, the parameter undefrepl can be used to supply a string to substitute for every undefined entity, or a function which takes the undefined entity name and returns the string to substitute.

Parameters:
  • text (string) - the text to transform
  • entities (has .get() with dict.get() semantics) - entity name-value pairs
  • ignored (a sequence or (string)->bool) - entities to ignore; a sequence of entity names, or function taking the entity name and returning True if ignored
  • srcname (None or string) - if not None, report unknown entities to standard output, with this parameter as source identifier
  • vfilter (string or (string)->string) - format string (with single %s directive) or function to apply to every resolved entity value
  • undefrepl (string of (string)->string) - string or function to use in case of undefined entity
Returns: (string, [string...], [string...])
the resulting text, resolved entities names, and unknown entity names

resolve_entities_simple(text, entities, ignored=set([]), srcname=None, vfilter=None)

 

As resolve_entities, but returns only the resolved text.

Returns: string
the resulting text

See Also: resolve_entities

resolve_alternatives(text, select, total, althead='~@', altfilter=None, outfilter=None, condf=None, srcname=None)

 

Replace alternatives directives in the text with the selected alternative.

Alternatives directives are of the form ~@/.../.../..., for example:

   I see a ~@/pink/white/ elephant.

where ~@ is the directive head, followed by a character that defines the delimiter of alternatives (like in sed command). The number of alternatives per directive is not defined by the directive itself, but is provided as an external parameter.

Alternative directive is resolved into one of the alternative substrings by given index of the alternative (one-based). Before substituting the directive, the selected alternative can be filtered through function given by altfilter parameter. Text outside of directives can be filtered as well, piece by piece, through the function given by outfilter parameter.

If an alternatives directive is malformed (e.g. to little alternatives), it may be reported to standard output. Unless all encountered directives were well-formed, the original text is returned instead of the partially resolved one.

Parameters:
  • text (string) - the text to transform
  • select (int > 0) - index of the alternative to select (one-based)
  • total (int > 0) - number of alternatives per directive
  • althead (string) - directive head to use instead of the default one
  • altfilter ((string) -> string) - filter to apply to chosen alternatives
  • outfilter ((string) -> string) - filter to apply to text outside of directives
  • condf (None or (x_1, ..., x_n) -> True/False) - resolve current alternative directive only when this function returns True on call with each alternative as argument
  • srcname (None or string) - if not None, report malformed directives to standard output, with this string as source identifier
Returns: string, int, bool
resulting text, number of resolved alternatives, and an indicator of well-formedness (True if all directives well-formed)

resolve_alternatives_simple(text, select, total, althead='~@', altfilter=None, outfilter=None, condf=None, srcname=None)

 

As resolve_alternatives, but return only the resolved text.

Returns: string
the resulting text

first_to_case(text, upper=True, nalts=0, althead='~@')

 

Change case of the first letter in the text.

Text may also have alternatives directives (see resolve_alternatives). In that case, if the first letter is found within an alternative, change cases for first letters in other alternatives of the same directive too.

If lowercasing is requested, it is not done if both the first and the second letter are uppercase (e.g. acronyms, all-caps writting).

Parameters:
  • text (string) - the text to transform
  • upper (bool) - whether to transform to uppercase (lowercase otherwise)
  • nalts (int) - if non-zero, the number of alternatives per directive
  • althead (string) - alternatives directive head instead of the default one
Returns: string
the resulting text

first_to_upper(text, nalts=0, althead='~@')

 

Uppercase the first letter in the text.

A shortcut for first_to_case for uppercasing.

See Also: first_to_case

first_to_lower(text, nalts=0, althead='~@')

 

Lowercase the first letter in the text.

A shortcut for first_to_case for lowercasing.

See Also: first_to_case

expand_vars(text, varmap, head='%')

 

Expand variables in the text.

Expansion directives start with a directive head (head parameter), followed by variable name consisting of alphanumeric characters and underscores, and ending by any other character. Variable name may also be explicitly delimited within braces. Variable values for substitution are looked up by name in the varmap dictionary; if not found, NameError is raised.

Some examples:

   expand_vars("Mary had a little %mammal.", {"mammal":"lamb"})
   expand_vars("Quite a %{critic}esque play.", {"critic":"burl"})
   expand_vars("Lost in single ~A.", {"A":"parenthesis"}, "~")

Dictionary values are filtered as "%s" % value prior to substitution. Directive head may be escaped by repeating it twice in a row.

Parameters:
  • text (string) - string to expand
  • varmap ((name, value) dictionary) - mapping of variable names to values
  • head (string) - opening sequence for expansion directive

remove_accelerator(text, accels=None, greedy=False)

 

Remove accelerator from the text.

Accelerator markers are characters which determine which letter in the text will be used as keyboard accelerator in user interface. They are usually a single non-alphanumeric character, and inserted before the letter which should be the accelerator, e.g. "Foo &Bar", "Foo _Bar", etc. Sometimes, especially in CJK texts, accelerator letter is separated out in parenthesis, at the start or end of the text, such as "Foo Bar (&B)".

This function will try to remove the accelerator in a smart way. E.g. it will ignore ampersand in "Foo & Bar", and completely remove a CJK-style accelerator.

If accels is None, the behavior depends on the value of greedy. If it is False, text is removed as is. If it is True, some usual accelerator markers are considered: _, &, ~, and ^.

Parameters:
  • text (string) - text to clear of the accelerator
  • accels (sequence of strings or None) - possible accelerator markers
  • greedy (bool) - whether to try known markers if accels is None
Returns: string
text without the accelerator

remove_fmtdirs(text, format, subs='')

 

Remove format directives from the text.

Format directives are used to substitute values in the text. An example text with directives in several formats:

   "%d men on a %s man's chest."  # C
   "%(num)d men on a %(attrib)s man's chest."  # Python
   "%1 men on a %2 man's chest." # KDE/Qt

Format is specified by a string keyword. The following formats are known at the moment: c, qt, c{kde}, c{python}. Format string may also have -format appended to the keyword, for compatibility with Gettext format flags.

Parameters:
  • text (string) - text from which to remove format directives
  • format (string) - format keyword
  • subs (string) - text to replace format directives instead of just removing it
Returns: string
text without format directives

remove_literals(text, subs='', substrs=[], regexes=[], heuristic=True)

 

Remove literal substrings from the text.

Literal substrings are URLs, email addresses, web site names, command options, etc. This function will heuristically try to remove such substrings from the text.

Additional literals to remove may be specified as verbatim substrings (substrs parameter) and regular expressions (regexes). These are applied before the internal heuristic matchers. Heuristic removal may be entirely disabled by setting heuristic to False.

Parameters:
  • text (string) - text from which to remove literals
  • subs (string) - text to replace literals instead of just removing them
  • substrs (sequence of strings) - additional substrings to remove by direct string match
  • regexes (sequence of compiled regular expressions) - additional substrings to remove by regex match
  • heuristic (bool) - whether to apply heuristic at all
Returns: string
text without literals

convert_plurals(mapping, plhead)

 

Convert plural forms in the catalog [hook factory].

Parameters:
  • mapping ([(int, int)*]) - The source to destination mapping of form indices. This is a list of tuples of source (before modification) to destination (after modification) indices. There must be no gaps in the destination indices, i.e. all indices from 0 up to maximum given destination index must exist in the mapping.
  • plhead (string) - The plural header value.
Returns: (cat) -> numerr
type F5A hook