proper_words(text,
markup=False,
accels=None,
format=None)
|
|
Mine proper words out of the text.
The proper words are those one would expect to find in a dictionary,
or at least having that latent quality (jargon, etc.) As opposed to URLs,
email addresses, shell variables, etc.
The text may contain XML-like markup (<...> tags,
entities...), or keyboard accelerator markers. It may also be of certain
format known to Gettext (e.g. c-format ). If specified, these
elements may influence splitting.
- Parameters:
text (string) - the text to split
markup (bool) - whether text contains markup tags
accels (sequence) - accelerator characters to ignore
format (None or string) - Gettext format flag
- Returns: list of strings
- proper words
|