Package gmisclib :: Module accent_spec
[frames] | no frames]

Module accent_spec

source code

This module provides a way of safely specifing accent positions in running text.

If you have a transcription "I did not eat the orange ball.", you can attach a "+" symbol to the word "eat" like this: "+eat".

You enter an accent specification, which consists of words with a prefix (the prefix is anything that ends in a punctuation mark). The program matches up the words in the accent specification to the words in the transcription. You can have many words in the accent specification, if necessary. You can also disambiguate things by putting context words into the accent spec (without a prefix). All matching is done left-to-right.


Version: $Revision: 1.4 $

Classes
  BadMatchError
Functions
 
prefix(text_array, accent_spec, map_fcn=<function <lambda> at 0x3072aa0>)
This function takes an array of words and an accent spec.
source code
 
suffix(text_array, accent_spec, map_fcn=<function <lambda> at 0x3072c08>)
See prefix, but with the obvious changes.
source code
 
preshow(text_array, alignment, map_fcn=<function <lambda> at 0x3072cf8>)
Shows an alignment in a printable form.
source code
 
sufshow(text_array, alignment, map_fcn=<function <lambda> at 0x3072de8>)
Shows an alignment in a printable form.
source code
 
test() source code
Variables
  __package__ = 'gmisclib'

Imports: re


Function Details

prefix(text_array, accent_spec, map_fcn=<function <lambda> at 0x3072aa0>)

source code 

This function takes an array of words and an accent spec. It matches the accent spec to the words, and outputs an array of tuples which tells you where the accents are, and what kind. The optional map_fcn can be used to map other kinds of objects into a array of strings.

More specifically, an accent_spec is a whitespace-separated list of strings. Each string is a word from the text_array, with an optional prefix. The strings are matched in order to the words, and the output array is a list of (index_in_text_array, prefix_text) tuples.

So, if you have text_array = ['my', 'cat', 'is', 'my', 'cat'] and accent_spec="is +my", then align() will match "is" to text_array[2], and "+my" to text_array[3], and it will return [ (3, "+") ] . Note that "+is +my" does not imply that 'is' and 'my' are adjacent, and "+is +cat" simply returns [ (2, "+"), (4, "+") ].

If the accent_spec were "+my", then it would match text_array[0], and return [ (0, "+") ].

Prefixes can be multiple characters, but they cannot end in letters, digits, or underscore.

preshow(text_array, alignment, map_fcn=<function <lambda> at 0x3072cf8>)

source code 

Shows an alignment in a printable form. It puts the prefixes in the appropriate places.

sufshow(text_array, alignment, map_fcn=<function <lambda> at 0x3072de8>)

source code 

Shows an alignment in a printable form. It puts the suffixes in the appropriate places.