Package gmisclib :: Module edit_distance
[frames] | no frames]

Module edit_distance

source code

Levenshtein (edit) distance between two strings of symbols.

Classes
  text_cost
This is useful for differencing documents that have been parsed into lists of words.
Functions
 
dist(s, t, csub, cinsert, cdel)
Cost for converting string s into string t.
source code
 
distf1(s, t, csub)
Cost for converting string s into string t.
source code
 
distf2(s, t, csub)
Cost for converting string s into string t.
source code
 
distf(s, t, csub)
Cost for converting string s into string t.
source code
int
def_cost(a, b)
This is the default cost function for converting a into b.
source code
int
free_sub(a, b)
This is a sample cost function for converting a into b.
source code
int
free_del(a, b)
This is a sample cost function for converting a into b.
source code
 
test() source code
 
test2() source code
Variables
  __package__ = 'gmisclib'

Imports: numpy


Function Details

dist(s, t, csub, cinsert, cdel)

source code 

Cost for converting string s into string t. This function takes three dictionaries that contain the cost of insertion, deletion, and substitutions.

Parameters:
  • s (string or array) - starting string
  • t (string or array) - ending string
  • csub (dict((a,b) : float)) - cost of converting a to b
  • cinsert (dict(a: float)) - cost of insertion
  • cdel (dict(a: float)) - cost of deletion

distf1(s, t, csub)

source code 

Cost for converting string s into string t. This function takes a function that computes the cost of insertions, deletions, or substitutions. (This is an alternative implementation for distf2().)

Parameters:
  • s (string or array) - starting string
  • t (string or array) - ending string
  • csub (function(a,b) : float) - cost of converting a to b

distf2(s, t, csub)

source code 

Cost for converting string s into string t. This function takes a function that computes the cost of insertions, deletions, or substitutions.

Parameters:
  • s (string or array) - starting string
  • t (string or array) - ending string
  • csub (function(a,b) : float) - cost of converting a to b

distf(s, t, csub)

source code 

Cost for converting string s into string t. This function takes a function that computes the cost of insertions, deletions, or substitutions.

Parameters:
  • s (string or array) - starting string
  • t (string or array) - ending string
  • csub (function(a,b) : float) - cost of converting a to b

def_cost(a, b)

source code 

This is the default cost function for converting a into b. Insertions, deletions, and substitutions all have a unit cost. Normally, this is passed as an arg to distf.

Parameters:
  • a - a symbol (or None to indicate an insertion).
  • b - a symbol (or None to indicate an deletion).
Returns: int
the cost of changing a into b

free_sub(a, b)

source code 

This is a sample cost function for converting a into b. Insertions and deletions have a unit cost; substitutions are free. Normally, this is passed as an arg to distf.

Parameters:
  • a - a symbol (or None to indicate an insertion).
  • b - a symbol (or None to indicate an deletion).
Returns: int
the cost of changing a into b

free_del(a, b)

source code 

This is a sample cost function for converting a into b. Insertions and substitutions have a unit cost; deletion are free. Normally, this is passed as an arg to distf.

Parameters:
  • a - a symbol (or None to indicate an insertion).
  • b - a symbol (or None to indicate an deletion).
Returns: int
the cost of changing a into b