Package gmisclib :: Module cache :: Class cache_info
[frames] | no frames]

Class cache_info

source code


This class manages a disk cache of arbitrary objects. It first constructs a unique name, based on information that you give, then you can dump data to that path, or load data from that path. An attempt to load data will either succeed or raise an exception; an attempt to dump will either succeed or silently fail.

Typical use:

       def cached_f(parameters):
               ci = cache_info(info=tuple(parameters))
               if ci is not None:
                       try:
                               return ci.load()
                       except (BadFileFormat, IOError, OSError):
                               pass
               o = f(parameters)
               if ci is not None:
                       ci.dump(o)
               return o

Note: This class assumes that the results it is cacheing are generated by a function f(parameters). You need to be careful to give all of the relevant parameters to cache_info, otherwise you can get the wrong results back. For instance, if you have five parameters and you forget to give parameters[2] to cache_info, it will happily store values obtained with all different values of parameters[2] in the same slot, and when you later call load, you'll get whatever you asked for, even if it is not what you wanted.

Instance Methods
 
__init__(self, root, info=(), fname=None, modname=None, mod=None)
x.__init__(...) initializes x; see help(type(x)) for signature
source code
 
__repr__(self)
repr(x)
source code
 
copy(self) source code
 
addinfo(self, *s, **kv) source code
 
makespace(self, avoid) source code
could be anything picklable.
dump(self, e)
Cache some data on the disk.
source code
 
bg_dump(self, e) source code
could be anything picklable.
load(self)
Pull in some data from the disk.
source code
tuple(str, str)
cachepath(self)
Return a pathname suitable for cacheing some result.
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables
  Age = 864000
  NumObj = 10000
  Errors = (<type 'exceptions.IOError'>, <type 'exceptions.EOFEr...
Properties

Inherited from object: __class__

Method Details

__init__(self, root, info=(), fname=None, modname=None, mod=None)
(Constructor)

source code 

x.__init__(...) initializes x; see help(type(x)) for signature

Parameters:
  • info (tuple(anything)) - This is where you specify the parameters from which the cached value can be computed. It is essentially a look-up key for the value.
  • fname (str) - You can specify that the cached value depends on the contents of a file (in addition to other parameters). See fileinfo for details.
  • modname (str or tuple(str)) - You can specify that the value depends on a module (or a list of modules). (You give the names of the modules here.) In which case, it tries to detect changes to the specified modules. See namedModInfo for details. You use this argument to protect youself against changes to the code used to compute the cached value. Obviously, you don't want to load a value from last weeks, buggy implementation.
  • modname (str or tuple(str)) - You can specify that the value depends on a module (or a list of modules). (You give the module itself here.) In which case, it tries to detect changes to the specified modules. See modinfo for details.
  • mod (module or tuple(module).)
Overrides: object.__init__
Notes:
  • Certain compromises were made in the handling of modname and mod. Even if you use them, you are not 100% guaranteed to be protected from all changes to the code used to compute the cached values. To be entirely safe, you should manually clear the cache when ever you change your code. However, this will probably save your tail if you forget to clear the cache. See modinfo for details.
  • If there is an error when reading files (i.e. if fname, modname, or mod is specified), then the object will be constructed with info=None. This will lead to a OSError if you then call load on the object, which is what you'd get from a cache miss. Calling dump or bg_dump will silently do nothing.

__repr__(self)
(Representation operator)

source code 

repr(x)

Overrides: object.__repr__
(inherited documentation)

addinfo(self, *s, **kv)

source code 

Note: This does not modify self! It creates a new object.

dump(self, e)

source code 

Cache some data on the disk.

Parameters:
  • e (anything picklable.) - the data to write.
Returns: could be anything picklable.
whatever was passed as e.

Note: This function might quietly fail. Since this is a cache, failure to write is not considered a major problem. In my experience, failure to write is often caused by intermittent network problems, and you don't want it to crash a long-running computation.

load(self)

source code 

Pull in some data from the disk.

Returns: could be anything picklable.
whatever was cached on disk.
Raises:
  • BadFileFormat - when the data isn't valid.
  • OSError - on cache miss.
  • IOError - e.g. network problems.

cachepath(self)

source code 

Return a pathname suitable for cacheing some result.

Returns: tuple(str, str)
(path_to_root,path_with_tail). Path_to_root is/will be a directory; path_with_tail is a path to a data file within that directory. Normally, the actual cache is at the location os.path.join(path_to_root,path_with_tail) on the disk; that is what you would pass as the fname argument to load_cache or dump_cache.
Raises:
  • ValueError - if you haven't specified any info yet.

Class Variable Details

Errors

Value:
(<type 'exceptions.IOError'>,
 <type 'exceptions.EOFError'>,
 <class 'cPickle.UnpicklingError'>,
 <class 'gmisclib.cache.BadFileFormat'>)