"Roll your own" persistence

This page of contributed documentation will describe a few methods for
implementing object persistence in "pure perl."

In general, use of these kinds of methods require save and restore
operations to be explicitly coded.

The persistence that core Perl includes is provided by the deprecated
dbm(open|close) function, and more generally, the tie function, in
association with a database library module. These facilities allow a string
to be associated with a unique key.

Given that basic building block, the problem of making data persist can be stepwise
refined into mapping the data into unique keys and storable values.  The complexity
of this process depends on the data in question, and how it is represented within
the program.

For example, consider an event object that might be used in a calendaring system.  It
has a unique key called

 EVENTID

and fields including

 TITLE
 DESCRIPTION
 LINK
 TIME
 DATE
 DAYOFWEEK
 RESPONSIBLEPERSON
 RSVPTO

and the internal representation of this object is a blessed array reference, with the
array containing nine fields, with field 0 the eventID and fields 1..8 the data, in the
order they are listed above.  The CalendarEvent package includes appropriate access
functions for operating on these things.

"Save"  and "Restore" routines might look like this:

package CalendarEvent;

sub GetCalendarDBWriteLock() {
        # open the lock symbol file and lock it esxclusively
        open(LOCK,">>.CalDB_Lockfile") || die "Cannot open Lock File";
        flock LOCK,2;
}
sub  GetCalendarDBReadLock() {
        # open the lock symbol file and lock it esxclusively
        open(LOCK,">>.CalDB_Lockfile") || die "Cannot open Lock File";
        flock LOCK,1;
}
sub UnLockCalendarDB() {
        flock LOCK,8;
        close LOCK;
}

sub Save{
        my %Events;
        my $E = shift;
        my $EventID = shift @$E[0];
        my $Erecord = join($;, @$E) ;    # adjust depending on data.
        GetCalendarDBWriteLock;
        dbmopen(%Events,"data/CalendarEvents",0660);
          $Events{$EventID} = $Erecord ;
        dbmclose %Events;
        UnLockCalendarDB;
}

sub Restore{
        my %Events;
        my  $E;
        my $EventID = shift;
        my $Erecord;
        GetCalendarDBReadLock;
        dbmopen(%Events,"data/CalendarEvents",0660);
          $Erecord = $Events{$EventID};
        dbmclose %Events;
        UnLockCalendarDB;
        bless [$EventID, split($;, $Erecord,8)] ;    # adjust depending on data.
}
 
 

For maintainability, have all accesses in and out of the database go through
as few routines as possible. That way, when your project outgrows database
files and really does need to use SQL, there are only two routines to rewrite.

For speed,  hold the locks for as little time as possible.  The above code
does "lock,open,act,close,unlock" with no processing at all happening within
the act phase.

The above code relies on the data to not contain the $; character.  This may
be an unsafe assumption.  If you have permission to eliminate wild characters
from your data, altering the Save routine to eliminate the separator character
from the data , like so:

        my $Erecord = join($;, map { s/$;//g  } @$E) ;    # adjust depending on data.

is one possibility.  Another is to encode all data somehow, and then decode it
in the Restore routine:

        # in Save:
        my $Erecord = join($;, map {  s/([^\w ])/sprintf('%%%02H',$1)/ge   } @$E) ;
 

        # in Restore:
        bless [$EventID,  map { s/%(..)/chr(hex($1))/ge  } split($;, $Erecord)] ;
       # Of course, this can be made clearer by rewriting with named intermediate variables.
 
 

If your objects are more complex than being composed of  a set list of fields of something,
more complex uses of these basic blocks can be used.  For instance,  objects with various
named fields where the names are not known in advance can be handled by composing and
decomposing the record string with methods that walk the keys of the associative array, such
as

        # in save:
        my $Record = join('&',  map {join('=', map {  s/([^\w ])/sprintf('%%%02H',$1)/ge }  ($_, $$R{$_} ) }  keys %$R) ;
 

        # in Restore:
        bless {IDKEY => $EventID,  map { s/%(..)/chr(hex($1))/ge }  map { split /=/ } split('&', Record)}  ;
 
 

If you know your data might contain references,  there are two main possibilities before going with
an object persistence framework.  The first is to identify references and dereference them, losing the
reference relation.  The second is to devise a language for designating a part of an object as a reference,
having a cache of references that are already read in,  and supporting this in your Save and Restore
functions.  At this point you are getting very close to reinventing the wheel however, and you might consider
going with a known-good framework if you are not confident in your skills.
 

Advice, dissent, corrections?

Documentation and code snippets copyright 2001 David Nicol.  Entire contents of this page released
GPL/Artistic, under the same terms as Perl itself.