DOPY Users Manual

Basic Architecture

DOPY consists of a common library and a set of modules for specific communication protocols (currently tcp and rsh). This allows communication details to be abstracted out with minimal impact to applications.

At the core of the DOPY application interface is the "hub". One instance of the dopy.Hub class is defined per application. It is created either explicitly during the call of the dopy.init() function, or implicitly when some other DOPY function is called that requires it.

You may obtain a reference to the hub using the dopy.getHub() function. Use this reference to add server protocols, register objects, and obtain references to remote objects.

In a non-threaded server application, it is also necessary to call the hub's mainloop() method while the application is serving. Use of mainloop() is optional in threaded DOPY applications (but recommended if the application can be called in non-threaded mode through a command-line option).

mainloop() will block the application for as long as the hub's select loop remains active (i.e. as long as there are server protocols and clients). Unfortunately, at this time mainloop() will block forever because there is no way to remove server protocols.

Every DOPY protocol has a protocol client class (derived from dopy.Protocol and a protocol server class (derived from dopy.ProtocolServer). A DOPY application becomes a server by adding a server protocol instance with the protocol module's makeServer() function:

   
   import dopy.tcp
   
   dopy.tcp.makeServer(9600)

Parameters of the makeServer() function vary from protocol to protocol. In the case of the tcp module, it is a port number.

Remote objects can be created for any protocol either explicitly (usually by calling the remote() function in the protocol's module) or dynamically by calling the hub's getRemote() method. For example, the following two calls are equivalent:

   import dopy.tcp

   obj = dopy.tcp.remote('localhost', 9600, 'foo')
   obj = dopy.getHub().makeRemoteObject('tcp:localhost:9600:foo')

Note that even if you use only the second method, you must still import dopy.tcp. Protocol modules register themselves with the hub when they are imported, so if dopy.tcp is never imported the 'tcp:...' string will not be recognized. This may change in a future version of DOPY.

Initializing Your Application

As stated before, it is possible to explicitly initialize the DOPY hub through the dopy.init() function. This allows you to pass options from the command line and to specify other switches affecting DOPY.

In general, for each command line argument, there is also a parameter in the dopy.init() function. Command line arguments take precedence over parameter values.

To use command line arguments with no other parameters, we would do something like the following:

   import sys
   import dopy
   
   otherArgs = dopy.init(sys.argv)

In the case above, otherArgs will be asigned to a list of everything in the command line not parsed by the DOPY command line processor (excluding sys.argv[0], the program name).

It is also possible to initialize DOPY without passing any parameters. This should be used in cases where you do not wish to allow the user to pass command line options in to DOPY.

And finally, it is also possible to use DOPY without ever calling dopy.init(). If you choose to go this route, the hub will be constructed on demand with the default parameters.

Threading in DOPY

As of version 0.5, DOPY offers multiple threading modes:

non-threaded (THRD_NONE, none)

Threads are not used. This mode should be suitable for systems which (for some reason) still do not support multi-threaded programs.

When running in non-threaded mode, it is important that all use of the DOPY interfaces be limited to a single thread.

threaded select (THRD_SELECT, select)

In this mode a single thread is created for the select() loop. All method invocations and protocol handling occur within the select thread. This is not a very useful mode, but I suppose it could be used by servers that want to strictly limit the amount of system resources used by their clients.

threaded functions (THRD_FUNC, func)

This is the default mode of operation in a multi-threaded environment. It is just like threaded select mode except that when a server receives a request from a client, a new thread is started to service that request. The method invocation occurs in this service thread.

threaded communication (THRD_COM, com)

This is just like threaded functions mode except that in addition to initiating a thread for every request, a new thread is also initiated for every connection. Thus, every client has its own service thread.

This was the original threading mode of DOPY. You might want to try it if you experience problems with threaded functions.

The threading mode can only be selected during initialization. If command line processing is used, the "-dopy-thread" option can be used to select it from the command line. Alternately, it can be selected using an explicit option in the init() function:

   dopy.init(sys.argv, threadMode = dopy.THRD_SELECT)

The above would initialize DOPY in threaded select mode unless an alternate thread mode were specified on the command line:

   $ mydopycmd -dopy-thread func

Thread mode constants and command line thread mode switches are identified above in parenthesis next to the thread mode name.

Getting Tracebacks Into the Server Code

When an exception is raised from within a remote method, the DOPY server code catches the exception and passes it back to the client. On the client side, the proxy code re-raises the exception object so that it appears to the client as if the exception were raised seamlessly across the remote method invocation.

The only problem with this approach is that traceback information (which is not associated with the exception object itself) gets lost when the exception is passed back to the client environment.

To remedy this, the DOPY server stores the traceback information in the exception in an attribute called "dopy_traceback". This attribute contains the traceback information in the form returned by the traceback.extract_tb().

The information in "dopy_traceback" is automatically appended to the "real" traceback if you use the errorText() function in the dopy.tb module:

   import dopy.tb
   
   try:
   
      # call a method on a remote object
      remoteObject.someMethod()
      
   except Exception, ex:
   
      # print out the _full_ traceback and the exception info.
      print dopy.tb.errorText(ex)

Using DOPY With rsh

It is possible to run dopy over rsh, ssh or any communication program that uses standard input and output. The dopy.rsh module is used for this purpose. On the client side, just create a remote object from the rsh module instead of the tcp module. The first parameter is the command to run, the second is the object key:

   import dopy
   from dopy import rsh
   foo = dopy.remote('rsh my.host.com "dopyserver 2>err.out"', 'foo')

On the server side, we just invoke the makeServer() method and wait for the end of input:

   import dopy
   from dopy import rsh
   
   rsh.makeServer()
   rsh.wait()

See the rshclient and rshserver programs for an example.

It is also possible to start a dopy rsh server with inetd to serve DOPY tcp clients. Simply add a line similar to this to your /etc/inetd.conf file:

   13477 stream tcp nowait mike /home/mike/w/dopy/myserver myserver

"13477" is the port number to listen for. "mike" is the account that it is run under (I recommend that you use an account other than "root" to minimize the impact of any security holes that might arise). "/home/mike/w/dopy/myserver" is the name of the server program, and "myserver" is the 0th argument.

Do not write to standard output from within a dopy rsh server or any method called by it: standard input and output are the communication channels back to the client. Do not write to standard error either (unless it has been redirected) because this is usually merged with standard output.

Keep in mind that the rsh approach creates a new instance of the server program with every client connection. Applications making use of persistence will need to coordinate access to the persistent objects explicitly.

The DOPY Naming Service

The DOPY naming service is intended to provide the same functionality as a CORBA naming service through a Python dictionary interface. The basic classes of this service are housed in the dopy.naming module.

At this time, there is no dedicated "name server": any server can become a name server by registering an instance of StandardNamingContext:


   import dopy
   from dopy.naming import StandardNamingContext

   hub = dopy.getHub()
   nameService = StandardNamingContext()
   hub.addObject('naming', nameService)

We recommend the convention of using the object key "naming" for the name service.

At this time, the naming service is really just a dictionary. In fact, from a purely functional point of view, there is no reason why we could not have replaced the lines in which the StandardNamingContext was created and registered with the following:

   hub.addObject('naming', {})

It is only appropriate to store RemoteObject instances in the name server: however, there is nothing to prevent you from storing any kind of object in the name server. A copy of whatever is stored under a particular key will be returned to the client, so a naming service can (to some extent) be used as a storage repository. An attempt to store an object implementation in the naming service will produce unexpected results: retrieval of the object will return a copy of the object, not a proxy.

This brings up the broader issue of what the naming service really is. To some extent, the hub's registry is a naming service: it maps keys to object implementations. Likewise, the persistence service is a naming service that mirrors the directory tree. It seems as though there should be a more generic way of representing the tree of objects served by a particular server. At this point the system is still very much in flux, so there is no reason not to expect such a thing to evolve.

The DOPY Persistence System

The dopy.pos module is similar to the dopy.naming module described above, only instead of a tree of object references, the persistence system represents a directory tree full of pickled python object instances.

Use of the system is extremely simple: just create an instance of FileSysDirectory pointed at the root directory of the file system that you want to publish:

   import dopy
   from dopy.pos import FileSysDirectory
   
   hub = dopy.getHub()
   pos = FileSysDirectory('/home/mike/etc/pickled', 'pos')
   hub.addObject('pos', pos)

In the example above, /home/mike/etc/pickled is the root directory of the pickled object tree, and 'pos' is the object key of the persistence repository (again, we recommend the key 'pos' just for this purpose). The object key passed into the FileSysDirectory constructor must be the same as the key used in the hub.addObject() call.

Clients can store and retrieve copies of objects using the standard list operators. To manipulate them remotely, use the getRemote() method:


   # get the persistence repository
   pos = dopy.tcp.remote('somehost.bogus.net', 9600, 'pos')

   # store an object in the repository
   pos['foo'] = Foo()
   
   # get the remote instance and modify it
   remoteFoo = pos.getRemote('foo')
   remoteFoo.setValue(100)
   
   # get a local copy of foo and modify it
   localFoo = pos['foo']
   localFoo.setValue(200)

Note that in the example above, localFoo is not remoteFoo. The remoteFoo object is a proxy object for the foo instance stored on the server. The localFoo object is a copy of the object stored on the server. When we call the setValue() method, we are changing the local value; this does not effect the remote instance.

Normal, DOPY does not impose any constraints upon the objects that it provides access to. However, in the case of persistence, the system must be able to tell the object when it is done calling a method so that the object can save itself. To deal with this, it is recommended that persistent objects derive from dopy.pos.PersistentObject. This class implements three special methods, dopy_beforeMethod(), dopy_afterMethod() and dopy_setPath(), which are used to automatically write the file after any method is called and to manage a single instance in memory.

Users may wish to provide their own versions of these methods to gain greater control over when objects are written: for example, it is not necessary for methods that do not modify an object to cause the object to be written. However, they should still derive from PersistentObject and call the _acquire() and _release() methods, which manage the object instance in a cache for persistent objects.

The persistence system is smart enough to maintain a single instance of an object in memory: if two different clients call a method on the same object, it will not be loaded twice. However, this system is not smart enough to take into account things like overlapping directory trees and symbolic links. For this reason, we recommend the following:

If you use multiple root-level persistence repositories and one is a subtree of the other, make sure that the common root is specified in exactly the same way. i.e. don't do this:
```
      # assuming current directory is '/home'
      hub.addObject('pos0', FileSysDirectory('mike/pickle', 'pos0'))
      hub.addObject('pos1', FileSysDirectory('/home/mike/pickle/gherkin', 
                                             'pos1'
                                             )
                    )
```
In the example above, the system won't be able to recognize that pos0/gherkin/BigUn is the same file as pos1/BigUn because the root is mike/pickle in one case and /home/mike/pickle in the other.
For systems in which file names are not case sensitive, be consistent in your use of case.
Do not use symbolic links between two persistence repositories.

In summary, the system epects that a file be known by only one name.

The object locking mechanism is still rather flimsy in other respects: when an object is initially loaded, it is given a "use count" of 0. This use count is incremented prior to method invocation and decremented after method invocation. If the use count is zero after method invocation, the object is deleted and removed from the cache. Obviously, this creates a region of exposure between the point at which the object is loaded and the begining of the invocation of the method for which it has been loaded.

Special Methods

DOPY objects may have the following special methods:

dopy_beforeMethod(self, request)

Called before a remote method invocation. If an exception occurs, it is returned to the caller.

dopy_afterMethod(self, request, response)

Called after a remote method invocation. If an exception occurs, it is not returned to the caller. It is ignored.

dopy_setPath(self, pathname)

If the object is stored using the persistence service, this method will be called when the object is loaded so that the source file's path name can be given to it.

These methods will only be called if they are present.

Reactors

It is often desireable (particularly in single-threaded applications) to plug custom code into an application's select loop. For example, your application might have to read information from special pipes or sockets to communicate with other non-DOPY programs. To facilitate this, DOPY provides the Reactor class. Instances of classes derived from Reactor can be added to the hub using the addReactor() method.

Reactors provide a standard interface for plugging into the DOPY select loop.

Every reactor class must provide a fileno() method, which returns a handle suitable for inclusion in a select() call.

The notifyWhenReadable(), notifyWhenWritable(), and notifyOnError() methods must be overriden to return true or false depending on whether the reactor wants to be notified when its filehandle is in that state. The value returned from any of these need not be static: a reactor may change the events that it is notified of at any time. These methods are called every time a state change causes select() to become unblocked.

The handleRead(), handleWrite() and handleError() methods are what will actually be called when the reactor's file handle becomes readable, writable or enters an exception state. These functions will only be called if the reactor has indicated its interest by returning true from one of its "notifyWhen" methods.

When creating reactors, remember that the handleRead(), handleWrite() and handleError() methods are called synchronously from the select loop. They should try to return as quickly as possible. If you have to do an extensive amount of processing in response to one of these events, it is recommended that you spawn it off in another thread, if possible.

MiniComLine - a simple command line reactor

One Reactor derivative that comes pre-packaged with DOPY is the MiniComLine class (in dopy.minicl). This class is a handy way to add a simple python command line to your server (in fact, this is the command line that you see when you run the dopyserver example program). To use it, simply add lines such as this to your program:

   import dopy.minicl
   dopy.getHub().addReactor(dopy.minicl.MiniComLine())

Module Index

DOPY is currently comprised of the following modules:

dopy DOPY main functionality
tcp TCP/IP protocol
rsh RSH protocol
naming Name service
pos Persistent Object service
tb Functions to help with tracebacks
minicl Mini command line reactor

Copyright Info

Permission is granted to use, modify and redistribute this document, providing that the following conditions are met:

This copyright/licensing notice must remain intact.
If the code is modified and redistributed, the modifications must include documentation indicating that the code has been modified.
The author(s) of this code must be indemnified and held harmless against any damage resulting from the use of this code.

This code comes with ABSOLUTELY NO WARRANTEE, not even the implied warrantee of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.