Wednesday, June 4, 2008

Store Yourself in Context

The ViSit Anywhere product was designed and implemented over several years. When we started the project we did the right thing. We locked ourselves in a quite room for one week to examine what worked and what didn't work in ViSit v3 and ViSit/Web. We noted all the things we always wanted and all the thing we could never do. We concentrated on the what and not the how. After a week we had what we considered to be the foundations of the new product. I don't think this initial vision has changed in any significant way during the product development. The first vision of the product was that we would have a rooted hierarchy of objects that would describe the project schema, the applications that could be run and the data that was operated on. Today this vision manifests itself in the ViSit explorer tree, as discussed in the data model.

When we started implementing this vision we came very quickly to a tool with a reasonable level of classic ViSit behavior, continuous mapping, well defined layers and tables associated with graphic elements. We continued in this way until the China release (our third iteration). At this time, we found that the underlying project architecture was becoming more and more brittle. We started have to re-migrate all our projects after every new release. This was taking more and more of our time, until we finally felt that using .NET serialization was just not a good way to store project configuration and data. We decided that if we were going to progress, our objects would have to know how to store themselves.

At the same time, we decided that if we were going to have change management, we would have to implemented ourselves, and one way to implement change management is to save everything before a change and restore it if we had to rollback. So to us, having our change-managed objects store themselves was the corner store of building the product. With this new insight into how we would have to save projects and how saving objects is directly related to change management (with a good dose of design philosophy from Holub) we took a long hard look at our product architecture.

What followed in the Denmark release of ViSit Anywhere was what I like to refer to as the C-D extinction (in honor of the K-T extinction that wiped out the dinosaurs). That is, objects that were important in the ViSit v3 API suddenly lost their utility and were abandoned as the ViSit Anywhere API emerged. An example of this is the use of the primary ViSit objects, sites, themes and tables. In the ViSit v3 API there was a controller object which could be queried to access the desired ViSit object, which could then be used to perform an application task. The ViSit Anywhere model uses a cooperating set of objects, with fewer controller objects. For example, in pre-Denmark ViSit Anywhere, the SiteTreeFactory object could be used to create a sub-tree of sites that could then be used by an application. The controller object would be responsible to all site interaction, including keeping track of who was attached and who was not. The post-Denmark API simply has a hierarchy of sites that can be queried by path expression. Much of the functionality that was implemented by the SiteTreeFactory was now distributed directly over the hierarchy of SiteLinkages. For example, a parent site knows that it and some of its children might intersect a geographic region. It can then ask its child sites if they intersect the region.

This type of implementation is an expression of the object-oriented concept stating that you should not ask an object for information about itself so that you can perform an operation, but rather, you should ask the object to perform the task itself. And the task I want to talk about now, is object storage.

This all sounds very well and fine, but there is one more twist that really allows this implementation to prove its worth. Our objects are maintained in a strict hierarchy. Every object has a place in the hierarchy and there is only one object that is special (the root of the hierarchy). Some objects have children, some don't. When we save an object, it is stored relatively to other objects in the hierarchy.

Now, while objects have to know how to store themselves, they need to have some information on how they should be stored. If fact, it is better if a on object can describe its storage in a generic format, such as XML, then delegate the actual storage of the information to another object. To accomplish this we introduce a serialization context. The serialization context exposes a container object that the storable object uses to store its content. The serialization context is responsible for writing the container to some persistent store, for example the file system or (as we shall soon see), a relational database.

The serialization context object has been implemented as an interface, allowing us to re-implement the object when we wish to change the persistent store. In addition, only two objects are capable of creating a root serialization context, the root collection (i.e. the root of the object tree), which defines how a local project is stored, and a proxy object that represents the server store. For the iterations from Denmark to Malawi, we used a single serialization context, the XmlSerialzationContext. This object managed the storage of object content as XML files, with the hierarchical structure of the object tree represented as a hierarchical set of files.

Note, the idea of using a generic member to allow objects to perform some task, and then providing details on exactly how to represent the results of the operation with a context object is a recurring architectural pattern in ViSit Anywhere. A large part of the ViSit Anywhere API involves asking object to describe themselves using some context object. This subject will be dealt with in a future blog, but it is important to understand the use this concept.

So back to serialization (or storage) of objects. Basically the process goes as follows, the root object creates the root serialization context because it knows the type of serialization that is being used. Root objects are either the local project or the server project. Both these objects have to know how to locate their stores, so this is not a problem. Once this is done, it is possible to create a child serialization context for each child of the root . In this way, child objects are stored (in some way) relative to their parent object, through the use of a child serialization context. Serialization contexts must provide a way to create child contexts, so again this is not very difficult. By recursively applying these rules, all objects can be serialized.

On a side note to this, we can see how the change manager uses serialization to orchestrate change. When an object is added to a change scope (before it has been modified) the change manager can create a child backup serialization context using local revision information. When the object is changed it can then serialize the new state. If we have to return to the original state, we just have to load the backed version and store it as the current version. Thus change management of storable objects (typically schema objects) can be accomplished by applying simple rules of serialization.

While this may all seem good and change management is an important thing, you might be asking why I want to talk about serialization and object storage now? In fact, the last two iterations, Nepal and now Oman, have shown uses for the serialization context that were not previously planned.

The big feature of the Nepal release was to abandon on simple file system server for one that could be used to store objects over an HTTP connection. Previously, we were using a server project on a shared drive, locking the file hierarchy so that only one person could access it at a time, and using this as a server store. This server store was based on an XmlSerializationContext that pointed to the server store area rather than the local store area.

When we started thinking about how to implement an HTTP server, we came to the conclusion that by implementing a new type of serialization context we could push the store over the wire to store objects on a remote server. To do this, we simply implemented a serialization context that could communicate with a couple of WCF (Window Communication Framework) services. This provides a framework where we can have a server be hosted on a web server, via a TCP/IP port or simply using an in-process file system connection (or all three flavors), without having to change any program code. That's because WCF abstracts the networking protocols, making them look like the same thing to the application program. WCF is a technology that, for us, arrived just in time. If we would have started implementing this before WCF arrived, we probably would have spent a lot of time doing it, and then we would have had to re-implemented to conform to WCF.

But the really interesting thing is that rather than taking hundreds of hours to implement a new server technology, with client-side and server-side components, we implemented one new serialization context (about 400 lines) and a simple server-side object that accepted files and copied them to the correct directories (again, about 400 lines of code). So rather than spending weeks to implement our new server store, we spent 1 week (and most of that was discussing the fine details).

By this time we were starting to appreciate the ideas of strict object encapsulation, but history would repeat itself. In the current, Oman iteration, one of the big features is to be a special purpose client that will automatically synchronize a ViSit Anywhere project to an Oracle database (with the graphic objects represented as Oracle SDO geometry objects). Given the structure and complexity of the ViSit Anywhere synchronization sequence this task promised to be difficult. But, after a weekend of thought and about 1 day of design discussions, it became clear that we could solve this problem by implementing a new Oracle serialization context. In this case, the oracle context would know how to connect to the database and push the ViSit Anywhere geometry (managed CAD elements) into Oracle Locator. Again, the heart of the problem was solved in about a week, probably in less than 1000 lines of code.

The Oracle auto-synchronization client also provides a pattern for pushing ViSit Anywhere projects into any relational database, simply by implementing a new serialization context.

These two examples underline the advantages of the extensible ViSit Anywhere API and how object-oriented concepts like type polymorphisms and strict encapsulation can lead to stunning productivity improvements. To top it all off, this is exactly what Holub predicted!

1 comment:

Navya said...

Thanks for sharing..
well written post.
GIS services