Friday, May 16, 2008

ViSit Anywhere Change Management

A couple of years ago when Bentley first released Microstation v8, we were considering how to move ViSit forward onto the new platform. At the time it was clear that the new design file format was going to to result in a lot of re-work in ViSit just to stay where we were with in Microstation/J. This was very disconcerting since we felt that if we were to make this move, we would have to provide something extra to encourage our customers to move the new platform. We understood that new products, at least at the out set, are generally less stable and have fewer features. As we looked at migrating 10 or 15 years of MDL code from MS/J to MS/v8, we felt that the task would certainly be a difficult and time consuming.

At the time of the MS/v8 release Bentley had promised to fix what we considered to be one of the most difficult GIS problems - concurrent editing using optimistic change management. MS/J had been released with ProjectBank (a change management platform) with much fanfare, but it seems the system never caught on. Even though we did deliver a ProjectBank component for ViSit v3, the system never had the stability or the performance that we were looking for. So with every new MS/v8 release we patiently waited for news of the promised change management system and when it did appear (late in the Microstation XM product stream release), it was not what we were hoping for.

Meanwhile, we decided that if we were to have a change management system we would have to implement it ourselves. Our experience with ProjectBank had given us some insight into what was required, but ProjectBank only managed graphical CAD elements. We really needed something that would manage changes in both the graphics and the attribute data.

Many GIS systems (including Bentley) were starting to use Oracle Workspaces for change management, but since most of our customers where not using Oracle we did not feel that this could be made to work for us. In addition, the demonstrations we had seen seemed to be overly complex and the Oracle licensing requirements were (for us) ambiguous.

At the same time, being software developers, we were being exposed more and more to the same type of change management in our source code version management tools. For example, we are currently using P4 (from Perforce). When we program, different developers are often working on the same source files, however P4's automatic change merging and conflict management means that we only have to manually resolve conflicts on very rare occasions. Our experience with GIS systems was similar. Typically we can have a large number of people editing and working with the GIS system at the same time, but since everyone is doing their own work, their changes rarely overlap. We feel that information systems of this type are ideal candidates for change management with optimistic locking.

At the time we were struggling with the issue of change management I was reading the book Holub on Patterns, which I then passed on to Dominique. One of Holub's pet peeves was application user interfaces that would show popup dialog asking users if they were sure they wanted to continue because the changes that they were about make could not be undone. This notion also struck a chord with us, as we wanted users to be able to edit data in ViSit without fear that they would somehow corrupt the project data. Holub's solution to this problem was to have objects that could save themselves. So as changes were being made, the objects being changed could silently and transparently save their own state and thus allow any change to be undone.

This notion inspire much discussion at Géotech and the result of these discussions is the ViSit Anywhere system of change management. One of the cornerstones of our implementation is objects that can store themselves. For this, we developed the IStorable interface. That is, any object that implements this interface could be change managed. At the same time, we were discussing where and in what format objects should store themselves. We had already had requests that configuration information be stored in an RDBMS and while we did not think that this was appropriate at the time, we decided to punt on the matter and develop an ISerializationContext object that would be passed to IStorable objects to provide infrastructure and hints on how we wish objects to store themselves. At a later date, if we need to store the objects in some other format it could be accomplished using the framework of the ISerializationContext. The idea of a context object that provides hints for object behavior became another cornerstone of the ViSit Anywhere implementation.

Currently objects store their state in XML files and the XmlSerializationContext provides the necessary infrastructure for performing the save. However, we have already exploited the existence of the ISerializationContext by implementing a RemoteXmlSerializationContext to store objects remotely using a WCF proxy object. Thus when implementing the remote server, we simply implemented the local server proxy to provide a remote serialization context. The RemoteXmlSerializationContext provides the infrastructure to store XML documents remotes, so no IStorable code needed to be changed.

The final major participant in the change management system is the ChangeManager. This object is responsible for coordinating all change management tasks. That is, when objects know they are in the process of being changed, the ChangeManager will create the appropriate context objects and request that the IStorable's save themselves. If there is an error in the process and the change can not be completed, the ChangeManager must coordinate the rollback.

At the time we were developing the change management system Microsoft was releasing the .NET framework v2. The release featured an object called a TransactionScope which was used to delimit the effects of an atomic change. Since at the time we were still using the .NET framework v1.1 (for compatibility with Microstation) we developed a similar framework, where the ChangeManager would deliver and control IChangeScope objects. The basic usage is that a method that wants to perform a manged change would request an IChangeScope from the ChangeManager. The IStorable objects participating in the change would then add themselves to the IChangeScope. The change would then be performed. Once all the work was complete, a call to the Complete member on the IChangeScope would signal that the change should be committed. IChangeScope's can be nested to allow complex, multi-participant transactions. The ChangeManager provides all the calls to save and rollback the objects. As the IChangeScope is an IDisposable object, transaction commit/rollback is performed when the IChangeScope disposes. If the IChangeScope Complete method has been called, the transaction is committed, otherwise it is rolled back. The ChangeManager handles all thread synchronization (since only one thread can hold the current change scope at any one time). The result is a system where objects that can store themselves can be easily transaction managed. The ChangeManager also must store the committed IChangeScope in case the user later decided to undo his local changes. The stack of local changes is called the long transaction. The long transaction is what has to be committed to the server once the user decides that the changes are valid.

The above process can be rather complex, but it gets a little bit worse still. IStorable object are typically schema objects that allow project features to be configured. Most user changes are changes to graphic objects with their associated attribute data, that is instance data. The two major instance objects in question here are site linkages and table instances. SiteLinkage's are collections of graphic data, roughly analogous to a Microstation design file. For the purposes of change management, SiteLinkage's are collections of managed CAD elements (MCE) . An MCE is a primitive graphic element that can be change managed. An MCE can be a point (represented by a symbol), a line or a polygon. Each MCE can be associated with zero or more text labels. All the managed graphics in ViSit Anywhere must be represented by these 3 primitives. But note, that SiteLinkage (which implements IStorable) may contain any number of MCE's. The overhead of creating an IStorable object for each MCE is just too high. For this reason we have implemented the notion of an IChangeController. An IChangeController is simply an IStorable objects that helps in change managing a collection of elements. The elements managed by the IChangeController defines the granularity of the changes that can be managed by the system.

TableInstance provides the same service for attribute data as SiteLinkage provides for graphic data. The TableInstance change manages attribute data at row granularity. That is, when you change the attributes for a single row, only that row is marked as having been changed. If two users edit data in the same row it is seen as a conflict, even if the two users edit different columns in the row. This is the meaning of the change granularity.

Now, the ChangeManager knows how to save and restore IStorable objects, but rows and graphic elements are not IStorable's. When these items are added to the IChangeScope, the ChangeManager must resolve the appropriate IChangeController and request that that object perform the necessary change management operation. In this way we have resolved most of the problems with local change management. The ChangeManager coordinates change operations, but in the end only objects that know how to store their own state can be change managed. Other objects can change manage collections of elements but they are responsible for doing the right thing when the ChangeManager makes a request.

Now, consider that we have a stack of changes on the local client and the user wants to commit this information to the server. The ChangeManager is again responsible for coordinating this task. The first thing that must be done is to create a Revision. A Revision is an collection of changes that can be committed to the server. Unlike in the local long transaction, an IStorable object may appear only once in the Revision. A RevisionEntry for any object gives the new state of the object at the time of the commit. Thus, if I had an object that I added and then modified twice in my long transaction, the RevisionEntry created would be the final configuration with an ADD marker. Thus, there is a fixed logic to merging changes from the long transaction to the Revision. Users synchronizing to the revision will not see the intermediate states of the objects, only the revision state (which is also the state of the object in the committer's local project).

To commit data to the server, we simply merge the long transaction changes into a Revision object and send the new states of the changed objects to the server. The server maintains a stack of revisions, that contains entries for each object in the project, serializing the Revision on the server increments the tip of the revision stack and provides the new state of the changed objects. Objects that are not part of the revision can be accessed by drilling down into the revision stack.

Before a user can commit his changes, he must be sure that his local project is synchronized to the tip of the revision stack. If not, applying changes can result in a corrupted server. Synchronizing a revision simply means applying the IStorable objects that are entries in the revision in question to the local project. Again, this is performed simply by object serialization, in this case, from the server to the client. Note also that this is how conflicts between changes are detected. If the user must recover an IStorable object from the server, because it has a new version in the revision and that user has changed that IStorable object (i.e. it has an entry in the local long transaction) this is a conflict. The user must resolve the conflict in order to synchronize the incoming revision. This can be done by accepting the server version and rolling back the local change, by ignoring the server change and keeping the local change or by a combination of the two. The point is that the user must have, what he considers to be the correct version on his local machine before he can commit his changes.

So in essence we see that ViSit Anywhere implements a change management system based on object's ability to save themselves. Simple objects, like graphic elements and attribute rows, may use controller objects to perform change management operations for them. Changes are made in nested change scopes that provide a transaction environment for developers. The ChangeManager object performs most of the heavy lifting, when it comes to the details of organizing changes and getting them to the project server, but the objects being changed always have to participate in the process.

If this infrastructure seems complex, that's because it is. Managing nested change scopes during commit and rollback is extremely difficult, even with the ChangeManager. Fortunately, a new method for programming change managed operations has emerged during the implementation of the network editing tools that greatly simplifies change managing instance data. This technique is built on top of the basic change management plumbing, but uses an in memory ADO.NET dataset to contain all the information that could possible change during the operation. Once the operation is complete, we only have to change manage the dataset. The DataSetManager takes care of all the details of loading the data and change managing it, including ensuring that attribute data and graphic elements are always changed in the same IChangeScope (since they are managed by separate IChangeController's). Change management with the DataSetManager will be the subject of a future blog entry.

No comments: