Wednesday, February 28, 2007

Software system longevity paradigms

Software systems functionality often needs to survive changes like hardware and software failures, version upgrades, and system reimplementation.

There are two basic paradigms in providing this property: intentional and orthogonal. The differences between these two paradigms might look insignificant at first sight, but they lead to different properties of the products. I have an obvious bias in favor of intentional paradigm.

Saving State - Orthogonal Paradigm

In this paradigm the application state is saved. Saved artifacts are considered as based on the application. The saved state might be the entire application state or just a part of it. This functionality is often considered as orthogonal to other application functionality. A programmer working in this paradigm generally does not want to care if it is working with persistent objects or not. The features of persistence layer that leaks to application layer are often considered as sad facts of life and framework clutter.

The basic principle is that the entire software system survives because each application (or at least an important part of it) survives.

There are a lot of systems that use this paradigm.
  1. MS Office file formats

  2. OS hibernate

  3. Java object serialization

  4. CapROS

  5. JDO 1.0

  6. Most of object databases
Some of these technologies can be also used in the context of the intentional paradigm. But they are designed in the context of the orthogonal paradigm.

This paradigm has many features that make it attractive in the eyes of the developer. It is very simple to start using. An existing application object is just marked as persistent. So a developer does not have to learn a new technology.

The fundamental "feature" of this paradigm is that it assumes that the application does not change significantly during its life cycle. Problems surface when this assumption is invalidated. When the application evolves, it is hard to migrate data to a new version application. It is even harder to use data from other applications. Particularly if another programming language is used.

Problems of Java object serialization are relatively well documented. For example, almost each swing component JavaDoc file states that successful deserialization in other version of Java is unlikely.

JDO 1.0 changes semantics of Java objects using enhancer tool. This causes a number of interesting problems. On source level, objects look the same as they were before they we made persistent. Theoretically it could save a programmer from changing the code that had previously worked with these objects. However the behavior of the objects changes in the number of ways. Among other things, the tool replaces field access instructions with method calls. And if fields are public, it requires changes of persistent classes and client classes. The places that previously could not have been throwing an exception, now can throw an exception. Reflection stops to work because there are no more such fields in the class.

The problems of other products created in orthogonal persistence paradigm are also well documented. The hibernate functionality of operating systems possibly one of few places where the paradigm is applicable and quite natural. In this case it sometimes fails mostly because hardware and drivers are not quite ready for it.

Accessing Data - Intentional Paradigm

In this paradigm the application works with data that have life time longer than application's one. The application is considered to be dependent on the data rather than reverse as in orthogonal paradigm. This is paradigm looks more natural to me as the thing with shorter lifetime depends on the thing with longer lifetimes. Working with persistent data is considered as part of application functionality and constructs specific to persistence layers (like transactions) are considered as an essential part of application logic rather than unnecessary clutter.

The basic principle is that we ensure that the data survives, and an application is a transient thing anyway. It could die any time. Upon restart it will be able to work with the data again. Some data could be lost, but this is a known risk that should have been calculated.

There are a lot of systems that use this paradigm.
  1. Relational and SQL databases
  2. File systems
  3. OASIS Open Document Format
  4. Java XML Binding
Products created in the context of the intentional paradigm are usually a bit harder to use because the code that works with persistent state is aware of this fact. There is no attempt to create an illusion that these objects are just local to the application.

However, because of this, there are no problems related to the situation when an illusion breaks. The decisions related to persistence are explicit and they are part of the application logic.

If data is considered as separate thing from the application from the start, the features like upgradeability, support for multiple applications, and for legacy versions of the application come relatively naturally.

I think that the orthogonal paradigm is suitable only in cases of short lived data that can be written and read by the single version of the single application. There are quite a lot of such situations. But if such paradigms quickly fails in the case of modern EIS where applications are created, replaced (and often by rewriting them in another programming language), changed, and retired. And this is possibly one of most significant reasons why OODBMS (which are mostly developed in the context of the orthogonal paradigm) failed to overtake SQL databases (which are developed in the context of the intentional paradigm) in EIS.

No comments: