Wednesday, November 7, 2007

New article: Java EE meets Web 2.0

My article about Java EE concurrency model problems is published on developerWorks. The article is mostly rehash of the problems and overview of available solutions. This article fared much better through publishing process. Some content were cut, but none of it were essential.

Monday, October 8, 2007

What is a value of reverse and round-trip engineering in UML tools?

For some reason, reverse engineering and round-trip engineering is considered important feature for UML tools. UML tool developers sometimes postpone other features in order to make it running. I always wondered why so. It looks it is not only me.

In theory, the feature could display a messy body of source code as messy diagrams.

The problem with reverse-engineering is that diagrams have the same level of abstraction as original code. If we add into mix the fact that diagrams are more cumbersome than source code, we receive less efficient representation of what we already have in the code. Why we need that?

I have seen two roles of UML where UML might be useful in the projects:

  1. UML can be used to document important parts of the application.
  2. UML could be use as substrate for domain specific language.

Reverse-engineering offers a little for the both of uses.

As documenting usually comes before implementing, there is a little to reverse-engineer when we creating the docs.

When the code is generated from UML-based DSL, mapping is usually one-to-many and mapping functions are not easily reversible. Therefore changing one element in UML model affects many places in source code. And even the code structure could change depending on the changes (consider changing multiplicity of the attribute in OR-mapping). This complicates creation of reverse engineering tools greatly. So there are usually no reverse-engineering for DSLs.

One of few cases where it was at least marginally useful was JBuilder’s UML view of the currently selected class. This view were somewhat convenient when get an idea of what is used by the class. However there was no significant advantage over outline view and Ctrl-LEFT_CLICK in Eclipse.

Wednesday, October 3, 2007

Blogspot autoformatting

I have not blogged on blogspot for some time.

And now there is a nasty surprise on it. There is no option to disable auto-formatting when creating HTML (at least in the Russian skin for blogger). Blogspot insisting on inserting the pesky "<br>" in source where there is a newline in the html source.

I think that this is a bug.

When the user chooses to edit html instead of rich text is expected that the user knows html and that the user wants to create something better than the not-so-nice html generated by "create" option. So there must be an option to disable the questionable "helper".

The generated html is so problematic that it displays differently on the comments page. For example unordered lists are broken.

Static vs. dynamic typing

Several people that I somewhat respect blogged how good dynamic typing is and how restrictive static typing is. I feel uncomfortable about as I see how well static typing could facilitate software development process.

I do not see it as restricting.

I see it as facilitating.

I do not argue with that good programs could be written using dynamically typed languages. A plenty has been already written. And there are quite a lot of awful programs written using statically typed languages.

I also do not argue with that dynamically typed programs are usually simpler to type, since that there is less to type.

However programs exist not just for being executed. There are other tools that need to read the program beyond compiler or interpreter. There are other humans that need to read the program. And the human that will wear your skin will be other human in a year or so if you worth anything as programmer. So when writing programs, you need to address that future you as well.

The type is a specification of the value expected to be. The compiler (or interpreter) checks that these specifications are consistent. It also can optimize basing on these specifications at own leisure.

However other tools are able to use these specifications as well. If we take Eclipse Java IDE, there are following facilities that jump to mind when I think about static typing:
  • Refactoring
  • Navigation to type definition
  • Incremental compilation and error checking as code is typed
APT-based code generation is another example of how static typing information could be used.

And this is possible even with as broken type system as Java's is.

Humans also benefit from type annotations. Type information communicates (albeit incompletely) a contract of the component. The expected kind of value is specified in a standard place and in a standard way. If type annotations are missing, this information would have been communicated anyway, but other means were used (for example, comments).

Most dynamically typed languages that I know support optional type annotations. Some typing information could be inferred basing on source code in dynamically typed languages. However, such inference will be incomplete, and that will limit tools and understandability of the programs. For example, refactoring will not work reliably.

Another issue is that Java requires too much type annotations even in places where such information could be easily derived (for example, collection for and final variables). Generics in Java are hopeless broken as well, but I have already ranted about it.

Saturday, April 28, 2007

AsyncObjects 0.1.0 finally out

After long time and some pain I have released an updated version of the the AsyncObjects framework.

The frameworks has been worked upon mostly in the context of the Sebyla project. It is a blatant rip off of ideas of E programming language. Initially, it was created just as prototype of some ideas for the Sebyla project back at 2002. And I have tried some new ideas in it form time to time. I have even ported the thing to Java 5, and I have learned in process how thin the layer of generics in Java is.

For some weird reason, nothing similar had appeared during five years of the framework life.

Then there was a project that might have used the framework. However it would have used the framework on Foundation Profile 1.1 runtime. So I partially took old version and and partially removed generics from new version and I have put into the separate project that is currently the home of the framework. Alas the project has been canceled just after the beginning. However I have done some work on the framework in context of that project. After that I have continued the work in my free time. However the framework was a low priority project for me, so the progress was glacially slow. There is a higher priority personal project.

Now there is yet another project that might use the framework. And it already has a dependency on Java 5. The framework might experience yet another flip, now to Java 5. If this project won't happen, I guess I would need to advertise it on some wider forums.

Tuesday, April 24, 2007

Ruby for text processing

In context of ETL project, I has ported a large Java class from old API to new API. I actually has done it twice to try two different implementation options. Since ported API had a very regular structure, I decided to try Ruby for this task. It worked pretty well. Ruby has a great text processing library. The method gsub!() that invokes a closure to perform text substitution when using regular expression has saved me a lot of efforts. This allows quite complex transformations just in the place where it is needed.

I'm not sure whether I saved efforts or I would have faster done it manually. I do not know Ruby very well after all. But surely, the process was more enjoyable.

Wednesday, February 28, 2007

Software system longevity paradigms

Software systems functionality often needs to survive changes like hardware and software failures, version upgrades, and system reimplementation.

There are two basic paradigms in providing this property: intentional and orthogonal. The differences between these two paradigms might look insignificant at first sight, but they lead to different properties of the products. I have an obvious bias in favor of intentional paradigm.

Saving State - Orthogonal Paradigm

In this paradigm the application state is saved. Saved artifacts are considered as based on the application. The saved state might be the entire application state or just a part of it. This functionality is often considered as orthogonal to other application functionality. A programmer working in this paradigm generally does not want to care if it is working with persistent objects or not. The features of persistence layer that leaks to application layer are often considered as sad facts of life and framework clutter.

The basic principle is that the entire software system survives because each application (or at least an important part of it) survives.

There are a lot of systems that use this paradigm.
  1. MS Office file formats

  2. OS hibernate

  3. Java object serialization

  4. CapROS

  5. JDO 1.0

  6. Most of object databases
Some of these technologies can be also used in the context of the intentional paradigm. But they are designed in the context of the orthogonal paradigm.

This paradigm has many features that make it attractive in the eyes of the developer. It is very simple to start using. An existing application object is just marked as persistent. So a developer does not have to learn a new technology.

The fundamental "feature" of this paradigm is that it assumes that the application does not change significantly during its life cycle. Problems surface when this assumption is invalidated. When the application evolves, it is hard to migrate data to a new version application. It is even harder to use data from other applications. Particularly if another programming language is used.

Problems of Java object serialization are relatively well documented. For example, almost each swing component JavaDoc file states that successful deserialization in other version of Java is unlikely.

JDO 1.0 changes semantics of Java objects using enhancer tool. This causes a number of interesting problems. On source level, objects look the same as they were before they we made persistent. Theoretically it could save a programmer from changing the code that had previously worked with these objects. However the behavior of the objects changes in the number of ways. Among other things, the tool replaces field access instructions with method calls. And if fields are public, it requires changes of persistent classes and client classes. The places that previously could not have been throwing an exception, now can throw an exception. Reflection stops to work because there are no more such fields in the class.

The problems of other products created in orthogonal persistence paradigm are also well documented. The hibernate functionality of operating systems possibly one of few places where the paradigm is applicable and quite natural. In this case it sometimes fails mostly because hardware and drivers are not quite ready for it.

Accessing Data - Intentional Paradigm

In this paradigm the application works with data that have life time longer than application's one. The application is considered to be dependent on the data rather than reverse as in orthogonal paradigm. This is paradigm looks more natural to me as the thing with shorter lifetime depends on the thing with longer lifetimes. Working with persistent data is considered as part of application functionality and constructs specific to persistence layers (like transactions) are considered as an essential part of application logic rather than unnecessary clutter.

The basic principle is that we ensure that the data survives, and an application is a transient thing anyway. It could die any time. Upon restart it will be able to work with the data again. Some data could be lost, but this is a known risk that should have been calculated.

There are a lot of systems that use this paradigm.
  1. Relational and SQL databases
  2. File systems
  3. OASIS Open Document Format
  4. Java XML Binding
Products created in the context of the intentional paradigm are usually a bit harder to use because the code that works with persistent state is aware of this fact. There is no attempt to create an illusion that these objects are just local to the application.

However, because of this, there are no problems related to the situation when an illusion breaks. The decisions related to persistence are explicit and they are part of the application logic.

If data is considered as separate thing from the application from the start, the features like upgradeability, support for multiple applications, and for legacy versions of the application come relatively naturally.

I think that the orthogonal paradigm is suitable only in cases of short lived data that can be written and read by the single version of the single application. There are quite a lot of such situations. But if such paradigms quickly fails in the case of modern EIS where applications are created, replaced (and often by rewriting them in another programming language), changed, and retired. And this is possibly one of most significant reasons why OODBMS (which are mostly developed in the context of the orthogonal paradigm) failed to overtake SQL databases (which are developed in the context of the intentional paradigm) in EIS.

Monday, February 26, 2007

Java 5 vs. .NET 2.0 generics: past vs. future

There is a thing that struck me out about Java 5 vs. .NET 2.0 generics.

Major design constraint for the generics in Java 5 was the backward compatibility. Much of the statically available information related to generics is not retained in the byte code. It is erased during a compilation process. The generics become sort of consistent with the rest of the language. There is even ability to retrofit old API to the new language features as it could be seen from the collection framework.

As result, the generics in Java 5 cause a kind of "fake" feeling. There are a lot of things that cannot be done using Java 5 generics which are desirable. For example, a instance of a generic class cannot learn about arguments with which it was created. Some awkward constructs like "((ArrayList<String>)(Object)(new ArrayList<Integer>())).add("Test")" do not create a runtime exception (the compiler honestly warns that there might be a problem).

However by doing so, Java was positioned as a "legacy" language where the burden of the past outweights needs of the future development.

In .NET 2.0, the generics were added as a new feature. It looks like the backward compatibility was not a major constraint during the development. It is a bit facilitated since new collections also implement untyped collections interfaces. However unlike Java 5 counterparts, .NET 2.0 generics do need invoking the backward compatibility argument for explaining weird features. They are much cleaner. So generics in .NET 2.0 looks like mostly motivated by their future use.

On other hand, it is puzzle to me why generics did not make into .NET 1.0. Lack of generics was well documented problem of Java. And .NET framework IS primarily inspired by Java (it was quite funny to read some MS documents that discuss C# and very carefully avoid mentioning Java and mentioning such remote languages as as C). And many fundamental Java language problems were fixed in .NET. This one belongs exquisite group of language problems that .NET designers chosen to reproduce. I kinda understand time-to-market arguments. But I do not believe that few additional months would have made a difference here. Many Java proposals for generics were on the table for long time, and it was possible to select a good one from the start. At least it would have saved them from the shame of CodeDom API.

So it looks like Java position itself as legacy language and .NET as an actively developed language runtime. It might be explained by the fact that the Java had much longer past than .NET. However it does not bode well for Java future. This is a possible reason why Sun started development of the new language called Fortress instead of adding needed features to Java. Open sourcing Java might break this self-image problem and cause implementing and trying more daring features in Java language, but I would not bet on it.

Wednesday, January 24, 2007

JDJ article availalbe

My first published article become available from JDJ web site some time ago. I cannot say that I'm completely satisfied by it.
  • It does not completely match the original idea and some important related things have been just mentioned in the final version because of size constraints.
  • The language and style are the result of difficult compromises between me and other author, so it could have been better and more consistent.
  • Also almost all references has been cut out during internal review cycle "because other articles on the site do not do it". I personally prefer articles that mention its sources.
But it was the first article that I have written myself and overall working on it was a good experience.