Monday, October 8, 2007

What is a value of reverse and round-trip engineering in UML tools?

For some reason, reverse engineering and round-trip engineering is considered important feature for UML tools. UML tool developers sometimes postpone other features in order to make it running. I always wondered why so. It looks it is not only me.

In theory, the feature could display a messy body of source code as messy diagrams.

The problem with reverse-engineering is that diagrams have the same level of abstraction as original code. If we add into mix the fact that diagrams are more cumbersome than source code, we receive less efficient representation of what we already have in the code. Why we need that?

I have seen two roles of UML where UML might be useful in the projects:

  1. UML can be used to document important parts of the application.
  2. UML could be use as substrate for domain specific language.

Reverse-engineering offers a little for the both of uses.

As documenting usually comes before implementing, there is a little to reverse-engineer when we creating the docs.

When the code is generated from UML-based DSL, mapping is usually one-to-many and mapping functions are not easily reversible. Therefore changing one element in UML model affects many places in source code. And even the code structure could change depending on the changes (consider changing multiplicity of the attribute in OR-mapping). This complicates creation of reverse engineering tools greatly. So there are usually no reverse-engineering for DSLs.

One of few cases where it was at least marginally useful was JBuilder’s UML view of the currently selected class. This view were somewhat convenient when get an idea of what is used by the class. However there was no significant advantage over outline view and Ctrl-LEFT_CLICK in Eclipse.

Wednesday, October 3, 2007

Blogspot autoformatting

I have not blogged on blogspot for some time.

And now there is a nasty surprise on it. There is no option to disable auto-formatting when creating HTML (at least in the Russian skin for blogger). Blogspot insisting on inserting the pesky "<br>" in source where there is a newline in the html source.

I think that this is a bug.

When the user chooses to edit html instead of rich text is expected that the user knows html and that the user wants to create something better than the not-so-nice html generated by "create" option. So there must be an option to disable the questionable "helper".

The generated html is so problematic that it displays differently on the comments page. For example unordered lists are broken.

Static vs. dynamic typing

Several people that I somewhat respect blogged how good dynamic typing is and how restrictive static typing is. I feel uncomfortable about as I see how well static typing could facilitate software development process.

I do not see it as restricting.

I see it as facilitating.

I do not argue with that good programs could be written using dynamically typed languages. A plenty has been already written. And there are quite a lot of awful programs written using statically typed languages.

I also do not argue with that dynamically typed programs are usually simpler to type, since that there is less to type.

However programs exist not just for being executed. There are other tools that need to read the program beyond compiler or interpreter. There are other humans that need to read the program. And the human that will wear your skin will be other human in a year or so if you worth anything as programmer. So when writing programs, you need to address that future you as well.

The type is a specification of the value expected to be. The compiler (or interpreter) checks that these specifications are consistent. It also can optimize basing on these specifications at own leisure.

However other tools are able to use these specifications as well. If we take Eclipse Java IDE, there are following facilities that jump to mind when I think about static typing:
  • Refactoring
  • Navigation to type definition
  • Incremental compilation and error checking as code is typed
APT-based code generation is another example of how static typing information could be used.

And this is possible even with as broken type system as Java's is.

Humans also benefit from type annotations. Type information communicates (albeit incompletely) a contract of the component. The expected kind of value is specified in a standard place and in a standard way. If type annotations are missing, this information would have been communicated anyway, but other means were used (for example, comments).

Most dynamically typed languages that I know support optional type annotations. Some typing information could be inferred basing on source code in dynamically typed languages. However, such inference will be incomplete, and that will limit tools and understandability of the programs. For example, refactoring will not work reliably.

Another issue is that Java requires too much type annotations even in places where such information could be easily derived (for example, collection for and final variables). Generics in Java are hopeless broken as well, but I have already ranted about it.