Friday, November 19, 2004

Threads are not an architectural element

What do I mean by this, and why am I taking the energy to write about it?

Threads as I have mentioned in an earlier post are a means of solving a narrow set of problems. They are a tool, just like a database , XML/XSLT, HTML, ... are tools. They are not a way of life.

Let me tell you a sad, and most unfortunate story...

Once upon a time at a high-tech start-up, a new manager was hired to clean up a mess. This mess was caused by a naive developer who was being very clever.

This clever developer had created an "architecture" with some very clever building blocks. These building blocks were Microsoft COM components each which had two interfaces and its own thread of execution. The interfaces were very clever and "elegant". One interface was "DoIt(String)" and the other was "String ReportIt(string)".

The application built on this architecture was a large stack of these building blocks. Each block would serialize a command and send it to the next. Each call was asynchronous because each clever block had its own thread.
Function calls would rattle through these blocks and each block did some "useful" work.

Unfortunately this design suffered from the following maladies and ultimately had to be euthanized:

  1. Poor, poor, poor, poor performance. Waking a thread costs time. Crossing several thread boundaries to get work done was many times slower than letting a single thread do the work.
  2. Large memory foot print. Each of these components where their own dll. The COM overhead and the serialization/de-serialization code was often larger than the application code.
  3. Impractical to debug. Even with the use of trace statements it was nearly impossible to debug. We went to heroic lengths to tag trace statements such that we could follow the logical thread of execution we never came up with a good scheme.
  4. Non-deterministic behavior. Once you get beyond three or four threads in an application you lose determinism. The application becomes very sensitive to the operations of other applications on the computer. Your critical threads get swapped out at inopportune times. Timing dependent defects come and go arbitrarily.
  5. Costly to modify. Since each change alters the dynamics of the system (e.g. timing is changed) entirely new classes of defects emerge with even the most innocent change.
  6. sfd
When deployed this application had 40 threads of execution. We made it work by sheer will, but it nearly killed the product. Over time the application was reworked to have 2 threads. The new version took half the number of lines of code, was faster, and had more features.

I have since found other instances where developers have followed similar approaches with the same results. Threads are a tool that you to implement solutions, they have absolutely nothing to do with software architecture.

So why the facination with using threads as an architectural element? What is the allure?

Most engineers use a "divide and conqur" approach to solving problems. Big problems are recursively broken into smaller problems, and each sub-problem is then solved. The threaded component design provides a box into which you can place to solution to a sub-problem. This box has its own thread so all you have to do is present it with an input and it will notify you when it is done processing it. Its a magic box that problems go into and solutions come out of. How nice!

Here we illustrate a big problem being decomposed into smaller sub-problems.

Problem Decomposition

As an engineer I can decompose my problem into a series of independent sub-problems and assign each one to a magic box. I implement the solution to a sub-problem, stick it in the box, and put it on the shelf. I then string together all of these sub-problem solutions to create my solution. Better yet, I can reuse my magic boxes in other solutions. Since each is self-contained with its own thread of execution, I can drop it into a different solution and it will just work.

This diagram shows a problem decomposition directly mapped into a set of threaded components. Unless your problem domain is boxes such an architecture will get you nowhere.

Problem Decomposition

Yes we engineers have rich fantasy lives. This software architecture will absolutly not work in almost all cases.

In the end a software architecture must contain objects/components that are part of the problem domain that it is trying to address. The thread as a architectural element architecture is devoid any domain information.

I hope that I have explained the flaws of this approach. I hope that I have also explain the strange appeal of this approach. The end advise is: Don't do it!

No comments: