Friday, November 19, 2004

Threads are not an architectural element

What do I mean by this, and why am I taking the energy to write about it?

Threads as I have mentioned in an earlier post are a means of solving a narrow set of problems. They are a tool, just like a database , XML/XSLT, HTML, ... are tools. They are not a way of life.

Let me tell you a sad, and most unfortunate story...

Once upon a time at a high-tech start-up, a new manager was hired to clean up a mess. This mess was caused by a naive developer who was being very clever.

This clever developer had created an "architecture" with some very clever building blocks. These building blocks were Microsoft COM components each which had two interfaces and its own thread of execution. The interfaces were very clever and "elegant". One interface was "DoIt(String)" and the other was "String ReportIt(string)".

The application built on this architecture was a large stack of these building blocks. Each block would serialize a command and send it to the next. Each call was asynchronous because each clever block had its own thread.
Function calls would rattle through these blocks and each block did some "useful" work.

Unfortunately this design suffered from the following maladies and ultimately had to be euthanized:

  1. Poor, poor, poor, poor performance. Waking a thread costs time. Crossing several thread boundaries to get work done was many times slower than letting a single thread do the work.
  2. Large memory foot print. Each of these components where their own dll. The COM overhead and the serialization/de-serialization code was often larger than the application code.
  3. Impractical to debug. Even with the use of trace statements it was nearly impossible to debug. We went to heroic lengths to tag trace statements such that we could follow the logical thread of execution we never came up with a good scheme.
  4. Non-deterministic behavior. Once you get beyond three or four threads in an application you lose determinism. The application becomes very sensitive to the operations of other applications on the computer. Your critical threads get swapped out at inopportune times. Timing dependent defects come and go arbitrarily.
  5. Costly to modify. Since each change alters the dynamics of the system (e.g. timing is changed) entirely new classes of defects emerge with even the most innocent change.
  6. sfd
When deployed this application had 40 threads of execution. We made it work by sheer will, but it nearly killed the product. Over time the application was reworked to have 2 threads. The new version took half the number of lines of code, was faster, and had more features.

I have since found other instances where developers have followed similar approaches with the same results. Threads are a tool that you to implement solutions, they have absolutely nothing to do with software architecture.

So why the facination with using threads as an architectural element? What is the allure?

Most engineers use a "divide and conqur" approach to solving problems. Big problems are recursively broken into smaller problems, and each sub-problem is then solved. The threaded component design provides a box into which you can place to solution to a sub-problem. This box has its own thread so all you have to do is present it with an input and it will notify you when it is done processing it. Its a magic box that problems go into and solutions come out of. How nice!

Here we illustrate a big problem being decomposed into smaller sub-problems.

Problem Decomposition

As an engineer I can decompose my problem into a series of independent sub-problems and assign each one to a magic box. I implement the solution to a sub-problem, stick it in the box, and put it on the shelf. I then string together all of these sub-problem solutions to create my solution. Better yet, I can reuse my magic boxes in other solutions. Since each is self-contained with its own thread of execution, I can drop it into a different solution and it will just work.

This diagram shows a problem decomposition directly mapped into a set of threaded components. Unless your problem domain is boxes such an architecture will get you nowhere.

Problem Decomposition

Yes we engineers have rich fantasy lives. This software architecture will absolutly not work in almost all cases.

In the end a software architecture must contain objects/components that are part of the problem domain that it is trying to address. The thread as a architectural element architecture is devoid any domain information.

I hope that I have explained the flaws of this approach. I hope that I have also explain the strange appeal of this approach. The end advise is: Don't do it!

Monday, November 15, 2004

Threads increase design time

Keeping with my regular "post every couple of months" trend here is the next installment in my discussion of threads.

In addition to pulling from my personal experience I've been doing research on threading (which is half of the reason I write this blog). I have been amazed at the chorus of other voices warning of threading problems. There are many others with credentials better than mine who are are calling out the warning. So why bother again? Well the reason is: you can't have too many people warning you not to run with scissors! Perhaps my small contribution will be the last straw to make someone reconsider their multi-threaded application design. If so, my time has been well spent.

To the topic at hand.

Threads Increase Design Time

This is pretty simple. In the same way that it is infinitely easier to toss a single ball in the air as opposed to juggling three balls, it is easier to write a single threaded application than a multi-threaded application.

When designing a multi-threaded application you have many additional elements to consider:

  1. Concurrency

    • Each time you access memory you must carefully think through if it is possible that two threads will be executing this code segment at the same time.

  2. Performance

    • How do you insure that your threads are sleeping most of the time?

    • How do you prevent your threads from executing at the same time which will impare your performance?

    • Does your locking mechanism require a call to the kernal? If so are you prepared for the performance hit of getting locks?

    • Are you locking too much code for too long? Are you forcing your other threads to wait unnecessarily?


  3. Deadlock
    • If you have more than one lock can your threads deadlock?

    • What is your strategy for resolving/preventing deadlocks?


  4. State
    • Making an asynchronous call breaks the flow of your code. You make the call and expect a response at a later time. When you make the call you must save your state so that when the call eventually returns you can continue from where you left off. This is a major design consideration that is more often implemented poorly. Organizing your modules to have a single place where ansynchrounous returns are handled helps.
    • What happens when an asynchrounous call doesn't return? Are you tracking the time of your calls? What is your policy?

  5. Resource consumption and memory management
    • If one thread allocates memory, who deallocates it? Managing the lifetime of global resources (memory, file pointers, database connections) becomes especially difficult. Which thread "owns" the resource? How are you sure that all threads are done using the resource?

    • How many threads and locks are you using? These are kernel resources that are not free?

  6. Design for testing
    • Current debuggers are (and have been for over a decade) ineffective at tracking problems in multi-threaded applications. How will you debug your code? What tools will you use?

    • You will make use of extensive trace statements in your code (this is the answer to the previous question). What trace library will you use? How will you deploy your trace statements such that they don't significantly change the behavior of your code? What tools will you use to analyze the many megabytes of trace output produced?
It would seem that I am painting a bleak picture of multi-threaded development. Well I am, and it is well founded. This is one of the primary reasons that J2EE, COM+, and other Enterprise frameworks make you write single threaded code. These are highly multithreaded frameworks, but they make writing applications easier by hiding this fact.