May 20, 2011

Transactional memory - nothing but trouble

This post is a followup to a review that I have recently written for a book called "Transactional memory". Here I focus not on the book and its qualities, but on the matter it discusses - transactional memory. Still, when I say "they" and "their" I refer to the authors and indirectly - to the researchers they represent.

Why at all ?

To begin with, what are the problems they say transactional memory solves ?
  • Arrival of multi-core chips makes programs parallel.
  • Writing effective parallel programs is hard.
    My first point of criticism is right there - there is no evidence that parallel transactions make a good  generic programming model at all. Even if everything else was perfect, why TM programs would be efficient ?

    But there are transactions in databases !

    They quote databases as a proven example of successful transactional systems and kind of go from there. Ok, I buy that, but there are huge differences -
    • Databases operate on very high level objects - tables, records and alike.
    • Databases are about well defined objects with a rather limited behavior. You only can do a handful of operations to them. You cannot invent relational operators.
    • Because the operations are at high level, databases can to a degree "understand" what you are doing, and control the semantics.
    • With memory bytes you can do anything. There is no way to tell what you are doing in that transaction. It's very low level.
    Also, about their success, database transactions don't make the task of writing an efficient and/or correct program any simpler. The guarantee of an ACID system is not that it's effective or easy to use, but that it tolerates various screw ups and maintains data integrity. You normally say something like "it's in a transaction, so nothing bad can happen to the data". You don't say "it's in transaction, therefore it runs fast" or "it's in transaction therefore it is correct".

    Anything good at all ?

    The only thing that could possibly be good about transactional memory is that, just like any other enforcing formalism, it makes possible to reason about the correctness and behavior of a program. Proof of correctness, anyone ?

    That'd be great, but proof of what ? Again, working on a very low level, analysis of a TM program will reveal that there is no conflict between accesses to memory addresses such and such. Uh-huh, thanks.

    Transactional memory is shared

    For a technology whose goal is supporting multi-core computers (and we are talking about hundreds of cores, right ?), TM is surprisingly insistent on shared memory model.

    Although theoretically there is nothing wrong with 100 logical cores accessing the same memory, we live in a real world and in real world hardware rules. And hardware dictates that cores should be as independent as possible. Even in SMP architecture there are per-CPU caches, and you really should keep data accesses affine to a CPU, otherwise cache coherence would kill your performance. In NUMA architecture you don't have a privilege of a uniform view of a shared memory. And in clusters you don't have shared memory at all.

    Now, why sticking to an architecture that already scales poorly ? 

    Transactional memory doesn't play well with anything

    There hardly is anything in existing hardware or software which is not alienated by TM. Instead of having support from your commodity hardware and momentum from existing software development practices, you have to fight it.

    Hardware:
    • TM doesn't work well with caches. Shared memory doesn't in general.
    • TM doesn't work well  with CPU optimizations such as memory access reordering.
    • TM doesn't work well with non-shared bus architectures. Good bye NUMA, clusters, transputers.

    Software:
    • TM doesn't work with locks and waits. Most of current parallel programs use locks and waits.
    • TM doesn't like global heap and especially garbage collection.
    • TM doesn't like exceptions, because it is unclear what to do in case of exception.
    • TM doesn't like I/O because it is irrevocable. Because you cannot un-print something from your output, you have to invent workarounds like "transactions that are guaranteed to commit".
    Even worse, TM conflicts with itself. Consider having a transaction in a transaction. This can happen easily if you begin a transaction and then call a library which begins a transaction of its own. And it gets ugly fast. What if the inner transaction committed but the outer has to rollback ? What if there are multiple inner transactions ?

    Is there any surprise that existing databases don't support arbitrarily nested transactions ? Are you saying there should be a single memory transaction per execution context, as in database ?

    So what can you do in a transaction ?

    Because of the mentioned hardware implications, there is not much you can do in a memory transaction.

    1. The amount of memory that you can access in a single transaction is limited.

    For many TM systems it is limited with hardware cache size, and this is not much at all. Caches contains lines and by touching a line you effectively poison it, so if you access 2 bytes in separate lines you use up 2 cache lines. Also, concurrent transactions have to share it. What good is a 4M cache if it is only 128K lines and there are 128 parallel transactions ?

    2. The amount of time that you can spend in a transaction is limited.

    For many systems the length of the transaction must be less then OS scheduling quantum. Put another way, you cannot have a context switch in a transaction. And this is how much ? 1/1000th of a second ?

    There are such TM systems that allow dirty transaction data to spill from the cache to the main memory, and I presume there are such complex ones that allow context switching and arbitrary execution time, but their performance is to be seen.

    So, what can you do in a transaction with such draconian restrictions ?

    The answer is simple - memory transactions are only usable in low level libraries, implementing higher level objects such as data containers. For example you can have a multiple-producer multiple-consumer queue, a tree that allows access highly parallel, or some equally wonderful thing.

    But is it worth it ? I agree, it takes an expert to write a tree using locks or some magic non-blocking technique, but it's only done once. Are you saying that TM is good because now anyone could write a queue in an hour and it will be correct and efficient ?

    And you cannot use memory transactions in your applications. It is either impossible or grossly inefficient.

    And the answer is ?

    I do believe message passing is a better way to go. The technique goes under many names, but the idea is the same - there are separate processes with separate data and they interact only in a well-defined manner.

    It has no ACID properties, but heck, so does the real world. Like I always say - there are no transactions in real world.

    No comments: