How to Think about Parallel Programming: Not!
Anyone remember the old days, when for good performance you had to worry carefully about which register should hold which variable, and when? Sometimes we still do this to get extremely high performance from critical inner loops, especially when using specialized processing hardware such as GPUs.
On the other hand, we have been able to write ever more complex and ever more capable software systems only by sacrificing such micromanagement and using general-purpose tools and abstractions for coding the bulk of our software. Along the way, we have discovered that code generated by automated tools often does a better job than hand-crafted code.
And we learn to code in such a way that the behavior of our code does not depend critically on the detailed optimization decisions that we have delegated to the tools. If we want to let a compiler's register allocator have the freedom to put variables in registers, we stop writing code that takes the address of a variable, as in the C expression &myvar . If we want to allow an automatic storage allocator to do its job, we must write code that works properly independently of where an object or array happens to have been allocated, and perhaps independently of whether the object or array happens to be automatically relocated in the middle of a computation. Once we do this, we don't have to think about memory placement. Good programming language design can get us from the place where we must remember "don't use this difficult feature" to the place where it's not even on the radar screen because the language provides other, better ways to think and get things done. (Example: Java doesn't even have a way to take the address of a variable.)
Likewise, the best way to write code for multiple processors is not to have to think about multiple processors. We need to get to the point where we worry about the assignment of tasks to processors just about as much as we worry about the assignment of data to memory---which is to say, only for truly critical portions of the code---and for the most part leave such decisions to automated tools.
This will require further adjustments in our programming habits---adjustments that, we argue, in the end will make programs easier to understand and maintain as well as easier to run on parallel processors. The key is not to focus on a particular technology but on useful invariants. Here, as in the past, good programming language design can help to encourage good programming habits.