What Happened to BoostPro.com?

In case you noticed that boostpro.com (and my usual email address) are not responding, well, the domain registrar has the domain pointing at their own DNS servers, their DNS servers have no records for boostpro.com, and they are somehow unable to change either of those facts. I’ll post the full nightmare to Techarcana once we have safely arrived at a working domain registrar, which could be anytime in the next four days according to ICANN rules. In the meantime, you can reach us at boost-consulting.com and our @boost-consulting.com email addresses.

What’s so cool about Boost.MPI?

At some point after a client brings us a project proposal, we usually have a conversation about separating the domain-specific part of the job from the infrastructure bits, so we can release the latter part as open source. While we at BoostPro enjoy getting our work out there as FOSS, it can often be a huge win for the customer, reducing long-term maintenance cost and improving code quality, not to mention being good PR. Well, a few years back I got a call from Daniel Egloff, a statistician doing high-performance financial simulations in a Swiss bank—the results of which were of crucial importance to the bank’s future.

Daniel was something of a renegade. He had to be—the official policy at his bank was that everyone was to use Windows and program in Java, so putting together a Debian cluster and assembling a team that could drive it with C++ required swimming against the corporate current, to say the least. But Daniel was also a visionary. It isn’t every day that a new client is so tuned-in to the value and economy of open source libraries that they not only want to release everything we do that way, but they will pay us to shepherd the library through the Boost review process. As a result of our work with Daniel, BoostPro produced three new Boost libraries: Boost.Accumulators, Boost.Time_series, and Boost.MPI. What a year!

Today I want to write a little about Boost.MPI, because it is some very cool technology and because lots of people don’t seem to understand what’s so cool about it.

What’s MPI?

MPI (which stands for Message Passing Interface) is a C-like library API for synchronization and communication between parallel processes, usually running on separate networked computers, sometimes with heterogeneous architectures. If you need to compute something that requires more juice than you can squeeze out of the most powerful single machine, MPI can be a great technology around which to build your application. The library handles all kinds of basic issues, abstracting away details like

  • the OS
  • the communication substrate (ethernet, infiniband, shared memory…)
  • the existence of multiple network interfaces
  • machine architecture (including endian-ness)

Without having to deal with any of these low-level concerns, the MPI user can write straightforward, portable code that orchestrates large-scale parallel computations. There’s lots more to MPI, but those are the basics.

So What’s Boost.MPI?

Boost.MPI is a full-on C++ library wrapper over the MPI API that—quoting from its web page—“better supports modern C++ development styles… and the use of modern C++ library techniques to maintain maximal efficiency.” Which sounds nice-to-have-but-not-too-exciting at first blush. To really appreciate what’s so cool about it, you have to care about making the most of your cluster hardware, and you’ll need to delve into a few of the details about how that hardware works.

Into the Details

Network cards have a fixed-size buffer; sending anything to another process involves getting it into that buffer.1 If the buffer fills up, packets are waiting to go out, and you can’t send anything further until that happens.

One mission of MPI (the non-boost variety) is to provide a portable high-level API for sending out these messages. Therefore, MPI deals with the low-level stuff and has/needs direct access to the network buffers, but code written on top of MPI does not. Actually, some clusters even have multiple network connections, such as connections with tree structure in addition to a channel to six nearest neighbors on a cube, and MPI picks the most appropriate avenue for your communication pattern. So not only don’t you need to, but you actually never want to access the network buffer directly.

MPI DataTypes and Type Maps

Now let’s assume a heterogeneous cluster, where size, alignment, and endian-ness may not match from machine to machine. In that case, you can’t just “blit the bytes;” somebody needs to figure out how to encode data for transmission and decode it upon receipt so that it has the same meaning on both ends. Let’s further assume a system that transmits in little-endian, so before sending from a big-endian machine one needs to swap bytes (all the other possible schemes have the same consequences—assuming this one just allows us to work with specific examples).

The code doing the encoding and decoding has to know about the data structure, rather than operating on the data as raw bytes only. For example, if the data structure is a sequence of 32-bit integers, you need to reverse each group of 4 consecutive bytes. If it’s a sequence of 16-bit integers, you need to reverse pairs of bytes, and if it’s a sequence of chars, you don’t need to do anything.

Now, suppose MPI only knew about byte sequences. MPI messages would be a lot like files, and it would be our job, as users of MPI, to do the encoding and decoding. How could we approach that? We’d probably use something like Boost.Serialization to marshall our data. We’d serialize our source data into a flat, portable representation that could be passed to MPI, which would then copy the bytes into the network buffer as needed. That’s two copies of every byte. One copy to serialize, and another copy into the network buffer.

Fortunately, MPI knows about way more than just bytes. MPI has datatypes, which aren’t actually types at all, but constants that identify C/C++ primitive types such as int and long double to the library. With a base address, a length, and the appropriate datatype, we can tell the library to transmit an array across the network:

// Use of raw, unimproved MPI interface.
err = MPI_Send(&vec[0], vec.size(), MPI_DOUBLE, dest, tag, comm);

Not all data is organized into contiguous arrays of primitives, though; so MPI also provides type maps, which allow us to define the sequence and offsets of fields in a struct:

struct particle
{
    char spin;
    char color;
    double position[3];
    double speed[3];
};
 
particle particles[1000]; // a bunch of particles to send
 
namespace particle_mpi // define the MPI datatype for particles
{
  // Constituent member types
  MPI_Datatype types[2] = {MPI_CHAR, MPI_DOUBLE}; 
 
  // repetition counts
  int reps[2] = {2, 6};
 
  // Prepare offsets
  MPI_AInt offsets[2];
  MPI_Address( &particles[0].spin, offsets );
  MPI_Address( &particles[0].position, offsets + 1 );
  for (int i = 1; i >= 0; --i) 
      offsets[i] -= offsets[0];
 
  // Finally, create the new datatype
  MPI_Datatype datatype;
  MPI_Type_struct( 2, reps, offsets, types, &datatype );
  MPI_Type_commit( &datatype );
}

Presto: Efficiency!

MPI type maps are great for efficiency in three ways:

  1. First, if there’s padding in your data structure, the type map captures that fact, and padding bytes aren’t sent.

  2. Once we tell MPI about the “shape” of the data structure, it can put a serialized representation directly into the network buffer. That’s just one copy of every byte.

  3. Using type maps saves memory. If you’re in a resource-constrained environment—and cluster nodes often do run very near their memory capacity—you might not have space to spare for an additional serialized representation of the data you’re sending. In fact, that was the case with Daniel’s simulation.

These efficiency gains are multiplied when you have to send the same data structure (with new values) multiple times. In the worst case, MPI type maps are on the order of the same size as the data structure they describe, so creating one might be roughly the same cost as a copy. So using the same “shape” over and over again can be important. Fortunately, that’s a natural pattern for many large-scale parallel computations.

So What’s The Catch?

I don’t know whether you noticed from our example, but an MPI type map is a pain to create! It’s even more painful to maintain as data structures evolve. As a result, in a typical application, type maps get created for a very few structs, and more complex structures are typically sent with a series of separate MPI calls (some of which use type maps) or with the extra serialization step. Fortunately, Michael Gauckler (Daniel’s protégé) had a brilliant idea that we implemented in Boost.MPI.

Skeleton and Content

The genius of Michael’s idea is in three realizations:

  1. You can represent all the values in an arbitrarily complex non-contiguous data structure with a single type map. It’s like a struct that extends from the data’s minimum address to its maximum, probably with lots of padding.

  2. When you serialize a complex data structure with Boost.Serialization, the Archive sees the type and address of every datum.

  3. You can treat addresses as byte offsets (e.g. from address zero), and build a type map that way.

So Boost.MPI has a Boost.Serialization Archive type that creates an MPI type map by treating addresses as offsets and translating fundamental C++ types into MPI datatypes. This step involves no actual data copying; it’s just sending the “bones” of the data structure with no real “meat.” Then it uses the type map and asks the underlying MPI library to send the “giant struct beginning at address zero,” thus avoiding an expensive intermediate serialization phase before MPI actually gets its paws on your data. And as long as you don’t change the layout of your data structure, you can send new “meat” without ever repeating the “bones” step. This approach became known as the “skeleton and content” technique. Michael and Daniel wrote a paper about it, which you can read here.

Conclusion

When were able to make the efficient use of MPI as simple as making the types in question serializable, that was a huge win. There simply wasn’t enough memory in the systems to do the most important communications with an intermediate serialization step, and the programmers didn’t have time to manually maintain MPI type maps, so this library was a crucial part of the project’s technical success.

Acknowledgements

Thanks very much to Matthias Troyer for working on the Boost.MPI project with us and for checking (and correcting) my facts.


  1. There are typically multiple buffering levels, some of which are in main memory (usually one for each target node, so you if your communication with node A is blocked you can still send to node B), but the basic facts remain the same: there is a limited amount of buffering available. Actually, the same applies to shared memory, in case that’s how your processes communicate. 

Down And Dirty With C++0x

Hans Odenthal, CEO of Sioux, on their massive orange couch

Hans Odenthal, Sioux's “People Manager,” on their massive orange couch

One thing I love about my job is that I get invited to talk in places I haven’t been before. In December, I went to Shanghai to give a keynote at the china-cpp conference. A couple of weeks ago I was in the Netherlands giving a three day hands-on course with C++0x (well, part of C++0x, anyway—we used GCC 4.5) and a big evening session, part of my host’s “Hot or Not” series.

This trip was all about the orange. Continue reading

BusyBusyBusyBusy…

In case anyone pays attention to what I write here anymore, I thought I should leave a quick update.

  • we did BoostCon 2010, and it was great as usual. I’m hoping to have something to write myself on it soon, but in the meantime check out Dean Michael Berris’ report on C++ Soup.
  • Right now I’m prepping to deliver a C++0x course and evening talk in the Netherlands. C++0x is huge! Even when you’re “on the inside of the development process,” there’s a lot to miss. So preparation is taking all my time which is why there’s not much activity on the Ryppl front.
  • Yeah, Ryppl. It’s the project I’m working on full-time when I’m not teaching, blogging, running BoostPro, recovering from Boostcon, blah, blah. Yeah, did I mention I’m busy? Anyway, in my opinion the future of Boost depends on having something like Ryppl in place, so I’m really looking forward to getting back to it. I’m especially psyched because next month my colleague and friend Eric Niebler is moving to Cambridge (right around the corner from me) to work on ryppl. It’s going to be great to collaborate face-to-face with someone on a regular basis again after living the virtual life for so long.

Shanghaied!

About a month ago I was invited to give a keynote at the China C++ Conference, and after some hemming and hawing due to the short notice, I decided to accept.  It’s not every day that someone offers to pay your way to China; I might as well see it, I figured.

The place is happenin’, seriously. Shanghai is a huge, beautiful city. Some people have lots of money, and the city center glitters with light and impressive buildings. If anyone wonders why there’s talk about China being an economic powerhouse, here’s your answer: I drove past a Maserati dealership on the way to the hotel. On the other hand, I saw guys in ragged clothes hauling loads of wood, or bricks, like oxen, and every category of bicycle and scooter with dump-truck-sized payloads. I ran past someone’s broken-down house with their chickens roaming the sidewalk outside.

Where I’ve Been

While BoostCon ’09 was awesome, at this point it’s pretty obvious that I won’t be able to report on each day in the detail I started out with. I will try to wrap it all up in one article, soon.  Until then, if you’re a programmer type, please have a look at C++Next, a new site I have started about advanced C++.

Cheers!

BoostCon 2009 Trip Report 3a

May 4, 2009

KickOff

Monday morning in Aspen

Hey, this is going to be fun! I sure hope people don’t grumble too much about the snow, and that they dressed for the changeable mountain weather.

I hoof it on over to breakfast and get ready for my day. In light of the snow, I take the long way over to the Physics Center, along the road, not through the meadow.

Jeff Garland opens the conference in Flug (“floog”) Forum with his “Library In a Week” project for this year: building a relational database binding library.

Despite appearances, Jeff actually is a corporeal being

Despite appearances, Jeff actually is a corporeal being.

Jeff has done a different variation of this series every BoostCon, getting a group together for one hour every morning before the rest of the sessions start, to collaborate on developing a new library. As great as “Library in a Week” has been for everyone over the years, it hits me like a ton of bricks that a working session is no way to start a conference. [Note to self for next year: add a short formal welcome session]

Despite appearances, Jeff actually is a corporeal being

I have some interest in this area, since I recently had to learn more than I ever wanted to about web development, so I had a brush with databases, specifically the Django web framework’s binding library. I volunteered to give an overview the next day of that interface, but sadly became overwhelmed with other conference responsibilities and activities, and was never able to get back to Library in a Week. Sorry, everybody! If someone would like to write a comment about how the project turned out, I’d be happy to approve it here.

Flug again, featuring the infamous "red chair"

The infamous Flug red chair

Next up, Christophe Henry’s talk on the Meta State-Machine Library in Bethe (“beta”) hall. I’ve been looking forward to this one ever since I saw it on the program. I first learned about Christophe’s work when he contacted me and Aleksey Gurtovoy (my co-author on C++ Template Metaprogramming) about the library he had written based on a simple example in our book. I get similar requests now and again to look at peoples’ code; I usually don’t have the time, and when I do I am usually not impressed, but this time I took a gander, and I was wowed. He had maintained the declarativeness and efficiency of our work and had extended it to cover all the fancy-dancy features that people familiar with the UML state machine specification expect.

Years ago, when the review for the Boost StateChart library was underway, I had pointed to our example as a way of showing that one could write highly-efficient state machines with a declarative syntax, but at the time it was claimed that the advanced features supported by the UML standard (and the proposed library) made such an approach infeasible. I wasn’t happy, but also didn’t have the time or domain expertise to build what I thought would have been an improvement. Since the library would certainly be useful for some portion of the C++ community, it was accepted into Boost; I think I even voted for it.

Christophe

Christophe explains why conventional design processes are not Scottish

But, Christophe had taken his knowledge of UML and template metaprogramming, and combined them to produce something much closer to my ideal. Not only that, but it was well-documented and nicely presented. Back to the past present, at BoostCon, I’m looking forward to seeing the details.

Christophe turns out to be an excellent presenter: knowledgeable, entertaining, and understandable without talking down to the audience. He also has an impressive grasp of how crucial abstraction is to the software development process, and a way of explaining it in terms of Model-Driven Engineering from which I have lots to learn.

Michael Wong slings C++0x

Michael Wong slings C++0x

Still trying to get pictures of everything at the conference, I duck over to Flug and check out Michael Wong talking about the features of the upcoming standard, C++0x. It looks like Michael is doing a great job keeping people engaged, and I know most of this stuff, it seems, from my work on the C++ committee, so I go back to Bethe just in time for the first coffee break.

Next we have the dueling parallel patterns presentations (DPPP). In Flug, Stephan T. Lavavej is talking about Visual Studio 2010’s parallel patterns library. In Bethe, Joel Falcou on an Embedded Domain Specific Language (EDSL) for parallel programming. How do I choose? I figure that pretty soon information about Visual Studio is going to be ubiquitous, but this is probably my one chance to hear about what Joel is working on. So it’s off to Bethe.

Wow, this is getting long.  More on Monday to come tomorrow…

BoostCon 2009 Trip Report 2

Sunday, May 3: Sunday, May 3: I wake to an overcast sky and look around, half-startled by my own relaxation. With Dave Jenkins handling facilities and Kim Scheibel handling registration, there’s so much less to worry about than in years past! I did have to bring the nametags, so I’ll be taking them over to registration this afternoon. In the meantime, I’m looking forward to a day of hanging out, finishing a writeup for the rvalue references coding session I’ll be running on Monday afternoon. This article has become much longer and more involved than I had ever expected it to.

At breakfast in the morning I meet Edouard Alligand, Christophe Henry, and Christophe’s wife Inna. Edouard came in from Paris, and though Christophe and Inna are French and Russian, respectively, they live in Germany. A very cosmopolitan group.

Naturally, Inna is wondering what she’s going to do with herself in this tiny resort town all week, in the off-season when so many businesses are closed, and I don’t have words to reassure her. I like Aspen anytime, but I spent so many summers here because of my dad’s connection with the Physics Center that it’s like a second home. Trying to help, I offer to give them a little tour of the city. It’s a nice walk into town and if your shoes are reasonably comfortable it’s pretty easy to make the loop on foot. So off we go.

When we get up to Main street I’m relieved that I can no longer find the little cluster of victorian houses that I noticed last year advertising cosmetic surgery and dentistry. Since I first came here as a child, Aspen has gone from being an old mining outpost with a ski area to a playground for the rich and famous, with rodeo drive boutiques pushing out many of the local businesses downtown. One of my favorite establishments was a little fiberglass A-frame called “Donny’s Dog House,” where you could get the best onion rings and kosher dogs served only on a whole-wheat bun. That place simply could not exist today. I’ve learned to accept much of the recent development, but “cosmetic surgery row” made Aspen a parody of itself.

Downtown, I do my best to point out the decent restaurants, knowing many of them are closed for the off-season, and of those that aren’t, many won’t open ’till dinnertime. At some point the walk becomes a mission to find a restroom, a problem I solve brilliantly by suggesting we could simply ask some open business for permission to use theirs. Apparently that isn’t done in Europe.

Next mission: lunch. But now we’re in the wrong part of town to find anything. Someone is craving steak, so we head past Rubey park’s empty rugby field to where I remember there being a steakhouse years ago. Gone. Across the street is an “Authentic Western Bistro” (closed), the idea of which made me a little embarrassed in front of my French companions. In the end, after passing through Carl’s Pharmacy to buy provisions (contact lens solution for me, beer and wine for everyone else—you can get anything there), we end up stopping at Hickory House.

Hickory House is the last possible restaurant before you arrive back at the Meadows. It’s a decent barbequeue joint, but… I kid you not… they brag that they import their ribs from Denmark! I have nothing against Denmark, mind you, but ribs? First, importing is just too highbrow for ribs, I’m sorry. Taking trash meat and smoking and/or marinating it until it becomes tender and delicious is a foundation of barbequeue. Second, isn’t this a classic American food and aren’t we in livestock country? The carbon footprint implications of shipping ribs from Denmark to the middle of the U.S. boggle the mind. But maybe it’s just the thin air and empty stomach. I can’t remember much about lunch, but I didn’t order the ribs.

Heading back to the Hotel afterwards, we decided to pass by the Physics Center and walk through the meadow instead of taking the road we came out on. After snapping a few pictures of magpies, Inna takes a picture of Edouard, Christophe (with bag o’ beers), and me. My look of satisfaction in this picture pretty much sums up my day.

Edouard Alligand, Dave Abrahams, Christophe Henry in Aspen Meadow

Edouard Alligand, Dave Abrahams, Christophe Henry in Aspen Meadow