Multi-threaded Mallet text processing

We do a lot of text processing on our big office computer, a six-CPU 3.33GHz Mac Pro. We got this model because not all the software we’ll be running is multiprocessor-aware, and because only certain kinds of tasks are parallelizable anyway. (See this page for a discussion of Mac Pro performance considerations.)

In situations where a program isn’t written to take explicit advantage of multiple cores (beyond whatever Grand Central Dispatch provides at some low level), fewer — but faster — cores are better. And the ‘Gulftown’ Xeon processors in this model can actually adapt to a big, demanding, non-multithreaded task by shutting down some cores that were going unused anyway, speeding up the CPU’s which are actually doing the work.

Nevertheless, I was curious what kind of difference in performance we could get by fully utilizing the 12 cores in this Mac Pro. One of the tools we use all the time is Mallet, a text analysis package from UMass-Amherst. Not all parts of this toolset are written with multithreading in mind, but one which is the train-topics topic modeler.

I ran train-topics on about 15,000 discrete elements of folklore, first with no —num-threads argument and then with —num-threads = 12. The results are pretty clear, in terms of CPU utilization:

I’m not quite sure what the default number of threads is, but I’m guessing perhaps 1 — a single task which is then distributed over the available cores by some combination of the Java Virtual Machine and Mac OS X working together. This accounts for the semi-random CPU use on the left-hand side of the picture above. On the far right, you can see that all 12 core elements are at least 50% engaged on the task at the same time. I have the numbers written down somewhere, but I think the task completed in about half the time with all cores engaged.

Previous: Starbucks in Japan | Next: Trilingual inscription