Local spark scheduling is bad
I've been trying to improve the performance of parallel programs because by default it appears to be terrible...
In particular I've been using a modified version of the icfp_2000 ray tracer. It has been modified to be trivially parallelisable. And it is therefore reasonable to expect decent performance when parallelising it. It has been modified to render a row at a time, and within each row render a pixel at a time. A render_rows predicate has two independent calls in a conjunction, render_row and render_rows (recursive). These calls are independent and therefore a call must exist to merge their results (we use concatenation of cords), therefore render_rows is not tail-recursive. By making this conjunction a parallel conjunction we can easily parallelise this program. This code can be found at progs/icfp2000_par_pbone within the benchmarks CVS module.
When running this program with MERCURY_OPTIONS="-P4" in a parallel grade it performs marginally better than a sequential version. Although we can show that such a small improvement can come from using a parallel-mark phase in the garbage collector (which is enabled in all parallel grades). The performance continues to improve as we increase --max-contexts-per-thread which allows for more parallelism by scheduling more computations on the global spark queue.

The above graph shows boxplots of the wall time (from 10 samples) of the icfp_2000 ray-tracer as we vary the value of --max-contexts-per-thread. The first boxplot shows the execution of the same program compiled for sequential execution. The other plots double the number of --max-contexts-per-spark starting from the default of two.
| mean | standard deviation | |
|---|---|---|
| main_asmfast-gc | 85.23 | 0.39 |
| main_asmfast-gc-par_p4_c2 | 76.76 | 0.19 |
| main_asmfast-gc-par_p4_c4 | 73.74 | 0.32 |
| main_asmfast-gc-par_p4_c8 | 70.08 | 1.04 |
| main_asmfast-gc-par_p4_c16 | 66.41 | 0.51 |
| main_asmfast-gc-par_p4_c32 | 63.35 | 1.20 |
| Attachment | Size |
|---|---|
| icfp2000_max-contexts-per-thread.png | 6.73 KB |
- paul's blog
- Login or register to post comments