Tuesday, June 21, 2011

[ylvmvovj] GCC PGO results

Running GCC's profile guided optimization ("profile feedback directed optimization") on the program to generate the point-growth tree, which has already seen a few iterations of manual optimization.

Baseline speed (-O3): 20.98 points per second
With -march=native: 21.83 points per second (4.0% improvement over baseline)
With -march=native -fprofile-use : 25.95 points per second (23.7% improvement over baseline). PGO is amazing!

One "gotcha" is the -fprofile-generate execution must exit normally (not ctrl-C) in order for the profile data to be written out. This required modifying my program which runs in an infinite loop: periodically check the filesystem for a killswitch file.

gcc 4.5.3-1 (Debian Wheezy 32-bit)

No comments :