this was required because running compare results on the entire benchmark set generates gigabytes of data.