SPEC's CPU2006 benchmark suite was quite an unruly beast to tame. We've spent two weeks working on it non-stop, from starting the process of reading documentation and understanding the whole SPEC protocol in place for performing and reporting benchmarks, to finally come up with what we felt was a correct and acceptable configuration for Techgage CPU reviewing purposes.
In the way we hit many walls, head first, that would leave us bewildered, confused, and often frustrated, and that would force us to go back and rethink the way we were trying to work with it. It can be said that this wasn't a case of progressively reaching a good configuration. We would in fact, more than once, run an entire benchmark for more than 10 hours, only to find at the end of it all that we were still not doing it right. Not according to SPEC's rules and not according to our interests.
I can still remember me telling Rob this would be easy. Ah, the naivety! CPU2006 is a complex benchmark suite meant to provide anyone who wishes to use it with a standardized processes, from compiling the tests to running and reporting them. The complexity is in a fact a consequence of this standardization, and guarantees proper scaled results across different machine setups and compiler optimization flags. It's an ideal benchmark suite (the one ideal benchmark suite!) for CPU performance analysis and comparison. As such, it is extremely well suited for an hardware review website such as Techgage. But the cost is a complex and hard to learn suite that demands also knowledge of software development and building, particularly compiler usage.
However, it needs to be said that many of the obstacles we faced were entirely our own fault. There's a proper way to do these things. It involves reading the documentation from start to finish, understanding the key concepts, experimenting and planning ahead. We did nothing of those things. Or we did, but all lumped together in a 2 week race to have the benchmark suite ready for the upcoming Techgage reviews. Time was not on our side, so we had to do without the best we could. That involved necessarily not always reading documentation properly, missing key information or making decisions that would soon enough reveal themselves obvious mistakes we should have realized.
What's worse, I didn't have a 64bit machine. While I could compile for 64bit runs, I couldn't run them. So I was flying a little blind there, after we finally dealt with all compilation errors and went into understanding why some tests wouldn't run or the whole benchmark was marked as invalid. Rob had to take the blunt of it and, with that, all the frustration of running a 13 hour benchmark only to find at the end it wasn't a valid benchmark (several times!). This while he was trying to build 2 machines in time for an upcoming review.
At some point it became depressing. So much in fact that when we finally got our first 100% valid, totally foolproof, reportable benchmark, we didn't celebrate. We were so exhausted, that hitting success came like a drink of water when you've already passed out from thirst.
Let this serve of a lesson to anyone wishing to use CPU2006. This is a serious, professional and complex benchmark suite that companies like Intel, AMD, ASUS, Dell and Cisco use for internal purposes. The membership and associates page
of SPEC will give you a good idea of what we are talking about here. It's meant to be properly studied before it is implemented. Don't do what we did
So, how do we know we have a good CPU2006 benchmark configuration?
Rob insisted from the very beginning that he wanted the benchmarks to be submitted to SPEC. For this to happen, benchmarks have to be flagged as valid by the CPU2006 tools. These tools perform an exhaustive analysis of the our benchmark configuration and compiler flags and only when they respect CPU2006 specifications, will they be marked as valid. And this is the validation we need. By conforming to SPEC rules and specifications, we know we have a good benchmark configuration that not only allows Rob to submit the results to SPEC, but give readers of Techgage the confidence they are looking at credible and true results that conform with the very high standards imposed by SPEC for proper CPU benchmark.
In an upcoming post on this thread, I'll be discussing (more briefly, thank goodness!) what exactly is CPU2006 and why should you care.