# Why does running my tests in a different order result in drastically different performances?

<>This question already has an answer here:

## Answers

I doubt the ordering has an effect here. The problem is that in the results where the ordering is changed, the number of times the tests were run is also changed. This is a problem because the more times the code is run, the more the JIT will optimize it.

If you look at just the numbers there's a pretty strong pattern that the tests that were run more times were faster. This is pretty classic JIT behavior. The optimized compilation gets run more times and weights the average. So it should be clear that the number of times the individual tests are run needs to be equal. Also the test should account for warm up.

So adjusting main gets something like this:

for(int i = 0; i < 2; i++) { testPerformance(castNumerator, 6); testPerformance(multiplyNumerator, 6); testPerformance(makeNumerator, 6); testPerformance(castDenominator, 6); testPerformance(multiplyDenominator, 6); testPerformance(makeDenominator, 6); testPerformance(diDivider, 6); testPerformance(idDivider, 6); System.out.println(); testPerformance(castNumerator2, 6); testPerformance(multiplyNumerator2, 6); testPerformance(makeNumerator2, 6); testPerformance(castDenominator2, 6); testPerformance(multiplyDenominator2, 6); testPerformance(makeDenominator2, 6); testPerformance(diDivider2, 6); testPerformance(idDivider2, 6); System.out.println(); }

For warm up, this just runs the tests twice and the first set of results are probably not very meaningful.

But after doing that I see something like the following for the second run:

Casting numerator: 0.003012 ms Multiplying numerator: 0.003011 ms Making double variable out of numerator: 0.003025 ms Casting denominator: 0.003040 ms Multiplying denominator: 0.003015 ms Making double variable out of denominator: 0.003006 ms Passing double numerator: 0.000000 ms Passing double denominator: 0.000000 ms

Well that's kind of strange because that indicates the last two tests took 0 ms to run. And it's not because those divisions are really just very fast because changing the program so it uses nanoTime has no effect. (nanoTime is always recommended though here it probably won't make much of a difference since the times are long enough.)

Actually probably what is happening is the JIT has figured out you're ignoring return values so it's optimized those tests by not doing them somehow. As for why it only does it for certain tests, your guess is probably as good as mine, but it indicates a serious problem. If the JIT is doing this kind of extreme optimization, we have no idea what it's doing elsewhere.

This can be solved by just doing something (basically anything) with the return value:

// somewhere static long result; // for each op result += idDivider.divide(n, d); // at the end of the test System.out.println(result);

This might slow down the tests a little but it will foil the JIT's cleverness.

After all those changes I get basically this every time:

Casting numerator: 0.007088 ms Multiplying numerator: 0.007135 ms Making double variable out of numerator: 0.007162 ms Casting denominator: 0.007180 ms Multiplying denominator: 0.007206 ms Making double variable out of denominator: 0.007173 ms Passing double numerator: 0.003650 ms Passing double denominator: 0.003663 ms Casting numerator, dividing twice: 0.007554 ms Multiplying numerator, dividing twice: 0.007574 ms Making double variable out of numerator, dividing twice: 0.007538 ms Casting denominator, dividing twice: 0.007550 ms Multiplying denominator, dividing twice: 0.007503 ms Making double variable out of denominator, dividing twice: 0.007577 ms Passing double numerator, dividing twice: 0.003765 ms Passing double denominator, dividing twice: 0.003798 ms

Now if I had to guess at why this indicates it's faster to cast as a parameter, I would say HotSpot is compiling it so that the loop variable is always a double.

On benchmarking in general I'd recommend the following further reading:

- Robust Java benchmarking, Part 1: Issues
- Java theory and practice: Anatomy of a flawed microbenchmark
- Zed Shaw's crazy and convincing rant on poor statistics (He is crass so not a read for anyone easily-offended.)