Matlab relational operator performance in presence of NaN-values
When testing the following code (notice the *NaN in the second fragment)
tic x = zeros(1,5000000); for i=1:10 selector = x > 1; end toc tic x = zeros(1,5000000)*NaN; for i=1:10 selector = x > 1; end toc
on Matlab revisions
- R2012a 64-bit
- R2013a 32-bit
I observe the following odd behavior
Elapsed time is 0.056266 seconds. Elapsed time is 0.059677 seconds.
Elapsed time is 0.070116 seconds. Elapsed time is 3.995697 seconds.
So in case of R2013a 32-bit the presence of NaN values drastically increases runtime. Can anyone give me a hint where this might be comming from?
Best regards, Thomas
You are using Intel CPU, and of that, for 32-bit code, you are using it's FPU. It is awfully slow with NaN, Inf and denormals and this is an old story. Good news SSE unit is slow with denormals only and handles NaNs at full speed, so if you can convince your compiler to emit SSE code, you should be up to full speed. This is done automatically for x64, because it implies SSE2 and the ABI uses SSE registers, but since x32 floating point ABI uses FPU registers, the FPU is used for doing the calculations to avoid moving things around too much.
I did not dig deeper (we use embedded platforms and not all of them have SSE as of now), but I suspect changing some compiler/flags should help. If it does, checking how things are inlined would be in order to see if you have that SSE-to-FPU-and-back on each function call. If it's a small tight loop somewhere in the code, there is a possibility of using SSE intrinsics.
upd: Oops just noticed this is matlab. The reasoning stays, but for the solutions, you'll have to look yourself.
The problem may be due to the fact that your 32-bit system takes longer to reallocate the ~40MB of memory in the x = zeros(1,5000000)*NaN; line. Perhaps there is not enough available RAM and it needs to swap memory to disk. To check which part (the allocation or the comparison) is problematic, tic-toc these parts separately.
BTW, there is no need to multiply by NaN - you can simply do x = nan(1,5000000);