Matlab relational operator performance in presence of NaN-values

When testing the following code (notice the *NaN in the second fragment)

tic
x = zeros(1,5000000);
for i=1:10
        selector = x > 1;
end
toc

tic
x = zeros(1,5000000)*NaN;
for i=1:10
        selector = x > 1;
end
toc

on Matlab revisions

  1. R2012a 64-bit
  2. R2013a 32-bit

I observe the following odd behavior

R2012a 64-bit

Elapsed time is 0.056266 seconds.
Elapsed time is 0.059677 seconds.

R2013a 32-bit

Elapsed time is 0.070116 seconds.
Elapsed time is 3.995697 seconds.

So in case of R2013a 32-bit the presence of NaN values drastically increases runtime. Can anyone give me a hint where this might be comming from?

Best regards, Thomas

Answers


You are using Intel CPU, and of that, for 32-bit code, you are using it's FPU. It is awfully slow with NaN, Inf and denormals and this is an old story. Good news SSE unit is slow with denormals only and handles NaNs at full speed, so if you can convince your compiler to emit SSE code, you should be up to full speed. This is done automatically for x64, because it implies SSE2 and the ABI uses SSE registers, but since x32 floating point ABI uses FPU registers, the FPU is used for doing the calculations to avoid moving things around too much.

I did not dig deeper (we use embedded platforms and not all of them have SSE as of now), but I suspect changing some compiler/flags should help. If it does, checking how things are inlined would be in order to see if you have that SSE-to-FPU-and-back on each function call. If it's a small tight loop somewhere in the code, there is a possibility of using SSE intrinsics.

upd: Oops just noticed this is matlab. The reasoning stays, but for the solutions, you'll have to look yourself.


The problem may be due to the fact that your 32-bit system takes longer to reallocate the ~40MB of memory in the x = zeros(1,5000000)*NaN; line. Perhaps there is not enough available RAM and it needs to swap memory to disk. To check which part (the allocation or the comparison) is problematic, tic-toc these parts separately.

BTW, there is no need to multiply by NaN - you can simply do x = nan(1,5000000);


Need Your Help

Addpage wxAuiNotebook with an object in a vector

wxwidgets listctrl

I'm currently working on a C++ wxWidgets-based software designed to show some data extracted from .txt files. Since I want to create multiple tabs, I decided to use wxAuinotebook with wxListCtrl.

Is there a way to write an LLVM front end compiler in C#?

c# llvm

Is there a way to write an LLVM front end compiler in C#?