Is A==0 really better than ~A?

Introduction to problem setup

I was doing some benchmarks involving - ~A and A==0for a double array with no NaNs, both of which convert A to a logical array where all zeros are converted to true values and rest are set as false values.

For the benchmarking, I have used three sets of input data –

  • Very small to small sized data - 15:5:100
  • Small to medium sized data - 50:40:1000
  • Medium to large sized data - 200:400:3800

The input is created with A = round(rand(N)*20), where N is the parameter taken from the size array. Thus, N would vary from 15 to 100 with stepsize of 5 for the first set and similarly for the second and third sets. Please note that I am defining datasize as N, thus the number of elements would be datasize^2 or N^2.

Benchmarking Code

N_arr = 15:5:100; %// for very small to small sized input array
N_arr = 50:40:1000; %// for small to medium sized input array
N_arr = 200:400:3800; %// for medium to large sized input array
timeall = zeros(2,numel(N_arr));
for k1 = 1:numel(N_arr)
    A = round(rand(N_arr(k1))*20);

    f = @() ~A;
    timeall(1,k1) = timeit(f);
    clear f

    f = @() A==0;
    timeall(2,k1) = timeit(f);
    clear f
end

Results

Finally the questions

One can see how A==0 performs better than ~A across all datasizes. So here are some observations and related questions alongside them –

  1. A==0 has one relational operator and one operand, whereas ~A has only one relational operator. Both produce logical arrays and both accept double arrays. In fact, A==0 would work with NaNs too, wheras ~A won’t. So, why is still ~A at least not as good as A==0 as it looks like A==0 is doing more work or am I missing something here?

  2. There’s a peculiar drop of elapsed time with A==0 and thus increased performance at N = 320, i.e. at 102400 elements for A. I have observed this across many runs with that size on two different systems that I have access to. So what’s going on there?

Answers


This is not strictly an answer but rather my contribution to the discussion

I used the profiler to investigate a slightly-modified version of your code:

N_arr = 200:400:3800; %// for medium to large sized input array

for k1 = 1:numel(N_arr)

    A = randi(1,N_arr(k1));
    [~]=eq(A,0);
    clear A

    A = randi(1,N_arr(k1));
    [~]=not(A);
    clear A   

end

I used the following profiler flags (as per UndocumentedMatlab's series of posts about Profiler):

profile('-memory','on');
profile('on','-detail','builtin');

And here's an excerpt from the profiler results (link to the larger image):

It seems that the == variant allocates a tiny bit of additional memory that allows it to work its magic....

Regarding your question 2: Before removing the keeping of timeall, I tried plotting the same charts you did in Excel. I didn't observe the behavior you mentioned for N = 320. I suspect this may have something to do with the additional wrappers (i.e. function handles) you're using in your code.


I thought I'd attach the available documentation for the discussed functions for quick reference.

The documentation for ~ (\MATLAB\R20???\toolbox\matlab\ops\not.m):

%~   Logical NOT.
%   ~A performs a logical NOT of input array A, and returns an array
%   containing elements set to either logical 1 (TRUE) or logical 0 (FALSE).
%   An element of the output array is set to 1 if A contains a zero value
%   element at that same array location.  Otherwise, that element is set to
%   0.
%
%   B = NOT(A) is called for the syntax '~A' when A is an object.
%
%   ~ can also be used to ignore input arguments in a function definition,
%   and output arguments in a function call.  See "help punct"

%   Copyright 1984-2005 The MathWorks, Inc.

The documentation for == (\MATLAB\R20???\toolbox\matlab\ops\eq.m):

%==  Equal.
%   A == B does element by element comparisons between A and B
%   and returns a matrix of the same size with elements set to logical 1
%   where the relation is true and elements set to logical 0 where it is
%   not.  A and B must have the same dimensions unless one is a
%   scalar. A scalar can be compared with any size array.
%
%   C = EQ(A,B) is called for the syntax 'A == B' when A or B is an
%   object.

%   Copyright 1984-2005 The MathWorks, Inc.

Also not strictly an answer, but I want to add to the discussion. Maybe it boils down to your function timeit.

I've tried Dev-iL's function. I have profiled and got the same results: EQ seems to be faster than NOT and EQ appears to allocate slightly more memory than NOT. It seemed logical that if the EQ operator is allocating more memory then, as we increased the array size, that memory allocation would also increase. Suspiciously, it does not!

I went on and deleted everything unnecessary and repeated the loop for N=1000 iterations. The profiler seemed to still agree that EQ is faster than NOT. But I was not convinced.

Next I did was to remove the weirdly looking [~] = ~A and [~] = A == 0 for something more humane looking like tmp1 = ~A and tmp2 = A == 0 and voilà! The runtimes are almost equal.

So my guess is that you are doing something similar inside your timeid function. Worth to note is that the assignment [~] slows down both functions, but NOT seems to be more affected than EQ.

Now the big question: why is the operator [~] slowing down the functions? I do not know. Perhaps only Mathworks can answer that. You can open a ticket in the Mathworks webpage.

Partial answer: they have almost the same run-time, even for big arrays (biggest array I tried is 10K).

Unanswered part: why [~] assignment slows down the code. Why NOT is more affected than EQ.

My Code:

clear all
clear classes

array_sizes = [1000:1000:10000];
repetitions = 10000;

for i = 1:length(array_sizes)
    A1 = randi([0, 1], array_sizes(i), 1);
    for j = 1:repetitions
        tmp1 = eq(A1, 0);
    end
end

for i = 1:length(array_sizes)
    A2 = randi([0, 1], array_sizes(i), 1);
    for j = 1:repetitions
        tmp2 = not(A2);
    end
end

Need Your Help

Convert a two byte UInt8 array to a UInt16 in Swift

swift

With Swift I want to convert bytes from a uint8_t array to an integer.

Getting Spark, Python, and MongoDB to work together

python mongodb hadoop apache-spark pymongo

I'm having difficulty getting these components to knit together properly. I have Spark installed and working succesfully, I can run jobs locally, standalone, and also via YARN. I have followed the ...