# How do the bit manipulations in this bit-sorting code work?

Jon Bentley in Column 1 of his book programming pearls introduces a technique for sorting a sequence of non-zero positive integers using bit vectors.

I have taken the program bitsort.c from here and pasted it below:

/* Copyright (C) 1999 Lucent Technologies */ /* From 'Programming Pearls' by Jon Bentley */ /* bitsort.c -- bitmap sort from Column 1 * Sort distinct integers in the range [0..N-1] */ #include <stdio.h> #define BITSPERWORD 32 #define SHIFT 5 #define MASK 0x1F #define N 10000000 int a[1 + N/BITSPERWORD]; void set(int i) { int sh = i>>SHIFT; a[i>>SHIFT] |= (1<<(i & MASK)); } void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); } int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); } int main() { int i; for (i = 0; i < N; i++) clr(i); /*Replace above 2 lines with below 3 for word-parallel init int top = 1 + N/BITSPERWORD; for (i = 0; i < top; i++) a[i] = 0; */ while (scanf("%d", &i) != EOF) set(i); for (i = 0; i < N; i++) if (test(i)) printf("%d\n", i); return 0; }

I understand what the functions clr, set and test are doing and explain them below: ( please correct me if I am wrong here ).

- clr clears the ith bit
- set sets the ith bit
- test returns the value at the ith bit

Now, I don't understand how the functions do what they do. I am unable to figure out all the bit manipulation happening in those three functions.

## Answers

The first 3 constants are inter-related. BITSPERWORD is 32. This you'd want to set based on your compiler+architecture. SHIFT is 5, because 2^5 = 32. Finally, MASK is 0x1F which is 11111 in binary (ie: the bottom 5 bits are all set). Equivalently, MASK = BITSPERWORD - 1.

The bitset is conceptually just an array of bits. This implementation actually uses an array of ints, and assumes 32 bits per int. So whenever we want to set, clear or test (read) a bit we need to figure out two things:

- which int (of the array) is it in
- which of that int's bits are we talking about

Because we're assuming 32 bits per int, we can just divide by 32 (and truncate) to get the array index we want. Dividing by 32 (BITSPERWORD) is the same as shifting to the right by 5 (SHIFT). So that's what the a[i>>SHIFT] bit is about. You could also write this as a[i/BITSPERWORD] (and in fact, you'd probably get the same or very similar code assuming your compiler has a reasonable optimizer).

Now that we know which element of a we want, we need to figure out which bit. Really, we want the remainder. We could do this with i%BITSPERWORD, but it turns out that i&MASK is equivalent. This is because BITSPERWORD is a power of 2 (2^5 in this case) and MASK is the bottom 5 bits all set.

Basically is a bucket sort optimized:

- reserve a bit array of length n bits.
- clear the bit array (first for in main).
- read the items one by one (they must all be distinct).
- set the i'th bit in the bit array if the read number is i.

- iterate the bit array.
- if the bit is set then print the position.

Or in other words (for N < 10 and to sort 3 numbers 4, 6, 2) 0

start with an empty 10 bit array (aka one integer usually)

0000000000

read 4 and set the bit in the array..

0000100000

read 6 and set the bit in the array

0000101000

read 2 and set the bit in the array

0010101000

iterate the array and print every position in which the bits are set to one.

2, 4, 6

sorted.

Starting with set(): A right shift of 5 is the same as dividing by 32. It does that to find which int the bit is in. MASK is 0x1f or 31. ANDing with the address gives the bit index within the int. It's the same as the remainder of dividing the address by 32. Shifting 1 left by the bit index ("1<<(i & MASK)") results in an integer which has just 1 bit in the given position set. ORing sets the bit. The line "int sh = i>>SHIFT;" is a wasted line, because they didn't use sh again beneath it, and instead just repeated "i>>SHIFT"

clr() is basically the same as set, except instead of ORing with 1<<(i & MASK) to set the bit, it ANDs with the inverse to clear the bit. test() ANDs with 1<<(i & MASK) to test the bit.

The bitsort will also remove duplicates from the list, because it will only count up to 1 per integer. A sort that uses integers instead of bits to count more than 1 of each is called a radix sort.

The bit magic is used as a special addressing scheme that works well with row sizes that are powers of two.

If you try understand this (note: I rather use bits-per-row than bits-per-word, since we're talking about a bit-matrix here):

// supposing an int of 1 bit would exist... int1 bits[BITSPERROW * N]; // an array of N x BITSPERROW elements // set bit at x,y: int linear_address = y*BITSPERWORD + x; bits + linear_address = 1; // or 0 // 0 1 2 3 4 5 6 7 8 9 10 11 ... 31 // . . . . . . . . . . . . . // . . . . X . . . . . . . . -> x = 4, y = 1 => i = (1*32 + 4)

The statement linear_address = y*BITSPERWORD + x also means that x = linear_address % BITSPERWORD and y = linear_address / BITSPERWORD.

When you optimize this in memory by using 1 word of 32 bits per row, you get the fact that a bit at column x can be set using

int bitrow = 0; bitrow |= 1 << (x);

Now when we iterate over the bits, we *have* the linear address, but need to find the corresponding word.

int column = linear_address % BITSPERROW; int bit_mask = 1 << column; // meaning for the xth column, // you take 1 and shift that bit x times int row = linear_address / BITSPERROW;

So to set the i'th bit, you can do this:

bits[ i%BITSPERROW ] |= 1 << (linear_address / BITSPERROW );

An extra gotcha is, that the modulo operator can be replaced by a logical AND, and the / operator can be replaced by a shift, too, if the second operand is a power of two.

a % BITSPERROW == a & ( BITSPERROW - 1 ) == a & MASK a / BITSPERROW == a >> ( log2(BITSPERROW) ) == a & SHIFT

This ultimately boils down to the very dense, yet hard-to-understand-for-the-bitfucker-agnostic notation

a[ i >> SHIFT ] |= ( 1 << (i&MASK) );

But I don't see the algorithm working for e.g. 40 bits per word.

Quoting the excerpts from Bentleys' original article in DDJ, this is what the code does at a high level:

/* phase 1: initialize set to empty */ for (i = 0; i < n; i++) bit[i] = 0 /* phase 2: insert present elements */ for each i in the input file bit[i] = 1 /* phase 3: write sorted output */ for (i = 0; i < n; i++) if bit[i] == 1 write i on the output file

A few doubts : 1. Why is it a need for a 32 bit ? 2. Can we do this in Java by creating a HashMap with Keys from 0000000 to 9999999 and values 0 or 1 based on the presence/absence of the bit ? What are the implications for such a program ?