Division method for Hash Values

Suppose that a string of r characters is hashed into m slots by treating it as a radix-128 number and then using the division method. The number m is easily represented as a 32-bit computer word, but the string of r characters, treated as a radix-128 number, takes many words. How can we apply the division method to compute the hash value of the character string without using more than a constant number of words of storage outside the string itself?

Answers


For any n-digit number in radix r:

number=a0*r^0+a1*r^1+a2*r^2+...+a(n-1)*r^(n-1)

To compute the value of the that number mod m, we do

(a0*r^0+a1*r^1+a2*r^2+...+a(n-1)*r^(n-1))%m

But, notice that

(a0*r^0+a1*r^1+a2*r^2+...+a(n-1)*r^(n-1))%m
   = ((a0*r^0)%m + (a1*r^1)%m+(a2*r^2)%m+...+(a(n-1)*r^(n-1))%m)%m
   = (sum over 0<=i<n: (ai*r^i)%m)%m

Thus, you can just iterate over one character at a time, computing the value of (ai^ri)%m and accumulate the sum.

Code (in Python):

def hash_code(s,radix,mod):
        pwr=1 # radix^0=1
        answer=0
        for index,character in enumerate(s):
            answer=(answer+(ord(character)*pwr)%mod)%mod
            pwr=(pwr*radix)%mod # radix^(i+1)=radix*radix^i
        return answer

Remember to use the % operator after every operation to avoid overflows (although this is strictly not needed in Python).


You can use Horner's method/rule.

y = 0
for i = (n - 1) downto 0
    y = (ai + 128y) mod m
return y

Need Your Help

User credentials are sent in clear text in asp.net website

c# php asp.net

I run an Audit on my website and it shows that "User credentials are sent in clear text"

Is there a cross platform version of window vista's slim reader writer locks?

c windows multithreading

I'm totally blown away from the quality of windows SRW implementation. Its faster then critical sections and its just a few bytes memory overhead.