GetHashCode() for OrdinalIgnoreCase-dependent string classes

public class Address{
    public string ContactName {get; private set;}
    public string Company {get; private set;}
    //...
    public string Zip {get; private set;}
}

I'd like to implement a notion of distint addresses, so I overrode Equals() to test for case-insensitive equality in all of the fields (as these are US addresses, I used Ordinal instead of InvariantCulture for maximum performance):

public override bool Equals(Object obj){
    if (obj == null || this.GetType() != obj.GetType())
        return false;

    Address o = (Address)obj;

    return  
    (string.Compare(this.ContactName, o.ContactName, StringComparison.OrdinalIgnoreCase) == 0) &&
    (string.Compare(this.Company, o.Company, StringComparison.OrdinalIgnoreCase) == 0)
    // ...
    (string.Compare(this.Zip, o.Zip, StringComparison.OrdinalIgnoreCase) == 0)
}

I'd like to write a GetHashCode() similarly like so (ignore the concatenation inefficiency for the moment):

public override int GetHashCode(){
    return (this.contactName + this.address1 + this.zip).ToLowerOrdinal().GetHashCode();
}

but that doesn't exist. What should I use instead? Or should I just use InvariantCulture in my Equals() method?

(I'm thinking .ToLowerInvariant().GetHashCode(), but I'm not 100% sure that InvariantCulture can't decide that an identical character (such as an accent) has a different meaning in another context.)

Answers


Two unequal objects can have the same hashcode. Though two equal objects should never have different hashcodes. If you use InvariantCulture for your hashcode it will still be correct as far as the contract for Equals goes if it's implemented in terms of OrdinalIgnoreCase.

From the documentation on StringComparer.OrdinalIgnoreCase (emphasis mine):

http://msdn.microsoft.com/en-us/library/system.stringcomparer.ordinalignorecase.aspx

The StringComparer returned by the OrdinalIgnoreCase property treats the characters in the strings to compare as if they were converted to uppercase using the conventions of the invariant culture, and then performs a simple byte comparison that is independent of language. This is most appropriate when comparing strings that are generated programmatically or when comparing case-insensitive resources such as paths and filenames.


Whatever string comparison method you use in Equals(), it makes sense to use the same in GetHashCode().

There's no need to create temporary strings just to calculate hash codes. For StringComparison.OrdinalIgnoreCase, use StringComparer.OrdinalIgnoreCase.GetHashCode()

Then you need to combine multiple hash codes into one. XOR should be ok (because it's unlikely that one person's zip code is another's contact name). However purists might disagree.

public override int GetHashCode()
{
    return StringComparer.OrdinalIgnoreCase.GetHashCode(ContactName) ^
        StringComparer.OrdinalIgnoreCase.GetHashCode(Company) ^
        // ...
        StringComparer.OrdinalIgnoreCase.GetHashCode(Zip);
}

Having said all that, I'd question whether it's sensible to use a composite structure like Address as the key to a dictionary. But the principle holds for identity-type strings.


Need Your Help

C++ switch statement expression evaluation guarantee

c++ standards language-lawyer

Regarding switch the standard states the following. "When the switch statement is executed, its condition is evaluated and compared with each case constant."

What is the difference between “int” and “uint” / “long” and “ulong”?

c# types integer unsigned signed

I know about int and long (32-bit and 64-bit numbers), but what are uint and ulong?