Retrieving the "canonical value" from a Set<T> where T has a custom equals()

I have a class Foo which overrides equals() and hashCode() properly.

I would like to also would like to use a HashSet<Foo> to keep track of "canonical values" e.g. I have a class that I would like to write like this, so that if I have two separate objects that are equivalent I can coalesce them into references to the same object:

class Canonicalizer<T>
{
    final private Set<T> values = new HashSet<T>(); 

    public T findCanonicalValue(T value)
    {
        T canonical = this.values.get(value);
        if (canonical == null)
        {
           // not in the set, so put it there for the future
           this.values.add(value);
           return value;
        }
        else
        {
           return canonical;
        }
    }
}

except that Set doesn't have a "get" method that would return the actual value stored in the set, just the "contains" method that returns true or false. (I guess that it assumes that if you have an object that is equal to a separate object in the set, you don't need to retrieve the one in the set)

Is there a convenient way to do this? The only other thing I can think of is to use a map and a list:

class Canonicalizer<T>
{
    // warning: neglects concurrency issues

    final private Map<T, Integer> valueIndex = new HashMap<T, Integer>(); 
    final private List<T> values = new ArrayList<T>();

    public T findCanonicalValue(T value)
    {
        Integer i = this.valueIndex.get(value);
        if (i == null)
        {
           // not in the set, so put it there for the future
           i = this.values.size();
           this.values.add(value);
           this.valueIndex.put(value, i);
           return value;
        }
        else
        {
           // in the set
           return this.values.get(i);
        }
    }
}

Answers


I would use just a HashMap like this:

final private Map<T, T> values = new HashMap<T, T>(); 
public T findCanonicalValue(T value)
{
    T canon = this.values.get(value);
    if (canon == null)
    {
       // not in the set, so put it there for the future
       this.values.put(value, value);
       return value;
    }
    else
    {
       // in the set
       return canon;
    }
}

its still a little awkard but it should be slightly more efficient than the map and the list (one fewer data structure).


Use an Interner from Guava:

http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Interner.html

http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Interners.html

This is exactly what it's for.


With ConcurrentMap it is a little simpler. (And thread safe)

private final ConcurrentMap<T, T> values = new ConcurrentHashMap<T, T>();  
public T findCanonicalValue(T value) { 
    values.putIfAbsent(value, value);
    return values.get(value);
}

A map seems the right idea, but why not map from some Object T to the corresponding canonical object?

class Canonicalizer<T>
{
    final private Map<T, T> values = new HashMap<T, T>(); 

    public T findCanonicalValue(T value)
    {
        T canonical = this.values.get(value);
        if (canonical == null)
        {
            // not in the map, so put it there for the future
            this.values.put(value, value);
            return value;
        }
        else
        {
            return canonical;
        }
    }
}

In many cases, you want this sort of container for an object cache that needs to be accessed by different threads; if so, make sure not to neglect concurrency issues, and be aware that the thread-safe map implementations generally don't support null keys. If for some reason you can't use Guava, at least take a look at the source code:

http://www.google.com/codesearch/p?hl=en#UKMs0lhE9bg/trunk/src/com/google/common/collect/Interners.java

public final class Interners {
  private Interners() {}

  /**
   * Returns a new thread-safe interner which retains a strong reference to
   * each instance it has interned, thus preventing these instances from being
   * garbage-collected. If this retention is acceptable, this implementation may
   * perform better than {@link #newWeakInterner}.
   */
  public static <E> Interner<E> newStrongInterner() {
    final ConcurrentMap<E, E> map = new MapMaker().makeMap();
    return new Interner<E>() {
      public E intern(E sample) {
        E canonical = map.putIfAbsent(checkNotNull(sample), sample);
        return (canonical == null) ? sample : canonical;
      }
    };
  }

Need Your Help

PHP SOAP xml response not decode xml tags

php xml soap encoding

SOAP server try to return corrupted xml, and I'm getting error: looks like we got no XML document

MYSQL Group By date + 3 hours

php mysql sql group-by

I have a query that counts the "Xp" difference per day from my database, this all works as it should however it groups from midnight-midnight, what I would like to do is group 3am to 3am.