How to implement a set?
I want to implement a Set in C. Is it OK to use a linked list, when creating the SET, or should I use another approach ?
How do you usually implement your own set (if needed).
NOTE: If I use the Linked List approach, I will probably have the following complexities for Set my operations:
- init : O(1);
- destroy: O(n);
- insert: O(n);
- remove: O(n);
- union: O(n*m);
- intersection: O(n*m);
- difference: O(n*m);
- ismember: O(n);
- issubset: O(n*m);
- setisequal: O(n*m);
O(n*m) seems may be a little to big especially for huge data... Is there a way to implement my Set more efficient ?
I have used Red-Black trees in the past to build sets.
Here are the time complexities from the Wikipedia article.
Space O(n) Search O(log n) Insert O(log n) Delete O(log n)
Sets are typically implemented either as red-black trees (which requires the elements to have a total order), or as an automatically-resizing hashtable (which requires a hash function).
The latter is typically implemented by having the hashtable double in size and reinserting all elements when a certain capacity threshold (75% works well) is exceeded. This means that inidividual insert operations can be O(n), but when amortized over many operations, it's actually O(1).
std::set is often implemented as a red black tree: http://en.wikipedia.org/wiki/Red-black_tree
This approach will give you much better complexity on all the listed operations.
Since you already have a linked list implemented, the easiest is a skip list. If you want to use balanced trees, the easiest in my opinion is a treap. These are randomized data structures, but generally they are just as efficient as their deterministic counterparts, if not more (and a skip list can be made deterministic).