org.abora.white.collection.sets
Class ActualHashSet

java.lang.Object
  |
  +--org.abora.white.xpp.basic.Heaper
        |
        +--org.abora.white.collection.sets.ScruSet
              |
              +--org.abora.white.collection.sets.MuSet
                    |
                    +--org.abora.white.collection.sets.HashSet
                          |
                          +--org.abora.white.collection.sets.ActualHashSet

public class ActualHashSet
extends HashSet

The ActualHashSet class is an implementation of a MuTable set that can contain arbitrary Heapers. It uses the hashForEqual member function to get a hash value for the items contained in the set. The set establishes the equality of objects in the set through the isEqual: function.

The implemention of ActualHashSet is straightforward. There are primitive tables used to store pointers to the stored items (myHashEntries), and their corresponding hash values (myHashValues). The ActualHashSet also maintain a current tally of the number of items in the set

. definition: preferred location - the location calculated from item hashForEqual mod tableSize.

The search routine first calculates the preferred location. This is used as the first location in the myHashEntries table to test for the presence of the item, and the search proceeds forward (in a linear probe) from there. If there is no object at that position, the routine assumes the item is not in the table. If there is an item there, the first test is to check if the item's hash matches the hash myHashValues at that position. If the match is successful, a full isEqual: test is performed. If the isEqual: test succeeds, the item is present in the set. If either test fails, the entry at the desired location is tested to see if it is closer to its own preferred location than the item (a test which necessarily fails the first time). If it is closer, the item is not in the set. (This extra test is what distinguishes an ordered hash set from an ordinary hash set. It often detects the absense of an item well before an empty slot is encountered, and the advantage becomes pronounced as the set fills up. Ordered hash sets with linear probe beat ordinary hash sets with secondary clustering on misses (the big time eater), yet they preserve linear probe's easy deletion.)

On insertion to the set, the hash and probe sequence is essentially the same as the search. The main exception is that on a hash collision, the item bumps down any item that is no farther than its own preferred position.

An example is perhaps in order:

the set contains items a, b, and c in table locations 3, 4, and 5. Assume that a has location 2 as its preferred location, while b and c both have location 4 as their preferred location. Now, if we attempt to add an item d to the table, and item d were to have a preferred location of 3. Since 3 is already occupied by something that is already far from its preferred location, we probe for a another location. At location 4, item d is displaced by one from its preferred location. Since b is in it's preferred location (4) d replaces it, and we move item b down. Item c is in location 5 because it had already been bumped from location 4 when b was inserted previously. B again ties with c, so it pushes it out of location 5, replacing it there. Item c will end up in location 6. This probe function minimizes the individual displacement of hash misses, while keeping the most items in their preferred locations.

Note that, though the choice of which item to bump is obvious when the distances from home are different, when they are equal we could have given preference to either the new or the old item. We chose to put the new item closer to its preferred location, on the assumption that things entered recently are more likely to be looked up than things entered long ago.

This algorithm was derived from a short discussion with Michael McClary (probably completely missing his intended design - all mistakes are mine -- heh). (Unfortunately, I wasn't clear in the discussion. Since hugh was unavailable when I discovered this, I've taken the opportunity to practice with Smalltalk and corrected both the explanation and the code rather than sending him a clarification. -- michael)


Field Summary
protected  SharedPtrArray myHashEntries
           
protected  Int32Array myHashValues
           
protected  int myTally
           
 
Constructor Summary
protected ActualHashSet(int newTally, Int32Array hashValues, SharedPtrArray entries)
           
protected ActualHashSet(int newTally, SharedPtrArray entries)
           
protected ActualHashSet(Rcvr receiver)
           
 
Method Summary
protected  void aboutToWrite()
          If my contents are shared, and I'm about to change them, make a copy of them.
protected  void actualAboutToWrite()
           
protected  void checkSize(int byAmount)
           
 int contentsHash()
          Has the same relationship to contentsEqual that hashForEqual has to isEqual.
 ScruSet copy()
          A new one whose initial state is my current state, but that doesn't track changes.
 IntegerValue count()
          How many elements are currently in the set.
 void destruct()
           
protected  int distanceFromHome(int loc, int home, int modulus)
           
protected  int entryTableSize()
           
 boolean hasMember(Heaper someone)
          Is someone a member of the set now?
 void introduce(Heaper anElement)
          Add anElement to my members, but only if it isn't already a member.
 boolean isEmpty()
          Return true if this set does not currently have any elements.
static MuSet make()
           
static MuSet make(Heaper something)
           
static MuSet make(IntegerValue someSize)
          someSize is a non-semantic hint about how big the set might get.
protected  void printInternals(java.io.PrintWriter oo)
          This method is for regression testing.
 void receiveHashSet(Rcvr rcvr)
          Make myHashEntries large enough that we won't grow.
 void remove(Heaper anElement)
          Remove anElement from my members.
 void sendHashSet(Xmtr xmtr)
          This currently doesn't take advantage of the optimizations in TableEntries.
 void sendSelfTo(Xmtr xmtr)
           
 Stepper stepper()
          Returns a stepper which will enumerate all the elements of the set in some unspecified order
 void store(Heaper anElement)
          maintainance note: storeAll: has a copy of the code starting at self hashFind:...
 void storeAll(ScruSet other)
          union equivalent
 Heaper theOne()
          Iff I contain exactly one member, return it.
 void wipe(Heaper anElement)
          make anElement no longer be one of my members.
 void wipeAll(ScruSet other)
          Sort of minus.
 
Methods inherited from class org.abora.white.collection.sets.HashSet
asImmuSet, asMuSet, immuStepper
 
Methods inherited from class org.abora.white.collection.sets.MuSet
actualHashForEqual, fromStepper, isEqual, restrictTo
 
Methods inherited from class org.abora.white.collection.sets.ScruSet
asArray, contentsEqual, intersects, isSubsetOf, printOn, printOnWithSimpleSyntax, printOnWithSyntax
 
Methods inherited from class org.abora.white.xpp.basic.Heaper
destroy, equals, hashForEqual, printContentsOn, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

myHashValues

protected Int32Array myHashValues

myHashEntries

protected SharedPtrArray myHashEntries

myTally

protected int myTally
Constructor Detail

ActualHashSet

protected ActualHashSet(int newTally,
                        SharedPtrArray entries)

ActualHashSet

protected ActualHashSet(int newTally,
                        Int32Array hashValues,
                        SharedPtrArray entries)

ActualHashSet

protected ActualHashSet(Rcvr receiver)
Method Detail

make

public static MuSet make()

make

public static MuSet make(Heaper something)

make

public static MuSet make(IntegerValue someSize)
Description copied from class: MuSet
someSize is a non-semantic hint about how big the set might get.


contentsHash

public int contentsHash()
Description copied from class: ScruSet
Has the same relationship to contentsEqual that hashForEqual has to isEqual. I.e., if 'a->contentsEqual (b)', then 'a->contentsHash() == b->contentsHash()'. The same complex caveats apply as to the stability and portability of the hash values as apply for hashForEqual.

Overrides:
contentsHash in class ScruSet

count

public IntegerValue count()
Description copied from class: ScruSet
How many elements are currently in the set. Being a set, if the same element is put into the set twice, it is only in the set once. 'Same' above is according to 'isEqual'.

Specified by:
count in class HashSet

hasMember

public boolean hasMember(Heaper someone)
Description copied from class: ScruSet
Is someone a member of the set now?

Specified by:
hasMember in class HashSet

isEmpty

public boolean isEmpty()
Description copied from class: ScruSet
Return true if this set does not currently have any elements.

Specified by:
isEmpty in class HashSet
Returns:
true if this set does not currently have any elements.

copy

public ScruSet copy()
Description copied from class: ScruSet
A new one whose initial state is my current state, but that doesn't track changes. Note that there is no implication that these can be 'destroy'ed separately, because (for example) an ImmuSet just returns itself

Specified by:
copy in class HashSet

destruct

public void destruct()
Overrides:
destruct in class Heaper

stepper

public Stepper stepper()
Description copied from class: ScruSet
Returns a stepper which will enumerate all the elements of the set in some unspecified order

Specified by:
stepper in class HashSet

theOne

public Heaper theOne()
Description copied from class: ScruSet
Iff I contain exactly one member, return it. Otherwise BLAST. The idea for this message is taken from the THE function of ONTIC (reference McAllester)

Specified by:
theOne in class HashSet

storeAll

public void storeAll(ScruSet other)
union equivalent

Overrides:
storeAll in class MuSet

wipeAll

public void wipeAll(ScruSet other)
Sort of minus. Wipe from myself all elements from other. Turn myself into my current self minus other.

Overrides:
wipeAll in class MuSet

introduce

public void introduce(Heaper anElement)
Description copied from class: MuSet
Add anElement to my members, but only if it isn't already a member. If it is already a member, BLAST

Specified by:
introduce in class HashSet

remove

public void remove(Heaper anElement)
Description copied from class: MuSet
Remove anElement from my members. If it isn't currently a member, then BLAST

Specified by:
remove in class HashSet

store

public void store(Heaper anElement)
maintainance note: storeAll: has a copy of the code starting at self hashFind:... for efficiency.

Specified by:
store in class HashSet

wipe

public void wipe(Heaper anElement)
Description copied from class: HashSet
make anElement no longer be one of my members. No semantic effect if it already isn't a member.

Specified by:
wipe in class HashSet

aboutToWrite

protected void aboutToWrite()
If my contents are shared, and I'm about to change them, make a copy of them.


actualAboutToWrite

protected void actualAboutToWrite()

checkSize

protected void checkSize(int byAmount)

distanceFromHome

protected int distanceFromHome(int loc,
                               int home,
                               int modulus)

entryTableSize

protected int entryTableSize()

printInternals

protected void printInternals(java.io.PrintWriter oo)
This method is for regression testing.

Specified by:
printInternals in class HashSet

receiveHashSet

public void receiveHashSet(Rcvr rcvr)
Make myHashEntries large enough that we won't grow.


sendHashSet

public void sendHashSet(Xmtr xmtr)
This currently doesn't take advantage of the optimizations in TableEntries. It should.


sendSelfTo

public void sendSelfTo(Xmtr xmtr)
Overrides:
sendSelfTo in class Heaper


Copyright © 2003 David G Jones. All Rights Reserved.
Original Udanax-Gold - Copyright © 1979-1999 Udanax.com. All rights reserved.