Hashing

  1. What problem that both linear and non-linear hashing suffer from is solved by double hashing?

  2. Given 5 buckets, numbered 0, 1, 2, 3, 4, each with a capacity for two records, show the result of inserting the records with the keys: 16, 7, 15, 34, 24, 25, 75, 14, 39, into the appropriate buckets, using the linear probing hashing function
    	h(K) = [K + (i-1)] mod 5
    
  3. On average, how many buckets had to be read to insert each record?

  4. How would this number change (calculate it) if we used instead the non-linear probing hashing function h(K) = [K + (i-1)²] mod 5 to insert the same records?

  5. Double hashing overcomes secondary clustering by using two different hashing functions as in: h(k) = (h1(k) + (i-1)h2(k)) mod N. Is it possible to overcome secondary clustering with only a single hashing function? If so, describe the method, and if not, explain why.

  6. What is the advantage of using a direct file, in which the location of flight reservations is based on a hash value of the reservation number, for storing records in the term project?

Virtual Hashing

  1. In the virtual hashing method with a total of M initial buckets, and an initial hashing function of h(K) = K mod M which of the following occurs when a bucket B overflows for the second time?

Table-assisted Hashing

  1. A certain file is stored in buckets using the simple table-assisted hashing method described in class. There are four buckets, each of which can hold up to 3 records. The initial contents of this file and of the maxKey table are shown in the figure below. The hash function is h(K,i) = (K + i-1) mod 4. Show the state of the file and of the maxKey table following the insertion of a record with key 28 into the file.
    Bucket maxKey contents
    0 16 4 16 8
    1 43 9 43 25
    2 26 18 26
    3 19 7 3 19

  2. What is the maximum number of buckets that must be accessed during the search for a record?

  3. When a record to be inserted hashes to a full bucket, where is the record placed?

  4. Is it possible for the insertion of a record R to fail even though there is enough room in the file to accommodate a record? Explain.

  5. In the table-assisted hashing method the attempted insertion of a record, R, into a full bucket may require that another record, R' be removed from that bucket and re-inserted elsewhere. Is it possible that this re-insertion of record R' would require record R to be re-inserted, and so on, thus resulting in an infinite loop? Explain.

  6. Write a pseudo-code routine for the deletion of a record from a direct file, implemented using the table-assisted hashing method.

Dynamic Hashing

  1. For the dynamic hashing method, a PseudoRandom function, B(K,i), is required to guide the traveral of a binary tree towards the bucket containing the record with key K. The function outputs either 0 or 1, based on the values of the key, K, and the tree level, i.

  2. What is wrong with using the following function for B(K,i)?
       int B(int K, int i) {
          int pos, bit; 
    
          pos = 1 << i; /* calculates 2^i */
          bit = (K & pos) >> i; /* obtains the ith bit of K */ 
          return (bit); 
       }
    
  3. Write a good function for B(K,i).

  4. A certain file is stored in buckets as determined using the method of dynamic hashing. The hash function is h(K) = K mod 4 and the first few bits of the B string associated with each key in the file are given in the table below. Each bucket can hold up to 2 records. Show the state of the directory and the file after the records with keys 17, 13, 16, 5, and 25 are inserted.
    Key B
    17 01101
    13 00101
    16 10110
    5 11110
    25 01001

  5. What major issue can we ignore when when deleting from a direct file using the dynamic hashing method instead of simple open addressing?

  6. What is the major similarity between table-assisted hashing and dynamic hashing? What advantage is there to using dynamic hashing?