Inverted Files

  1. Assume we have a database of bicycle information, on which we wish to perform queries based on the secondary attributes of manufacturer, colour, and bicycle type.

    Given the following table of possible values of each attribute, how many bit vectors would we require in total to construct the inverted files?

    Attribute Values
    manufacturer Sekine, Raleigh, Miele, Nishiki, Apollo, Norco
    colour white, red, black, blue, green, silver
    type touring, mountain, racing, hybrid

    Instead, we are to implement the inverted file using the graph structure as discussed in class, with the following node structure:

    Pointer Size
    pointer to main file 4 bytes
    pointer to directory for manufacturer attribute 4 bytes
    pointer to next node with same manufacturer 4 bytes
    pointer to directory for colour attribute 4 bytes
    pointer to next node with same colour 4 bytes
    pointer to directory for type attribute 4 bytes
    pointer to next node with same type 4 bytes

    Show the format of the directory for this inverted file, and then write an algorithm in pseudocode which will search for all of the white Miele racing bikes.

    Assuming that each value of every attribute appears reasonably frequently in the main file, describe two changes which could make the node structure more space-efficient.