HASHING

 

 


                                    Static Files                   Expandable Files

                                    Still we can add,           Extendible Hashing

                                    delete records               Linear Hashing

 

Dynamic Hashing

 

 

 

 
                                                                         0

Hash Function

 

 
 


Key             

 

 

                                                                         m-1

 

 

 

HF = mod(key,m)

HF(ki)=HF(kj)             ki != kj             collision

 

We must find a collision resolution method

 

Prime Data    Area

 

Overflow Area

 
 

 

 

 


                                                                                       

A bucket has several blocks. It can store many blocks as long as we have room for a new record collision is no problem.

 

 


                                  

 

 

 

 

 

 

                                

Example:

 

Bkfr=3  (Bucket Factor)

HF=mod(key,5)

Key values : 35, 71, 33, 19, 22, 60, 42, 46, 47, 17

 

 

 

Key                           HF=mod(key,5)      Collision(N/Y)          Prime               Overflow                                                                                                         Data Area        Area

 

35 60

 
35                             mod (35,5)=0                  N               0               

71                             mod (71,5)=1                  N

71 46

 
33                                3                                  N               1

19                                4                                  N        

17

 

22 42  47   47

 
22                                2                                  N               2

60                                0                                  N

33

 
42                                2                                  N               3

46                                1                                  N

19

 
47                                2                                  N               4

17                                2                                  Y

                                

To decrease number of collisions keep prime data area size > the number of records to be inserted to the files.

                                                  N                  number of records in file

                                 LF=                   

                                             M x Bkfr             Bucket Factor 

 

                 Load Factor

 

LF=    10       = 0,66                 %66

          5x3

 

TF =one disk access

     = s + r + dtt

 

                                Time needed to read one bucket

 

X: Average overflow chain length

TF(successful)  = s + r + dtt + (x/2) * (s + r + dtt)

TF(unsuccessful)  = s + r + dtt + x * (s + r + dtt)

 

Lower load factors means less collision and shorter access time.

Large bucket factor also means less overflow.

X=10  

 

TF(successful)  = 1 + 5 = 6

TF(unsuccessful)  = 1 + 10 = 11

 

 

 

 

 

 

 

 

 

File needs reorganization if we have many records in the overflow area

 

 

 

 

 

 

 

 

 


                                                                                                                     

 

TD = Assume that it is deleted from the last bucket

     =   TF(unsuccessful)  + 2r

 

TI = Assume that new records are inserted at the end of the chain

    =   TF(unsuccessful)  + 2r

   

Sequential Access:

 

Reading all records = n x TF

 

TF                    Bkfr

 

 

 


TF                     LF

 

 

Load Factor = 30%

Load Factor = 60%

Bkfr

Unsuccessful

Successful

Unsuccessful

Successful

1

1.0408

1.011500

1.01488

1.3000

2

1.0269

1.0530

1.1638

1.1823

3

1.0162

1.0216

1.1588

1.1259

4

1.0095

1.0097

1.1476

1.0922

5

1.0056

1.0046

1.1346

1.0699

50

1.0000

1.0000

1.0007

1.0000

 

 

 

Example:

 

n (number of records) = 100,000

B (block size) = 40 bytes

Bkfr = 50

Lf = 0.70

 

If we insert some number of records to the file that the specifications are supplied above, the final record size becomes 400,000, so what is the new load factor of this file?

 

Answer:

                                                                                                                           

Since we know that the load factor is proportional to the number of records, namely L =  n/(M x Bkfr), and also we know that the Bkfr and the M will not change, the proportion is:

 

400.000 / 100.000 = 4,

so,

 

LF (new) /  LF (old) = 4, too      

 

LF (new)  = 4 x 0.7

                 = 2.8

 

File Update:

MasterFile

 

Program

 

TransactionFile

 
                                                       I / O         

 


                                                                                                                                   

           

                                                                                                                

 

Report File

 
 

 

 


Properties of transaction file :

 

Report file includes :

 

MasterFile

 
Sequential file update:

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Old master file                                                  master file                                 old master file                                      

Oval: Grand father Oval: father Oval: child
 

 

 

 


Example:

 

Master file  :

Key (Mk)

Date (Md)

E1

8

E5

15

E10

10

E15

5

E20

20

E30

5

 

 

 

 

 

Trans File :

Trans type

Key (Tk)

Data (Td)

ADD

E2

50

MODIFY

E3

70

DELETE

E5

 

MODIFY

E15

30

ADD

E20

80

 

 

New master file :

 

Key

 

Data

 

Master file pointer

 

Trans. File pointer

 

Error

 

 

Actions

E1

8

E1

E2

NO

COPY E1

E5

E2

E5

E2

NO

ADD E2

 

 

E5

E3

YES

 

 

E5

 

 

Variable Names

 

master-file

new-master-file

trans-file

report-file

Mk: master key

Md: master data

Tk: trans key

Td: trans data

Flag

 

 

eof-master-file

 

true when we hit the end of the master file.

 

 

eof-trans-file

 

true when we hit the end of transaction file.

 

 

controller:

 

            open input master-file, trans-file

                     output new-master-file, report-file

            perform read-master-file

            perform read-trans-file

            perform apply-trans-file until eof-master-file = “true”

    or eof-trans-file = “true”

            perform finish up

            close all files

            stop run

 

apply-trans:

 

            case of Mk, Tk

                       

:Mk = Tk:

                        case of Flag

                                   

:A: write report-file ( Error: Tk exists in the master file)

                                          write new-master-file ( Mk, Md )

                                          // Copy from master-file to new-master-file

                                    :D: do nothing

                                    :M: write new-master-file ( Mk, Md+Td )

                       

end of case Flag

                        perform read-master-file

                        perform read-trans-file

                       

 

 

:Mk < Tk:

 

                                    write new-master-file ( Mk, Md )

                                    perform read-master-file

 

                        :Mk > Tk:

                                   

case of Flag

                                               

:A: write new-master-file ( Tk, Td )

                                                :D: Error à Tk does not exist in master file

                                                :M: Error àTk does not exist in master file

                                   

end of case Flag

                                    perform read-trans-file

           

end of case Mk, Tk

 

 

finish up:

 

            case of eof-master-file, eof-trans-file

            :eof-master-file = “true” and eof-trans-file=”true”

                        do nothing

            :eof-trans-file = “true”

            perform copy-master-to-new-master until eof-master-file = “true”

            :eof-master-file = “true”

            perform copy-trans-to-new-master until eof-trans-file = “true”

 

copy-master-to new-master:

 

            write new-master-file ( Mk, Md )

            perform read-master-file

 

copy-trans-to-new-master:

 

            case of Flag

                       

:A: write new-master-file ( Tk, Td )

                        :D: Error à No master record exists for Tk

                        :M: Error à No master record exists for Tk

           

end of case Flag

            peform read-trans-file

 

read-master-file:

 

            read master-file

            at end move “true” to eof-trans-file

 

read-trans-file:

 

            read trans-file

            at end move “true” to eof-trans-file

 

 

Hash file Update:

 

·        Mark all of the records as available

·        Rec_pos points to the record to be read

·        Read master file =  reads the record position indicated by rec_pos

 

File Creation:

prime area size  = 3

Bkfr = 1

 

Key

Mod(key,3)+1

E1

1+1=2

E5

2+1=3

E10

1+1=2

E15

0+1=1

E20

2+1=3

E30

0+1=1

E12

0+1=1

  

 

 

Rec_pos

Key

Link

1

E15

0  ==> 6

 

PRIME

2

E1

0  ==> 4

3

E5

0  ==> 5

4

E10

0  

5

E20

0

6

E30

0  ==> 7

7

E12

0