CS 351 -Data Organization and Management ( Section: 02)

LectureNotes, Week 7

Prepared by: Ilginç Demir /Anil Öztuncer /Ugurcan Karayel

 

LINEAR HASHING

 

Extendible hashing and dynamic hashing require index structure.

However Linear Hashing requires no index structure and we can specify the load factor, as the file size changes we change the hash function.

In linear hashing, the last bits in the hash number are used for placing the records in the file.

 

Example: Begin with 8 buckets and use the last three bits.

 

hex

bin

34

100 010

17

010 001

5

000 101

28

011 100

25

011 001

50

110 010

55

110 111

8

001 000

                                   

000

8

 

 

001

17

25

 

010

34

50

 

011

 

 

 

100

28

 

 

101

5

 

 

110

 

 

 

111

55

 

 

    

Boundary value bv = 0

Bkfr = 3, use last k=3 bits.

2k =  tableSize < 2k+1

 

 

SEARCH ALGORITHM

 

Consider the last k bits of the key value, call it m.

if m < bv then we use last k+1 bits as address

else use m as address.

if necessary, follow overflow chains.

 

 

TABLE EXPANSION

 

            000    bv=0

 

|

,,

|

,,

|

,,

|

,,

|

,,

111

 

 

 

0000

 

           001   bv = 1

,,

|

,,

|

,,

|

,,

|

,,

111

 

1000

 

 

 

0000

 

0001

,,

          | bv = 2

,,

|

,,

|

,,

|

,,

111

 

1000

 

1001

 

 

 

INSERTION

 

1.                  Establish the correct bucket.

(use the search algorithm for that purpose)

2.                  If bucket is not full than enter the record

Else link (another) overflow bucket and enter it there.

           3.          if Lf * Bkfr records have been added since the last expansion than        

ˇ                    Add a new bucket to the primary area

ˇ                    Distribute the records in the boundary bucket between the new and

boundary buckets

Advance the bv value to the next bucket.

 

 

 

Index Sequential Access Method (ISAM)

 

Prime data area

Overflow area      

-         Cylinder overflow area (faster access, since there is no seek time)                                 

-         Independent overflow area (for many cylinders, slower since it involves additional seek time.)

Index area

 

 

 

Provides access to records in ascending order.

 

Cylinder 

 

Track Index

 

 

 

 

Track Index: Cylinder index: Master index (for large files).

 

IBM file structure no longer available, replaced by VSAM (Virtual Storage Access Method [B+ Tree Structure]).

            Example:

           

            Cylinder 1

1

TI

TI

10

15

2

20

25

30

35

3

40

45

50

55

4

 

 

 

 

 

 

 

{Overflow track}

 

 

            Cylinder 2

1

TI

TI

110

115

2

120

125

130

135

3

140

145

150

155

4

 

 

 

 

 

 

 

{Overflow track}

 

 

            Track Index for Cylinder 1

1

15

-

-

2

35

-

-

3

55

-

-

There are the highest key values for prime area and overflow area in the first and the second columns respectively.

The third column is for the address of the overflow record.

 

 


            Track Index for Cylinder 2

1

115

-

-

2

135

-

-

3

155

-

-

 

 

 

 

 

            Cylinder Index for both of the cylinders

1

55

2

155

 

 

 

 

           

             Delete 25, Add 22, Add 42, Add 60, Delete 30, Add 24, Add 47, Add 160.


            Delete 25

            Cylinder 1

1

TI

TI

10

15

2

20

25

30

35

3

40

45

50

55

4

 

 

 

 

25 marked as deleted.

 

 

 

 

            Add 22

 

           Cylinder 1

1

TI

TI

10

15

2

20

22

25

30

3

40

45

50

55

4

35*

 

 

 

*: There is a follower of this overflow bucket.

 

 

 

 

 

            Track Index for Cylinder 1

1

15

-

-

2

30

35

4,1

3

55

-

-

 

 

 

 

 

            Add 42

           Cylinder 1

1

TI

TI

10

15

2

20

22

25

30

3

40

42

45

50

4

35

55

 

 

               Track Index for Cylinder 1

1

15

-

-

2

30

35

4,1

3

50

55

4,2

 

 

 

 

 

            Add 60

            Cylinder 2

1

TI

TI

60

110

2

120

125

130

135

3

140

145

150

155

4

115

 

 

 

Track Index for Cylinder 2

1

110

115

4,1

2

135

-

-

3

155

-

-

 

 

 

 

 

          Delete 30

          Cylinder 1

1

TI

TI

10

15

2

20

22

25

30

3

40

42

45

50

4

35

55

 

 

30 marked as deleted.

No change in the track index.

 

 

 

 

            Add 24

           Cylinder 1

1

TI

TI

10

15

2

20

22

24

25

3

40

42

45

50

4

35

55

 

 

30 don’t goes to overflow area as long as it was marked as deleted.

 

 

 

           

 

             Track Index for Cylinder 1

1

15

-

-

2

25

35

4,1

3

50

55

4,2

 

 

 

 

 

            Add 47

 

           Cylinder 1

1

TI

TI

10

15

2

20

22

24

25

3

40

42

45

47

4

35

55

50

 

 

 

 

 

 

 

            Track Index for Cylinder 1

1

15

-

-

2

25

35

4,1

3

47

55

4,3*

*: The third column points to the record with the smallest key value.

As long as, the smallest value is 50 in the overflow area, it point outs the address of 50 (4,3).

 

 

 

            Add 160

 

            Cylinder 2

1

TI

TI

60

110

2

120

125

130

135

3

140

145

150

155

4

115

160

 

 

160 is bigger than the biggest value of Cylinder2. So it is placed in the overflow area.

 

 

                                                  

 

 

            Track Index for Cylinder 2

1

110

115

4,1

2

135

-

-

3

155

160

4,2

 

 

 

 

 

 

            Consider transaction processing:

 

                        ADDITION

                        DELETION

                        MODIFICATION

 

Indexed File Update:

 

Can: Master File
Can: Transaction File
                       

 

 

Update File

 
           

 

Report

 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

 

                           File   

 

 

                       Sequential File

 

 

            open    i-o master file                                                    apply_trans:

                        input trans_file

                        output report_file                                                         MK = TK

                                   

            perform                                                                                   perform read_master_file

                        apply_trans

                                                                                                            case of trans_flag

            until    

                        trans_eof = “true”                                                         A: perform addition

 

            close                                                                                        B: perform deletion

                        files

                                                                                                            C: perform modification

            stop

                        modification                                                                  end case

 

                                                                                                            perform read_trans_file

 

            addition:

 

                        if file_status not= 0   //record does not exists in the file

                        then

                                    master_record = trans_record

                                    write master_record

                        else

                                    error