CS 351 -Data Organization and Management ( Section: 02)

LectureNotes, Week 4

Prepared by: Seher BAS /Aysan BAYSAL /Esra DALGIC

 

REPLACEMENT SELECTION SORT

(DUE TO Knuth com of ACM,1965)

 

            Let m be the number of records to be sorted which can be kept in the main memory. Imagine that these memory locations are registers and assume we can mark them as “on” or “off”. Replacement selection can overlap reading, sorting and writing.

 

The Algorithm:

 

Step1: The m registers are filled with records from the input to be sorted.

Step2: All registers are put into the “on” state.

Step3: Select the register which has the smallest of all “on” registers.

Step4: Transfer the contents of the selected register to the output (call it as key Y).

Step5: Replace the contents of the selected register by the next input record.

            If new record key > Y

                        Go to step 3

            If new record key = =Y

                        Go to step 4

            If new record key < Y

                        Go to step 6

 

Step6: Turn the selected key register  “off”.

            If all registers are now  “off”

                        We have completed a sorted substring(run).

                        We start a new substring group and go to step 2.

            Else

                        Go to step 3

 

Example: Suppose number of registers is 3 and input keys are 5, 2, 9, 7, 0, 8, 1, 6, 3, 4 ...

 

            Registers                               Output

 

            5 2 9                                       2

            5 7 9                                       5

*0 7 9                                     7

*0 8 9                                     8

*0*1 9                                    9

*0*1*6

0 1 6                                       0

3 1 6                                       1

3 4 6                                       3

- 4 6                                        4

- - 6                                        6

- - -

 

Using Two Disk Drives For Heap Sort:

 

 

 

Can: Output file Text Box: heap Can: Input file

 

 

                        Sort time = b* ebt (to create sorted segments)

 

 

1.          Create HeapA

2.          Write a block of records of HeapA to disk  read a block of records from input file

3.          Consider the first record of the block read from input file(new key)

If new key < keys written of the records written out

            Insert new key to HeapB

Else

            Insert new key to HeapA

 

 ONE DISK DRIVE LARGE-MEMORY MERGING

 

            2- way merge, 4-way merge, p-way merge

 

Assumptions: 2400 MB file 10 MB memory for heap construction.

 

 

 

Reads 5MB partition of each segment each segment contains seg number of blocks

(4+3)*(r+s)+2*2*seg*ebt

10MB/2400bytes=4167 segments

7*(24,3)+4*4167*0,84=14,2 sec (just to merge two sorted segments four halves of two segments)

Since we have 240 sorted segments

120*14,2=1704 sec = 28,4 min (for the first pass)

 

Passes

1

2

3

4

5

6

7

8

Segment size

10

20

40

80

160

Seven 320 one 160

Three 640 one 480

One 1280 one 1120

Number of segment

240

120

60

50

15

8

4

2

 

 

 

Number of passes =[lg240] (for tw0 way merge)

 

Consider 80 Mb, if we have “nsg ”segment of memory size, we read half of each segment at a time we have 2*nsg pieces.

 

Cost of one pass = 2*2*(nsg)*(r+s)+2*b*ebt

                           = 4*240*(24,3)+2*1000000*0,84

                           =1703 sec=28,4 min/pass

we have 8 passes

Total cost=28(heap sort creates initial run)+8*28,4=28+227=255min=4,25 hours.

 

 

The Four-Way Merge

 

Cost of one pass= 2*4*(nsg)*(r+s)+2*b*ebt

                          =8*240*24,3+2000000*0,84

                          =28,8 min

                       

[log4240]=4

 

Total cost= 4*28,8+28=143 min

 

The P-Way Merge

 

A=Cost of one pass=2*p*(nsg)*(r+s)+2*b*ebt

B=Number of passes=logp (nsg)

Total cost = A*B+ initial sorting time(execution time)

 

Maximum number of initial runs => p=nsg

 

A=2*nsg*nsg*(r+s)+2*b*ebt

   =2*2402 *24,3+1680000

   =74 min.