Separate speaker notes to accompany the slide show on sequential update processing (sequpdateproc):


Slide #1:

This slide show deals primarily with the processing that is required to edit data and update a file sequentially. 

Slide #2:

Maintenance updating is used to add, change and delete records from a file in the traditional sense of updating.

Production updating is when the file changes because of the daily processing that occurs. Production updates usually have just change transactions.  For example a production update would deal with receipts into inventory and sales from inventory where a maintenance update would be concerned with adding new items to inventory, changing the price of an item or deleting an item from inventory.

Sequential updating is useful when there is a large percentage of hits between the two files.  If you have additions, changes or deletions for the majority of records on the file it makes sense to process sequentially.  If you have a small percentage of hits, that is you will be adding, changing or deleting a small number of records then a random update is more effective.

Slide #3:

Note that prior to doing the update the Valid Transactions need to be in the same order as the master file. In sequential processing you are going to be stepping through both files one record at a time and processing them together.  Therefore, they must be in the same order!

Slide #4:

The three parts of updating a file are editing, sorting and then updating the file with transactions that will cause records to be added to the file, or change information in the records on the file or delete records from the file.

Slide #5:

If the errors will be handled in future processing then you can write the transactions with errors to disk, print them or both write to disk and print.  The disk errors can then be handled by another program where a user can correct the data as it displays on a screen or someone can use the printout to enter corrections to the transactions that have problems.

If the errors are being handled interactively then the user will try to make corrections to the error transactions displayed on screen.  Errors that cannot be corrected are printed or written to disk for future editing.

Slide #6:

When you sort you can use a sort that accompanies the language you are using, a sort provided with the operating system or you can even write your own sort.  Usually you use a sort that has already been written.  It is time consuming code and there is no real advantage to redoing something that has already been done and is readily available.

Slide #7:

This is the sequential update program.  It was proceeded by the edit program and the sort program.  It is possible to include the edit, the sort and the update all in one program, however, they are frequently done as separate processes.

In this case I am calling the file to be update a master file.  A master file is something like an inventory file, a payroll file or a student file at the college.  It is updated with transactions that have been edited and sorted.  That is why I call the transaction file the sorted valid transaction file.

Slide #8:

Maintenance updating is the process by which files are kept current.  The rest of this slide presentation will deal with the sequential maintenance updating process.

Slide #9:

This is a chart showing the logic of the sequential update.  For additional information see the notes and read about the logic of sequential updates from some other source - book in the library, source on the Web etc.

Slide #10:

Note that frequently I use all 9s in the ID to indicate EOF.  The data can actually contain the 9s in the MID and the TID or the read statement can move the 9s there when EOF is encountered.

Note that with a numeric field all 9s is the highest value that can be in a field.  In a string field, we can set the field to all binary 1s - nothing can be larger than all binary 1s.

Slide #11:

The logic of the update loop is shown in the flowchart above.

Note that in the read old master routine I move all 9s to MID if the AT END clause is taken and in the read transaction routine I move all 9s to the TID if the AT END clause is taken.

Note: In the code illustrated here the WRITE routine is not a paragraph to be performed - the actual WRITE code is included in the code.  This also applies to the ADD Routine - the reporting of the ADD is done in the ADD Routine but the actual WRITE is done in the code.

Whatever approach you choose, I feel that logically it is good to control all I/O from this routine so it is easy to trace if problems occur.

You should also note that the delete routine accomplishes its main goal by just not writing anything on the new master (a trail should be produced).  Be sure to read the transaction and the old master because you want to move past the master that was scheduled not to be put on the new master because it is being deleted.

Slide #12:

Again please note that in the code in this slide show the Write new master from master work is actual code in the LOOP instead of occurring in a routine and the Write new master from add transaction is also actual code in the LOOP instead of occurring in the Add Routine.

Slide #13:

The reads are put as utility routines so that the code is only generated once.  I/O statements are expensive in the sense of the machine language code that is generated.  By coding them only once, we increase efficiency.

Slide #14:

Note that the master file idnos and the transaction idnos are in the same order.  As stated before, this is a requirement for sequential processing.

Slide #15:

For the purpose of this demonstration, the identification number on the old master file will be MID and  the identification number on the transaction file will be TID.

Slide #16:

We are assuming that the C transaction with TID=121 caused some changes to be made in the work area and that the old master 121 that is written to the new master will include these changes.

Slide #17:

In this case we are basically copying 123 to the new master.  There were no transactions for 123 so the unchanged record is written to the new master.

Slide #18:

An ADD can be made when there the TID does not already exist on the old master.  We know that when the MID is greater than the TID.

Slide #19:

If you were sure that there was only one change allowed per id, you could write the record after making the change.  But, since we are allowing multiple changes per id we have to read the next transaction and find out whether it is a match or not.  If it matches the master, another change will be made.  If it does not match the master, the TID will be now be greater than the MID so the changes will be written.

Slide #20:

Again we do not write because there could be still another record with TID=222.

Slide #21:

Be sure when you write the old master you write from the area where the changes were made.

Slide #22:

Note that this time we are simply copying the old master to the new master - no changes were made since there were no transactions for this record.

When writing the code, you have to be very careful to setup the old master record and/or write the procedure division code so that if changes have been made they will be written but that records without changes will be processed correctly as well.

Slide #23:

Note: You can only delete a record if a matching record exists on the old master.

Deleting is actually doing nothing.  I do not want to write the old master to the new master.  The act of not writing deletes it.

This is a strong argument for having an additional report that is a paper trail - you want a trail of records - especially those that are deleted.  You could choose to make the error report a combination of errors and deletions and write the trail of the deleted records there.

Slide #24:

Essentially in determining what to read, you read from the file where you have dealt with the record.  If you have dealt with the master record, you read a master record.  If you have dealt with the transaction record, you read a transaction record.  If you have dealt with both, you read both.

Slide #25:

When MID > TID (in this case MID=444 and TID=350), that means that we have passed the place on the master where 350 would be if it existed.  Since 350 does not exist on the master, the transaction is invalid.

Slide #26:

Because MID=444 and TID=444 you cannot add the record.  There is already a record with identification number 444 on the master file.

Slide #27:

The pattern should be emerging here that we always deal with the smallest record.  If the master is smallest we deal with that.  If the transaction is smallest we deal with that.  If they are the same, we deal with them together.

Slide #28:

You can delete something unless it exists. The fact that the MID is 456 and the TID is 450 means we have passed the place where 450 would be on the master - therefore we know that 450 did not exist on the master and the delete is an error.

Slide #29:

I am using 999 as an EOF indicator.  I have made the last record on each file have an identification number of 999.  When both files reach 999, I know that processing is complete.

Slide #30:

Note that once the transaction file has reached the 999 the only thing to do is copy the remaining old master records onto the new master.  Note: MID will always be less than TID.

If the master file had reached the 999 record first, the only valid thing you could do is process the Add records by writing them to the New Master and process Changes and Deletions as errors.  When the old master is at 999 the only possible relationship is MID > TID and the only thing that is legal in that situation is an Add.

Slide #31:

This way of testing makes it easier to write the logic because you are constantly checking the relationship between MID and TID anyway.  If the files do not physically contain 999 as a last record, you can move 999 to the MID or the TID when the file reaches EOF by including the move in the AT END clause.