Many or non on one file and many or none on a second file

Speaker Notes:

Slide 1:

This presentation looks at the logic of 2 files with the potential of multiple records per id.

Slide 2:

Assume that both files are in order by id.

Producing a file with one record per id (containing the sum of the information from all of the records from both files with the same id).

This logic shows reading a record from each file and then moving the smaller id to the hold area (if equal, ID1 will be moved). The hold area will be used for comparison to determine when it is time to write. The processing will continue until EOF on both files is reached. At that point the last record will need to be written on the output file.

Slide 3:

This shows the logic flowchart where two files can have many records. Note that it is also valid for there to be no records of a particular id as well. There is no error processing because files do not match.

Slide 4:

In the setup we determine which id to put in holdid. Since the first record on each file has the same id, the id from record 1 is moved to holdid.

This is just one approach. Other programmers would use a very different kind of logic. My main goal is to have you think about the logic and understand the things that must be considered in processing two files with the possibility of multiple records per id.

Slide 5:

Two more records with are processed - each has an id of 111 so the amounts are added to the accumulators. We have not written anything to the output file yet.

Slide 6:

In this situation, only the record in file 1 matches the holdid so only the amount from that record is added to the accumulator. We now know that there are no more records from file 2 that match the holdid. We will continue processing records from file1 until a record which does not match the holdid is read. In this example, it is the next record.

Slide 7:

Since the records current read in both files do not match the id in the hold area, we have moved on to a new id and it is time to write the accumulated total for record 111.

Be sure to follow the order of the flowchart to see the processing that was done.

Slide 8:

The records with id of 222 are now being processed. On the next read we will have 300 as the id from file 2. Since we have already passed 222 on file 1, it will be time to write 222.

Slide 9:

In this processing anther record gets written tothe disk. It contains the total of all of the records with id 222. The holdid and accumulator get reset. Then record 300 from file 2 gets processed. The holdid is reset and the amount on that record is added to the accumulator.

Slide 10:

The record with id 300 is written on the disk. Note that there was only one record with this id. We have not specified that any rules about records having to appear on one file or another so the processing here is considered valid.

We have covered most of the situations, continue the record by record processing on your own to make sure you understand.