Data Processing

  • Data Processing

  • Definition of the terms:

    Data, information and data processing

    Data processing cycle

    Data collection

    i)  stages of data collection

    ii) methods of data collection

    Data input



    Description of errors in the data processing

    • Transcription errors
    • Transposition

    Data Integrity

    • Accuracy
    • Timeliness
    • Relevance

    Data processing methods

    • Manual/conventional
    • Mechanical
    • Electronic

    Computer files

    • Elements of computer file
    • Logical and physical files

    Types of computer processing file

    • Master
    • Transaction
    • Report
    • Sort
    • Backup
    • Reference

    File organization methods

    • Sequential
    • Random/direct
    • Serial
    •  Indexed sequential

    Electronic Data processing modes

    • Online
    • Distributed
    • Time-sharing
    • Batch processing
    • Multi-processing
    • Multi-programming/multitasking
    • Interactive processing
    • Real-time

    Data processing deals with how data is organized & processed in the computer.


    Data is a collection of facts & figures, which can be processed to produce information.

    Data are the facts relating to an activity in a given environment. 

    The activity can be Accounting, Inventory control, etc.  The environment can be business, scientific, education, etc.


    • In an educational environment, when students sit for exams, the grades obtained represent the data to be processed by the computer.  In this case, data can be Names of students & Marks obtained.
    • In a business environment, data can be the No. of Hours worked, names of employees, Stock

    Data can also be described as Raw data if they are not yet processed, i.e. if they do not convey particular meaning to a given activity within any given environment.

    It therefore means that, Data are unprocessed information consisting of details relating to business transactions.  For example, in a Payroll system, data are employee’s names, basic salary, department number, marital status, etc.


    The collection, manipulation & distribution of data (i.e.) letters, numbers & graphic symbols, to achieve certain objectives. 

    The processing may involve calculations, comparisons, decision-making and/or any other logic to produce the required result.

    The activity of manipulating the raw facts to generate a set of meaningful data (described as Information), which is able to convey some meaning.

    Those activities, which are concerned with the systematic recording, arranging, filing, processing, and dissemination of facts relating to the physical events occurring in a business.

    Data processing is a very important activity in any organization of any size or nature because it generates information for decision-making.

    If the data processing uses complicated processing tools or aids, e.g. the computer, it is described as Electronic Data Processing (EDP).


    Information is data, which is summarized and processed in the way you want it, so that it is useful in your work.

    Information is an assembly of meaningful data items.

    The information in Payroll activity includes; Net pay, Total Tax deductions, etc.  In Stock Control, the information generated includes; Closing stock, Total cost of the items, Purchases, Sales, etc.

    The information is obtained by applying some processing procedures onto the raw data being input.  For example, to get the Net pay in a Payroll activity, the procedure would be;

    Net pay = (Basic salary + Allowances + Overtime, if any) – Taxes.

    Information is the end product of data processing available at the right place, the right time and in the right form.

    The information generated by the data processing activities is very important in the working strategies of any organization, because it is used by the organization to make decisions. 

    Characteristics/ Features of good Information.

    It should: -

    1. Have and serve a purpose.
    2. Be relevant to its purpose.
    3. Be complete, accurate, and comprehensive.
    4. Have been obtained from a reliable source.
    5. Be communicated to the right person and in the right time (i.e. it should be timely).
    6. Be clear and understandable by the user.
    7. The user must have confidence in it.

    Relationship between Data, Data Processing, and Information.

    Data are the facts which relate to any particular activity, and do not have any specific meaning. 

    Information is data with a definite meaning.

    Data processing is the process, which transforms data into information.

    In a Manufacturing industry, data may be compared to raw materials and Information to finished products.  Just as raw materials are transformed into finished products, raw data are transformed into information.

    In order to generate information from data items, a set of processing activities have to be performed on the data items in a specific sequence depending on the desired final result.  Performing these processes is known as Data processing.


    Data processing cycle refers to the various stages involved in converting data into information.

    Basic stages in the Data processing cycle.

    There are 5 primary elements/functions of data processing system.  They include; Input, Processing, Storage, Output, and Control.


    Data Collection is the process involved in getting the data from the point of its origin to the computer in a form suitable for processing.

    Note.  Data collection starts at the source of the raw data & ends when valid data is within the computer in a form ready for processing.


    Data Entry:

    Nowadays, most end-users input data to the computer using Keyboards on PCs, Workstations, or Terminals.

    Data can originate in many forms, but the computer can only accept it in a machine-sensible form. 

    Problems of Data Entry.

    The data to be processed by the computer must be presented in a Machine-sensible form (i.e. in the language of a particular input device).

    Note that most of the data originates in a form that is not machine-sensible.  Therefore, the data must undergo the process of Transcription before it is suitable for input to the computer.

    The process of Data collection involves getting the original data to the “processing center”, transcribing it, sometimes converting it from one medium to another, and finally getting it into the computer.  This process involves a great number of people, many machines, and much expense.


    Data Capture:

    Data Capture is the process of obtaining data in a computer-sensible form at the point of origin.

    Obtaining of data in a computer-sensible form helps to avoid many of the problems of data entry. 

    The captured data may be stored in some intermediate form for later entry into the main computer in the required form.  If data is input directly into the computer at its point of origin, the data entry is said to be On-Line.  In addition, if the method of direct input is a terminal or workstation, the method of input is known as Direct Data Entry (DDE).



    The process of data collection may involve any number of the following stages depending on the methods used.

    1. Data Creation.


    This involves 2 basic alternatives:


    1. Source documents


    Source document is the original document used to record data and/or instructions.


    Most of the data is in form of a manually scribed or typewritten documents, i.e. the data is on clerically prepared source documents.


    1. Data capture.  This involves preparing the source document itself in a machine-sensible form so that it may be used as input to the computer without the need for transcription.  The prepared source document is then read directly by a suitable device, e.g. a Bar code reader.


    Data capture eliminates the need for transcription.


    Note.  The method and medium adopted for data creation will depend on factors such as Cost, Type of application, etc.


    Data Transmission.

    This will depend on the method & medium of data collection involved/adopted. 

    If the computer is located at a central point, the documents will be physically “transmitted”, i.e. by the Post office or a Courier to the central point.

    The data can also be transmitted by means of Telephone lines to the central computer.  In this case, no source documents would be involved in the transmission process.

    Data Preparation.

    Data Preparation is the term given to the transcription of data from the source document to a machine-sensible medium.

    There are 2 parts involved in the data preparation:

    1. The original transcription itself, and
    2. The Verification process that follows.

    Conversion of data from one medium to another.

    Data is prepared in a particular medium & converted to another medium for faster input into the computer.

    For example; data might be prepared on Diskette, or captured onto Cassette, and then converted to magnetic Tape for input.

    The conversion will be done on a computer that is separate from the one for which the data is intended.


    The data, now in magnetic form, is put into the computer and subjected to validity checks by a computer program before being used for processing.


    This stage is required to re-arrange the data into the sequence required for processing. 

    Sorting is necessary for efficient processing of sequentially organized data in many commercial and financial applications.


    In all the stages of data collection, control must be established and applied where necessary.  In other words, Control is usually applied through out the whole process of data collection.


    The following are alternatives that can be used to collect data:

    1. Use of Data Capture devices such as Scanners, Kimball Tags, Point-of-Sale systems, Bar-code readers & Magnetic strip readers.


    The System designer must guard against the following types of errors:

    1. Transcription (copying) errors.
    2. Missing source documents.
    3. Source documents whose entries are omitted, illegal and suspicious/doubtful.
    4. Program faults (errors).
    5. Machine hardware faults.

    Note.  Machine hardware faults are less common because modern computers have self-checking facilities & usually signal any internal failure.


    1. Accuracy:
    2. Timeliness:
    3. Relevance:


    The quality of Input data is important to the accuracy of output.  Control must be instituted as early as possible in the system & everything possible must be done to ensure that data is complete and accurate before being input to the computer.

    Objectives of Data Control.

    The objectives of Control are:

    1. To detect, correct and re-process all errors.
    2. To ensure that all data is processed.
    3. To preserve the integrity/reliability of maintained data.
    4. To prevent and detect fraud/deception.

    Note.  Control must be designed into the system & thoroughly tested.  Failure to build in adequate control may cause expensive systems to fail.  In addition, all users must be fully consulted to ensure that adequate controls are implemented.

    Types of Data Controls.

    The following are controls that can be used to ensure data accuracy:


    This is the process of checking & ensuring that data has been transcribed/ written out correctly.

    Verification is whereby several computer users are given data to enter into the computer and the results are compared.  Or else, a second transcription is compared with the first one.  If the results are different, then there is inaccuracy in that data.

    This method is mostly used to verify password changes.

    Note.  Verification calls for manual intervention, hence errors are possible.  Note that some copying/transcription mistakes that bypass the verification stage are difficult to isolate during verification, e.g. the confusion of l (letter l) and 1 (one).  In this case, l might be input instead of 1 and vice versa, hence such mistakes go undetected.

    The main types of errors, which might occur are: -

    1. Missing data.
    2. Duplicating of data.
    3. Use of outdated records.
    4. Incorrect batches of input data.
    5. Incorrect recording at the source.
    6. Incorrect data preparation.

    Manual controls.

    This involves considerable checking of the source documents.

    Such checks may be:

    1. Inspecting the source documents to detect missing entries, illegible entries, illogical or unlikely entries.
    2. Comparing the document against stored data to verify entries.
    3. Re-calculating to check calculations made on the document.


    A Computer cannot notice errors in the data being processed in the way that a Clerk or Machine operator does.

    Data validation is the process of preventing wrong data from being processed.  It involves checking whether the results generated by the computer are valid or applicable.  During input or data preparation, the data must be checked for transcription errors, through a process known as Verification.

    Once the data is brought into the computer memory directly from an input device, immediately before processing, the data is again subjected to checks built in the program described as validation checks, to check the data integrity or the conformity of the data to the processing requirements. 

    Data validation includes testing for the following:

    Test for reasonableness.

    The computer program checks whether the data is reasonable, e.g., number of people should not be represented in decimals, i.e. 9½ children.

    Test for numbers.

    E.g., numbers should not be given as alphabets.

    Test for alphabets.

    E.g., alphabets should not be represented as numbers.

    These checks can be made at 2 stages:

    Input stage: When data is first input to the computer, different checks can be applied to prevent errors going forward for processing.  For this reason, the first computer run is often referred to as Validation or Data vet.

    Updating stage: Further checking is possible during data processing (or when the data input are being processed).

    The program checks the consistency of the input data with existing stored data.  This check is possible during the input run if the stored data is on-line at the time.

    Note.  Validation is an online process (i.e. validation checks are build into the computer programs using the input data, so that incorrect data items are detected and reported).  Since the checks are under the influence of the computer, they are not prone to errors.

    Exercise I.

    Distinguish between Data verification and Validation as used in the context of data collection.

    Reasons for changing from Manual to Mechanical and Electronic Systems.

    The following are the factors, which may necessitate the change from Manual to Mechanical or to Electronic data processing method:

    Operation Speed.

    The timing aspect of information availability (i.e. when the information is required) is very important.

    Electronic & Mechanical systems provide automatic processing of the input data.  This quickens the operations on the input data to produce timely information.

    For example, a Clerk assisted by mechanical or electronic devices takes shorter time to complete the posting of a transaction.

    Accuracy of the information.

    The use of mechanical or electronic data processing tools makes information more accurate & neat, by removing the use of illegible handwritten entries.

    In addition, verification is made easy; hence wrong data are easily prevented from entering the processing stage.

    Volume of data.

    The data processing method selected should be able to cope with the processing tasks, in respect to the data held.  The data (records) of an organization depends on the size & the nature of the business. 

    Small organizations with low volumes of data, require few personnel with little or no data processing aids.

    Large or complex business organizations, with high volumes of data, require the use of sophisticated processing tools, if the information is to be produced on time.


    Data processing that requires repeated operations may be boring & tedious when carried out manually.  In such a case, mechanical or computer machines may be employed to assist in the processing depending on the nature of the business.

    Linked Applications.

    In a situation where there is a common data pool that supports several applications, and e.g., Manual D.P method is used, then different operations may be required to produce different informations.  However, if Electronic D.P method is used, the informations can be easily produced from the same data.  This is because, the computer is versatile, and can operate in any desired manner provided the relevant programs are available.

    Better services to customers.

    As Data processing systems produce information, the recipient of such information should receive them immediately to enable them take decisions that control their business operations. 

    Using the sophisticated processing aids, such as Computer as in Electronic D.P systems, improves the quality of information produced, e.g. statistical summaries are produced in good time, enquiries are answered in good time, and orders are dispatched promptly.

    Factors that determine the Methods of Data Processing.

    The following are the factors that influence the method of data processing selected:

    1. Size and Type of business.
    2. Timing Aspects of the information produced from the system.
    3. Link between Applications.

    Size and Type of Business.

    Simple or small business organizations require relatively fewer personnel and processing methods that are less complicated.

    In a very small company, a single person can be used to produce all the information required, but as the volume of business increases, more people and tools/aids in the form of Calculators and small Computers may be employed.  Large volumes of data and information will require the use of large computers.

    For example;

    In some companies, the Payroll may involve paying a member of staff the same amount each month, while in others a complex payment system may be involved.

    Similarly, producing an Invoice may be a matter of simply copying from the customer’s order, or it may require complex discount calculations. 

    Simple calculations indicate the need for fewer people and tools to produce the information, while complex situations indicate the need for more people and aids.

    Timing Aspects of the information produced.

    Some applications/ jobs require much shorter time between the origination of the transaction and the production of information (e.g. Hotel bookings), while other business applications may require the information to be made available after a relatively longer period, e.g. in Passport application, where information is required periodically.

    Some information requirements are less important than others.  E.g., the Payroll and Statement of Accounts may only be produced once a month, whereas in certain companies, the Invoices may be produced all the time (i.e. as a customer collects the goods). 

    Link between Applications.

    In some applications, the same data items may be used in producing more than one information; hence, the most suitable data processing system should be used depending on circumstances surrounding these information requirements.

    E.g. a particular item sold may be needed to produce the Invoice & to amend the recorded Stock position (i.e. to make adjustment of Stock level, and the Bank account or Cash account).

    Exercise I.

    1. Distinguish between Manual, Mechanical and Electronic systems.
    2. Describe the reasons for changing from Manual to Mechanized or Electronic systems.

    Exercise II.

    1. Write short notes on the following:
    2. Manual systems.
    3. Mechanized systems.
    4. Electronic systems.
    5. By use of a clear table and brief explanation, show the differences between manual, mechanized and electronic systems touching on the following functions: input, process, output, storage and control.                                                   (20 marks).


    A File is a collection of related records (i.e. several records put together) that give a complete set of information about a certain item or a particular business entity.

    Files are important in any business because; they provide up-to-date information relating to the entity sets of the business, e.g., the suppliers, employees, customers, etc of the organization.

    Entities are things whose facts need to be recorded.  Each entity has its attributes (i.e., individual properties), e.g., Employee (which is an entity) has attributes such as; Name, salary, address, etc.

    A file can be stored manually in a file cabinet or electronically in a computer’s secondary storage device such as a Floppy disk or hard disk.

    Advantages of computerized filing system over manual filing systems.

    1. Information takes up less space than the manual filing.
    2. It is much easier to update or modify information.
    3. It offers faster access and retrieval of data.
    4. It enhances data integrity.
    5. Reduces duplication of data, or of the stored records

    Logical and physical files

    Computer files are classified as either; Logical or Physical.

    Logical files.

    A Logical file is a type of file viewed by the user in terms of what data items it contains & what processing operations may be performed on the stored data items.

    Physical files.

    A Physical file is viewed in terms of how the data items found in a file are arranged on the surface of the storage media (e.g., disk, tape), and how the stored data items can be processed.


    A Bit is the smallest item that can be stored in a physical file. 

    The bit can either be a ‘0’ or a ‘1’; the two states that define the storage cells of a computer memory & a storage media. 

    Bits combine together to form the Byte (which is the unit of measuring the computer storage).  A Byte is the collection of several bits that represent a Character.


    A computer file is made up of three elements:

    1. Characters.
    2. Fields.
    3. Records.


    A character is the smallest element in a computer file, and can refer to a letter, number, & symbol that can be entered, stored and output by a computer.

    A character is formed by several bits combined together, depending on the character coding system used, e.g., in a 6-bit character coding system, a character is represented by a combination of 6 bits.

    Characters are normally used to represent data items such as Names, Prices, etc.


    A field is an item of data or information consisting of one or more characters.

    A Field is made up of a combination of characters, and forms the attribute of a given entity, e.g., in a student’s record, the students Admission number is a field.

    There are 2 types of fields;

    • Fixed length fields – these are fields with the same numbers of characters.
    • Variable length fields - fields within a record that are made up of different numbers of characters (i.e., fields with different spaces allocated for their characters).


    A record is a collection of related fields, which together form or represents a single entity. 

    In any particular file, there is a separate record for each entity, e.g., in a class score sheet, the details of each individual student in a row such as name, admission number, total marks, and position form a record.

    There are 2 types of fields;

    • Fixed length records – records in a file that are made up of the same number of fields.
    • Variable length records – records that have different number of fields making them.  If the records have different spaces preserved for them, then it implies that, all the records in the file will not have the same size.

    Note.  Variable length records normally utilize the storage efficiently.  However, processing or updating them in a computer is difficult because; the programmer is dealing with unknown quantities. 

    On the other hand, fixed length records do not utilize the storage efficiently, but they are easy to process because; the programmer is dealing with known character quantities.


    There are various types of files used to store data needed for processing.  Data processing files are classified according to:

    • Their uses within the overall data processing activities.
    • The kind of data/ information they store.

    The main types of data processing files include:

    1. Master files.
    2. Transaction files.
    3. Reference files.
    4. Sort files.
    5. Backup files.
    6. Scratch files.
    7. History files.
    8. Report files.

    Master files.

    A Master file is the main file that contains relatively permanent records about particular items or entries against which transactions are processed. 

    Master files contain records, which have long-term significance, and are very important for the running of the organization.

    Master files normally contain 2 types of data: Static data and Dynamic data.

    Reference (Static) data:

    Static data is relatively permanent, and contain details which do not change, e.g., Name, Sex, Date of birth, Date of hiring, etc.

    Static data is processed by amending (i.e. making occasional changes to) the existing records, e.g., inserting new records, deleting outdated records, etc 

    Dynamic data:

    Dynamic data is temporary and is likely to change frequently, e.g., Salary, Tax rates, hours worked, Rate of pay, etc.

    Dynamic data is processed by updating (i.e. changing the values of the various fields). 

    The accuracy of data within the operational files is achieved by Updating the Master file (i.e., changing the contents in the master files regularly in order to reflect the current state of affairs).  This involves adding, removing or adjusting the data in the Master file.

    Transaction (Movement) files.

    A Transaction file contains individual data about the transactions (activities) that occurred in a business during a particular period of time. 

    The file contains relatively temporary information such as all incoming or outgoing records resulting from a transaction.

    Transaction files are usually created from the source documents, which contain data from the point of their origin.

    The contents in a Transaction file are used to update the dynamic data on Master files.  For example, in a busy supermarket, daily sales are recorded on a transaction file, and later used to update the stock file.  The file is also used by the management to check on the daily or periodic transactions.

    Transaction files have a short life span.  This is because, once the contents of the file have been used to update the master file, its contents are no longer required, and can be replaced by the next business transactions.

    Examples of transaction files:

    Files that contain Earnings & deductions of an Employee, or payments received from customers.

    Reference files.

    A Reference file is used for reference or look-up purposes.

    Lookup information is that information which is stored in a separate file, but is required during processing.  E.g., the item code entered either manually or using a bar-code reader in a point-of-sale terminal is used to look-up the item description & price from a reference file stored on a storage device.

    Reference files contain records that are fairly permanent or semi-permanent such as tax deductions, Wage rates, Customer address, etc, and therefore, they need to be revised occasionally.

    Backup files.

    A Backup file is used to hold duplicate copies (backups) of data or information from the computer’s fixed storage (hard disk).  These files are kept for security purposes.

    This is because; the operational files held on the hard disk may be corrupted, lost or changed accidentally leading to loss or damage of existing information.  It therefore important to keep copies of the recently updated files so that, in case the original file is corrupted or deleted, the backup file can be used in its place or to reconstruct the original file.

    Note.  The backup file & the operational file should be kept at separate places so that in case of loss or damage, both are not affected.

    Sort files.

    Sort files are created from existing files, such as Master or Transaction files, and are used mainly for sorting data (i.e., they are used to alter the sequence of the existing files).

    A sort file is mainly used where data is to be processed sequentially.  In sequential processing, data or records are first sorted and held on a magnetic tape before updating the master file.

    Report files.

    A Report file contains a set of relatively permanent records extracted from the data in a Master file or generated after processing.

    Report files are used to prepare reports, which can be printed at a later date.

    Example of Report files:

    Report on Overtime, report on Taxes, report on student’s class performance in the term, etc.

    Scratch file.

    A Scratch file is a temporary file used to hold data during processing.  It contains temporary data, which can be erased when the task is finished.

    History (Archive) files.

    History files are usually old files retained for historical use or for reference purposes, e.g., it can contain Employee details for the last 10 yrs.

    Key field.

    A Key field is one or more fields in a record that uniquely identifies the record or a group of records.

    E.g., an Employees Serial number may be used to identify the employee records in a Payroll file.

    Note.  Any field in the record can be used as the key field.  However, it should display unique identification characteristics.  

    Review questions

    1. Define a computer file.
    2. State four advantages of storing data in computer files over the manual filing system.
    3. Differentiate between Logical file structure and physical file structure.
    4. With the help of a figure, illustrate the information system Data hierarchy.
    5. Define the following terms:
      1. Character.
      2. Field.
      3. Record.
      4. Key field.
    6. List 5 types of files used in data processing and their purposes.


    File organization refers to the way records are arranged (laid out) within a particular file.

    The term file organization can also refer to the relationship of the Key of a record to the physical location of that record in the computer file.

    File organization is very important because; it determines the method of access, efficiency, flexibility, and storage devices to be used.

    Methods of file organization.

    There are 4 methods by which records of a file can be arranged and accessed.  These include:

    1. Random.
    2. Serial.
    3. Sequential.
    4. Indexed sequential.

    Random (Direct) file organization.

    In Random or direct file organization, the records are stored in the file randomly, and in no particular order.  This implies that, there is no relationship between two adjacent records.

    An Algorithm (mathematical procedure) is applied onto the record key to generate the address of the location where the record would be stored.

    Random files are usually accessed directly.  To access the file, the record key is used to determine where a record is stored on the storage media.  Once the record is located, it is then read into the computer memory.

    This method is used by Magnetic disks and Optical disks.

    Advantages of Random file organization.

    1. Records are quickly accessed (i.e. there is fast access to records).
    2. Files are easily updated (i.e. adding, deleting, and amending the records is easily achieved).
    3. The method does not require the use of indexes, hence saving space.
    4. Transactions do not need to be sorted before being updated.
    5. New records can be easily inserted into a random file.

    Disadvantages of Random file organization

    1. Data may be accidentally erased or overwritten unless special precautions are taken.
    2. Random files are less efficient in the use of storage space compared to sequentially organized files.
    3. Expensive hardware and software resources are required.
    4. Relatively complex when programming.
    5. System design based on random file organization is complex and costly.

    Serial file organization.

    In Serial file organization, records in a file are stored one after the other in the order they come into the file without any particular sequence.  The records are not sorted in any way on the storage medium, and there is no relationship that exists between adjacent records.

    This type of organization is mostly used on Magnetic tapes.

    Serial files can be accessed serially.  This involves searching through the entire file record by record starting from the ‘head’ of the file towards the ‘tail’ of the file.

    Note.  Serial access is suitable where all the records in the file are to be read.  This is because; even the records that are not required must be passed over before locating the record of interest.  E.g., to access the 10th record in the file, then the computer reads the first 9 records before reading the 10th record.  Therefore,.

    Sequential file organization.

    In Sequential file organization, the records are arranged within the file serially one after the other.  However, in sequential file organization, the records are stored in a particular order sorted using a key field; hence, there is a relationship that exists between adjacent records and the key fields.

    Sequential files are accessed sequentially, i.e. the key field is used to search for the particular record required.  Searching starts at the beginning of the file and proceeds sequentially towards the ‘tail’ of the file, until the required record is located. 

    Advantages of Sequential organization.

    1. The method is simple & easy to understand.
    2. Sequential files are easy to organize and maintain.
    3. Loading or reading a record requires only the Record Key.
    4. It is efficient & economical if the number of file records to be processed is high.
    5. Relatively inexpensive Input/Output media and devices may be used.
    6. Errors in the files remain localized.

    Disadvantages of Sequential organization.

    1. The entire file must be processed even when the no. of file records to be processed is low.
    2. Transactions must be sorted in the sequence of the Master file before they can be processed or updated.
    3. Data redundancy/idleness is high since the same data may be stored in several files sequenced in different keys.
    4. Random enquiries are almost impossible to handle.

    Indexed Sequential file organization.

    The records are arranged sequentially as in sequential files.  However, indexed sequential files have an Index that enables the computer to locate individual records on the storage media.

    An Index is the address of a particular cylinder or track.  The indexes are used to point at the portions where the records are stored in groups.  This allows a group of records that are not required in a particular processing run to be bypassed.

    To access a record in an indexed sequential file, the Index and the record’s key field are used by the computer to search for the required record before it is read into the computer memory.

    Methods of accessing Indexed sequential files.

    Indexed sequential files may be accessed using 3 methods;

    1. Sequential access.
    2. Selective sequential access.
    3. Random access.

    Sequential access:

    In sequential access, the computer reads the records in sequential order (i.e., one record after the other) using the index until the record matching the search key is found.  The record is then read into the Main memory.

    Sequential access is suitable for high activity files.

    Selective Sequential access:

    In this selective sequential access, the transaction file must first be sorted into the same key sequence as the master file.  The access mechanism then goes forward in an ordered progression (sequence), and only those records needed are read/processed.

    The method is suitable for low activity files.

    Random (direct) access:

    The records in a Random file are not stored in any particular sequence of the key field.  This means that, the records can be processed in any sequence, i.e., by moving access mechanism forward and backwards along the file in a non-orderly manner to access the records required. 

    The method is suitable for low activity files.

    Advantages of Indexed sequential file organization

    1. Records can be accessed sequentially or randomly.
    2. Accessing of records can be fast, if done randomly.
    3. Records are not duplicated.

    Disadvantages of Indexed sequential file organization.

    1. Accessing of records sequentially is time consuming.
    2. Processing of records sequentially may introduce redundancy/idleness.
    3. Required expensive storage medium.

    File organization & access on a Magnetic Tape.

    In a Magnetic tape, the file records are placed one after the other onto the tape.

    There are 2 ways in which files are arranged on tapes:


    In serial organization, the records are written onto the tape without having any relationship between the record keys.

    • Serial files on a tape are accessed serially, i.e., each record is read from the tape into main storage one after the other in the order they occur on the tape.


    In Sequential organization, the records are written onto tape in sequence according to the record keys.  Sequential files are accessed sequentially.


    To process a sequential Master file on a tape, the transaction file must be in the sequence of the Master file.  The transaction file is read first, followed by the Master file until the matching file record is found.  E.g., if the record required is the 20th record of the file, the computer must first read all the 19 preceding records.

    File organization & access on a Magnetic Disk.

    There are 4 basic methods of organizing files on a Magnetic disk:


    The records are placed onto the disk one after the other with no regard for sequence.

    Serial files on a disk are accessed Serially, i.e. each record is read from the disk into main storage one after the other in the order they occur on the disk.


    In sequential organization, the records are written onto the disk but in a defined sequence according to the record keys.

    The Sequential method of access is used to read a sequential disk file.


    In random organization, the records are placed onto the disk “randomly”, (i.e. there is no obvious relationship between the records).

    A mathematical formula is used to generate the address of the location where the record is placed on the disk.  During processing, the same record key is used to generate the address which shows the location from which the record is read.

    The method of access to random files is Random (direct) access.

    Indexed Sequential:

    In Indexed Sequential organization, the records are stored in sequence, but an Index (key field/guide) is provided to enable individual records to be located.  In this case, the index will always enable the sequence of the records to be determined.

    Indexed sequential files can be accessed using sequential access, selective sequential access, or random access method.

    Factors to consider when choosing the type of file organization to use.

    Frequency of update.

    The file designer should determine how often the file is going to require updating.

    For periodic updates (e.g., monthly update), the transactions are used to update the master files in one run.  For the non-periodic systems, the transactions may be updated anytime as required.

    The file design selected should therefore be able to meet the update strategies, and at the required time.

    File activity.

    The type of file organization adopted should be based on the expected number of records to be processes/accessed in a particular run.

    Method of file access.

    This refers to the method the computer shall use to transfer the contents of the file from the storage media into the computer.

    Nature of the system.

    Before designing the file(s) to be maintained by a computer system, you have to consider whether the system runs periodically or is an event-driven system.

    In periodically run systems, all transactions relating to particular business are accumulated over a period of time, after which they are applied to the relevant master files in a single run.  Such systems produce periodic reports from the maintained files.

    On the other hand, event-driven systems allow file enquiries and instant update so long as the transactions are available from the maintained master files for the production of instant information.

    Medium for storing the Master file.

    Computer files are stored in the storage media.  The type of file organization adopted depends on the medium that will be used to store the computer file.

    E.g., Serial access devices, such as Magnetic Tapes cannot be used to store Random files or Indexed-sequential files.  This is because; searching for the particular record required proceeds serially regardless of the file organization method used.

    Review Questions.

    1. What do you mean by File Organization?
    2. State and explain four types of file organization.
    3. Distinguish between:
      1. Sequential and serial file organization methods.
      2. Random and indexed-sequential file organization methods.
    4. (a). Describe how files are organized and accessed on tape.

    (b). What are the disadvantages of storing files on tape?

    1. Differentiate between Sequential and Indexed Sequential methods of file organization on disk.
    2. (a). What is random file organization?  State its advantages.

    (b). How are Random files accessed on disk?

    1. Identify four file processing methods.
    2. Discuss four considerations for choosing a file organization method.


    Data processing modes describe the ways in which a computer, under the influence of an operating system, is designed to handle data or transactions during processing.

    Types of electronic data processing modes

    • Batch processing (also referred to as Sequential or Offline processing)
    • Online processing
    • Real-time processing
    • Time-sharing
    • Multi-programming (also referred to as Multi-tasking)
    • Multi-processing.
    • Distributed processing
    • Interactive processing

    Review questions

    1. Define the term “Data processing modes”.
    2. Mention five types of electronic data processing modes.

    Batch processing

    In batch processing, data or transactions are collected & accumulated together over a specified period of time, e.g., daily, weekly, or monthly.  The data is then input & processed at once (or as a single unit) to produce a batch of output.

    For example:

    In a payroll processing system, details of employees such as number of hours worked, rate of pay, may be collected for a period of 1 month, after which they are used to process the payment for the duration worked.

    Data collection is usually done off-line (i.e. away from the CPU) on special machines known as Data entry terminals.  The data is entered & stored on a disk in a batch queue for a while.  It is then input & processed one or more at a time under the control of the Batch operating system, and the result obtained.

    Batches of transactions are scheduled for processing by assigning them priorities.  The priorities are assigned in terms of percentage ratio, e.g. 95%, 60%, etc.  The most priority jobs are processed first, while the less priority jobs are processed once the computer resources (i.e., CPU time, Memory & I/O devices) are released by the most priority jobs. 

    Once the processing of a given batch starts, there is no interaction between the operator & the CPU.  Therefore, the user cannot intervene to perform amendments to the program. 

    A job is not processed until it is fully input.  In addition, a program must wait its turn before processing the data.  This means that, there will be a delay in obtaining results.  For instance, a job may wait in the batch queue for minutes or hours depending on the workload.  Hence, Batch processing cannot be used when the results are needed immediately.

    Characteristics/ Features of Batch processing system.

    • The input device does not necessarily need to be connected to the computer.
    • If the device used for data entry is not connected to the computer, it is said to be Off-Line (away from the computer).
    • The data is not immediately input into the computer, and it is not even immediately recorded in a machine-readable form.
    • The speed of processing is not important. This implies that, processing of the data is done at whatever time is most convenient.

    Application areas for Batch processing systems.

    Batch processing is commonly applied in:

    Payroll systems. 

    The attendance data of each employee is collected regularly.  It is then input weekly or monthly as per the demands of the system, processed, & then the pay figures for each employee is obtained.

    Printing systems (to print documents)

    Advantages of Batch processing.

    1. Batch systems are easy to develop.
    2. Processing of data in batches is efficient & economical.
    3. The cost of processing per unit is low.
    4. Batching provides manageable units for control purposes.
    5. Timing of the information (reports) is not a necessity.

    Disadvantages of Batch processing.

    1. There are delays in obtaining information.
    2. It leads to overloading of the processing facilities.
    3. Late information is not suitable in situations where instant decisions are required.
    4. It is difficult to provide the scheduling of the desired priority.

    Review questions

    1. Briefly explain Batch processing.
    2. Describe the application, advantages and disadvantages of batch processing.

    Online processing

    In online processing, data or the input transactions are processed immediately they are received to produce the information required.

    Online processing occurs when the transactions are processed to update (or make any change in) a computer file immediately after the transactions occur.

    In online processing, all the Input/Output facilities, and communication equipments are under direct influence of the central Processor. 

    In online processing, the operator communicates directly to the computer’s operating system using commands, which are then interpreted by the supervisor.  This means that, the operator can interact with the system at any point of processing using the Input/Output facilities.

    Note.  In online processing, the data input units (terminals) are connected directly to the central computer using communication links. 

    In such a configuration, the data (input transactions) are communicated from the workstations to the central computer for processing, & the results communicated back to the workstations through the telecommunication links.

    Characteristics of Online processing system.

    • The input device is connected directly to the computer.
    • The input data is processed immediately.  Processing is completed within a short time (usually 1 or 2 minutes), depending on the speed of the system.

    Application areas for online processing systems.


    A bank customer can make an inquiry using an online terminal.  The system would then respond immediately by accessing the relevant file, and inform the customer on the status of his/her account.

    Stock exchanges:

    Terminals located in major stock exchanges throughout the country enables quick processing of shares dealings.

    Stock control:

    Terminals located in warehouses enable stock records to be re-ordered automatically, make reservations, follow-up of outstanding orders, & print picking lists.

    Manufacturing plants: - to control the progress of work.

    Inventory status: - i.e., ordering & reporting of geographically dispersed distributors.

    Advantages of online processing.

    1. Files are held online; therefore the information generated can be used to update the master files directly.
    2. The Information is readily available for immediate decision-making.
    3. File enquiries are possible at any given time through the terminals (workstations).

    Disadvantages of online processing.

    1. Online systems are complex to develop.
    2. They are costly in terms of hardware, software, storage media, operating system, communication facilities, etc.

    Review questions

    (a). Discuss Online processing. 

    (b). Mention and explain the Application, Advantages and disadvantages of Online processing mode.

    Real-time systems.

    A Real-time system is capable of processing data so quickly such that the results (output) produced are able to influence, control, or affect the outcome of the activity or process currently taking