Files

In nearly all computer systems there are objects called files. A file is a collection of information which is treated as a single entity. These files are in a form which will continue to exist even if the computer is turned off, provided the file has been properly closed prior to the turning off of the power or the removal of the media (e.g. disk or tape) containing the file. Thus it is important to leave a program in the normal way. If you abnormally abort a program or reset the computer you may lose the information in any unclosed output files.

Files on magnetic media are subject to degradation, either because the media physically wears out with usage, or is damaged by accident, or the drive itself can fail in such a way as to damage the media (a so called head-crash), or simply from heat. All magnetic material is degraded by sufficient heat. Because of this, it is absolutely essential to make back-ups, that is duplicates of important and valuable files on to another physical media periodically (more frequently when the file contents have been changed). The importance of making back-ups, i.e. duplicate files on another media, cannot be stated often enough.

Files can be: computer programs, reports, news, data or any other form of information which is meaningful for the computer to use or manipulate. Computer program files contain the set of instructions which tell the computer how to do a particular task. You will use such program files to create, maintain and operate on other files. These files can be other programs, documents or data, that are relatively unstructured or can have an internal structure which is recognized by one or more programs. For example, you may have files which are lists of names and addresses, or an inventory list, or a collection of notes being used in the preparation of a book, or accounting information, or anything.

Computer files can be either sequential or direct access. Direct access files can be read or written at any point in the file, usually at boundaries of fixed size called records. Sequential files are intended to be read or written starting always from beginning and proceeding to the end. The operating system keeps track of where the next byte of information is to go or come from when writing or reading. It is beyond the scope at this point to discuss direct access files any further since the details are quite technical (see libraries).

When writing to a sequential file the last thing written determines the end and hence size of the file. If you are already at the end of a file when a write is done, then the file is extended by that amount of information. If you are in the middle or the beginning of a sequential file when you write, then the length of the file will be set to the point after that last write. Thus if you write at the beginning or middle of a long sequential file you will lose all the information after that point. Normally this is not a problem, because it is the intended result, that is you are replacing the file contents with new data or information. But if for some reason the power goes off before properly completing the last write of a file and Closing it (that is notifying the operating system you are finished) then all of the file may be lost.

In such a case, when the power goes off while writing a file, it is likely that some or all of the information will be lost because the file length has not been properly set. But, if you or the program has already saved the information, make and then further changes are made and then the power goes off you will usually only lose the information added since the last save. It is for this reason that it is strongly recommended that while editing a file, frequent saves are done. This takes usually only a few seconds or less, and can be done while pausing to think or rearranging or turning pages in your notes.

Every file has a:

  Name
  Size (usually in bytes)
  Physical location on some physical storage media
  Logical location in some directory

In Windows and DOS the name of a file, upper and lower case letters are NOT distinguished, unlike in unix or linux where case makes a difference. Additional information is stored relating to the date (or dates) associated with the file, plus operating system dependent flags such as "read-only" used for protection and other purposes.

All Input/Output devices (other than disks) also look like files. This means that your program can open and read or write (as appropriate) from or to them. For example, if a program opens the file for writing which is the name of the printer, then when the program writes any data to that file it will automatically go to the printer queue when the file is closed. If the program opens the console file for reading then it will read the keystrokes that are typed.


© 1991-2008 Prem Sobel. All Rights Reserved.