What is a 'C' program? A 'C' program is one or more functions plus optional data. In particular, a 'C' program always begins execution with the function called:
main()
no matter where the function 'main()' is in the source file or files. A program ends normally when either a return from the function "main()" is made or a call to the library function "exit()" is made.
A function is indicated lexically by a name followed by a pair of parenthesis which may or may not contain parameters passed to that function. These parameters are sometimes also called arguments. The function has a return data type and of each of its parameters has a data type.
As was mentioned previously, these functions or data may be distributed in different files, usually called MODULES. A separately compilable file is called a Module when it has some cohesive unity. That is, the collection of data objects and functions perform some related task. A function must be entirely contained in one source file. If a function becomes too big (for what ever reason) it can be broken up into two or more smaller functions which can be contained in one or more source files, with a function to call each of the (sub)functions in turn.
An example of a module is one which contains data with the:
name, addresses and phone number
of people or organizations, plus a collection of functions to manipulate these names, addresses and phone numbers.
Such functions could:
add to the list, remove from the list, change a phone number or or address, supply the phone number given the name, supply the address given the phone number, sort by name sort by phone number etc.
By working in this way with modules, a program is more structured. This is because the rest of the program does not have to know how the information is organized. This allows the organization of the data to change and only that one single module needs to be modified.
Such a module can also be used in another program without any change because it too knows nothing about the rest of the program.
Square root is an example of a function most people are familiar with. It receives one parameter and returns the square root of its value. A function may have a side effect, i.e. it may cause some change of state in the program or computer, such as changing the value of one or more global variables, or doing Input or Output.
There are 3 ways to refer to a function. These are:
A function declaration is where the function is named and its return type is specified. For example:
double sqrt();
A function prototype is the same as a function declaration, plus the type of the parameters passed to it are declared:
double sqrt(double);
A function declaration or prototype tells the compiler to allow the programmer to call or use the function in a statement in a function, and gives the compiler information so it can verify that data type consistency is maintained.
Once the compiler has seen a function declaration it knows what type of data it will return (if any). If the compiler has seen a function prototype, it also knows the type of the parameters to be passed to it. This allows the compiler to check that the function has been used correctly. An example of the use of the sqrt() function is:
x=7.9; y=sqrt(x);See the pages aboutt data types for more on this subject. Another example is with the function main() which must exist in every 'C' program:
int main(); /* function declaration */ int main(int argc,char *argv[]); /* function prototype */
If a function is not used consistently with its declaration or prototype then a compiler error or warning message will result, depending upon the severity of the difference in types. When the compiler gives a warning with a function type error, it means that it has seen the discrepancy in types, notifies you, but will automatically make a type conversion for you. It is strongly recommended that such implicit type conversions be made explicit.
This can be done with what is called a "cast" or by using the appropriate type for variables and expressions. It is possible that a type conversion will give the wrong result because a data items of a large size has been squeezed into a smaller size with the inherent loss of significant bits.
The function body is where the actual implementation of the function is given in full detail. This includes all of the statements and local data private to the function are given.
The body of a function is preceded by a full prototype, but without the trailing semicolon. This is followed by a pair of left and rights curly braces which enclose the optional data declarations and then the optional statements. Thus the simplest function does nothing except to return to its caller:
void do_nothing(void) { }
Such a null function has several possible important uses. Sometimes, a function must be called under every circumstance, but no action is to take place in certain cases. A null or empty do nothing function serves as a place holder and fulfills the structure of the program.
Another use for a null function, i.e. one with an empty body, is because you have not as yet had time to write it. Nevertheless, you wish to test the rest of the program, but cannot compile without getting an error because there is a missing function which is called, perhaps in many places.
To solve this problem, you create a place holder function with the proper expected parameters and types, but which as yet does nothing. Later, when the rest of the program has been tested, you can come back and fill in the body. Or perhaps the null body can be initially filled in with a statement which will print a message saying that it safely got to that function.
void to_be_filled_in(int x) { /* --- this function is not complete --- */ printf("to_be_filled_in: %d\n",x); }
Such functions, and the messages they print, then serve as a reminder that the statements for it still need to be written. Such a function is called a "stub".
There are many reasons for the instructions of a program to be organized into a set of functions. The first is that functions correspond to a deep and essential mathematical view of things.
Another important reason is that the process, actions, and effects embodied by the function may be used in more than one place in a particular program. A function allows that code to be written once and used as many times as needed without duplicating the instructions. Just as the square root function, sqrt(), can be used in many places in one program without repeating all of the many instructions needed to calculate the square root.
A third reason is that a function can not only call other functions, but can call itself. A function that calls itself (either directly or indirectly - by calling a function which calls it) is termed recursion. Recursion is a very elegant way to implement certain algorithms.
An example of recursion is the factorial calculation:
int factorial(int n) { if(n==1) return(1); return n*factorial(n-1); }
Another reason for using functions is to make it easier to understand and implement and maintain a large program. Even if a function is called only once, having a short way to refer to it (especially with a name evocative of what it does) makes understanding and modifying a a program much easier.
Having separate functions which can be defined clearly on paper, will allow more than one programmer to work on the same program. Each programmer is assigned different functions that they are responsible for. Each function can be written and tested by itself. Ideally such functions are grouped into modules which perform related actions.
Lastly, a function can be used to help 'hide' data structures, so that the rest of the program does not depend on a particular way the data is organized. If that data structure is changed, then the rest of the program does not have to change. Only the functions which are the sole operators on that data structure need change. An example for a set of functions to add, access and delete entries from a table has already been given above. They need be the only ones to know how that table is organized, how big it is, etc.
When the table is short, a simple array may be adequate. But when the table gets large, it may be necessary to use a tree, or hash table, or other structure for speed.
Lastly, a function or collection of function can be re-used in other programs. This can save considerable time when writing new programs.
How is a function actually used in the computer? First let us restate what a function is. A function is a collection of instructions to perform some action or calculation, and may or may not return a value.
A function has a location and occupies some memory space. From the function prototype we know that every function has a return type, even if it is the type "void" (i.e. no data is returned).
All functions have some number of parameters, including zero, or no parameters. Each existing parameter has a type. Before a function is called, these instances of the parameters (called actual parameters), must be passed to the function. This is done on the stack. The parameters are placed on the stack always in fixed order, i.e. left to right or right to left, but which direction is a design decision of the compiler and operating system. We do not need to know because in 'C' we never directly manipulate the stack or the stack pointer.
Next the return address, i.e. the address of the instruction after the call to the function, is placed on the stack (by the compiler) so that the function will know where to return to when it is done.
Then, when the function is called, it will put any local, or "auto" variables on the top of the stack too. Thus if the function is called recursively, each concurrently active copy will have its own private copy of the (non "static") local variables.
Lastly, if the return data type of the function is anything but void, then it too must be placed somewhere. This is may be on the stack too, but that is not necessary. It may be in a register, or main memory. But in any case the software convention of where it is placed is known to both the caller and the callee functions and is taken care of automatically by the compiler.
Before the function does return, it must first pop any local variables off the stack. Either the called function or the calling function must clear the parameters from the stack. The compiler takes care of this too.