| Home/ | www.icosaedro.it | ![]() |
Modular programming in CLast updated: 2010-07-21
This paper explains how C programs can be structured by modules.
What modules are
Header file
Header file: Constant declarationsCode file
Header file: Type declarations
Header file: Global variables
Header file: Function prototypes
Main program
Makefile
Modules dependent from other modules
Final suggestions
Modularization is a method to organize large programs in smaller parts, i.e. the modules. Every module has a well defined interface toward client modules that specify how "services" provided by this module are made available. Moreover, every module has an implementation part that hides the code and any other private detail clients modules should not care of.
|
Layout of the source three. Dotted boxes are files generated by the compiler, while arrows indicate files involved in their generation. |
Modularization has several benefits, especially on large and complex programs:
Programming by modules using the C language means splitting every source code
into a header file module1.h and a corresponding code file
module1.c. The header contains only declarations of constants,
types, global variables and function prototypes that client program are allowed
to see and to use. Every other private item internal to the module must stay
inside the code file. We will now describe in detail the header and the code
file.
Every header file should start with a brief description of its purpose,
author, copyright informations, version and how to check for further updates.
All these information are simply C comments. Proper C declarations must be
enclosed between C preprocessor directives that prevent the same declarations
from being parsed twice in the same compilation run. Here is the skeleton of
our module1.h header file:
/*
module1.h -- Skeleton example of a C module
Copyright 2008 by icosaedro.it di Umberto Salsi
License: as you wish
Author: Umberto Salsi <salsi@icosaedro.it>
Version: 2008-04-23
Updates: www.icosaedro.it/c-modules.html
*/
#ifndef _MODULE1_H_
#define _MODULE1_H_
/* Include headers required by following declarations: */
#include <stdlib.h>
#include <math.h>
#include "some_other.h"
/* Set EXTERN macro: */
#ifdef MODULE1_IMPORT
#define EXTERN
#else
#define EXTERN extern
#endif
/* Constants declarations here. */
/* Types declarations here. */
/* Global variables declarations here. */
/* Function prototypes here. */
#undef MODULE1_IMPORT
#undef EXTERN
#endif
The meaning of the MODULE1_IMPORT and of the EXTERN
macros will be explained below in the paragraph devoted to the global
variables. Since clients modules do not define the MODULE1_IMPORT macro,
the value of EXTERN will be the keyword extern. On the contrary,
the MODULE1_IMPORT macro will be defined by our code module when it will
import its own header file, as we will see.
As a general rule, to prevent collision in the global space of names, every public identifier must start with the name of the module, then an underscore, and then the actual name of the item.
Constants can be both simple CPP macros or enumerative values. Enumeratives are
more suited to define also a new type and are discussed below along the type
declarations. Usually constants are simple int or
double numbers, but also float and literal strings
are allowed.
/* module1.h -- Constants declarations */ #define MODULE1_MAX_BUF_LEN (4*1024) #define MODULE1_RED_MASK 0xff0000 #define MODULE1_GREEN_MASK 0x00ff00 #define MODULE1_BLUE_MASK 0x0000ff #define MODULE1_ERROR_FLAG (1<<0) #define MODULE1_WARNING_FLAG (1<<1) #define MODULE1_NOTICE_FLAG (1<<2) #define MODULE1_EARTH_RADIUS 6367.445 /* meters */
This section of the header file contains enumative declarations, data structure declarations and opaque data structure declarations. Enumeratives are suitable to declare several constants. struct declarations are suitable to declare data structures whose internal details hare available for reading and writing by client modules.
Opaque data structures are data structure whose internal details are hidden to
the client modules. Opaque types are fully declared only in the code module, so
that client module can't access their internal details. Clients modules cannot
dynamically allocate opaque data structures, nor they can declare arrays of
such types, because their size is known only inside their own code module.
Since client modules can deal only with pointers to opaque types, the code
module must then provide every allocation and initialization routine that may
be required, whose typical name follows the scheme
module_type_alloc() and
module_type_free() respectively.
/* module1.h -- Types declarations */
enum module1_direction {
MODULE1_NORTH,
MODULE1_EAST,
MODULE1_SOUTH,
MODULE1_WEST
};
typedef struct _module1_node
{
struct _module1_node *left, *right;
char * key;
} module1_node;
typedef struct _module1_opaque module1_opaque;
It is good rule to avoid public global variables at all. But if you really need them, here is the recipe to deal with their declaration and initialization. The MODULE1_IMPORT macro is required in order to allocate the variable in the "text" section of the code module. Without this macro every client module would allocate its own copy if the variable, which is not what we expect.
/* module1.h -- Global variables declarations */
EXTERN int module1_counter
#ifdef MODULE1_IMPORT
= -1
#endif
;
EXTERN module1_node *module1_root
#ifdef MODULE1_IMPORT
= NULL
#endif
;
The CPP code protects the initial value from being evaluated by client modules, so that the variables are allocated in the code module and here initialized. Client modules will only see an external variable of some type.
All the functions that need to be accessible from client modules must be
declared with a prototype. Remember that functions without arguments
must have a dummy void formal argument.
/* module1.h -- Function prototypes */ EXTERN void module1_init(void); /* Initialize this module. */ EXTERN void module1_free(void); /* Release internal data structures. */ EXTERN module1_node * module1_add(char * key); /* Add a node to the root three. Return allocated node. */ EXTERN module1_opaque * module1_opaque_alloc(void); EXTERN void module1_opaque_free(module1_opaque * p); /* Memory handling of the opaque data type. */
The code module module1.c should include the required headers,
then it should define the MODULE1_IMPORT macro before including its own header
file. Including its own header, the compiler grabs all the constants, types
and variables it requires. Moreover, including its own header file, the code
file allocates and initialize the global variables declared in the header.
Another useful effect of including the header is that prototypes are checked
against the actual functions, so that for example if you forgot some argument
in the prototype, or if you changed the code missing to update the header, then
the compiler will detect the mismatch with a proper error message.
Macros, constants and types declared inside a code file cannot be exported, as them are implicitly always "private".
Global variables for internal use must have the static keyword in order to make them "private".
Remember also to declare as static all the functions that are private to
the code module. The static keyword tells to the compiler that these
functions are not available for linking, and then them will not be visible
anymore once the code file has been compiled in its own module1.o
object file.
Since all the private items are no exported, there is not need to prepend the
module name module1_ to their name, as them cannot collide with
external items. Private items are still available to the debugger, anyway.
/* module1.c -- See module1.h for copyright and info */
#include <malloc.h>
#include <string.h>
/* Including my own header: */
#define MODULE1_IMPORT
#include "module1.h"
/* Private macros and constants: */
/* Private types: */
/* Public opaque types: */
typedef struct _module1_opaque
{
...
} _module1_opaque;
/* Private global variables: */
static module1_node * spare_nodes = NULL;
static int allocated_total_size = 0;
/* Private functions: */
static module1_opaque * alloc_opaque(void){ ... }
static void free_opaque(module1_opaque * p){ ... }
/* Public functions: */
void module1_init(void){ ... }
module1_node * module1_add(char * key){ ... }
void module1_free(void){ ... }
Note that public functions are left by last, since usually them need some private function; moreover, since public functions already have their prototype, public functions can be called everywhere in the code above them.
The code file should never need to declare function prototypes, the only exception being recursive functions.
The name of our project will be program_name and its source file
is program_name.c. This source is the only one that does not
require an header file, and it contains the only public function,
main(), that does not have a prototype. The main source includes
and initializes all the required modules, and finally terminates them once the
program is finished. The general structure of the main program source file is
as follows:
/*
program_name.c -- Our sample program
Copyright 2008 by icosaedro.it di Umberto Salsi
License: as you wish
Author: Umberto Salsi <salsi@icosaedro.it>
Version: 2008-04-23
Updates: www.icosaedro.it/c-modules.html
*/
/* Include standard headers: */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
/* Include our modules headers: */
#include "module1.h"
#include "module2.h"
#include "module3.h"
int main(int argc, char **argv)
{
/* Initialize modules: */
module1_init();
module2_init();
/* Perform our job. */
/* Properly terminate the modules, if required: */
module2_free();
module1_free();
return 0;
}
Compiling, linking and other common ordinary tasks are usually delegated to a
Makefile, the configuration file for the make
command. make already has default rules that tell how to build
object files *.o out from their source file *.c, but unfortunately it is not
aware of our modular structure of the source. To deal with our modules we have
to tell to make that also *.h header files have to be added to its
dependencies rules. This will require an explicit rule, as we can't rely on the
default one. Moreover the main program, the only one that has not an
header file, must to be compiled with another rule and it may require also to
specify some external library to linking with. Finally, this is the resulting
Makefile skeleton:
# Makefile
# Compiler flags: all warnings + debugger meta-data
CFLAGS = -Wall -g
# External libraries: only math in this example
LIBS = -lm
# Pre-defined macros for conditional compilation
DEFS = -DDEBUG_FLAG -DEXPERIMENTAL=0
# The final executable program file, i.e. name of our program
BIN = program_name
# Object files from which $BIN depends
OBJS = module1.o module2.o module3.o
# This default rule compiles the executable program
$(BIN): $(OBJS) $(BIN).c
$(CC) $(CFLAGS) $(DEFS) $(LIBS) $(OBJS) $(BIN).c -o $(BIN)
# This rule compiles each module into its object file
%.o: %.c %.h
$(CC) -c $(CFLAGS) $(DEFS) $< -o $@
clean:
rm -f *~ *.o $(BIN)
depend:
makedepend -Y -- $(CFLAGS) $(DEFS) -- *.c
With this Makefile, compiling the source becomes as simple as issuing the
make command alone, no arguments are required. Other tags may also
be present, as make clean, make dist and so
on. The last tag make depend will be the subject of the next
paragraph.
The source three we considered till now is very simple, with a main program
that depends from several, independent modules. The %.o rule takes
care to update every *.o file if any module source gets modified,
while the $(BIN) rule re-compiles and re-link the main program if
its source or any of the modules gets modified.
But what if some module depends from some other sub-module, either including it
in its header or in its code file? And what if modules, besides contributing to
the main program, are also mutually dependent? The figure below schematically
illustrates a situation in which module1.h/.c requires a sub-module
module4.h, and module2 requires module3.
|
A more complex source layout, where module 1 (either in its .h or .c file) requires module 4, and module 2 (either in its .h or .c) requires module 3. If not properly directed, our Makefile in its basic form fails to detect these dependencies, and sources are not re-compiled as required. |
The make command does not parse the contants of the files and it
is not aware of these new dependencies. So if we modify module4 it
will omit to re-compile module2, and if we modify
module3 it will omit to re-compile module2. We can
fix simply adding specific rules that handle these dependencies, but we have
also to remember to update these rules according to any change in our source
three layout.
The special tag make depend can do all that boring work for us,
as it builts automatically all the dependencies between the source files,
and appends them to the Makefile itself. Issuing make depend,
in fact, the Makefile gets changed with these new lines:
---- The Makefile as above, but remember to add ---- ---- module4.o to the list of the object files. ---- # DO NOT DELETE module1.o: module4.h module2.o: module3.h program_name.o: module1.h module2.h module3.h module4.h
These rules complete the %.o rule we wrote by hand. The last rule reports the
file program_name.o we do not generate, and it is ignored in the
context of our Makefile. So, for example, modifying module4.h
and issuing the make command, the rule %.o causes the
re-compilation of module4.c, the rule module1.o
added by makedepend combined with the rule %.o
causes the re-compilation of module1.c, and finally the rule
$(BIN) produces the updated executable program program_name.
Summarizing, after every change to the layout of the source three it is safe to
update the Makefile issuing the command "make depend", and
then we can use the command "make" as usual to generate the
executable program.
The GNU GCC compiler has a -Wall flag that enables all the possible warning messages. I always use this flag because it helps to write clean code, and it saves from many obscure mistakes that would be difficult to detect otherwise.
You may use the nm command to check if some internal item
(variable or function) escaped from our modularization. This command displays
all the symbols available in the object file, either available to the linker or
to the debugger. For every symbol this command prints also a letter that marks
its status and its availability. Public items (i.e. those that the object file
make available to the client modules) are marked by an uppercase letter B D T
etc. while local symbols have lowercase letters b d t etc.:
$ nm module1.o 00000000 t alloc_opaque 0000000c b allocated_total_size 00000014 T module1_add 00000000 D module1_counter 0000001e T module1_free 0000000f T module1_init 00000004 B module1_root 0000000a t free_opaque 00000008 b spare_nodes
A simple grep allows to immediately detect variables and items
actually exported by modules:
$ nm module1.o | grep " [A-Z] " 00000014 T module1_add 00000000 D module1_counter 0000001e T module1_free 0000000f T module1_init 00000000 B module1_root
We can improve this shell command writing an useful tool that displays all the private identifiers erroneously exported by each code module:
#!/bin/sh
# Usage: c-detect-private-exported *.o
echo "Detecting private items exported by object files:"
while [ $# -gt 0 ]; do
base=`basename $1 .o`
nm $1 | grep " [A-Z] " | cut -d " " -f3 |
while read id; do
grep -q -w $id $base.h || echo " $id"
done
shift
done
This script accepts a list of .o files and displays all the
identifiers exported that are not declared in the corresponding .h
file: these symbols can then be readily added to their proper include file.
| Umberto Salsi | Contact | Site map | Home/ |