Compiling programs on Unix systems
This article is composed of the following sections:- Compiling programs on Unix systems
Compiler names and basic usage
On UNIX systems, the name of a program source file must end in the appropriate suffix for the language used. To compile a program, type the compiler name followed by the source file.
Language Source Compiler Usage
filename
suffix
----------------------------------------------------------------
C .c cc % cc program.c
GNU C .c gcc % gcc program.c
GNU Pascal .p gpc % gpc program.p
Fortran90 .f90 or .f f90 % f90 program.f90
Sun Fortran 77 .f f77 % f77 program.f
C++ .c or .C CC % CC program.c
GNU C++ .c, .cc or .C g++ % g++ program.c
Java .java javac % javac program.java
Lisp
Allegro Common Lisp
Scheme (uncommon lisp)
GNAT Ada compiler
----------------------------------------------------------------
If there are no syntax or other compilation errors, an executable file ``a.out'' will be produced. To run the executable output file, type the name ``a.out'' at the prompt sign. For example, to compile a C program ``mytest.c'' and run the executable output file ``a.out'':
% cc mytest.c % a.out
Specifying a name for the executable file
To produce an executable file with a name different than the default name a.out, use the -o (lowercase letter O) compiler option. For example, to compile a Fortran 77 program ``myprogram.f'' and name the executable file ``myprog'', and run ``myprog'':
% f77 -o myprog myprogram.f % myprog
Compiling modules separately
When a program is large and is composed of many routines, it is a good practice to divide the routines among several source files (modules). The modules may be compiled separately and then linked to form an executable image. In the event that some of the modules must be modified, you can compile them and link them to the others without recompiling all the modules. This saves considerable amounts of time. Separate compilation is also the means by which executables are constructed when the routines are not all written in the same language.
Linking modules to create an executable image is a job performed by the loader ``ld''. Normally you do not invoke the loader directly, instead the loader is run for you by the compiler. The -c option causes the compiler to compile a module, but to stop short of running the loader. A non-executable object file is produced. It is given the same base name as the source file, but the suffix is ``.o'' (for object module). Linking is performed by running the compiler once more with all the the .o files specified as input.
In the following example, a program made of two C source files, ``file1.c'' and ``part2.c'' is separately compiled, and the executable file a.out is produced:
% cc -c file1.c % cc -c part2.c % cc file1.o part2.o
CAUTION: the order of the list of .o files is important. Modules which call routines which are defined in other modules must appear in the list before those other modules. Otherwise the compiler will not see the need for the routines until too late, will omit to load them, and will complain that the routines are undefined.
Use a Makefile to automate the compilation of large programs..
Specifying subroutine libraries
The topic of subroutine libraries is discussed fully in another article.
Briefly, here is an example of how a library is specified in a compile command using the -llibname loader option. There is no space between the -l (lower case letter L) and the library name. Put the option to load the library at the end of the compiler command. For example, to compile C program ``homework2.c'' that uses routines from the math library, and to produce executable file ``squareroot'':
% cc -o squareroot homework2.c -lm
C++ (Iostream) tips
C++ (Iostream) tips
Manual pages related to Iostream are available online. However, unless you know how the articles are named, they are not easy to find. Simply entering "man iostream" doesn't retrieve anything. On the other hand, "man -k iostream" gives these leads:
% man -k iostream
ios (3C++) - basic iostreams formatting
ios.intro (3C++) - introduction to iostreams and the man pages
manip (3C++) - iostream manipulators
stream_MT (3C++) - base class to provide dynamic changing of
iostream class objects to and from MT safety.
stream_locker (3C++) - class used for application level locking of
iostream class objects.
So "man ios.intro" is a good place to start.
Fortran tips
Interactive programming
The following is a list of reserved FORTRAN unit numbers and their names:
unit number name
0 diagnostic(screen)
5 standard input(keyboard)
6 standard output(screen)
When writing an interactive FORTRAN program all prompts to alert the user to input data should be sent to unit number 0. This unit number corresponds to the diagnostic unit(screen) and is not buffered-whereas unit number 6 which refers to standard output(also the screen) IS buffered. This means that output written to unit 6 will not be displayed to the user until the output buffer is full or until the program ends. For example, one should use:
WRITE(0,*) "ENTER X" READ(5,*)X
instead of
WRITE(6,*) "ENTER X" READ(5,*)X
Legal FORTRAN variable names
A legal FORTRAN symbol(variable, subroutine, function or program name) is defined as a string of letters and digits such that the first character is a letter. The maximum length is six characters.
I/O concepts
F77 I/O CONCEPTS
Sequential vs. direct access
Sequential access is the common mode of I/O where records are read or written, one after the other, in the same order that they appear in the file. Typical programs open the file, read or write the first record, then the second, the third and so on. It is possible however, when reading disk files, to reposition for the next read operation with fseek (see "man 3f fseek"). RECORDS CAN VARY IN LENGTH when sequential access is used. See example 1 below.
Direct access I/O allows you to read or write records in any order. Typically this is used when reading records from a file. For example you could read record 7, then record 3, then record 15. Key elements of direct access I/O: ALL RECORDS MUST BE THE SAME SIZE, the size is specified in the open statement using the 'recl' option, the read or write statement has an extra rec= argument which specifies the record number to read or write. Examples 2 and 3 show some sample code using direct access.
Formatted vs. unformatted
Formatted I/O of numeric variables involves CONVERSION between the machine dependent internal "binary" representation of numeric values and their "human readable" (ascii) representation. A file which contains formatted (ascii) numbers can be examined or modified with a text editor. Also, such a file should be fairly portable to other computer systems regardless of their hardware type. See example 2 below.
Unformatted I/O reads or writes the "binary" representation of numbers WITHOUT CONVERSION. This makes unformatted I/O efficient for storing numeric values that will later be read by a program, again using unformatted I/O. A file containing "binary" data cannot be manipulated with a text editor and it cannot be readily used on another machine which has a different hardware architecture and a different scheme for representing numbers internally. Unformatted I/O can be specified in an open statement with form='unformatted'. See example 3 below.
In list-directed read and write statements the format specifier is "*". This is a shortcut for programmers. In a read statement it means the (ascii) data in the file follows certain flexible conventions and conversion of numeric values to "binary" form should be done based on the data types of the variables and the forms of the data. In write statements, suitable formats are chosen for each item based on its data type. Eventhough you do not supply an explicit format, CONVERSION BETWEEN "BINARY" AND ASCII REPRESENTATION IS DONE, SO LIST-DIRECTED I/O IS NOT UNFORMATTED I/O. See example 4 for sample code using list-directed I/O.
Example 1: Sequential Access (Formatted)
open(unit=6, file='seq_acc.dat') a=10.0 b=20.0 c='hello' i=2345 write(6,100) a, b write(6,110) b, c, i write(6,120) c 100 format (2(f8.5, 1x)) 110 format (f8.5, 1x, a5, 1x, i4) 120 format (a5)
Output File
10.00000 20.00000 20.00000 hello 2345 hello
Notes for Example 1
Records input or output using formatted sequential access can have variable lengths as shown in the output file (above).
Example 2: Direct Access (Formatted)
integer num(3)
open(11, file='dir_in.dat', recl=3, access='direct',
* form='formatted')
open(12, file='dir_out.dat', recl=3, access='direct',
* form='formatted')
read(unit=11,rec=2,fmt=100) num(1)
read(unit=11,rec=5,fmt=100) num(2)
read(unit=11,rec=1,fmt=100) num(3)
write(unit=12,rec=1,fmt=100) num(1)
write(unit=12,rec=3,fmt=100) num(2)
write(unit=12,rec=2,fmt=100) num(3)
100 format (i2, 1x)
Input File
11 12 13 14 15 16 17 18 19 10
Output File
12 11 15
Notes for Example 2
A record length (recl) of 3 is used in the OPEN statement since the data consists of integers followed by spaces. (ie. 2-digit integer + 1 space = 3 characters) Each integer-space set forms one complete record. These records can be read or written in any order.
Example 3: Direct Access (Unformatted)
real a, array(10)
data array/1.,2.,3.,4.,5.,6.,7.,8.,9.,0./
open(12, file='unform.dat', recl=40, access='direct',
* form='unformatted')
open(6, file='unform_read.dat', recl=40, access='direct',
* form='formatted')
a=10.0
write(12,rec=2) a
write(12,rec=1) (array(k), k=1,10)
read(12,rec=2) a
write(6,100,rec=2) a
read(12,rec=1) (array(k), k=1,10)
write(6,110,rec=1) (array(k), k=1,10)
100 format(1x,f8.3)
110 format(1x,10(f3.1,1x))
Output File (Stdout, Unit 6, Readable Data)
1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 0.0 10.000
Notes for Example 3
The output file 'unform.dat' (unit 12) in the above example is a "binary" file and cannot be viewed with ordinary tools such as a text editor.
The record length (recl) was chosen to be forty bytes to accomodate a set of ten real numbers (each occupying 4 bytes on a VAX). In this example, a smaller record length will result in a run-time error. Note that rec 2 in unform.dat occupies 40 bytes even though it contains only 4 bytes of useful information (ie. only one real value). The amount of space allocated to each record in the open statement does not change upon execution of the write statement. The record will occupy the same amount of space regardless of whether the write statement completely fills the record space or only a portion of it.
Example 4: List Directed I/O
character*5 c real a integer n,ni open(5, file='list_dir.dat') open(6, file='output.dat') read(5,*) n,a,c,ni write(6,*) n,a,c,ni
Input File
4 6.3 hellothere 15
Output File
4 6.30000hello 15
Notes for Example 4
An asterisk '*' is used in the READ or WRITE statement to indicate list- directed I/O.
List-directed I/O automatically formats data types according to implementation specific conventions (eg. the real 6.3 becomes 6.30000). Notice that since 'c' is defined as a character variable of length five, only the word 'hello' gets stored in memory and sent to the output file.
Where to find additional information
Manual pages
For a complete list of options for a particular compiler, see the manual page. For example,
% man cc % man CC % man pc % man f77 % man f90
Note that not all compilers have man pages.
Other man-page leads can be found using the `-k' option. For example, to find pages that deal with programs that compile, issue the command:
% man -k compile
The `k' in `man -k' stands for `keyword'. The command `man -k' can also be used to find information on library routines when the exact name of the man page is difficult to guess. For example:
% man -k iostream
Some of the options that are frequently given in compiler commands, for example the -l option, are in fact loader options. For information about loader options see:
% man ld
ACS Info Articles
Search for locally written information about various topics:
Compilation of large programs can be handled and maintained by a utility called 'make'..

