Skip to content

Update of the minibook on building programs: #99

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion _data/learning.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,18 @@ books:
- link: /learn/quickstart/organising_code
- link: /learn/quickstart/derived_types


- title: Building programs
description: How to use the compiler to build an executable program
category: Getting started
link: /learn/building_programs
pages:
- link: /learn/building_programs/compiling_source
- link: /learn/building_programs/linking_pieces
- link: /learn/building_programs/runtime_libraries
- link: /learn/building_programs/include_files
- link: /learn/building_programs/managing_libraries
- link: /learn/building_programs/build_tools
- link: /learn/building_programs/distributing

# Web links listed at the bottom of the 'Learn' landing page
#
Expand Down
77 changes: 77 additions & 0 deletions learn/building_programs/build_tools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
layout: book
title: Build tools
permalink: /learn/building_programs/build_tools
---

If this seems complicated, well, you are right and we are only
scratching the surface here. The complications arise because of
differences between platforms, differences between compilers/linkers and
because of differences in the way programs are set up. Fortunately,
there are many tools to help configure and maintain the build steps.
We will not try and catalogue them, but give instead a very limited
list of tools that you typically encounter:

* The `make` utility is a classical tool that uses instructions about
how the various components of a program depend on each other to
efficiently compile and link the program (or programs). It takes a
so-called `Makefile` that contains the dependencies.

Simply put:

If a program file is older than any of the libraries and object files
it depends on, the make utility knows it has to rebuild it and goes on
to look at the libraries and object files - are any out of date?

If an object file is older than the corresponding source file, the
make utility knows it has to compile the source file.

* Integrated development tools take care of many of the above details. A
popular cross-platform tool is Microsoft's [Visual Studio Code](https://code.visualstudio.com/), but others exist,
such as [Atom](https://atom.io/), [Eclipse Photran](https://www.eclipse.org/photran/), and [Code::Blocks](http://www.codeblocks.org/). They offer a graphical
user-interface, but are often very specific for the compiler and
platform.

* Maintenance tools like autotools and CMake can generate Makefiles or
Visual Studio project files via a high-level description. They abstract
away from the compiler and platform specifics.

Here is a very simple example of a `Makefile` as used by the `make` utility,
just to give you an impression:

# Collect the macros at the beginning - easier to customise
FC = gfortran
LD = gfortran
FCOPTS = -c
LDOPTS = "-o "

EXE = .exe
OBJ = .o

all: tabulate$(EXE)

tabulate$(EXE) : tabulate$(OBJ) function$(OBJ)
{tab}$(LD) $(LDOPTS)tabulate$(EXE) tabulate.f90 function$(OBJ)

tabulate$(OBJ) : tabulate.f90 function.mod
{tab}$(FC) $(FCOPTS) tabulate.f90

function$(OBJ) : function.f90
{tab}$(FC) $(FCOPTS) function.f90

(A peculiarity of `make` is that in the input file, tab characters are used
in several places - here indicated as "{tab}" - as significant whitespace.)

When stored in a file "Makefile" and "{tab}" replaced by a tab character,
you can run it like:

```shell
$ make
```

(the name `Makefile` is the default, otherwise use the option `-f` to specify
a different file name). Now only change the file "tabulate.f90" and run it
again. You will see that only that file gets compiled again and then the
program is built. The file "function.f90" was not changed, so the object
file and the module intermediate file would remain unchanged, so there
is no need to recompile it.
108 changes: 108 additions & 0 deletions learn/building_programs/compiling_source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
---
layout: book
title: Compiling the source code
permalink: /learn/building_programs/compiling_source
---

The first step in the build process is to compile the source code. The
output from this step is generally known as the object code - a set of
instructions for the computer generated from the human-readable source
code. Different compilers will produce different object codes from the
same source code and the naming conventions are different.

The consequences:

* If you use a particular compiler for one source file, you need to use
the same compiler (or a compatible one) for all other pieces. After
all, a program may be built from many different source files and the
compiled pieces have to cooperate.
* Each source file will be compiled and the result is stored in a file
with an extension like ".o" or ".obj". It is these object files that are
the input for the next step: the link process.

Compilers are complex pieces of software: they have to understand the
language in much more detail and depth than the average programmer. They
also need to understand the inner working of the computer. And then,
over the years they have been extended with numerous options to
customise the compilation process and the final program that will be
built.

But the basics are simple enough. Take the gfortran compiler, part of
the GNU compiler collection. To compile a simple program as the one
above, that consists of one source file, you run the following command,
assuming the source code is stored in the file "hello.f90":

```shell
$ gfortran -c hello.f90
```

This results in a file "hello.o" (as the gfortran compiler uses ".o" as
the extension for the object files).

The option "-c" means: only compile the source files. If you were to
leave it out, then the default action of the compiler is to compile the
source file and start the linker to build the actual executable program.
The command:

```shell
$ gfortran hello.f90
```

results in an executable file, "a.out" on Linux or "a.exe" on
Windows.

Some remarks:

* The compiler may complain about the contents of the source file, if it
finds something wrong with it - a typo for instance or an unknown
keyword. In that case the compilation process is broken off and you will
not get an object file or an executable program. For instance, if
the word "program" was inadvertently typed as "prgoram":

```shell
$ gfortran hello3.f90
hello.f90:1:0:

1 | prgoram hello
|
Error: Unclassifiable statement at (1)
hello3.f90:3:17:

3 | end program hello
| 1
Error: Syntax error in END PROGRAM statement at (1)
f951: Error: Unexpected end of file in 'hello.f90'
```

Using this compilation report you can correct the source code and try
again.

* The step without "-c" can only succeed if the source file contains a
main program - characterised by the `program` statement in Fortran.
Otherwise the link step will complain about a missing "symbol", something
along these lines:

```shell
$ gfortran hello2.f90
/usr/lib/../lib64/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status
```

The file "hello2.f90" is almost the same as the file "hello.f90", except
that the keyword `program` has been replaced by the keyword `subroutine`.

The above examples of output from the compiler will differ per compiler
and platform on which it runs. These examples come from the gfortran
compiler running in a Cygwin environment on Windows.

Compilers also differ in the options they support, but in general:

* Options for optimising the code - resulting in faster programs or
smaller memory footprints;
* Options for checking the source code - checks that a variable is not
used before it has been given a value, for instance or checks if some
extension to the language is used;
* Options for the location of include or module files, see below;
* Options for debugging.

135 changes: 135 additions & 0 deletions learn/building_programs/distributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
---
layout: book
title: Distributing your programs
permalink: /learn/building_programs/distributing
---

When you distribute your programs, there are a number of options you can
choose from:

1. Distribute the entire source code
2. Distribute a pre-built executable program
3. Distribute static or dynamic libraries that people can use

__Option 1: Distribute the entire source code__

By far the simplest - for you as a programmer - is this one: you leave it
up to the user to build it on their own machine. Unfortunately, that
means you will have to have a user-friendly build system in place and
the user will have to have access to suitable compilers. For build systems:
see the previous section.

__Option 2: Distribute a pre-built executable program__

A pre-built program that does not need to be customised, other than via its
input, will still need to come with the various run-time libraries and will
be specific to the operating system/environment it was built for.

The set of run-time libraries differs per operating system and compiler version.
For a freely available compiler like gfortran, the easiest thing is to ask the
user to install that compiler on their system. In the case of Windows: the Cygwin
environment may be called for.

Alternatively, you can supply copies of the run-time libraries together with your
program. Put them in the directory where they can be found at run-time.

Note: On Windows, the Intel Fortran comes with a set of _redistributable_ libraries.
These will need to be made available.

In general: use a tool like "ldd" or "dependency walker" to find out what
libraries are required and consult the documentation of the compiler.

If your program does allow customisation, consider using dynamic libraries for this.
More is said about this below.

__Option 3: Distribute static or dynamic libraries that people can use__

This option is a combination of the first two options. It does put some burden on
the user, as they must create a main program that calls your routines in the
proper way, but they do not need to know much about the build system you used.
You will have to deal with the run-time libraries, though.

If you choose this option, besides the compiled libraries, you will also need to
supply the module intermediate files. These files are compiler-specific, but so are
the static libraries you build.

## Distributing the tabulation program
As shown above, the tabulation program can be built with the user-defined function
in a dynamic library. This enables you to:

* Ship the executable (with the appropriate run-time libraries)
* Provide a skeleton version of the module, something like:

```fortran
module user_functions
implicit none
contains

real function f( x )
!DEC$ ATTRIBUTES DLLEXPORT :: f
real, intent(in) :: x

... TO BE FILLED IN ...

end function f
end module user_functions
```

* Provide a basic build script with a command like:

```shell
gfortran -o function.dll function.f90 -shared
```

or:

```shell
ifort -exe:function.dll function.f90 -dll
```

As said, you cannot control that the user has done the right thing - any
DLL "function.dll" with a function `f` would be accepted, but not necessarily
lead to a successful run.

An alternative set-up would be to change the main program into a subroutine
and have the function as an argument:

```fortran
module tabulation
implicit none
contains

subroutine tabulate( f )
interface
real function f( x )
real, intent(in) :: x
end function f
end interface

... actual implementation

end subroutine tabulate

end module tabulation
```

Then provide a skeleton main program:

```fortran
program tabulate_f
use tabulation

call tabulate( func1 )
contains
real function func1( x )
real, intent(in) :: x

... TO BE FILLED IN ...

end function func1
end program tabulate_f
```

The advantage is that the compiler can check the interface of the
function that is passed and that the user has more freedom in the use of the
functionality provided by your library.
Loading