Lesson 1.3. Compilation

Introduction

The programming languages do not all run in the same way, we distinguish two major classes: compiled languages and interpreted languages.

Interpreted languages

An interpreted language requires the use of an interpreter that will run the source code. The interpreter is an intermediate program that parses and executes the lines (usually one by one). The most popular interpreted languages are : MATLAB, PHP, Python, Java ... Each of these languages requires the installation of the associated
interpreter to run a program.

Diagram of the general operation of an interpreter

Compiled languages

Running a compiled program is in two steps:

  1. The source code is first transformed into machine language.
  2. Then the transformed code is directly executed in the processor

The first step, the transformation of the source code into machine language is called compilation. The output of the compilation is an executable file, typically an .exe file under Windows. Here are some examples of compiled languages: C, C++, C#, swift, Pascal...

Compiled language operating principle

In general, compiled languages are more efficient because they are executed directly in the processor. To get an idea of the performance, this page presents a comparison between Python and C++. The same program takes 15 minutes in Python, against 30 seconds in C++. The graph below gives an order of magnitude of the execution ratios of the different languages. We can see that C is 10 times faster than PHP 7 and 40 times faster than the same program in Python.

Ratio of execution speeds of different languages (compiled and interpreted)

The counterpart is the compilation time: if it is generally instantaneous, it can sometimes be much longer on large projects. For example the compilation of of a Linux kernel can take about ten hours. Obviously, if the source code is not modified it is useless to restart the compilation before each execution.

Compiler

Before executing a C code, it is imperative to compile it with a compiler. In the example Hello world seen before, if you look carefully at the console output, you will notice that this command is executed before the execution of the program :

clang-7 -pthread -lm -o main main.c

The compiler creates a main file which is executable. This file is run right after with the command ./main. If you type this command again in the command again in the console, the program will run a second time.

Compiling step

During compilation, the source code is analyzed before being converted. Of course, the source code must respect rules to be compilable. The first rule to know is that each instruction ends with a semicolon. Here are the lines of the main function of the program Hello world :

printf("Hello World\n");
return 0;

The semicolon indicates to the compiler that the instruction ends and the next one starts. This allows it to parse the code automatically. If you omit a semicolon, the compiler will not know the difference between the two
instructions, for him you will have written something like :

printf("Hello World\n")return 0;

printf and return become a single instruction. Some editors like Qt Creator display a small arrow to indicate the error line. It is not uncommon for the compiler to place the arrow on the second line and not on the line where the semicolon is omitted, because from the compiler's point of view the two lines are the same instruction.

For readability reasons, we write the programs with one instruction per line. But the compiler does not look at the line breaks. The only separations that matter to him are semicolons. The program Hello world could be written on a single line as long as the instructions are separated by semicolons:

#include <stdio.h> 
int main(void) { printf("Hello World\n"); return 0; }

This program, even if it works, is unreadable. This type of writing is to be banned.

Note: the first line does not have a semicolon, because it is not a statement, but a preprocessor directive. Indeed, this line is not intended to be executed. It is used to transmit information to the compiler. Here, it tells the compiler that he should use the stdio.h library.

Compiling errors

When the code is transformed into machine language, the compiler analyzes the code and detects errors if there are any:

When deploying an application, the best is to have no errors, nor warnings. For errors, it is imperative. About the warnings, they are not always fixed. But if you keep the warning, you must understand the origin of the problem and do it knowingly.

Quiz

Which statements are true?

Check Bravo! An interpreter parses and runs the code line by line. Try again...

About compilers...

Check Bravo! The compiler converts the code into a file that can be run in a second step. Try again...

What is the problem with the code below?

int main(void){printf("Strange exercise !\n");return 0;}
Check Bravo ! This program is understandable by the machine, but less by human. Try again...

What is the problem with the code below?

int main(void) {
  printf("Drole d'exercice !\n")
  return 0
};
Check Bravo! We put a semi-colon at the end of the instructions, but not of the functions. Try again...

With the following compilation result:

main.c:3:26: error: expected ';' after expression
  printf("Hello\n")
Check Bravo! The missing semicolon generates an error that prevents the compilation. Try again...

At the time of compilation a warning appears:

main.c:5:8: warning: using the result of an assignment as a condition without parentheses
      [-Wparentheses]
  if (i=5) printf ("Hello");
Check Bravo! The warnings do not prevent the program from running, but they should be considered carefully. Try again...

See also


Last update : 11/20/2022