Please enable JavaScript.
Coggle requires JavaScript to display documents.
Systems Software: Programming Language Translators (Stages of compilation …
Systems Software: Programming Language Translators
Assembly code
Computers execute machine code.
It is difficult for humans to read, write and debug machine code.
A machine code instruction might look like this: 01000101
Assembly code instructions are equivalent to machine code but easier for humans to work with.
An assembly code instruction might look like this: LDA 5
This type of translator is used for Assembly Language (not High Level Languages).
It converts mnemonic assembly language instructions into machine code.
Assembler
Assembly code is a
low level language
.
Translating assembly code instructions into machine code is done by an
assembler
.
Each processor has it own instruction set and so the object code produced will be hardware specific.
Compiler
A compiler translates
a whole program
written in a
high level language
into executable machine code, going through several stages.
Compiled high level languages include Visual Basic and C++
The resulting machine code is called object code.
The object code produced is also hardware specific.
Advantages
Program can be run many times without the need to recompile.
Faster to execute.
Executable code does not require the interpreter to run.
Compiled code cannot be easily read and copied by others.
Converts the whole code into one file (often a .exe file).
The file can then be run on any computer without the translator needing to be present.
Can take a long time to compile source code as the translator will often have to convert the instructions into various sets of machine code as different CPUs will understand instructions with different machine code from one another.
Interpreter
An interpreter also translates code written in a high level language into machine code.
Interpreted high level languages include JavaScript and PHP
However, the interpreter does this line by line rather than translating the whole program before any of it can be executed.
Interpreter
Source code can be run on any machine with the interpreter.
If a small error is found, no need to recompile the entire program.
Converts the source code into machine code 1 line at a time.
Program therefore runs very slowly.
Main reason why an interpreter is used is at the testing / development stage.
Programmers can quickly identify errors and fix them.
The translator must be present on the computer for the program is to be run
Bytecode
Most languages are not solely compiled or interpreted.
-They use combination of both
For example, Java is compiled into bytecode which is an intermediate step between source code and machine code.
The bytecode is interpreted by a bytecode interpreter, for example the Java
virtual machine
.
Stages of compilation
A compiler goes through several stages to convert source code to object code:
Lexical analysis
Symbol table
Syntax analysis
Semantic analysis
Code generation
Lexical analysis
All
unnecessary spaces
all
comments
are
removed
.
Keywords (e.g. print), constants and identifiers are replaced with
tokens
representing their function in the program.
For example look at the following code:
age = 17
print(age)
This might produce the following tokens:
<identifier><operator><number><keyword>
<open_bracket><identifier><close_bracket>
Symbol table
The lexer will build up a symbol for every keyword and identifier in the program.
They symbol table helps to keep track of the run-time memory address for each identifier.
Syntax Analysis
The stream of tokens from the lexing stage is split up
phrases
.
Each phrase is parsed which means it is checked against the rules of the language.
If the phrase is not valid, an error will be recorded.
For example,this sequence of tokens may not be valid and this would be picked up by syntax analysis.
<number><operator><identifier>
(e.g. the source code might be 5 = a)
The rules of the language need to be defined.
Semantic analysis
It is possible to create a sequence of tokens which is valid but is not a valid program.
Semantic analysis checks for this kind of error.
For example this phrase may be valid syntax:
<if> <identifier> <operator> <number>
(e.g. the source code might be: if a > 5 )
However if the identifier has not previously been declared then semantically it is not a valid program.
Example of error:
Example = (“hello”)
Answer = 2 * (example)
-The example variable is expected to be a number at the next stage during multiplication.
-This type of error will be picked up during semantic analysis.
Code generation
Once the program has been checked, the compiler generates the machine code.
It may do this in several 'passes' over the code because
code optimisation
will also take place.
Code optimisation
Sometimes source code is written inefficiently.
Code optimisation aims to
Remove redundant instruction
Replace inefficient code with code that achieves the same result but in a more efficient way.
Libraries
Most languages have sets of pre-written (and pre-compiled) functions called
libraries
.
Examples could include functions for generating random numbers or for mathematical operations.
A programmer can also write their own libraries.
Library functions can be called within a program.
Linker
The
linker
needs to put the appropriate memory addresses in place so that the program can call and return from a library function.
Loader
The job of the
loader
is to copy the program and any linked subroutines into main memory to run.
When the executable code was created it may assume the program will load in memory address 0.
However,memory addresses in the program will need to be
relocated
by the loader because some memory will already be in use.