Please enable JavaScript.
Coggle requires JavaScript to display documents.
2018 - An Analysis of x86-64 Inline Assembly in C Programs (Information…
2018 - An Analysis of x86-64 Inline Assembly in C Programs
Information
Inline assembly code
Assembly instructions embedded in C code in a way that allows direct interaction, i.e. they can directly access C variables
Platform-dependent, uses instructions from the target machine's ISA
Many tools that process C code or associated intermediate languages (such as LLVM IR and CIL) partially or entirely lack support for inline assembly
Methodology
Research Questions
How long is the average inline assembly fragment
Characterizing the length of the average inline assembly
fragment gives further implementation guidance
In which domains is inline assembly used
Answering this question helps if a tool targets only specific domains
It seemed likely that the usage of inline assembly differs across domains
Do projects use the same subset of inline assembly
Determines how much inline assembly support needs to be implemented to cope with the majority of C projects
What is inline assembly used for
Knowing the typical use cases of inline assembly helps tool writers to assign meaningful semantics to inline assembly instructions
Helps to determine whether alternative implementations in C could be considered
Assumption
Inline assembly is used—aside from cryptographic use cases—mainly to improve performance and to access functionality that is not exposed by the C language
How common is inline assembly in C programs
Knowing how commonly inline assembly is used indicates to C tool writers whether it needs to be supported
Scope of the Study
Focus was to quantitatively and qualitatively analyze inline assembly code
Obtaining the Projects
Selected C applications from GitHub
Filtering the Projects
Focused on code for x86-64 Linux systems. Therefore, we excluded projects that worked only for other architectures or other operating systems
Analysis focused to C code, excluding C++ code.Projects that mixed C/C++ code were also excluded if the C++ LOC were greater in number than the C LOC.
Also excluded is C/C++ header files (ending with .h) when they contained C++ code
Inline Assembly Constructs
Since inline assembly is not part of the C language standard,compilers differ in the syntax and features provided
We assume use of the GNU C inline assembly syntax, which is the de-facto standard on Unix platforms, recognizes the asm or
asm
keywords to specify an inline assembly fragment, and has both “basic” and “extended” flavors
Using basic asm, a programmer can specify only the assembler fragmentor directive. Use cases for basic assembly are limited;however, in contrast to extended asm, basic inline assembly can be used outside of functions
Analyzing the Instructions
Analysis focused on inline assembly fragments found with grep in the source code
Searched for strings containing “asm”
Results
Quantitative Results
Projects using inline assembly
Density of inline assembly fragments
Number of fragments per project
Overview of the fragments
Analysis of the fragments
Instructions in a fragment
Duplicate fragments
Project domains
What is?
asm
Contribution
Findings
Most inline assembly fragments consist of a single instruction, and most projects contain only a few inline assembly fragments.
Since many projects use the same subset of inline assembly fragments, tool writers could support as much as 64.5% of these projects by implementing just 5% ofx86-64 instructions
Inline assembly is used mostly for specific purposes:to ensure semantics on multiple cores, to optimize performance, to access functionality that is unavailable in C, and to implement arithmetic operations
Goals
Investigates the use of x86-64 inline assembly in 1264 C projects from GitHub
Authors
Rigger M
Marr S
Kell S
Leopoldseder D