3.6 KiB
Mangling
The following document gives an overview of the topic of mangling in compiler design and describe the mangling implementation used by the Gemstone compiler.
Table of Contents
Abstract
According to Wikipedia [1] mangling refers to the following:
In compiler construction, name mangling (also called name decoration) is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.
It provides means to encode added information in the name of a function, structure, class or another data type, to pass more semantic information from the compiler to the linker.
Mangling changes the names of symbols such as functions or variables so that symbols of the same name but different implementation or semantic (like variables in different modules) can be used in the same object file. The linker will complain about multiple symbols with the same name. Names alone are not enough to uniquely identify certain symbols. Thus encoding additional information into the symbols name solved the problem.
A simple example on how basic mangling can be achieved for functions with the same name which are located in different modules:
mod A {
fn gee() { }
}
mod B {
fn gee() { }
}
A simple solution for mangling would be to prefix any functions name with the module separated by an underscore. The first gee
function would get the name A_gee
whereas the second function would become B_gee
avoiding a name clash.
Many such schemes exist in modern compilers such as the Itanium C++ ABI used by C++, RFC 2603 by Rust [2][3].
Available characters for symbol names
Taking into account both the GNU/Linux linker ld
and Microsofts the following list of symbol classes can be used for symbols across at least Windows and GNU/Linux [4, p 84][5]:
Class | Symbols |
---|---|
letters | abcdefghijklmnopqrstuvwxyzABCDEFGHJIKLMNOPQRSTUVWXYZ |
underscore | _ |
period | . |
hypen | - |
digits | 0123456789 |
Specification
Common Prefix
Every mangled name is prefixed with gsc
to denote the "Gemstone Compiler name mangling convention".
Functions
Data required for mangling functions:
- Function name
- Parameter name
- Parameter type
- Return type
- Parent modules
Global variables
Data required for mangling global variables:
- Name
- Type
- Parent modules
References
1]: https://en.wikipedia.org/wiki/Name_mangling.
\[2]: https://github.com/rust-lang/rfcs/blob/master/text/2603-rust-symbol-name-mangling-v0.md
\[3]: https://refspecs.linuxbase.org/cxxabi-1.86.html#mangling
\[4]: https://sourceware.org/binutils/docs-2.37/ld.pdf
\[5]: https://learn.microsoft.com/en-us/cpp/build/reference/decorated-names?view=msvc-170