Linking is the process of combining multiple object files into a single executable that can be used to start the application. Linking is the last phase of compilation process and deals with the object files(binary) rather than actual source code files. Object files are binary in nature and contain the assembly language representation of all the instructions in source code. Assembler produces object files(.o files) aka “relocatable object code“. These files contain relocation records to help linker find all the offset addresses where external source code needs to be filled in. At a basic level, linker resolves the external references that compiler had generated on encountering external symbols(eg function call) whose definitions were in different source files and hence in different object files at the end of compilation. Apart from linking the object files produced from code written by developer, linker also links the external libraries/APIs used by the developer in their source code.
Linking phase can happen in 2 ways:
1. Static Linking – When applications are statically linked, all the external references are resolved at link time itself. So a library function call reference will be resolved by associating or binding the call with the function’s definition contained in a different library or object file. Hence the executable produced will include the code from all the object files(written by developer or external libraries — libc, librt, libpthread etc). So required code/instructions from all the concerned object files/libraries will be a part of “.text” section in your exe file and ultimately the “text” segment in the address space once the application is loaded into memory. There are various ways of achieving static linking.
Feed all the .o files individually to “ld”(linker). ld.exe is the linker program used by gcc.
Compile: gcc -Wall -c source1.c source2.c source3.c
Link: gcc -o myapp source1.o source2.o source3.o
The above command will produce an executable named “myapp” after linking your source code object files.
Create an archived container (.a file on Linux) of multiple object files and tell the linker to link your application against the .a file. This procedure is very common in production systems where the number of source code files is huge. We can create and statically link against an archive of object files in the following manner:
Compile the code : gcc -Wall -c source1.c source2.c source3.c
Create an archive: ar -cvq libmyapp.a source1.o source2.o source3.o
Link: gcc -o myapp -lmyapp -lpthread -lm
The above command will produce an executable named “myapp” after linking your source code object files(libmyapp.a) with pthread library (libpthread.a) and math library(libm.a).
To check the contents of .a file, do: ar -t libmyapp.a. This will list all the object files contained in libmyapp.a
The -o option is used to specify the name of executable file and -l is prefixed to library names that are linked statically. You can of-course explore gcc options to find out ways of giving directory names to look for libraries along with various other useful options.
Once the executable is produced as a result of static linking, it can be deployed and executed in another system or location without requiring any sort of libraries to be present at the application startup or any time during the execution. So, all the dependencies in terms of external symbol resolution are taken care of at link time itself and hence it is called as static linking. Obviously the application can be dependent on factors like platform/OS which the developer or engineer needs to keep in mind. I do not wish to digress from the topic by discussing about those issues.
In most cases, windows has “.lib” files as counterparts for their Unix/Linux’s “.a” archive files.
2. Dynamic Linking – It is a deferred linking process where all the symbols references are not resolved by the linker at link time. So, the executable is kind of partially bound as all the bindings or function call references were not resolved by the linker at link time. As opposed to static linking, dynamic linking happens against shared object files(.so files) or “.dll” files on Windows — dll stands for dynamically linked library. More details on shared object and dll files will be presented in a separate post. Here the focus is on their usage in dynamic linking.
So what did the linker do at link time to defer the binding until runtime. ?
The linker just did a verification of the API calls made in your code. In other words, it just verified that specific code to resolve these references exists as a part of shared object modules. In the executable file, linker stores information about location of these shared object modules so that run-time linker can find and appropriately use them to bind references in your code with specific code sections in shared object modules.
Dynamic Linking itself can be done in two different ways. Both of them are deferred ways of linking an application.
- Static or Load-time dynamic linking – As the name suggests, references are resolved by the loader once it loads the executable from disk into memory. Loader(OS program) will use the information contained in the executable to find the location of shared object or .dll files. Once these libraries are located, required external symbols from your code are resolved by mapping the corresponding executable code into process’s address space.
- Execution-time dynamic linking – In this case, the executable should have instruction that explicitly make function calls(dlopen on Linux, LoadLibrary() on Windows) to load the library module at run-time. These function calls will get you a handle for a library module, but the bindings are still not resolved. The references are resolved as and when they occur. When an unresolved symbol is used, or in other words a call to function “foo” is made whose definition has not yet been mapped to in the process’s address space, this is the time when the binding for this symbol will be resolved by run-time linker. Run-time linker is a part of C runtime library.
Here, we just had a brief discussion about static and dynamic linking processes. In the follow-up post, I will try to establish comparison between these two and also present more details on dynamic linking as it has more complex details associated with it.
Leave a Reply