Recent GNU* C library improvements

16 Feb, 2018

by Victor Rodriguez Bahena

As technology advancements continue, the core technology must be updated with new ideas that break paradigms and enable innovation. Linux* systems are based on two main core technologies: the Linux Kernel project and the GNU C Library (GLIBC) project. The GLIBC project provides the core libraries for the GNU system and GNU/Linux systems, as well as many other systems that use Linux as the kernel. These libraries provide interfaces that allow programs to manipulate and manage files, memory, threads and other operating system objects. The release of GLIBC version 2.27 marks a new step on the Linux technology roadmap, with major new features that will allow Linux developers to create and enhance applications. This blog post describes several key new features and how to use them.

Support for building static PIE executables

Let's start with the basics: an application (let's not consider interpreters or JIT) could do something really simple, such as opening a file on disk or showing something on the screen. However, in order to talk directly to the CPU, a programming language and a compiler are necessary. First, the compiler translates code functions into object code. This object code is then linked into a full program, using a linker tool. The result is a binary file, which then can be executed on the CPU.

On Linux, those binaries can use the ELF format (Executable and Linkable Format). An ELF file consists of segments that describe how to load data from disk and prepare it for runtime execution. When the dynamic loader sees those segments (without consider static binaries), it maps them into virtual address space, using the mmap() system call. In other words, it converts predefined instructions into a memory image.[1]

Over the last few years, the Linux community has been working hard to produce more secure code. One methodology to achieve this is making more use of Address Space Layout Randomization (ASLR)[2], and in particular using PIE (Position Independent Executable). PIE is a technique that allows for random address space by compiling and linking the user’s code in a position independent manner. This is the usual case for libraries, since they can be loaded by different applications that have different dependencies. PIE applies the same technique to executables.

The GNU C Library can now be compiled with support for building static Position Independent Executables (PIE) binaries. These static PIE executables are static executables but can be loaded at any address and provide additional security hardening benefits, at the cost of some   memory and performance.  When the library is built with the --enable-static-pie option, the resulting libc.a is usable with GCC 8 and above to create static PIE executables using the GCC option -static-pie.  This feature is currently supported on i386, x86_64, and x32 with binutils 2.29 or later, and on AArch64 with binutils 2.30 or later.[3]

Transparent use of library packages

Recently, the Linux community has seen the launch of impressive technologies for the data center market; however, many developers do not take advantage of these advances immediately. To close this gap requires a solution where the compiler can generate optimized binaries for multiple platform targets without annotating the source code. The glibc project introduced this capability back in August 2017 with glibc 2.26.

First, during process initialization, glibc uses hardware detection capabilities to determine the platform (dl_platform) where the operating system is running. It then builds an array of hardware capability names. Next, a shared library search path is created for the proper library selection at runtime. GLIBC creates this path by gathering information such as the cpu family, cpu model, and cpu kind during process startup.

On x86-64 systems, a “platform” could take either the value of the xeon_phi or haswell processor family. For example, for the GLIBC2.27 release and current GLIBC2.26 git branch, a “platform” could be x86_64, xeon_phi, or haswell. With this new linking capability, developers can generate optimized binaries for multiple platforms and save them in the correct search paths.

Per thread Malloc

Power consumption has been a problem for microprocessor architectures for a long time. As a solution, the microprocessor industry has shifted its focus from increasing clock frequencies to delivering increasing numbers of processor cores. The parallelism at multiple levels is now driving computing design. For software engineers, this represents a challenge to create applications that benefit from increases in core counts as new generations of microprocessors emerge. One solution is multithreading. As a result, there is an urgent need for programming models and tools to support development of efficient multithreaded programs.

The GNU C Library project has provided libraries that include foundational facilities to manage the need of parallel thread programming for a long time, through mechanisms like pthread_create on the POSIX thread library. The pthread_create() function starts a new thread in the calling process. Threads require less overhead than "forking" or spawning a new process, because all threads within a process share the same address space. Parallel programming technologies such as Message Passing Interface (MPI) and Parallel Virtual Machine (PVM) are used in a distributed computing environment, while threads are limited to a single computer system.

In the 2.26 release, a per-thread cache has been added to malloc. Access to the cache requires no locks and therefore significantly accelerates the fast path to allocate and free amounts of memory[4].

Conclusion

As the open source community continues to redefine the boundaries of what is possible for cloud-based Linux distributions, being on the bleeding edge of technology is mandatory. The Clear Linux* Project moves as fast as possible to provide these new cutting edge technologies for our users. The GLIBC project is a core technology that expands the possibilities for Clear Linux developers, who can design better solutions for their applications using new GLIBC technologies and IA features.

Bibliography:

  1. The 101 of ELF Binaries on Linux: Understanding and Analysis by  Michael Boelen, 2015-09-28, https://linux-audit.com/elf-binaries-on-linux-understanding-and-analysis/
  2. https://blog.fpmurphy.com/2008/06/position-independent-executables.html
  3. https://sourceware.org/ml/libc-announce/2018/msg00000.html
  4. https://lists.gnu.org/archive/html/info-gnu/2017-08/msg00000.html