Tuesday, January 19, 2016

Using Boost C++ Library from C

This post is related to another post: Building C++ Application with Boost Library and Autotools in Linux. If you haven't know how to use Boost C++library in an autotools project, please read that post.

The purpose of this post is to explain what you need to do to use use Boost from your C language code. The following are the steps required to use Boost from C:
  1. Decide which part of Boost that you require in your C application.
  2. Wrap that part of Boost as "convenience" C library.
  3. Link your C application code to the "convenience" library. 
The steps above is easier said than done. Don't worry, I have provided a sample project over at github: https://github.com/pinczakko/boost_spsc_queue_c_wrapper. Just download the code and try to make sense of it.
DISCLAIMER
-----------
- The code assumes that the platform in which it runs has a working pthread implementation.
- The code is not production quality code. Use it at your own risk.
The sample code basically wraps Boost SPSC (single producer single consumer) lockfree queue into a convenience C library and the sample application links to the convenience library.

Anyway, the most important part of the code that you need to understand is the part that "returns" a C++ class as an "opaque" C structure. Probably, it's a quite alien concept. But, I assure you that this alien concept is the core of C<-->C++ interoperability. Below is the relevant code snippets:


//---- START spsc_interface.hpp file -----------------
#ifndef  SPSC_WRAPPER_H
#include "spsc_wrapper.h"
#endif //SPSC_WRAPPER_H 

class spsc_interface {
public:
    explicit spsc_interface();
    explicit spsc_interface(const spsc_interface &);
    ~spsc_interface() {};
//...

};
//---- END spsc_interface.hpp file -----------------

//---- START spsc_wrapper.h file -----------------
#ifdef __cplusplus
extern "C" {
#endif
//...
 typedef struct spsc_interface spsc_interface;

 spsc_interface *create_spsc_interface();

//---- END spsc_wrapper.h file -----------------

//---- START spsc_wrapper.cc file -----------------

#include "spsc_interface.hpp"
#include "spsc_wrapper.h"

#ifdef __cplusplus
extern "C" {
#endif

spsc_interface * create_spsc_interface()
{
    return new spsc_interface();
}

...
#ifdef __cplusplus
}
#endif
//---- END spsc_wrapper.cc file -----------------

As you see above, the wrapper function create_spsc_interface() returns an opaque object of type struct spsc_interface. In fact, this opaque object is a C++ class in disguise as you can see in the spsc_wrapper.cc snippet above. If you ask: How's that possible? Well, the answer lies in the linking process. Let's see the relevant snippets from src/Makefile.am:

wrapper_test_SOURCES = wrapper_test.c 
### Below is just a trick to force using C++ linker
nodist_EXTRA_wrapper_test_SOURCES = dummy.cc 

As you can see, we're not using a C linker to link the entire project, but we use a C++ linker (by tricking autotools ;) . A C++ linker "understand" the idiom used in spsc_wrapper.cc above.

Monday, January 18, 2016

Little Known C Language "Feature"s

There are several useful "feature"s in C language which are not widely known. Only those digging deep enough to C libraries or kernel code knows these "feature"s. These are some of them:
  • The C offsetof() macro. In many cases, this macro is utilized to help implement a sort of object oriented relationship (akin to sub-class<-->super-class in C++) between related C structures (or object if you prefer). You can use it to "obtain" pointer to the super-class given pointer to the sub-class and vice versa. Some useful explanation about this feature: Offsetof (Wikipedia), Greg KH explanation on offsetof in Linux Kernel code. Anyway, in some cases you can use compiler extension (built-ins) to achieve the same effect. In GCC there is __builtin_offsetof built-in that you can use. In fact, offsetof() translates to __builtin_offsetof in GCC.
  • const keyword can be used to designated input and output arguments in C/C++ function. You can use this keyword to force the compiler to notify you or outright balked out of compilation when some code inadvertently change the value of a function argument designated as const. Basically, you use it to enforce whether the function can modify the argument or not. An argument which strictly used as input (in a sense, must not be modified) can be designated as such with const. This is an excellent writeup on the matter: The C++ 'const' Declaration: Why & How.
  • Pointer aliasing issue, i.e. a situation where more than one pointer refer to the same object. Understanding this "feature" is important when you want to write high-performance C/C++ code. It's a quite complicated but important matter because it gives you the capability to "tell" the C/C++ compiler about your intention and avoid expensive memory accesses. See:
I think knowing these "feature"s is critical to create high quality and high performance C/C++ code.

Wednesday, January 13, 2016

Sanitizing Your C/C++ Code

GCC already has the capability to help with sanitizing your C/C++ code since version 4.9. This is probably one of GCC capability not known widely. Sanitizing in this context means cleaning up possible error, just in case you're not yet familiar with the terms. See: Defensive_programming for introduction.
This is a good introduction on GCC C/C++ "sanitizer":GCC Undefined Behavior Sanitizer – ubsan and for some details from GCC documentation (not really with exhaustive explanation, but complete) see: GCC Debugging Options -- see the section on -fsanitize=.. option.

Anyway, using the sanitizer option is not quite straightforward because you have to install the corresponding libraries for the sanitizer. These libraries will provide replacement for C library functions related to the sanitizer. For example: it provides malloc() replacement to provide memory leak detection. There are two libraries which you must install aside from GCC version >= 4.9, i.e. libasan (address sanitizer library) and libubsan (undefined behavour sanitizer library). I'll give you an example here in CentOS 7.

CentOS 7 comes with GCC 4.8 by default. Therefore, sanitizer support (especially libubsan) is missing. Therefore, we need to upgrade it. To do so, update the CentOS  repo data with ones for Fedora 23, so that we can upgrade to GCC 5.1.1, like so:
# cat << EOF > /etc/yum.repos.d/Fedora-Core23.repo
[warning:fedora]
name=fedora
mirrorlist=http://mirrors.fedoraproject.org/mirrorlist?repo=fedora-23&arch=$basearch
enabled=1
gpgcheck=1
gpgkey=https://getfedora.org/static/34EC9CBA.txt
EOF
# yum update gcc g++
# yum install libasan
# yum install libubsan
Now you have updated the C and C++ compiler and also installed the required libubsan and libasan. You can proceed to use -fsanitize option in GCC to add sanitizer option in your code.
Credits go to XakRu and sm1Ly for the script/command to add Fedora repo to CentOS (see: How To install gcc 5.2 on centos 7.1? [closed]).

Now let's proceed to learn how to use the flags in our build script. If you are using autotools, you can do these steps:
  1. Add -fsanitize=.. to your CFLAGS and/or CXXFLAGS
  2. Add -fsanitize=.. to your LDFLAGS as needed. Taking into account the GCC Debugging Options explanation.
  3. In some cases, you need to remove memory allocation function checks from configure.ac
This is an example build script (named build_debug.sh) which invokes configure script generated by autotools:
../configure CFLAGS="-DDEBUG -g -O0 -fsanitize=undefined -fsanitize=leak" \
 CXXFLAGS="-DDEBUG -g -O0 -fsanitize=undefined -fsanitize=leak" \
 LDFLAGS="-fsanitize=leak" --enable-debug \
 && make V=1
The build always failed when I enabled memory allocation function checks in configure.ac. Therefore, it needs to be disabled like so:
     ##AC_FUNC_MALLOC
     ##AC_FUNC_REALLOC
This happens because AC_FUNC_MALLOC and AC_FUNC_REALLOC are based on a runtime tests. See:https://github.com/LLNL/ior/issues/4

Hopefully, this is helpful for those developing software in C/C++ with GCC.