Essential Macros for C Programming
I was extensively using C during 1989-1992. After that it was C++ for a long period. From 1999 started coding in Java, then had to use C# for Windows applications.
In mid 2005, I happened to get involved in a project where I had to use C and enjoyed the experience thoroughly. I could use some of the C macros I had written way back in early 1990s and also got an opportunity to implement few which I had conceived but had not implemented.
IMHO knowledge of C++ and Java enables one to write better C code and even allows OOP concepts to be used. Expertise in C/C++ allows one to enjoy the absence of memory management issues while coding in Java and C#, truly appreciate the convenience provided by the Garbage Collector and at the same time have a watchful concern on the application performance when the Garbage Collector thread starts executing. Though I do not miss the pointers of C and AutoPointers of C++, I definitely miss Operator Overloading in Java. Glad that operator overloading is supported in C# and its delegate feature certainly deserves a thumbs up.
Let me start sharing the C macros with you and hope you would find it useful and informative.
Java and C# have “length” functions to obtain the length of any type of array. We can have similar functionality in C, but it would work only for fixed size arrays and not for dynamic arrays created using malloc or calloc.
/// Obtain the number of elements in the given C array #define GET_ARRAY_LEN( arrayName ) (sizeof( arrayName ) / sizeof(( arrayName)[ 0 ] )) |
MIN and MAX are commonly used macros and in some situations they might not be defined. It is handy to have them if they are not available.
/// Return min of two numbers. Commonly used but never defined as part of standard headers #ifndef MIN #define MIN( n1, n2 ) ((n1) > (n2) ? (n2) : (n1)) #endif /// Return max of two numbers. Commonly used but never defined as part of standard headers #ifndef MAX #define MAX( n1, n2 ) ((n1) > (n2) ? (n1) : (n2)) #endif |
Sometimes when we allocate a memory pool, we might want the size to be a perfect power of two and following macros could be useful for such cases.
// Aligns the supplied size to the specified PowerOfTwo #define ALIGN_SIZE( sizeToAlign, PowerOfTwo ) \ (((sizeToAlign) + (PowerOfTwo) - 1) & ~((PowerOfTwo) - 1)) // Checks whether the supplied size is aligned to the specified PowerOfTwo #define IS_SIZE_ALIGNED( sizeToTest, PowerOfTwo ) \ (((sizeToTest) & ((PowerOfTwo) - 1)) == 0) |
The second macro is equivalent to ((sizeToTest % PowerOfTwo) == 0). The macro is avoiding the modulo operator and accomplishing the same using bitwise operator. Only if the denominator is an exact power of two, then bitwise AND operator could be used to obtain the remainder.
The first macro is equivalent to (sizeToAlign + PowerOfTwo – 1) / PowerOfTwo * PowerOfTwo. The macro is avoiding the integer division and also the multiplication. Modern optimizing compilers should be able to do the same in most cases but why take chances when we can do it without much sweat.
We definitely would want to use macros if we need to get the offset of any field that is a member of a structure and also to obtain the address of a field given its offset.
Other useful macros related to struct are ALLOC_STRUCT and INIT_STRUCT. I found that they make the code highly readable, less keystrokes to type and reduce the chance of errors.
// Macros related to "struct" /// Obtain the offset of a field in a struct #define GET_FIELD_OFFSET( StructName, FieldName ) \ (( short )( long )(&((StructName *)NULL)->FieldName)) /// Obtain the struct element at the specified offset given the struct ptr #define GET_FIELD_PTR( pStruct, nOffset ) \ (( void *)((( char *)pStruct) + (nOffset))) /** Allocates a structure given the structure name and returns a pointer to that allocated structure. The main benefit is there is no need to cast the returned pointer, to the structure type. @param StructName the name of the structure @return pointer to allocated structure if successful, else NULL. @see INIT_STRUCT */ #define ALLOC_STRUCT( StructName ) ((StructName *)malloc( sizeof( StructName ))) /** Initializes the given structure to zeroes using memset(). @param pStruct the pointer to structure that has to be initialized @see ALLOC_STRUCT */ #define INIT_STRUCT( pStruct ) (memset( pStruct, '\0', sizeof( *(pStruct) ))) |
Here are few macros that are simple, yet useful. I would like to mention my favourite line from the most adored and renowned book, “The C Programming Language” by Brian W Kernighan and Dennis M Ritchie. The quote may not be exact, but I shall try to convey what I have understood.
“Simple is always Elegant. Elegant is always Simple. But no simpler.”.
For those who are new to C, “The C Answer Book” is a must have and it contains solutions to the exercises provided in the first book. To mention few more books that I have enjoyed reading are “C Traps and Pitfalls“, “Writing Solid Code“, “Programming Pearls“, “More Programming Pearls“, “Data Structures and C Programmes“, etc.
Here are those simple macros for checking whether a given number is odd or even and whether a number falls between two values (both inclusive).
/// Determine whether the given signed or unsigned integer is odd. #define IS_ODD( num ) ((num) & 1) /// Determine whether the given signed or unsigned integer is even. #define IS_EVEN( num ) (!IS_ODD( (num) )) /** Determine whether the given number is between the other two numbers (both inclusive). */ #define IS_BETWEEN( numToTest, numLow, numHigh ) \ ((unsigned char )((numToTest) >= (numLow) && (numToTest) <= (numHigh))) |
Following macro is borrowed from MFC (Microsoft Foundation Classes) to suppress the compiler warnings on unused parameters in a function body.
/** Use this macro for unused parameters right in the beginning of a function body to suppress compiler warnings about unused parameters. This is mainly meant for function parameters and not for unused local variables. */ #define UNUSED( ParamName ) \ (( void )(0 ? ((ParamName) = (ParamName)) : (ParamName))) |
Sometimes we need to use open and close curly braces to have a block without the use of “if”, “for” or “while”. In C, this is useful if we need to use a local variable which is an array of some significant size and that array is needed only for few lines of code. In such cases the closing curly brace would make the array to go out of scope resulting in immediate release of the stack memory acquired by the array.
In the case of C++, in addition to a local array, we can have instances of several classes as local variables which can be made to go out of scope using the closing curly brace resulting in the call to destructors of all those class objects thereby releasing all the resources used by those objects.
In such cases I found it to be extremely useful to use the macros BEGIN_BLOCK and END_BLOCK instead of using the curly braces as it improves the code readability, avoids unnecessary code indentation and clearly broadcasts the intentions about releasing of resources.
/** To open a "C/C++" block without using any construct such as "if", "for", "while", etc. The main purpose of this macro is to improve readability and to make the intentions clear in the code. This is useful if some local variables are required only for few lines. In such cases putting such local variables in a block causes the local variables to go out of scope and hence reclaim their memory once the end of block is reached. */ #define BEGIN_BLOCK { /** Closes a "C/C++" block opened using BEGIN_BLOCK. */ #define END_BLOCK } |
Now let me present the endian related macros which I immensely enjoyed implementing them. The memory architecture used by most of the CISC machines is Little-Endian and majority of the RISC architectures use Big-Endian and you can get the comparison of CPU architectures here. These words Little-Endian and Big-Endian are borrowed from Jonathan Swift’s classic, “Gulliver’s Travels“. They are the names of the two warring factions of Lilliputians figuring in that timeless satire.
The endianness should be of concern only if we are persisting data to a file and if the file can be used by an application running on an architecture using a different endian. In other words, if a number variable whose size is 16 bits or larger, is written to a file and is read again in the same machine then there is no problem. But if the file has to be read from a different machine whose endian is different, then rearrangements of the bytes becomes necessary. This is applicable to all the number types whose size is larger than one byte and it includes both integer and floating point datatypes.
Java by default uses Big-Endian format while writing or reading from a file. So if the same file is read on different machines by only Java applications then there is no problem. However if some other language such as C or C++ is used to read the file written by a Java application or vice versa, then one has to be aware and pay attention to the endianness.
Let us get to the macros. First two macros IS_LITTLE_ENDIAN and IS_BIG_ENDIAN return TRUE or FALSE depending on the current machine’s memory architecure. Then comes the most important macro IS_DEFAULT_ENDIAN where we need to decide and set which one we want to use as default.
After that various number conversion macros are defined which rearrange the bytes of a supplied number only if the current machine’s endian is different from the default set by us. Everytime before we write a number to a file and everytime just after a number has been read from a file, these number conversion macros should be used.
What these macros accomplish is that there is no need to supply the machine endianness as part of the compiler options, the same code can be used without any change while compiling on different machines, there is no need to have separate macros for reading and writing numbers and everything we need are provided by macros without using any function call.
Please feel free to nail all the social buttons displayed below this article, appreciate if you leave your comments, opinions or suggestions and in the meantime I shall start preparing to bring you another blog post.
/** Determines whether the memory architecture of current processor is LittleEndian. Optimizing compiler should be able to reduce this macro to a boolean constant TRUE or FALSE. @return 1 if LittleEndian, else 0 */ #define IS_LITTLE_ENDIAN() (((*(short *)"21") & 0xFF) == '2') /** Determines whether the memory architecture of current processor is BigEndian. Optimizing compiler should be able to reduce this macro to a boolean constant TRUE or FALSE. @return 1 if BigEndian, else 0 */ #define IS_BIG_ENDIAN() (((*(short *)"21") & 0xFF) == '1') /** Change this macro to change the default endian format. In this example, the default endian format is Little Endian. Optimizing compiler should be able to reduce this macro to a boolean constant TRUE or FALSE. @return 1 if the curren endian format is the default format, else 0 */ #define IS_DEFAULT_ENDIAN() IS_LITTLE_ENDIAN() /** Reverses the bytes of the supplied byte array. */ #define REVERSE_BYTE_ARRAY( ByteArray, Size ) \ if (!IS_DEFAULT_ENDIAN()) \ { \ int _i, _j; \ char _cTmp; \ for (_i = 0, _j = (Size) - 1; _i < _j; _i++, _j--) \ { \ _cTmp = (( char *)(ByteArray))[ _i ]; \ (( char *)(ByteArray))[ _i ] = (( char *)(ByteArray))[ _j ]; \ (( char *)(ByteArray))[ _j ] = _cTmp; \ } \ } /** If the current machine is not default endian, re-arranges the bytes of the given number. Does nothing if the current machine is default endian. Use this for number variable whose size is greater than 32 bits. For 16 and 32 bit numbers CONVERT_NUM16() and CONVERT_NUM32() are recommended. */ #define CONVERT_NUM( n ) REVERSE_BYTE_ARRAY( (&(n)), sizeof( n )) /** If the current machine is not default endian, re-arranges the bytes of the given 16-bit number. Does nothing if the current machine is default endian. */ #define CONVERT_NUM16( n ) ((void)(IS_DEFAULT_ENDIAN() ? (n) \ : ((n) = ((((n) & 0x00FF) << 8) | (((n) & 0xFF00) >> 8))))) /** If the current machine is not default endian, re-arranges the bytes of the given 32-bit number. Does nothing if the current machine is default endian. */ #define CONVERT_NUM32( n ) ((void)(IS_DEFAULT_ENDIAN() ? (n) \ : ((n) = ((((n) & 0x000000FF) << 24) | (((n) & 0x0000FF00) << 8) \ | (((n) & 0xFF0000) >> 8) | (((n) & 0xFF000000) >> 24))))) /** If the current machine is not default endian, re-arranges the bytes of the given 32-bit floating point number. Does nothing if the current machine is default endian. */ #define CONVERT_FLOAT( f ) CONVERT_NUM( f ) /** If the current machine is not default endian, re-arranges the bytes of the given 64-bit floating point number. Does nothing if the current machine is default endian. */ #define CONVERT_DOUBLE( d ) CONVERT_NUM( d ) /** If the current machine is not default endian, re-arranges the bytes of the given 64-bit point number. Does nothing if the current machine is default endian. */ #define CONVERT_NUM64( n ) CONVERT_NUM( n ) |
My Reply to some of the points raised in the comments made at reddit, LinkedIn and CodeProject.
Posted on: May 30, 2013
All the macros presented have been tested using Microsoft Visual Studio 2010 which was my development environment and also tested using both GNU 16-bit & 32-bit compilers under Linux which was the production environment.
Here are my replies to the various comments on all the macros presented in this post.
The explanation for GET_ARRAY_LEN clearly mentions that the macro can be used only for fixed size arrays and not for dynamic memory or pointers. Saying that the macro would break if a pointer is provided as input is just an incorrect usage rather than anything being wrong with the macro itself. We did have several fixed arrays for storing XML states, tokens, pointers to functions, etc. The macro was quite helpful in such cases. I am grateful for the comment in theCodeProject site pointing out the inclusion of _countof() macro in the C standard meant for the same purpose.
MIN / MAX macros have problems of multiple evaluations if expressions or function calls are used as arguments. This is a common fact which all the C/C++ programmers are aware of. Andrew Koenig’s classic “C Traps & Pitfalls”, which I have mentioned in the post, elucidates the presence of several such kind of landmines. Usages of macros in C or C++ have these potential dangers and we need to use them with utmost care.
Now let me tell you the real story about MIN & MAX. I did not write them first as they were available in the header files of Microsoft Visual Studio. I was told that in a particular customer location using Linux and a different version of compiler, these macros did not exist and the code failed to compile. That’s when I included them and those macros get defined only if not already present. Now coming to the most important fact, the code in the header files installed by Visual Studio is exactly the same. So folks, all the C/C++ programmers around the world using Visual Studio if they use MIN/MAX then that is the code they are executing. One has to just be awake and use them carefully.
I am quite thankful for some of the most useful comments related to MIN/MAX macros made on LinkedIn and as well as on reddit. There was a detailed discussion about avoiding multiple evaluations, use of stringify operator in a CONCATENATE macro, mention of __typeof__ & __COUNTER__ and the definition of MAKE_UNIQUE macro. The information provided need not be limited to MIN/MAX and could be put to good use in several other instances as well.
ALIGN_SIZE is using simple addition and yes, if a large value is value is provided the addition will result in overflow. Providing a large value or even a negative value is a case of providing erroneous input and an incorrect usage. In such cases the fault lies in the usage and definitely not with the definition of the macro. We had to build several data-structure utilities such as HashMap, LinkList, Vector, Dynamic String, etc. which needed the capability to dynamically expand the memory as and when required. They all used to expand the memory to the next nearest power of two after computing the actual required size. For all such requirements the macro was used.
I would also like to mention that my articles are eyeing those who might be new to C, who may be in their early college years. For those budding and yearning programmers I am just trying to provide some most useful information based on my long career in software engineering. That’s the reason I mentioned those books, which took quite a few years for me to come across them, but want to introduce such gems to the aspiring enthusiasts in one simple blog post.
The idea is the same, when in the article “Innocuous looking Evil Devil“, I mentioned several different types of common bugs in one bunch, but which I became aware of over a period of time either by encountering and fixing them or reading about them.
Even an elementary thing like an integer with a non-zero value getting doubled if left shifted by one has to be taught or has to be read to become aware of. Just like that it cannot dawn in the mind without reading or understanding the concept. This was also one of the reasons for specifying the ALIGN_SIZE macro, which illustrates an interesting usage of bitwise operators.
I was definitely not aware of offsetof() which is available since C99 and hence used GET_FIELD_OFFSET. Visual Studio 2010 implements it in exactly the same way. I think it served its purpose first by drumming the existence of offsetof() in the C standard through several comments and secondly by bringing out the amount of misunderstanding surrounding it. There were a few mentions about the macro trying to dereference a NULL pointer and how it could have “some undefined behavior”. That is pure misunderstanding, as there is no dereference happening even though it appears so. Computation of the offset value happens without any kind of dereferencing and the value gets computed during preprocessing stage or more definitely during compilation stage. Runtime code would just contain the offset value as a numeric constant.
There was some suggestion for the name which had all lowercase letters. I definitely recommend usage of all caps for macro names so that when we use it in the code the entire thing about the macros with all its dangers and pitfalls flashes past our mind at that instant and helps us to be cautious.
While allocating the memory for a structure, assigning the void* pointer returned by malloc did give warnings in some compilers. Hence the cast provided by ALLOC_STRUCT is doing some service by avoiding those warnings. Another important service provided by the macro is though void* could be assigned to any pointer datatype, because of the cast used by the macro, a mistake of assigning one structure pointer to a different type is now not possible.
That ALIGN_SIZE cannot provide the protection from overflow, when erroneous input containing a large value is provided was criticized. If ALLOC_STRUCT is providing the protection from assigning the returned pointer to a different type, why is it being frowned upon?
With ALLOC_STRUCT, one has to type less. Also is increasing the code readability of the code. The macro can also be tweaked to update a counter of how many total structures are being allocated by the program, the number of allocations of each structure, peak usage, etc. and this facility was indeed implemented to get the memory usage details during runtime.
If somebody has objection to the name of INIT_STRUCT I can’t say much about it and I do not have any objection if it is renamed to anything of one’s choice. It is providing a service by having to type less and the most important service it is providing is by using the code that compiles properly with all the different compilers with which it was tested. I have seen three different variations in providing the second argument of memset to zero out the memory. One uses NULL, other uses the integer constant zero and the third is the null character used by the macro. Thus the macro is providing consistency and allows us to easily avoid any compiler warnings due to some particular usage of memset.
UNUSED macro provided by MFC is simple “(void)ParamName”, which unfortunately generated compiler warnings under Linux. Hence the attempt to fool the compiler into believing that some assignment is going on. An optimizing compiler should not generate any runtime code for that macro. But still care is being taken that even the self-assignment specified in the ternary operator does not happen during runtime. Nobody writes a C function where some parameters are unused. But if the function signature or prototype is forced as is the case in a callback function, this macro can be handy in suppressing unwanted compiler warnings.
During the early nineties I was part of a project where some modules always used to generate compiler warnings. I was told that those warnings were “known” and was asked to pay attention only if any new warnings happen. That was the last time and since then it is a big NO NO for any kind of warnings under any environment, be it C, C++, Java or C#. Every build must end with zero errors and zero warnings. This is also one of the main reasons for using the UNUSED macro.
Now coming to the most reviled BEGIN_BLOCK macro, I most definitely expected vehement if not violent opposition. The explanation clearly states that sometimes we need to open the curly brace WITHOUT having if, while, do, switch or any other similar construct. I can be mercilessly crucified if I had suggested its usage for opening and closing of function body or any other valid construct requiring a block. You are writing a few lines of code and bang you want to open a curly brace. Just please ask when do you have such a requirement. Very rare. Or is it?
Let me specify one valid and simple usage by taking the help of C++. In the middle of a function body, we could just open a block, declare a CMutex or CCriticalSection class, initialize it and use it to execute few lines of code in exclusive mode and just close the block to exit the lock or the critical section and continue with the rest of the code in the function.
Using if (1) or if (TRUE) for the sake of having a block is much more ugly. If opening and closing curly braces are used all by itself, then another developer reading the code might wonder whether by mistake “if” or “while” is missed out. So if one chooses to be courteous to fellow developers, then we could leave a comment next to the opening brace “entering critical section”, go further and leave another comment at the end of the block as well. By having these macros and understanding their intended usage all these dilemma are avoided and the code readability is increased. The macro can be used in such situations for leaving the tell-tale signs about the programmer’s intentions.
Regarding the endian macros, the solution does not lie in wishing away the problem or pretending such a problem does not exist or thinking that we never have to worry about endianness in our code. In the work experience provided below, I have mentioned that I wrote a module in Java to compile the XML schema and the generated meta data was stored in another file. Then the runtime module which was written in C had to read this compiled meta data after opening the file in binary mode. So the endianness had to be taken care of in the Java module while writing and in the C module as well while reading.
The endian macros presented here worked successfully under Visual Studio as well as while using GNU 32 & 64 bit compilers under Linux. In case on some machine, if a macro results in memory access fault or does not work as intended, the solution is to use an alternate implementation for such environment by making use of the pre-defined pre-processor definitions available in that particular compiler. The very purpose of the MACRO facility is just that so that the code using the macros remains intact and does not have to change.
Definitely I have not presented any untested or flawed code. But improvements are always possible and if any of the macros can be implemented in a better way please suggest them for the benefit of everybody.
Hope my reply is convincing. No offense intended to anybody and I appreciate your valuable comments.
My Work Experience
In 1990 I started to do some serious coding in C. In 1991 while working for Coromandel in New York City, I got the chance to implement B+ Tree from scratch when I was asked to build a library in C to access Dbase-III data, index and memo files. The Dbase-III module was used in the product called “DbControls for Dbase” which was a plugin or addon or more exactly “VBX Controls” to Microsoft Visual Basic 1.0. The product was promoted by Microsoft along with its Visual Basic and Coromandel successfully sold several copies of the shrink wrapped product in USA, UK, Scandinavia, Australia and Japan.
In 1993 I completely moved to C++ and MFC and developed several key components of “Visual Database Builder”, another successful product from Coromandel. One of the main components, the Query By Example (QBE) module was bought by Borland International and got incorporated it in their product “Delphi” which became quite popular and was used worldwide.
While working for Base One International, New York, developed the “C++ Number Class” library which has been sold worldwide as a separate product and its customer list includes NASA, US Navy and HP Research Labs. I got my first patent for the work done in Number Class and the product can be downloaded along with the source code atContentGalaxy.com.
From 2005-2011 I was working as a software consultant to apigee. The main modules I designed and developed were XML Parser, XPath Compiler & Evaluator and XML Schema Validator. All the modules were written in C. But for XML Schema module, I used Java for parsing and compilation of a .xsl file and generated the compiled metadata which was stored in another file .cxsl (compiled xsl). The schema validation module, basically the runtime or the application of compiled information onto a required XML file was done in C. All these modules are currently in production and are being used by the customers of apigee spread across the world.