C and C++ Coding Style Guide

Basics

Summary

C and C++ are widely used languages for (embedded) software.

Our C and C++ style guide follows the Google C++ Style Guide, with some exceptions and clarifications.

As with all style guides the intention is to:

  • promote consistency across projects
  • promote best practices
  • increase code sharing and re-use

Terminology Conventions

Unless otherwise noted, the following terminology conventions apply to this style guide:

  • The word must indicates a mandatory requirement. Similarly, do not indicates a prohibition. Imperative and declarative statements correspond to must.
  • The word recommended indicates that a certain course of action is preferred or is most suitable. Similarly, not recommended indicates that a course of action is unsuitable, but not prohibited. There may be reasons to use other options, but the implications and reasons for doing so must be fully understood.
  • The word may indicates a course of action is permitted and optional.
  • The word can indicates a course of action is possible given material, physical, or causal constraints.

Shared C and C++ Style Guide

We use the Google C++ Style Guide for both C and C++ code. The following exceptions and additions to this style guide apply to both C and C++ code.

Pointers

When declaring pointer types, the asterisk (*) should be placed next to the variable name, not the type.

Example:

int *ptr;

Formatting of loops and conditionals

Single-statement blocks are not allowed. All conditions and loops must use braces.

Example:

if (foo) {
  do_something();
}

Infinite loops

Prefer while(true){} infinite loops rather than for(;;)

Comments

Non-doc comments should use the // C99-style for consistency with C++.

Variables mentioned in comments should be delimited with backtick (`) characters. Example:

// `ptr` can never be NULL for reasons.

Documentation comments should use Doxygen-style /** ... */ blocks. Examples:

/**
 * This function sorts an int array very quickly.
 */
void sort(int array[], size_t size){
  // Loop through the array moving higher numbers to the end.
  for (size_t i = 0; i < size; ++i>){
    ...
  }
}

This also applies to struct and enum members:

/**
 * Boot data stored in the flash info partition.
 */
typedef struct boot_data {
  /**
   * SHA-256 digest of boot data.
   *
   * The region covered by this digest starts immediately after this field and
   * ends at the end of the entry.
   */
  hmac_digest_t digest;
  ...
  /**
   * Boot data identifier.
   */
  uint32_t identifier;
  ...
} boot_data_t;

Note also Public function (API) documentation below.

TODO Comments

TODO comments should be in the format TODO: message.

TODO comments which require more explanation should reference an issue.

It is recommended to use fully-qualified issue numbers or URLs when referencing issues or pull requests.

TODO comments should not indicate an assignee of the work.

Example:

// TODO: This algorithm should be rewritten to be more efficient.
// (Bug lowrisc/reponame#27)

Included files

#include directives must, with exceptions, be rooted at $REPO_TOP.

Every #include directive must be rooted at the repository base, including files in the same directory. This helps the reader quickly find headers in the repository, without having to worry about local include-search rules.

Example: my/device/library.c would start with a directive like the following:

#include "my/device/library.h"

This rule does not apply to generated headers, since they do not yet have a designated location in the source tree, and are instead included from ad-hoc locations. Until these questions are resolved, these includes must be marked as follows:

#include "my_generated_file.h"  // Generated.

This convention helps readers distinguish which files they should not expect to find in-tree.

The above rules also do not apply to system includes, which should be included by the names dictated by the ISO standard, e.g. #include <stddef.h>.

Linker Script- and Assembly-Provided Symbols

Some C/C++ programs may need to use symbols that are defined by a linker script or in an external assembly file. Referring to linker script- and assembly-provided symbols can be complex and error-prone, as they don’t quite work like C’s global variables. We have chosen the following approach based on the examples in the binutils ld manual.

If you need to refer to the symbol _my_linker_symbol, the correct way to do so is with an incomplete extern char array declaration, as shown below. It is good practice to provide a comment that directs others to where the symbol is actually defined, and whether the symbol should be treated as a memory address or another kind of value.

/**
 * `_my_linker_symbol` is defined in the linker script `sw/device/my_feature.ld`.
 *
 * `_my_linker_symbol` is a memory address in RAM.
 */
extern char _my_linker_symbol[];

A symbol’s value is exposed using its address, and declaring it this way allows you to use the symbol where you need a pointer.

char my_buffer[4];
memcpy(my_buffer, _my_linker_symbol, sizeof(my_buffer));

If the symbol has been defined to a non-address value (usually using ABSOLUTE() in a linker script, or .set in assembly), you must cast the symbol to obtain its value using (intptr_t)_my_linker_symbol. You must not dereference a symbol that has non-address value.

Public function (API) documentation

It is recommended to document public functions, classes, Methods, and data structures in the header file with a Doxygen-style comment.

The first line of the comment is the summary, followed by a new line, and an optional longer description. Input arguments and return arguments should be documented with @param and @return if they are not self-explanatory from the name. Output arguments should be documented with @param[out].

The documentation tool will also render markdown within descriptions, so backticks should be used to get monospaced text. It can also generate references to other named declarations using #other_function (for C-style declarations), or ns::foo (for C++ declarations).

Example:

/**
 * Do something amazing
 *
 * Create a rainbow and place a unicorn at the bottom of it. `pots_of_gold`
 * pots of gold will be positioned on the east end of the rainbow.
 *
 * Can be recycled with #recycle_rainbow.
 *
 * @param pots_of_gold Number of gold pots to place next to the rainbow
 * @param unicorns Number of unicorns to position on the rainbow
 * @param[out] expiration_time Pointer to receive the time the rainbow will last in seconds.
 * @return 0 if the function was successful, -1 otherwise
 */
int create_rainbow(int pots_of_gold, int unicorns, int *expiration_time);

Polyglot headers

Headers intended to be included from both languages must contain extern guards; #includes should not be wrapped in extern "C" {}.

A polyglot header is a header file that can be safely included in either a .c or .cc file. In particular, this means that the file must not depend on any of the places where C and C++ semantics disagree. For example:

  • sizeof(struct {}) and sizeof(true) are different in C and C++.
  • Function-scope static variables generate lock calls in C++.
  • Some libc macros, like static_assert, may not be present in C++.
  • Character literals type as char in C++ but int in C.

Such files must be explicitly marked with extern guards like the following, starting after the file’s #includes.

#ifdef __cplusplus
extern "C" {
#endif

// Declarations...

#ifdef __cplusplus
}  // extern "C"
#endif

Moreover, all non-system #includes in a polyglot header must also be of other polyglot headers. (In other words, all C system headers may be assumed to be polyglot, even if they lack guards.)

Additionally, it is forbidden to wrap #include statements in extern "C" in a C++ file. While this does correctly set the ABI for a header implemented in C, that header may contain code that subtly depends on the peculiarities of C.

This last rule is waived for third-party headers, which may be polyglot but not declared in our style.

X Macros

In order to avoid repetitive definitions or statements, we allow the use of X macros in our C and C++ code.

Uses of X Macros should follow the following example, which uses this pattern in a switch definition:

#define MANY_FIELDS(X) \
  X(1, 2, 3)           \
  X(4, 5, 6)

int get_field2(int identifier) {
#define ITEM_(id_field, data_field1, data_field2) \
  case id_field:                               \
    return data_field2;

  switch (identifier) {
    MANY_FIELDS(ITEM_)
    default:
      return 0;
  }
}

This example expands to a case statement for each item, which returns the data_field2 value where the passed in identifier matches id_field.

X macros that are not part of a header’s API should be #undefed after they are not needed. Similarly, the arguments to an X macro, if they are defined in a header, should be #undefed too. This is not necessary in a .c or .cc file, where this cannot cause downstream namespace pollution.

C++ Style Guide

C++ Version

C++ code should target C++14.

Aggregate Initializers

C++20-style designated initializers are permitted in C++ code, when used with POD types.

While we target C++14, both GCC and Clang allow C++20 designated initializers in C++14-mode, as an extension:

struct Foo { int a, b; };

Foo f = { .a = 1, .b = 42, };

This feature is fairly mature in both compilers, though it varies from the C11 variant in two ways important ways:

  • It can only be used with structs and unions, not arrays.
  • Members must be initialized in declaration order.

As it is especially useful with types declared for C, we allow designated initializers whenever the type is a plain-old-data type, and:

  • All members are public.
  • It has no non-trivial constructors.
  • It has no virtual members.

Furthermore, designated initializers do not play well with type deduction and overload resolution. As such, they are forbidden in the following contexts:

  • Do not call overloaded functions with a designated initializer: overloaded({ .foo = 0 }). Instead, disambiguate with syntax like T var = { .foo = 0 }; overloaded(var);.
  • Do not use designated initializers in any place where they would be used for type deduction. This includes auto, such as auto var = { .foo = 0 };, and a templated argument in a template function.

It is recommended to only use designated initializers with types which use C-style declarations.

Naming

Structs, Classes and Methods As stated by the Google C++ Style Guide, the names of Structs, Classes and Methods must be in CamelCase format.

Variable and Class Members Naming As stated by the Google C++ Style Guide, the names of variables (including function parameters) and data members must be in lower_snake_case format. Data members of Classes (but not Structs) additionally have trailing underscores, unless the variable represents a constant that should follow the rule.

For example:

// in animal.cpp
public Class AnimalInfo{
  ...
private:
  uint32_t number_of_paws_;
  std::string name_;
public:
  static constexpr uint32_t kMaxNumOfPaws=100;

  void SetAnimalName(std::string new_name);
};

Features

Avoid using C-style features in C++ code, here are some examples:

  • Use C++-style casting (static_cast, reinterpret_cast, etc) rather than C-style casting.
  • Use the constexpr keyword to define constants rather than const.
  • Use the nullptr keyword rather than NULL.
  • Use std::endl rather than '\n'.
  • Use new and delete rather than malloc() and free().
  • Use smart pointers rather than pointers. Refer to Ownership and Smart Pointers for more details.

C Style Guide

The Google C++ Style Guide targets C++, but it can also be used for C code with minor adjustments. Consequently, C++-specific rules don’t apply. In addition to the shared C and C++ style guide rules outlined before, the following C-specific rules apply.

C Version

C code should target C11.

The following nonstandard extensions may be used:

  • Inline assembly
  • Nonstandard attributes
  • Compiler builtins

It is recommended that no other nonstandard extensions are used.

Any nonstandard features that are used must be compatible with both GCC and Clang.

Function, enum, struct and typedef naming

Names of functions, enums, structs, and typedefs must be lower_snake_case.

This rule deviates from the Google C++ style guide to align closer with a typical way of writing C code.

All symbols in a particular header must share the same unique prefix.

“Prefix” in this case refers to the identifying string of words, and not the specific type/struct/enum/constant/macro-based capitalisation. This rule also deviates from the Google C++ style guide, because C does not have namespaces, so we have to use long names to avoid name clashes. Symbols that have specific, global meaning imparted by an external script or specification may break this rule. For example:

// in my_unit.h
extern const int kMyUnitMaskValue = 0xFF;

typedef enum { kMyUnitReturnOk } my_unit_return_t;

my_unit_return_t my_unit_init(void);

The names of enumeration constants must be prefixed with the name of their respective enumeration type.

Again, this is because C does not have namespaces. The exact capitalisation does not need to match, as enumeration type names have a different capitalisation rule to enumeration constants. For example:

typedef enum my_wonderful_option {
  kMyWonderfulOptionEnabled,
  kMyWonderfulOptionDisabled,
  kMyWonderfulOptionOnlySometimes
} my_wonderful_option_t;

C-specific Keywords

C11 introduces a number of underscore-prefixed keywords, such as _Static_assert, _Bool, and _Noreturn, which do not have a C++ counterpart. These should be avoided in preference for macros that wrap them, such as static_assert, bool, and noreturn.

Constants and Preprocessor Macros

Constants Prefer using a enum to define named constants rather than Preprocessor Macros or even const int in C because:

  • It appears in the symbol table which improves debugging in contrast to Macros.
  • It can be used as case labels in a switch statement in contrast to const int.
  • It can be used as the dimension of global arrays in contrast to const int.
  • It doesn’t use any memory as well as Macros.

Note that, if the constant will be used in assembly code, then Preprocessor macros would be the best choice.

Macros Macros are often necessary and reasonable coding practice C (as opposed to C++) projects. In contrast to the recommendation in the Google C++ style guide, exporting macros as part of the public API is allowed in C code. A typical use case is a header with register definitions.

Function-like Macros Function-like macros should be avoided whenever possible since they are error-prone. Where they are necessary, they should be hygienic. Here are some useful tips:

  • Expand the macro arguments between brackets () as the caller could use an expression as an argument.
  • Variables local to the macro must be named with a trailing underscore _.
  • Wrap up multiline macros inside a block do { ... } while (false) to make them expand to a single statement. For example:
    #define CHECK(condition, ...)                      \
    do {                                               \
      if (!(condition)) {                              \
        ...                                            \
      }                                                \
    } while (false)
    
  • Don’t finish macros with a semicolon ; to force the caller to include it.

Aggregate Initialization

C99 introduces designated initializers: when initializing a type of struct, array, or union type, it is possible to designate an initializer as being for a particular field or array index. For example:

my_struct_t s = { .my_field = 42 };
int arr[5] = { [3] = 0xff, [4] = 0x1b };

With judicious use, designated initializers can make code more readable and robust; struct field reordering will not affect downstream users, and weak typing will not lead to surprising union initialization.

When initializing a struct or union, initializers within must be designated; array-style initialization (or mixing designated and non-designated initializers) is forbidden.

Furthermore, the nested forms of designated initialization are forbidden (e.g., .x.y = foo and .x[0] = bar), to discourage initialization of deeply nested structures with flat syntax. This may change if we find cases where this initialization improves readability.

When initializing an array, initializers may be designated when that makes the array more readable (e.g., lookup tables that are mostly zeroed). Mixing designated and non-designated initializers, or using nested initializers, is still forbidden.

Function Declarations

All function declarations in C must include a list of the function’s parameters, with their types.

C functions declared as return_t my_function() are called “K&R declarations”, and are type compatible with any list of arguments, with any types. Declarations of this type allow type confusion, especially if the function definition is not available.

The correct way to denote that a function takes no arguments is using the parameter type void. For example return_t my_function(void) is the correct way to declare that my_function takes no arguments.

The parameter names in every declaration should match the parameter names in the function definition.

Inline Functions

Functions that we strongly wish to be inlined, and which are part of a public interface, should be declared as an inline function. This annotation serves as an indication to the programmer that the function has a low calling overhead, despite being part of a public interface. Presence—or lack—of an inline annotation does not imply a function will—or will not—be inlined by the compiler.

C11 standardised inline functions, learning from the mistakes in C++ and various nonstandard extensions. This means there are many legacy ways to define an inline function in C. We have chosen to follow how C11 designed the inline keyword.

The function definition is written in a header file, with the keyword inline:

// in my_inline.h
inline int my_inline_function(long param1) {
  // implementation
}

There should be exactly one compilation unit with a compatible extern declaration of the same function:

// in my_inline.c
#include <my_inline.h>
extern int my_inline_function(long param1);

Any compilation unit that includes my_inline.h must be linked to the compilation unit with the extern declarations. This ensures that if the compiler chooses not to inline my_inline_function, there is a function definition that can be called. This also ensures that the function can be used via a function pointer.

Static Declarations

Declarations marked static must not appear in header files. Header files are declarations of public interfaces, and static definitions are copied, not shared, between compilation units.

This is especially important in the case of a polyglot header, since function-local static declarations have different, incompatible semantics in C and C++.

Functions marked static must not be marked inline. The compiler is capable of inlining static functions without the inline annotation.

volatile Type Qualifier

Do not use volatile in production, i.e. non-test, silicon creator code unless you are implementing a library explicitly for this purpose like sec_mmio, abs_mmio, or hardened. See guidance for volatile for more details. When in doubt, please do not hesitate to reach out by creating a GitHub issue (preferably with the “Type:Question” label).

Nonstandard Attributes

The following nonstandard attributes may be used:

  • section(<name>) to put a definition into a given object section.
  • weak to make a symbol definition have weak linkage.
  • interrupt to ensure a function has the right prolog and epilog for interrupts (this involves saving and restoring more registers than normal).
  • packed to ensure a struct contains no padding.
  • warn_unused_result, to mark functions that return error values that should be checked.

It is recommended that other nonstandard attributes are not used, especially where C11 provides a standard means to accomplish the same thing.

All nonstandard attributes must be supported by both GCC and Clang.

Nonstandard attributes must be written before the declaration, like the following example, so they work with both declarations and definitions.

__attribute__((section(".crt"))) void _crt(void);

Nonstandard Compiler Builtins

In order to avoid a total reliance on a single compiler, any nonstandard compiler builtins (also known as intrinsics) should be used via a single canonical definition. This ensures changes that add compatibility for other compilers are less invasive, as we already have a function that includes a full implementation.

All nonstandard builtins should be supported by both GCC and Clang. Compiler builtin usage is complex, and it is recommended that a compiler engineer reviews any code that adds new builtins.

In the following, __builtin_foo is the builtin name and foo is the corresponding general name.

The use of nonstandard compiler builtins must be hidden using a canonical, compatible definition.

There are two ways of providing this canonical definition, depending on what the builtin does.

For builtins that correspond to a C library function, the general name must be available to the linker, as the compiler may still insert a call to this function. Unfortunately, older versions of GCC do not support the __has_builtin() preprocessor function, so compiler detection of support for these builtins is next to impossible. In this case, a standards-compliant implementation of the general name must be provided, and the compilation unit should be compiled with -fno-builtins.

For builtins that correspond to low-level byte and integer manipulations, an inline function should be provided with a general name, which contains a call to the builtin name itself, or an equivalent implementation. Only the general name may be called by users: for instance, uint32_t __builtin_bswap32(uint32_t) must not be called, instead users should use inline uint32_t bswap32(uint32_t x). Where the general name is already taken by an incompatible host or device library symbol, the general name can be prefixed with the current C namespace prefix, for instance inline uint32_t bitfield_bswap32(uint32_t x) for a function in bitfield.h. Where the general name is a short acronym, the name may be expanded for clarity, for instance __builtin_ffs may have a canonical definition named bitfield_find_first_set. Where there are compatible typedefs that convey additional meaning (e.g. uint32_t vs unsigned int), these may be written instead of the official builtin types.

For builtins that cannot be used via a compatible function definition (e.g. if an argument is a type or identifier), there should be a single canonical preprocessor definition with the general name, which expands to the builtin.

Code Lint

The clang-format tool can check for adherence to this style guide. The repository contains a .clang-format file which configures clang-format according to the rules outlined in this style guide.

You can run clang-format on you changes by calling git clang-format.

cd $REPO_TOP
# make changes to the code ...
git add your_modified_file.c
# format the staged changes
git clang-format

To reformat the whole tree the command ./bazelisk.sh run //quality:clang_format_fix can be used.