gareth_rees ([info]gareth_rees) wrote,
@ 2007-04-24 21:31:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Relational macros
Relational macros are a C programming language technique for embedding tables of data into programs in a way which is easy to check and safe to update.

First, a motivating example. Imagine we're writing some bitmap-handling code, and we're going to be handling several pixel formats. So let's have an enumeration in a header:
enum {
    PIXEL_FORMAT_8888, /* 8 bits RGBA. */
    PIXEL_FORMAT_8880, /* 8 bits RGB. */
    PIXEL_FORMAT_5551, /* 5 bits RGB; 1 bit A. */
    PIXEL_FORMAT_I8,   /* 8 bits of intensity (equal RGB). */
    PIXEL_FORMAT_LIMIT
};
(It's nice to have a LIMIT for each enumeration so you can straightforwardly check whether a value is in the enumeration, iterate over values in the enumeration and so on.) Now in some other piece of code we want to know how many bits per pixel there are for each format, so let's have a table of data:
static int pixel_format_bits[] = {
    32,
    24,
    16,
     8,
};
There are two important problems with this. First, it's hard to check that this is correct: we've got to compare text in two files. Second, if we change the order of values in the enumeration, or delete or add values, we break the bits-per-pixel table and there's nothing to tell us that we've gone wrong. So let's be a lot more cautious about checking that we've got it right, and change that table:
static struct {
    int format;
    int bits;
} pixel_format_bits[] = {
    {PIXEL_FORMAT_8888, 32},
    {PIXEL_FORMAT_8880, 24},
    {PIXEL_FORMAT_5551, 16},
    {PIXEL_FORMAT_I8,    8},
};

static void
pixel_format_check(void) {
#ifndef NDEBUG
    int i;
    assert((sizeof pixel_format_bits) / (sizeof pixel_format_bits[0])
           == PIXEL_FORMAT_LIMIT);
    for (i = 0; i < PIXEL_FORMAT_LIMIT; i++) {
        assert(pixel_format_bits[i].format == i);
    }
#endif
}
This works, but it's already looking a bit verbose for what ought to be a simple data structure. And if somewhere else we need another piece of information about each pixel format (for example, we need its name so that on the command line we can indicate which format of output we want), then we need to do the same thing again:
static struct {
    int format;
    const char *name;
} pixel_format_name[] = {
    {PIXEL_FORMAT_8888, "rgba"     },
    {PIXEL_FORMAT_8880, "rgb"      },
    {PIXEL_FORMAT_5551, "reduced"  },
    {PIXEL_FORMAT_I8,   "greyscale"},
};
And of course we need code to check this table too. You can see that this could get rather tedious.

What we have ended up doing here is taking a single relation (pixel format, bits per pixel, format name) and splitting it up into several smaller relations in different places in the code. Not only is it still hard to check that the relation is correct, it's still hard to modify it. (At least we do get told if we made a mistake, but we don't find out until we've compiled and run the program.)

However, we can get something that's easier to check and maintain if we keep the relation in one place. So let's do that:
#define PIXEL_FORMAT_RELATION(X) \
    X(PIXEL_FORMAT_8888, 32, "rgba"     ) /* 8 bits RGBA. */ \
    X(PIXEL_FORMAT_8880, 24, "rgb"      ) /* 8 bits RGB; no A. */ \
    X(PIXEL_FORMAT_5551, 16, "reduced"  ) /* 5 bits RGB; 1 bit A. */ \
    X(PIXEL_FORMAT_I8,    8, "greyscale") /* 8 bits of intensity (equal RGB). */
This macro gives us the whole relation, and by passing in different things for the parameter X, we can capture different parts of the relation in different contexts.

To declare the enumeration:
enum {
#define X(ENUM, BITS, NAME) ENUM,
    PIXEL_FORMAT_RELATION(X)
#undef X
    PIXEL_FORMAT_LIMIT
};
To define the table of bits per pixel:
static int pixel_format_bits[] = {
#define X(ENUM, BITS, NAME) BITS,
    PIXEL_FORMAT_RELATION(X)
#undef X
};
And to define the table of names:
static const char *pixel_format_name[] = {
#define X(ENUM, BITS, NAME) NAME,
    PIXEL_FORMAT_RELATION(X)
#undef X
};
With this technique, there's no need for code to check that the tables are consistent with the enumeration; and we can add, delete, and rearrange the relation PIXEL_FORMAT_RELATION without needing to change anything else in the code.

(This is based on the relational header approach, in which you put the relation in a separate header file, and #include in each place you want to use it. Paul Hankin pointed out that you could put the relation into a macro and thus avoid a proliferation of small headers. I learned the relational header technique when I was at university, but I can no longer remember quite where or when: Googling for the term finds only a previous mention of mine. So what happened to this technique? Have I got the name wrong? Has it been forgotten? Does everyone use something better these days?)

Update: Wikipedia mentions the relational header technique under the name "X-Macros". (Spotted by Paul Wright, via Gareth McCaughan.)



(Post a new comment)


[info]_lj_sucks_
2007-04-25 02:12 am UTC (link)
Does everyone use something better these days?

Than C? Sadly not.

(Reply to this)


[info]fanf
2007-04-25 04:02 pm UTC (link)
A good technique. I think it's slightly nicer to use more informative projection macro names, which also removes the need to #undef them:
    enum {
    #define PIXEL_FORMAT_ENUM(ENUM, BITS, NAME) ENUM,
        PIXEL_FORMAT_RELATION(PIXEL_FORMAT_ENUM)
        PIXEL_FORMAT_LIMIT
    };

(Reply to this) (Thread)


[info]drj11
2007-04-26 09:21 am UTC (link)
In Gareth's example pixel_format_bits and pixel_format_name might be in the same translation unit, so you had better #undef the (relation row?) macro X. Giving it a relatively pure name, like X, is good because it prevents you from getting hung up on its meaning (which is up to whoever defines X). It's also nice and short which is good because rows can tend to get quite long.

(Reply to this) (Parent)(Thread)


[info]drj11
2007-07-12 03:33 pm UTC (link)
Sorry fanf. My reply is totally misguided. I guess I didn't spot that PIXEL_FORMAT_RELATION took an argument.

I agree with you.

(Reply to this) (Parent)


[info]drj11
2007-04-26 10:08 am UTC (link)
I've never come across the technique before. I think it's really useful; a good compromise between pedestrian C and generating C from another language in which all your datastructures are declared. Note how it doesn't complicate the build procedure: you can still go cc *.c. That's a good thing.

It needs a name, so we can spread the word, and relational macros seems like a good one to me.

It works best with a little bit of C99 sprinkled into the grammar (most places where a trailing comma works are C99).

(Reply to this) (Thread)


[info]gareth_rees
2007-04-26 03:33 pm UTC (link)
Yes, the trailing comma is the most important feature in C99 for me. (Second best is snprintf.)

(Reply to this) (Parent)


Create an Account
Forgot your login?
Login w/ OpenID
English • Español • Deutsch • Русский…