Wednesday, January 4, 2012

Automatic zero-initialization of C++ class instances

C++ has a very nice but not well-known feature letting you initialize a class-type object with a bunch of zeroes. It's called zero-initialization, and happens when you call for value-initialization on a class-type which does not have a user-defined constructor.

The reason it is not well-known, is simply because it is not fully implemented by some compilers widely used in the industry (namely MSVC). This is a shame, because it's a very easy way to quickly switch to dumb old C calloc/memset'd structures to classes.
void snaflu() {
  my_huge_struct *huge = static_cast(calloc(1, 
      sizeof(my_huge_struct)));
  // ...
}

void snaflu2() {
  my_huge_struct *huge = new my_huge_struct();
  // This doesn't work on VS, it's the same as new my_huge_struct;
}
Doing this was a necessity for me, since we have some structs that have several hundreds fields, and writing manually an init() method or an initializer list would be, although possible, really suboptimal and would not make it through quality reviews.

That being said, the solutions I have to offer are also suboptimal and acceptable only in specific contexts. The best solution, as always, is to get a working compiler.

A first solution would be to call a placement new on initialized memory:
void snaflu() {
  my_huge_struct *huge = new (calloc(1, 
      sizeof(my_huge_struct))) my_huge_struct;
  // use...
  huge->~my_huge_struct();
  free(huge);
}
This has several advantages, the main one being that it is quite simple, and very similar to the first technique. The second is that it does not require a modification of the instanciated class.

The main drawback is that it's ugly.

The second one is that allocating arrays became suddenly super complicated :
void multiSnaflu(int count) {
  my_huge_struct *huges = new (calloc(count,
      sizeof(my_huge_struct))) my_huge_struct[count];
}
Wait, C++ doesn't have a placement-delete (or a destructor) for arrays. We would have to hand-destruct them after usage, which would imply a lot of boring stuff like keeping in memory the quantity of objects contained, and iterating upon them.
A vector would be a nicer solution, but writing an allocator is not really what one could call a "simple solution".

Another solution, more intrusive, is to overload the new-handler of your class.
struct my_huge_struct {
  void *operator new(size_t size) {
    void *mem = ::operator new(size);
    std::memset(mem, 0, size);
    return mem;
  }
};
This will initialize all scalar values to 0 before the construction of the object (as long as the bit pattern for zero is indeed 0 on your platform). All automatic members also inherits from this behavior.
Note that this is absolutely not standard, and that it works only in the sense that the constructor should take no action about your scalars.

For more lisibility, a parent class could prove useful :
struct ZeroInitialized {
  void *alloc_zero(size_t size) {
    void *mem = ::operator new(size);
    std::memset(mem, 0, size);
    return mem;
  }
  void *operator new(size_t size) { return alloc_zero(size); }
  void *operator new[](size_t size) { return alloc_zero(size); }
  // Yep, we also need this one for arrays
};

struct my_huge_struct : public ZeroInitialized {
  // ...
};
Also note that scope-bound objects will not be zero-initialized with this technique. The quick answer is to use auto_ptrs to manage the score binding. The long answer is longer, and I'm not even sure about it. I may come back on it later.

You may want to follow me on twitter.

2 comments:

  1. Oh my - this is bad :) Both of those techniques are kinda dangerous, and will only work on integers, pointers and other "atomic" types.

    But what about floating point types? You won't set it's value to 0.0 by zeroing it's memory.

    "Also note that scope-bound objects will not be zero-initialized with this technique."

    ...thus, it makes this technique completely useless...

    Solution: Manually initialize all members to zero :)

    Better solution: Create a custom build tool, that will scan the source code and generate method for initializing all class members for you.

    Also, if you're afraid of uninitialized members, you can run some test in debug build.

    For example, you can fill the memory with some non-zero value (this value should be determined in the new operator, by scanning allocated memory for unused byte value), and at the end of constructor, you scan the memory for unmodified bytes.

    Cheers !:)

    ReplyDelete
  2. Actually, 0 bits is float 0.0 on any machine still in use for the present, and any machine likely to be build in the future by non-insane people.

    I, also, wish C++ would just initialize all allocations to 0 so we could all stop jacking around with this. But some people love to masturbate.

    ReplyDelete