One of the great things about doing a Minecraft-style voxel engine, where the entire world is made of cubes, is collision detection. It's very cheap to detect which voxels one should compare a game entity's collision hull against, and the math is very simple.

Minecraft: The cubic-voxel world game of the century. |

One of the lame things about doing a Minecraft-style voxel engine is that Notch did it already (not to mention Minecraft's inspiration: Infiniminer). I am not making a cubic Voxel world. I'm not really even making a voxel world. I'm using voxels as an intermediate representation in generating my world. I don't even plan on making it very big, or modifiable. I'm not using marching cubes to make smooth terrain, and I'm not using boxy voxels either.

my voxel game thus far |

After all the generation and stuff, the end product is rendered as polygons (triangles). I could handle collision detection between objects and the world in terms of spheres and triangles, or cylinders and triangles, or any mathematically simple shape and triangles. But triangles are yucky, and I don't like dealing with them. I especially do not enjoy the thought of detecting *which* triangles to test intersections on. Back in the days when I was a big fan of BSP trees, this would have been fun, but not anymore. I've moved on to bigger and better things (or, whatever).

What's nice about cubic voxels is that you can pretty much just index an array using your object's position and see if there are any voxels intersecting. In a Wolfenstein 3D style raycaster this is great for collision detection, it's just so simple to do. That's great and all, but I'm not using cubes. I'm using octahedrons that behave like metaballs and 'merge' when neighboring eachother. This complicates things.

Collision primitives from Cryengine. |

My first ideas consisted of first assuming every object to be a sphere, or pill shape, and do a quick (hah) spatial search for the closest voxels. Then detect which faces are facing the entity, which edges and planes it intersects, calculate where any 45 degree planes may be (corners/edges) and handle the collision accordingly by moving the position of the object, and calculating a new velocity based on elasticity and friction values. This method seemed ideal for handling collisions using the sparse volume representation in memory only, and the sparse volume structure doesn't store information about the shape of the voxels, they are all just cubes as far as it is concerned. It was important to me that a slope of voxels could be ran up/slid down/rolled across/etc smoothly, without any stair-stepping. I wanted a 45 degree plane to behave like one.

The most difficult part of this solution was handling large objects with multiple collision points with different groups of voxels around themselves. I quickly realized that a distance field representation of the whole scene would solve all of my problems - efficiently detecting collisions with all object sizes while behaving as though diagonal voxels were one continuous 45 degree surface.

In this video of a 2D prototype I did last week you can see the green 'voxels' and the distance field rendered as a blue/red gradient that is generated using a simple distance transform that propagates distances over the 'volume' in two passes. The circle is just a point in space which is used to index the distance field, which the value returned from is compared against the circle's radius. If the circle's radius is greater than the value at its center in the distance field then it is intersecting. Then it's a matter of checking the surrounding distance field values and calculating the gradient of area of intersection to determine a sufficient approximation of the 'normal' to properly de-intersect the circle and reflect its velocity with some amount of elasticity for a bounce effect.

The neat thing is that the same distance field properly handles collisions with larger objects, effectively confining them to the 'red' area depending on how large they are. There are still some minor issues though with larger objects because of the low resolution of the distance field. Even though it is being interpolated linearly it still suffers in tight corners because the resolution is just not there to indicate the true distance of each point from the nearest voxel. The distance field is the same resolution as the source voxel volume. This could be alleviated with some super sampling, but that's expensive memory-wise.

As a bonus, there are other uses for distance field representations of a 3D scene. Distance representations are very handy for any line/path tracing, so it lends itself well to calculating lighting and shadowing. It is also useful for AI pathfinding and obstacle avoidance. Fluid dynamics can also benefit from a distance field representation for properly drifting particle effects around the scene realistically. I've already uploaded the same distance field to the GPU as a 3D texture to perform some ambient occlusion in the vertex shader, which has increased the visual depth of the game scene. It will also make the job of illuminating game entities much simpler.

The only downside to the distance field representation is memory usage. So far I'm just using a flat array, because I don't really see a very worthwhile means of compressing data as incoherent as a distance field. Conventional sparse structures, like octrees, will not be of much use. What would probably work best is a more continuous approach, like a cosine transform. Maybe dividing it up into 8x8x8 blocks and performing a discrete cosine transform (ala JPEG) would be a decent means of representing the data in memory? Each distance field query would then result in performing a bunch of cosine calls though, unless some quantization could negate most of them. Compression artifacts would yield bumpy surfaces, however.

Links of interest:

Wikipedia: Distance Transform

Gamasutra: Advanced Collision Detection Techniques