Networking Game State and Events

Distributing the game state among multiple machines can be tricky business, depending on the complexity of your game state itself. Most games treat the game state as an array of entities, sometimes indexed using some sort of a scene graph or other hierarchical organization. At the end of the day, though, it's just a list.

To make designing the game more flexible (what with scripting, etc) usually these entities are represented as merely an entity index, and a chunk of data associated with that entity. This data 'blob' is then structured into meaningful values through the scripting system, which divides it up into entity state variables like origin, velocity, model index, etc.

On the networking side of things the engine discerns what state changes have been made to each entity that are relevant to the machine on the other end of the connection. Typically this consists of the server trying to figure out what information is relevant to each client about all the entities in the game, based on what the server knows the client has received so far, and what the client needs to know for its perspective of the game state.

This entails a complex system of backing up copies of the game state to generate a delta-compressed update unique to each client's situation, based on what the client has acknowledged having received as far as the game state is concerned. Fortunately, things are much simpler for the client, only being required to update the server concerning the player's activities in a much simpler scope.

The system I've been developing the past few days simplifies things, for the most part. It could also potentially simplify peer-to-peer type games, as an emergent property of its design. But for now I'm focusing on utilizing a server for simulating most of the game state, while allowing clients to submit their own simulation states. The goal, for simplicity's sake, is to utilize the same system on both ends for dealing with generating and sharing the game state.

This requires a system that works virtually like two servers talking to one-another. The only difference is that the client has a player manipulating the game state that it's responsible for relaying to the server, which in turn relays them to other clients.

One strategy for simplifying the description of entity types, while simultaneously simplifying serialization of entity states for network transmission, is to utilize entity type templates. Instead of having a scripting language that describes (in some sort of VM-executed code) setting the state of an entity, one variable at a time, why not just have static entities that are essentially copied to real entities based on specific triggers or events occuring?

Then, all that's needed to be transferred over the network when an entity's non-continuous state evolves (eg: entity flags, boolean states, model index, physics type, etc) is an ID value for the specific entity template that the entity should become. This eliminates the need to submit any specific entity properties outside of the continuous state (like origin, velocity, etc). The server could send these templates to clients, so that servers can be running customized games.

Utilizing an event-system, for things like player chat messages, player actions (jump, shoot, kill), entities changing their type/template, etc. allows the game state to be divided into discrete 'frames' that are not based on units of time, but units of actual state change. Each machine, upon generating an event, both executes the event and queues it to be distributed to remote systems.

Clients distribute their locally-generated events to the server (through a Quake3 style re-send until acknowledgement received method) which execute the events, and queue them to be further distributed to all clients. One global queue can be used on the server, in a circular buffer, and each client connected will continuously provide the server with acknowledgement of the last received event.

Right now I am implementing the events system which consists of an event execution mechanism (giant switch/case) and data parsing. Data comes in as a simple void* pointer, which is either forwarded from a network-parsed event, or from wrapper functions which allow engine code to invoke specific events with actual variable parameters by lumping them into data identical to what can be found in a network event.

Links of interest:


Multiplayer, Netcode, Etcetera

A big giant portion of my game project is multiplayer. Since the days of Quake3 I have found it hard to imagine working on much else than a game which can be played online among friends, enemies, and total strangers. I've also been an avid fan of Counter-Strike since the beta days, and the ultra competitive game play enthuses me to this very day.

Anybody who played QuakeWorld (TF, 2fort4 anyone?) surely remembers the requisite skill of being able to lead your targets in order to actually hit them. This was acceptable to us back then. It was considered a fact of life, and was simply unavoidable. It was also something that could be improved upon.

During what I like to think of as "The Counter-Strike Days" a programmer at Valve Software by the name of Yahn Bernier developed a new approach to networking multiplayer games in the Half-Life engine. It was a two-pronged strategy that consisted of client-side prediction and server-side lag compensation.

Client-side prediction is actually just a technique for disguising the fact that the server is doing all the authoritative simulation and that there is internet-induced latency in communicating the state of the simulation to the player clients which are interacting with it. It goes a long way toward making the game feel more responsive, without actually making the simulated interaction of game objects more responsive. Only superficial aesthetic aspects of the game state can be predicted, and are not necessarily an accurate depiction of the actual game state.

Server-side lag compensation is the closest thing to an actual solution which minimizes the effect network latency has on game play and game simulation responsiveness. More often than not, it works quite well. If you aim directly at an opponent, and fire, the hit will register almost as accurately (but not as immediately) as if the game were being run on the local machine. The server effectively re-winds the game state before performing physics and intersection tests.

On paper it sounds great. In practice, it isn't perfect at coping with latency jitter - or, latency variance (fluctuating ping) among players, which can throw off the compensation wildly. This is a fact that is not well-known. It is the heart of clips being emptied point-blank to no avail. It is also a good reason to avoid playing over WiFi connections, which are prone to random interference and ping spikes.

One more strategy, to smoothing out the game experience, is interpolation. In most modern games, this involves the client storing up multiple updates from the server, so that it can play them as smoothly and accurately as possible - by knowing the starting point and following point of an objects motion. In Counter-Strike Source, for example, this is fixed at 100 milliseconds. So that no matter what the actual network latency is, it is always compounded with an extra 100 millisecond delay for the sake of smoothing out object motion.

This image of Counter-Strike Source with sv_showhitboxes enabled displays the last known position of an entity as received from the server before the 100 millisecond interpolation delay that is used to smooth movement

At the end of the day, it seems like this is a lot of work just to make the game as responsive as possible without actually reducing or eliminating the actual latency between clients and servers. As long as the server is the only 'true' simulation of the game, these techniques or minimizing the effects of latency on game play are as good as it's going to get without upgrading the actual internet itself.

The reason that many games use the above techniques is because they are the best there is right now, and nobody believes that anything better can be done. Thus, nobody really explores different options. They see server-side simulation authority as imperative, because it's the only way to make the game secure from cheaters hacking and hackers cheating. My strategy is to do everything different than what is considered 'right' by many developers.

Firstly, I believe in letting the player's simulation occur client-side. I believe that game play can only be furthered by removing the lag component almost entirely from the player interacting with the simulation. This would be the equivalent of affording some game authority to the client-side prediction being used already.

Of course, the issue of hacking and abuse is the first thought to cross the minds of virtually anybody who understands the difference between client-side and server-side game logic. This is alleviated using a simple array of sanity checks for various 'vulnerable' circumstances. There is always a requisite state, or group of states, which allows a particular following state.

By closely examining the evolving state of a client's side of the game, on the server, it is easy to determine the likelihood that the game has been altered, hacked, etc. Each client will have a dynamic score indicating the probability of game tampering, and a threshold for this value which will invoke consequences (eg: kickban) once reached.

So, lets say that we are simulating a player's influence on the game locally on the client, and we have our hacking detection on the server, and everything is peachy. There's still the issue of latency when the game state is sent to clients. Clients will be interacting with an older state than other players are seeing. If two players are in a firefight, and one tries to take cover, he could end up dropping dead after reaching a safe spot simply because the other player could still shoot him where he was out in the open. This is not fun!

What if we could just predict where other players could be all the time? Or at least guess, so that we're closer to seeing where they actually are, as opposed to where they were? This is called extrapolation. Most games only rely on extrapolation when there is a lag spike, or dropped update packet, and the client's simulation runs out of updates to interpolate between to keep things moving smoothly.

I propose utilizing extrapolation exclusively to substitute for interpolation. When an update is received it should be used to project where the object currently is (based on latency) and where it will be by the estimated time the next update will be received. In the interim the engine can begin interpolating from wherever the object is at the moment (end of previous update extrapolation) to this predicted position.

This will not be nearly as accurate at showing the actual path as existing interpolation/delay methods, but since the server isn't the boss anymore this doesn't even matter! Infact, it will be extraordinarily inaccurate for higher pings.

At a round-trip time of 50 milliseconds (ping), an object will be 25 milliseconds ahead of the position received. If updates are 20hz then we can add another 50 milliseconds (plus or minus whether the update is early/late). So all we need to do is project the received origin out by 75 milliseconds and start interpolating the position to this new spot. A better approach would be to estimate an interpolation vector, scaled so the position will reach the projected position at 75 milliseconds, but continue moving the object with that same vector if the next update doesn't make it in time. When a new update comes, it will cause a smooth correction, and the accuracy no longer is significant as far as manipulating the state of the object locally (shooting and killing it).

Now objects will be more closely where they really are for clients and the server. At least, they won't be far behind like existing methods force them to be. This allows for firefights and interactions to be far more engaging, because it will drastically reduce the take-cover-and-die phenomenon.

The one side-effect of extrapolation is rubber-banding. Let's say we are viewing an object that is stationary, and our client has 200 ping (100ms latency). The object begins moving, and we receive the update 100ms later that is has moved a certain amount so far, and is moving at a certain velocity. Now, we take our stationary local copy of the object and have to accelerate it to the position we predict it will be by the next update, moving it faster than it actually was. Once we receive the second update about its motion, we should be pretty synced up, so long as it keeps moving in a straight line, but there will be a noticeable drop in its movement speed, back to its real speed, when that occurs.

This image depicts a player path (moving upwards) and what it looks like linearly interpolated from one predicted position to the next. If one were to also extrapolate velocity this could be smoothed further.

Inversely, when the object stops moving, we will still be simulating it moving beyond its stopping point, and our simulation will be forced to bounce it back to its resting position. Objects will effectively be racing around to be where they really are, always drifting around. This can be hidden by not allowing abrupt movement changes using low acceleration and friction values, but is not always fun because it lowers the game pace. Further smoothing of positions and velocities can be applied, but ultimately the smoother the result, the less real-time it will be. You will be trading smoothness for delay.

Links of interest:


Voxel Polygonization and Game Engines

 Hi, and welcome to my blog. I wanted to have a place online for the world to see what I am doing, something to keep me focused. Right now I am currently polishing an algorithm I had conceived a few years ago for generating a triangular mesh from a voxel volume. From the algorithm's inception I've understood what I wanted it to output. It wasn't until recently whilst starting on my (latest) game engine project that I finally had the insight to help me solve writing the algorithm.

 I was trying to nail down what sort of world I wanted my current game project to present to players. In my previous game engine project, Revolude, the terrain was a simple heightmap and a static ROAM type algorithm was used to perform a simple polygon reduction before dumping each terrain chunk to a display list consisting of triangle strips and fans.

Revolude, with some pseudo-trees and buildings, and some missiles hitting the ground.

 I was just about to opt for creating a much more involved version of this same type of terrain when I had a stroke of ingenuity. The algorithm is logically very simple, and produces the equivalent output of what I would like to call the dual of a cubrille mesh. In other words, it takes a Minecraft-esque type mesh and converts every face into a single point, and each point into a triangle. The dual of a cube is an octahedron, and the algorithm I wanted to create essentially produced an octahedral mesh from a voxel volume.

A cube, and its 'dual' - an octahedron. Notice how each of the six cube corners becomes a single triangle, and each cube face is collapsed into its center point.

From an outside standpoint, this really seems very simple to do. One approach could be to first generate a simple boxy cube mesh, and then try to brute force convert it into its dual, an octahedral mesh. This became an appealing option after only a while, but I only approach problems with the intent of devising a novel solution, so that was out of the question.

Ideally, I was hoping to create an algorithm in some way similar to marching cubes - where each cube was inspected along with its neighbors to produce geometry. Something that achieved the desired output through such an immediate method was the goal. This approach, however, became much more involved than I had initially believed it could be.

The end product works in multiple passes, on the volume as a whole. The first pass examines 8 voxels at a time, determining whether vertices should be placed, and where. The second pass connects vertices to form edges. Edges can only exist between vertices where the dot product of their normals is greater than or equal to zero. So, vertices must either be facing in the same direction, or up to 90 degrees apart. The third pass involves detecting triangles formed by edges. This is done by searching for 'complement' edges for each vertex, where two edges connected to the vertex are connected to a third edge.

The last operation involves detecting all the left-over quads formed by edges, by doing a similar search to the triangle detecting pass. Instead of looking at the edges connected to a point, we look at the edges connected to an edge, and see if there is an opposite edge on the ends of its connected edges.

These operations, naively implemented, can be extremely slow. Fortunately, it is not too expensive to store extra connectivity information which is used to quickly search edges, vertices, and triangles for related geometry. The end product can then be reduced to simple vertex information and vertex indices for triangles.

Here is a quick video of the algorithm at work with an evolving volume. Forgive the choppiness of the video, my little netbook isn't very awesome. The slowest part of this demo is generating the volume itself. The next step will be to utilize a compressed data structure as the data source, as opposed to raw volume data.