t-e.dev

Replay Storage & Compression

Part of the SkillSurf: Mobile Project

Introduction

I am currently developing a game called SkillSurf: Mobile which utilises replay files as a way of recording user gameplay. The game is a 3D mobile racing game where players race offline through levels for the fastest completion times. Replay files record their runs from start to finish.


These replay files allow the user to play-back the motion of a race through a level with perfect accuracy as well as other features such as resuming play from a certain point, racing against a ‘ghost’ that follows that recorded run, or showing markers of that run through the level for the player to try to follow along with.


To further improve this format I designed a custom algorithm that could compress replays to one third of their original size. A huge reduction!


These methods and principles of storing & compressing game replay data can be applied to multiple types of racing game and other genres and provide huge benefits over storing or sharing video files directly.


Speaking of videos & sharing, an average 2-minute replay file clocks in at only ~1.0Mb, whereas a 2-minute long 1080p video can easily reach 100.0Mb if it is of high quality. That’s 100x the size! In most games sharing a replay of gameplay with a friend involves setting up screen capture software and then dealing with these large files and performance losses.


Thanks to this replay format and the compression methods discussed in this article my game has a built-in sharing feature that enables users to share replay files over messaging apps such as WhatsApp, facebook messenger, telegram, email, and more through the default Android & iPhone sharing feature. The files automatically open with the game (if installed). On average most replay files are the same size as a photo, so transfer speeds even on 2G mobile data connection are practically a non-issue.

Replay Data

Replay data files are files that contain all the information required to playback the player’s movement through an in-game level:


The player position (Vector3), the player velocity (Vector3), the player direction (Vector3), and the length of the frame (Float).


Information such as the player’s username (string), the level name (string) they played on, the time they scored, and the number of replay frames, are also stored within the file.

Compression

For any non-teleporting player their position, direction, and velocity will for the most part change in a sequential way frame by frame.


As can be seen in the example to the left the player’s position only changes by a small amount frame to frame, the same is true for their velocity and direction values.


If the players game is running smoothly then the timespan that each frame occupies should be the same for every frame.


This sequential gameplay enabled me to devise two custom compression methods purpose-built for this type of data that I’ve named Differential Comparison Compression & Equivalence Compression.


Equivalence Compression

This compression method is one of the easiest to follow that I've ever seen.


If the data is the same as it was in the previous frame (e.g. if the frame took the same length of time in both frames, as shown in the example and as is the norm), then instead of using the full value (e.g. 0.0166667), use a simple marker that tells the decompression algorithm to use the last-seen real value in the markers place.


For my algorithm I decided to use the # character as a marker for this. In a perfectly smooth demo file the 0.0166667 value appears only once, and in its place for every other frame resides the # marker.

Differential Compression


As previously noted the values from frame to frame to do not tend to change dramatically.


Take two frames that have a value of 1.555 and then 1.65555 for the players x position. Instead of storing 1.55555 in the file, we could instead store >0.1, which is several characters shorter. We can do this with > or < wherever the differential value is smaller than the original full-length value.


These > & < (& #) markers tell the decompression algorithm how to parse the data as it reads & loads the file into the games memory frame by frame.

The LZ4 Algorithm

The final way I decided to save space was by using a pattern recognising compression technique known as LZ4 compression.


Pattern recognition compression works by searching for repeating data patterns and replacing them with markers, storing each marker in a lookup table alongside the chunk of repeated-data that it needs to copy into place when decompressing the file.


For example, at this stage in the compression process, we have already replaced all of the "0.0166667 " with "#" markers due to equivalence compression. Wherever this is done we have "[#]]" which is 4 characters long. LZ4 can look at the file and see that there are 500 instances of this 4-character data pattern, and replace all 500 of them with a marker. It then adds the "[#]]" data to a lookup table alongside the marker.

To decompress it opens the lookup table and replace each marker with the corresponding data wherever it appears in the file.


The pattern recognition compression that LZ4 uses works especially well on removing the file-size bloat caused by the use of brackets.

Results

There are two aspects to the performance of the algorithm; processing speed & space-saved.


In regards to performance, I didn’t want noticeable lag-spikes as the game saved (compressed) or loaded (decompressed) replay files. My goal was to be able to load 1-minute long replay files in less than a blink of an eye. This means loading or saving 3600 frames (60 seconds of 60fps) in less than 100 milliseconds.


My implementation in C# for Unity (excluding file-read/write) time performed so well that it could decompress or compress an entire hours worth of gameplay in under a third of a second, as can be seen in the benchmark results below:


Equivalence & Differential Compression: 0.006ms per frame

LZ4 Compression: 0.009ms per frame

Total Compression Time per Frame: 0.0015ms


Equivalence & Differential Decompression: 0.012ms per frame

LZ4 Decompression: 0.001ms per frame

Total Decompression Time per Frame: 0.0013ms


The data format is already effective at reducing file-sizes from the megabytes of traditional screen-capture recordings into only a few hundred kilobytes. I kept my focus on speed rather than perfect bit-level compression and set my target at 50% file-size reduction.


To benchmark, I used several different replay files as each will compress differently depending on how ‘random’ the data is. I stored the results as a percentage in file-size reduction from the original file. All files were between 60 and 70 seconds long.


Equivalence Compression Only: 22% File Size Reduction

Differential Compression Only: 11% File Size Reduction

Equivalence & Differential Compression: 23% File Size Reduction

LZ4 Compression Only: 48% File Size Reduction

LZ4 & Equivelence & Differential Compression: 65% File Size Reduction


This means that on average a 60-second 3600-frame long file would be reduced from 334kb to 117kb.

Conclusion

Despite the relatively dull-nature of data manipulation, I found the work incredibly satisfactory; seeing the file sizes shrink and my algorithms at work was a pleasure and I look forward to working with similar systems in the future.


I achieved the performance results desired and achieved praise and an A-grade for my tutors for the final-year assignment that this work contributed towards.

Future Plans

Despite the results being within my specifications, I know that they’re far from perfect. I’ll need to conduct some tests with multiple algorithms before I can know for sure how effective the ones I have chosen are.


LZ4 is lighting fast, and this may help on lower-end mobile phones however due to a restructuring of how the saving & loading system works I’ve moved these processes to a separate thread and on most devices an entirely separate CPU core. Nearly all mobile devices use a mixture of “big” CPU cores and “little” CPU cores.


Replays start loading whenever a level loads so that they can be instantly accessed and played; this means that the decompression algorithm has entire seconds worth of time to load the replay files on our inactive “little” CPU cores while the “big” CPU cores work on loading the level. Remember a 60-second replay file takes only 4.68ms (0.00468 seconds) to decompress. This means we could decompress hundreds of files in memory before we would run into any issues with a replay file not showing as ready to play in the menu immediately upon loading the level.


Replay files are saved when the user completes a level. There is a 3-second countdown between completing a level and restarting for another run. This means we again have an abundance of CPU time to save and compress several hours worth of data on our “little” CPU cores.


In both cases, the algorithm is much faster than we need it to be. This sounds good, but in my eyes is a bit of a negative. I would rather the file size be smaller and the algorithm a little slower, since plenty of CPU time is available. This is why in the future I will be looking into different compression methods and different ways of storing the data. The hope is that I can find a balance between file-space and algorithmic speed. Perhaps I’ll even implement multiple methods and switch between them depending on device-speed.


One important thing to note is that throughout this I did not benchmark the file read or write speeds of the devices. This is because these values are highly variable and in my eyes cannot be relied upon. I aggressively use the device RAM for handling replay data to prevent bottlenecks in non-volatile storage. At present this isn’t an issue as the game & replay files consume minimal RAM and I don’t envision any amassing the hundreds of replay files needed to cause any issue in this area. If this should change I may have to get clever about my approach. Time will tell.


Watch this space, I’ll link to an article on the next iteration of this system here when it’s finished.