In my previous article, Create Custom Binary File Formats for Your Game's Data, I covered the topic of using custom binary file formats to store game assets and resources. In this short tutorial we will take a quick look at how to actually read and write binary data.
Note: This tutorial uses pseudo-code to demonstrate how to read and write binary data, but the code can easily be translated to any programming language that supports basic file I/O operations.
Bitwise Operators
If this is all unfamiliar territory for you, you will notice a few strange operators being used in the code, specifically the &
, |
, <<
and >>
operators. These are standard bitwise operators, available in most programming language, which are used for manipulating binary values.
For more information on bitwise operators, see:
- Understanding Bitwise Operators
- The documentation for your programming language of choice
Endianness and Streams
Before we can read and write binary data successfully, there are two important concepts that we need to understand: endianness and streams.
Endianness dictates the order of multiple-byte values within a file or within a chunk of memory. For example, if we had a 16-bit value of 0x1020
, that value can either be stored as 0x10
followed by 0x20
(big-endian) or 0x20
followed by 0x10
(little-endian).
Streams are array-like objects that contain a sequence of bytes (or bits in some cases). Binary data is read from and written to these streams. Most programming will provide an implementation of binary streams in one form or another; some are more convoluted than others, but they all essentially do the same thing.
Reading Binary Data
Let's start by defining some properties in our code. Ideally these should all be private properties:
__stream // The array-like object containing the bytes __endian // The endianness of the data within the stream __length // The number of bytes in the stream __position // The position of the next byte to read from the stream
Here is an example of what a basic class constructor might look like:
class DataInput( stream, endian ) { __stream = stream __endian = endian __length = stream.length __position = 0 }
The following functions will read unsigned integers from the stream:
// Reads an unsigned 8-bit integer function readU8() { // Throw an exception if there are no more bytes available to read if( __position >= __length ) { throw new Exception( "..." ) } // Return the byte value and increase the __position property return __stream[ __position ++ ] } // Reads an unsigned 16-bit integer function readU16() { value = 0 // Endianness needs to be handled for multiple-byte values if( __endian == BIG_ENDIAN ) { value |= readU8() << 8 value |= readU8() << 0 } else { // LITTLE_ENDIAN value |= readU8() << 0 value |= readU8() << 8 } return value } // Reads an unsigned 24-bit integer function readU24() { value = 0 if( __endian == BIG_ENDIAN ) { value |= readU8() << 16 value |= readU8() << 8 value |= readU8() << 0 } else { value |= readU8() << 0 value |= readU8() << 8 value |= readU8() << 16 } return value } // Reads an unsigned 32-bit integer function readU32() { value = 0 if( __endian == BIG_ENDIAN ) { value |= readU8() << 24 value |= readU8() << 16 value |= readU8() << 8 value |= readU8() << 0 } else { value |= readU8() << 0 value |= readU8() << 8 value |= readU8() << 16 value |= readU8() << 24 } return value }
These functions will read signed integers from the stream:
// Reads a signed 8-bit integer function readS8() { // Read the unsigned value value = readU8() // Check if the first (most significant) bit indicates a negative value if( value >> 7 == 1 ) { // Use "Two's complement" to convert the value value = ~( value ^ 0xFF ) } return value } // Reads a signed 16-bit integer function readS16() { value = readU16() if( value >> 15 == 1 ) { value = ~( value ^ 0xFFFF ) } return value } // Reads a signed 24-bit integer function readS24() { value = readU24() if( value >> 23 == 1 ) { value = ~( value ^ 0xFFFFFF ) } return value } // Reads a signed 32-bit integer function readS32() { value = readU32() if( value >> 31 == 1 ) { value = ~( value ^ 0xFFFFFFFF ) } return value }
Writing Binary Data
Let's start by defining some properties in our code. (These are more or less the same as the properties we defined for reading binary data.) Ideally these should all be private properties:
__stream // The array-like object that will contain the bytes __endian // The endianness of the data within the stream __position // The position of the next byte to write to the stream
Here is an example of what a basic class constructor might look like:
class DataOutput( stream, endian ) { __stream = stream __endian = endian __position = 0 }
The following functions will write unsigned integers to the stream:
// Writes an unsigned 8-bit integer function writeU8( value ) { // Ensures the value is unsigned and within an 8-bit range value &= 0xFF // Add the value to the stream and increase the __position property. __stream[ __position ++ ] = value } // Writes an unsigned 16-bit integer function writeU16( value ) { value &= 0xFFFF // Endianness needs to be handled for multiple-byte values if( __endian == BIG_ENDIAN ) { writeU8( value >> 8 ) writeU8( value >> 0 ) } else { // LITTLE_ENDIAN writeU8( value >> 0 ) writeU8( value >> 8 ) } } // Write an unsigned 24-bit integer function writeU24( value ) { value &= 0xFFFFFF if( __endian == BIG_ENDIAN ) { writeU8( value >> 16 ) writeU8( value >> 8 ) writeU8( value >> 0 ) } else { writeU8( value >> 0 ) writeU8( value >> 8 ) writeU8( value >> 16 ) } } // Writes an unsigned 32-bit integer function writeU32( value ) { value &= 0xFFFFFFFF if( __endian == BIG_ENDIAN ) { writeU8( value >> 24 ) writeU8( value >> 16 ) writeU8( value >> 8 ) writeU8( value >> 0 ) } else { writeU8( value >> 0 ) writeU8( value >> 8 ) writeU8( value >> 16 ) writeU8( value >> 24 ) } }
And, again, these functions will write signed integers to the stream. (The functions are actually aliases of the writeU*()
functions, but they provide API consistency with the readS*()
functions.)
// Writes a signed 8-bit value function writeS8( value ) { writeU8( value ) } // Writes a signed 16-bit value function writeS16( value ) { writeU16( value ) } // Writes a signed 24-bit value function writeS24( value ) { writeU24( value ) } // Writes a signed 32-bit value function writeS32( value ) { writeU32( value ) }
Note: These aliases work because binary data is always stored as unsigned values; for instance, a single byte will always have a value in the range 0 to 255. The conversion to signed values is done when the data is read from a stream.
Conclusion
My goal with this short tutorial was to complement my previous article on creating binary files for your game's data with some examples of how to do the actual reading and writing. I hope it's achieved that; if there's more you'd like to know about the topic, please speak up in the comments!
Subscribe below and we’ll send you a weekly email summary of all new Game Development tutorials. Never miss out on learning about the next big thing.
Update me weeklyEnvato Tuts+ tutorials are translated into other languages by our community members—you can be involved too!
Translate this post