My latest short project is a programmable texture filtering program – a piece of software that will let me write short scripts to do image processing such as edge detection, blurring, color balancing and other mathematical operations. I started writing it two days ago on the way home from work, and the first stage is complete today.
I decided to write the program, currently called Viridian, because firstly I needed it (for producing normal maps and other textures for Vitae), and secondly because it seemed like a short, fun program to write. The idea is not mine; I was largely inspired by a similar program written by a co-worker at FRD – but the implementation is all mine.
Leaving aside the issues of parsing and grammars that would be necessary for a scriptable version, my first step was to build a program that could read in an image, run a hardcoded C transformation on it, and then write it out as a valid file format. I chose to use TrueVision TARGA (.tga) for my first format, as it is a remarkably simple format. I began with using the RGB (24-bit) form, where it uses 8 bits for each channel (and no alpha channel).
In order to load the file and correctly extract the data into the C program, I used the wikipedia reference page for the TGA spec. Creating a struct of the right order that I could alias to the header worked well enough, and then it was simply a matter to adjust for the variable length fields and grab a pointer to the image data itself.
typedef struct tga_header_s {
u8 id_length;
u8 color_map;
u8 image_type;
u8 color_map_spec[5]; // This can’t be declared as a struct due to aliasing
u16 origin_x;
u16 origin_y;
u16 width;
u16 height;
u8 pixel_depth;
u8 image_desc;
} tga_header;
My first implementation had a struct within a struct for the colormap specification (TGAs can use an indexed palette), but this broke the code as the compiler added alignment padding to keep the 5-byte struct correctly aligned[1].
After that it was easy enough to create the rest of the code to reconstitute a new image header, append the image data, and write it out to a destination file. Once all that framework is in place, writing the code to actually apply a basic image transform was trivial.
I did have a couple of hiccups along the way though, and those were largely smoothed over thanks to an old friend – the hex editor. I was coding on the train at the time with no internet connection, and no saved offline version of the TGA reference. In order to make sure I was loading the right fields from the right bytes, I simply opened up one of my source example TGAs, and checked to see if the insides made sense.

With modern techniques and libraries, I think Hex editing – or at least reading Hex – is becoming somewhat of a lost art, but it’s a very useful ability. Being comfortable with taking a look at exactly what is under the hood to confirm your assumptions, and also to be able test output before writing any code, is often extemely helpful.
I simply used the Hex mode in Vim rather than resorting to a specialised editor, which was completely up to the task as I was purely reading.
Taking a look at the top line, this simply corresponds to the main parts of the header.
u8 id_length
0000 0200 0000 0000 0000 0000 5505 0004
(length 0, so no ID)
u8 color_map
0000 0200 0000 0000 0000 0000 5505 0004
(0 = no color map))
u8 image_Type
0000 0200 0000 0000 0000 0000 5505 0004
(2 = uncompressed, true-color)
u8 color_map_spec[5]
0000 0200 0000 0000 0000 0000 5505 0004
(There is no color map, per byte #2)
u16 origin_x
u16 origin_y
0000 0200 0000 00000000 0000 5505 0004
(origin starts at 0, 0, the default)
u16 width
u16 height
0000 0200 0000 0000 0000 0000 5505 0004
Now hear is something slightly odd – These last 4 bytes of the line should be the width and height, but the values here don’t seem to work out. 0x5505 is 0d21765, and I’m fairly certain the image wasn’t 21765 * 4. Opening it up in the gimp confirmed it was 1365 * 1024, a much more reasonable size, but then what was going on here?
Experienced programmers will probably get this one pretty quickly. Luckily reading the TGA spec earlier had already reminded me, and so it came back to me quickly even though it’s not something I’ve personally had to deal with before – endianness.
The endianness of data refers to the ordering of bytes[2], and whether they should be stored with the biggest first (big-endian), or the littlest (little-endian). In the English speaking world we tend to think in Big-endian, as that is how we describe our numbers[3] (”two-hundered and forty seven”, not “seven, forty and two-hundred”). Many computer systems, including the x86 platform that powers billions of PCs worldwide, use little-endian ordering, which means that the two bytes 55 05 actually correspond to the hex number 05 55, and 00 04 is 04 00. As expected, 0x0555 = 0d1365, and 0x0400 = 0d1024.
With that mystery sorted, it all makes sense, and being able to tweak my code until I get exactly what I expect in the file output takes only a matter of minutes. Could I have done it without inspecting the Hex? Yes, of course, but it was easier to have that resource with me, and when dealing with less-well documented file formats (or reverse-engineering formats, as I once did with Blizzard’s Warcraft 3 map scenery file in order to build a tileset converter), it is invaluable.
As with my other projects, Viridian is open-source and can be found on github.
—-
[1] C compilers will pad structs so that multi-byte fields are correctly aligned for the target platform. This often means they are even-byte aligned, and also often aligned with their size, so a 4-byte field will be at an address divisible by 4. This is due to most hardware having poor or no ability to read incorrectly aligned values.
[2] Actually endianness can refer to the ordering of bits in a byte, but pretty much every sane person now has agreed on big-endian for that, ie. 128 64 32 16 8 4 2 1, and not the reverse.
[3] Although confusingly, we use little-endian format for teens – ie. fourteen is ‘four-ten’ not ‘ten-four’, yet we say ‘twenty-four’ not ‘four-twenty’. Yet another arbitrary confusion in our language!