SuperSakura

Researching Fairy Dust SGV graphics

Two guys peering through foliage at something intriguing on Uki Uki Island

To get a game running on SuperSakura, I need to understand and convert an original game's resources into a modern format. This can range from trivially easy to distressingly difficult, depending on how clever the original programmers were.

If a programmer isn't very skilled, or has too tight deadlines, as they often did in Japan's intense game industry, they're likely to use only the simplest compression and encryption schemes they manage to throw together. A few games were even distributed with no image compression at all, just plain raw bitmaps! A modern company, in the CD-ROM era or later, probably wouldn't care; the bigger the game files, the more laborious it is to manufacture pirate copies, after all! And CD's are big enough that you can easily fit in all your graphics even without compression. But prior to that, in the floppy disk era, every additional disk in the game box added significant marginal cost to the physical part of game production. Being able to fit your game on 3 floppies instead of 4 could mean the difference between modest success and going out of business.

Still, deadlines are deadlines, and sometimes good enough is good enough. So most PC98 games tend to stick with simple RLE (run-length encoding) and LZ compression schemes. Let's see what Fairy Dust's programmers came up with!

Company logo, says Fairy Dust in a stylised script Fairy Dust logo.

Fairy Dust was an adult anime publisher in the 80's. In addition to merchandising, they branched into computer games, many based on their existing animes. The first was a Cream Lemon episode adaptation in 1986, then a larger variety from 1993 onward. Since Fairy Dust was primarily a publisher, about half of their games were actually developed by different companies. As a result, there's wild variance in the engines and file formats used.

In this case, let's look at a few games they apparently produced in-house. Akai Suishou no Hitomi ("Crimson Crystal Eye") and Uki Uki Island use the same game resource formats: PMD/MMD for music, SCP for the game scripts, and SGV for the graphics. The music files are a widely-used standard format which I've already figured out, but the scripts and graphics are unfamiliar custom formats.

There are two primary ways to figure out an unknown format: look at the input data and the resulting output data, and figure out how one comes from the other; or, examine a working converter in a debugger to directly decipher the conversion logic. Beware: porting an existing converter to a new language directly runs into trouble with copyright, so for avoidance of doubt it's preferable to favor the first approach. But if it becomes necessary to use a debugger, it should only be to write a plain-language description of the algorithm being inspected, followed by a fresh implementation that deliberately avoids following the same approach used by the original converter.

A hex editor looming over a foreboding library Those hexadecimals somehow turn into this picture.

In this case, we have the input data – the individual graphic files of each game, easily viewed in a hex editor. The output are the graphics seen when playing the game. Slight complication: which file corresponds with which picture?

This is easiest to figure out if an existing graphics converter already knows the format, so you can just feed it files one at a time and see which image is displayed. The converter programs MLD, Grapholic, and Susie together cover most graphic formats used on the PC98, so they're a good first try. Somewhat surprisingly, none of them is able to read SGV files. Guess I'll be the first one to crack the format, then!

(Wait, if some viewer program can already convert these graphics, why am I reverse-engineering them again? Well, as smart as the creators of these viewers were, they usually just wrote the closed-source conversion code and left it at that, so there's usually no documentation in plain text or source form to explain how a particular format works, so I can view the pictures, but don't understand how and can't convert the graphics independently! So I have to re-decipher everything and write it down somewhere so the next hacker wanting to mess with these won't have to do it all over again.)

File browser with a grid of icons for Uki Uki files Try to guess where exactly each SGV appears in the game!

The next best thing is to look at the filenames. In this case, I can see Uki Uki has a title1.sgv and title2.sgv, which will surely be the game's title screen. There are opw_##.sgv files, numbered 1 to 5, probably graphics for the opening sequence on game start. Quite often there's a frame or waku file for the game's viewframe, but Uki Uki doesn't have an obvious frame graphic. It does have numbered bg_##.sgv files for backgrounds. There's a chance the background shown during the first scene in the game may be the lowest-numbered one. There's a fairy_2.sgv, which is probably the Fairy Dust logo.

Meanwhile, Hitomi has a title_a.sgv and fairy_2.sgv, but no obvious frame or opening graphics. It has numbered background graphics too.

Let's boot up an emulator and take some screenshots!

Screenshot of an indoor office scene The first scene in Akai Suishou no Hitomi.

OK, the first scene after starting a new game in Hitomi is an indoor office, but the file bg_01 comes with bg_01_n, which is probably a nighttime version of the same graphic. This doesn't look like a picture that has an obvious night counterpart, so I suspect this picture is not actually bg_01.

There are a few ways to make sure. The simplest is to look at the game's scripts, if you can figure out how to read them. Happily, Hitomi scripts have a trivial XOR-encryption, and after that are directly human-readable. Tracing through the startup sequence, the first game script turns out to be pos06_op, which draws bg_06. That file doesn't come with a night version, so my intuition was right.

If the game scripts were too hard to decipher, there's one more surefire way to test which file is which picture: replace a graphic file with another one in the disk image (after making a backup), or just poke some garbage bytes into the graphic file. Run the game again, and see if the scene has a different background or crashes entirely. If yes, you know the edited file was the one it's trying to display.

Bird's eye view of a boat plowing through the ocean The opening scene of Uki Uki Island.

The start of a file generally contains a "header" with useful information. Graphic files normally have the picture pixel size there along with a screen position and the image palette. How about these files' headers? Let's put fairy_2, bg_06, and Uki Uki's first background opw_01 side by side.

53 47 56 46 00 00 00 00 7F 02 8F 01 21 00 00 00 C1 0F 00 00 5B 0C 00 00 1D 1C 00 00 14 13 00 00
53 47 56 57 00 00 00 00 FF 01 2F 01 21 00 00 00 A1 09 00 00 64 0F 00 00 05 19 00 00 E8 18 00 00
53 47 56 57 00 00 00 00 FF 01 2F 01 21 00 00 00 A1 09 00 00 AA 18 00 00 4B 22 00 00 AC 28 00 00

The most obvious commonality is the first four bytes, which in standard ASCII encoding spell out SGVF or SGVW. Glancing at a few other files, it's clear that F means a fullscreen graphic, and W means an in-viewframe or "windowed" graphic. Having this signature is great for file format autodetection, since anything that doesn't start with SGV can be immediately disqualified. There are graphic formats without any recognisable signature; a format autodetection algorithm has to validate the header much more carefully in that case to avoid misidentification.

Next, picture sizes. The Fairy Dust logo is a fullscreen graphic, which on the PC98 means a 640x400 image. In hexadecimals, that is 280x190. The fairy_2 header has the little-endian values 027F and 018F, clearly indicating the picture size minus 1.

Let's make sure the picture size is always found there. I've measured the in-viewframe graphics in an image editor – they are 512x304. (The graphics in Hitomi have an extra edge around the picture; comparing several screenshots, I've definitely got the size right.) In hexadecimal, that is 200x130. Again, the header has a matching 01FF and 012F.

The remaining values in the header look like 32-bit numbers, which are all smaller than the total file size. These likely define different sections of image data – starting offsets and byte sizes. In fact, if you add the two last numbers together, you always get the total file size minus 1. Furthermore, if you add the 3rd and 4th numbers from the right, you get the 2nd last number.

Uki Uki stylised text over a palm tree logo Uki Uki title screen.

At this point, I get the feeling I've seen this somewhere before. It's very rare for picture size values to be exactly 1 less than the true size, and section pointers like these are familiar as well. I documented the MAG image format years ago. Is SGV just a slightly modified MAG? The unnecessary stuff at the start of MAG headers isn't present, and there's just a single byte where a MAG palette would be, but otherwise this looks the same.

Both Uki Uki and Hitomi have a pal.dat file, which on inspection turns out to contain a large number of palette combinations. The SGV header's palette byte must select the full palette from this file. A pretty insignificant optimisation for the programmers to throw in, but fair enough. (Uki Uki has 340 SGV files and its palette file is 12,288 bytes, so it gains a total saving of 340 * (48 - 1) - 12,288 = 3,692 bytes, not accounting for a handful of further code bytes needed to implement this indirection. It reduces the total game size by about 0.03%, hardly worth it.)

Let's feed an SGV into my existing MAG converter (with a few header reading tweaks) and see what comes out!

Well-realized city street, albeit devoid of people Street scene extracted directly from Hitomi.

Well, looks like it all works fine, just like that. I now have all graphics from both games. Just need to describe these findings in SuperSakura's file format documentation, tweak the post-processing code a bit to ensure transparency is correctly applied when needed, and submit to git!

So what does this say about Fairy Dust's programmers? Usually, setting out to make your own file format takes longer and compresses worse than if you just used the best available free format. MAG was freely available, pretty easy to implement, and gives reasonable compression, so this was a smart choice. They even threw in a tiny little extra optimisation, presumably just for fun. Solid work!