Reversing Robico’s Island of Xaan

Overview

This is a basic description of a reverse engineer of Robico’s first text adventure: Island of Xaan.

Although it is one of their weakest, it shows a number of trademark features of Robico’s adventures: lots of detail descriptions, lots of humour and some strange logic. The parser, being just verb and noun, is quite weak compared to Robico’s later adventures.

The adventure was only available on the BBC Micro and Acorn Electron computers. The version I’m reverse engineering is the BBC Micro one, taken from the Stairway to Hell archive and is designed to run in &1900 DFS, using MODE 7.

This write up uses BBC Micro conventions (e.g. using & to signify a hexadecimal number) and assumes that some knowledge is known about the BBC micro.

All notes and scrap code is available from my github.

The disk

The disk contains six files, three of which are loaders:

I’ll get these out of the way, as they’re the simplest and have nothing really interesting from a reverse engineering perspective:

  • $.!BOOT – is loaded when the disc is run with shift-break and just runs $.LOAD
  • $.LOAD – is the general Stairway to Hell loader and just runs $.LOADER
  • $.LOADER – prints the introduction, then loads and runs $.XAAN1. It also shows the address for Robico software, which is an unimposing semi-detached house in Wales.

The other files are code for the adventure which are loaded in the following order, and locations:

  • $.XAAN1 which is loaded from &880 to &D00 with an exec address of &888
  • $.XAAN which is loaded from &1200 to &7BFC with an exec address of &225D
  • $.XAAN2 which is loaded from &2000 to &2400 with an exec address of &2000

XAAN1

XAAN1 contains the intial loader, a selection of library routines, such as the message printer, and several command routines, for short commands, probably placed here because memory was short.
Once loaded it performs the basic system calls, before handing over control to the XAAN file:

  1. OSBYTE 225        (disable function keys)
  2. OSBYTE 200 3      (disable escape and clear memory on break)
  3. OSBYTE 139 1      (*OPT 1 equivalent – report nothing during file ops)
  4. OSCLI “R. XAAN”   (Load XAAN and jump to its executable address)

Other obvious functions include (these are the names I’ve given the functions):

  • printmsg – to decompress and print a message. This uses OSWRCH to print to the screen.
  • findmsg – finds the message passed in X and passes it to printmsg
  • getinput – retrieves a input from the command line, using OSWORD 0. This also splits the input into a verb (stored at &41f) and a noun (stored at &43e)
  • searchlist – looks for the passed word in the passed list and returns the index in X. This is used to convert the verb and noun to a number.
  • handlecmd – looks up the entered verb and the looks for the appropriate address from a list of pointers.
  • the routines for the commands WAIT, SIT and QUIT

XAAN2

XAAN2 looks like it was originally the copy protection file – it contains a selection of calls to OSWORD to read sectors directly from the disk. This is unused on the Stairway to Hell disk.

XAAN

XAAN contains nearly everything, the code, the game data and everything else, apart from a few routines in XAAN1.

XAAN has multiple chunks of data with roughly the following uses:

From To Addr Purpose
0000 10be 1200 Code
10bf 30ff 22bf Empty data
3100 3248 4300 Introduction text
3249 32ff 4449 Unknown
3300 5c56 4500 Messages
5c57 5c5b 6e57 Unknown
5c5c 6e5c Room exits
62f6 74f6 Object locations and flags
6354 7554 Room messages
6400 6437 7600 Verb action pointer (low)
6438 646f 7638 Verb action point (high)
6473 65a8 7673 List of verbs
65a9 66e1 77a9 List of nouns
66e4 69fb 78e4 List of tokens

As far as we’re concerned, the data is all listed towards the end of free memory (&7c00 in Mode 7). The bit that surprises me is the amount of spare memory – bearing in mind that the BBC Micro only had 32KB of memory, of which large swathes are unusable due to disk and screen memory, there’s still around 8 KB of free memory whilst XAAN is running!

I’ll go into detail of the different sections later, but for now, the code block about performs all the stuff to run the game and apply the verbs. Unfortunately Robico decided against using a form of bytecode (like other text adventure creators), so all the logic is done in raw 6502.

The machine code does do a lot of JMPing to JMP instructions suggesting to me that it was probably assembled in blocks to reduce memory usage whilst coding it.

Objects are handled in a slightly strange way – every noun is counted as an object, with a location byte.

Anyway, to the data formats…

Strings

All strings are compressed with routine referred to by the creator as MIDGE. THis is relatively easy to decompress and there have been public domain examples found in the past.

This uses a dictionary of common patterns with some handy special characters.

Decompressing it is a matter of reading the next byte and then expanding according to the following rules. The best way to show this is to show my notes for the first location in the game Rick Hanson (which uses MIDGE too):

45 = "you"
FC = "'"
ED = "r" = 0xed - 0xdb (size of database)
59 = "e "
48 = "in "
3a = "the "
00 = "entrance "
70 = "ha"
93 = "ll"
FB = " "
49 = "of "
5D = "a "
EE = "s"
81 = "ma"
93 = "ll"
F9 = ","
DF = "d"
73 = "es"
66 = "er"
6D = "te"
5C = "d "
43 = "rail"
4B = "way"
FB = " "

Internally, this is all handled by printmsg in the XAAN file.
To save space, messages in the code are constructed from several different messages, e.g. from the THROW command:

.canthrow   lda invsize
            bne stuffinv      ; if we have stuff in inv 
            ldx #&10          ; message 16 "You are not"
            jsr findmsg
            ldx #&1a          ; message 26 "holding"
            jsr findmsg
            ldx #&18          ; "anything"
            jmp prtmsg

This is used to compress the intro message (at offset &3100), the bank of messages (at offset &3300) and the room descriptions (more later).

Rooms

Rooms are stored in two chunks:

  • exits (at offset &5c5c)
  • descriptions (at offset &6354)

Room numbers start at 52; this is because they are treated as nouns (see later).
For each room there is are a sequence of bytes. The bottom nybble of the first byte signifies the direction (up to 10, including IN and OUT); the top nybble indicates whether the exit is blocked (&40) or whether there are no exits (&80).

If there is an exit, the flag byte is followed by the destination.

Descriptions are stored as a list of tokens, like the messages.

Objects

Object and nouns (and rooms) all share a common number, which follows the ranges of:

  • 2 – 47 nouns
  • 52 – 223 rooms

For each noun/room there is a location byte (at offset &62f6) which has the location of the object. 0 is the inventory and 1 is being worn. This does mean that potentially items can be carried by other items!
Verbs/NounsThese are a simple list of strings separated by an “@” character. The number offset is used as an internal reference for the verb/noun number.

Running verb specific actions

There is a table of action entry points, at offset &6400. This is divided into two sections – low byte and high byte; this is to make it easy to do a vector jump:

            lda verbptrsl,x      ; lookup table to control each verb
            sta verbv
            lda verbptrsh,x
            sta verbvh
            jsr setnouns
            jmp (verbv)

Out of curiosity

I found one example of self modifying code in the handlecmd function, where it sets the colour of the output text. It does this by modifying this statement in printmsg:

 colourinst  = &0922
; &10921
.printmsg
{          
            lda #&86          ; this instruction can be altered
            jmp oswrch        ; VDU 134 (Colour = cyan)

We see this location modified in both xaan1.asm (in handlecmd) and in xaan.asm, for example:

.l20ca      lda #&86
            sta l0922
            rts

Reversing the ZX Vega: Firmware

Introduction

A while back I got a ZX Vega for the LOLs and trying to remember what the Spectrum was like. Of course, I didn’t need to remember: I have quite a few in watertight boxes in the garage and, of course, I was a BBC Micro fan as a lad.

So, after a quick play and realising that 80s resolution on a 42” TV does not look good and that Spectrum artefact clash has always been horrible (thanks to the way that the Speccy did screen memory), I wondered how this thing worked.

Obviously, it’s running a Spectrum emulator – my initial guess is that the creators of the Vega wouldn’t have written their own – not when there’s many open source and easily licensed emulators around.

As a quick finger-in-the-air estimation, I thought it would probably use some simple Linux like base OS, running a framebuffer menu over the top and probably spawning fuse. After all, that’s how I would do it.

The Firmware

When reverse engineering stuff like this you can attack at multiple levels: looking at the software, looking at the hardware or attempting to reverse the firmware.

I’m lazy, I started with the firmware (to be honest I looked at this even before I had the hardware). This is easily obtainable from the site highlighted above. This comes in at just shy of 29 MB:

The first step when reversing stuff is to use some binary matching tools, these look for signatures in the data and attempt to work out what stuff is in there. The one of these that gives the best results, usually is binwalk.

This time, though it wasn’t brilliantly helpful, below is an edited extract of what it returned:

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
71            0x47            Copyright string: "Copyright (C) 2015 Retro Computers Limited."
579           0x243           Copyright string: "Copyright (C) 2015 Retro Computers Limited."
665530        0xA27BA         StuffIt Deluxe Segment (data): fton Manor Assignment.tap.gz
2689030       0x290806        StuffIt Deluxe Segment (data): fton Manor Assignment
2689060       0x290824        StuffIt Deluxe Segment (data): fton Manor Assignment.tap
2963660       0x2D38CC        Zlib compressed data, best compression
[…]
4015378       0x3D4512        Zlib compressed data, best compression
4024320       0x3D6800        gzip compressed data, has original file name: "1994.tap", from Unix, last modified: 2015-09-07 19:57:07
4036608       0x3D9800        gzip compressed data, has original file name: "1999.tap", from Unix, last modified: 2015-09-07 19:57:07
4057088       0x3DE800        gzip compressed data, has original file name: "20-20 Vision.tap", from Unix, last modified: 2015-09-07 19:57:07

This is sort of helpful; but not really – a copyright string that I could’ve guessed and some references to compressed Spectrum games.

Let’s use a more common string to search through here, the standard strings utility, which will look for strings that look to be ASCII text, once again, here’s an edited output:

VEGA@
0.1.64 20150909
The Sinclair ZX Spectrum Vega.
Copyright (C) 2015 Retro Computers Limited.
www.retro-computers.co.uk  www.zxvega.co.uk
Concept, CAD, hardware design, C, ARM and Z80
programming: Chris Smith. www.zxdesign.info
SMEM
The Sinclair ZX Spectrum Vega.
Copyright (C) 2015 Retro Computers Limited.
www.retro-computers.co.uk  www.zxvega.co.uk
Concept, CAD, hardware design, C, ARM and Z80
programming: Chris Smith. www.zxdesign.info
STMP
2m?`
kvjU<v

This is more useful: we have some comments, a version number (0.1.64.20150909) and what look like chunk headers (VEGA, SMEM, STMP). We’ll come back to these later.

A bit further down the file we have this:

[…]
t:iyJ
=0Pb
a=I1;3$f
n+[z
7&@CJ
mkdosfs
NO NAME    FAT16
This is not a bootable disk.  Please insert a bootable floppy and
press any key to try again ...

Now that’s useful – that’s the header for a FAT16 volume made by mkdosfs. As this is a filing system we should be able to rip it out and mount it through loopback.

Looking at it through hexdump we can see the header at offset 0x88800:

This all follows what we’d expect to see with a FAT16 volume. At this point we can safely assume that there’s a FAT16 volume from 0x88800 onwards. Using a simple structure viewer to highlight the header gives us:

From here we can see that the volume consists of 55032 sectors of 512 bytes, which is 28,176,384 bytes, or 27,516 KB. Or, to put it simply, the remainder of the file:

To finally confirm this, let’s extract the end bit and mount it. First, splitting the file:

Now we can mount it:

Inside the Games Archive

Looks like we have a filesystem; there’s a load of text files on there, which seem to have a simple file format, for example looking at adventure.idx:

00000000  5a 5a 5a 5a 01 5a 5a 5a  5a 2e 74 61 70 2e 67 7a  |ZZZZ.ZZZZ.tap.gz|
00000010  0a 5a 65 6e 20 51 75 65  73 74 01 5a 65 6e 20 51  |.Zen Quest.Zen Q|
00000020  75 65 73 74 2e 74 61 70  2e 67 7a 0a 58 74 72 6f  |uest.tap.gz.Xtro|
00000030  74 68 20 2d 20 54 68 65  20 41 64 76 65 6e 74 75  |th - The Adventu|
00000040  72 65 01 58 74 72 6f 74  68 20 54 68 65 20 41 64  |re.Xtroth The Ad|
00000050  76 65 6e 74 75 72 65 20  2d 20 31 32 38 6b 2e 74  |venture - 128k.t|
00000060  61 70 2e 67 7a 0a 57 6f  6c 66 6d 61 6e 01 57 6f  |ap.gz.Wolfman.Wo|
00000070  6c 66 6d 61 6e 2e 74 61  70 2e 67 7a 0a 57 69 7a  |lfman.tap.gz.Wiz|
00000080  61 72 64 20 51 75 65 73  74 20 2d 20 54 68 65 20  |ard Quest - The |
00000090  47 75 69 64 65 01 57 69  7a 61 72 64 20 51 75 65  |Guide.Wizard Que|

In red I’ve highlighted the end of field markers and in blue the end of line.

The format is basically;

string     gamename
byte       FS (0x01)
string     filename
byte       RS (0x0a)

So a quick bit of Python to convert it to TSV format:

>>> with open('adventure.idx','rb') as f:
...     for line in f:
...             sp=line.rstrip().split(b'\x01')
...             print "%s\t%s" % (sp[0], sp[1])
...
ZZZZ   ZZZZ.tap.gz
Zen Quest     Zen Quest.tap.gz
Xtroth - The Adventure Xtroth The Adventure - 128k.tap.gz
Wolfman Wolfman.tap.gz
Wizard Quest - The Guide     Wizard Quest - Guide.tap.gz

Looking into the games directory and we see (no surprise) the games:

[dave@mictlan games]$ ls -l | head
total 26972
-rwxr-xr-x 1 root root  11485 Sep  7  2015 1994.tap.gz
-rwxr-xr-x 1 root root    170 Sep  7  2015 1994.zxk
-rwxr-xr-x 1 root root  19736 Sep  7  2015 1999.tap.gz
-rwxr-xr-x 1 root root    119 Sep  7  2015 1999.zxk
-rwxr-xr-x 1 root root  22115 Sep  7  2015 20-20 Vision.tap.gz
-rwxr-xr-x 1 root root    123 Sep  7  2015 20-20 Vision.zxk
-rwxr-xr-x 1 root root  13631 Sep  7  2015 2088.tap.gz
-rwxr-xr-x 1 root root    115 Sep  7  2015 2088.zxk
-rwxr-xr-x 1 root root   4022 Sep  7  2015 3 Dimensional Noughts & Crosses.z80.gz

Each game has a gzip version of the file in either z80 or tap file format and an extra file, that zxk file. The zxk file has a simple dictionary format:

[dave@mictlan games]$ cat ZZZZ.zxk
T:ZZZZ
F:ZZZZ.tap
M:48
C:adventure
K:;;;;;;;;;;;;;;;;;;;;;;;
D:;;;;;;;;;;;;;;;;;;;;;;;

These seems to be a simple file format of Attribute:Value pairs, the obvious ones are:

T              Game Title
F              Filename
M            Memory (48 or 128)
C             Classification (only adventure seen)
K             Keyboard definitions
D             Keyboard descriptions

The last thing to note is that this volume is pretty much full to capacity:

So presumably the firmware would expand in size dependent on games.

The rest of it

Right, we’ve decoded the lion’s share of the firmware, what about the rest?

First off I’m going to pull it into a separate file, so I can leave the original in case of error:

The hexdump is not that exciting:

[dave@mictlan zxvega]$ hexdump -C firstpart.bin | head -40
00000000  56 45 47 41 40 01 00 00  09 09 df 07 01 00 00 00  |VEGA@...........|
00000010  00 76 b6 01 00 00 00 00  30 2e 31 2e 36 34 20 32  |.v......0.1.64 2|
00000020  30 31 35 30 39 30 39 0a  54 68 65 20 53 69 6e 63  |0150909.The Sinc|
00000030  6c 61 69 72 20 5a 58 20  53 70 65 63 74 72 75 6d  |lair ZX Spectrum|
00000040  20 56 65 67 61 2e 0a 43  6f 70 79 72 69 67 68 74  | Vega..Copyright|
00000050  20 28 43 29 20 32 30 31  35 20 52 65 74 72 6f 20  | (C) 2015 Retro |
00000060  43 6f 6d 70 75 74 65 72  73 20 4c 69 6d 69 74 65  |Computers Limite|
00000070  64 2e 0a 77 77 77 2e 72  65 74 72 6f 2d 63 6f 6d  |d..www.retro-com|
00000080  70 75 74 65 72 73 2e 63  6f 2e 75 6b 20 20 77 77  |puters.co.uk  ww|
00000090  77 2e 7a 78 76 65 67 61  2e 63 6f 2e 75 6b 0a 43  |w.zxvega.co.uk.C|
000000a0  6f 6e 63 65 70 74 2c 20  43 41 44 2c 20 68 61 72  |oncept, CAD, har|
000000b0  64 77 61 72 65 20 64 65  73 69 67 6e 2c 20 43 2c  |dware design, C,|
000000c0  20 41 52 4d 20 61 6e 64  20 5a 38 30 0a 70 72 6f  | ARM and Z80.pro|
000000d0  67 72 61 6d 6d 69 6e 67  3a 20 43 68 72 69 73 20  |gramming: Chris |
000000e0  53 6d 69 74 68 2e 20 77  77 77 2e 7a 78 64 65 73  |Smith. www.zxdes|
000000f0  69 67 6e 2e 69 6e 66 6f  0a 00 00 00 00 00 00 00  |ign.info........|
00000100  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

There’s the string we saw earlier with the copyright string, version number and header, we can try to clean this up:

Start Length Type Value Use
00 4 u32 “VEGA” Magic
04 4 u32 0x00000140 Version number (1.64)
08 2 u32 0x07df0909 Date stamp (2015-09-09)
0C 4 u32 0x00000001 Unknown
10 4 u32 0x01b67600 Length of file without this header
14 4 u32 0x00000000 Unknown
18 16 uchar “0.1.64 20150909\n” Version String
28 419 uchar “The Sinclair ZX Spectrum Vega …” Copyright String

Then there is empty space up to the end of the header (0x200), after which we get:

00000200  53 4d 45 4d f6 00 00 00  00 02 00 00 00 00 00 00  |SMEM............|
00000210  04 00 00 00 01 00 02 00  40 01 00 00 09 09 df 07  |........@.......|
00000220  43 04 00 00 54 68 65 20  53 69 6e 63 6c 61 69 72  |C...The Sinclair|
00000230  20 5a 58 20 53 70 65 63  74 72 75 6d 20 56 65 67  | ZX Spectrum Veg|
00000240  61 2e 0a 43 6f 70 79 72  69 67 68 74 20 28 43 29  |a..Copyright (C)|
00000250  20 32 30 31 35 20 52 65  74 72 6f 20 43 6f 6d 70  | 2015 Retro Comp|
00000260  75 74 65 72 73 20 4c 69  6d 69 74 65 64 2e 0a 77  |uters Limited..w|
00000270  77 77 2e 72 65 74 72 6f  2d 63 6f 6d 70 75 74 65  |ww.retro-compute|
00000280  72 73 2e 63 6f 2e 75 6b  20 20 77 77 77 2e 7a 78  |rs.co.uk  www.zx|
00000290  76 65 67 61 2e 63 6f 2e  75 6b 0a 43 6f 6e 63 65  |vega.co.uk.Conce|
000002a0  70 74 2c 20 43 41 44 2c  20 68 61 72 64 77 61 72  |pt, CAD, hardwar|
000002b0  65 20 64 65 73 69 67 6e  2c 20 43 2c 20 41 52 4d  |e design, C, ARM|
000002c0  20 61 6e 64 20 5a 38 30  0a 70 72 6f 67 72 61 6d  | and Z80.program|
000002d0  6d 69 6e 67 3a 20 43 68  72 69 73 20 53 6d 69 74  |ming: Chris Smit|
000002e0  68 2e 20 77 77 77 2e 7a  78 64 65 73 69 67 6e 2e  |h. www.zxdesign.|
000002f0  69 6e 66 6f 0a 00 35 c5  b7 bf 56 4c 0c 6d ec a6  |info..5...VL.m..|
00000300  94 c5 b4 80 71 97 6c 03  b5 86 53 54 4d 50 01 01  |....q.l...STMP..|
00000310  00 00 40 88 00 00 09 00  00 00 00 00 00 00 01 00  |..@.............|
00000320  07 00 06 00 01 00 01 00  36 61 09 0a 80 2d 80 b2  |........6a...-..|
00000330  a2 e2 56 c2 01 00 99 09  00 00 99 09 00 00 99 09  |..V.............|
00000340  00 00 99 09 00 00 99 09  00 00 99 09 00 00 00 00  |................|
00000350  cd 62 12 00 0e c5 00 00  00 00 0a 00 00 00 34 88  |.b............4.|
00000360  00 00 01 00 00 00 32 6d  3f 60 db db 73 48 e2 7e  |......2m?`..sH.~|
00000370  73 e6 61 50 8a 56 79 56  d1 87 c3 22 0d 2c 84 1c  |s.aP.VyV...".,..|
00000380  fa 0c 59 f8 d3 31 e1 6b  76 6a 55 3c 76 97 7e 80  |..Y..1.kvjU<v.~.|
00000390  59 7e db 7d c5 99 1e bc  af 3c 89 3e 62 47 24 fd  |Y~.}.....<.>bG$.|
000003a0  d3 5e 08 35 a9 d8 ce 1b  03 07 ad 20 23 0f 19 b6  |.^.5....... #...|
000003b0  2f f3 11 6d f9 08 6a 56  89 1c 24 17 ea 2f 7f fd  |/..m..jV..$../..|
000003c0  c6 9a 7f e3 a2 a9 9f d8  69 8f b5 83 e5 9c 7f b4  |........i.......|
000003d0  91 3b 87 4a 6d ab 4d cb  c7 9f e5 45 87 44 9b 19  |.;.Jm.M....E.D..|
000003e0  e4 1a 15 64 b8 eb 05 62  7e c3 62 d9 31 54 7a 9d  |...d...b~.b.1Tz.|
000003f0  a4 b4 be 10 d4 60 d3 d7  35 6d fd 42 a8 f6 2b 1f  |.....`..5m.B..+.|
00000400  1f f6 9b 62 96 26 28 29  e6 e1 7e ed ec d5 45 c3  |...b.&()..~...E.|

This looks like a chunked file format; we can see two chunk headers:

  1. SMEM
  2. STMP

The STMP is a big clue of where the file format comes from and we can find a binary definition at rockbox of all places! There’s a copy of the tool, elf2sb that can be found on github. There’s also documentation at nxp.com.

This defines the STMP chunk, but not the SMEM, so we’ll take a guess ourselves:

Start Length Type Value Use
0x00 4 u32 “SMEM” Magic
0x04 4 u32 0x000000fb Length of SMEM chunk
0x08 4 u32 0x00000200 Unknown
0x0C 4 u32 0x00000000 Unknown
0x10 4 u32 0x00000004 Unknown
0x14 4 u32 0x00020001 Unknown
0x18 4 u32 0x00000140 Version String (1.64)
0x1C 4 u32 0x07df0909 Timestamp (2015-09-09)
0x20 4 u32 0x00000443 Unknown
0x24 210 uchar “The Sinclair ZX Spectrum Vega …” Copyright String

Using the rockbox link we can decode the STMP chunk, starting at 0x2f6; where the word block is used below, these refer to 16 bytes blocks.

Start Length Type Value Use
0x00 20 uchar 35c5b7bf564c0c6deca694c5b48
071976c03b586
SHA1(STMP chunk)
Also IV for encryption
0x14 4 u32 “STMP” Magic
0x18 1 u8 0x01 Major version of format
0x19 1 u8 0x01 Minor version of format
0x1A 2 u16 0x0000 Flags
0x1C 4 u32 0x00008840 Image size in blocks

(0x8840 x 16 = 0x88400)

0x20 4 u32 0x00009000 Offset to first boot tag in blocks

(0x9000 x 16 = 0x90000)

0x24 4 u32 0x00000000 First bootable section
0x28 2 u16 0x0001 Number of encryption keys
0x2A 2 u16 0x0007 Start block for key dictionary

(7 x 16 = 0x70)

0x2C 2 u16 0x0006 Size of header in blocks

(6 x 16 = 0x60)

0x2E 2 u16 0x0001 Number of sections headers
0x30 2 u16 0x0001 Size of chunk headers in blocks
0x32 2 uchar 0x6136 Padding
0x34 4 u32 0x2d800a09 Second signature
0x38 8 u64 0x0001c256e2a2b280 Creation time in μs since 2000
0x40 4 u32 0x00000999 Product major version
0x44 4 u32 0x00000999 Product minor version
0x48 4 u32 0x00000999 Product sub version
0x4C 4 u32 0x00000999 Component major version
0x50 4 u32 0x00000999 Component minor version
0x54 4 u32 0x00000999 Component sub version
0x58 2 u16 0x0000 Drive tag
0x5A 6 uchar 0xcd6212000ec5 Padding

Even though there are some things that look wrong in the header we can check that it is correct by working out the SHA1 value of the header bytes from 0x14 to 0x5F to show that they match the hash in the header:

Which they do!

Then we have the chunk header:

00000350  cd 62 12 00 0e c5 00 00  00 00 0a 00 00 00 34 88  |.b............4.|
00000360  00 00 01 00 00 00 32 6d  3f 60 db db 73 48 e2 7e  |......2m?`..sH.~|
Start Length Type Value Use
0x00 4 u32 0x00000000 chunk name
0x04 4 u32 0x0000000a chunk offset in blocks

(0x0a * 16 = 0xa0)

0x08 4 u32 0x00008834 chunk size in blocks

(0x8834 * 16 = 0x88340)

0x0C 4 u32 0x00000001 Flags; bit 0 is bootable.

If you’ve kept track of the sizes you can see they’re all roughly correct indicating that we’re interpreting this correctly.

After this we have the DEK block – a block of Data Encryption Keys; each of the DEKs are encrypted with the KEK – the Key Encryption Key.

And that’s our problem – we don’t know the KEK. Bugger.

According to the Rockbox documentation it uses AES-128 in CBC mode to encrypt the DEK with a KEK passed on the command line.

If we follow on the with the Rockbox documentation the DEK block should start at block 7 (i.e. 0x70 from the header), which is straight after the chunk header. This consists of two blocks:

00000360  00 00 01 00 00 00 32 6d  3f 60 db db 73 48 e2 7e  |......2m?`..sH.~|
00000370  73 e6 61 50 8a 56 79 56  d1 87 c3 22 0d 2c 84 1c  |s.aP.VyV...".,..|
00000380  fa 0c 59 f8 d3 31 e1 6b  76 6a 55 3c 76 97 7e 80  |..Y..1.kvjU<v.~.|

Which translates into:

Start Length Type Value Use
0x00 16 uchar 326d3f60dbdb7348e27e73e661508a56 Encrypted CBC-MAC of the header
0x10 16 uchar 7956d187c3220d2c841cfa0c59f8d331 Encrypted DEK

We can confirm what we know using the sbtool program found on the eewiki github:

We’ve hit a wall with the firmware here, unless we can brute force the key (128 bits – that’s 3.4028236692093846346337460743177e+38 different combinations – that’s not happening any times soon) or we can look into using side channel mechanisms (maybe more on that later).

It’s time to look at the hardware.