Introduction
A while back I got a ZX Vega for the LOLs and trying to remember what the Spectrum was like. Of course, I didn’t need to remember: I have quite a few in watertight boxes in the garage and, of course, I was a BBC Micro fan as a lad.
So, after a quick play and realising that 80s resolution on a 42” TV does not look good and that Spectrum artefact clash has always been horrible (thanks to the way that the Speccy did screen memory), I wondered how this thing worked.
Obviously, it’s running a Spectrum emulator – my initial guess is that the creators of the Vega wouldn’t have written their own – not when there’s many open source and easily licensed emulators around.
As a quick finger-in-the-air estimation, I thought it would probably use some simple Linux like base OS, running a framebuffer menu over the top and probably spawning fuse. After all, that’s how I would do it.
The Firmware
When reverse engineering stuff like this you can attack at multiple levels: looking at the software, looking at the hardware or attempting to reverse the firmware.
I’m lazy, I started with the firmware (to be honest I looked at this even before I had the hardware). This is easily obtainable from the site highlighted above. This comes in at just shy of 29 MB:
The first step when reversing stuff is to use some binary matching tools, these look for signatures in the data and attempt to work out what stuff is in there. The one of these that gives the best results, usually is binwalk.
This time, though it wasn’t brilliantly helpful, below is an edited extract of what it returned:
DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 71 0x47 Copyright string: "Copyright (C) 2015 Retro Computers Limited." 579 0x243 Copyright string: "Copyright (C) 2015 Retro Computers Limited." 665530 0xA27BA StuffIt Deluxe Segment (data): fton Manor Assignment.tap.gz 2689030 0x290806 StuffIt Deluxe Segment (data): fton Manor Assignment 2689060 0x290824 StuffIt Deluxe Segment (data): fton Manor Assignment.tap 2963660 0x2D38CC Zlib compressed data, best compression […] 4015378 0x3D4512 Zlib compressed data, best compression 4024320 0x3D6800 gzip compressed data, has original file name: "1994.tap", from Unix, last modified: 2015-09-07 19:57:07 4036608 0x3D9800 gzip compressed data, has original file name: "1999.tap", from Unix, last modified: 2015-09-07 19:57:07 4057088 0x3DE800 gzip compressed data, has original file name: "20-20 Vision.tap", from Unix, last modified: 2015-09-07 19:57:07
This is sort of helpful; but not really – a copyright string that I could’ve guessed and some references to compressed Spectrum games.
Let’s use a more common string to search through here, the standard strings utility, which will look for strings that look to be ASCII text, once again, here’s an edited output:
VEGA@ 0.1.64 20150909 The Sinclair ZX Spectrum Vega. Copyright (C) 2015 Retro Computers Limited. www.retro-computers.co.uk www.zxvega.co.uk Concept, CAD, hardware design, C, ARM and Z80 programming: Chris Smith. www.zxdesign.info SMEM The Sinclair ZX Spectrum Vega. Copyright (C) 2015 Retro Computers Limited. www.retro-computers.co.uk www.zxvega.co.uk Concept, CAD, hardware design, C, ARM and Z80 programming: Chris Smith. www.zxdesign.info STMP 2m?` kvjU<v
This is more useful: we have some comments, a version number (0.1.64.20150909) and what look like chunk headers (VEGA, SMEM, STMP). We’ll come back to these later.
A bit further down the file we have this:
[…] t:iyJ =0Pb a=I1;3$f n+[z 7&@CJ mkdosfs NO NAME FAT16 This is not a bootable disk. Please insert a bootable floppy and press any key to try again ...
Now that’s useful – that’s the header for a FAT16 volume made by mkdosfs. As this is a filing system we should be able to rip it out and mount it through loopback.
Looking at it through hexdump we can see the header at offset 0x88800:
This all follows what we’d expect to see with a FAT16 volume. At this point we can safely assume that there’s a FAT16 volume from 0x88800 onwards. Using a simple structure viewer to highlight the header gives us:
From here we can see that the volume consists of 55032 sectors of 512 bytes, which is 28,176,384 bytes, or 27,516 KB. Or, to put it simply, the remainder of the file:
To finally confirm this, let’s extract the end bit and mount it. First, splitting the file:
Now we can mount it:
Inside the Games Archive
Looks like we have a filesystem; there’s a load of text files on there, which seem to have a simple file format, for example looking at adventure.idx:
00000000 5a 5a 5a 5a 01 5a 5a 5a 5a 2e 74 61 70 2e 67 7a |ZZZZ.ZZZZ.tap.gz| 00000010 0a 5a 65 6e 20 51 75 65 73 74 01 5a 65 6e 20 51 |.Zen Quest.Zen Q| 00000020 75 65 73 74 2e 74 61 70 2e 67 7a 0a 58 74 72 6f |uest.tap.gz.Xtro| 00000030 74 68 20 2d 20 54 68 65 20 41 64 76 65 6e 74 75 |th - The Adventu| 00000040 72 65 01 58 74 72 6f 74 68 20 54 68 65 20 41 64 |re.Xtroth The Ad| 00000050 76 65 6e 74 75 72 65 20 2d 20 31 32 38 6b 2e 74 |venture - 128k.t| 00000060 61 70 2e 67 7a 0a 57 6f 6c 66 6d 61 6e 01 57 6f |ap.gz.Wolfman.Wo| 00000070 6c 66 6d 61 6e 2e 74 61 70 2e 67 7a 0a 57 69 7a |lfman.tap.gz.Wiz| 00000080 61 72 64 20 51 75 65 73 74 20 2d 20 54 68 65 20 |ard Quest - The | 00000090 47 75 69 64 65 01 57 69 7a 61 72 64 20 51 75 65 |Guide.Wizard Que|
In red I’ve highlighted the end of field markers and in blue the end of line.
The format is basically;
string gamename
byte FS (0x01)
string filename
byte RS (0x0a)
So a quick bit of Python to convert it to TSV format:
>>> with open('adventure.idx','rb') as f: ... for line in f: ... sp=line.rstrip().split(b'\x01') ... print "%s\t%s" % (sp[0], sp[1]) ... ZZZZ ZZZZ.tap.gz Zen Quest Zen Quest.tap.gz Xtroth - The Adventure Xtroth The Adventure - 128k.tap.gz Wolfman Wolfman.tap.gz Wizard Quest - The Guide Wizard Quest - Guide.tap.gz
Looking into the games directory and we see (no surprise) the games:
[dave@mictlan games]$ ls -l | head total 26972 -rwxr-xr-x 1 root root 11485 Sep 7 2015 1994.tap.gz -rwxr-xr-x 1 root root 170 Sep 7 2015 1994.zxk -rwxr-xr-x 1 root root 19736 Sep 7 2015 1999.tap.gz -rwxr-xr-x 1 root root 119 Sep 7 2015 1999.zxk -rwxr-xr-x 1 root root 22115 Sep 7 2015 20-20 Vision.tap.gz -rwxr-xr-x 1 root root 123 Sep 7 2015 20-20 Vision.zxk -rwxr-xr-x 1 root root 13631 Sep 7 2015 2088.tap.gz -rwxr-xr-x 1 root root 115 Sep 7 2015 2088.zxk -rwxr-xr-x 1 root root 4022 Sep 7 2015 3 Dimensional Noughts & Crosses.z80.gz
Each game has a gzip version of the file in either z80 or tap file format and an extra file, that zxk file. The zxk file has a simple dictionary format:
[dave@mictlan games]$ cat ZZZZ.zxk T:ZZZZ F:ZZZZ.tap M:48 C:adventure K:;;;;;;;;;;;;;;;;;;;;;;; D:;;;;;;;;;;;;;;;;;;;;;;;
These seems to be a simple file format of Attribute:Value pairs, the obvious ones are:
T Game Title
F Filename
M Memory (48 or 128)
C Classification (only adventure seen)
K Keyboard definitions
D Keyboard descriptions
The last thing to note is that this volume is pretty much full to capacity:
So presumably the firmware would expand in size dependent on games.
The rest of it
Right, we’ve decoded the lion’s share of the firmware, what about the rest?
First off I’m going to pull it into a separate file, so I can leave the original in case of error:
The hexdump is not that exciting:
[dave@mictlan zxvega]$ hexdump -C firstpart.bin | head -40 00000000 56 45 47 41 40 01 00 00 09 09 df 07 01 00 00 00 |VEGA@...........| 00000010 00 76 b6 01 00 00 00 00 30 2e 31 2e 36 34 20 32 |.v......0.1.64 2| 00000020 30 31 35 30 39 30 39 0a 54 68 65 20 53 69 6e 63 |0150909.The Sinc| 00000030 6c 61 69 72 20 5a 58 20 53 70 65 63 74 72 75 6d |lair ZX Spectrum| 00000040 20 56 65 67 61 2e 0a 43 6f 70 79 72 69 67 68 74 | Vega..Copyright| 00000050 20 28 43 29 20 32 30 31 35 20 52 65 74 72 6f 20 | (C) 2015 Retro | 00000060 43 6f 6d 70 75 74 65 72 73 20 4c 69 6d 69 74 65 |Computers Limite| 00000070 64 2e 0a 77 77 77 2e 72 65 74 72 6f 2d 63 6f 6d |d..www.retro-com| 00000080 70 75 74 65 72 73 2e 63 6f 2e 75 6b 20 20 77 77 |puters.co.uk ww| 00000090 77 2e 7a 78 76 65 67 61 2e 63 6f 2e 75 6b 0a 43 |w.zxvega.co.uk.C| 000000a0 6f 6e 63 65 70 74 2c 20 43 41 44 2c 20 68 61 72 |oncept, CAD, har| 000000b0 64 77 61 72 65 20 64 65 73 69 67 6e 2c 20 43 2c |dware design, C,| 000000c0 20 41 52 4d 20 61 6e 64 20 5a 38 30 0a 70 72 6f | ARM and Z80.pro| 000000d0 67 72 61 6d 6d 69 6e 67 3a 20 43 68 72 69 73 20 |gramming: Chris | 000000e0 53 6d 69 74 68 2e 20 77 77 77 2e 7a 78 64 65 73 |Smith. www.zxdes| 000000f0 69 67 6e 2e 69 6e 66 6f 0a 00 00 00 00 00 00 00 |ign.info........| 00000100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
There’s the string we saw earlier with the copyright string, version number and header, we can try to clean this up:
Start | Length | Type | Value | Use |
00 | 4 | u32 | “VEGA” | Magic |
04 | 4 | u32 | 0x00000140 | Version number (1.64) |
08 | 2 | u32 | 0x07df0909 | Date stamp (2015-09-09) |
0C | 4 | u32 | 0x00000001 | Unknown |
10 | 4 | u32 | 0x01b67600 | Length of file without this header |
14 | 4 | u32 | 0x00000000 | Unknown |
18 | 16 | uchar | “0.1.64 20150909\n” | Version String |
28 | 419 | uchar | “The Sinclair ZX Spectrum Vega …” | Copyright String |
Then there is empty space up to the end of the header (0x200), after which we get:
00000200 53 4d 45 4d f6 00 00 00 00 02 00 00 00 00 00 00 |SMEM............| 00000210 04 00 00 00 01 00 02 00 40 01 00 00 09 09 df 07 |........@.......| 00000220 43 04 00 00 54 68 65 20 53 69 6e 63 6c 61 69 72 |C...The Sinclair| 00000230 20 5a 58 20 53 70 65 63 74 72 75 6d 20 56 65 67 | ZX Spectrum Veg| 00000240 61 2e 0a 43 6f 70 79 72 69 67 68 74 20 28 43 29 |a..Copyright (C)| 00000250 20 32 30 31 35 20 52 65 74 72 6f 20 43 6f 6d 70 | 2015 Retro Comp| 00000260 75 74 65 72 73 20 4c 69 6d 69 74 65 64 2e 0a 77 |uters Limited..w| 00000270 77 77 2e 72 65 74 72 6f 2d 63 6f 6d 70 75 74 65 |ww.retro-compute| 00000280 72 73 2e 63 6f 2e 75 6b 20 20 77 77 77 2e 7a 78 |rs.co.uk www.zx| 00000290 76 65 67 61 2e 63 6f 2e 75 6b 0a 43 6f 6e 63 65 |vega.co.uk.Conce| 000002a0 70 74 2c 20 43 41 44 2c 20 68 61 72 64 77 61 72 |pt, CAD, hardwar| 000002b0 65 20 64 65 73 69 67 6e 2c 20 43 2c 20 41 52 4d |e design, C, ARM| 000002c0 20 61 6e 64 20 5a 38 30 0a 70 72 6f 67 72 61 6d | and Z80.program| 000002d0 6d 69 6e 67 3a 20 43 68 72 69 73 20 53 6d 69 74 |ming: Chris Smit| 000002e0 68 2e 20 77 77 77 2e 7a 78 64 65 73 69 67 6e 2e |h. www.zxdesign.| 000002f0 69 6e 66 6f 0a 00 35 c5 b7 bf 56 4c 0c 6d ec a6 |info..5...VL.m..| 00000300 94 c5 b4 80 71 97 6c 03 b5 86 53 54 4d 50 01 01 |....q.l...STMP..| 00000310 00 00 40 88 00 00 09 00 00 00 00 00 00 00 01 00 |..@.............| 00000320 07 00 06 00 01 00 01 00 36 61 09 0a 80 2d 80 b2 |........6a...-..| 00000330 a2 e2 56 c2 01 00 99 09 00 00 99 09 00 00 99 09 |..V.............| 00000340 00 00 99 09 00 00 99 09 00 00 99 09 00 00 00 00 |................| 00000350 cd 62 12 00 0e c5 00 00 00 00 0a 00 00 00 34 88 |.b............4.| 00000360 00 00 01 00 00 00 32 6d 3f 60 db db 73 48 e2 7e |......2m?`..sH.~| 00000370 73 e6 61 50 8a 56 79 56 d1 87 c3 22 0d 2c 84 1c |s.aP.VyV...".,..| 00000380 fa 0c 59 f8 d3 31 e1 6b 76 6a 55 3c 76 97 7e 80 |..Y..1.kvjU<v.~.| 00000390 59 7e db 7d c5 99 1e bc af 3c 89 3e 62 47 24 fd |Y~.}.....<.>bG$.| 000003a0 d3 5e 08 35 a9 d8 ce 1b 03 07 ad 20 23 0f 19 b6 |.^.5....... #...| 000003b0 2f f3 11 6d f9 08 6a 56 89 1c 24 17 ea 2f 7f fd |/..m..jV..$../..| 000003c0 c6 9a 7f e3 a2 a9 9f d8 69 8f b5 83 e5 9c 7f b4 |........i.......| 000003d0 91 3b 87 4a 6d ab 4d cb c7 9f e5 45 87 44 9b 19 |.;.Jm.M....E.D..| 000003e0 e4 1a 15 64 b8 eb 05 62 7e c3 62 d9 31 54 7a 9d |...d...b~.b.1Tz.| 000003f0 a4 b4 be 10 d4 60 d3 d7 35 6d fd 42 a8 f6 2b 1f |.....`..5m.B..+.| 00000400 1f f6 9b 62 96 26 28 29 e6 e1 7e ed ec d5 45 c3 |...b.&()..~...E.|
This looks like a chunked file format; we can see two chunk headers:
- SMEM
- STMP
The STMP is a big clue of where the file format comes from and we can find a binary definition at rockbox of all places! There’s a copy of the tool, elf2sb that can be found on github. There’s also documentation at nxp.com.
This defines the STMP chunk, but not the SMEM, so we’ll take a guess ourselves:
Start | Length | Type | Value | Use |
0x00 | 4 | u32 | “SMEM” | Magic |
0x04 | 4 | u32 | 0x000000fb | Length of SMEM chunk |
0x08 | 4 | u32 | 0x00000200 | Unknown |
0x0C | 4 | u32 | 0x00000000 | Unknown |
0x10 | 4 | u32 | 0x00000004 | Unknown |
0x14 | 4 | u32 | 0x00020001 | Unknown |
0x18 | 4 | u32 | 0x00000140 | Version String (1.64) |
0x1C | 4 | u32 | 0x07df0909 | Timestamp (2015-09-09) |
0x20 | 4 | u32 | 0x00000443 | Unknown |
0x24 | 210 | uchar | “The Sinclair ZX Spectrum Vega …” | Copyright String |
Using the rockbox link we can decode the STMP chunk, starting at 0x2f6; where the word block is used below, these refer to 16 bytes blocks.
Start | Length | Type | Value | Use |
0x00 | 20 | uchar | 35c5b7bf564c0c6deca694c5b48 071976c03b586 |
SHA1(STMP chunk) Also IV for encryption |
0x14 | 4 | u32 | “STMP” | Magic |
0x18 | 1 | u8 | 0x01 | Major version of format |
0x19 | 1 | u8 | 0x01 | Minor version of format |
0x1A | 2 | u16 | 0x0000 | Flags |
0x1C | 4 | u32 | 0x00008840 | Image size in blocks
(0x8840 x 16 = 0x88400) |
0x20 | 4 | u32 | 0x00009000 | Offset to first boot tag in blocks
(0x9000 x 16 = 0x90000) |
0x24 | 4 | u32 | 0x00000000 | First bootable section |
0x28 | 2 | u16 | 0x0001 | Number of encryption keys |
0x2A | 2 | u16 | 0x0007 | Start block for key dictionary
(7 x 16 = 0x70) |
0x2C | 2 | u16 | 0x0006 | Size of header in blocks
(6 x 16 = 0x60) |
0x2E | 2 | u16 | 0x0001 | Number of sections headers |
0x30 | 2 | u16 | 0x0001 | Size of chunk headers in blocks |
0x32 | 2 | uchar | 0x6136 | Padding |
0x34 | 4 | u32 | 0x2d800a09 | Second signature |
0x38 | 8 | u64 | 0x0001c256e2a2b280 | Creation time in μs since 2000 |
0x40 | 4 | u32 | 0x00000999 | Product major version |
0x44 | 4 | u32 | 0x00000999 | Product minor version |
0x48 | 4 | u32 | 0x00000999 | Product sub version |
0x4C | 4 | u32 | 0x00000999 | Component major version |
0x50 | 4 | u32 | 0x00000999 | Component minor version |
0x54 | 4 | u32 | 0x00000999 | Component sub version |
0x58 | 2 | u16 | 0x0000 | Drive tag |
0x5A | 6 | uchar | 0xcd6212000ec5 | Padding |
Even though there are some things that look wrong in the header we can check that it is correct by working out the SHA1 value of the header bytes from 0x14 to 0x5F to show that they match the hash in the header:
Which they do!
Then we have the chunk header:
00000350 cd 62 12 00 0e c5 00 00 00 00 0a 00 00 00 34 88 |.b............4.| 00000360 00 00 01 00 00 00 32 6d 3f 60 db db 73 48 e2 7e |......2m?`..sH.~|
Start | Length | Type | Value | Use |
0x00 | 4 | u32 | 0x00000000 | chunk name |
0x04 | 4 | u32 | 0x0000000a | chunk offset in blocks
(0x0a * 16 = 0xa0) |
0x08 | 4 | u32 | 0x00008834 | chunk size in blocks
(0x8834 * 16 = 0x88340) |
0x0C | 4 | u32 | 0x00000001 | Flags; bit 0 is bootable. |
If you’ve kept track of the sizes you can see they’re all roughly correct indicating that we’re interpreting this correctly.
After this we have the DEK block – a block of Data Encryption Keys; each of the DEKs are encrypted with the KEK – the Key Encryption Key.
And that’s our problem – we don’t know the KEK. Bugger.
According to the Rockbox documentation it uses AES-128 in CBC mode to encrypt the DEK with a KEK passed on the command line.
If we follow on the with the Rockbox documentation the DEK block should start at block 7 (i.e. 0x70 from the header), which is straight after the chunk header. This consists of two blocks:
00000360 00 00 01 00 00 00 32 6d 3f 60 db db 73 48 e2 7e |......2m?`..sH.~| 00000370 73 e6 61 50 8a 56 79 56 d1 87 c3 22 0d 2c 84 1c |s.aP.VyV...".,..| 00000380 fa 0c 59 f8 d3 31 e1 6b 76 6a 55 3c 76 97 7e 80 |..Y..1.kvjU<v.~.|
Which translates into:
Start | Length | Type | Value | Use |
0x00 | 16 | uchar | 326d3f60dbdb7348e27e73e661508a56 | Encrypted CBC-MAC of the header |
0x10 | 16 | uchar | 7956d187c3220d2c841cfa0c59f8d331 | Encrypted DEK |
We can confirm what we know using the sbtool program found on the eewiki github:
We’ve hit a wall with the firmware here, unless we can brute force the key (128 bits – that’s 3.4028236692093846346337460743177e+38 different combinations – that’s not happening any times soon) or we can look into using side channel mechanisms (maybe more on that later).
It’s time to look at the hardware.