Copy Protection in Jet Set Willy: developing methodology for retrogame archaeology

John Aycock1 and Andrew Reinhard2

1. Department of Computer Science, University of Calgary, Canada,
2. Archaeogaming,

Cite this as: Aycock, J. and Reinhard, A. 2017 Copy Protection in Jet Set Willy: developing methodology for retrogame archaeology, Internet Archaeology 45.

1. Introduction

It would be hard to argue that computer games have not had a cultural impact. The computer games industry is massive, with estimated worldwide revenues over 99 billion dollars in 2016 (Newzoo 2016). Countless people play 'casual' games (Juul 2010) and the effects of games can spill out into the physical world, as they have recently with Pokémon GO (Wingfield and Isaac 2016). Certain video games and their characters are widely recognised far outside their original setting, like Pac-Man and Mario. As a result, they can be and have been subject to archaeological study (Reinhard forthcoming).

'Retro' computer games, or retrogames, which we will define here as games created in the early 1990s and earlier, are especially interesting from the point of view of studying how humans (programmers) made use of the extremely limited computing technology of the time. The computers driving retrogames could have very little memory (RAM and ROM), slow CPUs, slow and small secondary storage (e.g., cassette tapes, floppy disks), and other technological constraints (Aycock 2016). This also extended to programming environments and, at times, game programmers would even create their own tools (Aycock 2016). While the quality of retrogames may be derided compared to modern games, we argue that it is impressive that some of these retrogames existed at all, given the technical constraints.

How did humans coerce their tools into creating these games? How did the games themselves not only encourage people to play them, but to respect early forms of copy protection on a new kind of intellectual property? How can these games be studied? In this article, we use the 1984 game Jet Set Willy for the ZX Spectrum as a vehicle to demonstrate research methodology for retrogame archaeology.

Figure 1
Figure 1: ZX Spectrum. Dimensions (mm) are 230x144x30 (WxHxD)

The ZX Spectrum, made in the UK, was introduced in 1982; it is shown in Figure 1. For anyone unfamiliar with the home computers of this era, it is worth pointing out that this is the complete computer, not just the keyboard. Other input and output was supplied by devices that were readily available or already present in a typical home: a television for video output, and a cassette player for loading and saving computer programs. The primary market for the Spectrum was the UK, and by some accounts it was for a time the 'UK's most popular computer' (BBC 2007). The ZX Spectrum sold 50,000 units per month in its heyday (Carroll 2014), and it also had a presence in the Eastern Bloc via unauthorised clones (Stachniak 2015).

Jet Set Willy itself was a sequel to the 1983 game Manic Miner; both were written by Matthew Smith and both were what would now be called 2D platform games. It is hard to find words that adequately describe Manic Miner, as it was 'enlivened by a taste for the surreal and the bizarre that would become common among early British games' (Donovan 2010, 116), and Jet Set Willy was 'an even weirder sequel' (Donovan 2010, 116). However, both hold their own against games that are decades more modern, appearing in a 2014 list of 'The 30 greatest British video games' (Parkin et al. 2014). Game-wise, the player of Jet Set Willy had to move Willy from room to room (i.e., screen to screen) in a mansion and collect objects, purportedly 'glasses and bottles' left over from the previous night's party (Software Projects 1983). We observe that discussion of Jet Set Willy and its influence is mostly useful for situating it in a historical context, and does not necessarily have any connection to its underlying implementation.

Computer programs distributed on cassette tape, like Jet Set Willy, were particularly vulnerable to software piracy. It was simply too easy to make copies of a tape. Here we narrow our scope to the copy protection method Jet Set Willy employed, as it allows us to focus on some techniques that have relevance for archaeologists working on computer software.

The work described in this article is interdisciplinary. While we have written the article using the 'academic we' to avoid awkwardness in the text, the first author (a computer scientist) performed the technical analysis described here, and the second author (an archaeologist) has interpreted it through the lens of archaeology and situated it within the field.

2. Retrogame Archaeology

Archaeogaming can be broadly defined as the archaeology in and of video games. Video games, created by people (or by machines or routines created by people), contain their own player- and developer-cultures (which exist in the real world), and can contain their own manufactured cultures (which exist solely within the game-space). Retrogame archaeology is subsumed into archaeogaming as focusing on the study of older games and of game history, treating the games as media artefacts both created and used by a nascent, emerging digital creative and consumer culture. Millions of people interact with these games both in-game and out, occupying them as sites, and manipulating them as artefacts when they play, study, and live. Because of this creation and occupation in the actual and virtual worlds, games merit archaeological study, and archaeogaming is the literal interpretation of games as sites and artefacts. In some ways this is no different from any place on Earth that has been manipulated, managed, and transformed by people past and present. In other ways, it offers wholly new dimensions, including how archaeologists consider time, space, and human interaction with places that are simultaneously both real and virtual.

Most of archaeology could be described as the history of technology. As Olli Sotamaa put it, 'the known history of games is a history of artifacts' (Sotamaa 2014, 3–4). Claus Pias defines technology as 'a relay between technical artifact, aesthetic standards, cultural practices, and knowledge. Technology does something, not is something' (Pias 2011, 180–1). Technology is an artefact-creation tool, itself a creation of people. As Wolfgang Ernst writes, '[archaeologists] are dealing with the past as delayed presence, preserved in technological memory. We are not communicating with the dead' (Ernst 2011, 250). Video games then, as with other software, are not only artefacts (and sites), but are also sources of preservation. When we play the games, the games are in-the-moment and active, ignorant that any real-world time has passed, performing just as they were programmed to perform. Games – at least in 2017 – remain unaware of themselves, dumb output from smart people, not unlike any other artefact, or as Ian Hodder calls them, 'things'. This is seen throughout Hodder's 2012 book Entangled: An Archaeology of the Relationships between Humans and Things.

Video games are things. They are often created out of a suite of needs that include a desire to be entertained, to be challenged, to make money. Compare this to pots, which are also things but are created out of a different, more utilitarian, suite of human needs. In Goldberg and Larrson's introduction to State of Play, they note that games have traditionally been engaged with and discussed as products of technology rather than products of culture (Goldberg and Larrson 2015, 8). The road to the serious study of video games as well as their scrutiny as forms of entertainment has sometimes come from outside gaming culture, both that of developers and players (Goldberg and Larrson 2015,12). It can also come from within, as in the case of this article, authored by a computer scientist and an archaeologist, both lifelong players. Goldberg and Larrson (2015, 13) see contemporary games as transcending their perceived definition of artefacts of technology into something more. This assessment supports archaeogaming's premise that games cannot be disentangled from the context and culture that had a hand in creating them, and that games as both sites and artefacts contain far more than whatever manifests on-screen. 'Like films and books, video games are cultural texts. They say something about the society in which they were made' (Knoblauch 2015, 187).

Also like films, ebooks, and other digital media, video games - specifically computer games - are saddled with digital rights management (DRM), aka copy protection. Because most media is monetised, creators and publishers have attempted, and continue to attempt, to ensure that all copies of that media are purchased. Many contemporary games are now available as download-only or stream-to-play, tied to a player's account, which includes payment details. Retrogame developers, without the luxury of always-on Internet access, had to get creative in how they applied DRM which, as will be shown below, could combine both physical and digital elements. The full scope of DRM, even in retrogames, is vast: a thorough examination of copy protection for even a single retrocomputing platform can occupy its own journal-length paper (Ferrie 2016).

Taken archaeologically, copy protection of media is nothing new. One can compare retrogame DRM to Mesopotamian cuneiform-inscribed bullae, clay envelopes containing secure documents inside. One must break the seal or envelope to get to the content within, a destructive process leading to a creative one. The copy protection is part of the media itself, a characteristic of the game-artefact. It gives additional context to the game, and to the wider media culture invested in protecting creative content for economic reasons. Solving a DRM mystery unlocks treasures within the game, following on a common archaeological trope of excavation, of opening a sealed tomb. But it also speaks to something more fundamental in the opening of any container holding something within that is to be engaged: a bottle of wine, a letter in an envelope. For the archaeogamer, the protective mechanism, its purpose and underlying engineering, is as important as that which it is designed to protect. The DRM is the bottle's cork, the envelope's gum.

3. Copy Protection in Jet Set Willy: Acquisition and Identification

The first problem in retrogame archaeology is one of acquisition. In our running example, Jet Set Willy's copy protection comprises two pieces to acquire: a physical artefact and a digital artefact. To explain why both are necessary, an explanation of how Jet Set Willy starts is helpful.

Figure 2
Figure 2: Jet Set Willy copy protection screen. Used under fair dealing and fair use for research and commentary

Upon running Jet Set Willy, even before seeing the game's title screen, the user is presented with a challenge from the program (Figure 2): 'Enter Code at grid location' followed by a letter and number that represent coordinates in a 2D grid. The user must look up those coordinates on a physical artefact (Figure 3) and enter the colours from the named grid square, correctly and in order, in order to proceed and play the game. As Figure 2 shows, the colours are mapped into numbers that can be entered on the keyboard. If incorrect colours are entered twice, the computer is restarted and the game must be reloaded from cassette tape to try again, a process that takes approximately three minutes (ignoring any tape rewinding time).

Figure 3
Figure 3: Jet Set Willy physical artefacts for the ZX Spectrum, including copy protection card (left). The card's dimensions (mm) are 61x98 (WxH), allowing it to easily fit inside a standard cassette tape case (centre)

The principle under which this copy protection operates is known in the area of computer security - specifically, user authentication - as authentication by 'what you have'. The assumption was that the colourful physical artefact would have been too difficult to duplicate using the photocopier technology at the time, and too large to easily write out the codes by hand. While a copy of the tape could be made, it would be rendered useless without a copy of the physical artefact, in theory. Furthermore, it was efficient from the distribution point of view, in that the physical card fitted neatly inside the cassette tape case. This was not the only retrogame whose copy protection relied on physical artefacts and what-you-have authentication (Aycock 2016); more narrowly, other anti-photocopying measures went so far as to use technology that still defies photocopiers today (Figure 4), although not scanners or cameras.

Figure 4
Figure 4: No Copi paper used for copy protection in Jack Nicklaus' Unlimited Golf & Course Design (1990) for DOS. The blank rectangle is a modern photocopy attempt

In this case, we were fortunate in that the physical artefact proved relatively easy to acquire at modest cost from eBay, although this would not be the case for more obscure games. The paper-based copy protection card is still easy to read and work with. Had the copy protection involved electronics or moving parts, using the physical artefact could have been much more challenging. Plastic from the retrogame era is already brittle; the Lenslok copy protection system (Aycock 2016) required use of a plastic lens, for example, and bending one into the required configuration risks irreparable damage now. Electronic components can degrade over time, too, like the failing capacitors well known to retrocomputing enthusiasts.

Part of the physical artefact, the cassette tape, also incorporates the digital artefact. However, the tape may be too brittle to read, we may not have the equipment to do it, or the magnetic media may have degraded to the point where it cannot be read without errors. We have chosen to sidestep these issues and work with a digital image for Jet Set Willy found on the Internet. (By 'image' we mean the binary game software image, as opposed to a picture of the game.)

The immediate question this raises is one of provenance. How can we be sure that a game image found on the Internet, possibly on a dodgy-looking website, is in fact a true unmodified image of the game we wish to study? Here, our work is able to reconstruct the contents of the copy protection card from the game image we used, proving the link between the two, but more general issues exist. First, there may be no one true image. Software versioning practices were not always performed, and duplication could be done in an ad hoc fashion, literally by a person copying cassettes or floppy disks. It is entirely possible for multiple legitimate versions of the same software to exist, even with no visible distinction on the physical label or in the game. Second, there may have been errors when reading from the original media, or even legitimate variations when reading, the latter being used for some forms of copy protection. Third, cracked copies of games, which in some cases may be the only available extant version, may have had changes introduced. An obvious change is the removal of copy protection (making it difficult technology to study), but cracked versions would also introduce 'crack screens' that gave crackers credit (Apple II Crack Screens n.d.), remove introductory game sequences, and add cheat keys. The changes that might have been made by these third parties are unknown. (There is one exceptional case of note: '4am' is a prolific modern cracker of retro Apple II software. He/she thoroughly documents what was done to crack a particular piece of software and makes those notes available with the cracked software at the Internet Archive (4am 2016). This, along with his/her predilection for 'clean' cracks without unnecessary modifications, makes it possible to work more confidently with those cracked images.)

Not only may there not be one true game image, but there may be no single location for a game image. Game images are replicated in numerous places on the Internet, and the questionable nature of some websites hosting game images means that these sites may become unreachable and move around. It is not generally useful to supply a link to a game image, in other words. Yet to ensure reproducibility of retrogame archaeology, we must ensure that a game image under study is uniquely identified, and that a later researcher can determine with precision if a game image is the same as that used for earlier work.

Altice (2015, 340) suggests using ROM size as identifying bibliographic information, but this is an incredibly weak identifier, because it is possible that two different images for the same game have the same size yet contain substantial differences. Instead we recommend use of a cryptographic hash, also known as a cryptographic checksum. On average, a one-bit change in the game image would change half the bits in the cryptographic checksum's output (Schneier 1996, 30), meaning that even slight changes to the game image would be reflected in the checksum. While no longer useful for cryptographic purposes, the MD5 checksum algorithm (or stronger) would be suitable for this purpose. For example, the MD5 checksum of the Jet Set Willy image we have used is 4e5ed538eb9f56598faff8290644c9d7 and, knowing this, a future researcher happening upon a Jet Set Willy image could compute the MD5 checksum of their image to see if their checksum matched ours. If the checksums match, then they would know with a high degree of certainty that they were looking at the same image as we were.

Related to finding game images is the question of copyright. While popular notions of 'abandonware' exist, in practice there is no basis for this in law. Further complicating the matter is that, for retrogames, a rightsholder may be deceased or simply untraceable if a person; a corporate rightsholder may have long since ceased to exist, or have had its intellectual property acquired through a long chain of acquisitions. There may easily be no one to ask for permission to use a game image.

Fortunately, copyright law does allow some exemptions for research. In the United States, this falls under fair use (United States Code, 'Limitations on exclusive rights: Fair use' 17 USC 107), and in Canada there is a slightly more restrictive notion of fair dealing (Copyright Act 1985, s. 29). Neither would exempt a researcher from the obligation to purchase a commercially available version of a game if one exists, and there are sites, like, that specialise in selling retrogames. However, there are caveats: having a retrogame available does not mean that it is in its original form, nor does it mean that the game is in a form suitable for study. A re-released version may have had modifications, such as Nintendo's 2010 Donkey Kong re-release adding content (Altice 2015, 76-7), or the game images may be effectively inaccessible, as with the re-released Atari and Activision games on the iPad. Legal concerns have ramifications for the feasibility of open data in retrogame archaeology, as a research exemption would be unlikely to apply to making a game image generally available to others, and even linking to a game image already available on the Internet may fall foul of copyright law in the form of 'contributory copyright infringement' (Digital Media Law Project, n.d.; Electronic Frontier Foundation, n.d.). The safest path appears to be use of the cryptographic checksum mentioned above, leaving future researchers to locate their own copies of the image.

An image's cryptographic checksum may be sufficient to identify it, but more bibliographic data is useful. Altice devotes an entire book appendix to the topic of game bibliographies (Altice 2015, appendix A), and this is an indication that there are some nuances. Moreover, academic fields that study games, such as game studies and platform studies, are young enough that no standard has emerged yet. The game title is obviously important, and the format of the game image is helpful to narrow down the range of possible images: for example, we used a Jet Set Willy image in .TAP format.

The computing platform is a very important piece of data, and it is often used to organise game images on websites. The platform can have a dramatic impact, because the processor (hence the game code) may be entirely different, for starters. Even if the processor is the same across platforms, the internal details vary enough between different computers and consoles that the code will again be different. There may be changes in game releases from one platform to another that will manifest themselves in the physical artefacts: different media, and even different copy protection. Figure 5 shows how the copy protection card changed for the MSX platform's version of Jet Set Willy.

Figure 5
Figure 5: Jet Set Willy physical artefacts for the MSX platform. The copy protection card (bottom) folds out to 315x99 (WxH, in mm) and is printed on both sides, but folds up to fit inside a cassette tape case

Other data that would be standard in bibliographies for other media have more limited usefulness for identifying game images. Authorship will not always be known, for example, and dating is problematic. A game may have been released in different places at different times, and those years may differ again from the copyright year shown in the game itself. A game publisher's name can be helpful, however, since different versions of the same game may have been released by different publishers for the same platform, or the same game may have been re-published or re-bundled. Remember that software versioning as software developers would now understand it was not always performed, and there may well be no extant or reliable version or build numbers.

4. Copy Protection in Jet Set Willy: Methodology

From the myriad problems of acquisition and identification, we finally arrive at the point of studying the game. How did programmers implement the copy protection in Jet Set Willy?

4.1 'Traditional' research

A traditional research approach would ordinarily rely heavily on published scholarly work. It is safe to say, however, that Jet Set Willy does not figure prominently in articles in Science and Nature. Alternative sources must be considered.

Retrogame archaeology has the appealing property that many - but not all - retrogame authors are still alive. But experience has shown us that not all may be reachable, for starters. Also, while we know that Matthew Smith wrote Jet Set Willy, we do not know who created the copy protection; it would often be added on after the game was written, by often-anonymous programmers. We further make the counterintuitive assertion that the retrogame author cannot be considered a primary source for the inner workings of their code, not after so much time has passed. While they may supply helpful contextual information and may give useful clues that can be followed up, the game code itself is the primary source: it must be analysed directly by a researcher, and anything not present in it is anecdotal only. Consider this an exploration into computer game epigraphy, the study and interpretation of ancient inscriptions (our inscriptions being code).

We may also ask the Internet. There are many retrogame enthusiasts who have devoted inordinate amounts of time and effort to studying their favourite game, and this is work that can and should be leveraged (with appropriate credit, of course). For some games, the game code has been disassembled (i.e., a human-readable version has been reconstructed from its binary form) and annotated by these enthusiasts. Indeed, we came across one of these for Jet Set Willy (Harston 2004), albeit after we had already done our own analysis. We used this as an independent source to double-check our analysis in this case, but even if we had discovered it earlier, it could at best have acted as a guide. Again, a third-party disassembly of this form is not a primary source; it must be verified in the original game code by the researcher.

In Wikipedia's 14 August 2016 'Copy Protection' article (always a bastion of veracity), it actually mentions Jet Set Willy and its copy protection card, along with a very specific, uncited claim: 'The codes in tables are based on a mathematic formula and can be calculated by using the row, line and page number if the formula is known, since the data would have required too much disk space.' An interesting assertion, and one that we can test through our analysis below.

Traditional research may also draw on resources that game studies researchers would refer to as 'paratext' (which is drawn from Genette (1997), who would more precisely categorise this as 'epitext', a finer distinction that is not often made in game studies), which is writing about the game that is not included with the game. In other words, this would include game reviews, letters to the editor, and so on. We can find some evidence of this in computer magazines contemporary with Jet Set Willy related to its copy protection: one letter to the editor even published a workaround to the copy protection (Sanderson 1984a, with follow-up in Sanderson 1984b). Sanderson was not alone in figuring this out; one columnist states (Kendle 1984) 'I also get [mailed] a lot of routines that overcome the colour-code loading sequence but with the current sensitivity about piracy I don't intend to encourage people in print to overcome this protection device. For those who have genuinly [sic] lost their colour card or are colour blind first write to Software Projects and if they don't offer any help then perhaps we'll think again'. One reviewer calls the copy protection a 'hare-brained system' (Burton 1984), and another article mocks the copy protection and players alike (Pennell 1984): 'I know of one player who typed the whole chart into his wordprocessor [sic], and another who dutifully duplicated it all with felt tips. This latter soul gave me the biggest laugh - the fact is just a single POKE disables the entire coding mechanism!' Taken together, this suggests that the popular attitude to copy protection at the time, particularly schemes that imposed some form of cost on legitimate users, could be less than favourable.

A final lead from traditional research is a marking on the copy protection card: 'PATENT PENDING'. It was not clear that the patent was ever granted, however, and initial efforts to locate the patent document or its application failed. It was only through a later tip (C. Cannon, pers. email, 26 May 2016) that we were able to find the patent, which was indeed granted. We should stress that the existence of this additional patent documentation is atypical for games, to the extent that we hesitate to mention it here as 'traditional' lest it be misleading. Instead, we return to the patent and what it revealed in the Analysis and Discussion section.

4.2 Code analysis

A game's code may be analysed in two ways, from a high level. There is static analysis, which is looking at the code without running it, and dynamic analysis which, by contrast, examines the code as it runs. Barring extreme circumstances that preclude use of either static or dynamic analysis, our experience is that an analysis will often bounce back and forth between the two techniques. We give an example below for Jet Set Willy.

What is being analysed? Often we are not fortunate enough to have access to a game's original source code, assuming it still exists. Here we rely on tools to convert the binary code that the machine sees into more human-readable forms: a disassembler was already mentioned, and it is the commonest tool, but for some programming languages it is possible to decompile them into a higher level, more readable form. It is important to note that 'human-readable' is a relative measure. Any comments from the original source code, any meaningful variable names, will not be reconstituted, and the researcher is left with a more readable but not immediately enlightening puzzle to decipher. The complexity of analysis will be increased in cases where deliberate code obfuscations have been performed to make copying the game harder, or where the retrogame programmer has performed optimisations to make the code faster or smaller, which can result in the code being less straightforward to understand. We see this in more traditional artefacts such as ancient Roman coins, which contain complex monograms whose design can represent an entire word (or more) with a few clever strokes.

Where is the code analysed? Perhaps surprisingly, counter to the trend of experiencing retrogames on their original platforms, the real retro hardware that the retrogame originally ran on tends to be quite useless for analysis, even ignoring the increasing difficulties getting old hardware and media working. These machines often had limited facilities for performing analysis tasks, and even these would be disabled if possible by retrogames to prevent unauthorised copying. It is far better to analyse the code in one of the many excellent emulators that are now available. Getting emulation of a machine correct is extremely difficult, and it is prudent to rely on existing emulators rather than consider creating one from scratch. We agree with Altice's suggestion (Altice 2015, 341) that is it useful to list the emulators used and their version numbers for reproducibility reasons: we used MESS 0.151 (32-bit) on OS X 10.9.5, and Fuse 1.1.1 on OS X 10.9.5 and Linux Mint 17 for our analysis here.

Emulators give the researcher an essentially omnipotent view of the retrogame, from the outside of the emulated machine, and the machine can be stopped and probed as necessary without disturbing its internal state. It is particularly important that the emulator(s) used have debugging facilities to permit this probing; not all do. In extreme cases we have had to modify an emulator's code to add necessary instrumentation, but this is rare, and was certainly not required for Jet Set Willy. Emulator debuggers vary in their support for the retrogame archaeology task, and while most, if not all, incorporate a built-in disassembler, for example, not all have the ability to record a trace of the running code to assist dynamic analysis. Not all emulators support all game image formats either. This is the reason that we used different emulators for Jet Set Willy analysis, because we needed to switch back and forth to get different debugger and emulator functionality.

Where is the code to analyse? The memory of old computers is comparatively small, but it still can be a gruelling task to grind through it all looking for legitimate code: data can masquerade as legitimate code in a disassembly, and disassemblers can be misled about code and show false disassemblies, a technique that was used on purpose to deter software copying. It is more efficacious if the possible location of code can be narrowed down.

Let us return to Jet Set Willy's copy protection to make this discussion more concrete. One favourable factor is that the copy protection challenge was displayed to the user at the very beginning of the game, so it was not as difficult to find as if it were buried partway through the game's execution. However, the game was not the only code that the Spectrum would be running: there would be code run, at least initially, out of read-only memory (ROM) to load the game from cassette, and this loading may occur in multiple stages. This means that we would need to determine the point at which control was initially transferred to the game. Not an impossible task, but we could do better in this case.

We used static analysis to search the memory for the string 'Enter Code at grid location' that is displayed to the user. Reasoning that the copy protection code, or code called directly from it, would be accessing this memory near the beginning of the process, we used the Fuse debugger to set a 'breakpoint' at that spot, causing any access to it to stop the running game. Rerunning the game (i.e., dynamic analysis), the game reached the breakpoint and we were able to see where in memory the code responsible was located. Once stopped in the debugger, the code disassembly can be seen for static analysis, and we could also run the code from that point slowly, instruction by instruction, for dynamic analysis to garner further information. The emulated computer screen is still visible when the debugger is active, and so the execution of instructions can be correlated with any in-game visual effect.

Figure 6: Partial instruction execution trace from Jet Set Willy. As a record of the actual instructions executed, some are repeated (ldir at $86f1) and others are skipped (one instruction is omitted after the jr)
86F1: ldir
86F1: ldir

   (loops for 125 instructions)

86F3: ld   a,($5C78)
86F6: add  a,$25
86F8: ld   ($5C78),a
86FB: cp   $B3
86FD: jr   c,$8701
8701: ld   l,a
8702: ld   h,$9E
8704: ld   a,(hl)
8705: add  a,l
8706: ld   ($85E4),a
8709: ld   c,l
870A: ld   e,$2F

Eventually a likely range of start and end memory locations was narrowed down through this process. To capture the running code from start to end for dynamic analysis, a recording of the code execution (a 'trace') is helpful, but was unsupported by Fuse. Code traces can be very lengthy, thus limiting their use to a constrained area of code execution is beneficial. We shifted temporarily to MESS to gather this trace, whose debugger did support that functionality, but that also required a change in game image format, because MESS was having difficulties loading Jet Set Willy from the digital tape image that Fuse could use. Through the recorded trace, we now knew exactly which instructions were running, and in what order. The trace shows memory addresses and disassembled instructions, with possible duplications and omissions as code jumps and loops; an excerpt from that trace is shown in Figure 6. From this point, with technical knowledge of the Spectrum and the assembly language its Z80 processor understood, we were able to locate and understand the copy protection code's algorithm by reading the trace, gradually filling in our observations until the full process had been decoded. Figure 7 shows an excerpt from our notes that illustrates the level of analysis involved.

Figure 7: Partial annotated execution trace from Jet Set Willy, showing analysis
86F3: ld   a,($5C78)                    $5c78 is frame counter, incr every 1/50s
86F6: add  a,$25
86F8: ld   ($5C78),a    XXX does game take over interrupt, thus this adjustment
                                in case a new random value is req'd? setting
                                bkpt at $86f3 and write bkpt on $5c78, that
                                seems to be exactly what happens - no other
                                writes to $5c78, and $86f3 hit again for new
                                code (reboots after 2nd time wrong)
86FB: cp   $B3                          hmm, $12 * 10 = 180, and $b3 is 179
86FD: jr   c,$8701
                        ( skipped instr subtracts $b4 from A )
                        XXX seems to be a bug if A == $b3 above, b/c
                        XXX A has $b4 subtracted from it anyway, yielding
                        XXX $ff, which becomes the LUT addr, and the "number"
                        XXX rendered is ">" - counter value of $8e should
                        XXX cause this
                        according to um0080.pdf Z80 ref, "CP s" computes A-s,
                        and carry set if borrow occurred, which would mean
                        that A's value was less than $b3
                        confirmed in emulator by setting $5c78 to $8e before
                        first read - challenges player for location "D>"
8701: ld   l,a                          random value becomes low addr byte
8702: ld   h,$9E                        hello, lookup table at $9exx...
8704: ld   a,(hl)                       load byte from LUT
8705: add  a,l                          hmm, maybe this is light obfus on
                                                the LUT values
                        XXX note that this would make diff values even if
                                the LUT data were the same, so you could
                                potentially overlap this with, say, level data
8706: ld   ($85E4),a                    stash random + LUT byte in $85e4
8709: ld   c,l                          random value < 180 is 1D coord
870A: ld   e,$2F                        #'0 minus one

4.3 Testing hypotheses

The emulator's debugger is not limited to viewing only. Changes to the emulated memory and machine state can be made in order to test specific hypotheses about the code during its analysis, or to verify understanding about what the code is doing. As an example, our analysis of the execution trace suggested that there might be a bug in the copy protection code, an 'off-by-one' error well known to programmers. We verified this by using the debugger to stop the game at a specific spot in the copy protection algorithm where it chose the grid location to challenge the user with. We changed the grid value that the game actually chose in that case to the one we thought would trigger the bug, and were able to verify that the bug did in fact exist, asking the user for the nonsensical grid location 'D>'. Later, as a prelude to the reconstruction below, we stopped the game numerous times at that same memory location, to verify that we could accurately predict what the displayed grid coordinates and corresponding copy protection card codes would be.

4.4 Experimental reconstruction

Finally, based on the code analysis, we attempted an experimental reconstruction of the copy protection card. As with any reconstruction, success would indicate that we have a working understanding of the technique that the original programmers used in their code.

Figure 8: Textual reconstruction excerpt of Jet Set Willy copy protection
A 0 = 1 3 3 4
A 1 = 1 3 4 1
A 2 = 4 3 2 1
A 3 = 2 2 3 1
A 4 = 2 1 4 2
A 5 = 3 3 2 1
A 6 = 3 2 2 1
A 7 = 3 1 2 1
A 8 = 4 2 1 1
A 9 = 3 3 1 1
B 0 = 2 1 4 1
B 1 = 4 1 3 3
B 2 = 3 2 4 4
B 3 = 4 2 3 2
B 4 = 2 3 3 1

Our first reconstruction was implemented as a Python script, with our code using 256 bytes of data captured from the game's memory image at memory address $9e00. The partial output of that program is shown in Figure 8. The important thing with the reconstruction is that we can match up our output with the copy protection card's contents, even if it does not look exactly the same. The source code for this has been made available on Github.

While it was not strictly necessary, we also created a modified Python script that rendered the reconstruction graphically, using Python's turtle graphics module (Figure 9). This version of the reconstruction is much easier to compare with the original physical artefact, of course.

Figure 9
Figure 9: Graphical reconstruction of Jet Set Willy copy protection

4.5 Other methods

There are five additional retrogame archaeology research techniques that were not brought to bear on Jet Set Willy.

5. Analysis and Discussion

From a high level point of view, the copy protection algorithm from Jet Set Willy chooses a random number between 0 and 179 (inclusive); each of these 180 values corresponds to a unique row/column location on the copy protection card, which has ten columns (0-9) and 18 rows ('A'-'R'). The copy protection code computes these 2D grid coordinates for displaying to the user. There is a table starting at memory location $9e00 - really, a series of consecutive bytes - and the value of each byte contains the correct answer for that particular set of coordinates. For example, 'C 3' corresponds to index 56 in the table: the '0', '1', and '2' columns are the first 18*3=54 bytes, plus 2 to reach the correct row number ('A'=0, 'B'=1, 'C'=2). In computer science, this is referred to as column-major order (Aho et al. 2007, 382).

Two additional features are of note. First, 37 is added to the random value after it is read, in case the user enters the wrong code. When that happens, the user is given a different code to enter, and the addition of 37 assures that a new 'random' number is available. Second, the byte value extracted from the in-memory table has the index added to it, modulo 256. Returning to the previous example, the byte value at index 56 is 64, but the correct code is the value 56+64=120. Each of the four pairs of bits in the resulting byte value are interpreted as colours/numbers, as shown in Figure 10.

Figure 10
Figure 10: Interpretation of bits in a byte from Jet Set Willy copy protection table

The presence of the table proves that the Wikipedia claim that the copy protection values are calculated instead of stored due to space issues is clearly wrong. Nor was it calculated before the point at which we stopped the game for our code analysis; we verified that that memory was only loaded with those values once, and that was while the tape was loading.

Could the values have been calculated? Yes. And, in fact, they could have been calculated very well, so that they were all unique values and would appear random. The game Pitfall!, for instance, used a maximal-length linear feedback shift register to produce random numbers with this property (Aycock 2016, 127). However, what is interesting is that the Jet Set Willy values are not very unique at all: only 83 of 180 values are unique in memory before adding the index, and even adding the index only yields 125/180 unique values. The result is duplicated codes in the copy protection card. 'A 2' and 'D 5' have the same 4-3-2-1 code, for instance.

Once the restriction on having unique copy protection codes is removed, it is clear that Jet Set Willy's copy protection scheme could have done away with any space concerns entirely. The location of the table could have been placed anywhere in the computer's memory. Had it been positioned atop the game code, for instance, the table would have had random-ish bytes by virtue of the bytes comprising the game code instructions. Even if the table had been located in a memory region with a long series of the exact same byte values, adding the index would ensure that the codes displayed to the user were different regardless (although a user knowledgeable about counting in binary may have discerned the resulting pattern).

Normally it would be impossible, looking at these different implementation options, to know what was envisaged by the copy protection designer. In this specific case we are fortunate to have additional information, though. The patent for this technique (Maton 1987) follows the copy protection code as implemented in Jet Set Willy very closely, and makes it clear that a table lookup was always part of the scheme. The overlap of the copy protection table with other code or data is less clear: one passage talks about the game code, the copy protection table, and the copy protection code, saying that 'It will be appreciated that [...] whether one is within or partly within another or others, is a matter of choice and corresponding programming' (Maton 1987, 6). Something proposed for the physical copy protection artefact in the patent that did not manifest itself was that it 'could be placed on a 3-dimensional article so as to increase difficulty of any photographic copying, whether on a cube or sphere, or even on sides of the cassette case' (Maton 1987, 11). The patent states, with no apparent sense of irony, that traditional copy protection methods are 'exceptionally irritating to the bona fide user' (Maton 1987, 2).

We were able to contact Chris Cannon, who worked at Jet Set Willy's publisher Software Projects, for fact checking (C. Cannon 2016, emails 26 May and 31 May). He informed us that the copy protection was designed by Alan Maton (the patent holder) and was implemented by Matthew Smith (the game author). His recollection is that the copy protection code 'was thrown together in about ½ a day, more as an afterthought' and that since Matthew had already completed programming the game, 'He wasn't particularly interested in the idea'. Later versions of the copy protection were 'more robust'.

Not surprisingly, the copy protection caused Software Projects to receive complaints, although this may be interpreted more as a study on copy protection circumvention. Everyone who claimed they were colour blind (a potential problem mentioned in the patent) or whose game cassette had not come with the card was offered a full refund if the game was returned with proof of purchase. As Chris informed us, 'NONE of the complainants EVER returned the game.' On the one hand, this is a humorous anecdote; on the other, it documents attempts to circumvent copy protection by social means. In the computer security community, this is referred to as 'social engineering' (Hadnagy 2011).

Jet Set Willy, despite its age, is an interesting case study in the broader scope of DRM. Its use of a tangible physical artefact that inconveniences legitimate and illegitimate users alike underscores the fact that DRM - both old and new - involves tradeoffs between usability and protection. And, ultimately, DRM is an exercise in impeding copying rather than affording absolute security.

6. Conclusion

One of the issues the young field of archaeogaming has struggled with is getting beyond the purely theoretical phase, to engage in practical analysis of video game mechanics and design, creating a methodology in support of clearly defined research questions, and then following those methods to arrive at concrete conclusions. We hope that this tour of deconstructing Jet Set Willy's DRM has demonstrated the real viability of archaeogaming as being able to contribute to the understanding of video games as part of contemporary material culture, going from the understanding of games as artefacts, and instead taking a piece of a game's design, and thoroughly analysing its construction.

These research methods are not specific to this one instance. We have used at least 100 games during the course of our retrogame archaeology work over the last few years, and the same set of methods we have described are generally applicable and, we conjecture, may be used for software well beyond games. Jet Set Willy's copy protection has served as an illustrative vehicle here. The methods are scalable, and they may be adapted for other games from other times as we understand not only the object, but the thinking behind the object's creation, and the maker's desire to protect it.


Aycock's work is supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada. Reinhard received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.


Internet Archaeology is an open access journal. Except where otherwise noted, content from this work may be used under the terms of the Creative Commons Attribution 3.0 (CC BY) Unported licence, which permits unrestricted use, distribution, and reproduction in any medium, provided that attribution to the author(s), the title of the work, the Internet Archaeology journal and the relevant URL/DOI are given.

University of York legal statements