This lesson is being piloted (Beta version)

Anatomy of a Computer

Overview

Teaching: 20 min
Exercises: 15 min
Questions
  • What is an emulator?

  • What is a ‘virtual machine’?

  • How can you break down a computer into its component parts?

Objectives
  • Re-consider the computer as a physical object with separate, discrete components

  • Translate each component to its virtual equivalent

  • Define emulation vs. virtualization

What is emulation?

In computing, emulation refers to the representation of a hardware equivalent in software. It allows us to:

To understand how this is accomplished, it’s worth stepping back to consider “hardware”. What is it exactly that we are trying to represent?

Diagram of a physical computing environment

In a computing environment, “hardware” refers to fixed, physical parts - a monitor, a mouse, a keyboard, RAM, the CPU, graphics and audio cards, etc. At a bare minimum, you pretty much need a CPU and a method of communicating with it, and all other parts are technically optional. But some combination of these other components are obviously very, very common.

What we tend to think of as “our computer” - the operating system (MacOS, Windows), the file system, the applications we use, our personal data - is usually stored on a particular piece of hardware: a storage device like a hard disk drive or solid state drive (increasingly the latter, especially if you’re on a laptop). We’ll refer to this as a “bootable system” - some combination of data and software (whatever is required to actually make use of your hardware and not just have a pile of plastic and metal), stored on physical media.

And though it’s not strictly necessary, we usually interact with computers by adding additional content or software to the bootable system: photos from your family, games like The Oregon Trail, the installer for Adobe PhotoShop, encrypted documents passed along by your whistleblower friend - whatever. Though the internet has taken over as the primary method for exchanging additional content, historically this was frequently done with portable, removable storage devices like floppies, CD-ROMs, external hard drives and USB sticks, etc.

So what does it mean to translate all this into a virtual environment?

Diagram of a virtual computing environnment

All of those fixed, physical components will be handled by emulators. Emulators are applications (software) that mimic the behavior of hardware - essentially, you will insert an emulator as an extra layer into your usual computing stack to trick older bootable systems and additional content or software into seeing different/legacy hardware rather than the actual laptop or desktop you are using right now.

Just as there have been a great variety of hardware brands and systems over the history of computing, there are many emulators that have been created by developers and enthusiasts to recreate that variety. Frequently, different emulators will be better at mimicking different types of hardware - you will want a different application if you are trying to emulate a Mac or a PC, or if you are trying to emulate a BBC Micro or a 2018 Raspberry Pi.

Here are some common examples of popular emulators and the hardware they try to recreate:

A sidenote on “disk imaging”

The storage devices that carry bootable systems and additional content - hard drives, floppies, CD-ROMs, etc - are converted to a virtual environment by a process called disk imaging. A disk image is a file that contains both the contents and the structure of a storage device; in other words, a bit-for-bit copy of that physical device, that now exists virtually. Emulation very frequently relies on disk images (and some have been provided to get through this tutorial) - an emulator without a bootable disk image is as useful as a MacBook without MacOS.

Creating disk images from physical media is a process entirely unto itself, which we will not cover in this lesson. We will assume disk imaging has already happened, and proceed from there to demonstrate how emulators use disk images to create a virtual environment. For more information on archival-quality disk imaging, we heartily recommend the resources provided by our friends in the BitCurator Consortium.

https://imgs.xkcd.com/comics/emulation.png

Emulation, Virtualization and Virtual Machines

We have been discussing computing “environments” to broadly encompass everything that makes up a computer: not just hardware, but the software, files and settings that you manipulate and personalize as a user. Emulators and disk images help us move this interaction from a physical to a virtual space.

Instead of a virtual “environment” you could also fairly refer to what we are doing with emulation as the creation of a virtual machine or VM. And indeed, you will see this term, as well as virtualization, come up a lot online, in similar contexts to emulation. The distinction between emulation and virtualization is very technical and largely contextual, so it is worth clarifying before we proceed.

As noted above, emulation is about recreating hardware in software - at a high level, its purpose is to overcome incompatibility and obsolescence. Without an emulator, it is impossible to run an SNES game on a desktop computer, or run a Apple II program in Windows, for instance. The guest system (the Apple II operating system) and the host (let’s say, a modern Dell laptop) are fundamentally incapable of talking to each other otherwise.

Virtualization also refers to a process of creating virtual machines, and setting up a “guest” system on a “host”. But it generally refers to cases where the guest system is totally compatible with the host’s hardware, but the user wants the option of running two or more different operating systems, or wants to isolate some of their applications and files, or otherwise has a reason to split their single computer into multiple computers. We can return to our example of a modern Dell laptop again here - that Dell laptop is perfectly capable of running either Windows 10 or Ubuntu 18.04 for its operating system, just not both at the same time. But you could run Windows 10 as the “host” with an Ubuntu virtual machine as the “guest” via a virtualization program, and take advantage of applications written for both operating systems.

Virtualization is less about overcoming incompatibility and obsolescence and more about managing and splitting hardware resources (the modern internet, for instance, pretty much runs on souped-up server racks that are split into virtual machines to manage and host different web sites - this is more efficient than setting up a separate physical machine for every site on the web). It’s also usually much faster than emulation, because programs run in virtualization are still communicating directly with hardware, rather than working their way through extra layers and layers of software.

However, confusingly, some programs are capable of both emulation and virtualization. VirtualBox, VMWare, and QEMU are all popular examples. QEMU is capable of either virtualizing Windows 10 or emulating Mac OS 9.2. The line of “compatibility” can also get muddy and sometimes has nothing to do with our usual markers of age or obsolescence: there is software originally written in the 1980s for MS-DOS that is probably perfectly capable of being run on modern hardware via virtualization.

All of this is to say that we will refer exclusively to emulation for the rest of this lesson, because contextually we are seeking to “resurrect” recovered data and that purpose is more commonly associated with creating virtual machines via emulation rather than virtualization. But, the further you pursue this work, the more likely these terms are going to come up in the course of troubleshooting and research. And, the more the digital preservation field considers how to proactively safeguard contemporary digital materials and computing environments, virtualization will become more and more relevant as well.

Mapping hardware components to emulator applications

An additional challenge of working with emulators is that there is little in the way of “controlled vocabularies” for computing environments. So different emulators may be referring to more or less the same thing and just use different words in their menus and support pages.

For an example: here is essentially the same MS-DOS virtual machine, displayed first in VirtualBox and then as a QEMU configuration. Both machines feature:

MS-DOS running VirtualBox

MS-DOS running in QEMU

The selected settings in both applications are essentially choosing the same pieces of hardware to emulate, but communicating it differently (and it’s possible that neither is referring to a piece of hardware the way you would in everyday conversation, when thinking about, using, or buying a physical machine). And these are both applications meant to either emulate or virtualize generic PCs!

Hardware component VirtualBox setting QEMU setting
CPU/processor Chipset Architecture qemu-system-[arch] (generic architecture) and/or -cpu (specific model)
RAM Base Memory -m
Monitor and Graphics Card Display -vga
Sound Card Audio Controller -soundhw
Network Interface Controller Network Adapter -nic
Floppy Disk Drive Floppy Device -fda [floppy_file.img]
CD-ROM Drive Optical Drive -cdrom [cd_file.iso]
Hard Disk Drive Virtual Hard Drive -hda [harddisk_file.img]
Mouse Pointing Device -device
USB USB Controller -usb

Exercise: Emulator Documentation

Running a virtual machine in QEMU consists of “building” your emulated PC by selecting hardware components to recreate from the program’s list of options.

Open a text editor or word processor (Text Editor, Notepad, Microsoft Word, Google Docs, whatever you’d like, just open something to write things down). Also open a new tab in your browser and navigate to the official documentation for QEMU’s “PC System Emulator” (qemu-system-x86_84, though the commands apply as well to qemu-system-i386).

Below is a list of several specific pieces of hardware emulated by QEMU. Copy the list to your open word document. Using the QEMU manual, can you write down, next to each device in the list, the appropriate setting and option to specify that particular hardware?

Since QEMU is a command line application, settings are selected with this structure (brackets for illustrative purposes only): -setting [option]

  • a Pentium processor:
  • 512 megabytes of RAM:
  • a USB mouse:
  • a Cirrus Logic GD5446 Video card:
  • an ENSONIQ AudioPCI ES1370 sound card:
  • a RealTek RTL8139D network card:

Solution

  • a Pentium processor: -cpu pentium
  • 512 megabytes of RAM: -m 512
  • a USB mouse: -device usb-mouse or -usbdevice mouse (the latter is deprecated syntax and may be removed in a future QEMU version)
  • a Cirrus Logic GD5446 Video card: -vga cirrus
  • an ENSONIQ AudioPCI ES1370 sound card: -soundhw es1370
  • a RealTek RTL8139D network card: -net nic,model=rtl8139

Key Points

  • Emulators recreate computing hardware in software. These are called ‘virtual machines’ to distinguish from physical machines.

  • You will need disk images to load software and files into emulators.

  • Virtualization and emulation are technically very similar but usually employed in slightly different contexts.

  • Different emulators are designed for different brands or models of hardware, and may refer to similar components in different ways. The variety of possible applications and vocabulary is confusing, but surmountable.