Computer Science

Here you'll find some information about programming languages and software development in general.

Archive for the ‘Boot’ Category

Boot your OS from CD/DVD

Posted by mtomassoli on February 27, 2009

 

Warning

This article is written in Bad English (BE), the most widespread language in the Web.

 

Source code

You won’t find any source code here. I don’t believe that providing source code is always a good thing. My reasons are simple:

  • if you really understand something you’ll be able to implement it by yourself in your language of choice;
  • usually, you just end up copying the code because you are too lazy to reread the explanation provided or too lazy to ask the author for more information;
  • I don’t like learning from code because it contains way too many irrelevant details.

That’s right. Sometimes having too many details is worse than missing some of them. The problem is that who writes the code need to know much more than what can be seen by the code itself. On the other hand, the code is the result of many totally arbitrary choices usually not explicitly marked as such. Over-commented code would probably do, but I don’t like over-commented code.

Bottom line: no source code here.

 

Booting from CD

I guess you already know how to boot from a hard disk or a floppy disk. Booting from a CD is just a bit more complex, but nothing to worry about.

Sooner or later, you’ll have to read this paper:
specscdrom.pdf.
What is El Torito? You’ll find some information here:
http://en.wikipedia.org/wiki/El_Torito_(CD-ROM_standard).

I am not an expert in these things. I just wanted to make my OS bootable from CD.

 

ISO 9660 & .iso

ISO 9660 is a standard that defines a file system for CD-ROM media. CDs contain less information than hard disks and can’t be modified as easily, so ISO 9660 is by far simpler than, say, NTFS.
User writable CD-R and CD-RW use the UDF format which is more complex than ISO 9660, though.

While ISO 9660 is (the specification of) a file system, an .iso file is just a sector by sector copy of a CD or DVD. It has NOTHING to do with ISO 9660!

An .iso image contains the so-called cooked 2048-byte sectors of a CD or DVD. They’re cooked and not raw because control data is missing.
Have a look here:
http://en.wikipedia.org/wiki/ISO_9660#CD-ROM_Specifications.
As you can see, in cooked 2048-byte sectors synchronization information and error correction and detection codes are missing. They’re normally automatically created for you, so we don’t need to worry about all that.
We just need to write user-data as a sequence of 2048-byte sectors. If we want to create a 30-sector CD/DVD we’ll need to create an .iso file of 30*2048 bytes. That’s all.

 

El Torito & ISO 9660

What does it mean that El Torito “format” is an extension to the ISO 9660 format?
It simply means that it’s compliant with ISO 9660, i.e. a CD may be bootable and still follow the ISO 9660 spec., which is a good thing, of course.

But that also means that we don’t have to follow the ISO 9660 format at all! When all the structures that El Torito format requires are present in the .iso, we can add more data by simply writing it in arbitrarily chosen sectors.

For instance, we could write the content of three modules of our OS in the following groups of sectors: 30…39, 40…63 and 64…100. Then our OS could read these three files directly because he would know where they are located in advance.
If this weren’t satisfying, we could even devise a simple file system.

 

Emulation

El Torito spec. talks about floppy disk images, hard disk images and emulation. Why? When you boot from a floppy disk, the BIOS loads your boot code found in sector 0 (zero) at the physical address 7C00h.
Now your code can access the content of the floppy disk by calling the 13h BIOS services. The boot from hard disk is analogous.
When the BIOS jumps to your code, DL contains the ID of the boot device. Floppy disk drives start from 0 and hard disks from 80h. CD-ROM drives should start from A0h.

What would happen if an old program were booted from a CD-ROM?
First of all, he could complain about DL being A0h and crash. Secondly, how would it be supposed to read from a CD? The old Int 13h doesn’t work.
So BIOSes which implement the El Torito extension, must virtualize the access to the CD and pretend it’s a normal floppy disk or hard disk! Such BIOSes shall set DL to 0 or 80h (or similar) and transparently extend the old int 13h interface in such a way that the old program thinks it’s reading from the media it was supposed to be booted from.

The important thing to understand is that the compatibility problem that is being solved is software related, not hardware or firmware related. If your BIOS doesn’t know anything about the El Torito extension then there’s nothing you can do about it.

 

Reinventing the Wheel

Many tutorials suggest that you should create a bootable floppy disk and then use some software to convert that to a bootable .iso.
Well, I think that’s not the right way to proceed. Sometimes, easy means “the only way I know how to make it work”. That doesn’t mean it’s easy. That only means that you didn’t find information about other methods.
Reinventing the wheel is bad, you will often be told. I think that reinventing the wheel is a good thing as long as you can tell good and bad wheels apart.
If you create something horrible you have to be aware of that.

Moreover, you really shouldn’t tell an OS developer that reinventing the wheel is bad. He’s developing an OS!!!

Reinventing the wheel is also extremely didactical, in fact you can’t make by yourself what you don’t understand. Secondly, if you do something by yourself you’re free to customize it and you’ll be more independent, especially when something doesn’t work and you have to understand what’s wrong with it.

 

Native Mode & int 13h extensions

With my project I went directly for the booting in Native Mode. When you boot from a CD/DVD in Native Mode you’ll be able to make the BIOS load up to 32 MB of code for you. The best thing to do, however, is to use the BIOS only initially. Afterwards you should write your own code.
AFAIK, int 13h can be called only in real mode, so you’ll have to run a v8086 task from protected mode or set up some kind of virtualization by yourself (I’m assuming your OS will run in protected mode).
Since I’m targeting recent platforms, I’ll assume that every BIOS supports int 13h extensions (ah>40h). See http://www.t10.org/t13/docs2004/d1572r3-EDD3.pdf.

From real mode, you can read 2048-byte sectors through the int 13h services 41h-48h. The good news is that you’ll be using LBA addressing mode, i.e. the sectors will be linearly numbered starting from 0. Keep in mind that we’re dealing with two types of sectors:

  • 512-byte sectors and
  • 2048-byte sectors

When you tell the BIOS how many sectors he should load from the CD, you specifies the number of 512-byte sectors. Since this number, as we shall see, is a WORD, you can ask for

  512*65536 bytes = 2^(9+16) bytes = 2^5*2^20 bytes = 32MB

at most. On the other hand, when you use the functions 41h-48h in your code, you’ll be referring to 2048-byte sectors.

 

Let’s get started

First of all, we don’t need any specific software. Sometimes people use very powerful software to do very simple things and the result isn’t even completely satisfying. Well, here we’ll create the .iso all by ourselves. What we need is just a way to tell the BIOS to

  1. read N 512-byte sectors
  2. starting from the 2048-byte sector S
  3. and copy them to the physical address P in RAM.

But El Torito spec. aims at being ISO 9660 compliant (remember?) so we’ll have to do much more than that. We’ll need to write the following structures:

  1. Boot Record Volume Descriptor (BVD)
  2. Boot Catalog (BC)

where the BC consists of

  1. Validation Entry (VE)
  2. Initial/Default Entry
  3. Section Header
  4. Section Entry

There may be many section headers and many sections. You’ll have to refer to the documentation for the details. I’ll just guide you through the creation of a minimalistic .iso. The idea is this:

  1. we create a BVD at sector 17 (i.e. the 18-th sector)
  2. we create a BC at sector 18
  3. we write our code (boot image) to sector 19-20-…

The BVD shall point to the BC which shall point to our boot image.

 

BVD

A BVD has the following format:

struct BootVolDesc
{
    BYTE bootRecInd;
    BYTE specId[5];
    BYTE descVer;
    BYTE specStr[32];
    BYTE reserved[32];
    DWORD bootCatSec;       // absolute sector number when
                            // the boot catalog starts
    BYTE reserved2[1973];
};

The documentation is clear:

  1. bootRecInd must be set to 0
  2. specId to "CD001"
  3. descVer to 1
  4. specStr to "EL TORITO SPECIFICATION"
  5. reserved must be filled with 0
  6. reserved2 must be filled with 0

And, finally, we’ll set bootCatSec to 18 because there is where we’ll put our BC. Note that the BVD fills the entire 2048-byte sector 17, then no padding is needed.

 

Validation Entry

The BC starts with the Validation Entry:

struct ValidationEntry
{
    BYTE headerId;
    BYTE platformId;    // 0 = 80x86, 1 = Power PC, 2 = Mac
    WORD reserved;
    BYTE devName[24];   // developer or manufacturer of the ISO
    WORD checkSum;      // such that the sum of all the words
                        // gives 0x0000
    WORD magicWord;
};

We do as the documentation says:

  1. we set headerId to 1
  2. platformId to 0 (it shouldn’t be too hard to figure out why)
  3. reserved to 0
  4. magicWord to 0xAA55

Please note that, in little-endian, bytes are written to memory from the least to the most significant one, therefore the word 0xAA55 is written byte by byte as

  0x55 0xAA.

We don’t care in which order single bits are read/written because this detail is not architectural (and not exposed by the operations).

Finally, we write some 24-byte-long ASCII string. This is my (provisional) string, but don’t you dare use it for yourself! 🙂

  "Virtual Debugger"

Now you have to choose checkSum in such a way that the sum of all the WORDs in this validation entry is zero. You can proceed as follows:

  1. you set checkSum to 0
  2. you read the entire structure as if it were an array of WORDs
  3. you compute the sum SUM of all the WORDs
  4. you set checkSum to -SUM

Remember that, in a two’s complement representation of N bits, -X is nothing more than a N-bit number such that (-X)+X = 2^N.

For instance, with WORDs, we have

  FFFF + 0001 = 10000,

where 10000 = is 2^16. Since the most significant bit is lost, we are left with 0000.

 

Initial/Default Entry

Now we need to write the I/D Entry right after the VE. The format is as follows:

struct SectionEntry
{
    BYTE bootable;          // 0x88 = bootable image present,
                            // 0 = non-bootable image present
    BYTE bootMediaType;     // 0 = no emulation,
                            // 1 = 1.2MB diskette,
                            // 2 = 1.44MB diskette,
                            // 3 = 2.88MB diskette,
                            // 4 = hard drive,
                            // 5-0xff = reserved
    WORD entryCodeSeg;      // usually 0x7c0
    BYTE systemType;
    BYTE reserved;
    WORD numSecToLoad;      // number of 512-byte sectors to load
                            // (usually 1 in emulation mode)
    DWORD startingSec;      // absolute address of the first sector
                            // of the image to load

    // The following data must be 0 if this is the "Initial/Default
// Entry". BYTE selCriteria; // 0 = no selection criteria, // 1 = language and revision information // (IBM format), // 2-0xff = reserved BYTE selCriteria2[19]; // selection criteria };

Here we go:

  1. we set bootable to 0x88 (obviously)
  2. *I* set bootMediaType to 0 (no emulation)
  3. *I* set entryCodeSeg to 0x7c0 (It’s an old friend)
  4. we set systemType to 0
  5. reserved to 0
  6. selCriteria to 0
  7. we fill selCriteria2 with 0

In my case startingSec is 19 because I decided to put my code (boot image) in the sector 19 (and following sectors…). numSecToLoad must be set to the number of 512-byte sectors you want the BIOS to load for you. You might or might not pad the image with 0 in such a way that its length is a multiple of 512, but I don’t think it’s required.

If your boot image is x bytes long, you should set numSecToLoad to

  ceiling(x/512).

You can compute that value as

  (x+511) div 512,

where div is the integer division.

Let k be a non-negative integer and r a positive integer less than 512.

If x = 512*k, then (x+511) div 512 = k.

If x = 512*k + r, then (x+511) div 512 = k+1.

That’s exactly what we wanted.

 

Extra Section

I think there must exist at least one extra Section Entry in the .iso. The documentation is not particularly clear about it.

Here’s the structure for the Section Header:

struct SectionHeader
{
    BYTE id;               // 0x90 = other sections will follow this one,
                           // 0x91 = this is the final section
    BYTE platformId;       // 0 = 80x86, 1 = Power PC, 2 = Mac
    WORD numSecEntries;    // number of section entries
    BYTE sectionName[28];
};

Note that one documentation reports 0x90 and 0x91 while the other 90 and 91. I’m currently using 0x90 and 0x91 and all seems to work fine. We don’t really need another section because our boot image is already pointed by the I/D Entry (which is itself a section). For this reason, we set:

  1. id to 0x91, to indicate that this is the LAST section.

    This is also what makes me think that we need at least one section. If the only way of saying that there are no other sections (besides the sections introduced by this header, of course) is by setting id to 0x91 in the last header and each header should be followed by at least one section, how can we do it without having at least one section?
  2. platformId to 0
  3. numSecEntries to 1
  4. sectionName to an arbitrary string padded with 0.

Now we have to write the actual section. Its structure is identical to that of the I/D Entry, because, as I already said, that’s itself a section. For more or less subtle differences please refer to the documentation. We should initialize this last section as we did before, but with some exception:

  1. we set numSecToLoad to 1
  2. we set bootable to 0

Here I’m being a little defensive, i.e. I prefer to do something potentially superfluous instead of missing something mandatory. I was lucky, in fact my .iso worked at the first try.

For this reason, this section has valid entryCodeSeg and startingSec as well (they are the same we used before). At this point, the current sector, i.e. sector 18, contains:

  1. Validation Entry
  2. Section Entry
  3. Section Header
  4. Section Entry

What you have to do now is pad the current (2048-byte) sector with 0.

 

Boot Image

Finally, we write our boot image and pad the .iso file with 0 so that its size is a multiple of 2048.

Please remember that sector 17 is the 18-th sector (they start from 0) so you may want to start by writing 2048*17 bytes to your .iso and then write all the structures we talked about.

And please also remember that you can’t read from the CD with the old int 13h functions (i.3. int 13h, ah=2). Have a look at the documentation.

By the way, you don’t need that

db 510 - ($ - $$) dup (0) dw 0AA55h

or

times 510 - ($ - $$) db 0 dw 0AA55h

anymore.

That’s all. Happy coding!

Advertisements

Posted in Boot | 3 Comments »