DOS from Scratch: Getting Disk Info

Previous: Debugging Functions

Now that we have some debugging functions in place, we can start work on actually loading data off a disk. Currently our “disk” is only a single block though, so we need to update our build to generate a real disk image. We'll do this using qemu-image to generate a blank disk image, then we'll use dd to copy our bootloader onto the first block of that image.

build/disk.img: build/boot.img
	qemu-img create build/disk.img 1474560B
	dd if=build/boot.img of=build/disk.img bs=512 count=1 conv=notrunc

1,475,560 bytes is the size of a 1.44MB floppy disk. This is a useful format to work with because it's ubiquitous (at least for any machine we're targeting) and it has a disk geometry we can predict when using qemu. Now we just need to update our run command to point at the full disk image and also to tell qemu this is a floppy image and not a hard drive.

.PHONY:
run: build/disk.img
	qemu-system-x86_64 -drive format=raw,if=floppy,file=build/disk.img

Logical Block Addressing (LBA)

Disks are broken up into groups called blocks, which are traditionally 512 bytes. Modern disks tend to use larger 4096 byte blocks, but DOS had long been retired by the time those existed.

Using logical block addressing, you can access the disk blocks indexed from zero. With a total size of 1474560 bytes, our floppy image is made up of 2880 blocks, with block 0 being the first block (which is the boot sector) and block 2879 being the last.

Unfortunately logical block addressing didn't exist until the mid 90s, so most machines that would have run DOS as a primary operating system don't support it. Instead, disks were addressed using cylinder-head-sector addressing, a scheme which requires knowing the physical geometry of the disk.

Cylinder-head-sector (CHS) addressing

Hard drives are made up of one or more metal discs called platters read by some number of magnetic heads (usually two per platter). The drives are sliced up into concentric rings called cylinders, and pie slices called sectors. The intersection of a cylinder and a head is a track, and the intersection of a track and a sector is a specific block. Using this, you can access a block by selecting a specific cylinder, head, and sector. If that sounds confusing don't worry, it's not necessary to fully understand.

In order to maintain our sanity we really need to think of the disk in terms of LBA. Luckily it's easy to convert from LBA to CHS (and vice versa), so we can build functions to read and write blocks based on their LBA address.

The formulas for converting an LBA address into a CHS tuple are as follows:

cylinder = LBA / (heads per cylinder * sectors per track) head = (LBA / sectors per track) % heads per cylinder sector = (LBA % sectors per track) + 1

As far as math is concerned this is just basic arithmetic, but there is some information we're missing. We need to know the number of heads per cylinder, the number of sectors per track, and if we want to ensure we're not reading past the end of the drive the number of cylinders (or total number of blocks, which we would need the number of cylinders to compute).

Determining Disk Geometry

Disk geometry can be determined by calling int 0x13 function code 0x08. This function will populate various registers with the disk info of a disk specified by the value in dl. We'll need to do some shuffling to get the data we want isolated, so we'll build a function for that.

; allocate some memory to store disk info
I_CYLINDERS: dw 0
I_HEADS: dw 0
I_SECTORS: dw 0


; Loads disk info at the I_CYLINDERS, I_HEADS, and I_SECTORS
; labels.
; dl: drive index
load_disk_info:
    pusha

    mov ah, 0x08
    int 0x13

    ; print a message and halt if there was an error
    jc disk_error

    ; isolate bits [5:0] of cx
    ; this is the number of sectors
    mov ax, cx
    and ax, 0x3f
    mov [I_SECTORS], ax

    ; isolate bits [15:8] of dx
    ; this is the number of heads - 1
    mov al, dh
    mov ah, 0
    inc ax
    mov [I_HEADS], ax

    ; isolate bits [7:6][15-8] of cx
    ; this is the number of cylinders - 1
    mov al, ch
    mov ah, cl
    shr ah, 6
    inc ax
    mov [I_CYLINDERS], ax

    popa
    ret


disk_error:
    mov bx, DISK_ERROR_MESSAGE
    call print_string
    hlt
    jmp $-1


DISK_ERROR_MESSAGE: db "Error Reading Disk!$"

To verify that the disk info is being loaded correctly, let's write a function to print that info in a human readable format.

; Prints disk info
; dl: drive index
print_disk_info:
    pusha

    call load_disk_info

    mov bx, CYLINDERS
    call print
    mov ax, [I_CYLINDERS]
    call print_hex
    mov bx, LINE_BREAK
    call print

    mov bx, HEADS
    call print
    mov ax, [I_HEADS]
    call print_hex
    mov bx, LINE_BREAK
    call print

    mov bx, SECTORS
    call print
    mov ax, [I_SECTORS]
    call print_hex
    mov bx, LINE_BREAK
    call print

    popa
    ret

CYLINDERS: db "Cylinders: $"
HEADS: db "Heads: $"
SECTORS: db "Sectors: $"

With all that in place, we can call print_disk_info with dl set to 0 to print the info for the first floppy disk.

; print disk info for first floppy disk
mov dl, 0x00
call print_disk_info

If everything is working correctly, you should see the following output:

Cylinders: 0050
Heads: 0002
Sectors: 0012

Converting from hex that's 80 cylinders, 2 heads, and 18 sectors, which is the geometry of a 1.44MB floppy disk. In the next post we'll use this information about the disk geometry to read an arbitrary block from the disk.