DOS from Scratch: Hello World
Now that we have a machine that can boot and execute some code, there is enough of a foundation to do a “Hello World” tutorial. There isn't too much we can do inside the 512 bytes of the boot sector, but it's enough space to play around with some fundamentals before moving on to loading an actual kernel.
So the boot loader from the previous post contained this block of code to print a dollar sign to the screen:
mov al, "$"
mov ah, 0x0e
int 0x10
You can probably infer what most of this does. int 0x10
triggers an interrupt handler that does something with the contents of ax
. We printed a dollar sign so al
is probably where we put a character we want to print, and the 0x0e
in ah
must refer to some sort of “print character” function. But how does that actually work? What if we want to print a string and not a single character? What if we want to use colors? Can we change the font? What if we want to deal with graphics instead of text?
INT 10H
INT 10H
(or int 0x10
as I prefer writing it in code), is the interrupt vector for the BIOS video services. It can do several useful things, but what we care about right now is controlling text output.
When triggering INT 10H
, ah
is used as the function code. The function code we used (0x0e
) is for teletype output, which behaves more or less like a TTY you're probably used to from modern operating systems. In text mode, it prints a character at the current cursor position then advances the cursor. It also line wraps if necessary, scrolls the screen if necessary, and responds to various control codes.
So that's enough information to output a character, and based on the described behavior you can probably figure out how to print full strings using a simple loop. This information also presents some unanswered questions though. Are there other types out output besides teletype? What modes besides text are there? What other function codes can INT 10H
handle?
Video Modes
Let's start from the top. The very first function code INT 10H
provides (function code 0
) allows you to set the video mode. The value of al
determines which of the many available modes the video card will use, but all of these fall into one of two categories: text and graphics.
In text mode, the video card interprets video memory as ASCII characters and renders those to the screen.
In graphics mode, the video card interprets video memory as pixel data. If you want to print text in this you'll have to write your own font rendering code, which is far beyond the scope of this post.
I'm going to choose to use text mode 3
, which is a 80x25 character text mode with 16 colors. On any modern computer or emulator this is likely the mode the computer booted into, but we'll set it to be safe. As an added bonus, setting the video mode will also clear the screen.
mov ah, 0
mov al, 0x03
int 0x10
NOTE: BIOS also supports multiple “pages” of text that you can toggle between, but for the time being we'll just do everything on page 0.
Cursor Position and Shape
Before we begin outputting text, it's important to understand how to manipulate the cursor on screen. Some output functions don't advance the cursor, so manually advancing the cursor is necessary.
Function 0x01
This sets the cursor shape. The cursor is always a full character width but you can specify the starting and ending rows to control the height and vertical position of of the box that's drawn. In text mode 3, characters are 16 pixels tall. The default underline cursor starts at line 13 and ends at line 14 (I assume that last pixel is left blank for line spacing). If you wanted the cursor to be a box instead, you could fill from 0 to 14 instead.
; Make cursor a square
mov ah, 0x01
mov ch, 0
mov cl, 0x0e
int 0x10
Function 0x02
This sets the cursor position using row and column indexing with 0, 0 being the top left of the screen, and 24, 79 being the bottom right corner.
; Put cursor in bottom right corner
mov ah, 0x02
mov bh, 0
mov dh, 24
mov dl, 79
int 0x10
Text Output Functions
INT 10H
provides four functions for text output.
- Function
0x09
writes a character and attribute at the cursor position - Function
0x0a
writes a character at the cursor position - Function
0x0e
is the already discussed teletype output - Function
0x13
is string output
Note: The first three should work on any IBM compatible PC, but the string output function wont work on some older machines. I haven’t been able to find a definitive answer as to where the exact cutoff is, but any machine with a VGA card is almost certainly new enough.
Function 0x09
This writes a character with a given attribute at the current cursor position. It does not advance the cursor or respond to control codes. It can, however, repeat a character more than one time.
The attribute refers to the color of the text, which in 16 color mode can be anything between 0x00
and 0x0f
.
0x00
– Black0x01
– Blue0x02
– Green0x03
– Cyan0x04
– Red0x05
– Magenta0x06
– Brown0x07
– White0x08
– Gray0x09
– Light Blue0x0a
– Light Green0x0b
– Light Cyan0x0c
– Light Red0x0d
– Light Magenta0x0e
– Yellow0x0f
– Bright White
; Set the page to use (used by function 0x02 and 0x09)
mov bh, 0
; Set the color to red (used by function 0x09)
mov bl, 0x04
; Move cursor to 10, 0
mov ah, 0x02
mov dh, 10
mov dl, 0
int 0x10
; Output "r" at the cursor position
mov ah, 0x09
mov al, 'r'
mov cx, 1
int 0x10
; Move cursor to 10, 1
mov ah, 0x02
mov dh, 10
mov dl, 1
int 0x10
; Output "e" at the cursor position
mov ah, 0x09
mov al, 'e'
mov cx, 1
int 0x10
; Move cursor to 10, 2
mov ah, 0x02
mov dh, 10
mov dl, 2
int 0x10
; Output "e" at the cursor position
mov ah, 0x09
mov al, 'd'
mov cx, 1
int 0x10
; Move cursor to 10, 3
mov ah, 0x02
mov dh, 10
mov dl, 3
int 0x10
Function 0x0a
This is the same as 0x09
, but you don't provide an attribute value. The text will be printed using whatever the last attribute value at that position was.
Function 0x0e
The behavior of the teletype output function was mostly discussed above, but it should be added that this function will also retain the previous attribute value like 0x09
.
Other Useful Functions
We've discussed all the text output functions BIOS provides, but there are several other useful functions you can use in conjunction with those.
Function 0x05
This function selects which display page to use.
; Set the currently displayed page to page 0
mov ah, 0x05
mov al, 0
int 0x10
Function 0x06
This function scrolls an area of the active page up one row. Text that goes outside that area isn't retained, so scrolling up one followed by scrolling down one would result in the top line of text in that area being blank.
; Clear the screen (assuming an 80x25 character video mode)
mov ah, 0x06
; Scroll 25 rows and set the attribute to 0x07 (white)
mov al, 25
mov bh, 0x07
; Top left corner of scroll area
mov ch, 0
mov cl, 0
; Bottom right corner of scroll area
mov dh, 24
mov dh, 79
int 0x10
Function 0x07
This scrolls the active page down. It behaves the same as 0x06
aside from the direction.
Printing Strings
Now that we've gone over all the various text related functions of INT 10H
, we can put all that together to actually print strings. What's that you ask? Isn't that just using function 0x013
?
It could be depending on what you were building, but this series is specifically about learning how DOS works, and function 0x013
isn't actually all that useful for displaying DOS's string format.
DOS’s print string function operates on dollar terminated strings. Rather than taking the length of the string as a parameter, it just keeps printing characters until it comes across a dollar sign. We could loop over our string to find its length, but if we’re looping anyway it’s easier to just print as we iterate.
As best I can tell from experimentation, the DOS print string function interprets characters the same as the BIOS teletype output, with one exception. DOS does not display the character 0x27 (escape), and also does not display the next character following 0x27. I don't fully understand why, but I'll update this section once I do.
For now, I'm going to implement the print string function by using the BIOS teletype output function and call that close enough.
; Prints a $ terminated string
; bx: address of the string to print
print:
pusha
; SI will point to the current character
mov si, bx
; Set the arguments for BIOS teletype output
mov ah, 0x0e
mov bx, 0
print_loop:
; load the next character from memory
mov al, [si]
; check if we're at the end of the string
cmp al, '$'
je print_exit
; print the character
int 0x10
; move on to the next character
add si, 1
jmp print_loop
print_exit:
popa
ret
What if I don't want to go through the BIOS?
In some cases, you may find it more effective to directly write text to graphics memory instead of using the various character and string printing functions provided by the BIOS. In text modes, text is stored directly in graphics memory and the graphics adapter handles converting that text data into pixels. All that's necessary to modify this data directly is knowing where it's located in memory and how it's structured, but that will be discussed in a future post about VGA.
Conclusion
Now that we have a print function we can update our bootloader to print an actual “hello world” message.
; boot.asm
[bits 16]
[org 0x7c00]
main:
; disable interrupts
cli
; make sure the CPU is in a sane state
jmp 0x0000:clear_segment_registers
clear_segment_registers:
xor ax, ax
mov ds, ax
mov es, ax
mov ss, ax
mov sp, main
cld
; re-enable interrupts
sti
; print a message
mov bx, msg
call print
; "It's now safe to turn off your computer."
hlt
jmp $-1
; Prints a $ terminated string
; bx: address of the string to print
print:
pusha
; SI will point to the current character
mov si, bx
; Set the arguments for BIOS teletype output
mov ah, 0x0e
mov bx, 0
print_loop:
; load the next character from memory
mov al, [si]
; check if we're at the end of the string
cmp al, '$'
je print_exit
; print the character
int 0x10
; move on to the next character
add si, 1
jmp print_loop
print_exit:
popa
ret
msg:
db "Hello World!"
crlf db 0x0d, 0x0a
endstr db '$'
; padding
times 510-($-$$) db 0
; literally magic
dw 0xaa55