I have done some Commodore 64 programming over the holidays. The C64 is old, but I think there are things that can be learned from dealing with all the hardware limitations (although there may be better ways of getting this experience with more recent platforms...)
C64 Hardware
The 6502 instruction set consists of 56 very simple instructions, and the CPU has been completely reversed engineered so you can follow how it does its work – Visual 6502 even let you visualize transistor-level Simulation of the CPU in the browser! The details of the CPU are described in the talk “Reverse Engineering the MOS 6502 CPU”:
The rest of the C64 hardware is also well described by a talk – “The Ultimate Commodore 64 Talk”:
There are lots of information about C64 available on the net, but I have not found any convenient place that collects the relevant information... But everything I needed (including examples/tutorials) was available at C64 Wiki and Codebase 64. The details of the Video Interface Chip is documented in “The MOS 6567/6569 video controller (VIC-II) and its application in the Commodore 64”, and Andre Weissflog’s chip emulators contains very readable code if you want clarifications on what the various hardware registers do.
The rest of the C64 hardware is also well described by a talk – “The Ultimate Commodore 64 Talk”:
There are lots of information about C64 available on the net, but I have not found any convenient place that collects the relevant information... But everything I needed (including examples/tutorials) was available at C64 Wiki and Codebase 64. The details of the Video Interface Chip is documented in “The MOS 6567/6569 video controller (VIC-II) and its application in the Commodore 64”, and Andre Weissflog’s chip emulators contains very readable code if you want clarifications on what the various hardware registers do.
Getting started – raster bars
My first test was to implement two raster bars and some sprites moving around, just to verify that I understood the basics.
Much of C64 programming centers around getting around hardware limitations by timing the code carefully. For example, the raster bars are drawn in the border of the screen, but the border can only have one color. It is, however, possible to modify the border color while the hardware is drawing the screen (the C64 hardware is designed for old CRT monitors that draw the picture one line at a time using an electron beam), so we can draw the raster bars by changing the border color exactly when a new line starts!
I had thought the raster bars should be trivial (just enable an interrupt on a line, and change the colors) but the C64 is not fast enough for this – it takes 9-16 cycles to enter the IRQ, so we are already a bit into the screen when we can change the color. And there are only 63 CPU cycles for each line, so we don’t have the time to set up a new interrupt for the next line anyway. We, therefore, need to first synchronize our code with the start of a line, and then write the code (using NOPs etc. to pad it out) so that we change the colors exactly every 63 cycles.
But there are more complications – some lines have less than 63 cycles available to do the work. The reason is that the VIC-II chip steals cycles from the CPU for lines where it must fetch extra data. There are two cases:
I was lazy and ensured that my raster bars were not on a bad line and that there were no sprites on the same raster line, but careful programming can handle this kind of things too. The talk “Behind the scenes of a C64 demo” mentions some insane techniques, such as treating the cycle counter as an address and jump to it (this jumps to different addresses depending on how many cycles have executed, and careful layout of the code can make this compensate for differences in execution time).
Much of C64 programming centers around getting around hardware limitations by timing the code carefully. For example, the raster bars are drawn in the border of the screen, but the border can only have one color. It is, however, possible to modify the border color while the hardware is drawing the screen (the C64 hardware is designed for old CRT monitors that draw the picture one line at a time using an electron beam), so we can draw the raster bars by changing the border color exactly when a new line starts!
I had thought the raster bars should be trivial (just enable an interrupt on a line, and change the colors) but the C64 is not fast enough for this – it takes 9-16 cycles to enter the IRQ, so we are already a bit into the screen when we can change the color. And there are only 63 CPU cycles for each line, so we don’t have the time to set up a new interrupt for the next line anyway. We, therefore, need to first synchronize our code with the start of a line, and then write the code (using NOPs etc. to pad it out) so that we change the colors exactly every 63 cycles.
But there are more complications – some lines have less than 63 cycles available to do the work. The reason is that the VIC-II chip steals cycles from the CPU for lines where it must fetch extra data. There are two cases:
- The first raster line of each text line must fetch the characters, which steals one cycle per character. These lines are usually called “bad lines”.
- Each sprite that is enabled on the raster line steals 2 cycles.
I was lazy and ensured that my raster bars were not on a bad line and that there were no sprites on the same raster line, but careful programming can handle this kind of things too. The talk “Behind the scenes of a C64 demo” mentions some insane techniques, such as treating the cycle counter as an address and jump to it (this jumps to different addresses depending on how many cycles have executed, and careful layout of the code can make this compensate for differences in execution time).
Source code
The source code for my test program is available below, and you can build it using the ACME Cross-Assembler asacme -o test.prg test.asmThe resulting program
test.prg
can be run in the VICE emulator by loading it using the “Smart attach Disk/Tape” option.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;; Zero page variables | |
sprite_x_idx = $03 | |
sprite_y_idx = $04 | |
sprite_skip_counter = $05 | |
tmp1 = $10 | |
irq_saved_a = $20 | |
irq_saved_x = $21 | |
irq_saved_y = $22 | |
irq_color_addr_lo = $22 | |
irq_color_addr_hi = $23 | |
* = $07ff | |
;; File header and a BASIC program that calls the assembly code. | |
!byte $01,$08 ; Load address $0801 | |
!byte $0c,$08 ; ??? Usually end of 1st line of BASIC prog. | |
!byte $0a,$00,$9e,$20,$32,$30,$36,$34,$00 ; 10 SYS 2064 | |
!byte $00,$00 ; End of program | |
*= $0810 | |
start: | |
jsr init_irq | |
jsr clear_screen | |
jsr init_sprites | |
loop: | |
jsr step_time | |
;; Wait until the raster is below the sprites. | |
loop1: | |
ldx $d012 | |
cpx #250 | |
bne loop1 | |
jsr move_sprites | |
jmp loop | |
;;; Update sine-table indices one time step. | |
step_time: | |
inc sprite_x_idx | |
lda sprite_x_idx | |
and #127 | |
sta sprite_x_idx | |
;; Skip some updates to y to prevent the position from repeating | |
;; every 64 frames. | |
inc sprite_skip_counter | |
ldx sprite_skip_counter | |
cpx #53 | |
bne st1 | |
ldx #0 | |
stx sprite_skip_counter | |
rts | |
st1: | |
inc sprite_y_idx | |
lda sprite_y_idx | |
and #127 | |
sta sprite_y_idx | |
rts | |
;;; Update sprites x and y coordinates | |
move_sprites: | |
ldx #0 | |
ldy sprite_x_idx | |
ms1: | |
tya | |
clc | |
adc #5 | |
and #127 | |
sta tmp1 | |
and #63 | |
tay | |
lda sine64,y | |
clc | |
adc #168 | |
ldy tmp1 | |
adc sine128,y | |
sta $d000,x | |
inx | |
inx | |
cpx #16 | |
bne ms1 | |
ldx #0 | |
ldy sprite_y_idx | |
ms2: | |
tya | |
clc | |
adc #5 | |
and #127 | |
tay | |
lda sine128,y | |
clc | |
adc #154 | |
sta $d001,x | |
inx | |
inx | |
cpx #16 | |
bne ms2 | |
rts | |
clear_screen: | |
lda #$20 | |
cs_loop: | |
sta $0400,x | |
sta $0500,x | |
sta $0600,x | |
sta $0700,x | |
dex | |
bne cs_loop | |
rts | |
;;; Initialize sprites | |
;;; https://www.c64-wiki.com/wiki/Sprite | |
init_sprites: | |
ldx #0 | |
ldy #0 | |
is1: | |
lda #sprite1/64 | |
sta $07f8,y | |
lda #0 | |
sta $d000,x | |
sta $d001,x | |
lda #1 | |
sta $d027,y | |
inx | |
inx | |
iny | |
cpx #16 | |
bne is1 | |
lda #0 | |
sta $d010 | |
lda #$ff | |
sta $d015 | |
;; Initialize variables used for calculating sprite positions. | |
lda #0 | |
sta sprite_x_idx | |
sta sprite_skip_counter | |
lda #40 | |
sta sprite_y_idx | |
rts | |
;;; Initialize IRQ | |
;;; http://codebase64.org/doku.php?id=base:introduction_to_raster_irqs | |
init_irq: | |
sei | |
lda #$7f | |
sta $dc0d | |
sta $dd0d | |
lda $dc0d | |
lda $dd0d | |
lda #1 | |
sta $d01a | |
lda #80 | |
sta $d012 | |
lda #$1b | |
sta $d011 | |
lda #$35 | |
sta $01 | |
lda #<irq1 | |
sta $fffe | |
lda #>irq1 | |
sta $ffff | |
cli | |
rts | |
;; Align the IRQ code to ensure the branches do no cross a page | |
;; boundary (branches takes longer when crossing page boundaries, | |
;; and the code below rely on branches taking 3 cycles). | |
!align 255, 0 | |
;;; Draw rasterbar1 | |
irq1: | |
;; Entering the IRQ takes 9-16 cycles (depending how long it takes to | |
;; execute the current instruction. We do not need to be cycle-exact | |
;; (as the first cycles are outside the screen), so assume 10 cycles. | |
;; But this is sloppy... See the link below for implementing better | |
;; raster routines: | |
;; http://codebase64.org/doku.php?id=base:making_stable_raster_routines | |
sta irq_saved_a ; 3 cycles | |
stx irq_saved_x ; 3 | |
sty irq_saved_y ; 3 | |
lda #<irq2 ; 2 | |
sta $fffe ; 4 | |
lda #>irq2 ; 2 | |
sta $ffff ; 4 | |
lda #100 ; 2 | |
sta $d012 ; 4 | |
lda #$ff ; 2 | |
sta $d019 ; 4 | |
lda #<rasterbar1 ; 2 | |
sta irq_color_addr_lo ; 3 | |
lda #>rasterbar1 ; 2 | |
sta irq_color_addr_hi ; 3 | |
jmp rasterbar_irq ; 3 | |
;;; Second half of IRQ. Assumes the first part has taken 56 cycles. | |
rasterbar_irq: | |
ldy #0 ; 2 cycles | |
lda (irq_color_addr_lo),y ; 5 | |
;; ---------------------------- = 63 | |
irq1_loop2: | |
sta $d020 ; 4 | |
sta $d021 ; 4 | |
ldx #9 ; 2 | |
irq1_loop3: | |
dex ; 2 * n | |
bne irq1_loop3 ; 3 * n - 1 | |
iny ; 2 | |
lda (irq_color_addr_lo),y ; 4 | |
bne irq1_loop2 ; 3 (2 for last iteration) | |
;; ---------------------------- = 63/62 | |
sta $d020 | |
sta $d021 | |
ldy irq_saved_y | |
ldx irq_saved_x | |
lda irq_saved_a | |
rti | |
;;; Draw rasterbar2 | |
irq2: | |
sta irq_saved_a | |
stx irq_saved_x | |
sty irq_saved_y | |
lda #<irq1 | |
sta $fffe | |
lda #>irq1 | |
sta $ffff | |
lda #85 | |
sta $d012 | |
lda #$ff | |
sta $d019 | |
lda #<rasterbar2 | |
sta irq_color_addr_lo | |
lda #>rasterbar2 | |
sta irq_color_addr_hi | |
jmp rasterbar_irq | |
rasterbar1: | |
!byte 6, 14, 14, 6, 0 | |
rasterbar2: | |
!byte 8, 7, 7, 8, 0 | |
sine64: | |
!byte $00,$03,$06,$09,$0b,$0e,$11,$13,$15,$17,$19,$1a,$1c,$1d,$1d,$1e | |
!byte $1e,$1e,$1d,$1d,$1c,$1a,$19,$17,$15,$13,$11,$0e,$0c,$09,$06,$03 | |
!byte $00,$fd,$fa,$f7,$f5,$f2,$ef,$ed,$eb,$e9,$e7,$e6,$e4,$e3,$e3,$e2 | |
!byte $e2,$e2,$e3,$e3,$e4,$e6,$e7,$e9,$eb,$ed,$ef,$f2,$f4,$f7,$fa,$fd | |
sine128: | |
!byte $00,$01,$02,$04,$05,$06,$07,$08,$09,$0a,$0b,$0c,$0d,$0e,$0f,$10 | |
!byte $11,$12,$13,$13,$14,$15,$15,$16,$16,$17,$17,$17,$18,$18,$18,$18 | |
!byte $18,$18,$18,$18,$18,$17,$17,$17,$16,$16,$15,$15,$14,$13,$13,$12 | |
!byte $11,$10,$0f,$0e,$0d,$0c,$0b,$0a,$09,$08,$07,$06,$05,$04,$02,$01 | |
!byte $00,$ff,$fe,$fd,$fb,$fa,$f9,$f8,$f7,$f6,$f5,$f4,$f3,$f2,$f1,$f0 | |
!byte $ef,$ee,$ed,$ed,$ec,$eb,$eb,$ea,$ea,$e9,$e9,$e9,$e8,$e8,$e8,$e8 | |
!byte $e8,$e8,$e8,$e8,$e8,$e9,$e9,$e9,$ea,$ea,$eb,$eb,$ec,$ed,$ed,$ee | |
!byte $ef,$f0,$f1,$f2,$f3,$f4,$f5,$f6,$f7,$f8,$f9,$fa,$fb,$fc,$fe,$ff | |
!align 63, 0 ; Sprites must be 64-byte aligned | |
sprite1: | |
!byte $00,$7e,$00 | |
!byte $03,$ff,$c0 | |
!byte $07,$ff,$e0 | |
!byte $1f,$ff,$f8 | |
!byte $1f,$ff,$f8 | |
!byte $3f,$ff,$fc | |
!byte $7f,$ff,$fe | |
!byte $7f,$ff,$fe | |
!byte $ff,$ff,$ff | |
!byte $ff,$ff,$ff | |
!byte $ff,$ff,$ff | |
!byte $ff,$ff,$ff | |
!byte $ff,$ff,$ff | |
!byte $7f,$ff,$fe | |
!byte $7f,$ff,$fe | |
!byte $3f,$ff,$fc | |
!byte $1f,$ff,$f8 | |
!byte $1f,$ff,$f8 | |
!byte $07,$ff,$e0 | |
!byte $03,$ff,$c0 | |
!byte $00,$7e,$00 |