Saturday, December 29, 2018

Commodore 64 programming

I have done some Commodore 64 programming over the holidays. The C64 is old, but I think there are things that can be learned from dealing with all the hardware limitations (although there may be better ways of getting this experience with more recent platforms...)

C64 Hardware

The 6502 instruction set consists of 56 very simple instructions, and the CPU has been completely reversed engineered so you can follow how it does its work – Visual 6502 even let you visualize transistor-level Simulation of the CPU in the browser! The details of the CPU are described in the talk “Reverse Engineering the MOS 6502 CPU”:


The rest of the C64 hardware is also well described by a talk – “The Ultimate Commodore 64 Talk”:


There are lots of information about C64 available on the net, but I have not found any convenient place that collects the relevant information... But everything I needed (including examples/tutorials) was available at C64 Wiki and Codebase 64.  The details of the Video Interface Chip is documented in “The MOS 6567/6569 video controller (VIC-II) and its application in the Commodore 64”, and Andre Weissflog’s chip emulators contains very readable code if you want clarifications on what the various hardware registers do.

Getting started – raster bars

My first test was to implement two raster bars and some sprites moving around, just to verify that I understood the basics.


Much of C64 programming centers around getting around hardware limitations by timing the code carefully. For example, the raster bars are drawn in the border of the screen, but the border can only have one color. It is, however, possible to modify the border color while the hardware is drawing the screen (the C64 hardware is designed for old CRT monitors that draw the picture one line at a time using an electron beam), so we can draw the raster bars by changing the border color exactly when a new line starts!

I had thought the raster bars should be trivial (just enable an interrupt on a line, and change the colors) but the C64 is not fast enough for this – it takes 9-16 cycles to enter the IRQ, so we are already a bit into the screen when we can change the color. And there are only 63 CPU cycles for each line, so we don’t have the time to set up a new interrupt for the next line anyway. We, therefore, need to first synchronize our code with the start of a line, and then write the code (using NOPs etc. to pad it out) so that we change the colors exactly every 63 cycles.

But there are more complications – some lines have less than 63 cycles available to do the work. The reason is that the VIC-II chip steals cycles from the CPU for lines where it must fetch extra data. There are two cases:
  • The first raster line of each text line must fetch the characters, which steals one cycle per character. These lines are usually called “bad lines”.
  • Each sprite that is enabled on the raster line steals 2 cycles.
There is, in addition, an overhead to this cycle stealing mechanism that may waste up to 3 cycles per raster line (depending on what the CPU is doing when the cycle stealing starts).

I was lazy and ensured that my raster bars were not on a bad line and that there were no sprites on the same raster line, but careful programming can handle this kind of things too. The talk “Behind the scenes of a C64 demo” mentions some insane techniques, such as treating the cycle counter as an address and jump to it (this jumps to different addresses depending on how many cycles have executed, and careful layout of the code can make this compensate for differences in execution time).


Source code

The source code for my test program is available below, and you can build it using the ACME Cross-Assembler as
acme -o test.prg test.asm
The resulting program test.prg can be run in the VICE emulator by loading it using the “Smart attach Disk/Tape” option.

;; Zero page variables
sprite_x_idx = $03
sprite_y_idx = $04
sprite_skip_counter = $05
tmp1 = $10
irq_saved_a = $20
irq_saved_x = $21
irq_saved_y = $22
irq_color_addr_lo = $22
irq_color_addr_hi = $23
* = $07ff
;; File header and a BASIC program that calls the assembly code.
!byte $01,$08 ; Load address $0801
!byte $0c,$08 ; ??? Usually end of 1st line of BASIC prog.
!byte $0a,$00,$9e,$20,$32,$30,$36,$34,$00 ; 10 SYS 2064
!byte $00,$00 ; End of program
*= $0810
start:
jsr init_irq
jsr clear_screen
jsr init_sprites
loop:
jsr step_time
;; Wait until the raster is below the sprites.
loop1:
ldx $d012
cpx #250
bne loop1
jsr move_sprites
jmp loop
;;; Update sine-table indices one time step.
step_time:
inc sprite_x_idx
lda sprite_x_idx
and #127
sta sprite_x_idx
;; Skip some updates to y to prevent the position from repeating
;; every 64 frames.
inc sprite_skip_counter
ldx sprite_skip_counter
cpx #53
bne st1
ldx #0
stx sprite_skip_counter
rts
st1:
inc sprite_y_idx
lda sprite_y_idx
and #127
sta sprite_y_idx
rts
;;; Update sprites x and y coordinates
move_sprites:
ldx #0
ldy sprite_x_idx
ms1:
tya
clc
adc #5
and #127
sta tmp1
and #63
tay
lda sine64,y
clc
adc #168
ldy tmp1
adc sine128,y
sta $d000,x
inx
inx
cpx #16
bne ms1
ldx #0
ldy sprite_y_idx
ms2:
tya
clc
adc #5
and #127
tay
lda sine128,y
clc
adc #154
sta $d001,x
inx
inx
cpx #16
bne ms2
rts
clear_screen:
lda #$20
cs_loop:
sta $0400,x
sta $0500,x
sta $0600,x
sta $0700,x
dex
bne cs_loop
rts
;;; Initialize sprites
;;; https://www.c64-wiki.com/wiki/Sprite
init_sprites:
ldx #0
ldy #0
is1:
lda #sprite1/64
sta $07f8,y
lda #0
sta $d000,x
sta $d001,x
lda #1
sta $d027,y
inx
inx
iny
cpx #16
bne is1
lda #0
sta $d010
lda #$ff
sta $d015
;; Initialize variables used for calculating sprite positions.
lda #0
sta sprite_x_idx
sta sprite_skip_counter
lda #40
sta sprite_y_idx
rts
;;; Initialize IRQ
;;; http://codebase64.org/doku.php?id=base:introduction_to_raster_irqs
init_irq:
sei
lda #$7f
sta $dc0d
sta $dd0d
lda $dc0d
lda $dd0d
lda #1
sta $d01a
lda #80
sta $d012
lda #$1b
sta $d011
lda #$35
sta $01
lda #<irq1
sta $fffe
lda #>irq1
sta $ffff
cli
rts
;; Align the IRQ code to ensure the branches do no cross a page
;; boundary (branches takes longer when crossing page boundaries,
;; and the code below rely on branches taking 3 cycles).
!align 255, 0
;;; Draw rasterbar1
irq1:
;; Entering the IRQ takes 9-16 cycles (depending how long it takes to
;; execute the current instruction. We do not need to be cycle-exact
;; (as the first cycles are outside the screen), so assume 10 cycles.
;; But this is sloppy... See the link below for implementing better
;; raster routines:
;; http://codebase64.org/doku.php?id=base:making_stable_raster_routines
sta irq_saved_a ; 3 cycles
stx irq_saved_x ; 3
sty irq_saved_y ; 3
lda #<irq2 ; 2
sta $fffe ; 4
lda #>irq2 ; 2
sta $ffff ; 4
lda #100 ; 2
sta $d012 ; 4
lda #$ff ; 2
sta $d019 ; 4
lda #<rasterbar1 ; 2
sta irq_color_addr_lo ; 3
lda #>rasterbar1 ; 2
sta irq_color_addr_hi ; 3
jmp rasterbar_irq ; 3
;;; Second half of IRQ. Assumes the first part has taken 56 cycles.
rasterbar_irq:
ldy #0 ; 2 cycles
lda (irq_color_addr_lo),y ; 5
;; ---------------------------- = 63
irq1_loop2:
sta $d020 ; 4
sta $d021 ; 4
ldx #9 ; 2
irq1_loop3:
dex ; 2 * n
bne irq1_loop3 ; 3 * n - 1
iny ; 2
lda (irq_color_addr_lo),y ; 4
bne irq1_loop2 ; 3 (2 for last iteration)
;; ---------------------------- = 63/62
sta $d020
sta $d021
ldy irq_saved_y
ldx irq_saved_x
lda irq_saved_a
rti
;;; Draw rasterbar2
irq2:
sta irq_saved_a
stx irq_saved_x
sty irq_saved_y
lda #<irq1
sta $fffe
lda #>irq1
sta $ffff
lda #85
sta $d012
lda #$ff
sta $d019
lda #<rasterbar2
sta irq_color_addr_lo
lda #>rasterbar2
sta irq_color_addr_hi
jmp rasterbar_irq
rasterbar1:
!byte 6, 14, 14, 6, 0
rasterbar2:
!byte 8, 7, 7, 8, 0
sine64:
!byte $00,$03,$06,$09,$0b,$0e,$11,$13,$15,$17,$19,$1a,$1c,$1d,$1d,$1e
!byte $1e,$1e,$1d,$1d,$1c,$1a,$19,$17,$15,$13,$11,$0e,$0c,$09,$06,$03
!byte $00,$fd,$fa,$f7,$f5,$f2,$ef,$ed,$eb,$e9,$e7,$e6,$e4,$e3,$e3,$e2
!byte $e2,$e2,$e3,$e3,$e4,$e6,$e7,$e9,$eb,$ed,$ef,$f2,$f4,$f7,$fa,$fd
sine128:
!byte $00,$01,$02,$04,$05,$06,$07,$08,$09,$0a,$0b,$0c,$0d,$0e,$0f,$10
!byte $11,$12,$13,$13,$14,$15,$15,$16,$16,$17,$17,$17,$18,$18,$18,$18
!byte $18,$18,$18,$18,$18,$17,$17,$17,$16,$16,$15,$15,$14,$13,$13,$12
!byte $11,$10,$0f,$0e,$0d,$0c,$0b,$0a,$09,$08,$07,$06,$05,$04,$02,$01
!byte $00,$ff,$fe,$fd,$fb,$fa,$f9,$f8,$f7,$f6,$f5,$f4,$f3,$f2,$f1,$f0
!byte $ef,$ee,$ed,$ed,$ec,$eb,$eb,$ea,$ea,$e9,$e9,$e9,$e8,$e8,$e8,$e8
!byte $e8,$e8,$e8,$e8,$e8,$e9,$e9,$e9,$ea,$ea,$eb,$eb,$ec,$ed,$ed,$ee
!byte $ef,$f0,$f1,$f2,$f3,$f4,$f5,$f6,$f7,$f8,$f9,$fa,$fb,$fc,$fe,$ff
!align 63, 0 ; Sprites must be 64-byte aligned
sprite1:
!byte $00,$7e,$00
!byte $03,$ff,$c0
!byte $07,$ff,$e0
!byte $1f,$ff,$f8
!byte $1f,$ff,$f8
!byte $3f,$ff,$fc
!byte $7f,$ff,$fe
!byte $7f,$ff,$fe
!byte $ff,$ff,$ff
!byte $ff,$ff,$ff
!byte $ff,$ff,$ff
!byte $ff,$ff,$ff
!byte $ff,$ff,$ff
!byte $7f,$ff,$fe
!byte $7f,$ff,$fe
!byte $3f,$ff,$fc
!byte $1f,$ff,$f8
!byte $1f,$ff,$f8
!byte $07,$ff,$e0
!byte $03,$ff,$c0
!byte $00,$7e,$00
view raw test.asm hosted with ❤ by GitHub