BASIC programs are stored in memory using a simple structure that we can investigate and manipulate. This article will show how they are stored and contains a BASIC program to go through each line of its own code and display how it is stored in memory. A good understanding of how BASIC is stored can help us to find novel ways to get the most out of BASIC and to leverage the power of the BASIC interpreter. This can be seen in a previous article: Storing Machine Code in REM Statements on the VIC-20.
A BASIC program consists of a series of lines. Each line has a link to the next line in memory, followed by a line number, then one or more BASIC statements and the end of the line is indicated with a 00 byte. The end of the program is indicated by two 00 bytes in the next line link.
BASIC programs start at different locations depending on how much memory the Vic has and this can be seen in the following table.
Memory | BASIC Storage | First Location of BASIC Program |
---|---|---|
Unexpanded | $1000-$1DFF (4096-7679) | $1001 (4097) |
+3K | $0400-$1DFF (1024-7679) | $0401 (1025) |
+8K | $1200-$3FFF (4608-16383) | $1201 (4609) |
+16K | $1200-$5FFF (4608-24575) | $1201 (4609) |
+24K | $1200-$7FFF (4608-32767) | $1201 (4609) |
In order to reduce the storage requirements of a BASIC program and to speed up the interpreter, BASIC statements are condensed using tokens. The following table contains the tokens used by BASIC v2 on the Vic. Where tokens aren't used, PETSCII takes it place in the case of comments, variable names, numbers, etc.
Decimal | Hex | BASIC token | Decimal | Hex | BASIC token |
---|---|---|---|---|---|
128 | 80 | END | 166 | A6 | SPC( |
129 | 81 | FOR | 167 | A7 | THEN |
130 | 82 | NEXT | 168 | A8 | NOT |
131 | 83 | DATA | 169 | A9 | STEP |
132 | 84 | INPUT# | 170 | AA | + |
133 | 85 | INPUT | 171 | AB | - |
134 | 86 | DIM | 172 | AC | * |
135 | 87 | READ | 173 | AD | / |
136 | 88 | LET | 174 | AE | ^ |
137 | 89 | GOTO | 175 | AF | AND |
138 | 8A | RUN | 176 | B0 | OR |
139 | 8B | IF | 177 | B1 | > |
140 | 8C | RESTORE | 178 | B2 | = |
141 | 8D | GOSUB | 179 | B3 | < |
142 | 8E | RETURN | 180 | B4 | SGN |
143 | 8F | REM | 181 | B5 | INT |
144 | 90 | STOP | 182 | B6 | ABS |
145 | 91 | ON | 183 | B7 | USR |
146 | 92 | WAIT | 184 | B8 | FRE |
147 | 93 | LOAD | 185 | B9 | POS |
148 | 94 | SAVE | 186 | BA | SQR |
149 | 95 | VERIFY | 187 | BB | RND |
150 | 96 | DEF | 188 | BC | LOG |
151 | 97 | POKE | 189 | BD | EXP |
152 | 98 | PRINT# | 190 | BE | COS |
153 | 99 | 191 | BF | SIN | |
154 | 9A | CONT | 192 | C0 | TAN |
155 | 9B | LIST | 193 | C1 | ATN |
156 | 9C | CLR | 194 | C2 | PEEK |
157 | 9D | CMD | 195 | C3 | LEN |
158 | 9E | SYS | 196 | C4 | STR$ |
159 | 9F | OPEN | 197 | C5 | VAL |
160 | A0 | CLOSE | 198 | C6 | ASC |
161 | A1 | GET | 199 | C7 | CHR$ |
162 | A2 | NEW | 200 | C8 | LEFT$ |
163 | A3 | TAB( | 201 | C9 | RIGHT$ |
164 | A4 | TO | 202 | CA | MID$ |
165 | A5 | FN | 203 | CB | GO |
An Example BASIC Program Structure
As an example we'll look at how the following short BASIC program is stored in memory.
10 REM A COMMENT
20 N=4*3:PRINT N
On an unexpanded Vic, BASIC starts at 4097 ($1001) and this table shows how the program is stored in memory. The two-byte values are stored in memory in LSB MSB order and in the table this is how they are listed in the hex column with the 16-bit value in MSB LSB order in parenthesis.
Address | Contents (decimal) | Contents (hex) | Meaning | Type |
---|---|---|---|---|
4097 | 4113 | 1110 ($1011) | Next Line Link | |
4099 | 0010 | 0A00 ($000A) | Line Number | |
4101 | 143 | 8F | REM | Token |
4102 | 32 | 20 | <space> | PETSCII |
4103 | 65 | 41 | A | PETSCII |
4104 | 32 | 20 | <space> | PETSCII |
4105 | 67 | 43 | C | PETSCII |
4106 | 79 | 4F | O | PETSCII |
4107 | 77 | 4D | M | PETSCII |
4108 | 77 | 4D | M | PETSCII |
4109 | 69 | 45 | E | PETSCII |
4110 | 78 | 4E | N | PETSCII |
4111 | 84 | 54 | T | PETSCII |
4112 | 0 | 00 | End of Line | |
4113 | 4127 | 1F10 ($101F) | Next Line Link | |
4115 | 0020 | 1400 ($0014) | Line Number | |
4117 | 78 | 4E | N | PETSCII |
4118 | 178 | B2 | = | Token |
4119 | 52 | 34 | 4 | PETSCII |
4120 | 172 | AC | * | Token |
4121 | 51 | 33 | 3 | PETSCII |
4122 | 58 | 3A | : | PETSCII |
4123 | 153 | 99 | Token | |
4124 | 32 | 20 | <space> | PETSCII |
4125 | 78 | 4E | N | PETSCII |
4126 | 0 | 00 | End of Line | |
4127 | 0000 | End of Program |
Video
The following video displays a BASIC program's structure and encoding in memory using the program below and VICMON.
Program to Display Structure of Basic Program
The following program will go through each of its BASIC lines and show the structure and encoding.
100 REM PRINT BASIC STRUCTURE
110 P=PEEK(43)+PEEK(44)*256:REM START OF BASIC
120 GOSUB 5000
130 LL=PEEK(P)+256*PEEK(P+1)
140 LN=PEEK(P+2)+256*PEEK(P+3)
150 GOSUB 1000
160 P=LL
170 IF LL <> 0 THEN GOTO 130
180 END
1000 REM PRINT LINE STRUCTURE
1010 IF LL=0 THEN PRINT P;LL;TAB(12);"*EOP":RETURN
1020 PRINT:GOSUB 4000:PRINT
1030 PRINT P;LL;TAB(12);"*LINK"
1040 PRINT P+2;LN;TAB(12);"*LINE NUM"
1050 P=P+4
1060 B=PEEK(P)
1070 PRINT P;B;TAB(12);
1080 GOSUB 3000:PRINT
1090 P=P+1
1100 GOSUB 2000
1110 IF B <> 0 GOTO 1060
1120 PRINT
1130 RETURN
2000 REM DELAY
2010 FOR I=1 TO 150
2020 NEXT I
2030 RETURN
3000 REM PRINT TOKEN OR CHAR
3010 IF PL = 0 AND B=0 THEN PRINT "*EOL"
3020 IF PL = 0 AND B=32 THEN PRINT "[SPACE]";
3030 IF B >= 128 AND B <= 203 THEN PRINT T$(B-128);:RETURN
3040 PRINT CHR$(B);
3050 RETURN
4000 REM PRINT BASIC LINE
4010 PRINT LN;
4020 BL=P+4
4030 B=PEEK(BL)
4040 IF B = 0 THEN 4080
4050 PL=1:GOSUB 3000:PL=0
4060 BL=BL+1
4070 GOTO 4030
4080 PRINT
4090 RETURN
5000 REM LOAD TOKENS
5010 DIM T$(76)
5020 FOR I=0TO75
5030 READ T$(I)
5040 NEXT I
5050 RETURN
6000 REM TOKENS
6010 DATA "END","FOR","NEXT","DATA","INPUT#","INPUT"
6020 DATA "DIM","READ","LET","GOTO","RUN","IF","RESTORE"
6030 DATA "GOSUB","RETURN","REM","STOP","ON","WAIT","LOAD"
6040 DATA "SAVE","VERIFY","DEF","POKE","PRINT#","PRINT"
6050 DATA "CONT","LIST","CLR","CMD","SYS","OPEN","CLOSE"
6060 DATA "GET","NEW","TAB(","TO","FN","SPC(","THEN","NOT"
6070 DATA "STEP","+","-","*","/","^","AND","OR",">","="
6080 DATA "<","SGN","INT","ABS","USR","FRE","POS","SQR"
6090 DATA "RND","LOG","EXP","COS","SIN","TAN","ATN"
6100 DATA "PEEK","LEN","STR$","VAL","ASC","CHR$","LEFT$"
6110 DATA "RIGHT$","MID$","GO"