Hacking away: Lines, printing, and variables
Learning Objectives
- You know how to create a new Dart project and how to run the project.
- You know how to create tests for a Dart project and know how to run the tests.
- You know how to create a naive BASIC interpreter that can handle print statements and variables.
Let’s start creating a simple BASIC interpreter. The interpreter will be able to handle print statements and variables. For simplicity, we make the following assumptions about the code given as input to the interpreter:
- Each line starts with a number.
- Each line has a statement.
- All lines are written in uppercase.
- Lines are separated by a newline character.
- Spaces are used to separate parts of a line.
To start off, let’s create a new Dart project called basic_interpreter
.
In this part, the interpreter is built in incremental parts. Follow the materials in order, building the interpreter locally on your computer as you go. At the end of each chapter, you are expected to submit a zip file containing the progress you have made on the interpreter (and the tests) up to that point.
Creating and testing a project
A new project is created with the dart create
command, which takes the name of the project as an argument. When creating projects, the project names are typically written in lowercase and separated by underscores.
On the command line, type the following command:
dart create basic_interpreter
The above command creates a folder called basic_interpreter
with the following structure:
tree --dirsfirst
.
├── bin
│ └── basic_interpreter.dart
├── lib
│ └── basic_interpreter.dart
├── test
│ └── basic_interpreter_test.dart
├── analysis_options.yaml
├── CHANGELOG.md
├── pubspec.lock
├── pubspec.yaml
└── README.md
The project comes with a bin
folder, which contains the main Dart file, a lib
folder for the library code, and a test
folder for tests. The pubspec.yaml
file contains the project configuration, and the README.md
file contains the project documentation.
To run the program, type dart run
on the command line in the folder where the pubspec.yaml
file is located. The output is as follows.
Building package executable...
Built basic_interpreter:basic_interpreter.
Hello world: 42!
Similarly, to run the tests that come with the project, type dart test
on the command line. The output is as follows.
Building package executable...
Built test:test.
00:00 +1: All tests passed!
Interface and first test
The interface for the interpreter will be a class called Interpreter
that has a function called interpret
that takes a string as input and returns a list of strings. The input is the BASIC code, while the output is the output of the program.
Modify the lib/basic_interpreter.dart
file as follows:
class Interpreter {
List<String> interpret(String code) {
return ['Hello BASIC!'];
}
}
The above provides the starting point for the interpreter. As you notice, it does not yet do anything useful. Next, create a test for the interpreter. Modify the test/basic_interpreter_test.dart
as follows:
import 'package:basic_interpreter/basic_interpreter.dart';
import 'package:test/test.dart';
void main() {
late Interpreter interpreter;
setUp(() {
interpreter = Interpreter();
});
test('10 PRINT "HELLO, WORLD!"', () {
expect(interpreter.interpret('10 PRINT "HELLO, WORLD!"'),
["HELLO, WORLD!"]);
});
}
The late
keyword in the test file is used to declare a variable that is initialized later. The setUp
function is called before each test, and the test
function is used to define a test case.
Each test case has a description and a function that runs the test. The function contains the code that is being tested, and an expectation — here, the expect
function — that checks whether the output of the code matches the expected output. The expect
function takes the actual output and the expected output as arguments.
If the expectation fails, i.e. the actual output does not match the expected output, the test fails.
When we run the test with dart test
, we see that the test fails. The test expects that the interpreter will return ["HELLO, WORLD!"]
, but the interpreter returns ["Hello BASIC!"]
.
// ...
00:00 +0 -1: test/basic_interpreter_test.dart: 10 PRINT "HELLO, WORLD!" [E]
Expected: ['HELLO, WORLD!']
Actual: ['Hello BASIC!']
// ...
00:00 +0 -1: Some tests failed.
Interpreting a print statement
Print statements in BASIC consist of a line number, the PRINT
keyword, and a message to be printed. The message is enclosed in double quotes. For example, the following code prints “HELLO, WORLD!”.
10 PRINT "HELLO, WORLD!"
Interpreting the PRINT
statement is the first step in building the BASIC interpreter. The interpret
function should parse the input code and return the output of the program. First, let’s add a function to the Interpreter
class that parses the lines of the program, storing them in a map where the key is the line number and the value is the statement as a string.
Modify the lib/basic_interpreter.dart
file as follows:
class Interpreter {
Map<int, String> programLines = {};
void parseLines(String code) {
List<String> lines = code.split('\n');
for (var line in lines) {
int firstSpace = line.indexOf(' ');
int lineNumber = int.parse(line.substring(0, firstSpace));
String statement = line.substring(firstSpace + 1);
programLines[lineNumber] = statement;
}
}
List<String> interpret(String code) {
return ['Hello BASIC!'];
}
}
Then, to interpret the code, we first call the parseLines
function in the interpret
function. Then, we sort the line numbers and iterate over them, checking if the statement is a PRINT
statement.
List<String> interpret(String code) {
parseLines(code);
List<int> lineNumbers = programLines.keys.toList()..sort();
List<String> outputLines = [];
for (var lineNumber in lineNumbers) {
String statement = programLines[lineNumber]!;
if (statement.startsWith("PRINT")) {
// do something here
}
}
return outputLines;
}
If a print statement is found, we extract the message from the statement and add it to the list of output lines. We earlier learned about regular expressions, so let’s use them here — we want to extract the content that follows PRINT and that is between the double quotes. Then, once we have extracted the content, we add it to the output.
Modify the interpret
function as follows:
List<String> interpret(String code) {
parseLines(code);
List<int> lineNumbers = programLines.keys.toList()..sort();
List<String> outputLines = [];
for (var lineNumber in lineNumbers) {
String statement = programLines[lineNumber]!;
if (statement.startsWith("PRINT")) {
statement = statement.substring(6);
final match = RegExp(r'"(.*)"').firstMatch(statement)!;
outputLines.add(match.group(1)!);
}
}
return outputLines;
}
Now, when we run the command dart test
, we see that the test passes.
Building package executable...
Built test:test.
00:00 +1: All tests passed!
Then, let’s add another test that checks if the interpreter can handle multiple print statements. Add the following test to test/basic_interpreter_test.dart
; the test goes after the first test.
test('Two print statements', () {
expect(
interpreter
.interpret('10 PRINT "HELLO, WORLD!"\n20 PRINT "HELLO, BASIC!"'),
["HELLO, WORLD!", "HELLO, BASIC!"]);
});
When we run the tests, we see that all tests pass.
...
00:00 +2: All tests passed!
Print statement with multiple arguments
Let’s next add the functionality to handle print statements with multiple arguments. For example, the following code should print two messages on the same line:
10 PRINT "HELLO, WORLD!", "HELLO, BASIC!"
Add the following test to test/basic_interpreter_test.dart
— we assume that the output is separated by tabs.
test('10 PRINT "HELLO, WORLD!", "HELLO, BASIC!"', () {
expect(
interpreter.interpret('10 PRINT "HELLO, WORLD!", "HELLO, BASIC!"'),
["HELLO, WORLD!\tHELLO, BASIC!"]);
});
When we run the tests, we see that the test fails.
dart test
...
00:00 +2 -1: Some tests failed.
The key part to fix in the interpreter is the way how the message is extracted from the statement. The current implementation only extracts one message, but we need to extract multiple messages.
if (statement.startsWith("PRINT")) {
statement = statement.substring(6);
final match = RegExp(r'"(.*)"').firstMatch(statement)!;
outputLines.add(match.group(1)!);
}
In addition, the regular expression "(.*)"
is greedy. The .*
will match as many characters as possible, which means that the regular expression matches the entire string between double quotes — even if the string between double quotes also has double quotes.
To transform the regular expression to non-greedy, we add a question mark after the asterisk:
"(.*?)"
.
To handle multiple arguments, we use a non-greedy version of the regular expression as shown above, and use the allMatches
method to identify all arguments to the print statement.
if (statement.startsWith("PRINT")) {
statement = statement.substring(6);
final matches = RegExp(r'"(.*?)"').allMatches(statement)!;
}
Then, to transform the matches into the output, we can use the map
method from iterable to transform the matches into the collected groups. The first element in the iterable contains the entire match, while second and subsequent elements contains the first group — the content between the double quotes.
if (statement.startsWith("PRINT")) {
statement = statement.substring(6);
final matches = RegExp(r'"(.*?)"').allMatches(statement)!;
final outputs = matches.map((match) => match.group(1));
}
And, finally, we can use the join
method of the list to transform the messages into a single string by joining them with tabs.
if (statement.startsWith("PRINT")) {
statement = statement.substring(6);
final matches = RegExp(r'"(.*?)"').allMatches(statement)!;
final outputs = matches.map((match) => match.group(1));
outputLines.add(outputs.join("\t"));
}
Now, when we run the tests, we see that all tests pass again.
...
00:00 +3: All tests passed!
Grouping tests together
As we are building specific functionality, it is a good idea to group tests that test similar functionality together. The function group
that is imported from the test
package is used for this. The function takes a description of the group and a function that contains the tests.
The following shows the current contents of the test/basic_interpreter_test.dart
file, with the tests grouped together using the group
function.
import 'package:basic_interpreter/basic_interpreter.dart';
import 'package:test/test.dart';
void main() {
late Interpreter interpreter;
setUp(() {
interpreter = Interpreter();
});
group("Printing", () {
test('10 PRINT "HELLO, WORLD!"', () {
expect(
interpreter.interpret('10 PRINT "HELLO, WORLD!"'), ["HELLO, WORLD!"]);
});
test('Two print statements', () {
expect(
interpreter
.interpret('10 PRINT "HELLO, WORLD!"\n20 PRINT "HELLO, BASIC!"'),
["HELLO, WORLD!", "HELLO, BASIC!"]);
});
test('10 PRINT "HELLO, WORLD!", "HELLO, BASIC!"', () {
expect(interpreter.interpret('10 PRINT "HELLO, WORLD!", "HELLO, BASIC!"'),
["HELLO, WORLD!\tHELLO, BASIC!"]);
});
});
}
When we run the tests, we see that all tests still pass. Time to move forward.
Variables
Let’s next add variables and the possibility to print the value of the variables to the interpreter. Variables in BASIC are introduced using the LET statement. For example, the following code assigns the value 10
to the variable X
and then prints the value of X
— in this case, 10.
10 LET X = 10
20 PRINT X
Let’s again start by writing a test for the interpreter. Create a group called Variables
and add the first test to the group.
group("Variables", () {
test('10 LET A = 5\n20 PRINT A', () {
expect(interpreter.interpret('10 LET A = 5\n20 PRINT A'), ["5"]);
});
});
When we run the tests, we see that the test fails.
...
00:00 +3 -1: Some tests failed.
Adding variables to the interpreter
To be able to handle variables, we need to store their values in the interpreter. Add a map called variables
to the Interpreter
class to store the variables. The map has a string key for the variable name and a num value for the variable value.
The
num
type is used to represent both integers and floating-point numbers in Dart.
class Interpreter {
Map<int, String> programLines = {};
Map<String, num> variables = {};
// ...
}
Then, we need to modify the interpret
function to handle the LET statement. The LET statement assigns a value to a variable. We need to parse the statement to extract the variable name and the value, and then add the variable to the variables
map.
Regular expressions are again useful for the task. We can use a regular expression to match the variable name and the value in the LET statement.
if (statement.startsWith("LET")) {
statement = statement.substring(4);
final regex = RegExp(r'(\w) = (\d+(\.\d+)?)');
final match = regex.firstMatch(statement)!;
}
Then, we can extract the variable name and the value from the match, parse the value into a number, and add the variable to the variables
map.
if (statement.startsWith("LET")) {
statement = statement.substring(4);
final regex = RegExp(r'(\w) = (\d+(\.\d+)?)');
final match = regex.firstMatch(statement)!;
String variable = match.group(1)!;
num value = num.parse(match.group(2)!);
variables[variable] = value;
}
Printing the value of a variable
Now that our interpreter can store variable values in the variables
map, we can modify the interpret
function to handle the print statement. When the interpreter encounters a pring statement, it should check if the argument is a variable (i.e., not in double quotes) or a string literal (i.e., in double quotes).
The current functionality for handling the print statement is as follows.
if (statement.startsWith("PRINT")) {
statement = statement.substring(6);
final matches = RegExp(r'"(.*?)"').allMatches(statement)!;
final outputs = matches.map((match) => match.group(1));
outputLines.add(outputs.join("\t"));
}
To be able to handle both strings and variables, we need to modify the regular expression to extract strings that are not within double quotes. We can add an alternation to the regular expression to match strings that are not separated by commas or spaces — that is, we either match content in double quotes, or words.
if (statement.startsWith("PRINT")) {
statement = statement.substring(6);
final matches = RegExp(r'"(.*?)"|(\w+)').allMatches(statement)!;
}
Now, the position 0 of each match has the full match, the position 1 has the content in double quotes, and the position 2 has the content that is not in double quotes. We can use the content in double quotes as a string literal and the content that is not in double quotes as a variable.
Now, when iterating over the matches, we can check if the group at position 1 is null — if no, it’s a string literal, and we can add it to the output. If it is null, it’s a variable, and we can look up the value of the variable in the variables
map.
if (statement.startsWith("PRINT")) {
statement = statement.substring(6);
final matches = RegExp(r'"(.*?)"|(\w+)').allMatches(statement)!;
final outputs = matches
.map((match) => match.group(1) ?? variables[match.group(2)!]);
outputLines.add(outputs.join("\t"));
}
Now, when we run the tests, we see that all tests pass.
...
00:00 +4: All tests passed!
Let’s add another test to check that decimal values work as expected.
test('10 LET A = 5.2\n20 PRINT A', () {
expect(interpreter.interpret('10 LET A = 5.2\n20 PRINT A'), ["5.2"]);
});
Running the tests, all tests continue to pass.
...
00:00 +5: All tests passed!
Finally, let’s add a test that checks that the interpreter can print both string literals and variables in the same print statement.
test('10 LET A = 42\n20 PRINT "A IS", A', () {
expect(interpreter.interpret('10 LET A = 42\n20 PRINT "A IS", A'),
["A IS\t42"]);
});
All tests pass again.
...
00:00 +6: All tests passed!
Yay! We have a naive BASIC interpreter that can handle print statements and variables. We are missing quite a bit of functionality, and most importantly, we haven’t really used the correct terminology. Let’s take a step back and look into what programming language interpreters typically contain.