Compiler Construction/Case Study 1

Case Study 1 - a Simple Interpreter
The purpose of this is to provide a gentle introduction to some compilation techniques, and to introduce a few more computer science concepts.

Interpreters are used rather than compilers since they don't have such a steep learning curve. Later chapters and case studies will delve more into compilers.

The language being interpreted is a very small subset of Basic, even smaller than Tiny Basic.

The purpose of this case study is to use simple interpreters to provide a gentle introduction to some compilation. There are three chunks of code in this chapter:
 * Simple line-mode IDE, which can be used with either interpreter.
 * Interpreter for Very Very Tiny Basic.
 * Interpreter for Very Tiny Basic.

The VVTB interpreter only allows integer variables, and only recognizes five different statements (just enough to write simple programs with, but not a serious programming tool). The VTB interpreter allows both integer and string variables, and recognizes ten different statements. The two interpreters use different methods for dealing with expressions, to provide you with some variety.

Note that the code presented here is intended to be as clear and simple as possible; high performance is not a target. Those readers unfamiliar with C++ may like to consult Appendix A which provides a brief summary on those C++ constructions which are used in the book.

Very Tiny Basic - Specification
A line-mode IDE is very simple: if the line you type starts with a number then that line is a statement to add to your program source code, otherwise it is a command to the system (e.g. RUN or LIST). The line-mode 'editor' is also very simple:
 * Just typing a line number without any code causes any existing line with that number to be deleted.
 * If the line number you use is the same as that of some previously typed line then the earlier line is discarded.
 * Program lines are sorted by line number order, so you can easily insert missing lines if needed, provided you originally used line numbers with gaps (every 10 was common).

For Lexical Analysis

Digit       ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ; LineNumber  ::= Digit [ Digit ]* ; WholeNumber ::= Digit [ Digit ]* ; Letter      ::= 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R' | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z' ; LiteralString::= '"' [any character except double-quote]+ '"' ; Comparison  ::= '=' | '<>' | '<' | '<=' | '>' | '>=' ; Separator   ::= ',' | ';' ; AddSub      ::= '+' | '-' ; MulDiv      ::= '*' | '/' ;

For IDE processing

Line        ::= CommandLine | ProgramLine ; CommandLine ::= 'RUN' | 'LIST' | 'NEW' | 'QUIT' ; ProgramLine ::= LineNumber | Statement ;

For VVTB Syntax Analysis

Statement   ::= 'LET' Variable '=' Expression | 'IF' Test 'THEN' LineNumber | 'PRINT' [ [ PrintItem ] Separator ]* | 'REM' [any character]+ | 'GOTO' LineNumber ; Variable    ::= Letter ; PrintItem   ::= Expression | LiteralString ; Test        ::= Expression Comparison Expression ; Expression  ::= [AddSub] Secondary [ AddSub Secondary ]* ; Secondary   ::= Primary [ MulDiv Primary ]* ; Primary     ::= '(' Expression ')' | Variable | WholeNumber ;

For VTB Syntax Analysis

Statement   ::= 'LET' Variable '=' Expression | 'IF' Test 'THEN' LineNumber | 'PRINT' [ [ PrintItem ] Separator ]* | 'REM' [any character]+ | 'GOTO' LineNumber | 'GOSUB' LineNumber 'RETURN' | 'STOP' | 'FOR' Variable '=' Expression 'TO' Expression ['STEP' Expression] | 'NEXT' Variable ; Variable    ::= Letter | Letter '$' ; PrintItem   ::= Expression ; Test        ::= Expression Comparison Expression ; Expression  ::= [AddSub] Secondary [ AddSub Secondary ]* ; Secondary   ::= Primary [ MulDiv Primary ]* ; Primary     ::= '(' Expression ')' | Variable | WholeNumber | LiteralString ;

IDE commands

Line number
 * Line numbers must be non-zero.

NEW
 * Start a new program, delete any existing source code.

QUIT
 * Terminate the IDE.

LIST
 * List the current source code.

RUN
 * Obey the current source code.

Sample Programs
Programs in Very Very Tiny Basic are very simple. Programs in Very Tiny Basic are still simple, but they can do more complex things. Neither language allows any user input.

Sample VVTB program

REM This is a sample VVTB program. 10 LET a = 10 20 PRINT "The value of a is ", a 30 LET b = 20 40 IF b < 40 THEN 10 Sample VTB program

REM This is a sample VTB program. 10 FOR a$ = 1 TO 50 STEP 5 20 FOR b = 1 TO a$ 30 GOSUB 200 50 NEXT b 60 NEXT a$ 70 FOR b = 1 TO 100 80 GOSUB 200 90 IF b > 25 THEN 500 100 NEXT b REM A sub-routine will return to where it is called from. 200 PRINT "The value of a$ is ", a 210 PRINT "The value of b is ", b 220 RETURN 500 PRINT "Exiting the program." 510 GOTO 1000 1000 STOP

Line Mode IDE - Implementation
The code which follows may seem quite short, but it is indeed a complete line-mode IDE. The reason it is so short is the use of the C++ , which is exactly what is needed for storing source code, sorting lines into line number order, etc.

//-- standard C++ headers -- using namespace std; //-- functions: IDE // Main loop of line-mode IDE. void LineModeIDE(void); // Obey command. void ObeyCommand(const string Command,     // Command to obey.                  map & Source,  // Program source code.                  bool & Continue);          // Set false to terminate. // Obtain next non-blank line. string NextNonBlankLine(void); // Split line into number and text. void SplitLine(const string & Line, // Line to be split.                int & Number,         // 0 or number at start of line.                string & Text);       /* Remainder of line, if any, converted to upper case. */ //-- functions: interpreter // Interpret program. void Interpret(map & Source); // Program source code.
 * 1) include
 * 2) include
 * 3) include
 * 4) include
 * 5) include

//-- main program -- int main(int argc, char* argv[]) {    LineModeIDE; return 0; }

//-- function definitions -- // Main loop of line-mode IDE. void LineModeIDE(void) {    map Source;   // Program source code. int CurNumb;             // 0 or current line number. string CurText;          // Rest of text on line. bool Continuing = true;  // true=keep going. while ( Continuing ) { SplitLine(NextNonBlankLine,CurNumb,CurText); if ( CurNumb == 0 ) { ObeyCommand(CurText,Source,Continuing); } else if ( CurText == "" ) { Source.erase(CurNumb); } else { Source[CurNumb] = CurText; }    } }  // Obey command. void ObeyCommand(const string & Command,   // Command to obey.                  map & Source,  // Program source code.                  bool & Continue)           // Set false to terminate. {     if ( Command == "QUIT" ) { Continue = false; } else if ( Command == "NEW" ) { Source.clear; } else if ( Command == "RUN" ) { Interpret(Source); } else if ( Command == "LIST" ) { map:: iterator Ndx; for ( Ndx = Source.begin;               Ndx != Source.end;                ++Ndx               ) { cout << Ndx->first << ' ' << Ndx->second << endl; }     } else { cout << "*** unrecognised command: " << Command << endl; } }  // Obtain next non-blank line. string NextNonBlankLine(void) {    string Line; int First,Last; do { cout << ':'; getline(cin,Line); First = 0; Last = Line.length - 1; while ( First<=Last && Line[First]==' ' ) { ++First; }        if ( First > Last ) { Line = ""; }    } while ( Line == "" ); return Line; } // Split line into number and text. void SplitLine(const string & Line, // Non-blank line to be split.               int & Number,        // 0 or number at start of line.                string & Text)       /* Remainder of line, if any, converted to upper case. */ {    int Ndx = 0; int Value = 0; int Last = Line.length - 1; string Result; while ( Line[Ndx] == ' ' ) {   // Skip leading spaces. ++Ndx; }    while ( isdigit(Line[Ndx]) ) {  // Extract any number. Value = Value*10 + (int(Line[Ndx]) - int('0')); ++Ndx; }    Number = Value; while ( Line[Ndx] == ' ' ) {   // Skip more spaces. ++Ndx; }    while ( Line[Last] == ' ' ) {   // Ignore trailing spaces. --Last; }    while( Ndx <= Last ) {          // Copy non-blank portion, Result += toupper(Line[Ndx]); // as upper case. ++Ndx; }    Text = Result; } // Interpret program. void Interpret(map & Source) // Program source code. {    cout << "Sorry, interpreter not yet available." << endl; }