BASIC Programming/Normative BASIC

= Normative BASIC =

The BASIC Programming Language has been standardized, firstly in the United States of America (USA) by the American National Standards Institute (ANSI), and later in Europe by the European Computer Manufacturers Association (ECMA), giving rise to the American National Standard (ANS) X3.60-1978 for Minimal BASIC and X3.113-1987 for Full BASIC by the former, and to the European Computer Manufacturers Association Standard 55 for Minimal BASIC in 1978 and Standard 116 for Full BASIC in 1986 by the latter.

The aim of the standards is to promote the interchangeability of BASIC programs among a variety of systems and through strict co-operation between both organizations it was possible to maintain full compatibility between the respective ANSI and ECMA standards.

The standards establish, among others:


 * the syntax of a program written in BASIC, and
 * the semantic rules for interpreting the meaning of a program written in BASIC.

Nowadays, only the ECMA standards are publicly available.

Character Set
The set of allowable character is given by:


 * the set of capital letters from A to Z,
 * the set of digits from 0 to 9,
 * the set of symbols !, #, $, %, &,, +, -, *, /, ^, ., ,, ;, :, <, =, >, _, ?, ', "
 * the space character

List of Reserved Keywords
Reserved keywords in Minimal BASIC are (26 in total):


 * 1) BASE
 * 2) DATA
 * 3) DEF
 * 4) DIM
 * 5) END
 * 6) FOR
 * GO
 * 1) GOSUB
 * 2) GOTO
 * IF
 * 1) INPUT
 * 2) LET
 * 3) NEXT
 * ON
 * 1) OPTION
 * 2) PRINT
 * 3) RANDOMIZE
 * 4) READ
 * 5) REM
 * 6) RESTORE
 * 7) RETURN
 * 8) STEP
 * 9) STOP
 * 10) SUB
 * 11) THEN
 * TO

Its meaning will be explained within the next sections.

Convention for the Name of Variables
Variables are used in BASIC to hold either character strings or numeric values, the latter being either of scalar or vectorial nature.

In the case of variables for character strings, each variable name is composed of a single letter between A - Z and the dollar sign $. So, A$, B$, ..., Z$ are all valid variable names for character strings, while A# or Z% are not.

In the case of variables for numeric scalar values, each variable name is composed of a single letter between A - Z and an optional digit. So, A, B, C1, D2, etc., are valid variable names for scalar values, while A11, B22, etc., are not.

In the case of variables for numeric vectorial values, each variable name is composed of a single letter between A - Z and either a number or two, separated by a comma, enclosed within parentheses for a one or a two dimensional array. So, A(1), B(2), C(1,1), D(2,2), etc., are valid variable names for vectorial values.

This convention makes the explicit declaration of variables not necessary in BASIC, since a dollar sign serves to distinguish a character string from a numeric value, and the presence of subscripts distinguishes a vectorial from a scalar variable.

Character Strings and Numeric Constants
Character strings are defined by any combination of characters from the allowable character set written within double quotation marks, the length of any character string being limited to 18 characters (with the exception of character strings in a print or remark-statement, to be seen later, which can be as long as line numbers and the line length limit permit). So, "", " ", "1 2 3 4 5 6 7 8 9", "A B C D E F G H I", "! # $ % & ... ' ", etc., are allowable character strings, while "1 2 3 4 5 6 7 8 9 0", "A B C D E F G H I J", "! # $ % & + - * / ^ ., ; : < = > _ ? ' ", etc., are not, since they exceed the 18-character limit.

Numeric constants denote scalar numeric values in a decimal representation in positional notation of a number. There are four general syntactic forms of optionally signed numeric constants:


 * implicit point representation (sd...d), like in the case of 1, 2, +1, -2, etc.,
 * explicit point unscaled representation (sd...drd...d), like in the case of 1.0, 2.0, +1.0, -2.0, etc.,
 * explicit point scaled representation (sd...drd...dEsd...d), like in the case of 1.0E1, 2.0E-1, +1.0E+1.0, -2.0E-2.0, etc.,
 * implicit point scaled representation (sd...dEsd...d), like in the case of 1.0E1, 2.0E-1, +1.0E+1, -2.0E-2, etc.,

where:


 * s is an optional sign (+ or -),
 * d is a decimal digit (0 - 9),
 * r is a period (.), and
 * E means 10 to the power.

Numeric constants can have any number of digits, although internally not less than six significant decimal digits and a range between 1E-38 and 1E+38.

Numeric constants whose magnitude is less than machine infinitesimal are replaced by zero, while constants whose magnitude are larger than machine infinity are replaced by machine infinity with the appropriate sign.

General Program Structure
BASIC is a line-oriented language, in the sense that a BASIC program can be considered as a sequence of lines, the last of which is an end-line, and each of which contains a keyword. Moreover, each line begins with a unique line number, which serves as a label for the statement contained in that line.

So, in BASIC every program can be represented with the following Backus-Naur form (BNF):


 * program = block end-line
 * block = line / for-block
 * line = line-number statement
 * line-number = digit digit? digit? digit?
 * end-line = line-number end-statement
 * end-statement = END
 * statement = data-statement / def-statement / dimension-statement / gosub-statement / goto-statement / if-then-statement / input-statement / let-statement / on-goto-statement / option-statement / print-statement / randomize-statement / read-statement / remark-statement / restore-statement / return-statement / stop-statement

So, the following simple examples are valid examples of a program in BASIC:


 * a two-line program (just a remark-statement, which serves to document the program and produces no output, and an end-statement, which terminates the program):




 * a three-line program (just a remark-statement, a print-statement, which prints a character string, and an end-statement):



Programs lines are executed in sequential order, starting with the first one, until:


 * some other action is dictated by a control statement, or
 * an exception condition occurs, which results in abnormal termination of the program, or
 * a stop-statement or end-statement is executed.

So, in the first example, the first line,, is composed of a non-control statement which produces no output or internal activity, passing then to the second line,  , which is composed of a control statement, the end-statement, which ends the program.

In the second example, there exists an additional line between the remark and the end-statement lines, being composed of a print-statement, also a non-control statement, which prints a character string.

The value of the line-numbers are positive integers, with leading zeroes having no effect. So, 1, 01, 10, 010, etc., are all valid line-numbers. Normally, line-numbers are given as multiples of 5 or 10, e.g., 10, 20, 30, 40, etc., which allows for room in case an additional line must be inserted in between existing lines.

Additionally, lines can be up to 72-characters long, so leaving 4 characters for the line-number, and a blank space as a separator between the line-number and the keyword, leaves 67 printable characters left for the statement in a line.

Spaces may occur anywhere in a BASIC program without affecting the execution of that program and may be used to improve the readability of the program.

All keywords in a program can be preceded by at least one space and, if not at the end of a line, can also be followed by at least one space.

Spaces shall not appear:


 * 1) at the beginning of a line
 * 2) within line numbers
 * 3) within keywords
 * 4) within numeric constants
 * 5) within function or variable names
 * 6) within two-character relation symbols

Program Variables
Variables in BASIC are associated with either numeric or string values and, in the case of numeric values, may be either simple variables or references to elements of one or two-dimensional arrays, which are then called subscripted or compound variables.

As stated before, simple numeric variables are named by a single capital letter followed by an optional single digit, while subscripted variables are named by a single capital letter followed by one or two numbers, separated in this last case by a comma, enclosed within parentheses.

String variables are also to be named by a single capital letter followed by a dollar sign.

At any instant in the execution of a program, a numeric variable is associated with a single numeric value and a string variable is associated with a single string value, the value associated with the variable possibly being changed by program statements in the course of program execution.

The length of a character string associated with a string variable can change during execution of the program from a length varying between 0 for the empty string to 18 characters.

Simple numeric variables and string variables are declared implicitly through their appearance in the program (also no type definitions are necessary, due to the given naming convention), although it is good programming practice to initialize or set them to meaningful values at the beginning of the program before their use in any statement.

A subscripted variable, on the other hand, refers to the element in the one or two-dimensional array selected by the value or values of the subscripts, being the subscripts integer values.

Unless explicitly declared in a dimension statement (to be seen later), subscripted variables are implicitly declared by their first appearance in a program, in which case the range of each subscript is to be understood from zero to ten, both inclusive, unless the presence of an option-statement indicates that the range is defined from one to ten, both also inclusive.

Caution must be paid, so that the same single letter is not used both for the name of a simple variable and a composed variable, nor for the name of both a one-dimensional and a two-dimensional array.

On the contrary, this restriction does not apply between a simple variable and a string variable, whose names may agree except for the dollar sign.

So, the following simple examples are valid examples of a program in BASIC:


 * the previous three-line program, with a somewhat different comment line and a new character string, which indicates the value of pi, being printed:




 * a modified four-line program which makes use of a let-statement to assign the numeric constant 3.14159265 to the numeric variable P in the second line, and a print-statement with a string constant and a numeric variable as a comma-separated list of arguments in the third line:




 * a modified five-line program which, still making use of a let-statement to assign the numeric constant 3.14159265 to the numeric variable P in the second line, now also makes use of a let-statement to assign the character constant "PI = " to the string variable P$ in a third line -- the print-statement in the fourth line is now composed of the comma-separated list of arguments formed by the string variable and the numeric variable:



Statements
Up to now we have seen how to declare/initialize simple numeric variables and string variables in the course of a program by means of the let-statement and how to print them with the help of the print-statement.

It is sometimes desirable not only to print the value of a variable, let it be a numeric or a character one, but to introduce the value as input to the program in order to compute a numerical value or to print a message depending on the value of a condition. For those cases, one needs to make use of expressions, mathematical functions, and control statements, as we shall see in this section.

Input/Output, Mathematical Operators, Expressions
Expressions are normally classified as numeric expressions or string expressions.

In the case of numeric expressions, these are constructed from variables, constants, mathematical functions, and the mathematical operations of addition, subtraction, multiplication, division, and involution.

The formation and evaluation of numeric expressions follow the normal algebraic rules, and the circumflex accent, the asterisk, the solidus, the plus sign, and the minus sign symbols are used to represent the operations of involution, multiplication, division, addition, and subtraction, respectively.

Unless parentheses dictate otherwise, exponentiation is performed first, then multiplications and divisions, and finally additions and subtractions, where operations of the same precedence are associated from left to right. So, A - B - C is interpreted as (A - B) - C, A / B / C as (A / B) / C, and A - B / C as A - (B / C), since in the first two all the mathematical operators have the same precedence, and hence evaluate from left to right, while in the last one there exists different precedence between operators, and hence the division is evaluated before the subtraction.

The following examples illustrate in a simple way the concepts seen so far:


 * a program that prints the value of the exponentiation of the numeric constant 1.4142 by 2 (also calculates the square of 1.4142), together with some text:




 * a program that defines a numeric variable S with a value of 1.4142, and prints the value of the exponentiation of the numeric variable by 2 (also calculates the square of 1.4142), together with some text:




 * a program that defines a numeric variable S with a value of 1.4142, calculates the product of S by S (also calculates the square of 1.4142), assigns this value to a numeric variable S2, and prints the value of both S and S2 together with some text as string constants:




 * a program that defines a numeric variable S with a value of 1.4142 and another one S2 with 2.0000, and prints the value of the operation of dividing S2 by S, together with some text as string constants:




 * a program similar to the previous one, in that it defines a numeric variable S with a value of 1.4142 and another one S2 with 2.0000, but prints the value of the operation of subtraction of S by the result of dividing S2 by S (giving then a measure of accuracy for the approximation -- this is the nucleus of a numerical method that we will see later to calculate the square root of a number), together with some text as string constants:




 * a program a little bit different to the previous ones, in that it asks the user for a number, whose square is to be calculated:



Mathematical Functions
Up to here we have seen how numeric and character variables are to be defined, the rules for writing lines, basic input and output, and the rules for simple arithmetic.

But what happens, if one needs to calculate the square root of a number? For this purpose, basic mathematical functions are by default provided. These are:


 * the absolute value of a number, ABS(X)
 * the arctangent of a number, ATN(X)
 * the cosine of a number expressed in radians, COS(X)
 * the exponential of a number, EXP(X)
 * the integer part of a number, INT(X)
 * the natural logarithm of a number, LOG(X)
 * the sign of a number, SGN(X)
 * the sine of a number expressed in radians, SIN(X)
 * the square root of a positive number, SQR(X)
 * a uniformly distributed pseudo-random number in the interval (0,1), RND
 * the tangent of a number expressed in radians, TAN(X)

Let us see some examples:


 * a program that calculates the square root of 2 with the help of the SQR mathematical function provided:




 * a program that asks the user for a number, for which its cosine is to be calculated:




 * a program that asks the user for a number, for which its sine is to be calculated:




 * a program that asks the user for a number, for which its tangent is to be calculated:




 * a program that asks the user for a number, for which its exponential is to be calculated:




 * a program that asks the user for a number, for which its natural logarithm is to be calculated:




 * a program to print a pseudo-random number uniformly distributed in the interval (0,1):



Sample Programs
Minimal BASIC sample programs can be found in the corresponding page.