Introduction to newLISP/Working with numbers

= Working with numbers =

If you work with numbers, you'll be pleased to know that newLISP includes most of the basic functions that you'd expect to find, plus many more. This section is designed to help you make the best use of them and to avoid some of the minor pitfalls that you might encounter. As always, see the official documentation for full details.

Integers and floating-point numbers
newLISP handles two different types of number: the integer and the floating-point number. Integers are precise, whereas floating-point numbers (floats) are less precise. There are advantages and disadvantages to each. If you need to use very large integers, larger than 9 223 372 036 854 775 807, see the section covering the differences between large (64-bit) integers and big integers (of unlimited size) - Bigger numbers.

The arithmetic operators +, -, *, /, and % always return integer values. A common mistake is to forget this and use / and * without realising that they're carrying out integer arithmetic:

This might not be what you were expecting!

Floating-point numbers keep only the 15 or 16 most important digits (ie the digits at the left of the number, with the highest place values).

The philosophy of a floating-point number is that's close enough, rather than that's the exact value.

Suppose you try to define a symbol PI to store the value of pi to 50 decimal places:

It looks like newLISP has cut about 40 digits off the right hand side! In fact about 15 or 16 digits have been stored, and 35 of the less important digits have been discarded.

How does newLISP store this number? Let's look using the format function:

Now let's make a little script to compare both numbers as strings, so we don't have to grep visually the differences:

Notice how the value is accurate up to 9793, but then drifts away from the more precise string you originally supplied. The numbers after 9793 are typical of the way all computers store floating-point values - it isn't newLISP being creative with your data!

The largest float you can use seems to be - on my machine, at least - about 10308. Only the first 15 or so digits are stored, though, so that's mostly zeroes, and you can't really add 1 to it.

Another example of the motto of a floating-point number: that's close enough!

The above comments are true for most computer languages, by the way, not just newLISP. Floating-point numbers are a compromise between convenience, speed, and accuracy.

Integer and floating-point maths
When you're working with floating-point numbers, use the floating-point arithmetic operators add, sub, mul, div, and mod, rather than +, -, *, /, and %, their integer-only equivalents:

and, to see the value that newLISP is storing (because the interpreter's default output resolution is 9 or 10 digits):

If you forget to use mul here, and use * instead, the numbers after the decimal point are thrown away:

Here, pi was converted to 3 and then multiplied by 2.

You can re-define the familiar arithmetic operators so that they default to using floating-point routines rather than integer-only arithmetic:

You could put these definitions in your init.lsp file to have them available for all newLISP work on your machine. The main problem you'll find is when sharing code with others, or using imported libraries. Their code might produce surprising results, or yours might!

Conversions: explicit and implicit
To convert strings into numbers, or numbers of one type into another, use the int and float functions.

The main use for these is to convert a string into a number - either an integer or a float. For example, you might be using a regular expression to extract a string of digits from a longer string:

A second argument passed to int specifies a default value which should be used if the conversion fails:

int is a clever function that can also convert strings representing numbers in number bases other than 10 into numbers. For example, to convert a hexadecimal number in string form to a decimal number, make sure it is prefixed with 0x, and don't use letters beyond f:

And you can convert strings containing octal numbers by prefixing them with just a 0:

Binary numbers can be converted by prefixing them with 0b:

Even if you never use octal or hexadecimal, it's worth knowing about these conversions, because one day you might, either deliberately or accidentally, write this:

which evaluates to 0 rather than 8 - a failed octal-decimal conversion rather than the decimal 8 that you might have expected! For this reason, it's always a good idea to specify not only a default value but also a number base whenever you use int on string input:

If you're working with big integers (integers larger than 64-bit integers), use bigint rather than int. See Bigger numbers.

Invisible conversion and rounding
Some functions convert floating-point numbers to integers automatically. Since newLISP version 10.2.0 all operators made of letters of the alphabet produce floats and operators written with special characters produce integers.

So using ++ will convert and round your numbers to integers, and using inc will convert your numbers to floats:

To make inc and dec work on lists you need to access specific elements or use map to process all:

Many newLISP functions automatically convert integer arguments into floating-point values. This usually isn't a problem. But it's possible to lose some precision if you pass extremely large integers to functions that convert to floating-point:

Because the add function converted the very large integer to a float, a small amount of precision was lost (amounting to about 52, in this case). Close enough? If not, think carefully about how you store and manipulate numbers.

Number testing
Sometimes you will want to test whether a number is an integer or a float:

With integer? and float?, you're testing whether the number is stored as an integer or float, not whether the number is mathematically an integer or a floating-point value. For example, this test returns nil, which might surprise you:

It's not that the answer isn't 10 (it is), but rather that the answer is a floating-point 10, not an integer 10, because the div function always returns a floating-point value.

Absolute signs, from floor to ceiling
It's worth knowing that the floor and ceil functions return floating-point numbers that contain integer values. For example, if you use floor to round pi down to the nearest integer, the result is 3, but it's stored as a float not as an integer:

The abs and sgn functions can also be used when testing and converting numbers. abs always returns a positive version of its argument, and sgn returns 1, 0, or -1, depending on whether the argument is positive, zero, or negative.

The round function rounds numbers to the nearest whole number, with floats remaining floats. You can also supply an optional additional value to round the number to a specific number of digits. Negative numbers round after the decimal point, positive numbers round before the decimal point.

-6  1234.67890  -5   1234.67890  -4   1234.67890  -3   1234.67900  -2   1234.68000  -1   1234.70000   0   1235.00000   1   1230.00000   2   1200.00000   3   1000.00000   4      0.00000   5      0.00000   6      0.00000

sgn has an alternative syntax that lets you evaluate up to three different expressions depending on whether the first argument is negative, zero, or positive.

-5 is below 0 -4 is below 0 -3 is below 0 -2 is below 0 -1 is below 0 0 is 0 1 is above 0 2 is above 0 3 is above 0 4 is above 0 5 is above 0

Number formatting
To convert numbers into strings, use the string and format functions:

Both string and println use only the first 10 or so digits, even though more (up to 15 or 16) are stored internally.

Use format to output numbers with more control:

The format specification string uses the widely-adopted printf-style formatting. Remember too that you can use the results of the format function:

The format function lets you output numbers as hexadecimal strings as well:

Creating numbers
There are some useful functions that make creating numbers easy.

Sequences and series
sequence produces a list of numbers in an arithmetical sequence. Supply start and finish numbers (inclusive), and a step value:

If you specify a step value, all the numbers are stored as floats, even if the results are integers, otherwise they're integers:

series multiplies its first argument by its second argument a number of times. The number of repeats is specified by the third argument. This produces geometric sequences:

Every number is stored as a float.

The second argument of series can also be a function. The function is applied to the first number, then to the result, then to that result, and so on.

The normal function returns a list of floating-point numbers with a specified mean and a standard deviation. For example, a list of 6 numbers with a mean of 10 and a standard deviation of 5 can be produced as follows:

Random numbers
rand creates a list of randomly chosen integers less than a number you supply:

Obviously (rand 1) generates a list of zeroes and isn't useful. (rand 0) doesn't do anything useful either, but it's been assigned the job of initializing the random number generator.

If you leave out the second number, it just generates a single random number in the range.

random generates a list of floating-point numbers multiplied by a scale factor, starting at the first argument:

Randomness
Use seed to control the randomness of rand (integers), random (floats), randomize (shuffled lists), and amb (list elements chosen at random).

If you don't use seed, the same set of random numbers appears each time. This provides you with a predictable randomness - useful for debugging. When you want to simulate the randomness of the real world, seed the random number generator with a different value each time you run the script:

Without seed:

With seed:

General number tools
min and max work as you would expect though they always return floats. Like many of the arithmetic operators, you can supply more than one value:

The comparison functions allow you to supply just a single argument. If you use them with numbers, newLISP helpfully assumes that you're comparing with 0. Remember that you're using postfix notation:

The factor function finds the factors for an integer and returns them in a list. It's a useful way of testing a number to see if it's prime:

2 3 5 7 11 13 17 19 23 29

Or you could use it to test if a number is even:

gcd finds the largest integer that exactly divides two or more numbers:

Floating-point utilities
If omitted, the second argument to the pow function defaults to 2.

You can also use sqrt to find square roots. To find cube and other roots, use pow:

The exp function calculates ex, where e is the mathematical constant 2.718281828, and x is the argument:

The log function has two forms. If you omit the base, natural logarithms are used:

Or you can specify another base, such as 2 or 10:

Other mathematical functions available by default in newLISP are fft (fast Fourier transform), and ifft (inverse fast Fourier transform).

Trigonometry
All newLISP's trigonometry functions, sin, cos, tan, asin, acos, atan, atan2, and the hyperbolic functions sinh, cosh, and tanh, work in radians. If you prefer to work in degrees, you can define alternative versions as functions:

and so on.

When writing equations, one approach is to build them up from the end first. For example, to convert an equation like this:

$$ \alpha = \arctan \left\{ \frac{\sin \lambda \cos \epsilon - \tan \beta \sin \epsilon }{\cos \lambda} \right\} $$

build it up in stages, like this:

and so on...

It's often useful to line up the various expressions in your text editor:

If you have to convert a lot of mathematical expressions from infix to postfix notation, you might want to investigate the infix.lsp module (available from the newLISP website):

Arrays
newLISP provides multidimensional arrays. Arrays are very similar to lists, and you can use most of the functions that operate on lists on arrays too.

A large array can be faster than a list of similar size. The following code uses the time function to compare how fast arrays and lists work.

with 200 elements: array access: 1; list access: 1 1 with 201 elements: array access: 1; list access: 1 1 with 202 elements: array access: 1; list access: 1 1 with 203 elements: array access: 1; list access: 1 1 ... with 997 elements: array access: 7; list access: 16 2.285714286 with 998 elements: array access: 7; list access: 17 2.428571429 with 999 elements: array access: 7; list access: 17 2.428571429 with 1000 elements: array access: 7; list access: 17 2.428571429

The exact times will vary from machine to machine, but typically, with 200 elements, arrays and lists are comparable in speed. As the sizes of the list and array increase, the execution time of the nth accessor function increases. By the time the list and array contain 1000 elements each, the array is 2 to 3 times faster to access than the list.

To create an array, use the array function. You can make a new empty array, make a new one and fill it with default values, or make a new array that's an exact copy of an existing list.

To make a new list that's a copy of an existing array, use the array-list function:

To tell the difference between lists and arrays, you can use the list? and array? tests:

Functions available for arrays
The following general-purpose functions work equally well on arrays and lists: first, last, rest, mat, nth, setf, sort, append, and slice.

There are also some special functions for arrays and lists that provide matrix operations: invert, det, multiply, transpose. See Matrices.

Arrays can be multi-dimensional. For example, to create a 2 by 2 table, filled with 0s, use this:

The third argument to array supplies some initial values that newLISP will use to fill the array. newLISP uses the value as effectively as it can. So, for example, you can supply a more than sufficient initializing expression:

or just provide a hint or two:

This array initialization facility is cool, so I sometimes use it even when I'm creating lists:

Getting and setting values
To get values from an array, use the nth function, which expects a list of indices for the dimensions of the array, followed by the name of the array:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

(nth also works with lists and strings.)

As with lists, you can use implicit addressing to get values:

To set values, use setf. The following code replaces every number that isn't prime with 0.

Instead of the implicit addressing (table row column), I could have written (setf (nth (list row column) table) 0). Implicit addressing is slightly faster, but using nth can make code easier to read sometimes.

Matrices
There are functions that treat an array or a list (with the correct structure) as a matrix.


 * invert returns the inversion of a matrix


 * det calculates the determinant


 * multiply multiplies two matrices


 * mat applies a function to two matrices or to a matrix and a number


 * transpose returns the transposition of a matrix

transpose is also useful when used on nested lists (see Association lists).

Statistics, financial, and modelling functions
newLISP has an extensive set of functions for financial and statistical analysis, and for simulation modelling.

Given a list of numbers, the stats function returns the number of values, the mean, average deviation from mean value, standard deviation (population estimate), variance (population estimate), skew of distribution, and kurtosis of distribution:

Here's a list of other functions built in:


 * beta calculate the beta function


 * betai calculate the incomplete beta function


 * binomial calculate the binomial function


 * corr calculate the Pearson product-moment correlation coefficient


 * crit-chi2 calculate the Chi square for a given probability


 * crit-f calculate the critical minimum F for a given confidence probability


 * crit-t calculate the critical minimum Student's t for a given confidence probability


 * crit-z calculate the critical normal distributed Z value of a given cumulated probability


 * erf calculate the error function of a number


 * gammai calculate the incomplete gamma function


 * gammaln calculate the log gamma function


 * kmeans-query calculate the Euclidian distances from the data vector to centroids


 * kmeans-train perform Kmeans cluster analysis on matrix-data


 * normal produce a list of normal distributed floating point numbers


 * prob-chi2 calculate the cumulated probability of a Chi square


 * prob-f find the probability of an observed statistic


 * prob-t find the probability of normal distributed value


 * prob-z calculate the cumulated probability of a Z value


 * stats find statistical values of central tendency and distribution moments of values


 * t-test use student's t-test to compare the mean value

Bayesian analysis
Statistical methods developed initially by Reverend Thomas Bayes in the 18th century have proved versatile and popular enough to enter the programming languages of today. In newLISP, two functions, bayes-train and bayes-query, work together to provide an easy way to calculate Bayesian probabilities for datasets.

Here's how to use the two functions to predict the likelihood that a short piece of text is written by one of two authors.

First, choose texts from the two authors, and generate datasets for each. I've chosen Oscar Wilde and Conan Doyle.

The bayes-train function can now scan these two data sets and store the word frequencies in a new context, which I'm calling Lexicon:

This context now contains a list of words that occur in the lists, and the frequencies of each. For example:

ie the word always appeared 21 times in Conan Doyle's text, and 110 times in Wilde's. Next, the Lexicon context can be saved in a file:

and reloaded whenever necessary with:

With training completed, you can use the bayes-query function to look up a list of words in a context, and return two numbers, the probabilities of the words belonging to the first or second set of words. Here are three queries. Remember that the first set was Doyle, the second was Wilde:

These numbers suggest that quote1 is probably (97% certain) from Conan Doyle, that quote2 is neither Doylean nor Wildean, and that quote3 is likely to be from Oscar Wilde.

Perhaps that was lucky, but it's a good result. The first quote is from Doyle's A Study in Scarlet, and the third is from Wilde's Lord Arthur Savile's Crime, both texts that were not included in the training process but - apparently - typical of the author's vocabulary. The second quote is from Jane Austen, and the methods developed by the Reverend are unable to assign it to either of the authors.

Financial functions
newLISP offers the following financial functions:


 * fv returns the future value of an investment


 * irr returns the internal rate of return


 * nper returns the number of periods for an investment


 * npv returns the net present value of an investment


 * pmt returns the payment for a loan


 * pv returns the present value of an investment

Logic programming
The programming language Prolog made popular a type of logic programming called unification. newLISP provides a unify function that can carry out unification, by matching expressions.

When using unify, unbound variables start with an uppercase character to distinguish them from symbols.

Bit operators
The bit operators treat numbers as if they consist of 1's and 0's. We'll use a utility function that prints out numbers in binary format using the bits function:

This function prints out both the original number and a binary representation of it:

The shift functions (<< and >>) move the bits to the right or left:

The following operators compare the bits of two or more numbers. Using 4 and 5 as examples:

(binary (| 4 5)) ; or: 1 if either or both bits are 1
 * ->     5 0000000000000000000000000000000000000000000000000000000000000101
 * ->"    5 0000000000000000000000000000000000000000000000000000000000000101"

The negate or not function (~) reverses all the bits in a number, exchanging 1's and 0's:

The binary function that prints out these strings uses the & function to test the last bit of the number to see if it's a 1, and the >> function to shift the number 1 bit to the right, ready for the next iteration.

One use for the OR operator (|) is when you want to combine regular expression options with the regex function.

crc32 calculates a 32 bit CRC (Cyclic Redundancy Check) for a string.

Bigger numbers
For most applications, integer calculations in newLISP involve whole numbers up to 9223372036854775807 or down to -9223372036854775808. These are the largest integers you can store using 64 bits. If you add 1 to the largest 64-bit integer, you'll 'roll over' (or wrap round) to the negative end of the range:

But newLISP can handle much bigger integers than this, the so-called 'bignums' or 'big integers'.

Notice that newLISP indicates a big integer using a trailing "L". Usually, you can do calculations with big integers without any thought:

Here both operands are big integers, so the answer is automatically big as well.

However, you need to take more care when your calculations combine big integers with other types of number. The rule is that the first argument of a calculation determines whether to use big integers. Compare this loop:

9223372036854775801 9223372036854775802 9223372036854775803 9223372036854775804 9223372036854775805 9223372036854775806 9223372036854775807 -9223372036854775808 -9223372036854775807 -9223372036854775806 -9223372036854775806

with this:

9223372036854775801L 9223372036854775802L 9223372036854775803L 9223372036854775804L 9223372036854775805L 9223372036854775806L 9223372036854775807L 9223372036854775808L 9223372036854775809L 9223372036854775810L
 * -> 9223372036854775810L

In the first example, the first argument of the function was a large (64-bit integer). So adding 1 to the largest possible 64 bit integer caused a roll-over - the calculation stayed in the large integer realm.

In the second example, the L appended to the first argument of the addition forced newLISP to switch to big integer operations even though both the operands were 64 bit integers. The size of the first argument determines the size of the result.

If you supply a literal big integer, you don't have to append the "L", since it's obvious that the number is a big integer:

92233720368547758123421231455635L 92233720368547758123421231455636L 92233720368547758123421231455637L 92233720368547758123421231455638L 92233720368547758123421231455639L 92233720368547758123421231455640L 92233720368547758123421231455641L 92233720368547758123421231455642L 92233720368547758123421231455643L 92233720368547758123421231455644L 92233720368547758123421231455644L

There are other ways you can control the way newLISP converts between large and big integers. For example, you can convert something to a big integer using the bigint function: