F Sharp Programming/Caching

Caching is often useful to re-use data which has already been computed. F# provides a number of built-in techniques to cache data for future use.

Partial Functions
F# automatically caches the value of any function which takes no parameters. When F# comes across a function with no parameters, F# will only evaluate the function once and reuse its value everytime the function is accessed. Compare the following:

Both functions accept and return the same values, but they have very different behavior. Here's a comparison of the output in fsi:

The implementation of  forces F# to re-create the internal set on each call. On the other hand,  is a value initialized to the function , so it creates its internal set once and reuses it for all successive calls.


 * Note: Internally,  is compiled as a static function which constructs a set on every call.   is compiled as a static readonly property, where the value is initialized in a static constructor.

This distinction is often subtle, but it can have a huge impact on an application's performance.

Memoization
"Memoization" is a fancy word meaning that computed values are stored in a lookup table rather than recomputed on every successive call. Long-running pure functions (i.e. functions which have no side-effects) are good candidates for memoization. Consider the recursive definition for computing Fibonacci numbers:

Beyond, the runtime of the function becomes unbearable. Each recursive call to the  function throws away all of its intermediate calculations to   and , giving it a runtime complexity of about O(2^n). What if we kept all of those intermediate calculations around in a lookup table? Here's the memoized version of the  function:

Much better! This version of the  function runs almost instaneously. In fact, since we only calculate the value of any  precisely once, and dictionary lookups are an   operation, this fib function runs in O(n) time.

Notice all of the memoization logic is contained in the  function. We can write a more general function to memoize any function:


 * Note: Its very important to remember that the implementation above is not thread-safe -- the dictionary should be locked before adding/retrieving items if it will be accessed by multiple threads.


 * Additionally, although dictionary lookups occur in constant time, the hash function used by the dictionary can take an arbitrarily long time to execute (this is especially true with strings, where the time it takes to hash a string is proportional to its length). For this reason, it is wholly possible for a memoized function to have less performance than an unmemorized function. Always profile code to determine whether optimization is necessary and whether memoization genuinely improves performance.

Lazy Values
The F#  data type is an interesting primitive which delays evaluation of a value until the value is actually needed. Once computed, lazy values are cached for reuse later:

F# uses some compiler magic to avoid evaluating the expression  on declaration. Lazy values are probably the simplest form of caching, however they can be used to create some interesting and sophisticated data structures. For example, two F# data structures are implemented on top of lazy values, namely the F# Lazy List and  method.

Lazy lists and cached sequences represent arbitrary sequences of potentially infinite numbers of elements. The elements are computed and cached the first time they are accessed, but will not be recomputed when the sequence is enumerated again. Here's a demonstration in fsi: