C++Memo v1.0-RC

(c) 2014 Giacomo Drago < giacomo@giacomodrago.com >

About

^ top

C++Memo is a generic framework for memoization written in the C++ programming language. It aims to be a tool for rapid prototyping of software components implementing dynamic programming algorithms or, anyhow, requiring memoization.
It is backed by Fast Concurrent Memoization Map (fcmm).

Advantages:
  • Allows simple and elegant descriptions of complex problems
  • Not limited to tabular data
  • No recursion, no stack overflows
  • Automatic parallelization (i.e. without prior knowledge of the structure of the problem)
  • Thread-safe
Disadvantages:
  • Suboptimal performance
  • Need to provide two hash functions
  • Imperfect parallelization because no knowledge of the problem structure is exploited

Get the source

^ top

You can browse, clone and fork the source code from here: https://github.com/giacomodrago/cppmemo.
A C++11-compliant compiler is needed to build the sources.

C++Memo is released under the BSD license.

Usage

^ top

In order to use C++Memo, the class CppMemo has to be instantiated. The defaults for the constructor parameters are always fine, but providing estimatedNumEntries can significantly improve the performance. Example:

cppmemo::CppMemo<int, int> cppMemo(4, 100000); // C++Memo will use 4 threads and expect to memoize 100,000 entries

Then, you must define a callable (a function, a functor, or a lambda), with a prototype like the following:

Value compute(const Key& key, CppMemo<...>::PrerequisitesProvider)

The PrerequisitesProvider object provides access to the prerequisites of the given key. For example, to calculate the i-th Fibonacci number:

int fibonacci(int i, CppMemo<int, int>::PrerequisitesProvider prereqs) {
    if (i == 0) return 0;
    if (i <= 2) return 1;
    return prereqs(i-1) + prereqs(i-2);
}

The fibonacci function must never be called directly. It shall be passed to getValue() like in the following example:

cppmemo::CppMemo<int, int> cppMemo;
std::cout << cppMemo.getValue(30, fibonacci) << std::endl; // will print 832040

In the example above, we have used a getValue() overload that omits the DeclarePrerequisites parameter. In order to provide recursion-free execution and automatic parallelism (when applicable), C++Memo ensures that the prerequisites of a given key are already calculated before the Compute function (fibonacci in this example) is invoked for that key. If you don’t provide the DeclarePrerequisites callable, C++Memo will first “dry-run” the Compute function, logging each call to PrerequisitesProvider::operator()(const Key&), which in this case may return a default-constructed Value, and then call the Compute callable normally as soon as all prerequisites are available.

If your Compute callable uses different prerequisites depending on the value of other prerequisites, use the appropriate getValue() overload that accepts a DeclarePrerequisites callable, like in the following example:

void fibonacciPrereqs(int i, CppMemo<int, int>::PrerequisitesGatherer declare) {
    if (i > 2) {
        declare(i-1);
        declare(i-2);
    }
}
std::cout << cppMemo.getValue(30, fibonacci, fibonacciPrereqs) << std::endl; // will print 832040

Capturing prerequisites by dry-running Compute is also incorrect when methods are invoked on the prerequisites, that throw exceptions or cause undefined behaviour if the Value is default-constructed.

The general prototype for the DeclarePrerequisites callable is:

void (const Key& key, CppMemo<...>::PrerequisitesGatherer declare)

Warning: calling a getValue() overload that omits the DeclarePrerequisites parameter is not always correct.
Please refer to this documentation to know when it can be done. If you have any doubts, please pass a DeclarePrerequisites callable to an appropriate overload of getValue(). It is a safe choice and has no drawbacks, except the need to write more lines of code.

You can even omit the Compute parameter of getValue() if you know that the value corresponding to the key has already been memoized by this CppMemo instance. Example:

std::cout << cppMemo.getValue(30, fibonacci) << std::endl; // will print 832040
std::cout << cppMemo.getValue(30) << std::endl; // will print 832040
std::cout << cppMemo(30) << std::endl; // shorthand for getValue(30)

If you omit the Compute parameter and the value for the key is not memoized, an exception is thrown.

What to do if the program “runs forever” or crashes

A cycle in the key dependency graph (i.e. a key requires itself in order to be calculated) will cause the program to “run forever” or crash. To debug this kind of situation, you can enable circular dependency detection. C++Memo will throw a CircularDependencyException when it detects a circular dependency. Circular dependency detection negatively impacts performance: use it only for debugging purposes.

A reason for the program taking a long time to terminate is a poor choice of the two hash functions, which should minimize collisions and be completely independent of one another.

Examples

^ top

The source distribution includes the Fibonacci example described in the previous paragraph, two classic dynamic programming algorithms (knapsack 0/1 and matrix chain multiplication), and a test case for the circular dependency detection. If you have coded something that uses C++Memo and can be proposed as an example, please contact me.

Performance

^ top

I have not performed benchmarks comparing the performance of a C++Memo-based implementation against a tailored implementation, but I expect the results to be like a 10x slowdown with C++Memo, especially when the problem can be solved with a simple 2D matrix.

I benchmarked the knapsack 0/1 and matrix chain multiplication examples with different parameters. Here are the results, showing that automatic parallelization is quite effective, at least in these two cases. If you made other benchmarks, please contact me.

Contact

^ top

For bug reports, comments, and suggestions, feel free to contact me at giacomo@giacomodrago.com