## Improving the efficiency of Python Programs Using Memoization

Some of you may be familiar with the Fibonacci sequence, which is famous in both mathematics and computer programming. The sequence is formed by adding together each of the previous two terms to create the next term in the sequence. It has its origins in a the consideration of breeding rabbits, and turn out to have many truly interesting and surprising properties. this article is not specifically about the Fibonacci sequence, but I will add some links to interesting articles you can check out if you would like to learn more about it.

The Fibonacci Sequence begins as follows:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …

The next number is found by adding up the two numbers before it in the sequence.

So, for example

the 2 is calculated by adding the values 1 and 1 proceeding it, the 3 is calculated by adding the values 1 and 3 proceeding it, and the 5 is calculated by adding the values 2 and 3 1 and 3 proceeding it. and so on!

Sometimes the sequence begins with different initial values, including commonly `1, 1`

instead of `0, 1`

, but the pattern of how each term is constructed remains consistent.

## Fibonacci Numbers with Python

The Fibonacci sequence is often used to illustrate the concept of **recursion** in programming, which is a very powerful technique with many applications. The basic idea is that we break a large problem down into smaller problems of the same type and solve those smaller problems as a means to solving the original problem.

In terms of computer code, it usually takes the form of a function that specifies **base case**, contains a call to itself (the function calls itself), and whose return values approach closer to the **base case** with each call.

So for example, the a function to return the `nth`

Fibonacci Number in Python looks like this:

1 2 3 4 5 6 7 8 9 10 |
def fib(n): """ Returns the n-th Fibonacci number. """ if n == 0 or n == 1: return n return fib(n - 1) + fib(n - 2) print(fib(10)) |

## Improving Time Complexity of Python Programs Using Memoization

Now the above code is all well and good, except…. there is a wee problem.

Try this code, which times the execution of `fib(n)`

for `n = 20`

:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
import time def fib(n): """ Returns the n-th Fibonacci number. """ if n == 0 or n == 1: return n return fib(n - 1) + fib(n - 2) n = 20 start = time.perf_counter() fib(n) end = time.perf_counter() print("Seconds taken: {:.7f}".format(end - start)) |

This code times the execution of `fib(n)`

for `n = 20`

. On my machine I got `Seconds taken: 0.0066919`

That seems OK, but what about a higher value for `n`

? Say the `40th`

Fibonacci number?

On my machine I got `Seconds taken: 118.2504081`

Clearly this is a problem, as we are only considering `n = 40`

and already the execution time is impractical.

Can you think why the algorithm as it stands takes so long to execute?

The answer lies in the fact that a lot of values are calculated multiple times. For example, using the the notation `F1`

for the first Fibonacci number, `F2`

for the second, etc, we can see that

in order to calculate `F6`

, we calculate `F4`

and `F5`

and in order to calculate `F5`

, we calculate `F3`

and `F4`

and in order to calculate `F4`

, we calculate `F2`

and `F3`

etc.

With a relatively small value for `n`

these extra calculations don’t matter much, but we don’t have to go far along the sequence to get to the point where the repeat calculations stack up so much that the execution time becomes impractical.

Can you think of an approach to deal with this problem?

## Python Memoization using lru_cache

There is a way to dramatically reduce the execution time of out Fibonacci function but storing previous results. The fancy term for this is **memoization**. What it means in this case is that once, say, `F3`

has been calculated, the result is stored in a cache, so that next time the value of `F3`

is needed, instead of calculating it by adding `F1`

and `F2`

, we the program can access the result differently from the cache. And the same goes for all other previously encountered values of `n`

.

Implementing this idea in **Python** is very easy if we make use of the tools available to us as you can see in the code below. It makes use of `lru_cahce`

from the `functools`

module. `lru`

in this stands for **Least Recently Used** A Least Recently Used (LRU) Cache organizes items in order of use, allowing you to quickly identify which item hasn’t been used for the longest amount of time.

## Python Memoization Code Listing Fibonacci

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
from functools import lru_cache import time @lru_cache def fib(n): """ Returns the n-th Fibonacci number. """ if n == 0 or n == 1: return n return fib(n - 1) + fib(n - 2) n = 40 start = time.perf_counter() fib(n) end = time.perf_counter() print("Seconds taken: {:.7f}".format(end - start)) |

On my machine I got `Seconds taken: 0.0000443`

. That is quite some improvement on `118.2504081`

seconds for the same value of `n`

!

### Manual caching

Although using the “magic” `@lru_cache`

decorator is fine for practical purposes, you may with to gain a deeper understanding of how memoization works by implementing the caching “manually”. For those who wish to do this, I have provided a code listing below showing one way it can be done, by passing a`cache`

dictionary along with each function call.

1 2 3 4 5 6 7 8 9 10 11 12 |
# With manual caching def fib(n, cache={}): if n in cache: return cache[n] if n == 0 or n == 1: return n result = fib(n - 1) + fib(n - 2) cache[n] = result return result print(fib(40)) |

There are a great many application for **memoization** in Python. Once you understand the concept and have some experience of using it, you will have a powerful tool in your programming toolkit to dramatically improve the performance of your code.