published on

Asynchronous Caching with Python

We’ve already talked a little bit about the new features in Pyhton 3, specifically with the type hinting system. Another feature that got shipped with version 3.6.0 is the native language support for asynchronous computation. Today we are going to evaluate how we can do caching in this new asynchronous environment without introducing any blocking into our code.

Simple asynchronous code

Let’s consider the following implementation of the Fibonacci numbers, wrapped with a tiny execution and measuring script:

import asyncio
import sys
import time

async def fibonacci(n):
    if 0 == n or 1 == n:
        return n
    return await fibonacci(n - 1) + await fibonacci(n - 2)

loop = asyncio.get_event_loop()

start = time.time()
print(time.time() - start)

Now, this code is probably not the best example for asynchronous computation, however, it’s a good starting point.

> python3 25
> python3 35
> python3 45
... I couldn't wait

As you can see, the time as we increase the number that is to be calculated grows exponentially. This is clearly unaccpetable. For this reason, clever engineers have come up with the idea of memoization. Essentially trading computational power for fast access memory storage. Once you go async, you need to go full async, though, which means that we need to find a memoization library that is asynchronous. Luckily, there’s already a very powerful one at our hands, called aiocache.

Here’s how we can use it in a simple way:

from aiocache import cached

async def fibonacci(n):
    if 0 == n or 1 == n:
        return n
    return await fibonacci(n - 1) + await fibonacci(n - 2)

Note: to use the aiocache library at full speed, it is recommended to install msgpack and ujson modules.

> python3 35
> python3 45
> python3 100

As you can see, the speedup is significant, however, memory consumption can skyroket, depending on what type of objects you are storing.

Real-life example

Enough with fooling around with Fibonacci! Let’s take a look at a real-world example: let’s say that you have a service that uses weather data from the OpenWeather APIs. Since you are a nice person, you don’t want to barrage the APIs every time you run your script or a user accesses your service, so you decide to build a cache around it.

import aiohttp
import asyncio

from aiocache import cached

APP_ID = 'b6907d289e10d714a6e88b30761fae22'

async def get_weather(city):
    async with aiohttp.ClientSession() as session:
        url = f'{city},de&appid={APP_ID}'
        async with session.get(url) as response:
            response = await response.json()
            return response['weather'][0]['description']

loop = asyncio.get_event_loop()

city = 'Berlin'
for _ in range(10):
    weather = loop.run_until_complete(get_weather(city))
    print(f"The weather in {city} is {weather.lower()}")

You can see that we’ve added a 3600 seconds (1 hour) cache around the weather using the ttl (time to live) keyword-argument, the for loop at the end is there for simulation purposes. Making sure that we don’t bombard the OpenWeather APIs.

The aiocache module has a lot more to offer in the ways of caching. We will take a look at its other features in the future.

That’s all for today. See you next time.