Categories
python python performance

Python Simple Asyncio [Part 1]

Is asyncio simple? Some ideas:

Came across a well written and simple post about Async programming in Python.

A good place to learn about python is always the documentation – here is the documentation for asyncio

There is also this Python and Async simplified post.

Many of the examples of asyncio can be a bit complex but it can be simplified.

The Fundamentals

Python code can run in one of 2 realms.

  • Synchronous
  • Asynchronous

Switching between these realms is done explicitly – one has to write down a keyword or tell python that some code is going to run asynchronously.

You can’t use a synchronous database library like mysql-python directly from async code; similarly, you can’t use an async Redis library like aioredis directly from sync code

In the synchronous world, the Python that’s been around for decades, you call functions directly and everything gets processed as it’s written on screen. Your only built-in option for running code in parallel in the same process is threads.

In the asynchronous world. Everything runs on a central event loop, which is a bit of core code that lets you run several coroutines at once. Coroutines run synchronously until they hit an await and then they pause, give up control to the event loop, and something else can happen. – aeracode

Synchronous and asynchronous functions/callables are different types. One cannot await a sync function or run an async function without await.

Async programming is different from threads or greenlets. Async programming context must be explicitly changed. Threads or greenlets context can change at any time.

If you block a coroutine synchronously – maybe you use time.sleep(10) rather than await asyncio.sleep(10) – you don’t return control to the event loop, and you’ll hold up the entire process and nothing else can happen. Blocking.

This is the reason why making external calls on asynchronous web frameworks is favoured – even if the asynchronous calls do not speed up user requests. The web framework can work on something else while waiting for the external call that used to block.

A confusing thing is that even with await asyncio.sleep(10):

  • it will block – if a coroutine is awaited instead of a task.
  • it will seemingly block if there is only a single task – it will seem as though it is being run synchronously – ie. no advantage or significant difference from running synchronously. The advantage only becomes clear when more than 1 task is run concurrently.

If there is nothing else that is waiting to run or can run asynchronously – then giving control back to the eventloop provides no benefit.

This is a feature. If you use a blocking synchronous call by mistake, nothing will explicitly fail, but things will just run mysteriously slowly.

Writing explicitly asynchronous code is harder than writing synchronous code

Example of the Confusion and Benefits

In async_vs_sync_simple_example.py:

import asyncio
import time

async def wait():
    print('Start sleeping sync...', time.strftime('%X'))
    time.sleep(5)
    print('Done sleeping sync...', time.strftime('%X'))

async def wait_async():
    print('Start sleeping async...', time.strftime('%X'))
    await asyncio.sleep(5)
    print('Done sleeping async...', time.strftime('%X'))

async def main():

    print('Await single sychronous coroutine\n')

    print(time.strftime('%X'))
    await wait()
    print(time.strftime('%X'))

    time.sleep(2)
    print('\nAwait single asychronous coroutine\n')

    print(time.strftime('%X'))
    await wait_async()
    print(time.strftime('%X'))

    time.sleep(2)
    print('\nAwait multiple sychronous tasks\n')

    print(time.strftime('%X'))
    await asyncio.gather(
        wait(),  wait(),  wait()
    )
    print(time.strftime('%X'))

    time.sleep(2)
    print('\nAwait multiple asynchronous tasks\n')

    print(time.strftime('%X'))
    await asyncio.gather(
        wait_async(),  wait_async(),  wait_async()
    )
    print(time.strftime('%X'))

if __name__ == "__main__":
    print('Start running asyncio.run')

    asyncio.run(main())

    print('Completed running asyncio.run')

The results are:

$ python3 async_vs_sync_simple_example.py 
Start running asyncio.run
Await single sychronous coroutine

14:24:48
Start sleeping sync... 14:24:48
Done sleeping sync... 14:24:53
14:24:53

Await single asychronous coroutine

14:24:55
Start sleeping async... 14:24:55
Done sleeping async... 14:25:00
14:25:00

Await multiple sychronous tasks

14:25:02
Start sleeping sync... 14:25:02
Done sleeping sync... 14:25:07
Start sleeping sync... 14:25:07
Done sleeping sync... 14:25:12
Start sleeping sync... 14:25:12
Done sleeping sync... 14:25:17
14:25:17

Await multiple asynchronous tasks

14:25:19
Start sleeping async... 14:25:19
Start sleeping async... 14:25:19
Start sleeping async... 14:25:19
Done sleeping async... 14:25:24
Done sleeping async... 14:25:24
Done sleeping async... 14:25:24
14:25:24
Completed running asyncio.run
Commentary

Awaiting a single coroutine (or tasks) that blocks and another that does not provides no benefit.
The async sleep takes 5 seconds and the sync sleep takes 5 seconds resulting in the program taking 5 seconds to execute the tasks.

Awaiting multiple tasks where the task uses synchronous code – will run in sequence and not overlap (block). It can be seen by the sync sleeping starting and ending on after another and taking 15 seconds in total.

Compare that to the awaiting multiple asynchronous tasks and the sleeping all starts in sequence but overlaps when the await asyncio.sleep(10) in encountered. Ie. control went back to the event loop and the next task could be initiated. Since the tasks executed concurrently it took 5 seconds.

In the example above asyncio.run is used. Prior to python3.7 one would have to write a lot more code to handle the tasks. Example of the old way vs new asyncio.run way.

asyncio.run is the entry point – telling python that we are exiting synchronous world and entering asynchronous.

What does Await do?

await x means "do not proceed with this coroutine until x is complete." If you place two awaits one after the other, they will naturally execute sequentially.

The python docs say that await – suspend the execution of coroutine on an awaitable object.

So at the core – we are saying stop running this coroutine asynchronously and wait for this asynchronous thing to run.

Ie. python is not going to do you any favours when it comes to async, one needs to plan it and ensure that every coroutine that needs to run is not going to be awaiting something else.

It is also important to understand the difference between a coroutine and a task. When a task is created with asyncio.create_task(x) (or asyncio.gather()) – it is scheduled for execution and executed asap.
Often this means that the tasks are already running and when one runs result = await some_task the result it already there.

Not the case when running coroutines directly with await result = some coroutine.

Tasks are used to schedule coroutines concurrently. – python docs on asyncio tasks

The main() function pauses and gives back control to the event loop, then that awaited function calls await and gets suspended and control is given back to whatever needs it.

Ideally anything that is blocking calls await and the event loop only runs things that are not paused or currently occupied by some I/O task (like a network or file system request).

What is a Coroutine?

A coroutine object is promise that the code will run and you’ll get a result back when given an eventloop. Python uses await to give it the eventloop.

async def greet():
    print('Hello world')

coro_greet = greet()

print(coro_greet)
print(type(coro_greet))

Moving Between Sync and Async Realm

There are 4 cases for code in this context;

  • Calling sync code from sync code – This is just a normal function call. Nothing risky or special about this.
  • Calling async code from async code – You have to use await here.
  • Calling sync code from async code – You can do this, but it will block the whole process and make things mysteriously slow, and you shouldn’t. Instead, you need to give the sync code its own thread.
  • Calling async code from sync code – Trying to even use await inside a synchronous function is a syntax error in Python, so to do this you need to make an event loop for the code to run inside. An entrypoint – usually asyncio.run()

Running Synchronous code from Async

You don’t give the event loop any chance to run. You haven’t paused the current coroutine and given the event loop control using await.

That means every other coroutine that might want to run don’t even get a chance.

Your coroutine just runs and ignores the others.

No one is going to save you – not even python – you must manually and explicitly give control back. This means you need to know you program inside and out and plan the execution.

The recommendation is to never call anything synchronous from an async function without doing it safely, or without knowing beforehand it’s a non-blocking standard library function, like os.path.join.

But what is safe? In the sync world, threading is our only built-in option for concurrency, so what we can do is spin up a new thread, get the sync function running inside it, and then have our coroutine pause and give back control to the event loop until its thread is finished and there’s a result.

Python has a slightly verbose way of doing this built-in, in the form of executors:

def get_chat_id(name):
    return Chat.objects.get(name=name).id

async def main():
    executor = concurrent.futures.ThreadPoolExecutor(max_workers=3)
    loop = asyncio.get_event_loop()
    result = await loop.run_in_executor(executor, get_chat_id, "django")

Next steps…

On the path of learning…

Sources