Python Concurrency: Threads, Processes, and Asyncio Explained

(newvick.com)

89 points | by BerislavLopac 176 days ago

10 comments

est 176 days ago
While these kind of articles are useful for learners, I hope someone please explain concurrency models for uWSGI/Gunicorn/uvicorn/gevent. Like how long does global variables live? How does context switching (like the magic request from Flask) work? How to spawn async background tasks? Is it safe to mix task schedulers inside web code? How to measure when concurrency is full and how to scale? What data can be shared between executors and how? How to detect back-pressure? How to interrupt long-running function when client disconnects (nginx 499)? How to proper handle unix signals for threads/multiprocess/asyncio?
I reality no one writes from scratch with threads, processes or asyncio unless you are a library author.
[-]
- bmitc 174 days ago
  > I reality no one writes from scratch with threads, processes or asyncio unless you are a library author.
  Is that really true in the case of asyncio? From my experience, it isn't.
  I am not familiar with the other three libraries you mentioned, but gevent predates asyncio and is separate from it. It does not build on top of asyncio, and is outdated because of that. It doesn't even have support for WebSockets.
WoodenChair 176 days ago
Unfortunately this starts with a definition of concurrency quoted from the Python wiki [0] which is imprecise: "Concurrency in programming means that multiple computations happen at the same time."
Not necessarily. It means multiple computations could happen at the same time. Wikipedia has a broader definition [1]: "Concurrency refers to the ability of a system to execute multiple tasks through simultaneous execution or time-sharing (context switching), sharing resources and managing interactions."
In other words it could be at the same time or it could be context switching (quickly changing from one to another). Parallel [2] means explicitly at the same time: "Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously."
0: https://wiki.python.org/moin/Concurrency
1: https://en.wikipedia.org/wiki/Concurrency_(computer_science)
2: https://en.wikipedia.org/wiki/Parallel_computing
[-]
- CoconutPilot 175 days ago
  > definition of concurrency quoted from the Python wiki [0] which is imprecise: "Concurrency in programming means that multiple computations happen at the same time."
  Surprising to some, this is the literal definition. The word "concurrent" is a portmanteau from Latin, "con" translates as same and "current" translates as time. Concurrent literally means "same time". Comp Sci really needs to use a different word.
  [-]
  - WoodenChair 175 days ago
    > Surprising to some, this is the literal definition.
    I don't think it's surprising to anyone that speaks English what the definition of concurrency in everyday language is. It is not the same as the computer science definition and that's unlikely to change anytime soon. Can anyone with permission fix the Python wiki?
- Spivak 175 days ago
  Which while it sounds like a nit the OPs definition of concurrency means that asyncio as implemented in Python (and others) is not a form of concurrent programming.
  [-]
  - bmitc 174 days ago
    Could you elaborate?
    While I am no fan of Python's backwards approach to multi-core programming, asyncio achieves concurrency when a coroutine reaches out to the operating system for network calls. So while two tasks in asyncio are awaiting a return from a long running network call, the operating system can be running the network calls concurrently.
whilenot-dev 175 days ago
> Summary
```
  if cpu_intensive:
    'processes'
  else:
    if suited_for_threads:
      'threads'
    elif suited_for_asyncio:
      'asyncio'
```
Interesting takeaway! For web services mine would be:
1. always use asyncio
2. use threads with asyncio.to_thread[0] to convert blocking calls to asyncio with ease
3. use aiomultiprocess[1] for any cpu intensive tasks
[0]: https://docs.python.org/3/library/asyncio-task.html#asyncio....
[1]: https://aiomultiprocess.omnilib.dev/en/stable/
[-]
- d0mine 175 days ago
  Common in practice variant: don't use pure Python for cpu-intensive tasks (offload to C extensions)
  [-]
  - eternityforest 173 days ago
    And those C extensions usually release the GIL, so you can do real multithreading to some extent.
daelon 176 days ago
Don't know if the author will show up here, but the code highlighting theme is almost unreadable, at least on chrome on android.
[-]
- nosioptar 175 days ago
  Looks fine to me on Firefox/android in light mode.
  It's unreadable in dark mode.
- tomtom1337 176 days ago
  Same on iPhone, Safari.
tomtom1337 176 days ago
I would have liked to see the performance of the ash c version, seems like a surprising omission.
I’m also confused about the perf2 performance. For the threads example it starts around 70_000 reqs/sec, while the processes example runs at 3_500 reqs/sec. That’s a 20 times difference that isn’t mentioned in the text.
t_mahmood 175 days ago
Wondering anyone have multiprocessing freezing on Windows? As it seems you need to have __main__ for multiprocessing to work in Windows, which I do not have as I'm using pyproject scripts with click to run. Have anyone face these issue? Is there any solution for this issue?
[-]
- griomnib 175 days ago
  It’s not hard to use main, and it’s a requirement for multiprocessing.
captaindiego 175 days ago
Any advice for debugging asyncio? I've tried it a few times but every time it got a bit more complicated it felt very hard to figure out what was going wrong.
[-]
- bmitc 174 days ago
  Have you used the asyncio logger and debug mode?
  https://docs.python.org/3/library/asyncio-dev.html
timonoko 175 days ago
Grok made better job of explaining different solutions in micropython environment. Summary:
* Task Parallelism and Multi-threading are good for computational tasks spread across the ESP32's dual cores.
* Asynchronous Programming shines in scenarios where I/O operations are predominant.
* Hardware Parallelism via RMT can offload tasks from the CPU, enhancing overall efficiency for specific types of applications.
[-]
- griomnib 175 days ago
  Real question: why use grok over literally any other LLM? I’ve never heard of them being SOTA.
  [-]
  - timonoko 175 days ago
    There seems to be less artificial restrictions. You can ask for example "who is Bryan Lunduke".
    This important especially in non-english world. As forbidden words have different significance in other cultures.
griomnib 175 days ago
The real fun is when you have multiple processes also spawning threads.
c-fe 175 days ago
Not really anything new in there. Been dealing with python concurrency a lot and i dont find it great compared to other languages (eg kotlin).
One thing I am struggling with right now is how do I handle a function that its both I/O intensive and CPU-bound? To give more context, I am processing data which on paper is easy to parallelise. Say for 1000 lines of data, I have to execute my function f for each line, in any order. However f using the cpu a lot, but also doing up to 4 network requests.
My current approach is to divide 1000/n_cores, then launch n_cores processes and on each of them run the function f asynchronoulsy on all inputs of that process, async to handle switching on I/O. I wonder if my approach could be improved.
[-]
- VagabundoP 175 days ago
  Interested in seeing if you have tried 3.13 free threading. Your usage case might be worth a test there if moving from a process to threading model isn't too much work.
  Where does your implementation bottleneck?
  Python concurrency does suffer from being relatively new and being bolted on to a decades old language. I'd expect the state of the art of python to be much cleaner once no-Gil is hammered on for a few release cycles.
  As always I suggest Core.py podcast as it has a bunch of background details[1]. There are no-Gil updates throughout the series.
  [1] https://podcasts.apple.com/us/podcast/core-py/id1712665877
  [-]
  - c-fe 174 days ago
    the no-GIL threading indeed looks very promising, thanks for mentioning it. Unfortunately I think its just a bit too new today to use it in production - at least thats the feeling I get reading the docs.
    [-]
    - VagabundoP 174 days ago
      Oh definitely.
      It will need a cycle or two to mature.
- numba888 175 days ago
  > I wonder if my approach could be improved.
  Yes. When you use N batches by the number of cores the total time is defined by the slowest batch. At the end it will be just one job running. If you make batches smaller, like 1000/n_cores/k then you may get better CPU utilization and start-to-end total time. Making k too big will add overhead. Assuming n_cores==10 then k==5 may be a good compromise. Depends on start/stop time per job.
  [-]
  - c-fe 174 days ago
    the numba in your name is convincing me to give this a try! But good idea - I see the point, If I understand you correctly you say with n_cores==10, doing 50 batches that a pool schedules on any free core will reduce the chance of one core taking much longer. I will try this out.