What I like about profiling.tracing is how you can control it in the same code base that you are profiling. That turns tracing into a feature you can ship, not just a tool you can attach. Profiling.sampling doesn't seem to provide anything similar, it has to run as a separate process
It’s not a downside perhaps but a tradeoff. Implementing internal sampling profiler is not impossible in general (though it will depend on exact python implementation) but it would impose different overhead on the target and might skew results a lot more than external one.
I'm not trying to criticize, but Python is known to be much slower than eg. Java or Go etc. So for performance-critcal code, why use Python? I find Python to be very good because it is concise and simple, but I have not used it for production so far.
You use Python when it makes sense for other reasons (library support, coworker familiarity, etc), same as for any other project. Additionally, sometimes performance matters, but perhaps not enough to overcome whatever else is drawing you to Python in the first place.
Right this second I'm writing something in Python with critical performance requirements. It needs to average processing 25k things per second. That won't be particularly hard, but it's close enough to the edge of what the language is capable of that I do need to be at least a tiny bit careful with the implementation. I'm highly unlikely to need a profiler for this project in particular, but earlier in my career I probably would have needed one.
Python is fairly commonly used as a glue engine around faster code too, and it's not always obvious when the wrapper code is inducing nontrivial overhead (hidden copies and that sort of thing). Profilers are great for teasing out those sorts of problems. They shine a spotlight on the section of code which should take 0us and is instead dominating your runtime.
I think it's the opposite, the fact that it is slow means profiling is more important. This is because a 10x difference between unoptimized 0.1ms and optimized 0.01ms in Go could translate to 10s vs 1s in an equivalent python script, which is considerably more noticeable difference
Even when performance is not critical it’s possible to write (or vibe?) disastrously slow code, so having profiler handy is always a plus. It might be a deciding factor between “we must rewrite it all in Rust” and “oh we added exponential complexity by accident” and save a ton of time
The simple answer is, I choose to use Python because I am productive with it. I get a lot done compared to the other languages I have tried. Performance is almost never the limiting factor in my work, nor has it been for the vast majority of the work I've ever been witness to. When it is, it comes up in very particular circumstances, and can usually be fixed algorithmically. Indeed, that is the situation where I have used profilers.
The fact that the base language is an order of magnitude (or two!) slower has almost never mattered. If my work gets to the point where it does, and I have an excuse to go mess around with a rust extension or some cool optimized library, things are going very well.
I've been professional developer for over 20 years now, and I've read this forum obsessively for much of that time. I've seen people write things like, "Most engineers would kill for a 5% speedup" and I think, on what planet? Most engineers have much larger problems that cannot be so easily quantified. Come to think of it, I think that there is an allure to performance optimization due to the fact that it can be so easily quantified.
> I choose to use Python because I am productive with it.
That, I fully understand. I think many developers are productive in one language: in the one that they know best. Which is probably the one they use most. It might happen that this is is a "fast" language by accident (like Java or Go), or a language like Python. And then there is never enough reason to switch.
> "Most engineers would kill for a 5% speedup"
I think this is very rare - maybe a heavily used app in Facebook or Google, where 5% could mean a lot of money. But a factor of 10 speedup is much more common (and possible sometimes).
> there is an allure to performance optimization due to the fact that it can be so easily quantified.
That's true. I also think simplicity is quantifiable, and so my personal hobby is to write something impressive in few lines. Like a chess engine, QR code reader, editor, data compression tool, compiler, in 500 lines. But this is mostly for hobbies I guess. For work, it's mostly about features, and then performance, I guess.
Sometimes you get there by accident. You make a thing, it grows or is used in unexpected ways, now suddenly performance matters.
Sometimes Python is just the language used in the domain. Lots of sciences live on Python because it is easy to teach to grad students and the package ecosystem is strong.
Profiling is about acknowledging hot paths. What to do with that info is up to programmers - usually trying to optimize code that takes more execution time than expected. Every language needs good tooling, no matter how fast its runtime.
Generally you choose Python for the conciseness you mentioned, and then move the performance-critical functions into another language like C or (I find to be easiest) Cython. Ideally most of your code stays Python, and you either optimize self-contained pieces, or find library bindings that have done it for you.
A profiler like this can be used to identify which parts to rewrite in a faster language. Sometimes it's easier to write everything in Python first, then measure, than guess at the start which parts need to be fast.
You can also get gains by switching algorithms, both in pure Python and when using a compiled library like `numpy`. And there are also some operations, like string manipulation or the `sqlite3` module, where the Python runtime's implementation has already been optimized in a compiled language.
In production what's making your application slow is extremely unlikely to be the python code. It's going to be I/O, the threading/concurrency architecture, other mistakes or inefficiency that can be cleared up without leaving the ecosystem. The question of fast vs. slow languages doesn't make a lot of sense to entertain before you have any context of the specific needs of the application or use case. On its own it's just unsophisticated vanity blog fodder.
With the obvious caveat that low-level game engine, image/video processing, numerical code etc. isn't really viable in Python. But outside of that, it's fast enough for gluing together other code that's doing the heavy lifting.
While there are some implementation differences (py-spy is written in rust, profiling.sampling is a mix of python and C etc), the end result seems pretty similar to me.
One thing to note is that there are some differences in blocking behaviour of the target process. Py-spy blocks by default and profiling.sampling doesn't. I wrote a bit about why py-spy blocks by default here https://www.benfrederickson.com/why-python-needs-paused-duri... - the first version of py-spy also didn't block and since we got incredibly misleading results at times this was one of the first changes I made to py-spy
Interesting, does wall-time mode work with Asyncio?
Imagine if I have a single request calling asyncio.gather() on 5 different coroutines. Only 1 is on CPU, the other 4 are on IO. Is Tachyon able to sample all 5 coroutine tasks?
Right this second I'm writing something in Python with critical performance requirements. It needs to average processing 25k things per second. That won't be particularly hard, but it's close enough to the edge of what the language is capable of that I do need to be at least a tiny bit careful with the implementation. I'm highly unlikely to need a profiler for this project in particular, but earlier in my career I probably would have needed one.
Python is fairly commonly used as a glue engine around faster code too, and it's not always obvious when the wrapper code is inducing nontrivial overhead (hidden copies and that sort of thing). Profilers are great for teasing out those sorts of problems. They shine a spotlight on the section of code which should take 0us and is instead dominating your runtime.
The fact that the base language is an order of magnitude (or two!) slower has almost never mattered. If my work gets to the point where it does, and I have an excuse to go mess around with a rust extension or some cool optimized library, things are going very well.
I've been professional developer for over 20 years now, and I've read this forum obsessively for much of that time. I've seen people write things like, "Most engineers would kill for a 5% speedup" and I think, on what planet? Most engineers have much larger problems that cannot be so easily quantified. Come to think of it, I think that there is an allure to performance optimization due to the fact that it can be so easily quantified.
That, I fully understand. I think many developers are productive in one language: in the one that they know best. Which is probably the one they use most. It might happen that this is is a "fast" language by accident (like Java or Go), or a language like Python. And then there is never enough reason to switch.
> "Most engineers would kill for a 5% speedup"
I think this is very rare - maybe a heavily used app in Facebook or Google, where 5% could mean a lot of money. But a factor of 10 speedup is much more common (and possible sometimes).
> there is an allure to performance optimization due to the fact that it can be so easily quantified.
That's true. I also think simplicity is quantifiable, and so my personal hobby is to write something impressive in few lines. Like a chess engine, QR code reader, editor, data compression tool, compiler, in 500 lines. But this is mostly for hobbies I guess. For work, it's mostly about features, and then performance, I guess.
Sometimes Python is just the language used in the domain. Lots of sciences live on Python because it is easy to teach to grad students and the package ecosystem is strong.
A profiler like this can be used to identify which parts to rewrite in a faster language. Sometimes it's easier to write everything in Python first, then measure, than guess at the start which parts need to be fast.
You can also get gains by switching algorithms, both in pure Python and when using a compiled library like `numpy`. And there are also some operations, like string manipulation or the `sqlite3` module, where the Python runtime's implementation has already been optimized in a compiled language.
One thing to note is that there are some differences in blocking behaviour of the target process. Py-spy blocks by default and profiling.sampling doesn't. I wrote a bit about why py-spy blocks by default here https://www.benfrederickson.com/why-python-needs-paused-duri... - the first version of py-spy also didn't block and since we got incredibly misleading results at times this was one of the first changes I made to py-spy
Imagine if I have a single request calling asyncio.gather() on 5 different coroutines. Only 1 is on CPU, the other 4 are on IO. Is Tachyon able to sample all 5 coroutine tasks?