Ask HN: What'd be possible with 1000x faster CPUs?

54 points | by xept 13 days ago

40 comments

  • yourcousinbilly 13 days ago
    Video engineer here. Many seemingly network restricted tasks could be unlocked with faster CPUS doing advanced compression and decompression.

    1. Video Calls

    In video calls, encoding and decoding is actually a significant cost of video calls, not just networking. Right now the peak is Zoom's 30 video streams onscreen, but with 1000x CPUS you can have 100s of high quality streams with advanced face detection and superscaling[1]. Advanced computer vision models could analyze each face creating a face mesh of vectors, then send those vector changes across the wire instead of a video frame. The receiving computers could then reconstruct the face for each frame. This could completely turn video calling into a CPU restricted task.

    2. Incredible Realistic and Vast Virtual Worlds

    Imagine the most advanced movie realistic CGI being generated for each frame. Something like the new Lion King or Avatar like worlds being created before you through your VR headset. With extremely advanced eye tracking and graphics, VR would hit that next level of realism. AR and VR use cases could explode with incredibly light headsets.

    To be imaginative, you could have everything from huge concerts to regular meetings take play in the real world, but be scanned and sent to VR participants in real time. The entire space including the room and whiteboard or live audience could be rendered in realtime for all VR participants.

    [1] https://developer.nvidia.com/maxine-getting-started

    • bobosha 12 days ago
      > In video calls, encoding and decoding is actually a significant cost of video calls, not just networking. Right now the peak is Zoom's 30 video streams onscreen, but with 1000x CPUS you can have 100s of high quality streams with advanced face detection and superscaling[1]. Advanced computer vision models could analyze each face creating a face mesh of vectors, then send those vector changes across the wire instead of a video frame. The receiving computers could then reconstruct the face for each frame. This could completely turn video calling into a CPU restricted task.

      Interesting, how do you see this different from deep learning based video coding recently demonstrated? [1]

      [1]https://dl.acm.org/doi/10.1145/3368405

  • throwaway81523 13 days ago
    Realistically, AI network training at the level being done by corporations with big server farms, becomes accessible to solo devs and hobbyists (let's count GPU's as general purpose). So if you want your own network for Stable Diffusion or Leela Chess, you can do on your own PC. I think that is the most interesting obvious consequence.

    Also, large scale data hoarding becomes far more affordable (I assume the petabyte ram modules also mean exabyte disk drives). So you can be your own Internet Archive, which is great. Alternatively, you can be your own NSA or Google/Facebook in terms of tracking everyone, which is less great.

    • 2OEH8eoCRo0 13 days ago
      I think when that hardware is attainable and the tech democratized things are going to get very bizarre very quickly. I'm hitting a wall in my imagination of what a society where this is common even looks like and it scares me.
      • 8jef 13 days ago
        Like any of your great grand parents would be absolutely scared of actual you and stuff you do on a daily basis, at first.
        • 2OEH8eoCRo0 13 days ago
          I imagine limitless tailor made entertainment and control on a per-user basis.

          "Play me Frank Zappa's new album featuring Kanye West."

    • mid-kid 11 days ago
      > Also, large scale data hoarding becomes far more affordable (I assume the petabyte ram modules also mean exabyte disk drives).

      It will also mean data in general will be bigger and scale accordingly.

    • ReactiveJelly 13 days ago
      Imagine just saving every web page your computer ever browsers, forever.
  • rozap 13 days ago
    Atlassian products would be twice as fast.
    • mebble 11 days ago
      40 thousand years of evolution and we barely even tapped the vastness of Atlassian functionality potential
  • exq 13 days ago
    Instead of electron we'd be bundling an entire OS with our chat apps.
    • johnklos 13 days ago
      That would be nice, because many OSes are much smaller than Electron.
    • ilaksh 13 days ago
      Electron basically IS an entire OS. Since Chromium has APIs for doing just about anything, including accessing the filesystem and USB devices and 500 other APIs.
      • ReactiveJelly 13 days ago
        If _accessing_ the filesystem counts toward being an OS, and not _implementing_ the filesystem, then I guess Qt and the stdlib of every lang is also "kind of an OS"
        • selfhoster11 12 days ago
          That's splitting hairs. Paravirtualised IO on a virtual machine doesn't make the guest OS running inside it, any less of an OS just because it has a simpler interface to the outside world than a SATA/SAS/NVMe controller.
    • rowanG077 12 days ago
      Oh we are not far away from that. Most devs consider it completely fine to run a docker instance per project.
    • randomf 13 days ago
      many apps already have "wget docker image" as the first step
  • nirinor 13 days ago
    Some applications depend on approximately solving optimization problems that are hard even for small problems. The poster child here is combinatorial optimization (more or less equivalently, np-complete problems), concrete examples are SMT solvers and their applications to software verification [1]. Non convex problems are sometimes similarly bad.

    Non smooth and badly conditioned optimization problems scale much better with size, but getting high precision solutions is hard. These are important for simulations mentioned elsewhere, but not just for architecture and games, also for automating design, inspections etc [2].

    [1] https://ocamlpro.github.io/verification_for_dummies/

    [2] https://www.youtube.com/watch?v=1ALvgx-smFI&t=14s

  • h2odragon 13 days ago
    1 million Science per Minute Factorio bases.
  • sahinyanlik 12 days ago
    Microsoft teams may work without locking my pc. Hopefully
  • ilaksh 13 days ago
    The thing is, computing has been getting steadily faster, just not at quite the pace it was before and in a different way.

    With GPUs we have proven that parallelism can be just as good or even better than speed increases in enhancing computation. And there again have been speed increases trickling in.

    I don't think it's realistic to say that more speed advances are unlikely. We have already been through many different paradigm shifts in computing, from mechanical to nanoscale. There are new paradigms coming up such as memristors and optical computing.

    It seems like 1000x will make Stable Diffusion-style video generation feasible.

    We will be able to use larger, currently slow AI models in realtime for things like streaming compression or games.

    Real global illumination in graphics could become standard.

    Much more realistic virtual reality. For example, imagine a realistic forest stream that your avatar is wading through, with realtime accurate simulation of the water, and complex models for animal cognition of the birds and squirrels around you.

    I think with this type of speed increase we will see fairly general purpose AI, since it will allow average programmers to easily and inexpensively experiment with combining many, many different AI models together to handle broader sets of tasks and eventually find better paradigms.

    It also could allow for emphasis on iteration in AI, and that could move the focus away from parallel-specific types of computation back to more programmer-friendly imperative styles, for example if combined with many smaller neural networks to enable program synthesis, testing and refinement in real time.

    Here's a weird one: imagine something like emojis in VR, but in 3d, animated, and customized on the fly for the context of what you are discussing, automatically based on an AI you have given permission to.

    Or, hook the AI directly into your neocortex. Hook it into several people's neocortices and then train an animated AI 3d scene generation system to respond to their collective thoughts and visualizations. You could make serialized communication almost obsolete.

    • saltcured 13 days ago
      However, 1000x is really not very much. With a 1000x uplift, we could certainly get better weather predictions, but not necessarily paradigm-altering improvement. In a real sense, we already have 1000x speedup and its what you get in a contemporary "supercomputer", whatever that is in a given market at a given point in history.

      Let's say we had perfect 1000x improvement in compute, storage, and IO such that everything remains balanced. A fluid-dynamics or atmospheric simulation can only increase resolution by about 10x if a 3D volumetric grid is refined uniformly, or only about 5x if we spread it uniformly over 4D to also improve temporal resolution. Or maybe you decide to increase the 2D geographic reach of a model by 30x and leave the height and temporal resolution alone. These growth factors are not life-changing unless you happen to be close to a non-linear boundary where you cross a threshold from impractical to practical.

      I'm not sure we can say how much a video game would improve. There are so many "dimensions" that are currently limited and it's hard to say where that extra resource budget should go. Maybe you currently can simulate a dozen interesting NPCs and now you could have a crowd of 10,000 of them. But you still couldn't handle a full stadium full of these interesting behaviors without another 10x of resources...

      • ninjanomnom 12 days ago
        I work on an open source multiplayer game that's limited by single thread CPU speed so I can give a perspective of what would improve for us at least.

        The fastest thing to change is we'd increase player limits per server, per player CPU costs are significant and we could bring the player limits to maybe 500 before network speeds start being a consideration. Certain ai improvements that are currently not viable like goal oriented ai design and pathfinding improvements could be added that would make new kinds of gameplay possible. Hell with even just 10x I would be very tempted to try unifying our atmospheric and chemistry simulations so they use the same data structures, thus allowing chemical reactions between gases that aren't basically masses of nonstandard performance hacks on the back end.

        In short though, even minor performance improvements would vastly change what we could accomplish. 1000x is extreme and you would see very different games that could make use of techniques that today are mostly relegated to games built around them as a gimmick that they make sacrifices for.

    • thfuran 13 days ago
      >With GPUs we have proven that parallelism can be just as good or even better than speed increases in enhancing computation.

      Not really, no. It's just that certain classes of problems can be very readily parallelized and it's relatively easy to figure out how to do something 1000x in parallel compared to figuring out how to achieve a 1000x single thread speedup.

      >Much more realistic virtual reality. For example, imagine a realistic forest stream that your avatar is wading through, with realtime accurate simulation of the water, and complex models for animal cognition of the birds and squirrels around you.

      I'm not sure 1000x would do much more than scratch the surface of that, especially if you're already tying a lot of it up with higher fidelity rendering.

  • ussrlongbow 12 days ago
    I wish CPUs for a while got 10x slower to allow some room for software products optimisation.
    • mburee 12 days ago
      Exactly, 1000x faster CPUs would result in new software filling in the extra speed in no time at all
  • Jaydenaus 13 days ago
    First thing that comes to mind is using your mobile device as your main workstation would become a lot more realistic.
    • 2rsf 12 days ago
      In a lot of aspects the limiting factor of using mobiles as workstations is the software and OS, you can add a Bluetooth keyboard and mouse then cast it to a screen but all you will get is a bigger phone and not a workstation. Mobile CPUs are not that bad nowadays.
    • alrlroipsp 13 days ago
      My main workstation up until 2005 or something was probably less computing power than the smartphone you use today.
      • muzani 12 days ago
        8 core 2.8 Ghz, 11 GB RAM, 256 GB storage, liquid cooling, camera zooms at the level of a toy microscope. This is more powerful than some gaming PCs just a while ago.

        It runs fine, but any less gets laggy, so I suspect apps like Facebook and TikTok are just going to continue to swallow up any more power.

  • ttoinou 13 days ago
    Infinite arbitrary precision real time Mandelbrot zoom generation :-)
    • ilaksh 13 days ago
      Can't you already do this with a good shader program? Well, Google search finds one that claims 'almost infinite'.
      • operator-name 13 days ago
        Only if you roll your own arbitrary precision type on the gpu, which is much harder given the constraints.
      • ttoinou 13 days ago
        the best thing I know of is you could emulate 256 bits with 4x64 bits float (double) and then use the derivative of mandelbrot to approximate the fractals around interesting points
      • stevejobs69 13 days ago
        >'almost finite'

        I mean one of the fundamental attributes of infinity is that you can never be 'almost there'.

  • captaincrunch 13 days ago
    Likely we would see 8192 keys for SSH
  • yoyopa 13 days ago
    it would be nice for the architecture field. we deal with lots of crappy unoptimized software that's 20-30 years old. so if you like nice buildings and better energy performance (which requires simulations), give us faster cpus.

    imagine you're working on airport. thousands of sheets, all of them PDF. hundreds or thousands of people flipping PDFs and waiting 2-3+ seconds for the screen to refresh. CPUs baby, we need CPUs.

    • mwint 13 days ago
      Is there any way I can contact you? I have an aspirational semi-related project.
  • mixmastamyk 13 days ago
    Real-time ray tracing was the goal in the old days. Are we there yet at adequate quality?
    • wtallis 13 days ago
      No, we're not there yet. Ray tracing in games is still merely augmenting traditional rasterization, and requires heavy post-processing to denoise because we cannot yet run with enough rays per pixel to get a stable, accurate render.
    • throw149102 13 days ago
      I feel like we are - I can run Minecraft RTX at 4k with acceptable framerate using DLSS 2.0 on a 3090. Minecraft is using pure raytracing (no rasterization). It also isn't using A-SVGF or ReSTIR, so there are 2 pretty big improvements that could be made.

      Minecraft RTX does suffer really badly with ghosting when you destroy a light source, but my intuition says that A-SVGF would fix that entirely.

      That being said, some of the newest techniques, like ReSTIR PT (a generalized form) have only been published for a couple of months, so current games don't have that yet. But in 3-6 months I would start to expect some games go with a 100% RT approach.

    • orbital-decay 13 days ago
      Still orders of magnitude away from full tracing, only as a part of traditional rendering, with a ton of hacks on top.

      Actually, there always was a lingering suspicion that brute force simulation might get sidestepped by some another clever technique long before it's achieved, to get both photorealism and ease of creation. ML style transfer could potentially become such a technique (or not).

    • Iwan-Zotow 12 days ago
      Unlikely to be done on CPU
  • MisterSandman 13 days ago
    Much more complicated redstone CPUs in Minecraft.
  • Tepix 11 days ago
    One thing i'd like to see would be smart traffic lights. For example as soon as a person finishied crossing the road, when there is noone else it switches back to green immediately.
    • MH15 11 days ago
      This totally be done with existing CV tech- think pedestrian detection in self driving cars.
  • Tepix 13 days ago
    Assuming that a CPU at today's speeds would require vastly less power, we would have very powerful, very efficient mobile devices such as smartwatches.

    Probably using AI a lot more, on-device for every single camera.

  • alkonaut 12 days ago
    I’d just not discover my accidentally quadratic code and ship it. It would save me a lot of debugging time.
  • jensenbox 13 days ago
    Your question is missing the factor of power - If we have 1000x at current power usage or 1000x at 1000x power?

    Also, 1000x parallelism or 1000x single core?

  • robertlagrant 13 days ago
    Be able to run Emacs as fast as I can run Vim?
    • invisiblerobot 13 days ago
      Consider you can easily emulate Vim inside Emacs but NOT the inverse and you'll understand what those extra cycles do.
      • robertlagrant 12 days ago
        For sure Emacs is a great operating system. If only it came with a decent text editor!
        • maremmano 12 days ago
          OMG moment, moment. Let me get my popcorn and I'm with you guys right away.
      • mburee 11 days ago
        I'll switch to emacs the day they implement an Acme or Sam emulator, until then, ed.
  • VoodooJuJu 9 days ago
    Cheaper employees. With faster CPU's, they won't need to understand leetcode level optimization, i.e. they won't need expensive or sophisticated training. Just find someone with a pulse and stick them in front of the computer. Less-than-ideal big O's won't be an issue with this kind of speed.
  • domenicrosati 13 days ago
    Simulation? Like fluid dynamics. I heard that was CPU intensive.
  • frontierkodiak 13 days ago
    Incredible biodiversity monitoring— everywhere, all the time
  • nyfresh 13 days ago
    More bloat
  • alexvoda 12 days ago
    I guess it depends on what you mean by faster.

    Higher IPC, higher clock, more cores, more cache, more cache levels, more memory bandwidth, faster memory access, faster decode, etc.

    One idea I imagine would be possible with a 1000x speed would be real time software defined radio capture, analysis and injection.

  • rubicon33 13 days ago
    React Native could now handle 500,000,000 3rd party jankfest line rather than just 100,000,000
  • valbaca 13 days ago
    If I dare to be optimistic for once, cure cancer via simulated protein folding.
  • bchelli 12 days ago
    Current encryption standard would become obsolete over night, internet/network connectivity would become insecure.

    This would lead to a complete chaos, until we update our security standards.

  • legulere 12 days ago
    Less time spent in software development on optimization. That might sound horrible at first, but also means that less resources need to be used for programming something
  • bob1029 13 days ago
    Single-shard MMO with no instancing requirements.
  • tarunmuvvala 12 days ago
    As per raykurzwiel https://www.kurzweilai.net/images/chart03.jpg

    With 1000X CPU computing, each computer will have equivalent computing power as human brain.

    So brain compute interface or jarvis like AI may get possible

  • plantain 13 days ago
    Weather forecasts would be as good as they are now, perhaps 1-2 days further ahead.
  • cutler 13 days ago
    A Ruby on Rails renaissance.
  • kramerger 13 days ago
    Windows update in the background would take 3 hours invested of 4.

    Average nodejs manifest file would contain x12000 more dependencies.

    Also, we would see a ton more AI being done on the local CPU. Anything from genuine OS improvements to super realistic cat filters on teams/zoom.

    And finally, I think people would need to figure out storage and network bottlenecks because there is only so much you can do with compute before you end up stalling waiting for more data

    • naikrovek 13 days ago
      we have always been memory-bound, in one way or another, even today.

      the difference in performance between an application using RAM with random access patterns and an application using RAM sequentially is far more than you are expecting it to be if you haven’t actually measured it. an order of magnitude or more for sequential stuff over random access. having your data already in the L1 cache before you need it is worth the effort it takes to make that happen.

      • kaba0 12 days ago
        Indeed, but in the case of your average application it is not only lack of will/expertise to optimize, but also simply that the program domain has a much more random memory allocation pattern. Most programs are not operating in a single hot loop on terabytes of data.
    • ellisv 13 days ago
      > Average nodejs manifest file would contain x12000 more dependencies.

      This is absolutely true

    • kllrnohj 13 days ago
      > Windows update in the background would take 3 hours invested of 4.

      And MacOS updates will still find a way to take your machine offline for an hour

  • 8jef 13 days ago
    How many pi decimal could be generated within X time using such a machine?
  • Iwan-Zotow 12 days ago
    1000x better porno
  • quadcore 13 days ago
    Good code.
  • anigbrowl 13 days ago
    Whole brain simulation, AGI.
    • lm28469 13 days ago
      A truck with 1000000000hp still won't beat a Ferrari on a race track, nothing guarantees faster hardware would solve any of our AI problems
      • 8jef 13 days ago
        A 1000000000hp equivalent electric truck generating much tork would probably lift off and fly to the Moon, or dig itself so deep it would melt in lava. In the meantime, a cybertruck with 3 motors (or 4) may soon (2023?) challenge Ferrari.
      • Salgat 13 days ago
        Training time is a massive constraint on advancement of the science, so at the very least the field would progress much faster and be much more accessible to researchers.
        • 6nf 13 days ago
          Faster processing alone won't make training 1000x faster, the bottleneck is more on the memory size / bandwidth side
          • anigbrowl 12 days ago
            I feel people are overlooking the OP's mention of parallel improvements in storage and speed of access. While there are physical limits to this, I feel like capabilities will continue to expand not so much in terms of pure speed as in better automation of parallelization and resource allocation.
          • rowanG077 12 days ago
            The thread is about the whole stack 1000x'ing. Not just processing speed.
    • cguess 13 days ago
      Still not even close to a brain though.
      • anigbrowl 12 days ago
        I think AGI requires different topological/conceptual paradigms rather than pure speed/processing capacity. But the latter is necessary to experiment and create recognizable results.

        A lot of the current excitement around AI image construction and SD's availability is the intuitive sense that these tools have succeeded in emulating some key aspects of our visual cortex - given a set of object classifiers they can create imaginary views that are recognizable to us. It's sort of an illusion - Stable Diffusion has no aesthetic or experiential preferences of its own and so its activity is reflexive rather than conscious, and we don't understand if or how consciousness is emergent from complex reflexivity.

        But the key point is that it's doing such a good job at this 'narrow' task of visual synthesis, and other models are doing such a good job at the 'narrow' tasks of textual or audible synthesis, that it's competitive with a human in an idiot-savant kind of way. And we know from our own experience that skill and learning are protean - we may disagree on the value of different types of learning, but don't question the similarity of the underlying mechanism. Thus I might think that becoming an expert on, say, the fictional universe of Star Wars is a waste of time, but the process of knowledge acquisition, recall, and synthesis are not fundamentally different from those used to learn history or engineering ('experimentation' can exist in terms of consensus establishment in a fandom about whether an innovation is canonical or parodic).

        So if we can train models with a billion semantically-tagged media objects and have them generate new media objects that meaningfully reflect the tags we supply, it means we have a decent general environmental-feature detection, recall, and resynthesis tool. Being able to take an existing model and tune it on workstations instead of needing a whole datacenter substantially widens the field of possibilities. So what happens if we connect it to sensors and actuators and train our model to navigate a dynamic landscape, which includes 'internal' signals that can't be directly responded to? Consider a virtual or lab environment which is complex and dynamic, and includes energy units (batteries). Our model has internal batteries and feedback mechanisms, but their state can only be altered through external activity and their signals are heavily weighted. Sensory subsystems attached to the model have some precomputed models of their own.

        My idea is that the brain is a 'system of systems' and that consciousness emerges from the instrumentation of the time cost of model tuning vs the rate of environmental variation.

      • Iwan-Zotow 12 days ago
        depending on the brain
    • klysm 13 days ago
      We can’t do those slowly though
  • tiernano 13 days ago
    java might run at a decent speed... Might, but probably won't (jk, sorry, I couldn't help myself...) [edit Grammarly decided to remove some text when fixing spelling...]
    • invisiblerobot 13 days ago
      Java runs at 90% the speed of C for most common benchmarks.

      It uses 50x the RAM to do so. But you're dead wrong to think java is slow.

      The only reason physics game engines are written in C++ is because physics game engines are written in C++.

      • solomatov 13 days ago
        >Java runs at 90% the speed of C for most common benchmarks.

        It's not 90%, more of several times slower: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

        >The only reason physics game engines are written in C++ is because physics game engines are written in C++.

        They are written in C++ because of latency requirements which are nearly impossible in GCed language.

        • seer-zig 7 days ago
          From what I know, the major C++ engines (Unity, Unreal) have GC in them. Using GC does not automatically mean that latency is out the window.
      • tiernano 12 days ago
        You missed the jk (joking part) didn't you. Only Java apps I use are jdownloader and ikvm apps for servers...and well, they are slow...
        • throwaway019254 11 days ago
          Have you ever used any service from AWS? Then you were using Java.