Cracking Random Number Generators Using Machine Learning

(research.nccgroup.com)

169 points | by Hard_Space 926 days ago

14 comments

_hl_ 926 days ago
What they are doing is essentially encoding the binary circuit of the xorshift128 PRNG as a neutal network. The fact that you can encode abritrary binary circuits as neural networks is well-known, so it's not surprising that it is possible to do this.
The interesting and perhaps surprising result is the demonstration that it is possible to train this network using standard gradient methods, when choosing the proper loss function for the problem and designing the network with the required domain-knowledge of the algorithm it is supposed to model. In other words, this can be seen as an example of how well-educated network design and choice of loss function massively impacts the performance of your fancy ML model.
For more "real-world" problems, it is often very difficult to come up with a network design that encodes the implicit constraints of the problem, mostly because these constraints are not even known (that's why we use ML!), but results like the recent AlphaFold show that even for these problems, thoughtful design of the model architecture goes a very long way.
[-]
- SeanLuke 926 days ago
  > What they are doing is essentially encoding the binary circuit of the xorshift128 PRNG as a neutal network. The fact that you can encode abritrary binary circuits as neural networks is well-known, so it's not surprising that it is possible to do this.
  Mcullough-Pitts aside, I think this is unfair. What's interesting to me is that they can predict a recurrent algorithm with internal state using with a trivial NON-recurrent feedforward network without any internal state.
  [-]
  - jameshart 926 days ago
    It looks like this specific algorithm can easily be deterministically reversed - the XORs and bitwise shifts mean that, given four output numbers, you can completely determine what state it was in before the numbers were generated - and, in fact, that there is probably a simple series of bit shifts and XORs you can perform on the last four outputs that produces the next number.
    Bit shifts and XORs are very much the kind of pattern neural networks should excel at learning - they mean that each output bit is a simple linear function of the input bits.
    Actually doing it is still a good demonstration! But the fact that the function in question is used as a PRNG doesn’t imply that other PRNGs would be susceptible to a similar approach.
    [-]
    - 317070 926 days ago
      XOR is not linear in the inputs. I mention this, because Minsky's paper used that fact to show that perceptrons could never work as AI. This pretty much started the first AI winter in 1973.
      It's an interesting story: https://towardsdatascience.com/history-of-the-first-ai-winte...
      [-]
      - ajb 926 days ago
        Linearity is actually relative to the algebra you are using. Xor is not linear in real arithmetic but it is in GF(2). This is useful because some of the algorithms work across many algebras, and are useful for different problems. Eg, the tropical algebra.
        But you are quite correct about the AI winter.
  - gus_massa 926 days ago
    They can see the last 4 results, not only the last result. That's somewhat equivalent of partially seeing the secret internal state.
    As a bad metaphor: Let's suppose that you enter a lifter and a NN without internal state must predict which button you will press, but now consider the case where the NN can see a video of everything you have done during the last year. Do you think it's possible?
    [-]
    - _hl_ 926 days ago
      > That's somewhat equivalent of partially seeing the secret internal state
      Not just partially, xorshift128 is designed such that the output depends only on the last 4 results. So this is "just" learning the output function of xorshift as a neural network.
      [-]
      - SeanLuke 926 days ago
        Perhaps you might consider how big that output function ideally is, then compare to the capacity of the neural network they used and the number of samples provided.
        [-]
        _hl_ 926 days ago
        My reading of it is that it's essentially one-to-one (up to minor constant factors), they took the binary circuit for xorshift128 and replace the xor gates by functionally equivalent NN blocks, and then trained the weights.
    - SeanLuke 926 days ago
      That's only ~365 values. XORshift32 has a period of 2^32-1, a good portion of that (I'd imagine) unique. If XORshift had really good randomness properties, at the extreme we'd expect that the NN would have to special case all of them. What this implies is that XORshift has easily cracked, nonrandom, repeatable patterns (which we suspect of course since it fails certain tests) and that the NN has identified them.
      [-]
      - rurban 926 days ago
        We knew that before the two ML attempts. You just need to look at the source. There are several other trivial PRNG's with the exact same properties. Never use a trivial PRNG for security or seeding.
- downandout 926 days ago
  So would this model be able to predict any imperfect PRNG with some degree of accuracy, or just xorshift128? For the purposes I'm thinking, even 1% accuracy above purely random would suffice.
  [-]
  - IAmGraydon 926 days ago
    Financial market prediction is where my mind went too.
    [-]
    - danbmil99 925 days ago
      Good luck with that. Financial markets are not pRNG's -- they are more like an adversary you are playing poker with. Think game theory, rather than pattern matching.
      [-]
      - downandout 925 days ago
        I was actually considering something known to be a relatively weak pRNG....card shuffling in a casino setting.
        [-]
        tatersolid 925 days ago
        Manual or modern automated shuffles in a casino are NOT weak. Casino owners and employees aren’t dumb. They actually understand the math involved quite well.
        [-]
        downandout 919 days ago
        Casino owners and employees aren’t dumb
        Having met a few casino owners, and many more game supervisors actually charged with understanding the math of the games to avoid exploitation, I can tell you this is patently false.
- More-nitors 926 days ago
  hm can this be applied to AES?
  [-]
  - bawolff 926 days ago
    I'm pretty sure it can't
- cookiengineer 926 days ago
  This whole project reads as a beginner's guide to ML and what it can do.
  I mean, it's nice and all but at some point I would've expected at least a mention that this all was solvable with a quantized neural network with low bit precision as well.
  Most of the article was about LSTMs and the recurrent design of those is just a very inefficient way to solve the problem at hand.
  I would've expected an LSTM try for something like a seed based randomizer like MT19937 and that the unfolding layers are trained on the seed itself with the idea that they learn how to predict the seed's state for the next iteration.
  Something like this would be really important research, especially in times where time based one time passwords are used everywhere, and their cryptographic security of how seeds are generated is important.
  [-]
  - sillysaurusx 926 days ago
    > I mean, it's nice and all but at some point I would've expected at least a mention that this all was solvable with a quantized neural network with low bit precision as well.
    This doesn't matter, so I'm not sure why you were expecting it.
    I'm struggling to be diplomatic, so I'll leave it at that.
    [-]
    - nsonha 926 days ago
      the rest of their comment sounds insightful
sillysaurusx 926 days ago
This is pretty hilarious. I just noticed that this is published by NCC Group. (Former NCC Group pentester here, and now I happen to be an ML engineer.)
I don't think HN's negativity is warranted. But there's also some confusion, which I'd like to help clear up.
People here seem confused about "Why? Why do this?"
The answer is simply that it's an interesting problem. That's it.
It doesn't need to be important, or a promising research direction, or mention low bit precision, or quantized networks, or be performant, or tackle a complex PRNG.
It just needs to be an interesting problem which you can solve. Love of ML is no more complicated than that.
Knowing the fellas at NCC, I'm quite certain that they did this mostly out of intellectual curiosity rather than some grand plan to advance the state of the art of ML. And I think that's wonderful. More people should get into ML with exactly that mindset.
[-]
Hendrikto 926 days ago
The underlying problem seems to be that the PRNG they are emulating is giving away its internal state. I wouldn‘t expect this to work for PCG, for example.
[-]
- ChrisLomont 926 days ago
  It would also work for PCG. You're simply modeling the PRNG and learning the internal state from outputs.
  It's even easier using Z3 or your own SAT solver. I've broken pretty much all non-crypto PRNGS using Z3, and it's trivial to do so. You simply model the PRNG in Z3, with unknown constants, plug in a few values from the output (or even parts of output, or even non-consecutives ones, anything you derive from them, etc - just model how the inputs formed), and ask Z3 to solve for the internal state. It works every time.
  [-]
  - shaicoleman 926 days ago
    I don't think this will work on PCG.
    "pcg32, which has state-space size of 2^127 (2^64 period × 2^63 streams) and produces 32-bit outputs."
    https://www.pcg-random.org/predictability.html
    That said, The PCG author does not recommend to use it for cryptography.
    "Although it is less trivial to predict than many mainstream generators, that does not mean it should be considered crypographically secure. Exactly how hard it is to break different members of the PCG family is unknown at this time."
    https://www.pcg-random.org/other-rngs.html
    [-]
    - ChrisLomont 925 days ago
      It trivially works on PCG. The problem is a boolean satisfiability problem, and these are solvable on far harder instances.
  - tubby12345 926 days ago
    How do you pick how many constants and they're relationships?
    [-]
    - Someone 926 days ago
      I think that’s the “You simply model the PRNG in Z3” part.
      Whether that’s simple depends is debatable, but for example would be if somebody open-sourced the server-side of their poker dealing site.
      [-]
      - ChrisLomont 926 days ago
        Also, to break unknown/unseen PRNG, model a few popular classes of PRNG, a few common reductions (so for cards the naive val%52 and better rejection sampling), then solve across the states till you get one. Then watch a few more outputs, check that you guessed the source and reduction, and now you've cracked that too.
        I doubt any poker site uses anything other than a crypto rand for deals because the technique I just explained would eat them alive. They can use fast and insecure PRNG for doing Monte Carlo hand evaluation, though, since that doesn't give you any advantage.
        [-]
        downandout 926 days ago
        So building on this...many physical casinos use a manual shuffle, especially at higher limits, as bigger players tend to distrust shuffle machines. Even some online casinos with live dealers will shuffle the previous shoe on camera, manually.
        Since getting a truly random shuffle would take too long, this is a poor method of getting random results. Would it be possible to, at least to some degree, predict the order of the cards using one of these models, given the order of the cards prior to shuffling? I'm thinking something along the lines of being able to say that the odds of a given card being the next one are higher than they should be, not outright predicting the full order of the cards.
        [-]
        ChrisLomont 926 days ago
        If I recall, something like 7 hand shuffles becomes indistinguishable from random. I'd guess a pro could get this level of mixing in a few shuffles
        [-]
        downandout 926 days ago
        Yeah, but that's 7 shuffles of a single deck. Most casinos use 6 or 8 decks. They are certainly not shuffling 42+ times.
        [-]
        ChrisLomont 925 days ago
        Shuffles don't multiply like that. Mixing is an exponential process. It's why mixing a little paint or a lot of paint isn't multiplicatively more work.
      - ChrisLomont 926 days ago
        It's quite simple. Z3 supports every basic integer option out of the box. You literally write the same code as the original prng. Instead of a uint32, or uint64, or float32, etc., you use the Z3 equivalent type. That's it. You define the state as unknown, but same Z3 type as in C or JavaScript. Then you run the function once. Set it to the output you see, repeat a few observed outputs, then tell Z3 to solve for the original unknown state.
        It's absolutely trivial, taking a few lines of code.
        I've broken all sorts of code using similar tools.
        [-]
        tmoertel 926 days ago
        Do you have a worked example of finding a PRNG's internal state that you could share? (Most people are unfamiliar with Z3 and other SMT solvers. Seeing a relevant worked example would be very illustrative.)
        [-]
        ChrisLomont 926 days ago
        Yeah, I was considering posting one. I've lately switched to Z3 under C# since it's so easy to do things... If I get around to it tonight I'll make a post.
        [-]
        ChrisLomont 925 days ago
        Here's a simple example finding a hidden seed. It takes around 1, 2, or 3 values in a row and finds the seed. You can adapt this to all sorts of breaking problems, but for some you'll have to learn more about SATs and be careful how you code things. Note the simple PRNG I chose doesn't reveal all the state, yet over time you can deduce all of it.
        // Code to demonstrate finding the hidden seed of a PRNG with Z3 // Chris Lomont Oct 2021 // Answering this comment on Hacker News https://news.ycombinator.com/item?id=28886698 // Start by getting Z3, a theorem prover from Microsoft, via nuget. See https://github.com/Z3Prover/z3 using Microsoft.Z3; using System; // get a solution ulong Solution(Context ctx, BoolExpr f, BitVecExpr seed) { Solver s = ctx.MkSolver(); s.Assert(f); // assert constraint into solver if (s.Check() == Status.SATISFIABLE) { var m = s.Model; Expr e = m.Evaluate(seed); return ulong.Parse(e.ToString()); } else throw new Exception("Fatal error"); } // simple PRNG, used in many old C libs UInt32 next = 0x1234; UInt32 rand() { next = next \* 214013 + 2531011; return (next / 65536) % 32768; } void Test() { var R = new Random(); next = (uint)R.Next(); var start = next; Console.WriteLine($"Starting seed {next:X8}"); using (var ctx = new Context()) { // 32 bit vector type Sort bv_type = ctx.MkBitVecSort(32); // unknown seed - we will solve for this BitVecExpr seed = ctx.MkBVConst("seed", 32); // constants used in the PRNG BitVecExpr c214 = (BitVecNum)ctx.MkNumeral("214013", bv_type); BitVecExpr c253 = (BitVecNum)ctx.MkNumeral("2531011", bv_type); BitVecExpr c65536 = (BitVecNum)ctx.MkNumeral("65536", bv_type); BitVecExpr c32768 = (BitVecNum)ctx.MkNumeral("32768", bv_type); // track the boolean to satisfy BoolExpr be = ctx.MkTrue(); var next = seed; // start here int pass = 0; while(true) { ++pass; if (pass > 10) break; // the sampled random var r = rand(); BitVecExpr rndBV = (BitVecNum)ctx.MkNumeral(r, bv_type); // make this code using Z3 variables //next = next * 214013 + 2531011; //return (next / 65536) % 32768; next = ctx.MkBVAdd(ctx.MkBVMul(next, c214), c253); var ret = ctx.MkBVURem(ctx.MkBVUDiv(next, c65536), c32768); // add a constraint boolean for this "Z3 ret = returned value" be = ctx.MkAnd(be, ctx.MkEq(ret, rndBV)); // get solution var val = Solution(ctx, be, seed); // todo - can check unique by adding "seed != soln" to the boolean and checking Console.WriteLine($"{pass} : 0x{val:X8}"); if (val == start) break; // done - can also break when shown unique } } } Test();
        [-]
        tmoertel 924 days ago
        Thanks! That example really helps to crystalize the technique :-)
        tubby12345 926 days ago
        Oh gotcha. I thought you meant that you were some how modeling the prng using some reduced set of constraints.
arendtio 926 days ago
I think this is a very good idea, because both topics (random number generators and AI) are hard to fully understand and using both against each other could be used to teach both topics with increasing difficulty.
ChuckMcM 926 days ago
Can't wait for the first time a couple of thousand people all pick the winning lottery numbers :-)
This looks like excellent work.
swayvil 926 days ago
They need to try this out on one those natural RNGs. Electron-noise or lava lamp.
Maybe they'll discover something.
[-]
- maweki 926 days ago
  The only thing you could discover is, that it's not indeed a natural RNG, but a PRNG.
  [-]
  - cblconfederate 926 days ago
    If a hidden variables model is found, it would be the discovery of the century
  - tlholaday 926 days ago
    Strong evidence, perhaps dispositive evidence, for the Simulation Hypothesis.
    [-]
    - JackFr 926 days ago
      No it isn’t.
      [-]
      - swayvil 926 days ago
        A fundamentally entropic phenomenon is discovered to be patterened. It would be like looking through a microscope and seeing voxels.
        How is this not strong evidence for simulation?
        [-]
        rsfern 926 days ago
        Isn’t the quantum nature of reality already like looking through a microscope and finding voxels?
        I don’t see why finding underlying structure in what we expect to be pure noise has to mean our reality is a simulation. I guess it could be consistent with that, but couldn’t it also be consistent with some previously undiscovered physics, much like the development of quantum mechanics?
        JackFr 926 days ago
        Intellectually the simulation hypothesis is creationism’s hipster cousin.
        It’s not a falsifiable hypothesis. It makes so many implicit assumptions it’s scarcely worth talking about.
        It’s fake smart and not interesting.
        [-]
        swayvil 926 days ago
        Have you heard the one about the rat in the glove? Looks like a hand. Wiggles like a hand. All rat.
        chaboud 926 days ago
        It could just as easily be evidence of a previously non-understood physical phenomenon or mechanism being recognized by new methods.
        Of course, that’s exactly the kind of thing a simulation would type!
deegles 926 days ago
Since SHA-1 is known to be broken, could this be adapted to create collisions on demand? How much training would be required to break SHA256?
[-]
- tptacek 926 days ago
  No, you can't use ML to break SHA256.
phenkdo 926 days ago
How Is ML being used to generate more randomness in the PRNGs? Wouldn't the opposite be a more interesting research idea?
joe__f 926 days ago
What are the potential implications of this?
[-]
- IncRnd 926 days ago
  None. This has been trivial to break for a long time. It is not considered a cryptographically secure random number generator.
  [-]
  - joe__f 926 days ago
    Is it potentially useful for something outside of cryptography? I was watching a video on speedrunning where they were gaming the RNG somehow to improve their times
    [-]
    - upofadown 926 days ago
      Predictable bit streams that look random are useful for communications. Satellite navigation systems like GPS for instance use such streams where the whole idea is for a receiver to sync up on on the bitstream.
      An example where no one even cares exactly what the bitsream is can be found on motherboards where a bitstream is used to jitter the system clock to spread out any radio interference caused by the motherboard.
      [-]
      - joe__f 923 days ago
        These are some interesting uses of psuedorandom bit streams, but I didn't understand why being able to reverse engineer the streams using ML in these instances would be useful
      - _hl_ 926 days ago
        > to jitter the system clock to spread out any radio interference caused by the motherboard
        Fascinating! That seems very counterintuitive, where can I learn more about this?
        [-]
        upofadown 926 days ago
        A hopefully relevant search:
        * https://www.google.com/search?q=motherboard+spread+spectrum
bionhoward 926 days ago
Could this make a more performant PRNG for generative models, random simulations, etc?
SavantIdiot 926 days ago
I'd like to see a neural net have a go at a DRBG, such as one using an HMAC/SHA256 and PBDKF2, a cryptographically strong pseudo-random bit generator. Is it possible there's something that analytic methods of this RFC didn't catch that a neural net could?
[-]
- tptacek 926 days ago
  There's been over a decade of research into machine learning cryptanalysis. It's not a new idea.
  [-]
  - SavantIdiot 925 days ago
    If we knew everything we wouldn't be reading this website. Or do you already know everything and just like being dismissive?
    [-]
    - tptacek 925 days ago
      Is there more information you’re looking for? Beyond what I said, I could have added “it has not been broken.”
smitty1e 926 days ago
If machine learning is ultimately pattern recognition, then, is random number generation the antithesis of machine learning?
[-]
- willis936 926 days ago
  Perhaps the machine is hallucinating.
  https://en.m.wikipedia.org/wiki/Pareidolia
- SilasX 926 days ago
  I would say “dual” rather than antithesis, in that [machine] learning involves finding regularities, while RNGs (like encryption) seek to obscure them, but yes.
cblconfederate 926 days ago
ANNs are function approximators so it s trivial for them to learn a mathematical function like a prng. The seed is irrelevant to the ANN, it just learns the function's sequence, regardless of where it started. Which makes the interesting question how many people are already cracking PRNGs with ANNs ?
[-]
- seanhunter 926 days ago
  I could see how they could learn to approximate the function if they could observe the internal state, but without being able to see the state, if it's a strong PRNG how would they be able to do it?
  Edit: Reading the paper they're approximating xorshift which as I understand it is fast and efficient but has very little internal state and is not a strong PRNG.
  [-]
  - cblconfederate 926 days ago
    Would it be called pseudorandom if it had an undecipherable internal state such as temperature or sth? aren't all PRNGs deterministic functions by definition?
    [-]
    - adrian_b 926 days ago
      For some PRNGs it is intended that the output values must be unpredictable, e.g. for using in cryptographic applications.
      For such PRNGs, you can either view them as having a known state and an unknown output function or you can view them as having a known output function and a state composed of a known part and of an unknown, secret, part.
      The second point of view is more general, as the first case can always be reduced to the second, because the unknown output function must be derived using known functions from a secret key, so you can consider the secret key as the part of the state that is unknown.
      (There is a third possible structure for a PRNG, with known state, known output function and unknown transition function, but this is obsolete for making unpredictable PRNGs, as it has inferior properties).
      So yes, a pseudo RNG that has a part of its state that remains unknown may produce an output sequence from which it is computationally infeasible to determine its complete state.
      Nonetheless, an unpredictable PRNG remains a PRNG, not a true RNG. You can reproduce any time its output sequence if you know the secret part of the state, e.g. when you are the recipient of an encrypted message.
      If a ML method would be able to recover the secret part of the state when given the output sequence, obviously that PRNG would be considered broken.
      Many PRNGs are designed for high-speed in non-cryptographic applications, e.g. simulations, and for those PRNGs it is usually very easy to determine the internal state from the output sequence, without the need of any ML.
      [-]
      - joshuaissac 926 days ago
        There are CSPRNGs like Fortuna that contain an entropy accumulator, so even if the secret part of the state is compromised, it can recover.
        [-]
        adrian_b 926 days ago
        True, but they belong to a different class, which are used for applications like key generation, where there is no need to ever reproduce a certain output sequence.
        Such RNGs are used instead of true RNGs, which might not be able to provide the required random numbers fast enough. They cannot be used instead of normal PRNGs in most of their applications.
        Unpredictable cryptographic PRNGs can be used instead of any other PRNG for simulations, games, MonteCarlo integration etc. and they are actually better than simpler predictable PRNGs, except that they might be too slow.
        On modern CPUs with hardware AES instructions, any of the traditional PRNGs that is not faster than computing AES, has become obsolete, as using AES with a counter would provide a random sequence with a higher quality.
        [-]
        rurban 926 days ago
        Wrong as we can see with x64's new rdrand which relies on a fast aesna, but is terribly broken.
    - saithound 926 days ago
      Generally, PRNGs have an internal state, a transition function that determines a new internal state based on the current internal state, and an output function that determines the output value based on the current internal state.
      In most cases, the output is determined by the output function based on a small part of the state (e.g. you may have 128 bits of state, feed the first 64 bits of it to an output function, and derive a 32-bit output as a result). This means that you don't learn much about the internal state of the generator merely by looking at an output value: you need to observe a lot of consecutive outputs to recover the internal state of the generator, and without knowing the internal state it's difficult (i.e. very hard to impossible) to use a relatively small number of known outputs to predict the next output.
      The xorshift128 generator tested in the article has a nigh-trivial output function (because it's designed to be fast, not hard-to-predict) and a very simple transition function, which is why 4 outputs can be used to recover the internal state and to train a NN to predict the next output.
      [-]
      - cblconfederate 926 days ago
        but ANNs are universal approximators, they dont care if a function is recursive or anything as long as it's deterministic
        [-]
        saithound 926 days ago
        The point is precisely that predicting the next output based on a number of known outputs is not deterministic. Predicting the next _internal state_ from the current _internal state_ is deterministic, but that doesn't help, since you don't have access to the internal state.
        Here's a simple random number generator. You start with a 16 bit number x, (your internal state). At each step you update your internal state to 75x + 74 (mod 65537), and output the first digit of your internal state (your random number)
        Here's a sequence of outputs: 6,5,3,3,3,5. Does this information allow you to predict the next output? No.
        For example, if the generator's starting state is x=60160, then the generator's internal state evolves as follows: [60160,55558,38093,38958,38296,54183,505]. So the seventh output will be 5. However, if your generator's starting state was instead x=6, then the generator's state evolved as follows: [6,524,39374,3959,34851,57956,21332]. So the seventh output will be 2. The releationship between the first six outputs and the seventh output is not deterministic.
        [-]
        maple3142 925 days ago
        > Here's a sequence of outputs: 6,5,3,3,3,5. Does this information allow you to predict the next output? No.
        Your prng looks like a truncated LCG, it could be broken if enough outputs are known. For example, 16 bits could be bruteforced in this case. If the internal state have more bits, depends on how much bits do the outputs provide, you could probably break it by Lattice (https://crypto.stackexchange.com/questions/37836/problem-wit...).
        [-]
        saithound 925 days ago
        This PRNG is a truncated LCG. It can be "broken" if enough outputs are known. But that's not news: _any_ PRNG can be predicted if enough outputs are known.
        For this specific generator, knowing 6 outputs is not enough to pin down the seventh output. Not by using a NN, not by brute-forcing the possible starting seeds, not by lattice methods, not by solving a linear system over a finite field. Unless you know the internal state of the generator, you won't be able to predict the seventh output. And the first 6 outputs simply don't reveal enough information to recover the internal state.
        cblconfederate 926 days ago
        The function is still deterministic. ANNs can learn the source code of the function , they re not looking for an analytical solution to the forward or inverse problem.
        [-]
        roywiggins 926 days ago
        The point is, you can be handed 1) the entire source code of the PRNG, and 2) six outputs from it, and it's impossible to know what the starting state of the PRNG was because six outputs could have come from many different starting states. You could brute force every single starting state and find every one that would produce those 6 outputs but it still doesn't let you know the next value.
- CodesInChaos 926 days ago
  1. For many functions ANNs will never converge to an approximation of that function given only (input, output) tuples, even if the function is simple enough that the ANN could represent it. Chaotic functions rarely have a useful gradient the NN can follow.
  2. To break a PRF it'd need to learn its inverse. For many of them it's unlikely that a small circuit computing the inverse even exists, so a non-iterated NN couldn't even represent the desired function, let alone converge to it.
  [-]
  - cblconfederate 926 days ago
    i dont know much about cryptography but why would it need to learn its inverse? I mean a theoretically absurdly large/wide network can approximate almost any function . Are PRFs resistant to that?
    [-]
    - _hl_ 926 days ago
      The underlying assumption of all of modern cryptography is that one-way functions exist (implying P != NP), and further that the algorithms we use in practice are actually instances of these theoretically hard problems. So there provably always exists an algorithm that breaks any given PRNG, it will just always take you very long to compute and/or a lot of queries to the PRNG. ANNs are nothing special in that regard, they are just algorithms and have the same theoretical lower bound on their runtime.
      So yes, an absurdly large ANN will break any PRNG, but so will other absurdly long-running algorithms, some of which are quite trivial: Just try every possible seed until you get the same sequence as what you observe, repeat until only one candidate seed is left.
      EDIT: To add to this, you seem to be referring to the universal approximation theorems for ANNs. These theorems state that for any function (subject to some conditions not relevant here) and any arbitrary approximation ratio, there exists a finite ANN that approximates the function to within the desired approximation ratio. It says nothing about whether it's possible to train such an ANN, merely that it exists. Which in this case is a fairly trivial results, you could feasibly create a look-up table of PRNG sequences to seeds and encode that as an enormous ANN. But finding/training this ANN is prohibitively expensive.
      [-]
      - cblconfederate 926 days ago
        I'm not sure if there is any guarantee on the size of the network required. Problems like vision or language were also very complex and nonlinear but anns were able to handle them. It is analytically difficult to find an inverse function but the ANNs are not trying to do that, merely approximating the source code of the PRNG as piecewise-nonlinear functions. So i wouldn't think the former problem (inverse) is informative about the difficulty of the ANN training. At least i dont know if there's theoretical work on designing ANN-hard cryptographic functions
        [-]
        _hl_ 926 days ago
        The argument is fairly straightforward: Let n be the number of seed bits of the CSPRNG. Assume there exists an ANN that approximates the CSPRNG so that the number of possibly wrong output bits is at most polylogarithmic in n (otherwise you haven't really gained much insight from the ANN). Assume further that the ANN is sufficiently small so that a forward pass through the ANN takes at most polynomial time in n. Then you can create a polynomial-time algorithm that (i) runs the ANN on the observed CSPRNG samples/by querying the oracle, and then (ii) brute-forces the possible error bits in polynomial time, thereby recovering the full seed. But this is a contradiction with the assumption that the CSPRNG is secure, i.e. that it admits no polynomial-time adversary, q.e.d.
        [-]
        cblconfederate 926 days ago
        The difference is that ANNs make approximations. It could be manageably small and learn an approximation of the PRNG code that succeeds for a large number of outputs, even if running a fully accurate network is impossible. I think the complexity of the PRNG recursive algorithm , when unrolled through time for large number of steps, is the relevant complexity here. The ANN is not necessarily trying to devise a new function that recovers the seed.
        [-]
        _hl_ 926 days ago
        The proof accounts for that. You can even allow for the ANN being totally wrong on "almost all" seeds except for a non-negligible fraction, the definition of a CSPRNG allows for that. If it is wrong for all but a negligible fraction of seeds, then you haven't gained much over random guessing. So the only way that an ANN could help is if the ANN approximation is sufficiently bad that it doesn't give you any significant speedup in cracking the CSPRNG.
        Of course, this is assuming that practical CSPRNGs conform to the theory, but we're just debating theory here.
    - MauranKilom 926 days ago
      > I mean a theoretically absurdly large/wide network can approximate almost any function
      Yes. But chances are that the network size scales exponentially with the number of e.g. input bits. Good luck building/training/storing a model with 2^128 neurons in the hidden layer.
- ChrisLomont 926 days ago
  They're already trivial to crack 100% with Z3. Using something fuzzy like a neural net is by far not an optimal way to break it.
- visskiss 926 days ago
  Uhh yeah right.
  [-]
  - saithound 926 days ago
    cblconfederate happens to be right in this specific case (although wrong about using neural nets for predicting PRNGs in general): the author uses four consecutive non-truncated outputs of the xorshift128 generator as the network input, and learn to predict the next output. Since four consecutive outputs expose the entire internal state of the xorshift128 generator, this amounts to little more than approximating a (very simple) four-variable function.