Some Epstein file redactions are being undone with hacks

(theguardian.com)

182 points | by vinni2 7 hours ago

22 comments

  • cmarschner 9 hours ago
    Befuddling that this happened again. It’s not the first time

    - Paul Manafort court filing (U.S., 2019) Manafort’s lawyers filed a PDF where the “redacted” parts were basically black highlighting/boxes over live text. Reporters could recover the hidden text (e.g., via copy/paste).

    - TSA “Standard Operating Procedures” manual (U.S., 2009) A publicly posted TSA screening document used black rectangles that did not remove the underlying text; the concealed content could be extracted. This led to extensive discussion and an Inspector General review.

    - UK Ministry of Defence submarine security document (UK, 2011) A MoD report had “redacted” sections that could be revealed by copying/pasting the “blacked out” text—because the text was still present, just visually obscured.

    - Apple v. Samsung ruling (U.S., 2011) A federal judge’s opinion attempted to redact passages, but the content was still recoverable due to the way the PDF was formatted; copying text out revealed the “redacted” parts.

    - Associated Press + Facebook valuation estimate in court transcript (U.S., 2009) The AP reported it could read “redacted” portions of a court transcript by cut-and-paste (classic overlay-style failure). Secondary coverage notes the mechanism explicitly.

    A broader “history of failures” compilation (multiple orgs / years) The PDF Association collected multiple incidents (including several above) and describes the common failure mode: black shapes drawn over text without deleting/sanitizing the underlying content. https://pdfa.org/wp-content/uploads/2020/06/High-Security-PD...

    • throwup238 2 hours ago
      > - Associated Press + Facebook valuation estimate in court transcript (U.S., 2009) The AP reported it could read “redacted” portions of a court transcript by cut-and-paste (classic overlay-style failure). Secondary coverage notes the mechanism explicitly.

      What happens in a court case when this occurs? Does the receiving party get to review and use the redacted information (assuming it’s not gagged by other means) or do they have to immediately report the error and clean room it?

      Edit: after reading up on this it looks like attorneys have strict ethical standards to not use the information (for what little that may be worth), but the Associated Press was a third party who unredacted public court documents in a separate Facebook case.

      • irishcoffee 1 hour ago
        My guess would be that if the benefitting legal party didn't need to declare they also benefitted from this (because they legally can't be caught, etc.) they wouldn't.

        I know and am friends with a lot of lawyers. They're pretty ruthless when it comes to this kind of thing.

        Legally, I would think both parties get copies of everything. I don't know if that was the case here.

    • ricksunny 25 minutes ago
      The covid origins Slack messages discovery material (Anderson & Holmes) were famously poorly redacted pdfs, allowing their unredacting by Gilles Demaneuf, benefiting all of us.
    • JumpCrisscross 31 minutes ago
      "There are major differences between the Trump 1.0 and 2.0 administrations. In the Trump 1.0 administration, many of the most important officials were very competent men. One example would be then-Attorney General William Barr. Barr is contemptible, yes, but smart AF. When Barr’s DOJ released a redacted version of the Mueller Report, they printed the whole thing, made their redactions with actual ink, and then re-scanned every page to generate a new PDF with absolutely no digital trace of the original PDF file. There are ways to properly redact a PDF digitally, but going analog is foolproof."

      https://daringfireball.net/linked/2025/12/23/trump-doj-pdf-r...

      • netsharc 26 minutes ago
        It's like Russian spies being caught in the Netherlands with taxi receipts showing they took a taxi from their Moscow HQ to the airport: corrupt organizations attract/can only hire incompetent people...

        https://www.vice.com/en/article/russian-spies-chemical-weapo...

        Anyone remember how the Trump I regime had staff who couldn't figure out the lighting in the White House, or mistitled Australia's Prime Minister as President?

        • JumpCrisscross 8 minutes ago
          > with taxi receipts

          Please tell me they were saving them for expensing.

      • tdeck 6 minutes ago
        The bigger difference from my perspective is that they have competent people doing the strategy this time. The last Trump administration failed to use the obvious levers available to accomplish fascism, while this one has been wildly successful on that end. In a few years they will have realigned the whole power dynamic in the country, and unfortunately more and more competent people will choose to work for them in order to receive the benefits of doing so.
    • ajross 1 hour ago
      Given the context and the baldly political direction behind the redactions, it's not at all unlikely that this is the result of deliberate sabotage or malicious compliance. Bondi isn't blacking these things out herself, she's ordering people to do it who aren't true believers. Purges take time (and often blood). She's stuck with the staff trained under previous administrations.
      • lamontcg 49 minutes ago
        Or it is just the result of firing people who were competent and giving insufficient training to people who had never done this before.
    • beaned 1 hour ago
      [flagged]
      • exasperaited 1 hour ago
        You mean the layers that were, in fact, just side effects of scanning the (non-authoritative) short form certificate?
  • digitaltrees 3 minutes ago
    Its not a hack to copy and paste text that is part of the document data. The incompetence of the people responsible to comply with the law doesnt mean its reasonable to label something a hack.

    Please change the title.

  • nickpinkston 2 hours ago
    I wonder if any of this is a conscious act of resistance vs. just incompetence.

    And yes, I've heard of Hanlon's Razor haha

    https://en.wikipedia.org/wiki/Hanlon%27s_razor

    • wolpoli 2 hours ago
      Black square vs redaction tool difference is well known if someone's job involves redacting PDF or just working with PDF. It's most likely that additional staffs were pulled in and weren't given enough training.
      • Dusseldorf 2 hours ago
        Colleagues whose full time job is doing this sort of thing for various bits of the government have told me this is exactly the case here. People from all over the government have been deputized to redact these documents with little or no prior training.
        • dboreham 1 hour ago
          CUaaS. Cover Up as a Service.
          • femto 54 minutes ago
            With a sister website BAEaas (Backup and Extort as a service).
        • mindslight 2 hours ago
          I wonder if this activity is being used as a kind of loyalty test. Keep track of who is assigned to redact what, and then if certain files leak or are insufficiently redacted, they indicate who isn't all in on Dear Leader.

          It's not like a few more stories of Trump raping $whomever are going to move the needle at all, especially with how the media is on board with burying negative coverage of the regime.

          Also if you're wondering how this activity isn't some kind of abuse of government resources, keep in mind that thanks to the Supreme Council's embrace of the Unitary Executive Theory (ie Sparkling Autocracy), covering up evidence about Donald Trump raping under-aged sex trafficking victims is now an official priority of the United States Government.

          • andrewflnr 1 hour ago
            I guess they might try, but given all the other nonsense I certainly don't think the admin is organized enough to execute that plan.
      • exasperaited 1 hour ago
        Yeah — don't attribute to resistance what can adequately be explained by idiocy.
      • cynicalsecurity 2 hours ago
        Let people believe it's deliberate sabotage. Unfortunately, in real life, minions of a dictator serve the dictator; they don't risk their live or safety for a noble cause. Any screw-ups are a result of gross incompetence that is typical for every dictatorship.
        • andsoitis 1 hour ago
          Do you truly believe the US is currently a dictatorship?
          • vunderba 1 hour ago
            I wouldn’t go so far as to call it a dictatorship, but it’s definitely trending toward authoritarianism.

            Wasn't too hard to put together a quick graph of the past decade for the U.S. using the World Press Freedom Index (relative ranking and score) - an annual ranking of 180 countries published by Reporters Without Borders that measures the level of press freedom.

            https://imgur.com/a/4liEqqi

          • bdangubic 1 hour ago
            what is the US exactly currently if not dictatorship? is there a single thing “President” cannot do right now and if so who would be stopping him? so perhaps on paper US is not dictatorship much like Russia and China are not… We spend decades trying to fight these regimes and lost so much that now we are worse than them :)
            • chocoboaus3 1 hour ago
              The supreme court did just stop him for the moment putting the national guard into chicago
              • bdangubic 19 minutes ago
                bookmark this for a few days and then come back to it… the story is “… for now” :-)
          • ourmandave 34 minutes ago
            The pendulum swings. It always does. And all the powers SCOTUS gave the executive branch will eventually be in the hands of the Loyal Opposition.

            If it swings as far back you might even see universal health care, sane gun laws, fair wages, campaign finance reform, reproductive freedom, science based policy making, reigning in billionaires, etc.

          • Loughla 1 hour ago
            I truly believe we're headed that direction. I've lived long enough to have seen a wide variety of presidents, both good and bad. This one is easily the worst one, in terms of bare naked power grabs.

            I believe Trump will manufacture a crisis before he's out of office in a bid to maintain control. I believe he will have learned from Bush Jr. that a simple war isn't good enough, and it needs to be a genuine emergency.

            I believe he'll do whatever he can to make that happen. Native born terrorist, or war with a close country, or absolutely over the top financial crash. Something awful that lets him invoke some obscure rule that lets him stay in power with congressional approval - he'll just skip the congressional approval part like he already does.

            • irishcoffee 1 hour ago
              This is one of those instances where I with hn had some kind of remindMe feature.
          • vkou 1 hour ago
            How would the roadmap for turning a democracy into a one party dictatorship differ from the trajectory we are on?
          • sneak 1 hour ago
            I’m still always surprised that there are adults who think it is not.

            The CIA, for example, is entirely above the law.

            • neutronicus 1 hour ago
              That's different from a dictatorship, though, especially if the CIA is not answerable to a supposed dictator.
              • dragonwriter 1 hour ago
                > That's different from a dictatorship,

                Its exactly equivalent to a dictatorship by the head of the CIA, unless the CIA is effectively answerable to some other authority despite not being answerable to the law, and then it is equivalent to a dictatorship by that higher authority.

          • idle_zealot 1 hour ago
            It's not so simple a binary. We're definitely much less democratic than a year ago, and the bar was low then.
        • brunoqc 1 hour ago
          Maybe because facism favor loyalty over competence.
    • neilv 2 hours ago
      A third possibility is diversion, while the most damaging evidence would be suppressed a different way.
    • apical_dendrite 2 hours ago
      Reporting is that they had a basically impossible deadline and they took lawyers off of counterintelligence work to do this. So a conscious act of resistance is possible, but it's a situation where mistakes are likely - people working very quickly trying to meet a deadline and doing work they aren't that familiar with and don't really want to be doing.
    • jmyeet 58 minutes ago
      It's a good question.

      For context, lawyers deal with this all the time. In discovery, there is an extensive document ("doc") review process to determine if documents are responsive or non-responsive. For example, let's say I subpoenaed all communication between Bob and Alice between 1 Jan 2019 and 1 Jan 2020 in relation to the purchase of ABC Inc as part of litigation. Every email would be reviewed and if it's relevant to the subpoena, it's marked as responsive, given an identifier and handed over to the other side. Non-responsive communication might not be eg attorney-client communications.

      It can go further and parts of documents can be viewed as non-responsive and otherwise be blacked out eg the minutes of a meeting that discussed 4 topics and only 1 of them was about the company purchase. That may be commercially sensitive and beyond the scope of the subpoena.

      Every such redaction and exclusion has to be logged and a reason given for it being non-responsive where a judge can review that and decide if the reason is good or not, should it ever be an issue. Can lawyers find something damaging and not want to hand it over and just mark it non-responsive? Technically, yes. Kind of. It's a good way to get disbarred or even jailed.

      My point with this is that lawyers, which the Department of Justice is full of, are no strangers to this process so should be able to do it adequately. If they reveal something damaging to their client this way, they themselves can get sued for whatever the damages are. So it's something they're careful about, for good reason.

      So in my opinion, it's unlikely that this is an act of resistance. Lawyers won't generally commit overt illegal acts, particularly when the only incentive is keeping their job and the downside is losing their career. It could happen.

      What I suspect is happening is all the good lawyers simply aren't engaging in this redaction process because they know better so the DoJ had the wheel out some bad and/or unethical ones who would.

      What they're doing is in blatant violation to the law passed last month and good lawyers know it.

      There's a lot of this going on at the DoJ currently. Take the recent political prosecutions of James Comey, Letitia James, etc. No good prosecutor is putting their name to those indictments so the administration was forced to bring in incompetent stooges who would. This included former Trump personal attorneys who got improerly appointed as US Attorneys. This got the Comey indictment thrown out.

      The law that Ro Khanna and Thomas Massey co-sponsored was sweeping and clear about what needs to be released. The DoJ is trying to protect both members of the administration and powerful people, some of whom are likely big donors and/or foreign government officials or even heads of state.

      That's also why this process is so slow I imagine. There are only so many ethically compromised lackeys they can find.

  • tim333 4 hours ago
    It's quite funny really. Apparently you just cut and paste the text into Word. They just had the pdf put black rectangles on top.
    • pilaf 1 hour ago
      Why into Word specifically?
      • iAMkenough 1 hour ago
        The average office worker has it on their computer, illustrating how commonplace unredacting could be. Any text tool will work, even some designed to detect bad redactions in PDFs via drag and drop (now specifically trained on these known bad redactions). https://github.com/freelawproject/x-ray
    • echelon 59 minutes ago
      Why reveal the trick before all the papers have been released?
      • pohl 34 minutes ago
        IKR?!
  • juujian 1 hour ago
    Apart from the technological and procedural question, I would love to learn why the DOJ found it important to protect Indyke. He was Epstein's lawyer, and now we learn that he was personally involved. He is not a Washington person. We expected there to be politically motivated protection of certain people, but is the DOJ just going to blanket protect anybody in the docs?
    • avidiax 56 minutes ago
      Indyke works for other powerful people, runs in MAGA circles.

      Two things come to mind:

      * Some things Indyke did fall outside the scope of lawyer-client privilege. It would be bad for certain people to get him on a stand and force him to spill the beans. He was never interviewed re: Epstein [1]

      * He's a very talented lawyer, insofar as a competent lawyer with, at least, extreme discretion, is talented.

      [1] https://www.finance.senate.gov/imo/media/doc/letter_to_doj-f...

  • montroser 17 hours ago
    Let's nobody make any fuss about this yet, lest they wise up before releasing the rest of the docs this way too!
  • sublinear 34 minutes ago
    If you think mere human incompetence with documents is bad, imagine all the vibe coded apps.
  • c420 1 hour ago
    “Like you guys have had this stuff for a year. Doesn’t it seem like you could just throw all that into AI at this stage of the game? And just redact the names of the victims, and let’s go.” Joe Rogan
  • nlitsme 3 hours ago
    Can you post the document numbers, I can't find where these texts are in the original pdfs.
  • tomekf 13 hours ago
    How it’s done from technical point?
    • mmh0000 11 hours ago
      Layers.

      PDF is an absurdly complex file format. It's part of the reason there is no single "good" PDF reader, just a lot of mediocre PDF readers that are all terrible in their own way. Which is a topic for another day.

      There are several ways to remove data in a PDF:

      - Remove the data. This is much harder than it sounds. Many PDF tools won't let you change the content of a PDF, not because it isn't possible, but because you'll likely massively screw up the formatting, and the tools don't want to deal with that.

      - Replace the data. This what what all the "blackout" tools do, find "A" and replace with "🮋". This is effective and doesn't break formatting since it's a 1-to-1 replacement. The problem with "replacing" is that not every PDF tool works the same way, and some, instead, just change the foreground and background color to black; it looks nearly the same, but the power of copy-and-paste still functions.

      - Then you have the computer illiterate, who think changing the foreground and background color to black is good enough anyway.

      • zauguin 1 hour ago
        This seems highly misleading.

        > - Remove the data. This is much harder than it sounds. Many PDF tools won't let you change the content of a PDF, not because it isn't possible, but because you'll likely massively screw up the formatting, and the tools don't want to deal with that.

        Compared to other formats this is actually relatively easy in a PDF since the way the text drawing operators work they don't influence the state for arbitrary other content. A lot of positioning in a PDF is absolute (or relative to an explicitly defined matrix which has hardcoded values). Usually this makes editing a PDF harder (since when changing text the related text does not adapt automatically), but when removing data it makes it much easier since you can mostly just delete it without affecting anything else. (There are exceptions for text immediately after the removed data, but that's limited and relatively easy to control.)

        > - Replace the data. This what what all the "blackout" tools do, find "A" and replace with "🮋". This is effective and doesn't break formatting since it's a 1-to-1 replacement.

        That's actually rather tricky in PDFs since they usually contain embedded subset fonts and these usually do not have "🮋" as part of the subset. Also doing this would break the layout since "🮋" has a different width than most letters in a typical font, so it would not lead to less formatting issues than the previous option. Unless the "🮋" is stretched for each letter to have the same dimensions, but then the stretched characters allow to recover the text.

        > The problem with "replacing" is that not every PDF tool works the same way, and some, instead, just change the foreground and background color to black; it looks nearly the same, but the power of copy-and-paste still functions.

        PDF does not have a concept of a background color. If it looks like a background color in PDF, you have a rectangle drawn in one color and something in the foreground color in front of it. What you usually see in badly redacted PDF files is exactly this, but in opposite color: Someone just draws a black box on top of the characters. You could argue that this is smarter since it would still work even if someone would chnage colors, but of course, PDF is a vector format. If you just add a rectangle, someone else can remove it again. (And also copy & paste doesn't care about your rectangle)

      • hallole 5 hours ago
        Thanks for this. Really quells the urge I get every so often to just code my own PDF editor, because they all suck and certainly it couldn't be THAT hard. Such hubris!
        • brailsafe 2 hours ago
          Heh, have at it, here's the full spec: https://developer.adobe.com/document-services/docs/assets/5b...

          Should take... a weekend tops? ;) PDF is crazy and scary

          • marcosdumay 30 minutes ago
            > PDF includes eight basic types of objects: Boolean values, Integer and Real numbers, Strings, Names, Arrays, Dictionaries, Streams, and the null object

            Wait, this is more complete than SOAP. It may be a good idea to redo the IPC protocol with a different serialization format!

          • embedding-shape 1 hour ago
            7.5.6 "Incremental updates" from the specification is an interesting section too, speaking about accessing data people didn't think to remove from PDF files properly.
          • CamperBob2 1 hour ago
            We will be able to say that AGI has arrived when we can hand that spec off to a model and tell it to build an Acrobat clone.
        • kayodelycaon 1 hour ago
          I did a bunch of work creating pdfs using a low-level API, object goes here stuff.

          As far as I understand it, at its core, pdf is just a stream of instructions that is continually modifying the document. You can insert a thousand objects before you start the next word in a paragraph. And this is just the most basic stuff. Anything on a page can be anywhere in the stream. I don't know if you can go back and edit previous pages, you might have a shot at least trying to understand one page at a time.

          Did you know you can have embedded XML in PDFs? You can have a paper form with all the data filled in and include an XML version of that for any computer systems that would like an easier way to read it.

        • gregsadetsky 4 hours ago
          Don't stop yourself before getting started. I believe in you - maybe you could write the one editor that would actually work!

          Not kidding - it's a ~~~billion dollar market haha

          Make an MVP/Show HN :-)

        • TRiG_Ireland 1 hour ago
          The blog post about adding colour gradients to Typst dives into some of the weirdness of the format. https://typst.app/blog/2023/color-gradients
        • NamTaf 2 hours ago
          Bravo to you for recognising the load-bearing 'just' before you threw it around :)
    • 3eb7988a1663 5 hours ago
      I remember reading the recommendation for journalists to redact documents is to black them out in the digital version, print it out, and re-scan it. Anything else has too many potential ways by which it might be possible to smuggle data.
      • dmurray 2 hours ago
        Even that might leak to length attacks: one reasonable plaintext would lead to black bars of 1135 px, another to 1138 px, and with enough redactions you can converge on what the plaintext might be.

        The only safe way for journalists is to paraphrase what the document said and to say "an unnamed source claims that ..." and to guarantee with your reputation, and the reputation of your publisher, that you are being faithful to what the original source said. For even better results, combine multiple sources.

        Unfortunately paraphrasing things and taking editorial responsibility have both been deprecated in favour of rereleasing press releases in the house style, so it's difficult to get the actual journalism these days.

    • general1465 13 hours ago
      Mistaking redaction tool (replaces data with black square) and black highlighter (adds black square as another layer). If people doing redactions are computer-illiterate, they won't see the difference.
    • oliwarner 7 hours ago
      They drew black boxes over the text. The text is still underneath. On OCR'd scanned documents, the text you'd copy is actually stored in metadata and just linked by position to the image.

      Anyway, if you click on a "redaction", you're clicking on the box and can't select the text underneath, but if you just highlight the text around it, you can copy all the original text.

      It's a bizarre oversight.

  • buhfur 11 hours ago
    Doesn't work on any PDF's of scanned documents , for example the contacts list.
    • jdiff 1 hour ago
      Copying and pasting doesn't work. Unless your PDF viewer does OCR. And if the redaction is just a black rectangle overlaid on top, that can still be removed.
  • tpoacher 7 hours ago
    reminds me of that leaky redaction program that won the obfuscated c contest some years back
  • Alifatisk 14 hours ago
    Alright, now when everyone knows this. I hope people have backed up all the files to unredact everything before DOJ retracts the sensitive documents.
  • lawn 17 hours ago
    Lots of these redaction doesn't make sense unless they're made to protect the rich and powerful. Not surprising of course.
  • The-Old-Hacker 13 hours ago
  • lisbbb 2 hours ago
    Did we learn anything useful or is it exactly as I said in the other thread, which got downvoted to hell, that all the really juicy blackmail material is with the CIA and will never see the light of day?
    • gosub100 38 minutes ago
      Won't know until all the documents are released. The blackmail is undeniable. But what's more interesting is who else was involved. Who purchased his services? That's what they are trying to hide.
    • apical_dendrite 2 hours ago
      Do you have any evidence of that?
      • XorNot 2 hours ago
        Of course they don't but it sounds truthy so give it a few rounds of the Internet whisper machine and it can become accepted fact everybody "knows".
  • xhkkffbf 11 hours ago
    So is the data extracted the names of the victims that were supposed to be hidden to protect them? Or is there something else that might be worthy of exposing?
    • deepsquirrelnet 2 hours ago
      It seems the redactions are to protect the perpetrators.
    • kjkjadksj 10 hours ago
      There are pages that are nothing but redacted text. It isn’t going to be a victims name copy pasted 80 times in a row…
      • wafflemaker 9 hours ago
        >It isn’t going to be a victims name copy pasted 80 times in a row…

        You can't possibly know that!

        (Sorry, watching Grinch, Jim Carrey spoke through me).

    • kgwxd 7 hours ago
      i assume the downvoters don't see the importance of the question.
      • watwut 7 hours ago
        The downvoters assume that it is a bad faith question. The downvoters are 99% right with that. If the 1% hit then OP is just exceedingly naive and did not followed the scandal in which case they should maybe first do some reading.

        The names of involved powerful people were NOT supposed to be censored. All those names except Bill Clinton name were redacted. To protect Trump and everybody else involved in the scandal except said Bill Clinton. But especially to protect Trump.

        • mapontosevenths 3 hours ago
          They also obscured the male perpetrators faces and bodies in many images, illegaly.
          • mindslight 1 hour ago
            I assume that de facto federal "law" now makes it illegal to be raped, and those men are the victims. That would be a logical conclusion of edgelord vice signalling, right?
            • mapontosevenths 1 hour ago
              I know what all of these words mean, but not when they're in this order.
    • tempaccountabcd 9 hours ago
      [dead]
  • Kaibeezy 14 hours ago
    See also:

    We Just Unredacted the Epstein Files

    https://news.ycombinator.com/item?id=46364121

    I tried to ascertain, but am not certain, this is the original blog source. Maybe they made some prior X posts.

  • vdupras 6 hours ago
    Trump's razor: Why attribute something to incompetence when you can attribute it to patriotic sabotage?
    • andrewflnr 1 hour ago
      There's no patriotism here. That's just part of the cover for seeking power.
    • jimt1234 1 hour ago
      There's no patriotism in protecting chomos.
    • TRiG_Ireland 1 hour ago
      It's certainly possible that some of the underlings are deliberately sabotaging orders from above. It's also possible that they're incompetent, as so many of the Trump team are. How would we know which it is?
  • ChrisArchitect 7 hours ago
    • dang 2 hours ago
      We'll merge those comments hither.
  • Sparkyte 1 hour ago
    I think this is a good thing. I think the people talking dictator this and that do not understand we have the ability to critique the administration. What we lack is control of the underhanded lobbyism. It is a warped democracy but still a democracy.