[-] jandrew13@lemmy.world 3 points 2 days ago

For anyone looking into doing some OSINT work, this is an epic file EFTA00809187

It contains lists of ALL know JE emails, usernames, websites, social medias, etc from that time

[-] jandrew13@lemmy.world 2 points 2 days ago

Nice work man! I also discovered something yesterday that I think is worth pointing out.

DUPLICATE FILES: Within the datasets, there are often emails, doc scans, etc that are duplicate entries. (Im not talking about multi torrent stitching, but actual duplicate documents within the raw dataset.) **These duplicates mustbe preserved. ** When looking at two copies of the same duplicate file, I found that sometimes the redactions are in different places! This can be used to extract more info later down the road.

[-] jandrew13@lemmy.world 2 points 2 days ago

nothing, but event the archived pages arent 100% because some of the files were "faked" in the paginated file lists on the DOJ site. it does work well enough though. I did this to recover all the court records and FOIA files

[-] jandrew13@lemmy.world 1 points 3 days ago

What does this contain? anything new?

[-] jandrew13@lemmy.world 2 points 3 days ago

I've been thinking a lot about this whole thing. I don't want to be worried or fearful here - we have done nothing wrong! Anything we have archived was provided to us directly by them in the first place. There are whispers all over the internet, random torrents being passed around, conspiracies, etc., but what are we actually doing other than freaking ourselves out (myself at least) and going viral with an endless stream of "OMG LOOK AT THIS FILE" videos/posts.

I vote to remove any of the 'concerning' files and backfill with blank placeholder PDFS with justification, then collect everything we have so far, create file hashes, and put out a clean + stable archive on everything we have so far. a safe indexed archive We wipe away any concerns and can proceed methodically through blood trail of documents, resulting in an obvious and accessible collection of evidence. From there we can actually start organizing to create a tool that can be used to crowd source tagging, timestamping, and parsing the data. I'm a developer and am happy to offer my skillset.

Taking a step back - Its fun to do the "digital sleuth" thing for a while, but then what? We have the files..(mostly).. Great. We all have our own lives, jobs, and families, and taking actual time to dig into this and produce a real solution that can actually make a difference is a pretty big ask. That said, this feels like a moment where we finally can make an actual difference and I think its worth committing to. If any of you are interested in helping beyond archival, please lmk.

I just downloaded matrix, but I'm new to this, so I'm not sure how that all works. Happy to link up via discord, matrix, email, or whatever.

[-] jandrew13@lemmy.world 1 points 4 days ago* (last edited 4 days ago)

this dude on pastebin posted his filetree in his epstein ubuntu env - i have a high confidence in whatever lives in his DataSet9Complete.zip file haha

[-] jandrew13@lemmy.world 2 points 4 days ago

Have a scraper running on web.archive.org pulling all previously posted Court-Records and FOIA (docs,audio,etc.) from Jan 30th

[-] jandrew13@lemmy.world 4 points 4 days ago

Holy shit

The entire Court Records and FOIA page is completely gone too! Fuckers!

[-] jandrew13@lemmy.world 4 points 4 days ago

While I feel hopeful that we will be able to reconstruct the archive and create some sort of baseline that can be put back out there, I also cant stop thinking about the "and then what" aspect here. We've see our elected officials do nothing with this info over and over again and I'm worried this is going to repeat itself.

I'm fully open to input on this, but I think having a group path forward is useful here. These are the things I believe we can do to move the needle.

Right Now:

  1. Create a clean Data Archive for each of the known datasets (01-12). Something that is actually organized and accessible.
  2. Create a working Archive Directory containing an "itemized" reference list (SQL DB?) the full Data Archive, with each document's listed as a row with certain metadata. Imagining a Github repo that we can all contribute to as we work. -- File number -- Dir. Location -- File type (image, legal record, flight log, email, video, etc.) -- File Status (Redacted bool, Missing bool, Flagged bool
  3. Infill any MISSING records where possible.
  4. Extract images away from .pdf format, Breakout the "Multi-File" pdfs, renaming images/docs by file number. (I made a quick script that does this reliably well.)
  5. Determine which files were left as CSAM and "redact" them ourselves, removing any liability on our part.

What's Next: Once we have the Archive and Archive Directory. We can begin safely and confidently walking through the Directory as a group effort and fill in as many files/blanks as possible.

  1. Identify and dedact all documents with garbage redactions, (remember the copy/paste DOJ blunders from December) & Identify poorly positioned redaction bars to uncover obfuscated names
  2. LABELING! If we could start adding labels to each document in the form of tags that contain individuals, emails, locations, businesses - This would make it MUCH easier for people to "connect the dots"
  3. Event Timeline... This will be hard, but if we can apply a timeline ID to each document, we can put the archive in order of events
  4. Create some method for visualizing the timeline, searching, or making connection with labels.

We may not be detectives, legislators, or law men, but we are sleuth nerds, and the best thing we can do is get this data in a place that can allow others to push for justice and put an end to this crap once and for all. Its lofty, I know, but enough is enough. ...Thoughts?

[-] jandrew13@lemmy.world 3 points 5 days ago

This seems like a valid plan - although I'm not that confident in the 'purge'. It might be good to redact those images ourselves and then nobody is pressed to store them. Better to have a confidently safe dataset that can be passed around safely.

Also, It looks like they went back and repaired the shitty text redactions on docs that were released late 2025 from what I can tell. I ran a script that auto detects and removes "fake" redactions and its not getting any hits anymore. even on files that it flagged in the past. They are definitely trying to cover their tracts* by the day*

[-] jandrew13@lemmy.world 4 points 5 days ago

wondering the same thing myself. Not sure about the latest DS9 dump, but I've definitely seen some of the other leaks that included some CSAM. crazy that DOJ let that out the door. :/

view more: next ›

jandrew13

0 post score
0 comment score
joined 5 days ago