This week is going to be a super short one. I was on-call at work so a lot of my time went into random work tasks and when I was done with work, I didn’t really want to do anything else. I mostly played video games, watched Cobra Kai, and wrote my BSides talk.
An annoying problem came up at work over the week, a team that only has Chromebooks needed to be able to redact PDFs before sending them to people. The PDFs might contain sensitive PII that we didn’t want to be relaying to another party.
The Adobe Acrobat extension didn’t really work well on Chromebook. Many other PDF editing extensions didn’t support any means of redacting contents on the page. The ones that did would upload your PDF to their cloud silently in the background, which was a non-starter for documents that may contain Name, DOB, and SSN combinations.
With the help of ChatGPT, I was able to combine pdf-lib, pdf.js, and fabric.js into a single-page local PDF redaction tool. It’s dirt simple and roughly goes like this:
The one caveat with this approach is that you lose all interactivity with the resulting PDF (e.g. you can’t search for text or select text anymore). Everything gets flattened down into an image. But this is perfect for our use case, as the PDFs tend to be fairly short documents where the security of the original data is paramount. By choosing to flatten the PDF down into a series of PNGs, I don’t have to worry about the redaction boxes messing up the formatting of the page, and I don’t have to worry about poor redaction options that would leave the text retrievable. We also get these PDFs from all kinds of sources, so they don’t always have selectable text or anything anyways, sometimes the PDFs we get are already just PNG pages, so this option works the most flexibly.
Over all, between me and ChatGPT, we finished this in a little under a day, adding keyboard shortcuts for all actions, adding drag-and-drop, and packaging it up into a chrome extension that our Chromebook users can run locally on their machines. But ultimately the whole thing can be served out of a single HTML file, which we could host on our CDN, or as part of our administrative applications. All of the processing is done locally in the browser, so it works perfectly for our Chromebook users. One area for future improvement would be to automatically load the PDFs from our object storage, preventing the need for the user to even have a copy of the unredacted PDF on their laptop.
I’m a little surprised and mostly disappointed that the Chrome Web Store was full of sketchy PDF tools, either straight up harvesting your data, or offering to do it for free, only to charge you for it once you try to save your files. Slimey.
Tor: From the Dark Web to the Future of Privacy
By Ben Collier
ISBN: 9780262548182
Learn More
No real progress reading this week.