It's easier than ever to de-censor videos

387 points by DamonHD 3 months ago

lynndotpy 3 months ago

> Years ago it would've required a supercomputer and a PhD to do this stuff

This isn't actually true. You could do this 20 years ago on a consumer laptop, and you don't need the information you get for free from text moving under a filter either.

What you need is the ability to reproduce the conditions the image was generated and pixelated/blurred under. If the pixel radius only encompasses, say, 4 characters, then you only need to search for those 4 characters first. And then you can proceed to the next few characters represented under the next pixelated block.

You can think of pixelation as a bad hash which is very easy to find a preimage for.

No motion necessary. No AI necessary. No machine learning necessary.

The hard part is recreating the environment though, and AI just means you can skip having that effort and know-how.

cogman10 3 months ago

In fact, there was a famous de-censoring that happened because the censoring which happened was a simple "whirlpool" algorithm that was very easy to unwind.
If media companies want to actually censor something, nothing does better than a simple black box.
- rcfox 3 months ago
  
  > nothing does better than a simple black box
  You still need to be aware of the context that you're censoring in. Just adding black boxes over text in a PDF will hide the text on the screen, but might still allow the text to be extracted from the file.
  - falcor84 3 months ago
    
    Indeed. And famously, using black boxes as a background on individual words in a non-monospaced font is also susceptible to a dictionary attack on an image of the widths of the black boxes.
    
    lesuorac 3 months ago
    
    And even taking sharpie and drawing a black box doesn't mean the words can be seen at a certain angle or by removing the sharpie ink but not the printed ink.
    Really, if you need to censor something create a duplicate without the originals. Preferably literally without the originals as the size of the black box is also an information leak.
    
    snotrockets 3 months ago
    
    No need for the monospaced requirement - it would reduce the search space, but it's solvable even before this reduction.
    
    fc417fc802 3 months ago
    
    The additional leakage provided by non-monospace is rather large. With monospace all you know is the character count.
  - Voultapher 3 months ago
    
    Curious anyone know if the specific censoring tool in the MacOS viewer has this problem? I had assumed not because they warn you when using the draw shapes tool that text below it can be recovered later and they don't warn you about that when using the censoring tool.
- lynndotpy 3 months ago
  
  Ah yes, Mr. Swirl Face.
  This was pretty different though. The decensoring algorithm I'm describing is just a linear search. But pixelation is not an invertible transformation.
  Mr. Swirl Face just applied a swirl to his face, which is invertible (-ish, with some data lost), and could naively be reversed. (I am pretty sure someone on 4chan did it before the authorities did, but this might just be an Internet Legend).
  - Modified3019 3 months ago
    
    A long while ago, taking an image (typically porn), scrambling a portion of it, and having others try to figure out how to undo the scrambling was a game played on various chans.
  - cjbgkagh 3 months ago
    
    Christopher Paul Neil is a real person who went to jail.
    
    Der_Einzige 3 months ago
    
    Shhhhh!!!!!
    Yeah but if you read about him it serves as a rallying cry for right wing types since he's an example of the candaian legal systems extreme leniency. This guy should be in prison forever and he's been free since 2017. Look at his record of sentencing. I love being a bleeding heart liberal/progressive and all, but this is too far.
    Furthermore, don't look too hard at Isreal and it's policy of being very, very open to pedophiles and similar types.
- benatkin 3 months ago
  
  A completely opaque shape or emoji does it. A simple black box overlay is not recommended unless you that’s the look you’re going for. Also very slightly transparent overlays come in all shapes and colors and are hard to recognize whether it’s a black box or another shape, so either way you need to be careful it’s 100% opaque.
nartho 3 months ago

Noob here, can you elaborate on this ? if you take for example a square of 25px and change the value of each individual pixels to the average color of the group, most of the data is lost, no ? if the group of pixels are big enough can you still undo it ?
- porphyra 3 months ago
  
  Yeah most of the information is lost, but if you know the font used for the text (as is the case with a screencast of a macOS window), then you can try every possible combination of characters, render it, and then apply the averaging, and see which ones produce the same averaged color that you see. In addition, in practice not every set of characters is equally likely --- it's much more likely for a folder to be called "documents" than "MljDQRBO4Gg". So that further narrows down the amount of trying you have to do. You are right, of course, that the bigger the group of pixels, the harder it gets: exponentially so.
  - lynndotpy 3 months ago
    
    Yeah, a screencast of a MacOS window is probably one of the best case scenarios for this.
- margalabargala 3 months ago
  
  Depends how much data you have. If a 25x25 square is blurred to a single pixel, that will take more discrete information to de-blur than if it were a 2x2 square. So a longer video with more going on, but you can still get there.
- krackers 3 months ago
  
  As a toy example if you have pixels a,b,c and you blur to (a+b)/2,(b+c)/2, you can recover back c - a. You might be able to have a good estimate of the boundary condition ("a"), e.g. you can just use (a+b)/2 as an approximation for "a", so the recovered result might look fairly close.
  You'll basically want to look up the area of deconvolution. You can interpret it in linear algebra terms as trying to invert an ill-conditioned matrix, or in signal processing terms as trying to multiply by the inverse of the PSF. In real-world cases the main challenge is doing so without blowing up any error that comes from quantization noise (or other types of noise).
  See https://bartwronski.com/2022/05/26/removing-blur-from-images... and https://yuzhikov.com/articles/BlurredImagesRestoration1.htm
- its-summertime 3 months ago
  
  How many pixels are in `I` vs how many are in `W`? different averages as a result. With subpixel rendering, kerning, etc, there is minute differences between the averages of `IW` and `WI` as well, order can be observed. It is almost completely based on having so much extra knowledge: the background of the text, the color of the text, the text renderer, the font, etc, there are a massive amount of unknowns if it is a random picture, but if we have all this extra knowledge, it massively cuts down on the amount of things we need to try: if its 4 characters, we can make a list of all possibilities, do the same mosaic, and find the closest match, in nearly no time.
- DougMerritt 3 months ago
  
  It's not that you're utterly wrong; some transformations are irreversible, or close to. Multiplying each pixel's value by 0, assuming the result is exactly 0, is a particularly clear example.
  But others are reversible because the information is not lost.
  The details vary per transformation, and sometimes it depends on the transformation having been an imperfectly implemented one. Other times it's just that data is moved around and reduced by some reversible multiplicative factor. And so on.
- lynndotpy 3 months ago
  
  TLDR: Most of the data is indeed "lost". If the group of pixels are big enough, this method alone becomes infeasible.
  More details:
  The larger the group of pixels, the more characters you'd have to guess, and so the longer this would take. Each character makes it combinatorially more difficult
  To make matters worse, by the pigeonhole principle, you are guaranteed to have collisions (i.e. two different sets of characters which pixelate to the same value). E.g. A space with just 6 possible characters, even if limited to a-zA-Z0-9, that's 62*6 = 56800235584, while you can expect at most 2048 color values for it to map to.
  (Side note: That's 2048 colors, not 256, between #000000 and #FFFFFF. This is because your pixelation / mosaic algorithm can have eight steps inclusive between, say, #000000 and #010101. That's #000000, #000001, #000100, #010000, #010001, #010100, #000101, and #010101.
  Realistically, in scenarios where you wouldn't have pixel-perfect reproduction, you'd need to generate all the combos and sort by closest to the target color, possibly also weighted by a prior on the content of the text. This is even worse, since you might have too many combinations to store.)
  So, at 25 pixel blocks, encompassing many characters, you're going to have to get more creative with this. (Remember, just 6 alphanumeric characters = 56 billion combinations.)
  Thinking about this as "finding the preimage of a hash", you might take a page from the password cracking toolset and assume priors on the data. (I.e. Start with blocks of text that are more likely, rather than random strings or starting from 'aaaaaa' and counting up.)
thehappypm 3 months ago

this gets exponentially harder with a bigger blur radius, though.
BoingBoomTschak 3 months ago

Something like https://github.com/Jaded-Encoding-Thaumaturgy/vapoursynth-de..., you mean?

JKCalhoun 3 months ago

Yeah, that is pretty wild.

I recall a co-worker doing something related(?) for a kind of fun tech demo some ten years or so ago. If I recall it was shooting video while passing a slightly ajar office door. His code reconstructed the full image of the office from the "traveling slit".

I think about that all the time when I find myself in a public bathroom stall.... :-/

nkrisc 3 months ago

> I think about that all the time when I find myself in a public bathroom stall.... :-/
Walk past a closed bathroom stall fast enough and you can essentially do that with your own eyes. Or stand there and quickly shift your head side to side. Just don't do it on one that's occupied, that's not cool.
- messe 3 months ago
  
  > Walk past a closed bathroom stall fast enough and you can essentially do that with your own eyes
  Only in the US. The rest of the world has doors without a gap at the sides.
  - nkrisc 3 months ago
    
    Well duh it won’t work if there’s no gap, no matter what country you’re in. It doesn’t get around the laws of physics.
    Not every bathroom stall in the US has gaps either.
    
    messe 3 months ago
    
    I'm sorry, but what point are you trying to make? Aside from: "Yeah, duh, you can't see through a fucking wall".
    Mine was that, typically, people from outside the US, only ever experience toilet stalls with gaps when they visit the US.
    Not every stall has gaps there, but I don't recall ever encountering it here in the EU.
  - freddie_mercury 3 months ago
    
    Doors in Vietnam are exactly the same as in the US.
    You clearly haven't traveled much so you should refrain from sweeping generalisations.
    
    messe 3 months ago
    
    I don't quite know how to respond to that.
- altruios 3 months ago
  
  The dither effect. Same as seeing through splayed fingers on a franticly oscillating hand.
  - Benjammer 3 months ago
    
    This is the nerdiest way I've ever seen someone talk about John Cena
    
    whycome 3 months ago
    
    What's funny is that the described action didn't click until your comment.
- donatj 3 months ago
  
  "Sir, why do you keep running back and forth in the bathroom?"
Agree2468 3 months ago

Line scan cameras operate on this principle, and are still used in various ways to this days. I'm especially partial to the surreal photos generated by them at the end of cycling races
https://finishlynx.com/photo-finish-trentin-sagan-tour-de-fr...
- hifikuno 3 months ago
  
  I love line scan cameras. Wikipedia has an awesome photo of a tram taken with a line scan camera on the relevant wiki page[1].
  I've just moved to a house with a train line track out front. I want to see if I can use a normal camera to emulate a line scan camera. I have tried with a few random YouTube videos I found [2].
  I think the biggest issue I face is that there simply isn't the frame rate in most camera's to get a nicely detailed line scan effect.
  ---
  [1]: https://en.wikipedia.org/wiki/Line-scan_camera
  [2]: https://writing.leafs.quest/programming-fun/line-scanner
  - mkl 3 months ago
    
    A normal frame rate is probably enough if you do it with groups of columns rather than a single column of pixels. That's what https://trains.jo-m.ch/ does with a Raspberry Pi camera, posted in https://news.ycombinator.com/item?id=35738987 with lots of questions and answers.
    
    hifikuno 3 months ago
    
    Oh wow that's neat! I honestly didn't think of using chunks instead of a single line and the result looks pretty good.
    I love the fact the top (or first?) comment is by dllu, and looking on their webpage I saw the tram photo from Wikipedia! It's cool to see the photographer talking about their work. I think about that tram photo so much.
- whycome 3 months ago
  
  Similar to
  https://hackaday.com/2024/08/17/olympic-sprint-decided-by-40...
- JKCalhoun 3 months ago
  
  I was not aware of those.
  Reminds me of slit-scan as well. And of course rolling shutters.
rosswilson 3 months ago

This reminds me of https://github.com/jo-m/trainbot, a neat example of stitching together frames of passing trains to form a panorama.
This frontend presents them nicely: https://trains.jo-m.ch
- stevage 3 months ago
  
  Surprised I've never come across this idea before
  - all2 3 months ago
    
    For a long time I've wanted to install one of these to pull graffiti off of trains. As art goes, graffiti is one of the last 'free' expressions where it is essentially truly anonymous and definitely not for money. Being free of those constraints, I think, frees the mind to truly create.
    And I'd love an archive somewhere of some of the truly awesome train art I've seen.
    
    jo-m 3 months ago
    
    here you go... https://trains.jo-m.ch/#/trains/44010
MisterTea 3 months ago

> His code reconstructed the full image of the office from the "traveling slit".
This method is commonly used in vision systems employing line scan cameras. They are useful in situations where the objects are moving, e.g. along conveyors.
- geerlingguy 3 months ago
  
  Even today most cameras have some amount of rolling shutter—the readout on a high-megapixel sensor is too slow/can't hold the entire sensor in memory instantaneously, so you get a vertical shift to the lines as they're read from top to bottom.
  Global shutter sensors of similar resolution are usually a bit more expensive.
  With my old film cameras, at higher shutter speeds, instead of opening the entire frame, it would pass a slit of the front/rear shutter curtain over the film to just expose in a thousandth of a second or less time.
quietbritishjim 3 months ago

Sorry if you're already aware, but in case not: The weird huge gap around the edge of cubical doors in pubic toilets is specific to the US. (For those that don't know, it's literally 1 or 2 cm.) In Europe you just get a toilet door that shuts properly and there's no slit to reconstruct.
I remember my first visit to a toilet in the plush US office of a finance company and thinking WTF are they doing with their toilet cubicle? I only found out later that it's common there.
- stevage 3 months ago
  
  It can be much more than 2cm. The US really hates people for some reason.
  - fc417fc802 3 months ago
    
    I believe it started as an anti-drug thing more than 50 years ago but somehow it became pervasive even in fairly high end settings where you would never expect such measures. Cargo cult bathroom construction IMO.
    Luckily things seem to be gradually changing.
- nkrisc 3 months ago
  
  I am aware, and of course this works for any gap you might be walking past. Plenty of newer bathroom stalls here do not have a gap.
nzach 3 months ago

And if you "reverse" this idea you can make a "holographic(?) display"[0].
[0] - https://www.youtube.com/watch?v=ric-95ig5oE

AdmiralAsshat 3 months ago

My Windows-98 approved method for redacting a screenshot:

1) Open screenshot in MS-Paint (can you even install MS-Paint anymore? Or is it Paint3D now?)

2) Select Color 1: Black

3) Select Color 2: Black

4) Use rectangular selection tool to select piece of text I want to censor.

5) Click the DEL key. The rectangle should now be solid black.

6) Save the screenshot.

As far as I know, AI hasn't figured out a way to de-censor solid black yet.

its-summertime 3 months ago

There was a programming competition, can't remember which, similar to IOCCC but more about problematic software? where the redaction was reversible despite being pure black, due to the format chosen allowing for left over information in the image (vastly reduced quality but it was enough to allow text to be recovered!) [edit: see replies!]
There was also the Android (and iOS?) truncation issue where parts of the original image were preserved if the edited image took up less space. [edit: also see replies!]
Knowing some formats have such flaws (and I'm too lazy to learn which), I think the best option I think is to replace step 6 with "screenshot the redacted image", so in effect its a completely new image based on what the redacted image looks like, not on any potential intricacies of the format et al.
- qwertox 3 months ago
  
  Maybe you're referring to "aCropalypse". Also there was an issue once where sections with overpainted solid black color still retained the information in the alpha channel.
  https://www.wired.com/story/acropalyse-google-markup-windows...
  https://www.lifewire.com/acropalypse-vulnerability-shows-why...
  - Modified3019 3 months ago
    
    I also recall at one point some image file format that ended up leaking sensitive info, because it had a embedded preview or compressed image, and the editing program failed to regenerate the preview after a censor attempt.
    Was a loooong time ago, so I don’t remember the details.
    
    fullstop 3 months ago
    
    AT&T leaked information, as did the US Attorney's Office, when they released PDFs with redacted information. To redact, they changed the background of the text to match the color of the text. You could still copy and paste the text block to reveal the original contents.
    https://www.cnet.com/tech/tech-industry/at-38t-leaks-sensiti...
- fanf2 3 months ago
  
  You are thinking of John Meacham’s winning entry in the 2008 underhanded C contest https://www.underhanded-c.org/_page_id_17.html
  - turnsout 3 months ago
    
    Wow, it took me a minute to figure out how his entry works. You really could read that code and assume it was correct. The resulting image is perfectly redacted visually, and the missing data is not appended or hidden elsewhere in the file. You would only discover it by inspecting the PPM image in a text editor. Very sneaky!
- evertedsphere 3 months ago
  
  https://www.underhanded-c.org/_page_id_17.html
- ZeWaka 3 months ago
  
  There's tricks like this with embedded thumbnails.
- googlryas 3 months ago
  
  The underhanded C contest: https://www.underhanded-c.org/
  - teddyh 3 months ago
    
    Too bad that they only show the winners up to 2015. All the later ones are on github.com, but are harder to find.
- youainti 3 months ago
  
  I would guess that would be due to compression algorithms.
- rkagerer 3 months ago
  
  Step 5.5) Take a new screenshot of the image.
  - Calwestjobs 3 months ago
    
    step 5.5.5 - tell chatgpt what is on image to regenerate it for you XD
- Calwestjobs 3 months ago
  
  what about intentional adding data into image?
  screenshot - im not convinced apple does not use invisible watermark to add info into image data. but for fact every photo you take with iphone, contains invisible watermark with your "phone serial number". to remove such watermarks, facebook is converting every picture you post for last 10 years... just weird extra con to using modern technology.
  try to copy banknote on your printer, it will not print anything, just says error. + every page of text printed contains barely visible yellow marks containing again serial number of printer.
  ....
a2128 3 months ago

> can you even install MS-Paint anymore? Or is it Paint3D now?
Paint3D, the successor to MSPaint, is now discontinued in favor of MSPaint, which doesn't support 3d but it now has Microsoft account sign-in and AI image generation that runs locally on your Snapdragon laptop's NPU but still requires you to be signed in and connected to the internet to generate images. Hope that clears things up
- orthoxerox 3 months ago
  
  Maxis was simply ahead of its time.
- shmeeed 3 months ago
  
  I wish I could tell if this is satire or not.
Retr0id 3 months ago

> AI hasn't figured out a way to de-censor solid black yet.
I did though, under certain circumstances. Microsoft's Snipping Tool was vulnerable to the "acropalypse" vulnerability - which mostly affected the cropping functionality, but could plausibly affect images with blacked-out regions too, if the redacted region was a large enough fraction of the overall image.
The issue was that if your edited image had a smaller file size than the original, only the first portion of the file was overwritten, leaving "stale" data in the remainder, which could be used to reconstruct a portion of the unedited image.
To mitigate this in a more paranoid way (aside from just using software that isn't broken) you could re-screenshot your edited version.
- navane 3 months ago
  
  Luckily the current Microsoft screen snip utility is so buggy I often have to screen shot my edited screen shots anyway to get it to my clipboard.
JimDabell 3 months ago

It’s possible, depending upon the circumstances. If you are censoring a particular extract of text and it uses a proportional font, then only certain combinations of characters will fit in a given space. Most of those combinations will be gibberish, leaving few combinations – perhaps only one – that has both matching metrics and meaning.
- HPsquared 3 months ago
  
  Not forgetting subpixel rendering.
Arubis 3 months ago

What I love about this method is that it so closely matches what actual US govt censors do with documents pending release: take a copy, black it out with solid black ink, then _take a photocopy of that_ and use the photocopy for distribution.
- devmor 3 months ago
  
  This is similar to how I censor images on a cellphone. I use an editor to cover what I want to censor with a black spot, then take a screenshot of that edited image and delete the original.
  - kccqzy 3 months ago
    
    Make sure your editor uses real pure black to cover the region. Chances are, if you use a general image editing app and if you deal with concepts like "brushes" you are not using pure black; it's mostly likely black with varying alpha channel.
    
    devmor 3 months ago
    
    Yes, very important! I personally use a black box sticker.
- 0cf8612b2e1e 3 months ago
  
  News publications are also encouraged to do the same or even re-type the raw document. There was a story about how they shared raw scans of the leaked documents such that the yellow printer id dots were visible. That might have been for C. Manning?
lynndotpy 3 months ago

Solid color would convey far less information, but it would still convey a minimum length of the secret text. If you can assume the font rendering parameters, this helps a ton.
As a simple scenario with monospace font rendering, say you know someone is censoring a Windows password that is (at most) 16 characters long. This significantly narrows the search space!
- graypegg 3 months ago
  
  That sort of makes me wonder if the best form of censoring would be solid black shape, THEN passing it through some diffusion image generation step to infill the black square. It will be obvious that it's fake, but it'll make determining the "edges" of the censored area a lot harder. (Might also be a bit less distracting than a big black shape, for your actual non-advisarial viewers!)
  - lynndotpy 3 months ago
    
    I think the edges would still be evident, and this would just waste time and energy. I think a black square is just fine, so long as you can leak some information on the length of the secret. I would make it larger than it needs to be.
layer8 3 months ago

If you want the blurred/pixelated look, blur/pixelate something else (like a lorem ipsum) and copy it over to the actual screenshot.
al_borland 3 months ago

Back in the TechTV days one of the hosts used Photoshop to crop a photo of herself before posting it online. One would think a crop, completely removing the part of the image would be even better than solid black. However, with the way Photoshop worked in 2003, it didn't crop the embedded Exif thumbnail, which people were able to use to get the uncropped image.
sva_ 3 months ago

Maybe silly, but I'd always take a screenshot of the final thing and then paste that to a new file... just to be sure.
gruez 3 months ago

>2) Select Color 1: Black
You don't need this step. It already defaults to black, and besides when you do "delete" it doesn't use color 1 at all, only color 2.
remram 3 months ago

https://jspaint.app/ (https://github.com/1j01/jspaint)
- murdockq 3 months ago
  
  Wow glad to see there were other fans of MSPaint, can't believe I built my open source version with wxWidgets 16 years ago https://github.com/murdockq/OpenPaint
jebarker 3 months ago

That's going to be a lot of work for a YouTube video though
layman51 3 months ago

This is odd because when I follow your steps up to Step 5, the rectangle that gets cut out from the screenshot is white. I did remember to follow steps 2 and 3.
- AdmiralAsshat 3 months ago
  
  Might've changed in recent versions of Paint if you're on Win 11. It definitely used to take whatever you had as Color 2 as your background.
  - ZeWaka 3 months ago
    
    Still does.
    
    layman51 3 months ago
    
    I think it depends on the new layers feature that’s on my version of Paint. If I make the base layer be transparent, then the cutout is transparent.
eviks 3 months ago

this method looks worse than pixelation/blurry style, those "just" need to be updated to destroy info first instead of faithfully using the original text
- MBCook 3 months ago
  
  If you REALLY care then replace the real information with fake information and pixelate that.
  But most people don’t care enough.
  Or I guess you could make a little video of pixelation that you just paste on top so it looks like you pixelated the thing but in reality there’s no correspondence between the original image and what’s on screen.
  - eviks 3 months ago
    
    Most people have no clue, they are fooled by tools that lie to them, if they didn't care enough they wouldn't use the tools to hide the info
il-b 3 months ago

…somehow, it uses 99.9% opacity for the fill…
layer8 3 months ago

Don’t do this on a PDF document though. ;)
- jcul 3 months ago
  
  Should be ok if you rasterize the PDF. Run something like pdftotext after to be sure it doesn't have any text.
  Or to be safe, print it and scan it, or just take a screenshot.
  - layer8 3 months ago
    
    Testing that it doesn’t have text doesn’t help if the text was a bitmap in the first place.
    Normally the use case is that you still want to distribute it as a PDF, usually consisting of many pages, and without loss of quality, so the printing/scanning/screenshotting option may not be very practical.
    No, the real solution is to use an editor that allows you to remove text (and/or cut out bitmaps), before you add black rectangles for clarity.
SoftTalker 3 months ago

7) Print the screenshot
8) Scan the printed screenshot
- eastbound 3 months ago
  
  This. Never give the original file, always take a screenshot of it. If it’s text being blacked out, it can be guessed from the length of words.
- genewitch 3 months ago
  
  Forgot the wooden table step...
- HPsquared 3 months ago
  
  Or take a blurry misaligned photo of the screen.

bob1029 3 months ago

It would seem techniques like this have been used in domains like astronomy for a while.

> The reconstruction of objects from blurry images has a wide range of applications, for instance in astronomy and biomedical imaging. Assuming that the blur is spatially invariant, image blur can be defined as a two-dimensional convolution between true image and a point spread function. Hence, the corresponding deblurring operation is formulated as an inverse problem called deconvolution. Often, not only the true image is unknown, but also the available information about the point spread function is insufficient resulting in an extremely underdetermined blind deconvolution problem. Considering multiple blurred images of the object to be reconstructed, leading to a multiframe blind deconvolution problem, reduces underdeterminedness. To further decrease the number of unknowns, we transfer the multiframe blind deconvolution problem to a compact version based upon [18] where only one point spread function has to be identified.

https://www.mic.uni-luebeck.de/fileadmin/mic/publications/St...

https://en.wikipedia.org/wiki/Blind_deconvolution

dopadelic 3 months ago

This makes sense for blurring, but not for pixelation mosaicking.
- gliptic 3 months ago
  
  For pixelation you can use another technique invented for astronomy: drizzling [1].
  [1] https://en.wikipedia.org/wiki/Drizzle_(image_processing)

wlesieutre 3 months ago

> If I hadn't moved around my Finder window in the video, I don't think it would've worked. You might get a couple letters right, but it would be very low confidence.

> Moving forward, if I do have sensitive data to hide, I'll place a pure-color mask over the area, instead of a blur or pixelation effect.

Alternately - don't pixelate on a stationary grid when the window moves.

If you want it to look nicer than a color box but without giving away all the extra info when data moves between pixels, pixelate it once and overlay with a static screenshot of that.

For bonus points, you could automate scrambling the pixelation with fake-but-real-looking pixelation. Would be nice if video editing tools had that built in for censoring, knowing that pixelation doesn't work but people will keep thinking it does.

geerlingguy 3 months ago

That's another good way to do it.
I wonder if it might be good for the blur/censor tools (like on YouTube's editor even) to do an average color match and then add in some random noise to the area that's selected...
Would definitely save people from some hassle.
- wlesieutre 3 months ago
  
  The part that might take some work is matching the motion correctly, with a pixelated area or blacked out rectangle it doesn't matter if it's exactly sized or moving pixel perfectly with the window. I haven't done any video editing in 20 years, so maybe that's not very difficult today?
  That moving pixelation look is definitely cooler though. If you wanted to keep it without leaking data you could do the motion tracked screenshot step first (not pixelated, but text all replaced by lorem ipsum or similar) and then run the pixelation over top of that.
  If any of you nerds reading this are into video editing, please steal this idea and automate it.
IshKebab 3 months ago

Yeah this scenario is purposefully chosen specifically to make this attack possible. It's basically irrelevant in the real world.
- geerlingguy 3 months ago
  
  Someone's already emailed me the depixelated version of the paper I'm holding in the video attached to this blog post.

formerly_proven 3 months ago

> Intuitively, blur might do better than pixelation... but that might just be my own monkey brain talking. I'd love to hear more in the comments if you've dealt with that kind of image processing in the past.

A pixelization filter at least actively removes information from an image, a Gaussian blur or box blurs are straight up invertible by deconvolution and the only reason that doesn't work out of the box is because the blurring is done with low precision (e.g. directly on 8-bit sRGB) or quantized to a low precision format afterwards.

danjl 3 months ago

Exactly. Do not use blur to hide information. Blurring simply "spreads out" the data, rather than removing it. Just search (you know, on Google, without an LLM) for "image unblur".
kccqzy 3 months ago

Even if the precision is low, the deconvolution process you described is still good enough to reconstruct the original text in the majority of cases.

luk4 3 months ago

It's worth noting that it could have dramatic consequences for some people.

In France, public television raised the alarm a few years ago about the anonymization of voices and the blurring of faces of investigative sources. As a result, an anonymization policy has been implemented requiring that voices be replaced by an actor reading the text and that people be filmed from behind at the very least, or even replaced by an actor altogether.

More problematic are the archives of past investigations, which put people like political dissidents and witnesses against organized crime or the mafia at high risk of suffering retaliation. To mitigate this risk, tens of videos have been taken down, and efforts have been made to contact those who may be affected.

https://larevuedesmedias.ina.fr/urgence-france-televisions-i... (in french)

HocusLocus 3 months ago

The Bell Labs A-3 scrambler used real time band inversion and transposition and was 'snake oiled' into the commercial market, but under the pressure of WWII it fell quickly. It was bad enough it was self-clocked and a German engineer had helped design it. But even more embarrassing was, without having to reverse engineer the circuit, humans could train their ears to recognize individual speakers and even words.

Today we take for granted the ability to conjure a complicated pseudorandom digital stream for keying, but in those days it was just "no can do".

In WWII... SIGSALY was the first system secure by modern standards. Pairs of synchronized one-time phonographic records containing a sequence of tones seeded from a noise source.

its-summertime 3 months ago

Speaking of, the Lockpicking Lawyer's "Thank you" video https://www.youtube.com/watch?v=CwuEPREECXI always irked me a bit, yeah its blurred, but as can be seen, (and as was possible back then, and way before then too, recovering poor data from windowed input has been a thing for 50+ years (e.g. radio signals, scanning tools, etc), if you think about it, its a cheap way to shift costs from physical improvement to computational improvement, just have a shutter), and yet he didn't block the information out, only blurred it

IshKebab 3 months ago

That's a totally different scenario. You can't unblur that video.
- its-summertime 3 months ago
  
  Why not? Would you be willing to stake hypothetical customer data on your assumptions?
  - IshKebab 3 months ago
    
    Because in real videos where blurring is done by physical processes there is too much additional noise and uncertainty in the blurring process.
    Unblurring is an extremely ill-posed problem so any noise or modelling errors get massively amplified.
    In only works in this case because there is essentially zero noise, and the correlation between source frames is an exact move.
    Yes I would stake hypothetical customer data on this.
wodenokoto 3 months ago

To save others a click: the video is a pile of customers packages with addresses ready to send.
“It’s” are the Address lines, which are blurred instead of blacked or whited out, potentially revealing customers private information.

42lux 3 months ago

Bad blackout jobs are in the news since the 50s and every time an expert tells the same solution. If you want to censor something remove the information.

nightpool 3 months ago

Easier said than done if you're using a proportional font though

Calwestjobs 3 months ago

black box is black box, or other more pleasing color.

nightpool 3 months ago

https://arxiv.org/pdf/2206.02285

    In this work we find that many current redactions of PDF text are insecure due to non-redacted character positioning information. In particular, subpixel-sized horizontal shifts in redacted and non-redacted characters can be recovered and used to effectively deredact first and last names.

Information-theoretic attacks work on black boxes too

HPsquared 3 months ago

I wonder how much random noise (or other randomness) would have to be added to the pixelated version to make this method unusable.

miki123211 3 months ago

If you really want that blur effect so badly, you can just replace your content with something innocuous, and then blur that innocuous content.
This is what you actually have to do with websites, e.g. when you want some content blurred when it's behind a paywall. If you leave the original text intact, people can just remove the CSS blur in dev tools.
Some implementations get this slightly wrong, and leave the placeholder content visible to accessibility tools, which sometimes produces hilarious and confusing results if you rely on those.

Funes- 3 months ago

Japanese porn is being "decensored" with AI as we speak, in fact. It looks a tad uncanny, still, but finding a "decensored" clip in the wild was quite the thing for me a couple of weeks ago.

internetter 3 months ago

This is a completely different process — the AI is inferencing what goes there, it isn't actually using any information from the pixels so it wouldn't work in this case.
Not to mention deeply and disturbingly unethical
- sva_ 3 months ago
  
  It would use information from the pixels around it though.
  > Not to mention deeply and disturbingly unethical
  Is it really deeply disturbingly unethical? Just FYI, it isn't their identities that are censored, but their genitals are pixelated due to Japanese laws.
  - Funes- 3 months ago
    
    >It would use information from the pixels around it though.
    I'd bet it could use information from the pixels around it and the blurred out ones as well. It's not hard to imagine such an approach.
- gjsman-1000 3 months ago
  
  So let me get this straight: Porn can be ethical - selling your nude features online can be ethical - doing the activities in porn consensually can be ethical - pleasuring yourself on other people doing so can be ethical - but using AI to infer nude features is "disturbingly unethical"?
  - ziddoap 3 months ago
    
    >but using AI to infer nude features is "disturbingly unethical"?
    If it is against the wishes of the people in the video, yes, yes it is.
    E: Never thought I'd see the day I'm downvoted for saying un-blurring porn of people who made porn under the assumption that it would be blurred (and may not have made that same decision without that context) is unethical, on HN of all places, but times are strange I guess.
    
    anigbrowl 3 months ago
    
    In this case it's a legal requirement imposed by the government.
    
    googlryas 3 months ago
    
    Yes, but their decisions to be porn stars were made within the context of that law. Maybe they wouldn't care about the uncensored version of their video getting out. Maybe they would?
    
    Der_Einzige 3 months ago
    
    Piracy is the opposite of unethical, because information wants to be free. IP rights holders had their NAP violated - and society is better off for it.
    The reason it's "against the wishes" of folks in a JAV video is because of the legal risk it opens them up to from the Japanese government - not because the actors/actresses "don't consent to viewers seeing their uncensored body".
    Note that I am NOT talking about distribution of non consensual deepfakes. Obviously that's abhorrent.
    
    ziddoap 3 months ago
    
    >not because the actors/actresses "don't consent to viewers seeing their uncensored body".
    How can you possibly know this? They made the career decision while knowing what the law was. They may not have made that decision if the law was different.
  - GuinansEyebrows 3 months ago
    
    only one of these things has an intrinsic environmental tax exponentially higher than the rest.
    
    kbelder 3 months ago
    
    You think using AI on the video has a higher environmental impact than actually filming the video in the first place?
- beeflet 3 months ago
  
  why is it so deeply unethical to remix porn into... porn
dopadelic 3 months ago

[dead]

feverzsj 3 months ago

You should try "AV-8500 Special" from 90s' Japan.

[0] https://groups.google.com/g/alt.video.laserdisc/c/Ws6h9uiumF...

[1] https://x.com/nabetora164/status/1660981662006775809

netsharc 3 months ago

I once thought the publishers of those videos would use a reversible algorithm, as malicious compliance...
Or having the pixelated parts be a particular pattern, and then releasing an XOR video to get the original footage..

zoky 3 months ago

I also have a network share named “mercury” connected to my Mac, and that last example nearly made me shit myself.

geerlingguy 3 months ago

Ha! I name most of my shares after celestial bodies... Jupiter is the big 100 TB volume for all my archives. Mercury is an all-NVMe volume for speed, for my video editing mostly.
- zoky 3 months ago
  
  Heh, mine is actually called mercury because I took to naming all of my homelab systems after Roman and Greek mythology, so the server was called “minerva” because of the knowledge it stores, and the share name was “mercury” because it was the messenger between myself and minerva… I swear I was going for something with this, LOL!

vault 3 months ago

I noticed the link in Jeff's post to RX 10 Elements Noise Reduction. The audio in their YouTube presentation was not horrible at all though. Has anybody tried it with some real horrible recording? Like those from a blink mini camera in a room without furniture.

geerlingguy 3 months ago

I have, I was going to go for a more extreme example but couldn't find one quickly on their channel.
It's not perfect, by any means, but you can get intelligible speech from a pretty terrible recording at least. Adobe has their AI assist tool too, it works pretty well though I've found it can't isolate a speaker when there are a lot of other people talking nearby.

taf2 3 months ago

Giving the final image at 13 seconds to ChatGPT and I wonder if this is pretty close... https://x.com/taf2/status/1912260125278032228

istjohn 3 months ago

It's clearly not. In the original screenshot there are 6 files with the prefix "I.2J", but in the GPT version, there are only four.
- rasz 3 months ago
  
  it hallucinated one more folder

syntaxing 3 months ago

Reminds me of the "swirl face".

[1] https://matzjb.se/2015/07/26/deconstructing-swirl-face/

brunosutic 3 months ago

I like this Jeff Geerling guy.

ge96 3 months ago

he's like THE or was THE raspberry pi guy

marcodiego 3 months ago

> For the second attempt, GIMP was used to get a better window selection algorithm with ffmpeg, and with a slight bit more data (more frames extracted), a perfectly legible result

Take that adobe.

Havoc 3 months ago

That was a cool vid.

I recall interpol doing similar a couple years back to bust a dodgy kids image ring. That was a swirl though which I guess is mathematically easier

Calwestjobs 3 months ago

superresolution is here for 20+years, even GPUs use that now... geerling is clickbait.

maybe this youtube channel will be interesting https://www.youtube.com/@mikirubinstein

so if we could do that 12 years ago, debluring/depixelating of video should be easy

mrlonglong 3 months ago

So it's now possible to redub phrases such as melon farmers with the real swears?

draw_down 3 months ago

[dead]

mikelitoris 3 months ago

Does this guy look like Eminem or what?