Poisoning the well: why image-generators keep eating copyrighted data.

Three years after building an image-copyright framework as a high-schooler, a look at what held up, what didn't, and where the current wave of adversarial-perturbation research is actually pointed.

In 2022, I spent a year of evenings trying to answer a very specific question: could you embed a signal in an image that was invisible to humans but catastrophic to a diffusion model's training loop? The short answer turned out to be yes. The longer answer is the reason I'm still writing about it in 2026.

The project — we called it Securing Image Copyrights — went further than I expected. It won a Gold at the Korea Science & Engineering Fair, picked up the Yale S&EA award, and sent me to EUCYS as part of the Korean delegation. But the technical core of it was narrower than the headlines suggested, and I want to be honest about what has aged well and what hasn't.

What we actually built

The framework added structured high-frequency noise to published images, shaped so that the gradient an image-to-image model would see during training was pushed toward a specific, useless target. The noise was tuned to sit below the perceptual threshold on typical display hardware — you could stare at it and not see a thing — but to dominate the loss signal during backprop on the kind of latent-diffusion models that were just starting to eat the internet.

The goal was never to hide the image. It was to make the image expensive to learn from.

That distinction matters. A watermark says this belongs to someone. A perturbation says this is not worth your gradient step. The second is the only one that scales, because you don't have to trust the person doing the scraping.

What held up

The core insight — that adversarial examples can be tuned against a training procedure rather than a single model instance — has held up surprisingly well. Modern work on Glaze, Nightshade, and the whole second generation of poisoning tools is downstream of the same idea. The shape of the attack is very different now, but the premise is identical: treat the dataset as the threat surface, not the model.

What didn't

Two things I got wrong.

I assumed attackers would be stable. They aren't. The moment a defense gets popular, the scrapers adapt — resizing, re-encoding, mild blur — and most of the perturbation budget evaporates.
I underestimated how much the arms race would favor the model side. Training compute has grown faster than publishing compute, and any per-image defense has to survive a deduplication pipeline that only gets smarter.

Where this goes next

I think the interesting question is no longer can we poison the well, but who owns the well at all. The copyright conversation has moved from the image layer to the policy layer faster than the technical work can keep up. Which is, weirdly, a good thing — it means the framework I built as a 16-year-old has done the one thing a student project can actually do: help make a case.

The code is on GitHub, mostly for archaeology at this point. If you're working on this space, I'd love to hear from you.