Why I Don't Like Filters

Apr 25, 2019
Filters are supposed to be a magic bullet to every problem, but are they really?

The simplest thing to run a filter on would be an image. Calculate the sha256 hash and away you go. Even then, someone could make an image more blue, more greyscale, change random bits, rotate the image in three dimensions, etc. It is possible to take that into account, but it takes significantly more computational power.

Most copyrighted content that is of high value are not going to be images. Most content that will be hit with the above method will likely be memes, identical images which are spread out everywhere. Video is higher value content, but you would have to run the above method on each and every frame. They are also likely to be inside some sort of archive like a ZIP file.

It should be mentioned that image manipulation is not without potential security vulnerabilities.
There was also a recent security incident affecting the ZIP format and no less than 14 programming languages. I avoided that as I already had a feeling that uncompressing random files from random untrusted users would be a generally bad idea.

AI are also vulnerable to security vulnerabilities like adversarial examples and one which shunts values over so that things which shouldn't be allowed suddenly are and things that should suddenly aren't. 4chan iirc also managed to turn Microsoft's chatbot, Tay, into a Nazi in approximately 24 hours, simply by talking to it a lot with such views.

The more complex the system is, the higher the chances of a security vulnerability, which is why the most secure systems are often the simplest.

All of this adds exponentially higher resource usage for simple file uploads, it's easily bypassed and only serves to tick "we're doing something checkboxes". AI is mentioned as a magic solution, but AI tends to do all of the above, with the addition of breaking the frame down into small sectors which it matches and does similarity analysis on, although it seems to fail badly at that on YouTube, I can't see this being less intensive.

There is also another thing to analyse in a video. Music.

It is also used fraudulently:

Media companies claiming NASA infringed on their copyright by uploading footage of Mars:

Bird calls mistaken as music by algorithm: