Watermarking and DRM – One Replacing the Other?
I sat staring at at my screen today with a squinty look in my eyes and a soured puss as my wife asked me why I looked so funny. "Meh!" I replied tersely.
The real answer was that I was pondering a question asked by the title of a topical piece penned by CNET’s Matt Rosof which begged: "Watermarking to Replace DRM?"
I think the reason I looked so perturbed is that it was an overtly stupid innocent question given that it’s pretty obvious that watermarking won’t "replace" DRM, it is merely another accepted application of it.
It doesn’t take much to remember that the ‘M’ in ‘DRM’ stands for management. Tracking how files move around is part of the M. Why is this any different? The point of monitoring anything is either to: (a) gather intelligence which can be used to (b) implement a control or effect a disposition based upon said intelligence.
It’s interesting that in many cases we risk giving up our ‘R’ but that’s a topic for a different post.
So here’s the premise of watermarking — something I think most of us understand:
So what’s watermarking? It’s the insertion of extra data into an audio
stream that can help identify where that audio came from. It’s not
enough to attach data to a digital audio file–users can just burn that
file to a CD and then re-rip it, changing the file format and stripping
off all the data associated with the original file. (This is also the
classic way users get around DRM.) Instead, the data is inserted into
the audio track itself. It’s inaudible to human ears, but detectible by
various other tools.
What I found interesting from a security and technology perspective was the following:
In the case of Universal, the watermarking data won’t identify each individual file–a
method that would allow the company to trace pirated files back to
their first purchaser. Instead, it will only identify the particular
song. Eventually, Universal will look at popular file-trading networks,
and see which of the DRM-free songs released through its experimental
program ended up on these networks.
Firstly, I don’t believe the first sentence. Sorry, I’m a skeptic. Secondly, this technology and its application isn’t new at all. I have it on very, very good authority that existing technology has been used in this exact manner for the last several years by the RIAA in order to track and monitor P2P file swapping which includes audio. It’s used by government and military operators, also.
How do you think those subpoenas get issued specifically against those 12 year old girls swapping Shakira MP3’s? They can definitively link a specifically watermarked MP3 with the IP address of the downloader after it’s injected into the network and consumed…by using watermarking.
(Ed: Comments below by Jordan suggest that this practice is not used heavily. I cannot dispute this assertion, but I maintain that the technology has been used in this manner. See the comments for an interesting perspective.)
It’s the same technology used by DLP and DRM solutions in the enterprise today. So, watermarking is just another means to the end. Period.
This is the funny part of the story:
Universal can then use this data to
help decide whether the risk of piracy outweighs the increased sales
from DRM-free MP3 files, segmenting this decision by particular
markets. For example, it might find that new Top 40 singles are more
likely to find their way onto file-trading networks than classic rock
from the 1970s.
Sure it will… 😉 I feel all warm and fuzzy now.
/Hoff
* Picture Credit: CNET
I'm in agreement as to the basic nature of watermarks. Heck, they share another characteristic with other more obvious forms of DRM — they're guaranteed to be broken. Watermarks only work when people aren't looking for them or don't know they're there.
I doubt the RIAA's been doing this for a while, if at all. For that to have happened, they'd have to actually be seeding the watermarked versions themselves somewhere unless you think that individual CDs are watermarked?
As someone who's processed… let's just say a "decent" volume of DMCA complaints over the years as a security engineer at a very large public university, I can tell you that if fingerprinting is going on, the RIAA's not making effective use of it. They can't even handle the basics of IPs and timestamps, let alone watermark detection.
1) They've sent notices for files that were innocuous but had key words that /looked/ vaguely similar to a copyrighted work they control. This type of keyword searching produced some funny and obviously wrong results. They clearly had some pretty poorly written spiders or really stupid people driving the process. Fortunately it's been two or three years since they were /this/ off base.
2) They've sent notices for files that were being advertised on P2P networks but weren't actually available if they would have tried to download them for one reason or another (kinda hard to detect a watermark without actually getting the file). One counter argument to this is that most p2p networks report a hash of the file in question, so it's possible they had gathered a copy from elsewhere. Still, as far as I know, they still don't actually verify that you're serving copyrighted material. You could stick up a honeypot that claimed to be serving up whatever song you wanted with appropriate hashes and never actually send any real data and still get sent a notification.
3) They've sent notices for machines that flow data conclusively showed weren't doing any P2P at the time of their complaint, or sometimes, for IPs that weren't even routed at the time of the notification. Frankly I've got no idea why they're sometimes this busted.
So if they have been doing this for years, they're doing a pretty poor job of making use of the info in their takedown notification process.
Jordan:
Great comments, but I have no reason to doubt my source.
The intel I received relates to specific files that, as you suggested, they have seeded themselves. I cannot confirm how widely used this is, but the information came from the company that provided the technology to do it.
Also, to be clear, I'm not talking about "text-based" watermarking. I'm talking about the insertion of audio artifacts that even given resampling, etc. are easily detectable.
The files in question that were tracked were downloaded and then directly compared (based upon the title of the song, of course) in order to confirm the file.
The "poor job" of using this technology *could* be attributed to the lack of good automation for sampling originals or the takedown system is broken.
I am indirectly involved in a company that is/was producing technology for the RIAA as well as ASCAP to replace the squads of drones who listen to the radio to see record song plays for royalties. You can sample about 3 seconds of a song and get an extremely high confidence return for a match.
This same technology is also being used to be detect songs being traded — and this doesn't require a watermark, it just uses a reference sample.
/Hoff
Hi Chris, thanks for bringing this up, it's an extremely interesting area. I'm afraid Jordan's right, like any other area of DRM, watermarking is pretty easily broken. Images and audio can be filtered to strip out watermarks in pretty much every case. This is maybe why Universal are taking such a soft approach.
There is actually reasonably strong economic proof that free dissemination of music actually helps the artist make money, but I'm not so sure that would work in the case of video.
DRM, as you well know, is one tough mother to crack. Whatever happened to MPEG-21 and RDF?
OK, hang on a second. Just so we're clear, you're suggesting that the audio equivalent of steganography is easy to detect?
If I use an encoding algorithm for distributing and then sampling artifacts across an audio sample, you're saying that it's easy to detect?
Pray tell.
(and just to be clear, I don't sponsor or support DRM from an economic perspective, but I'm really keen to understand how you guys can generalize that in a 3 minute song, assuming you don't know the encoding mechanism, that it's easy to reverse engineer the decode when not in possession of the decoding mechanism. That's a LOT of sampling opportunity…it could be a reassembly of 10-single 22KHz sample points across and entire song representing the "watermark.")
I think you're both approaching this from the perspective of encoding a recording with something that literally speaks (outside of human hearing range) "Stolen from Universal.." I'm not.
/Hoff
Hmm, perhaps easy was the wrong word, breakable would perhaps be better. This implies that someone would have to be looking for it in the first place of course. Even with spread spectrum encoding there are bit-errors due to the interference of the original signal. There is always a trade off between the robustness of a watermark and its detectability.
The point with watermarks is that they're there to keep the good guys good rather than catch the bad guys, and as you say, economically speaking, they are hard to justify by any party. Another great security idea that just isn't going to get the proper attention…