What if Watson analyzed 800 million pages of Dylan critiques and analysis, instead of 800 million pages of lyrics? I bet you could get to the anti-establishment theme. Maybe Watson was just given the wrong set of input data (garbage in, garbage out).
The themes it produced were not inaccurate when it comes to Dylan. The vast majority (and his most beloved works) are not protest songs. Pretty much everything he did after Bringing It All Back Home is not a protest song. Like A Rolling Stone is definitely not a protest song. In fact, most of his work IS about relationships in some form.
I would disregard what the author has to say about Dylan even though it seems to be author's primary example. Dylan wrote a ton of songs and encompassed a couple different personas through his career. He's not one thing.
Would Watson discover this, though? Even if you marked all the lyrics with a year, would it be able to make that sort of inference? I don't think so, I doubt it is able to form anything like a concept of time, or person, or a person changing over time, especially not from an input of song lyrics.
I don't think that's going very far. Humans who figure this out also have access to context. You wouldn't know a song is protesting against a war unless you know about the war it's protesting against. 800 million pages might seem overkill but it pales in comparison to the amount of information humans (sub)consciously use to reach these conclusions. Think about the amount of information required to adequately describe the concept of a protest-song to a machine.