lederhosen

From:

Yeah, errors often aren't independent. Also, because the sources are so heterogeneous, we can't ever really assume that we can avoid sampling bias (nor that samples are identically distributed), which is why things are so genre specific and why changing testing corpora can completely alter the outcomes.

The problem in a vast number of fields is that people assume they can just use a standard prepackaged statistical test without having to think about what assumptions it might be making, and that just isn't so.

Definitely. That's certainly a major problem in psychology from what I've seen. A lot of NLP folks are really good with their stats, so they know better, but the problem is that assumptions like IID are so pervasive in statistical theory that sometimes practitioners make assumptions because that's where the math is. Looking for your keys where the light is good instead of where you lost them, and all that.