Date: 2010-11-14 08:50 am (UTC)
lederhosen: (Default)
From: [personal profile] lederhosen
Without knowing anything about NLP, the testing you're talking about sounds like something where the 'independence of errors' assumption mightn't hold up, which would definitely be a problem for large sample sizes.

The problem in a vast number of fields is that people assume they can just use a standard prepackaged statistical test without having to think about what assumptions it might be making, and that just isn't so.

I am not a great fan of significance testing; while it certainly has its points, I think a lot of its appeal is that it can be used without having to think about certain issues that one probably should think about, e.g. "what are your priors?" The word 'significance' itself is also problematic, because its meaning is so hugely different from the everyday meaning.

Somewhat different topic, but Anscombe's quartet neatly illustrates the pitfalls of relying on a couple of simple numeric measures without actually looking at the data.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org

Profile

lederhosen: (Default)
lederhosen

July 2017

S M T W T F S
      1
2345678
9101112131415
16171819202122
2324252627 2829
3031     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 17th, 2025 01:52 pm
Powered by Dreamwidth Studios