Tuesday, May 28, 2013


Today I spent about 3 hours taking several standardized tests.  Getting to the stage where I would even be taking these tests was a lot more challenging than the tests were.  This morning I was having a conversation with someone else about another set of tests about interoperability, and the many and various ways those tests could be (and are), gamed.  Later in the day I was reading about how someone else was accused of gaming meaningful use criteria.  And at some other point in time, I heard about how the HIT Policy Committee was considering using some reported measure results (a measure is a test by any definition of the term) as a proxy for some other test.  And I expect to see a lot of discussions about how these measurements are used, gamed, and abused.

In all cases where we focus on testing, a lot of attention is paid by various observers to the edge cases, places where the test can be gamed, where people do things or have interpretations that are unintended, or stretch the purpose or intent.  And in the overall scheme of things, folks are right to be concerned.  There are of course, opportunities for abuse, and no, of course you cannot directly compare scores because they don't account for all ranges of variation, and why yes, people do get right answers by guessing, or even making stuff up.  And yes, we can measure this stuff better.

But in general, for most of the people, or organizations, or whatever else you are measuring, when you use the material as intended, the test works, and it does provide some value.  Unless you are in that very small, select group 3 sigma from the mean, or concerned about someone claiming to be there that shouldn't be, I wouldn't worry about it.  It's simply not worth the energy.  Remember MIPS? Millions of Instructions Per Second, also known as Meaningless Indicator of Performance.  At the order of magnitude level, MIPS worked.  It was when you were fussing about the difference between 100 and 110 that MIPS was simple not worth it.

Even I am only a second order deviant, according to some tests.


