My daughter spent the last week on a math project for her Algebra 2 class. The project is a poster, looking at data about an Olympic sport. The intent of the project was to compare improvements in the speed of women and men over at least a forty year period in a single event. She chose Giant Slalom. This is a downhill ski race where the participants weave in and out of flags (gates).
She did all of the work, collected the data table by hand, hand-drew the scatter plot, computed estimated best-fit lines (using her programmable calculator), added them to the chart, neatly drew out all of the data, used Kramer's rule and inverse matrices, et cetera. This was several hours of work over the last week, including at least 3 hours last night. Then she decorated her poster and began writing up her conclusions (the last step of the project). Everything fell apart. She struggled with the conclusion because her best-fit lines showed that men getting slightly better, but women getting much worse. It didn't make any sense.
We went back over her data, and discovered a significant differences in the numbers. Then we saw that there were also changes in how the data was reported from year to year. I asked her to go back over the data again and put it into a spreadsheet, checking for those changes in how it was reported, while I headed off to get more posterboard (at 9:30pm).
While I was gone, she had discovered other challenges with the reported data, and couldn't get the forty-year period for both genders that she needed. When I returned, I looked at her problem again. We put all the numbers (which she had again carefully tabulated by hand) into a spreadsheet.
I showed her how to turn that into a scatter plot. Then I showed her how to plot a regression (best-fit) line through each data set, and get the equation. "What's R2?" she asked me when I checked the box to display that value too. We looked up the definition of correlation coefficient on the web. I then explained to her that it serves as a measure of the likelihood of a correlation between the data points. She looked at my plot and noted that the men's results had an R2 of 0.000. Yup, I responded. And we went digging further.
We discovered after a bit of reading that Giant Slalom is really different races at each event with the same name (her words and emphasis). The course varies each time. The number of and position of the "gates", the width of them, and the distances between them can all change between each race.
Her whole approach to the conclusion changed. She reported that she had insufficient data to draw a conclusion and explained why. She wasn't happy. She wanted to get a result showing that women's times were improving faster than men, or at least that women's and men's were both improving. But she didn't. At least she reported her results accurately. I hope her teacher takes into account what she learned with this project, and grades it appropriately.
In our society's focus on evidence-based medicine, and with my own understanding about how the practice of medicine varies across different providers, I often wonder if they too are running the same races. Earlier this week, e-patients.net posted a blog describing the Reproducibility Initiative. That initiative runs what they think is "the same race", to see if they get similar results. Given our recent experience, that sounds like a really good idea.