Monday, March 4, 2013

What's Ailing Introductory Statistics?



Introductory courses are the most important in any academic department. They are often a student's first and only exposure to a discipline, and reach the widest audience of any course in the department. A bad first impression can not only cheats people out of a potential career option, but can leave them with a lasting aversion to an entire field. Introductory statistics has the added importance of being the basis of every scientific field – no pressure there.

So why are there so many introductory statistics horror stories? Introductory statistics has a high conceptual overhead, but very little computational demands – you can get very far with basic arithmetic, and even further with a little calculus. Without a firm grasp of the conceptual basis, introductory statistics can easily become a vacant exercise in arithmetical procedures.
How do we keep introductory statistics from becoming a rote arithmetic exercise, devoid of all its utility? In my experience, there are a couple concepts that people don’t seem to be grasping and retaining:
  • What is randomness? Where does it come from? How does it fit into research?
  • What’s the difference between probability and statistics?
  • What does a null hypothesis mean? Why can we only collect evidence against it?
  • What are Type I Error and Power? What’s the intuition behind test statistics and null distributions?
There are a lot of challenges in constructing a good introductory statistics curriculum. Lots of introductory statistics books are garbage, and developing good lecture material takes a lot of time, which is often in very short supply. We have a wide audience to reach, from the next generation of statisticians to those who just care about degree requirements and letter grades – our job is to convert as much of the latter as possible into the former. This is a formidable task, but not an impossible one.

Fortunately, the tools available to teachers are evolving. I think R has the potential to make a huge impact in statistics education at all levels, for a number of reasons:
·       Free – students can use it outside of computer labs and after they graduate.
·       Open source – makes tinkering easier, and that’s the best way to learn a great deal in a limited time.
·       Community – Users groups, bloggers, free open courses, and more, all lending support.
·       R Markdown – weave R into blogs, labs, presentations, and such. When people are copying from a blackboard or slides, they’re not spending as much effort listening.
·       Shiny – This seems like an incredibly powerful tool for creating and disseminating interactive didactic tools which previously were only accomplished using JS.
R isn’t the only game in town, but I do think it’s the best way forward for students.

No comments:

Post a Comment