Summer Camp for Social Scientists! Software Edition

RcodeThe subject of statistical software fluency is an oft discussed topic round the brown bag table. The dominance of SPSS in social work has been discussed many times, sometimes controversially. Here at ICPSR I’ve heard a few things about software, my favourite being that number eight of the fabled ten commandments of the summer program was something to the effect of ‘don’t be a software snob.’ Also amusing is that this commandment has been wildly violated by many here this summer.

It’s true all stats programs have their strengths and weaknesses. However, it must be said that some programs have shorter ranges and shallower depths than others. For example, SPSS and Stata often fall short in terms of graphical analysis and presentation. Each program has its strengths as well. In my scaling class this week, it was mentioned that the only reliable routine for unidimensional unfolding analysis is in SPSS. This was the first time SPSS was cited as having superior capabilities than other programs. From what I understand, only one of the courses here utilizes SAS, and another utilizes the HLM program, while the rest rely primarily on Stata and R.

In my work here, I have mainly been focusing on becoming fluent in R, although I have learned a bit of JMP and expanded my Stata skills. The R statistical computing environment is pretty incredible. Its breadth and depth are currently unparalleled. Many statistical techniques are simply not available in any other program. Even though it has a reputation for being difficult to use, I find the logic of the R language to be extremely straightforward. It does take longer to learn than other stats syntax for the non-programmer. But the way R forces an understanding of the moving parts of model estimation actually encourages greater understanding of the underlying math and logic to the statistics that far outweighs any labour costs. R has its more direct downsides of course, like any program. Tonight I ran into a parametric distribution that R did not support and I had to switch back to Stata. Also, I imagine that coding one’s work only in R poses challenges in terms of collaboration, particularly for fields that are dominated by other statistical programs.

This afternoon, I caught one of my professors on the way out of class, and he stressed that one should ‘become bilingual.’ Indeed, learning programming languages is just like learning regular languages: more can very rarely be a bad thing.

Stata, R on the rise while SPSS and SAS decline

Bob Muenchen provides a nice post comparing the current use major statistical software packages. I like that he (a) acknowledges the limits of R and the GUI, (b) critiques the bizzare number of SAS GUI’s, (c) the respectability of SAS Enterprise Guide.
I think some of the benefits of Stata for applied social sciences are underappreciated, i.e., current fav’s – estab and mkspline. I’ve recently tried to do similar procedures in R and found overly complex. These and numerous other procedures are prepackaged and ready to go for linear models in Stata.

Maybe I am too optimistic but I’d like to see all these platforms develop literate programming/reproducible research.
And, it’s nice he invites replication and offfers his data.

Blog authors are solely responsible for the content of the blogs listed in the directory. Neither the content of these blogs, nor the links to other web sites, are screened, approved, reviewed or endorsed by McGill University. The text and other material on these blogs are the opinion of the specific author and are not statements of advice, opinion, or information of McGill.