### Stats Software for Intro Class (of non-statisticians)

I am slated to teach a class in intro stats to psych majors this upcoming semester, and I would like some software to use consistently throughout the course. I refuse to use commercial software; having myself been trained extensively in SAS only to find it unavailable due to my current job, I don't want to create that situation for my students. In short, I'd like to give them something they can keep using as long as they want to use it.

I'd use R, if they were science students (more comfortable with computing/etc.) or stats majors, but these people are from psychology. So I need something easy to install, and which can be made easy (easier) to use. Don't get me wrong, I have used R in intro work before, but it was not smooth.

I would consider R with a front end. Is R-commander better than it used to be a couple years ago?

I have already rejected PSPP, as it has no graphics (right?), and Statistical Lab (Statistiklabor) as, well,

**I**am having trouble getting that working well. It also lacks the needed detailed support in English that my students might require.

Any other ideas? Any users of OpenEpi or Gretl? Would these work for general stats?

Ideally I'd like the following: some tools for simple non-parametrics, resampling (wishful thinking?), the usual suspects in normal theory statistics with both 1 and 2 way ANOVA, Fisher's exact test (wishful thinking again?), lots of good graphics, relatively easy data transformations, and the ability to do simple simulations (this may have to be done elsewhere than the main package). Obviously I'd like it to be Free (beer) and/or free (libre) and/or open source. But I'll settle for anything the students can get for 0.00 USD legally and give to there friends.

I realize that what I want is simplified-R. Any ideas? Thanks in advance for any leads you can give me.

X-post-to:

**stat_geeks**.

danielmedicallogenesdanielmediccozmic_oceanzallogenesIn addition, the limitations of SPSS and now the re-branded PASW (what does that stand for anyway?) are too limiting. It leads psych students down the garden path of doing the wrong statistic for the wrong problem. At some point we have to say enough to damaging the next generation because it is what we got.

Yeah this is overly dramatic, I know, but as a professional statistician who has to teach real statistics, there comes a point where I have to put my foot down and teach the right material, not the expected material. SPSS fights that at every turn.

But seriously, I do understand your position and appreciate your input.

bigmel42softwareI hate SPSS, and not because it's closed-source. However, it *is* the standard software that psychologists use *and* are expected to know. Additionally, unless you are teaching at a severely strapped school, they will likely have a psych computer lab with SPSS available on those machines (verify that).

Or, if you want to make sure they *really* get it, do what I do and avoid software platforms entirely. Here's why: SPSS, JMP, Minitab, etc. are all menu-driven and incredibly easy to figure out. What you need to focus on in the stats class isn't the point-and-click, but the actual work and reasoning behind the methods. You're going to be the only person (or maybe one of two) person that's going to impress upon these students the importance of statistics in psychological (and, indeed, any scientific) research, and you're not doing them any favors if you just say "well, ANOVA is when you have groups, and then you get this table like so.. and then this big F here means you reject."

Teach them *why* things are supposed to be done, in addition to the (actual) how, and then leave it to them to figure out how to do the analysis in a software program later. It does no one any good to have to learn software when they're supposed to be learning the techniques -- give these kids a leg up and focus on the theory/skills and make them good practitioners.

If, however, you absolutely can't get by without giving them shortcuts, then use Excel. I hate it, too, but it's Analysis ToolPak add-on does all the statistical methods (ANOVA, histograms, regression, etc) that you will probably need in a stats-for-psychology class.

Good luck.

allogenesRe: softwareThe course I plan on teaching is the basis of a new textbook I have been writing. This is one of the reasons that I need better software. I need to get students past the "which test do I use?" model and into the "what do I really want to know about the data (or experiment, or ...)?" model. I teach (and have taught with some success) a way of thinking and understanding, not a list of recipes. If I do my job right, knowing any specific package won't matter.

I agree that there is too much emphasis on software, but at the same time not using any software in a class is not good--you need it to analyze real sized problems. A class free of real problems does no one any favors. Excel encourages long columns of numbers and that, IMO, helps no one. (I have tried long columns, and it does not work for me. I have also taught with Excel several times. I found it to be problematic for a variety of reasons.) It is very old school to look at software as a short-cut. It is no longer so. Software, for better or worse, is an integral part of the research enterprise.

To the other point, there is a persistent illusion that to do standard analyses one has to "program" R. While raw R does not give you pull-down menus, it is hardly programming to do a t-test:

Or an ANOVA:

I grant you that there is a little more going on there, but it is

notprogramming. Knowing how to store data correctly is also required by SPSS. There are some GUIs for R that are improving--I just don't know that I trust them yet. :-) But that is why I was asking around about other software options.By the way--you didn't say why you hate SPSS. What is the problem with it?

Thanks for the input!

bigmel42Re: softwareTrue, there's not a lot of programming to do the basics, but it is still more than a lot of students are probably used to doing (unless they are at a school like mine, where everyone is required to have at least one semester of computer science).

I don't disagree that software is important to research, and that no one is out there manipulating massive matrices by hand anymore. At the risk of coming off as a total Luddite, teaching someone how to do something by hand (for example, the within sum of squares for ANOVA) not only is illustrative of the math at hand, but reinforces what, exactly, the within sum of squares is (oh! so, it's the squared difference between the points and the group's mean... it's the only "within" there is). I'm not suggesting that you have them partition the total variance or derive expected sums of squares, but on small problems it is useful to have the work done by hand (or to at least provide the parts and have the students put them together in the right way). By going strictly software or calculator base, and teaching to the technology, you risk (as I said above) stripping away the "why" behind using a technique... you're giving them a hammer without telling them what a nail looks like to appropriately use it.

I don't think that you would go so far as to completely ignore some of the statistical theory by including a bit of a software tutorial in the course, but by becoming too reliant on software too quickly, it's easy to introduce a lot of error into the results. Using variances, again, it's one thing to tell students that a variance can never be negative and have that as a mental check. It's quite another to show them *why* it can't be negative. Not that a package would spit out a negative variance (I hope), but there are other checks one does to see if results make sense, which can be lost by simply relying on software.

It'd be like (forgive the hyperbole) giving elementary school students a calculator and telling them that they don't need to understand how 1+1=2, just that, when you type that into a calculator, you get '2' back so that's the answer.

Finally, to answer your question, I hate SPSS for a variety of reasons:

1. It assumes the tail of your tests based on the test statistic. If, heaven forbid, you have a right-tailed alternative, but end up with a negative test statistic (so you looked in the wrong direction -- it happens), it will report the left-tailed p-value. (I hate JMP for a similar reason, but at least there it reports all possible p-values). Right now, this is my biggest pet peeve, as in an intro class, you may be emphasizing how important it is to really understand H0 and HA, and how they relate to p-values and test-statistics, but it is a rare student who will really get that when faced with an easy (wrong) output. Even after pointing out this flaw repeatedly for several weeks, nearly my entire class reported the wrong results from their tests (with real data and hypotheses).

2. It references "p-value" as "significance", which is incredibly misleading.

3. It calculates repeated measures statistics incorrectly (but we are unsure "how" it is incorrect, because of...

4. It doesn't have in its documentation the formulas used for different calculations... at least not that I've found

Sorry for the lengthy reply. My soapbox and I are going away now.

allogenesRe: softwareMaybe I misstated my position a bit. I am not against

all"by hand" work. I used to teach linear algebra and I did make my math students invert or row-reduce a couple of matrices by hand, for practice. I am just against doing anall-by-hand course, which I may have incorrectly assumed you were promoting. I have actually seen that done in several psych departments, and I don't believe that making the students do dozens and dozens of those calculations helps the understanding.I would never "just" use the computer, I always intended some hand work, but somewhere along the way, the class has to switch to a higher level process--and for me, giving summary statistics (rather than raw data from which the students derive the summaries) is deceptive. I have data on this, actually. So after some demonstrations and exercises for the understanding, I need some way for bulk data to turn into summaries and tests and intervals, and figures ... and there is where the software comes in. But I would consider it crazy to just push buttons. So I think we agree there. I certainly have always done the specific things you mention.

As an example of my point--I always teach the SS formulas using the so-called defining formulas, but (in psych) usually do not teach the so-called short-cut formulas. In such a class those are not well used. I have seen students in psych stats classes who were taught

onlythe short-cut formulas, as the teacher had them doingeveryproblem (with data) by hand. They also spent too much of their time using pre-computed summary stats in that course. Without the data at hand, how do they check the assumptions? Take it on faith in the problem statement? I suppose I am against faith-based statistics. :-) I want them using pictures of data for every problem, and that needs data and a fast way to get good pictures. Checking assumptions is a habit, and it cannot be established by waving our hands as teachers and saying "assume all is well." Which is about half of my justification for computers.Thanks also for your comments on SPSS. Getting on my soapbox, that is why I constantly press for open-source software in the sciences. It is exactly problems like your comments 2-4 that make the point that a lot of us are making these days: scientific results must be open to inspection and therefore we need to be able to (at least in principle) know exactly how they were derived. But unfortunately there is a cost to that, and that is being forced to use software that may be built to a different way of doing things (commands rather than menus) than one might prefer. Unfortunate, but life is always balancing costs...

Thanks again!

(Deleted comment)allogenesThanks!

(Anonymous)Minitab