II.3.01: Conduct Statistical Tests

Evaluation Implementation – 3.01 Conduct Statistical Tests

Descriptive statistics, as explained in 3.04, are important ways of summarizing what’s going on in the data you have. It may be your evaluation question simply calls for a description of the data (for example, showing the severity of a community need or levels of achievement among former program participants). However, if your evaluation purpose goes beyond that (for example, showing program participation has an influence or effect on post-program behavior), and if your data allow, the next step is to do some statistical tests of your data. This involves calculating specific “inferential statistics” in order to make inferences from the data. These inferences would be about relationships that are believed to hold (e.g., that participation in the program leads to certain outcomes) or patterns that may be meaningful (e.g., that a subset of participants experience different outcomes from other participants.) These are all an exercise in making meaning out of the data.

As mentioned above, there are key assumptions your data must meet in order to legitimately be able to use specific inferential techniques such as t-tests or analysis of variance (ANOVA). A simple explanation of statistical tests is that they are designed to assess how likely it is that the pattern that is detectable in the data could have arisen purely by chance. For example, it may look as though two variables of interest tend to move in the same direction. Suppose one is the score on a knowledge test, and the other is the participant’s class attendance rate. If higher attendance rates tend to go along with high test scores in the data, it might be tempting to conclude that class participation did cause increases in tested knowledge.

However it is also possible that the class lessons had no effect on participants, and the pattern simply arose by chance in this sample. The larger the size of the sample the more difficult it is for pure chance to have yielded this pattern in the data, but it’s still a possibility. (And this is one of the reasons getting a larger sample is so important).

Essentially, there is some underlying “truth” driving the patterns in a dataset and yet we can only observe the way things ended up turning out. The observed data then are necessarily a combination of the underlying true process plus error. In the pattern above, for example, it may be that some kids in an educational program might have missed a lot of classes but did learn a lot when they were there (i.e., “truth” is that the quantity of class attendance wasn’t so critical) but they had a bad day on test-day, or hadn’t slept well the night before, or had a fight on the way to class, and so on. Others might have attended a lot of classes and only learned a little (again, class attendance wasn’t having much effect), but they happened to know the topics that were on the test when test-day came, or they were just having a “good memory day”, or they were lucky guessers that day. So the data would “look as though” high attendance yielded more knowledge, but the underlying truth was that the class content really wasn’t the driving factor. These “errors” due to imperfect or incomplete observation (since we don’t have information on their sleep patterns, moods, distractions, or guessing ability) need to be ordinary or “normally distributed” in order for certain statistical tests to work. If the variability of these error factors is large – for example, if scores tend to change widely even for the same student taking the same test at several different times – then it is statistically more difficult to distinguish true patterns from chance patterns simply because the chance possibilities are so extensive.

Explanations of how to conduct individual statistical tests are beyond the scope of this Guide. The Appendix provides a concise table of options and choices of tests depending on the nature of your data and the evaluation question. The references at the end of that appendix offer specific resources for how to proceed further. An excellent reference, which also contains suggestions for where to find more detailed information, is the Research Methods Knowledge Base. In particular, for clear and accessible explanations of the roles and types of inferential statistics, see Inferential Statistics. You might also want to go through the guided questions on selecting an appropriate statistical analysis using the Selecting Statistics web application.

Q&A

Q: Where can I learn more about how to do analysis?

Quantitative Analysis:

One good, succinct source on analyzing quantitative data is: Taylor-Powell, E. (1996). Analyzing Quantitative Data. Retrieved May 5, 2015, from University of Wisconsin-Extension Cooperative Extension, Program Development and Evaluation Unit Web site: http://learningstore.uwex.edu/Assets/pdfs/G3658-06.pdf

Another useful source is: Research Methods Knowledge Base: Trochim, William M. The Research Methods Knowledge Base, 2nd Edition. Internet WWW page, at URL: http://www.socialresearchmethods.net/kb/analysis.php

Qualitative Analysis:

Although it is rather lengthy and in-depth, a very good, highly readable source on qualitative data analysis (and qualitative evaluation and research in general) is: Patton, M. Q. (2015). Qualitative Research & Evaluation Methods. Thousand Oaks, CA: Sage.

For a more succinct source (12 pages) specifically about qualitative analysis, see: Taylor-Powell, E., & Renner, M. (2003). Analyzing Qualitative Data. Retrieved May 5, 2015, fromUniversity of Wisconsin-Extension Cooperative Extension, at http://learningstore.uwex.edu/Assets/pdfs/G3658-12.pdf

II.3.01: Conduct Statistical Tests

Q&A

Presentations

Guiding Documents