Evaluation Planning – 3.06 Evaluation Design
An evaluation design shows how the evaluation is structured with respect to measurement, administration of the program, sampling and any comparison groups that are included. It provides an important schematic that can be used to guide the choice of data analysis. Simplified general research designs are described below, but selecting a design will vary depending on Evaluation Champion and working group preferences. Once again, we refer you to the literature for more in-depth information on design, including http://www.socialresearchmethods.net/kb/design.php.
Relationship between Designs and Claims
The kinds of claims that you can make based on the results of the evaluation vary depending upon the kind of design you choose to use. For example, if you want to be able to state that participation in the program is related to a change in some outcome, you need to use a design that assesses change. Not all designs are created equal. Some designs are better than others at addressing the kind of claim we want to make. When considering which kind of design to use, it is important to think about what kind of claim you want to make and select a design that can provide evidence for that claim. It is also important to consider the feasibility of the design as well as whether or not it is appropriate given the lifecycle phase of the program. It is possible that after reviewing different design options, the working group may decide to revise the evaluation questions.
In addition to considering the kinds of claims you want to make, it is also important to take note of the kind of language that is used in the evaluation question. For example, if the evaluation question asks whether participation in the program causes outcome X, this implies that a particular type of design that can assess causality is used. The strongest design for assessing a cause/effect relationship is a Randomized Controlled Trial (RCT; a pre-post-test with random assignment to groups). This type of design is considered a Phase 3 (Comparison and Control) Evaluation Lifecycle design and is most appropriate for a Phase 3 (Stability) Program Lifecycle program. On the other hand, when you are doing first-time implementation of a new program an RCT would not be appropriate and you might be advised to choose something like a post-only case study design. The evaluation questions may need to be revised to correspond with the program’s lifecycle phase.
Criteria to Consider when Selecting a Design: There are several criteria that should be considered when selecting a design: (1) Time order, (2) Covariation, (3) Rules out other possible causes, and (4) Shows change. In order to demonstrate time order, we need to use a design that clearly demonstrates that the “cause” or the program happened before the “effect” or the outcome that we are interested in assessing. Covariation means that changes in the “cause” or the program are related to changes in the “effect” or the outcome of interest. In order to demonstrate covariation, we need a design that shows that when the program occurs the outcome of interest occurs and that when the program does not occur the outcome of interest does not occur. Typically, this is demonstrated by using a design that includes at least two groups. One group receives the program (and hopefully exhibits the outcome of interest) and one group does not receive the program (and hopefully does not exhibit the outcome of interest). In order to rule out other possible causes, we need a design that demonstrates that the program (the presumed “cause”) is the only reasonable explanation for the “effect” or outcome of interest. This is typically an extremely difficult criterion to meet. Any number of factors other than the program could “cause” the outcome of interest. In order to demonstrate that change occurred, a design that includes a “before and after” or pre- and post-test is needed.
The strength of the claims we can make depends on how well the design addresses these criteria. The most important thing to consider is alignment. In other words, does the design we select allow us to make the desired claims? The chart below provides examples of some of the more commonly used designs and the associated claims that can typically be made.
|
Aligning Claims with Designs |
||||||
For more information on the criteria described above and designs see:
http://www.socialresearchmethods.net/kb/desdes.php
Design Notation
We often describe a design using a concise notation that enables us to summarize a complex design structure efficiently. If two or more of the same kind of elements function the same way in a design (e.g., all measures are given to all participants at the same time) then a single symbol may be used to represent the entire set; if they function differently (e.g., some measures are pre-post and some are post-only) then you can use subscripts to differentiate them.
- Observations or Measures are symbolized by an ‘O’. Distinguish among specific measures, with subscripts, as in O1, O2, and so on.
- The Activity or Program is symbolized with an ‘X’. As with observations, use subscripts to distinguish different activities or program variations.
- Groups are given their own line in the design structure. Samples are divided into groups that do or do not participate in the activity. If the design notation has three lines, there are three functionally distinct groups in the design. Group type – such as “random” (R), or “non-equivalent” (N) – is designated by a letter at the beginning of each line (i.e., group).
- Time moves from left to right.
For example:
O X O Represents a pre-test before and a post-test after the activity
and
N O X O Represents a pre-post group with a non-equivalent comparison
N O O group that didn’t participate in the activity
Notice that the design notation tells something about how the participants are organized or grouped in an evaluation (this relates to sampling) and it shows how measures are sequenced or organized (this relates to measurement). And, the structure of a design will usually circumscribe what will be done in analyzing the data collected. So, design is a fairly central topic in evaluation planning.
As always, it is important to keep the evaluation questions in mind when thinking through the various aspects of evaluation planning. If this is not done, there is the danger of developing a nice evaluation design that doesn’t actually help to answer the focal questions.
Much like the measures section, there are a few key questions to consider once the design has been outlined:
- Is there a clear connection between the evaluation questions, chosen measures and the resulting design?
- Is the design appropriate given the claims that you would like to be able to make?
- Is this design appropriate for this program’s lifecycle?
- Is this design feasible given the program resources and organizational capacity?
- Is this design feasible given the duration and setting of the program? For example, a short 30-minute activity does not lend itself to an elaborate pre-post measure.
It’s important to link design issues to the lifecycle of the program. As you learned in the Lifecycle Analysis step we believe that the ultimate goal is for the evaluation lifecycle to be aligned with the program lifecycle. Different evaluation designs are more or less appropriate depending on the program lifecycle phase.
When writing the design plan for the evaluation, consider each evaluation question and describe in detail the design type(s) (e.g., post-only, pre-post, pre-post with comparison group, etc.). Make sure the design(s) address each of the evaluation questions, are appropriate given the lifecycle stage of the program, and are appropriate for generating evidence for the desired claims.