1688421 – SAGE Publications, Inc. (US) ©
evaluation design will still yield useful information. Yet a strong logic model within a rigorousevaluation design will enable much stronger conclusions regarding program effectiveness andimpact. As you have likely surmised, a weak logic model within a strong evaluation designprovides little useful information, just as an unreadable treasure map within a sturdy homebrings you no closer to the treasure. That said, in this section you will add strength and depthto your logic model by continuing to build upon the evaluation matrix you began in Chapter 7.Methods and tools will be identified or developed for each indicator on your logic model,addressing the question, How will you collect your data?
Although there are many evaluation methods, most are classified as qualitative, quantitative, orboth. Qualitative methods rely primarily on noncategorical, free response, observational, ornarrative descriptions of a program, collected through methods such as open-ended surveyitems, interviews, or observations. Quantitative methods, on the other hand, rely primarily ondiscrete categories, such as counts, numbers, and multiple-choice responses. Qualitative andquantitative methods reinforce each other in an evaluation, as qualitative data can help todescribe, illuminate, and provide a depth of understanding to quantitative findings. For thisreason, you may want to choose an evaluation design that includes a combination ofqualitative and quantitative methods, commonly referred to as mixed methods. Some commonevaluation methods are discussed below and include assessments and tests, surveys andquestionnaires, interviews and focus groups, observations, existing data, portfolios, and casestudies. Rubrics are also included as an evaluation tool that is often used to score, categorize,or code interviews, observations, portfolios, qualitative assessments, and case studies.
Qualitative methods: evaluation methods that rely on noncategorical data and freeresponse, observational, or narrative descriptions.
Quantitative methods: evaluation methods that rely on categorical or numerical data.
Mixed methods: evaluation methods that rely on both quantitative and qualitativedata.
Before delving in to different methods, it is worth mentioning the ways in which the termsassessment and survey are sometimes used and misused. First, while the term “survey” issometimes used synonymously with “evaluation,” evaluation does not mean survey. A survey isa tool that can be used in an evaluation and it is perhaps one of the most common tools usedin evaluation, but it is just one tool nonetheless.
Another terminology confusion is between “assessment” and “evaluation.” These too are oftenused interchangeably. However, many in the field of evaluation would argue that assessmenthas a quantitative connotation, while evaluation can be mixed method.
Similarly, the term “measurement” is often used synonymously with “assessment,” andmeasurement too has a quantitative connotation. I believe the confusion lies in the terms“assess,” “evaluate,” and “measure”; they are synonyms. So, it only makes sense thatassessment and evaluation, and sometimes measurement, are used synonymously. And whilethere is nothing inherently wrong with using these terms interchangeably, it is a good idea toask for clarification when the terms assessment and measurement are used. Some majorfunders use the term “assessment plan” to mean “evaluation plan,” but others may use the termassessment as an indication that they would like quantitative measurement. The takeawayfrom this is to communicate with stakeholders such that the evaluation (or assessment) youdesign meets their information needs and expectations.
8.2.1 Qualitative Methods
Qualitative methods focus on noncategorical, observational, or narrative data. Evaluation usingqualitative methods is primarily inductive, in that data are collected and examined for patterns.These patterns are then used to make generalizations and formulate hypotheses based onthese generalizations. Qualitative methods include interviews and focus groups, observations,some types of existing data, portfolios, and case studies. Each method is described in thefollowing paragraphs.
1688421 – SAGE Publications, Inc. (US) ©
Interviews and focus groups (qualitative) are typically conducted face-to-face or over thephone. We also conduct individual interviews using video conferencing software. Focus groupsare group interviews and can also be conducted using video conferencing software, but I havefound it is difficult to maintain the richness of discussion found in face-to-face focus groupswhen conducted using video. However, I have no doubt as we become more skilled withfacilitating group discussions where individuals are in varied locations, video focus groups willbecome an important and invaluable mode of research. The list of interview and focus groupquestions is referred to as a protocol; an interview protocol can be created with questions toaddress your specific information needs. The interviewer can use follow-up questions andprobes as necessary to clarify responses. However, interviews and focus groups take time toconduct and analyze. Due to the time-consuming nature of interviews and focus groups,sample sizes are typically small, and research costs can be expensive. See Interviews inQualitative Research (King, Horrocks, & Brooks, 2018) and Focus Groups (Krueger & Casey,2014) for more information on designing and conducting interviews and focus groups.
Observations (usually qualitative but can be quantitative) can be used to collect informationabout people’s behavior, such as teacher’s classroom instruction or students’ activeengagement. Observations can be scored using a rubric or through theme-based analyses, andmultiple observations are necessary to ensure that findings are grounded. Because of this,observational techniques tend to be time-consuming and expensive, but can provide anextremely rich description of program implementation. See the observation section of RobertWood Johnson Foundation’s Qualitative Research Guidelines Project (Cohen & Crabtree, 2006)for more information and a list of resources on using observation in research.
Existing data (usually quantitative but can be qualitative) are often overlooked but can be anexcellent and readily available source of evaluation information. Using existing data such asschool records (e.g., student grades, test scores, graduation rate, truancy data, and behavioralinfractions), work samples, and lesson plans, as well as documentation regarding school ordistrict policy and procedures, minimizes the data collection burden. However, despite theavailability and convenience, you should critically examine the quality of existing data andwhether they meet your evaluation needs.
Portfolios (typically qualitative) are collections of work samples and can be used to examinethe progress of the program’s participants throughout the program’s operation. Work samplesfrom before (pre) and after (post) program implementation can be compared and scored usingrubrics to measure growth. Portfolios can show tangible and powerful evidence of growth andcan be used as concrete examples when reporting program results. However, scoring can besubjective and is highly dependent upon the strength of the rubric and the training of theportfolio scorers; in addition, the use of rubrics in research can be very resource intensive(Herman & Winters, 1994).
Case studies (mostly qualitative but can include quantitative data) are in-depth examinations ofa person, group of people, or context. Case studies can include a combination of any of themethods reviewed above. Case studies look at the big picture and investigate theinterrelationships among data. For instance, a case study of a school might include interviewswith teachers and parents, observations in the classroom, student surveys, student work, andtest scores. Combining many methods into a case study can provide a rich picture of how aprogram is used, where a program might be improved, and any variation in findings from usingdifferent methods. Using multiple, mixed methods in an evaluation allows for a deeperunderstanding of a program, as well as a more accurate picture of how a program operates andits successes. See Yin (2017) for more information on case study research.
8.2.2 Quantitative Methods
Quantitative methods focus on categorical or numerical data. Evaluation based on quantitativedata is primarily deductive, in that it begins with a hypothesis and uses the data to makespecific conclusions. Quantitative methods include assessments and tests, as well as surveysand questionnaires, and some types of existing data. Each method is described in the followingparagraphs.
Assessments and tests (typically quantitative but can include qualitative items) are often usedprior to program implementation (pre) and again at program completion (post), or at varioustimes during program implementation, to assess program progress and results. Assessmentsare also referred to as tests or instruments. Results of assessments are usually objective, andmultiple items can be used in combination to create a subscale, often providing a more reliableestimate than any single item (see Wright, 2007). If a program is intended to decreasedepression or improve self-confidence, you will likely want to use an existing assessment thatmeasures depression or self-confidence. If you want to measure knowledge of organizationalpolicies, you may decide to create a test based on the policies specific to the organization.
1688421 – SAGE Publications, Inc. (US) ©
However, before using assessment or test data, you should be sure that the assessmentadequately addresses what the program intends to achieve. You would not want the success orfailure of the program to be determined by an assessment that does not accurately measurethe program’s outcomes.
The reliability and validity of an instrument are important considerations when selecting andusing instruments such as assessments and tests (as well surveys and questionnaires).Reliability is the consistency with which an instrument measures whatever it intends tomeasure. There are three common types of reliability: internal consistency reliability, test–retest reliability, and inter-rater reliability. See Figure 8.2 for a description of each type ofreliability.
Reliability: the consistency with which an instrument measures something.
Validity is the accuracy with which an instrument measures a construct. The construct mightbe anxiety, aptitude, achievement, alcoholism, or self-confidence. There are four types ofvalidity: content validity, construct validity, criterion-related validity, and consequential validity.See Figure 8.2 for more information on each type of validity.
Validity: the accuracy with which an instrument measures a construct.
1688421 – SAGE Publications, Inc. (US) ©
Figure 8.2 Reliability and Validity
When choosing an assessment or creating your own instrument, you should investigate thetechnical qualities of reliability and validity to be sure the test is consistent in its measurementand to verify that it does indeed measure what you need to measure. Further, taking a subset ofitems from a validated instrument to create a new instrument does in fact create a newinstrument, with untested reliability and validity. Results from an instrument that is not validare, in turn, not valid. That is, using an instrument that has not been validated through theexamination of reliability and validity can result in erroneous and costly decisions being madebased upon those data.
Surveys and questionnaires (typically quantitative but can include qualitative items) are oftenused to collect information from large numbers of respondents. They can be administeredonline, on paper, in person, or over the phone. In order for surveys to provide useful information,the questions must be worded clearly and succinctly. Survey items can be open-ended orclosed-ended.
Open-ended survey items allow respondents to provide free-form responses to questions andare typically scored using a rubric. A rubric is a scoring guide used to categorize text-based orobservational information based upon set criteria or elements of performance. See Figure 8.3for more information on rubrics. Closed-ended items give the respondent a choice ofresponses, often on a scale from 1 to 4 or 1 to 5. Surveys can be quickly administered, areusually easy to analyze, and can be adapted to fit specific situations.
Rubric: a guideline that can be used objectively to examine subjective data.
Building a survey in conjunction with other methods and tools can help you to understand yourfindings better. For instance, designing a survey to explore findings from observations ordocument reviews can enable you to compare findings among multiple sources. Validating
1688421 – SAGE Publications, Inc. (US) ©
your findings using multiple methods gives the evaluator more confidence regarding evaluationfindings.
Figure 8.3 Scoring Rubrics
Using a previously administered survey can save you time, may give you something to compareyour results to (if previous results are available), and may give you confidence that some of thepotential problems have already been addressed. Two notes of caution, however, in usingsurveys that others have developed: (1) Be sure the instrument has been tested anddemonstrated to be reliable and valid for the intended population, and (2) be sure the surveyaddresses your evaluation needs. It is tempting to use an already developed survey withoutthinking critically about whether it will truly answer your evaluation questions. Existing surveysmay need to be adapted to fit your specific needs.
See Survey Research Methods (Fowler, 2013) for more information on designing, administering,and analyzing surveys.
8.2.3 Mixed Methods
Mixed-method studies combine both qualitative and quantitative methods. For example, anevaluation of a program intended to increase the retention of faculty from underrepresentedgroups in the STEM fields (science, technology, engineering, and math) might
1688421 – SAGE Publications, Inc. (US) ©