1. Operationalization
- operationalized construct
- still somewhat abstract
Specific, concrete method to measure/manipulate a construct.
Operationalization means selection or creation of a specific procedure to measure or manipulate the construct of interest.
An operationalization makes it possible to assign people an actual score on the variable of interest.
An operationalization doesn’t necessarily capture or represent the construct in its entirety.
Keep in mind what aspect of the construct the operationalization actually measures or manipulates.
2. Measurement
Measurement Structure
- What information is (not) captured with numbers?
- measurement is representation of relations between objects on a property with relations between numbers.测量是指通过运用数字间的关系来表现物体、人或群体间具有的某种属性之间的关系。
- the relation:
- differentiate
- order
- compare differences
- compare ratios
Measurement levels
Nominal 定类排列(主观分类)
- categorize the values
- distinguish between values(inequality)
- only differentiate values
- eg.: nationality, sex, pet preference
Ordinal 定序排列(主观排列组合)
- ordering of values
- **oder difference does not determine the quantitative differences. **
- eg.: math ability
- differences or ratios of scores don’t reflect differences or ratios of math ability
Interval Variable 定距变量(主观定量)
- not only distinguish order values, but also to interpret differences between values.
- eg.: temperature
- can not say water at 80F is twice as hot as water at 40F because the zero point of temperature is arbitrarily defined.
- Celsius 0 = fresh water freezes vs. Fahrenheit 0 = brine freezes
- The value zero doesn’t correspond to the absence of temperature.
Ratio variable 定比变量(客观定量)
- The zero length is the same whether you measure in inches or in centimeters.
- rare in social science
- the structure of a property doesn’t have to be fully captured by a measurement instrument.
3. Variable Types
number of categories
dichotomous/binary: 2
- male/female
- under 20/ 20 above
polytomous: more than 2
distance between values uninterpretable
Intepretation: frequency(频数);mode(众数);median(中位数)
- reflects inequality,order and extent of differences
- interval/ratio
Continuous/Discrete variable连续/离散变量
- always find value between any other two values
- limited set of values
- nominal/ordinal
- always discrete by nature
- quantitative
- can also be discrete
- nominal/ordinal
- limited set of values
4. Measurement Validity
Face Validity 表面效度
- expert assessment
- experts can be wrong
Predictive/Criterion validity 预测效度/效标效度
- Instrument predicts relevant property
- Something is measured consistently
- not necessarily intended construct
- the ability to predict something doesn’t mean the scores used for prediction accurately reflect the intended construct.
Gold standard
- already had a valid instrument for the property of interest.
- administer both instruments and see whether the scores on the new scale agreed with the already validated scale.
- there aren’t many gold standard instruments for social and psychological constructs.
Direct empirical verification 定量测试效度
- For social and psychological constructs, no undisputed, direct way to determine whether one person is more intelligent.
Convergent/discriminant validity 收敛/区别效度
seeing whether the scores relate to similar and different variables in a way that we expect.观察其得分与相似以及不同变量的关系是否符合预期。
Multi-trait multi-method matrix approach(MTMM) 多元特质多重方法矩阵法
- different instruments to measure different traits
5. Measurement Reliability
the instrument’s consistency, stability or precision
- not applicable to the memory test
internal consistency
- look at the consistency between different parts of the instrument at one time
split-halves reliability
- randomly splitting the tests in half and assess the association between the first and second half.
- there are also statistics that are equivalent to the average of all possible ways to split the test.
- if measurement consists of observation instead of self-report:
intra observer reliability: the same observer rate twice and assess the association between the two assessments.
- the memory of the observer can inflate the association
- inter rater reliability: two different people observe and rate the behavior and look at the association between the two rater’s scores.
Systematic error
- systematically measure an additional construct
- cat fondness scale + general positive attitude?
- less valid but not less reliable
Random error
- error that’s entirely due to chance: random fluctuations / noise
- reliability required for validity
- validity not require for reliability
- 一个测量方法必须有一定可靠性,进而才能是有效的 反之则不成立 。
- 一个非常可靠的测量方法也可能是完全无效的,当它完美测量的并不是应该测量的建构时,这一情况就会发生。
6. Survey Questionnaire Test
- ask for different types of information
- covers different topics
- measures one/related constructs
- psychological traits, emotional states or attitudes
- measures ability
a clear instruction
- cover story
Interviewers/ on-line application
items 题目
- a series of questions
stem 题干
- questions, statements or words that a participant has to respond
- respond options: discrete options or a continuous range
scale 尺度
- items measure the same construct or the same aspect of a construct
- subscale分量表
sumscore 总分
- indicates a person’s value on the property
7. Scales and Response Options
Likert scale
summative scale
items measure the same property
monotone单一维度: higher score = higher value on property
scale construction
items should be
- well formulated = short and simple 问题简短
- unambiguous: avoid double-barrelled questions 避免一题多义
not suggestive 中立
don’t you think…?
a filter question in advance
avoid extreme wording避免极端词汇
words like never or always
response options should be
- unambiguous没有歧义
- consistent前后一致
- all respondents should be able to reflect their position on the property 所有受试者都可以找到一个反映他们立场的答案
- mutually exclusive互斥
other types of rare scales
- differential scale 差异量表
- allow for non-monotone items
- cumulative scale 累积量表
- items themselves show consistent ordering
- each item expressing the property more strongly than the previous one
8. Response and Rater Bias
There is always some degree of systematic errors or bias.
Response sets反应定势/ response styles反应风格
Self-report bias
- the tendency to agree with all statements regardless of their content
- solution: include some negatively phrased items
social desirability
- the responses of people who tend to present themselves more favorably or in more socially acceptable ways.
- occur if a scale measures a property that’s considered socially sensitive or relevant to someone’s self image.
- **solution: adding social desirability items such as I’ve never stolen anything in my life or I’ve never lied to anyone. **
- If people strongly agree with these items, there’s a fair chance that their responses to other questions are biased towards responses that are more socially acceptable.
extreme response style
- respondents don’t want to think about exactly how strongly they agree or disagree with an item. They’ll choose the most extreme options.
- unlike acquiescence bias, participants’ responses are consistent, nut just more extreme than their true value.
bias towards the middle
- respondents tend to choose a less extreme response option.
solution: include some extremely strong items such as cats are purely evil creatures.
- if they respond with a middle category to all items, including these extremely worded items, their response pattern is inconsistent.
Observer Rating
halo effect 光环效应
- positivity/negativity on one dimension spills over to other dimensions
- eg. more attractive people are rated more intelligent or better at their job.
generosity errors 慷慨评价误差
severity errors 严格评价误差
9. Other Measurement Types
Physical measurements生理测量
- medicine/biology/psychology
- skin conductance皮肤电导率–>arousal觉醒状态
- eye tracking–> focus of attention
- EEG脑电图/FMRI核磁共振成像–>brain/cognitive activity
- medicine/biology/psychology
Observation 观测法
- sociology/psychology/educational sciences
- careful registration of specific behavior
- employ coding schemes that specify categories of behavior and their criteria
- what the behavior in each category looks like
- how long it should be displayed
- under what circumstances it should occur
- time frame 编码时间
- training of observers
- under they show enough agreement when coding the same material
- sociology/psychology/educational sciences
Trace measurement 痕迹测量
- assess behavior indirectly through physical trace evidence
- eg.: counting the number of used tissues after a therapy session to represent how depressed a client is.
Archival data 归档数据
- a property can be represented with measurements that were already collected by others
- eg.: census data
Content analysis 内容分析
- structured coding of elements in a text
- Computer software can code very complicated schemes.
Interviewing 访谈
- Structured
- questions/ question order/ response options are predetermined
- similar to survey
- hard to get unbiased answers to sensitive questions
- Unstructured/open
- a qualitative method
- procedure:
- start off with a general topic
- a set of points to be addressed
- but the interview is not limited to these points
- questions are open ended
- disadvantages:
- the conversation can lead anywhere
- differ per respondent –> aggregation more difficult
- other qualitative methods
- case study
- focus groups
- oral histories
- participatory observation
- …
- Structured
10. Interview
- 3 utmost important things:
- more theory about the constructs.
- more up to date norms(常模) studies to have meaningful interpretation of the test scores.
- scores are changing.
- people are complicated.