Week 6.2 - Data Analysis
Week 6.2 - Data Analysis
Levels of Measurement: (want to understand what kind of data you have and have an understanding of what you are collecting)
- Nominal-scale (lowest form of measurement): data organized into categories of defined property but cannot be ordered, must be exclusive and exhaustive
- Ordinal-scale: categories that can be ranked, but of unequal intervals
- Interval scale: follow rules of mutually exclusive categories, exhaustive categories, and rank ordering and are assumed to represent a continuum (equal numerical distances between intervals). No absolute zero
- Ratio-scale (highest form of measurement): have all of the above, plus absolute zero (weight, length and volume)
Managing the Data:
- what are the steps in quantitative data management? ideas?
- very precise
- Have we lost any of the data? How do you handle it when the person does not answer a question?
What are the Descriptive Statistics?
- description and/or summarization of sample data
- allow researchers to arrange data visually to display meaning and to help in understanding the sample characteristics and variables under study
- in some studies, descriptive statistics may be the only results sought from statistical analysis
- looking at means, summary of the variables
- when would the median be preferred descriptive central tendency statistic?
Purpose of Descriptive Statistics:
- reduce data to manageable proportions by summarizing them:
- measure of central tendency: summarizes the middle of the group; each measure has specific uses and is most appropriate to select types of distribution and measurement; an "average" (mode = most frequent score // median = middle score // mean = average score)
- scatter plot
Frequency of Distribution:
- common basic way to organize data
- summarizes the occurrences of events under study; tallies the frequency of events
- cohort groups are sometimes created to investigate the frequencies of certain data
- a theoretical concept that observes that interval or ratio data group themselves about a midpoint in a distribution closely approximating the normal curve
- skewness: not all data follow a normal curve
- Positive skew equals low range mean (ex. world income)
- Negative skew equals high range mean (ex. age at death)
Decision Path: Descriptive Statistics
- used to determine the probability (or likelihood) that a conclusion based on the analysis of data from a sample is true
- use the calculated sample statistic – the mean or SD calculated from the sample to estimate the population parameter
- combines mathematical processes and logic; allows researchers to test hypotheses about a population using data obtained from probability and nonprobability samples
- make inferences about the data that we have; looking at the probability and basing your conclusion on the data you have
- probability deals with the relative likelihood that a certain event will or will not occur, relative to some other events
- to describe the data that you have
Parametric VS. Nonparametric:
- Parametric - more powerful and more flexible than nonparametric; used with interval and ratio
- attributes: estimation of a population parameter; interval/ratio measurement scale; normal distribution (if it does not follow a normal distribution then do nonparametric)
- tests for independent group: t-test; ANOVA (more than 2 groups); ANCOVA; MANOVA
- tests for non-independence: paired t-test (some how related); repeated measures ANOVA (within-subjects ANOVA)
- testing relationships: Pearson's r (two variables measured on at least interval scale)
- Nonparametric - not based on the estimation of population parameters; used with nominal or ordinal variables
- tests for independent group: Mann-Whitney U; Median test (chi-square analysis); Kruskal-Wallis (more than 2 groups)
- differences in proportions: chi-squared test (categorical); fisher's exact (used when smaller sample size and within certain cells you may have less than 5 or 6 data points)
- testing relationships: Spearman's rho (rs), Kendall's Tau; Phi Coefficient; Cramer's V (contingency tables bigger than 2x2)
- Regression Analysis - looking at how it regresses to a line; predictor variables; how can they predict the variables; Y=a+bX (R vs. R2); control of extraneous variables
- Studying complex relationships among more than two variables
- Multiple Regression: one dependent variable or multiple independent variables
- used to determine what variables contribute to change in the dependent variable and to what degree
- key to inferential statistics
- answers questions such as: How much of this effect is a result of chance? How strongly are these two variables associated with each other?
- null hypothesis and alternative hypothesis
- want the alternative hypothesis to shine through
- Scientific or alternative hypothesis (H1): is what the researcher believes the outcome will be, that the variables will interact in some way
- Null hypothesis (H0): is the hypothesis that can actually be tested by statistical methods; states that no difference exists between the groups under study
- Standard error of the Mean (SE m - little m)
- SD (standard deviation) reflects how close individual scores cluster around their mean, whereas the SE (standard error) shows how close mean scores from repeated samples will be to the true population mean
Level of Significance:
- probability of making type 1 error = 0.5
- researcher is willing to accept the fact that if the study was done 100 times, the decision to reject the null hypothesis would be wrong 5 times out of those 100 trials
- Type 1 Error: Rejection of the null hypothesis when it is actually true
- Type 2 Error: accepting the null hypothesis when it is false
Clinical (or practical) VS. Statistical Significance:
- a statistically significant hypothesis = finding unlikely to have occured by chance
- magnitude of significance is important to the outcome of data analysis
- Odds Ratio:
- used in harm studies to estimate if a subject has been harmed by being exposed to a particular event
- calculated by dividing the odds in the treatment or exposed group by the odds of the control group
- OR = odds that case was exposed/odds that a control was exposed
- an estimated range of values that provides a measure of certainty about the sample findings
- most commonly reported in research is a 95% degree of certainty, meaning 95% of the time, the findings will fall within the range of values given as the CI
Overall Process of Qualitative Analysis:
- category scheme
- purest researchers say you will miss some things when you don't code it by yourself (ex. have a computer do it); by coding your own data, you are closer with it and it is richer
- manual methods = cut/paste
- computer programs = Atlas.i, Ethnograph, NVivo
- ongoing process as data is collected (try to understand the data as you collect it)
- process of selecting, focusing, simplifying, abstracting, & transforming the data
- organized into meaningful clusters (themes or structured meaning units)
- thematic analysis: process of recognizing & recovering the emergent themes (be sure you are capturing what the person says and check with them; also acknowledge any interruptions)
- memos are kept to help organize data, write personal notes to self --> keeping track of everything
- data is coded—given a tag or label according to theme/category (ex. topic coding)
- codebook used to organize code into lists (hugely important to organize data) (for every code you create a definition for how you understand it)
- researcher immerses self in the data during this stage, often for weeks or months!
- put back into a meaningful whole
- if not on a particular qualitative tradition, content analysis typically performed – identify prominent themes and relationship between themes
- finding “forest vs. the trees”
- an organized, compressed assembly of informaiton that permits conclusion drawing and action
- graphs, flow charts, matrixes, model
- a visual presentation of what you have found
Conclusion Drawing and Verification:
- challenge for the researcher is to stay open to new ideas, themes, and concepts as they appear
- conclusion drawing is the description of the relationship between the themes
- verification occurs as the data is collected
- leave biases outside; be open to whatever the participants say; let conclusion fall from the data and do not try to impose other frameworks or other data onto it
Generating Meaning: (what is the meaning of their words)
- note patterns, themes
- see plaursibility
- make metaphors
- partition variables
- subsume particulars
- note relationships
- intervening variables
- chain of evidence
- conceptual coherence
Procedures for Various Traditions: (LOOK AT CHART ON SLIDE 17)
- numerous methods
- immersion in the data—read & reread
- extract significant statements
- determine relationship among themes
- describe phenomena & themes
- synthesize themes into a consistent description of phenomenon
- domain (cultural units); taxonomic; componential; and theme
- immerse in the data
- identify patterns & themes
- take cultural inventory
- interpret findings
- compare findings to the literature
- Grounded Theory
- core variable
- examine each line of data line by line
- divide data into discrete parts
- compare data for similarities/differences
- compare data with other data collected, continuously—constant comparative method
- cluster into categories
- develop categories
- determine relationships among categories
- Is the method of analysis clear?
- Is it appropriate for the study?
- Can you follow the analysis step by step?
- Is there evidence that the interpretation accurately reflected what was said?
- Are credibility, auditability, fittingness, and trustworthiness accounted for?
- make sure that you understand the method of analysis
- What descriptive statistics are reported?
- Are these appropriate to the level of measurement used?
- Are appropriate summary statistics provided for each major variable?
- Do the statistics used match the problem, hypothesis, method, sample, and level of measurement?
- Does the hypothesis reflect if differences or relationships are being tested?
- Is the level of significance indicated?
- Does the measurement level permit parametric testing?
- Is the sample size large enough for parametric testing?
- Is there enough information given to assess appropriateness of parametric use?
- Do tables and graphs enhance text?
- Are the results understandable?
Take Home Message:
- Science and research prove nothing in isolation—research evidence only provides support for a theory
- One study’s findings are rarely sufficient to support a major practice change