Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 20: Data Preparation and Analysis of Data - Coggle Diagram
Chapter 20: Data Preparation and Analysis of Data
Purpose:
To obtain meaning from the collected data
A practice in which raw data is ordered and organized so that useful informations can be extracted from it
Data Preparation
Checking or logging the data in; checking the data for accuracy: entering the data into the computer: transforming the data: and developing and documenting a database structure that integrates the various measures
Logging the data
Data comes from different sources at different times, such as:
a) mail surveys returns
b) coded interview data
c) pre-test or post-test data
d) observational data
a. need to set up a procedure for logging the information and keeping track of it until you are ready to do a comprehensive data analysis.
b. you will want to set up a database that enables you to assess at any time what data is already in and what is still outstanding
c. you can do this with any standard computerised database program (e.g., Microsoft Access) or standard statistical programs (e.g., Statistical Package of Social Sciences- SPSS, Minitab)
d. a database for logging incoming data is critical component in good research record-keeping
Checking Data for Accuracy
Questions you should ask as part of the initial data screening:
a) Are the response legible/readible?
b) Are all important questions answered?
c) Are the responses complete?
d) Is all relevant contextual information included (e.g., data time)?
Developing a Database Structure- manner in which you intend to store the data for the study so that it can be accessed in subsequent data analyses
2 Options for storing data on computer:
database programs - more complex but is more flexible in data manipulating
Statistical programs
In every research project, you should generate a printed codebook, it should include following items for each variable:
a.variable name
b. variable description
c. variable format (number, data, text)
d. instrument/method of collection
e. date collected
f. respondent or group
g.variable location (in database)
h. notes
Entering data into the computer
a. type data directly
b. use procedure called double entry. Once data is entered, use a special program that check the second entry against the first. Reduce entry errors. Not widely available and require training
Data transformation
a. almost always necessary to transform the raw data into variable that are usable in the analyses
b. Common transformations that you might perform:
i) missing values
-many analysis programs automatically treat blank values as missing
-in others, you need to designate specific values to represent missing values
ii) item reversals
-to help reduce the possibility of a response set
-when you analyse data, you want all scores for scale items to be in the same direction where high scores mean the same thing and low scores mean the same thing
-you have to reverse the ratings for some of the scale items with formula : New Value= (High Value + 1) - Original Value
iii) scale totals
-once you have transformed any individual scale items you will often want to add or average across individual items to get a total score for the scale
iv) categories
-for many variables you will want to collapse them into categories
The Process of Data Analysis
Data cleaning
a. Data are inspected, and erroneous data are- if necessary, preferable, and possibly corrected.
b. Done during the stage of data entry.
c. The guiding principle is information should always be cumulatively retrieveable (always possible to undo any data set alterations.
d. All informations should be saved when altering variables in a syntax or a log.
Initial data analysis
a. The most important distinction between the initial data analysis phase and the main analysis phase, is that during initial data analysis one refrains from any analysis that are aimed at answering the original research question.
b. The initial data analysis phase is guided by the following activities:
i.
Quality of data
- should be checked as early as possible.
ii. Use different types of of analyses: frequency counts, descriptive statistics, normality, associations
iii. Other initial data quality checks on data cleaning, analysis of missing observations, analysis of extreme observations, comparisons, and correction of differences in coding schemes.
i.
Quality of measurements
- checked during the initial data analysis phase when this is not the focus or research question of the study
ii. 2 ways to assess measurement quality:
analysis of homogeneity
1.confirmatory factor analysis
iii. During this analysis, one inspects the variances of the items and the scales, the Cronbach's of the scales, and the change in the Cronbach's alpha when an item would be deleted from a scale.
iv. After assessing data's and measurement's quality, one might decide to impute missing data or to perform initial transformations. Possible transformations of variables are:
a. Square root transformation (if distribution differs moderately from normal).
b. Log-transformation (if the distribution differs substantially from normal).
c. Inverse transformation (if the distribution differs severely from normal)
d. Make categorical (ordinal/dichotomous) (if the distribution differs severely from normal, and no transformations help)
v. One should check the success of the randomisation procedure, for instance by checking whether background and substantive variables are equally distributed within and across group. If the study did not need and/or use a randomisation procedure, one should check the success of the non-random sampling, for instance, by checking whether all subgroups of the population of interest are represented in the sample. Other possible data distortions that should be checked are:
a. Dropout (this should be identified during the initial data analysis phase)
b. Item nonresponse (whether this is random or not should be assessed during initial data analysis phase)
c. Treatment quality (using manipulation checks)
Characteristics of data sample
Can be assessed by looking at :
a) Basic characteristics of important variable
b) Scatter plots
c) Correlations
d) Cross-tabulations
Analyses can be used during initial data analysis phase:
a. Univariate statistics
b. Bivariate associations (correlations)
c. Graphical techniques (scatter plots)
It is important to take the measurement levels of the variables into account for the analyses, as special techniques are available for each level:
Nominal and ordinal variables (frequency counts (numbers and percentages), associations, circumambulations (cross tabulation), hierarchical log linear analysis, log linear analysis, exact tests or bootstraping, computation of new variables.
Continuous variables (distribution, statistics (Mean M), SD, variance, skewness, kurtosis), stem-and-leaf displays, box plots.
Main data analysis
Aimed at answering the research question are performed as well as any other relevant analysis needed to write the first draft of the research report
2 approaches can be adopted:
Exploratory
No clear hypotheses about analysing the data, and the data is searched for models that describe the data well
Confirmatory
Clear hypotheses about data are tested