Please enable JavaScript.
Coggle requires JavaScript to display documents.
Unit 5: Organization of Data - Coggle Diagram
Unit 5: Organization of Data
5.1 DATA CONCEPTS AND GRAPHICAL SUMMARIES
Numerical Data (quantitative data)
Data in the form of any number
Numerical is either continuous or discrete
Continuous Data: can be any value in a range & when in a bar graph and the bars are touching
Discrete Data: only is specific values & when in a bar graph the bars are not touching.
Categorical Data (qualitative data)
Data that can be sorted into distinct groups or categories
There are two main types of Categorical Data (ordinal and nominal)
Ordinal Data: qualitative data that can be ranked
Ex. poor, fair, good, very good
Nominal Data: qualitative data that cannot be ranked
Ex. blue eyes, green eyes, brown eyes
5.2 PRINCIPLES OF DATA COLLECTION
Population: all the individuals in a group that is being studied
Sample: a group of items or people selected from the population
Sample Variety
shows how samples are different from each-other
the most similar samples are to each other, the lower the variability and the more accurately the samples represent the population
Types of Sample
Simple Random: Randomly choose a specific number of people
Ex. Put all names in a population into a hat and draw one or several name
Systematic: Put the population in an ordered list and choose people at regular intervals
Ex. Order all patients of a doctor in some way and choose one randomly. Select the rest of the data at a regular intervals from the original starting point.
Stratified: Divide the sample into groups with the same proportions as those groups in the population
Ex. Survey factory employees about new safety initiatives. There are 1000 employees in the factory, 633 are women and 367 are men. Randomly select 63 women and 37 men to take the survey.
Cluster: Divide the population into groups, randomly choose a number of the groups, and sample each member of the chosen group.
Ex. Survey Little League Canada baseball players. Randomly select five districts in each province and give the survey to each player in those districts.
Multistage: Divide the population into a hierarchy and choose a random sample at each level
Ex. Conduct an employee wellness survey by randomly selecting 10 stores. Randomly select three departments in each store, and randomly select 10 employees in each of those departments.
Convenience: Choose individuals from the population who are easy to access. It is often very inexpensive to conduct and can yield unreliable results since it inadvertently omits large portions of the population
Ex. To get the public's input on a new pet by law a local politician goes to a local park and asks people their opinion.
Voluntary: Allow participants to choose whether or not to participate. Often the only people who respond are either heavily in favour or heavily against what the survey is about
Ex. conduct an online poll asking people whether banning junk food in schools will fight obesity.
5.3 COLLECTING DATA
Treatment Group: the participants in an experiment who receive the specific treatment being measured.
Control Group: the participants in an experiment who do not receive the specific treatment being measured
Observational Study: the researcher records behaviours and tries to draw conclusions based on the observations
5.4 INTERPRETING AND ANALYZING DATA
Primary Source Data: data that has been collected directly by the researcher and have not been manipulated or summarized
Secondary Source Data: data used by someone other than those who actually collected them
Micro-data: an individual set of data about a single respondent.
Aggregate Data: data that is combined or summarized in such a way that the individual micro-data can no longer be determined
5.5 BIAS
Types of Bias
Response Bias: when respondents change their answers to influence the result to avoid embarrassment or to give the answer they think the questioner wants.
Sampling Bias: when the sample does not closely represent the population
Measurement Bias: when the collection method is such that the characteristics are consistently over or under represented
Non-Response Bias: when the opinions of respondents differ in meaningful ways from those of non respondents
Bias
occurs when there is a prejudice for or against an idea or response
biased samples can result from problems with either the sampling technique or the data collection method