[1] Introduction to Survey Sampling Frame
and Survey Error
(1) How to collect data ?
Survey
Questionnair
Interview
Social listenning
(2) Census VS Sample Survey
1. Census
2. Survey
Obtain data from a population
Gather information from a sample
(3) Parameter VS Statistic
Numerical measure
1. Parameter
2. Statistic
Measure calculated for a population
Measure calculated for a sample
(4) What is a survey ?
- Gathered mainly be asking people questions
- Typically collected from a sample
- Collected by
1. Interview administered method
2. Self-administered method
(5) What do you need to know to design a survey ?
- The target population
- The sampling frame
- The sample design
- The mode of data collection
- Ongoing survey or One-time survey
Terminology
1. Target population ประชากรเป้าหมาย
2. Sampling frame กรอบตัวอย่าง
3. Sample or element หน่วยตัวอย่าง
Space + Time
Set of things of interest
List and procedure used to identify an element in the target population
Fundamental unit in the population
Different unit ex. person or household
(6) Two approaches to survey sampling
1. Probabilitiy samples
2. Non-probability samples
A know probability of selection
Unknow probability of selection
Randomness or Chance
Judgment
Example
- Simple random sampling
- Cluster sampling
- Startified random sampling
- Systematic sampling
Example
1. Snowball samples
2. Quota samples
3. Convenience samples
Start with one person to find other people [Respondent driven sampling]
People who are the easiest to get participate
Problems
Problems
Problems
No prob. mechanism for inital selection
No guarantee that other characteristics will accurately reflect the population
No prob. mechanism for inital selection
No guarantee that other characteristics will accurately reflect the population
The number of people with certain characteristics meet predetermained quotas
No prob. mechanism for inital selection
Miss people with certain charateristics
Homeless, HIV+
Education, Race
Volunteer, Research opportunity
Notation
N
n
Y
y
i
Number or elements in your population
Number of elements in your sample
Population value for survey variable of interest
Sample value for survey variable of interest
Counter for units in the sample or population
Sampling Fraction
f = n/N
(7) Target VS Frame population
1. Frame
2. Target population
Set of materials used to designate a sample of units
Set of households, people or businesses of interest for inference
(8) Mapping
Reality
Ideal
[0.] Perfect mapping
[i.] Element in the target population that is not on the sampling frame
[ii.] Sampling frame contains an element that does not exist in the target population
[iii.] Target population Elements appears more than one on sampling frame
[iv.] More than one target population elements represented by one sampling frame element
[v.] Many to many matching
one-to-one mapping
one-to-zero mapping
zero-to-one mapping
one-to-many mapping
many-to-one mapping
many-to-many mapping
Problems
Solution
Zero chance of selection
Supplemental frames
Problems
Solution
Problems
Solution
Problems
Solution
Problems
Solution
Ineligible units, Foreign units, or Blanks
Duplicates: Non-EPSEM
Screening is needed to dertermine this
Rejecting blank
Smaller sample size
Adjust the sampling rate
m = n(1-pb)
pb
n
m
estimated proportion of blank frame elemants
desired final sample size of population elements
the number of frame elements to select
Remove from list before selection
Unequal probability of selection weights
EPSEM
Euqal Probability Selection Method
f = n/N
Weighting IIllustration
1. Unweighted mean
2. Weighted mean
Non-EPSEM
d(i)
Number of duplicates for a given selected element
1/d(i)
Weight observation for that selection with probability
Clustering
Take all elements within selected clusters
More than one population element can be selected by more than one frame element
Weigting + Subsampling
(9) Survey Error
Error
1. Bias
2. Variance
MSE = (bias)^2 + variance
Systematic error
Variable error
Error that tend to agree
Error that tend to disagree
(10) Types of Error in Survey
1. Coverage error
2. Sampling error
3. Nonresponse error
4. Measurement error
Happens when sampling frame does not match the target population
Occurs due to selective sample instead of the entire population
Occurs when an estimate calculated using only respondents differs from that calculated using the entire sample data
Occurs when the answer from the survey respondent does not match the true answer to the survey question
Ineligible unit
Noncoverage unit
SPF contain individuals who do not belong to the target population [Overcoverage]
SPF not contain all of the members of target population
Sampling bias
Sampling variance
{Coverage bias}
{Nonresponse bias}
Unit nonresponse
Occurs when the sample member does not provide any information
Item nonresponse
Occurs when a respondent fails to answer a survey question
1.Noncontact
2.Refusal
3.Inability to participate
Example
Sensitive question
Inadequate answer
Respondent
Interviewer
Questionnair
Mode of data collection
Either deliberately or unintentionally provide incorrect information
Record responses incorrectly or unintentionally affect answers
Ambiguous or confusing questions
Self-administered modes yield higher levels of sensitive behaviors than Interviewer administered modes
Note
1. Socially desirable question
2. Socially undesirable question
Overreport (+)
Underreport (-)
[2] Simple Random Sampling
and Systematic Sampling