Random Sampling and Sampling Errors
Convenience Sample
Example- If I were to sample kids in my stats class for an upcoming project of mine, they would be in the same room as me which is about as convenient as it gets. Although kids in other stats classes might have other opinions on the matter, which means the sample isn't perfect, but it is convenient.
Definition- A convenience sample is a sample/survey of people who are close and easy to find, hens the convenient part.
Voluntary Response Sampling
Definition- A Voluntary Response Sample is a sample of people who voluntarily take the sample/survey, these samples are always biased.
Example- If I were to do a survey before the presidential election happened, asking everyone in New England whether they favored Hillary or Trump, and if you look at the results of votes by states, every state in New England favored Hillary.
Under Coverage
Definition- It occurs when some members of the population are inadequately represented in the sample
Example- A good example of this would be if I took a survey of my stats class and my stats class only. This would mean that people from other stats classes opinions would not be included in the sample/survey.
Response Bias
Definition- When someone lies while doing a survey or answers in a way that they normally wouldn't
Example- An example of this would be if someone thought a truthful answer that they might give would be socially unacceptable, they might change their answer even though that's not what they really believe.
non-response
Definition- Non-response is very self explanatory, and it is when people do not respond to surveys, and obviously when people don't respond, the outcome of the survey/sample will be different.
Bias Definition
Example- The school emails that are sent out from administration that are surveys for students in iLab are not answered by a lot of people, which in turn, results in incomplete data.
Definition- Usually considered to be an unfair prejudice in favor of a person, group or thing.
Example- If you were to take a survey in Duxbury High School as to who they think will win the World Series this year, the percentage of people who pick the Red Sox will be substantially higher, because our school is about 30 minutes away from Boston making most people in our school Red Sox fans.
Bias Sources
Definition- biases or inaccuracies that can occur when combining or comparing research studies
Example- If two surveys of different high schools were compared asking whether or not the students enjoyed the lunch at their school, you would have to compare the actual quality and what kind of food each of the schools are serving and also which students from each schools were asked.
Bias Direction
Defintion- underestimation or overestimation of the true intervention effect.
example- It can go either way, someone can either underestimate or overestimate something because of bias and other emotional factors.
SRS
Stratified SRS
Cluster Sampling
Advantages- it ensures each subgroup within the population receives proper representation within the sample.
Disadvantages- Researchers must identify every member of a population being studied and classify each of them into one, and only one, subpopulation. Finding an exhaustive and definitive list of an entire population is the first challenge. In some cases, it is downright impossible.
Advantages- The aim of the simple random sample is to reduce the potential for human bias in the selection of cases to be included in the sample. As a result, the simple random sample provides us with a sample that is highly representative of the population being studied, assuming that there is limited missing data.
Disadvantages- A simple random sample can only be carried out if the list of the population is available and complete.
Advantages- This sampling technique is cheap, quick and easy. Instead of sampling an entire country when using simple random sampling, the researcher can allocate his limited resources to the few randomly selected clusters or areas when using cluster samples.
Disadvantages- From all the different type of probability sampling, this technique is the least representative of the population.
Holistic Sampling
Definition- relating to the idea that things should be studied as a whole and not just as a sum of their parts.
Example- If I were to have a survey in the school about the lunch food, asking if it were good, bad, or somewhere in the middle. I wouldn't do multiple surveys about it and not include certain people in the high school or even middle school about it for example boys and girls. I would take one collective sample of the entire high school and middle school, and if that wasn't possible, I would get 25 middle school boys, 25 middle school girls, along with 25 high school boys and 25 high school girls, in order to get an even fair sample. In order to make the sample as random as possible, I would do a random number generator with either my calculator or a website using three digits. From 001-how ever many boys/girls there are in the middle school/high school. Then I would get 25 different triple digit numbers for each of the 4 categories I made, middle school boys, middle school girls, high school boys and high school girls. Then I would get a list of each of the categories and match the numbers to the names of the students. I would make sure to get the random numbers before to make sure that the survey was random.
Distinguish between observational study and an experiment
Observational Study- attempt to understand cause-and-effect relationships. However, unlike experiments, the researcher is not able to control how subjects are assigned to groups or which treatments each group receives.
Experiment- a controlled study in which the researcher attempts to understand cause and effect relationships. The study is controlled in the sense that the researcher controls how subjects are assigned to groups and which treatments each group receives.
Explain how a lurking variable in an observational study can lead to confounding.
A lurking variable in an observational study can lead to confounding because an observational studies are just surveys and don't control everything that goes on, just asks certain question and relies on people being honest. For example, in what Mrs. Coleman said earlier, that some people think kids in Colorado are having more learning disabilities because of THC in pregnant women. If there was an observational study, a parent may not tell the truth because that may not be something they want other people to know, which could lead to misleading data, or confounding.
Definitions
Subject- A person who is being monitored in some way to give some type of data to a survey/sample.
Experimental Unit- Someone or something that is being monitored like a subject, but can be assigned a treatment, for example, given something to eat or watch while being monitored
Explanatory Variable- When a variable isn't independent for certain, it's an explanatory variable.
Treatments- Combinations of factor levels
Response Variables- a variable whose value depends on that of another.
Three Principles of Good Experiment Design
Replication
Randomization
Control
Definition- repetition of an experimental condition so that the variability associated with the phenomenon can be estimated.
Example- If I were to take a survey in the school of whether people like iLab or not, instead of sampling only 50 students at random, If I increase that number to 200 students at random, I will get a much more accurate representation on how kids feel about iLab.
Definition- the practice of using chance methods, random number tables, flipping a coin to assign subjects to treatments.
Definition- baseline group that receives no treatment or a neutral treatment.
Example- If I were to do a survey in the school of my grade, the best way to randomize is to get the amount of people im going to survey let's say 25 and get that many random numbers from either my calculator or a random number generator online, then match those numbers up with people on a list of students in the grade.
Example- If we were to give some plants a certain type of fertilizer instead of the one we usually use and keep using that one on certain plants to compare what happens.
Blinding and the Placebo Effect
Blinding
Placebo Effect
Definition- refers to the practice of keeping patients in the dark as to whether they are receiving a placebo or not.
Example- Giving a subject a certain treatment without telling them exactly what it is. Then giving another group of people the same thing and tell them exactly what it is and what it does, and see the difference between how people react to it.
Definition- a beneficial effect, produced by a placebo drug or treatment, that cannot be attributed to the properties of the placebo itself, and must therefore be due to the patient's belief in that treatment.
Example- If I were to give a group of people a drug and said each pill costs 1 dollar, and give another group of people the same medication but said that it costed 50 for one pill, the people who we told the pill was more expensive would feel better because they thought they were taking some expensive medication.
A well designed experiment design
The experiment I will be designing today is a plant-based experiment, where we will see the effects on a plant in different scenarios. One scenario will be the plant is in the sunlight 100% of the time. Another one will be 75% of the time. Another will be 50%. And the last one will be 0%. Because the sun isn’t out 24/7, we will use artificial sunlight lamps. We will use multiple plants for repetition to make sure the study comes out accurate. We will use parlar palm plants, snake plants, peace lily plants, anthurium plants and aloe plants
Definition- a subset of a statistical population in which each member of the subset has an equal probability of being chosen.
Definition- a method of sampling that involves the division of a population into smaller groups known as strata. In stratified random sampling, the strata are formed based on members' shared attributes or characteristics
Definition- the researcher divides the population into separate groups, called clusters. Then, a simple random sample of clusters is selected from the population.