Please enable JavaScript.

Coggle requires JavaScript to display documents.

Research Methods & Statistics - Coggle Diagram

- - - - Non-parametric -> "NO" is used to analyze Nominal or Ordinal data
        
        Chi-squared = Mr. Independent = 1 variable only (e.g. persons' gender)
        
        single sample chi-square = goodness of fit
        
        count all variables to determine the # of variables for a chi-square test
        
        multiple sample chi-square = used for contingency tables
        
        use when there are 2 variables (e.g. gender (male/female) AND voter preference (democrat/republican)
        
        Chi-Square is a test of difference -> it requires the following core conditions
        
        Kai Rasporivich is very independent from other families b/c they are weirdos
        
        i). randomly selected from the population
        
        ii). must have independent observations -> so we can't measure it more than once
        
        iii). it uses nominal (categorical) data
        
        Advantages
        
        not susceptible to outliers
        
        easier to calculate
        
        not as powerful as parametric tests
        
        greater chance of type II errors
        
        use if the data meets one of the following criteria
        
        ordinal data -> has order (e.g. Likert scale or rank)
        
        nominal data gender (male/female), marital status (single, married), etc.
        
        Non-parametic means -> that the numbers CAN'T be used in calculations
        
        non-parametic data is a label
        
        Substitute "variable" with "sample"
      - Parametric (Para is powerful)
        
        Types of parametric -> means that they all have "parameters" or requirements
        
        T-Test = "t" for 2 groups -> is used to compare 2 means "averages"
        
        i). t-test for single sample -> comparing an obtained sample mean to a population mean
        
        (e.g. mean mock EPPP scores vs. actual EPPP scores)
        
        (e.g. comparing the achievement scores of sixth-grade students in one school district in California to the test scores of all sixth-grade students in California)
        
        iii).related samples t-test -> comparing 2 groups when there is relationship btwn them (e.g. twins)
        
        ii). unrelated t-test -> difference btwn groups that are unrelated
        
        2 scoops = 2 groups
        
        Use when you have 1 IV and 2+ groups, you are not able to use t-tests it must be a t-test
        
        (3) core requirements of parametric
        
        i). interval/ratio data, "ir-data" which has a score value (e.g. income, IQ)
        
        iii). homoscedascity = equal varaiability between the groups, which is measured by SD
        
        ii). normally distributed data (e.g. bell shape)
        
        If these criteria are not met, you must do a non-parametric tests
        
        "homo" sapien
        
        "ir"ritating sound = interval/ratio
        
        Parametric means that -> the numbers can be used in calculations
        
        Substitute "variable" with "sample"
      - Once the data is collected we need to determine which tests to run
        
        Tests of Difference which includes ANOVAS, t-tests & Chi-Square
        
        Factorial ANOVA-> "factorial" is a generic term that means more than 1 I.V.
        
        "way" actually means IV's -> 2 IV's = 2-way = 3 IV's and 3-way
        
        more than 1 IV -> (2 IV's = 2-way ANOVA; 3 IV's = 3-way ANOVA, etc.)
        
        hallmark is all IV's are correlated
        
        factorial assumes there is more than 1 IV
        
        1-way ANOVA -> "a" is singular
        
        1-way ANOVA produces an "f-ratio"
        
        F-ratio
        
        **memorize this -> the larger the f-ratio, the more we can be sure that the changes are due to types of tx -> f-ratios can be as high as 2.0
        
        mean within (MSW) -> measure due to error only
        
        mean between (MSB) -> measure of variability in D.V. that's due to tx effect plus error
        
        1-way ANOVA = 1 IV and 1 DV
        
        conduct a post-hoc (after the fact) only if statistically significant
        
        MANOVA -> (Multi-variate analysis of variance) -> "m" multiuple factors
        
        1+ IV
        
        If the Question has more than 1 DV's = choose MANOVA & get the fuck out of the question
        
        1+ group
        
        For example -> effects of exercise on weight and stress level
        
        ANCOVA
        
        used to -> statistically remove the effects of an extraneous variable from scores on the D.V.
        
        When using the ANCOVA, the extraneous or moderator variable is the covariate
        
        alternative verbage that might be used on the exam for ANCOVA is hold constant OR partial out the effects of
        
        example -> study the effects of smoking on diet and accounting for anxiety (anxiety is extraneous variable )
        
        ancova erases
        
        Trend Analysis -> use when there is 1 or more IV's
        
        used when the researcher wants to determine if there is a significant linear or non-linear relationship
        
        A trend analysis shows the ups and downs in a set of data.
        
        For example
        
        nonlinear trend -> a lower dose has a minimal or even negative effect
        
        a liner trend might be -> the higher the dose, the more effective the medication
        
        tires only run in a linear fashion
        
        Split-plot ANOVA
        
        2 IV's -> Independent on 1 variable AND correlated on another variable
        
        Step #1 -> in what form is the data? you do this by looking at the DV (dv=outcome variable)
        
        Step #4 -> is the data (groups) correlated or independent -> 3 ways to do this
        
        b). if people in groups are related to each other (e.g. twin studies)
        
        c). match people in pairs before you put them into groups
        
        a). measure people over time (e.g. beg/middle and end of tx) -> this is the most common -> time is always correlated data
        
        Step #3 -> how many IV's? (the way the groups are being compared) and what are the levels
        
        (e.g. study comparing medications in treating depression; SSRI, SNRI, MAOI, TCA => 1 variables, with 4 levels)
        
        independent data, is that you can't be in 2 group
        
        Step #2 -> Is the data Interval or ratio?
        
        NO
        
        by default, the data must be either Nominal or Ordinal
        
        conduct a Non-parametric test (e.g. Chi-Square)
        
        YES
        
        data must be parametric, and you would do the next steps
        
        In order to run a parametric test you must meet 2 additional criteria
        
        i). homoscedasticity
        
        "homo" = same
        
        scedasticity = variability, or spread (same standard deviation)
        
        ii). normal distribution of the data
- - - - fitness level as a mediator to improved cardiovascular health which helps to explain the relationship between exercise and cardiovascular health.
      - A study investigating the relationship between exercise (independent variable) and improved cardiovascular health (dependent variable)
      - Self-esteem as mediator to academic performance & well-being, as high academic achievement may lead to increased self-esteem
      - communication as a mediator of relationship satisfaction and longevity -> good communication fosters deeper connection btwn partners
      - m"e"diator -> how the 2 variables are r"e"lated -> in-between a divorce
      - what is in-between that might explain -> triangulation is a term that you might use to explain this -> it is when you get from here to here through something else
      - CJ -> mediator is like a light switch, either it is "on" or "off" and explains "why"
        there is a relationship
    - - I manipulate the independent variable -> (e.g. male vs female, educated vs. non-educated, etc.)
      - "time" is an IV
    - - Social supports as a moderator btwn stress and job performance
      - education level as a moderator btwn income earned and job satisfaction
      - Age as a moderator of the relationship btwn technology usage and anxiety
      - Moderator -> I notice the “o” which reminds me m"o"derators tell me how strong a relationship is and what direction the relationship may go*
      - Connor McGregor is the "moderator" in MMA
      - pointing = direction
      - CJ-> moderator is like a
        "dimmer switch" -> explains why there is a relationship in the first place
      - it is a 3rd variable that affects the nature of the relationship
    - - I want to know if (blank), depends on (blank)
      - first blank = DV
      - second blank = IV
    - - These are "EXTRA things" outside of the IV
      - Anchor -> Extra, extra read all about it -> " confounding is disturbed about the variables"
      - A confounding variable is a variable which is not of interest to the researcher that exerts a systematic effect on the DV.
    - - I measure the dependent variable (a.k.a. outcome variable)
      - DV's don't have levels
      - survey says = final decision
  - - - first ask yourself -> how are we measuring our people?
        
        are they getting a score or numerical value => if YES => that is interval or ratio data (e.g. male/female, ethnicity)
        
        i).Nominal -> (Nom = name) Categorical data that is not numerical
        
        (e.g. gender, race, political view)
        
        ii).Ordinal -> (ord = order) -> data that must be in some kind of order & distance between items is not equal
        
        (e.g. race results 1st, 2nd, 3rd; grades A,B,C,F; education level; customer satisfaction level using a Likert scale)
        
        are they just being counted in one category or another => If YES => Data must be nominal or ordinal (e.g. are you voting republican or democrat)
        
        iv). Ratio -> think “Ration” is the amount of something once you run out you have nothing left
        
        you can't have -3 dogs
        
        a zero in a Ratio scale does mean an absence of something (e.g. income, height, weight, unemployment rate)
        
        iii). In"t"erval -> (distance between 2 things) -> numbers only, and is the distance between 2 numbers which is equal, numbers can be negative
        
        (e.g. Temperature & Time -> 1 pm is between 2 and 3 pm -> 3 pm is not more time than 2 pm
  - - - Malcolm in the median -> this is always the middle value.
      - the median is a better measure when there are extreme scores, or a substantial percentage of maximum scores
    - - apple pie & ice-cream is most popular choice
    - - Mean "Avril" Lavine -> where everything she does, averages out in the end.
      - the mean is the best measure of central tendency
    - - Standard Deviation -> the average amount we expect a point to differ or deviate from the mean
  - - - 68% of scores fall between +1 and -1 SD
      - mean, mode and median are equal
      - 95% of scores are 2+ or -2 SD
      - 99% of scores fall between 3+ or -3 of the bell curve
      - TQ -> the shape of a Z-score distribution is identical to (or follows) the shape of the raw score distribution.
    - - hint: you are right mean!! So exit stage right, tail goes to the right!!
      - (low mode = p0sitive)
      - the always go in alphabetical order => mean, median, mode
      - Whale's tail => the tail tells the tale
    - - hint: Low mein = eating noodles with left hand, so noodles (tail) goes left
  - - - Histogram -> used when there is a large distribution of scores
        
        the bars are touching each other b/c there is a range
      - Line graphs-> interval & ratio data
    - - bars above each category w space between
      - hint: No bar = nominal & ordinal
  - - - before square rooting, convert the number into 100ths (e.g. 0.5 becomes .50) => square root of .50 is 7 b/c 7x7 = 49
      - you answer is always in expressed in 10ths (e.g.
      - Square root of .1 becomes .10 => 3 squared = 9, which is very close to 10. the answer is .3
    - - answers are always expressed in 10ths (e.g. .6 squared become .36)
      - (e.g. .1 squared = .01)
    - - when A = B/C => A & B will always have a direct relationship => (e.g. when A increases, so does B, and visa versa) => A and C will have an inverse (indirect) relationship. As A goes up, C goes down
        
        for the EPPP, this concept is used when calculating the standard error of the mean formula
        
        if SD of population increases => SEM increased
        
        if sample size increases => SEM decreases
- - - - T-scores -> are used for smaller sample sizes or when the population standard deviation is unknown
        
        T-scores have a mean of 50 & SD of 10
        
        "T" for tens
        
        these are also percentile scores
      - Z-scores -> are a count of the SD -> applicable for large sample sizes with known population standard deviation
        
        mean of zero & SD of 1
        
        (a.k.a. Standard Deviation)
      - Stanines (aka-Stens) -> "s" staying in the middle -> mid-score is "5"
        
        hint: Stanines are in the middle at 5, everything starts in the middle of the curve
        
        (a.k.a. Stens)
        
        have a mean of 5 and SD of 2, that ranges is from 1-9.
      - IQ scores
        
        IQ scores have a mean of 100 & SD of 15
        
        IQ scores are double b/c they are so smart
        
        measured by t-scores
      - Percentile Ranks -> 2, 16, 50, 84, 98
  - - - measurement error -> is due to random factors that affect the test performance of examines in unpredictable ways
        
        examine fatigue
        
        ambiguously worded test items
        
        distractions during testing
      - true score variability -> is the result of actual differences among examines with regard to whatever the test is measuring.
      - a.k.a. "True Test score theory"
    - - Internal Consistency Reliability -> reliability of scores over different test items -> best for tests that measure a single content domain OR aspect of a bx
        
        b). Kuder Richardson (KR-20)-> it quantifies the extent to which items in a scale/ test are correlated with each other OR the degree to which the items measure the same underlying construct
        
        The coefficient alpha ranges from 0 to 1
        
        A value closer to 0 indicates low internal consistency -> where items do not consistently measure the same underlying construct.
        
        A value close to 1 indicates high internal consistency, -> where the items in the scale are highly correlated with each other.
        
        (KR-20) is used for dichotomous items (e.g. yes or no, correct or incorrect)
        
        Kuder/K.D Lang is a "dyke" (dichotomous)
        
        a). Coefficient alpha (a.k.a. Cronbach's alpha) -> it involves administering the test to a sample of examines
        
        calculates the average inter-item consistency
        
        d). Spearman-Brown prophecy formula -> used to compensate for lengthening or shortening a test on its reliability coefficient.
        
        should I keep my name, or should I hypenate it? -> lengthen of shorten
        
        c). Split-half reliability -> is taking the test and divide it into 2 halves -> you should get the same score on both halves of the test
        
        splitting the test in half (e.g. even- and odd-numbered items) and then correlating the 2 set of scores
        
        A problem with split-half reliability is that the shorter tests tend to be less reliable than longer tests -> therefore a split-half reliability coefficient underestimates a test's reliability
        
        need a mnemonic to remember the differences with these different types and how to tell them apart ...talk to Roza
      - 2. Alternate Forms Reliability -> two different test are used to test the individual for the same thing & there is a passage of time between tests
      - Test-Retest Reliability -> Consistency of scores over time -> you take a test, and then you re-take it, you should get similar scores if test is reliable
        
        (Equivalent form test -> EPPP/SAT exam b/c you can take different tests but get similar results)
        
        best for speed tests
      - 4. Inter-Rater Reliability -> is used to determine consistency of scores when there are multiple raters
        
        Cohen's kappa (kappa = couple, 2 or more) -> is used to correct for chance agreement by different raters
      - Anchor Story Mr. Reliable Construction business -> his motto is almost always reliable except for 1 "measuring incident"
        
        ii).Afro Jack (Alternate forms)
        
        2 different test to see which one works better -> test 1, test 2??
        
        iii). ICR "Iceman" (Internal-consistency-rater) -> I just want to play beach volleyball with my "teammates"
        
        d).Spearman-Brown -> compensate for long tests OR small test
        
        a).Cronbach (coefficient alpha) -> this dude is consistent!!
        
        b).Kuder-Richardson (KR-20) "Kuder" -> is always "k"omparing
        
        c).Split-half -> always undecided
        
        i). Testy Mctesterson (test-retest)
        
        I love to take tests!!
        
        kappa chance on me
        
        iv). Inter-rater (Smooth operator) -> smooth interrater....
        
        when there is multiple raters
      - these are all different types of reliability coefficients or measures used
      - instead of using "reliability coefficients" use the term RELIABILITY MEASURES
    - - Guessing
      - Range of Scores
      - Content Homogeneity:
    - - Item Difficulty
      - Item Discrimination
  - - - (a) Monotrait-Monomethod Coefficient: The monotrait-monomethod (same trait, same method) coefficient is a reliability coefficient (e.g., coefficient alpha) for the self-report
      - (b) Monotrait-Heteromethod Coefficient: The monotrait-heteromethod (same trait, different method) coefficient is the correlation coefficient for the self-report sociability test and the teacher report sociability test. When this coefficient is large, it provides evidence of the self-report sociability test's convergent validity.
      - (c) Heterotrait-Monomethod Coefficient: The heterotrait-monomethod (different trait, same method) coefficient is the correlation coefficient for the self-report sociability test and the self-report impulsivity test. When this coefficient is small, it provides evidence of the self-report sociability test's divergent validity.
      - (d) Heterotrait-Heteromethod Coefficient: The heterotrait-heteromethod (different trait, different method) coefficient is the correlation coefficient for the self-report sociability test and the teacher report impulsivity test. When this coefficient is small, it provides evidence of the self-report sociability test's divergent validity.
    - - Convergent validity exists, when we have high correlations of monotrait (MT) heteromethod (HM)
      - Divergent validity (aka discriminant validity) -> exists when we have high correlations of heterotrait (HT) monomethod (MM)
  - - - my common sense confirms that I have it!!
- - - - Negative Correlation
      - Zero Correalations
      - Positive Correlations
  - - - (e.g. divorced, deployed, new job, etc.)
      - from the Greek word phainomenon appearance -> appears to be, "to show"
    - - (e.g. anthropologist going to a village and watching)
    - - (e.g. grounded while making observations)
      - collecting of data based on your hypothesis
    - - (e.g. picture books have depth)
      - in-depth interview & focus groups
    - - data triangulation -> same method at different times, different setting, different people
      - 3. investor triangulation -> 2 or more investigators collect and analyze data
      - methodological triangulation -> uses multiple methods (e.g. interviews, focus groups, etc.
      - theory triangulation -> interpreting data using multiple theories, hypotheses or perspectives
  - - - Reversal Design ABAB or ABA Design -> where the second "A" is return to baseline and "B" is return to treatment
        
        A common threat is that the measures may fail to return to baseline b/c they have already been exposed to tx.
        
        (e.g. hyperactive child is impulsive) -> A phase is how many times he jumps out of his seat -> B is intervention
        
        ABAB removes history as a threat
        
        withdrawing tx when it is working is not ideal
        
        recall strategy needed here
      - Multiple Baseline Design -> when you are doing the something 3 times
        
        Tx is applied consecutively, sequentially, or successively
        
        mult-i-ple = 3X
        
        An advantage of multiple-base line design over ABAB is that you don't have to withdraw tx which could be detrimental, especially if it is working
        
        recall strategy needed here
      - AB Design -> one subject, where "A" is baseline and "B" is treatment
        
        simplest design -> Little Abner all by himself
        
        same person is measured many times
        
        autocorrelation is often associated with this type of research
        
        biggest threat is history -> something happens at the same time that threatens our research results
      - Simultaneous treatment -> looking at (2) different treatments b/c you want to know which one is better
        
        (e.g. two types of tx at different times of the day)
        
        probably won't see this on the exam
      - Idiographic -> research that focuses on understanding the unique experiences of individuals or specific cases.
      - In experimental research there is a "causal" relationship btwn IV & DV
      - Iggy Pop is a "single idiot"
      - In this type of research you study 1 or few subjects in a very intense manner
      - single subject = is an experiment
    - - c. Mixed Design -> is a mixture of between groups and within groups
        
        (e.g. one component is repeated and another component is independent)
        
        1 grouping must be independent and 1 thing is repeated measure
      - a. Between Groups Design -> comparing subjects to different groups where data and groups are independent
        
        (hint: you MUST choose between this or the group)
        
        data must be independent (can't be in more than 1 group at the same time)
      - Factorial Design -> occurs whenever there is 2 or more independent variables
        
        it allows researchers to obtain information on the the main effects of each IV.
        
        main effect -> is where the effect of 1 IV on the DV
        
        interaction effect -> the combined effect of 2 or more IV's on the DV
      - this is from Psychprep audios
      - b. Within Subject Design -> comparing groups when the data is repeated . Everyone has data in each one of the groups
        
        (e.g. before, after or during treatment)
        
        data is correlated
        
        measured repeatedly over time
      - Butterfly on Coral Reef => Correlation
  - - - b).Voluntary Random Sampling-> the sample consists of individuals who volunteered to participate in the study
      - c). Purposive Sampling (aka judgmental sampling) -> when researchers use their judgment to select individuals who are appropriate for the purposes of their study
        
        (e.g researcher wants to study beggars, and he visits 3 areas in the city where beggars live -> he then selects and interviews beggars)
        
        researcher attempts to identify "target population" with a specific objective
      - a).Convenience sampling -> involves individuals who are easily accessible to the researcher
        
        (e.g., university students are selected in various spots within the campus)
        
        (e.g. during election, journalist on the stress asks random people who are they voting for)
      - d). Snowball Sample -> used when direct access to members of the target population is difficult.
        
        (e.g. asking individuals who participate in the study to recommend others who might qualify)
        
        psst!! pass it on
      - NON-RANDOM
      - Risks of non-random research
        
        vulnerable to sampling error (a.k.a. selection bias)
        
        vulnerable to sampling bias (a.k.a. systematic error)
      - Anchor Story Norando's Mac's Store Research Project
        
        picking only people who come into the store -> (convenience)
        
        sign outside the store saying, "Ask to be a volunteer" -> (voluntary random sampling)
        
        gets mad, takes the mission statement, leaves the store and starts making decision on who he think would be a good subject -> (purpose)
        
        starts to snow, so he makes snowballs, everyone he hits comes to talk, so he tries to sell them on doing a survey (snowball)
    - - CBPR starts with a research topic of importance with the purpose of combining knowledge with action to achieve social change & improve health outcomes
      - 9 core processes include;
        
        (a) Recognize the community as a unit of identity
        
        (b) Build on the community's strengths and resources
        
        (c) Facilitate an equitable, collaborative, and power-sharing partnership during all phases of the research
        
        (e) Integrate and achieve a balance between knowledge generation and intervention for the benefit of all partners.
        
        (d) Foster co-learning and capacity building among all partners.
        
        (f) Focus on public health problems of relevance to the community and emphasize an ecological approach that recognizes the multiple determinants of health.
        
        (g) View system development as a cyclical and iterative process.
        
        h). Disseminate research results to all partners and involve them in the dissemination process.
        
        (¡) Understand that CBPR is a long-term process that requires a commitment to sustainability.
    - - b).Systemic random sampling -> used when a random list of all individuals in the population is available
        
        It involves selecting every nth (e.g., 10th or 25th) individual from the list until the desired number of individuals has been selected
      - c).Stratified random sampling -> used when the population is heterogeneous with regard to one or more characteristics that is relevant to the study (e.g., gender, age range, DSM diagnosis)
        
        key requirement -> it must ensure that each characteristic is adequately represented in the sample.
        
        dividing the population into subgroups (strata), based on the relevant characteristics and selecting a random sample from each subgroup
      - a).Simple random sampling -> All members of the population have an equal chance of being selected
        
        (e.g. using a computer-generated sample of individuals that was randomly chosen from a list of all individuals in the population)
        
        it simply look like everyone will get an equal chance to be selected
      - d).Cluster random sampling-> when it is impossible to randomly select individuals from a population b/c the population is large & b/c there are natural clusters within the population
        
        It involves randomly selecting a sample of clusters and then either including in the study all individuals in each selected cluster or a random sample of individuals in each selected cluster.
        
        (e.g., mid-sized cities, school districts, mental health clinics).
        
        this one is a bit confusing ....NEED TO FIND AN EXAMPLE FOR THIS ONE
      - RANDOM
- - - - this stuff is in assessment, do not duplicate
- - - - a). relationship btwn variables is linear
      - b). there is an unrestricted range of scores for all variables
      - c). there is homoscedasity -> which means that the variability of criterion scores is similar for all predictor scores.
    - - correlation matrix -> the closer the dots the stronger the relation
      - negative correlation -> is an inverse/indirect relationship, as one goes up the other goes down
      - positive correlation -> is a direct relationship btwn variables that goes in the same direction
- - - - percentage correct
      - pass/fail
      - raw scores
    - - (e.g. percentile ranks, t-scores, z-scores)
  - - - the key thing in MRE, is the predictors are compensatory (e.g. one variable can compensate for one another)
      - non-compensatory is when one variable should not be used to compensate b/c they are too different from each other