BIVARIATE DATA

multivariate data

Multivariate data analysis is a type of statistical analysis that involves more than two dependent variables. These variables are factors you compare to the control or unchanging component of the experiment. Multivariate analysis aims to identify patterns between multiple variables.


For example, if you want to measure the correlation between the amount of time spent on social media and an employee's productivity, you could use multivariate analysis. Each employee's productivity and social media times are variables in the analysis.

bivariate data


Bivariate Data is simply data that has two variables, a dependent and an independent variable.


For example, ice cream sales versus the temperature on that day. The two variables are ice cream sales (dependent) and temperature (independent).

categorical variables

Categorical variables represent types of data which may be divided into groups.


Examples of categorical variables are race, sex, age group, and educational level.

discreete variables

A discrete variable is a variable that takes on distinct, countable values. In theory, you should always be able to count the values of a discrete variable.


For example, number of red M&M’s in a candy jar, or number of votes for a particular politician.

continuous variables

Continuous variables can take on an unlimited number of values between the lowest and highest points of measurement.


For instance, consider the height of a student. The height can't take any values. It can't be negative and it can't be higher than three metres.

response variable

The response variable is the dependent variable. It is a factor whose variation is explained by the other factors.

correlation

Correlation is a statistical measure that indicates the extent to which two or more variables fluctuate in relation to each other.


There's a correlation (connection/relationship) between smoking and cancer.

causation

Causation indicates that one event is the result of the occurrence of the other event, in other words there is a causal relationship between the two events.


An example of causation is the fact that working more hours at a job that pays a person hourly will cause that person to have a larger pay check. There is a direct and identifiable causal relationship between a paycheck and the number of hours worked at a job which pays hourly.

regression and regression line

A regression line is a statistical tool that depicts the correlation between two variables. Specifically, it is used when variation in one (dependent variable) depends on the change in the value of the other (independent variable).

extrapolation

Extrapolation is a statistical technique aimed at inferring the unknown from the known. It attempts to predict future data by relying on historical data.


For example, estimating the size of a population a few years in the future on the basis of the current population size and its rate of growth.

interpolation

Interpolation is the process of estimating unknown values that fall between known values.


For example, a straight line passes through two points of known value. You can estimate the point of unknown value because it appears to be midway between the other two points.