Relations among variables

 

 

identify relations among variables. What happens to one variable as another variable changes? Does a change in one variable cause a change in another variable? These questions can lead to powerful methods of predicting future values through linear regression.

It is important to note the true meaning and scope of correlation, which is the nature of the relation between two variables. Correlation does not allow us to say that there is any causal link between the two variables. In other words, we cannot say that one variable causes another; however, it is not uncommon to see such use in the news media. An example is shown below.

 

Here we see that, at least visually, there appears to be a relation between the divorce rate in Maine and the per capita consumption of margarine. Does this data imply that all married couples in Maine should immediately stop using margarine to stave off divorce? Common sense tells us that is probably not true.

This is an example of a spurious correlation in which there appears to be a relation between the divorce rate and margarine consumption, but it is not a causal link.

The appearance of such a relation could merely be due to coincidence or perhaps another unseen factor.
What is one instance where you have seen correlation misinterpreted as causation? Please describe. This serves as your initial post to the discussion (if you choose topic 1) and is due by 11:59 p.m. EST on Saturday.

-OR-
Topic 2:

Linear regression is used to predict the value of one variable from another variable. Since it is based on correlation, it cannot provide causation. In addition, the strength of the relationship between the two variables affects the ability to predict one variable from the other variable; that is, the stronger the relationship between the two variables, the better the ability to do prediction.

For example, given this data on literacy and undernourishment, we can create a scatter plot which shows that there seems to be a relationship between the variables.

 

The graph implies that as literacy (x) increases, the percentage of people who are undernourished (y) decreases.

We can calculate a best-fit line equation and use this to predict that the undernourishment rate we would expect in a country with a percentage literacy rate of 87% would be y = (-0.5539)(87)+55.621 or about 7.43 percent.

What is one instance where you think linear regression would be useful to you in your workplace or chosen major?

 

Our customer support team is here to answer your questions. Ask us anything!