2 stats midterm questions regarding Least squares regression lines, scatter plots, and graphs
Question Description
- [10 points] Hospital administrators were interested in determining how hospitalization cost (y = cost) is related to length of stay (x = los) in the hospital. The data set “Hospital.sav” (or “hospital.xlsx”) contains the reimbursed hospital costs and associated lengths of stay (the number of days) for a sample of 33 elderly people.
- Obtain the least squares regression line for the hospitalization cost as a function of the length of stay, and graph the line on the scatterplot.
- Does a linear regression line appear to fit the data well? Explain.
- What is the predicted cost for a 7-day stay?
- Create the residual plot. It should have the unstandardized predicted values on the x-axis and the unstandardized residuals on the y-axis. What does this plot tell you about the residuals?
- Create the normal plot of the residuals. What does this plot tell you about the residuals?
- In your opinion, is the obtained linear regression model a good model? Explain in detail.
- [10 points] Not all bivariate relationships are linear. When you plot a scatterplot, sometimes you observe a curved relationship. In those cases, we can apply a transformation to the data that will make the relationship approximately linear. [We generally prefer simpler models. And linear regression models are simpler than curved regression models. Also, transforming data is common in statistical practice.]
One of the most common transformation methods is the log transformation. In this problem, you apply the log transformation to both variables in the data for the previous problem to experiment the model fitness.
- Create another variable ‘lnCost‘ (where ‘l’ is an alphabet el) by taking the natural log (ln) of cost. Do the same for the length of stay (los), and create another variable ‘lnLos‘. Then obtain the least squares regression line for ‘lnCost’ as a function of ‘lnLos’, and graph the line on the scatterplot.
- Does a linear regression line seem to fit these transformed variables? Better than the one without transformation? Explain.
- Create the residual plot of this plot in the same way as the previous problem. How are the residuals different from the previous problem?
- Create the normal plot of the residuals. How is this plot different from the one from the previous problem?
- In your opinion, is the linear regression model made from the transformed variables a better model? Explain in detail.
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."