Exam 3

NAME:

Exam 3

Fall 2000

You may use additional sheets of paper to solve the following questions, but you please report your results and conclusions in the space provided. Whenever possible, show your work for potential partial credit. NOTE: When performing numerical calculations, keep at least 4 digits after a decimal. (I.e., do NOT round .2265 to .23 or .227) BUDGET YOUR TIME WISELY!

1. Women who are union members earn $2.50 per hour more than women who are not union members (Wall St. Journal, July 26, 1994). Suppose independent random samples of 15 unionized women and 20 nonunionized women in manufacturing have been selected and the following summary statistics are found.

Union Workers	Nonunion Workers
Sample mean hourly wage = $17.54	Sample mean hourly wage = $15.36
Sample variance = $5.02	Sample variance = $3.95
Sample size = 15	Sample size = 20

a. Suppose we want to develop an interval estimate of the difference in the mean wage rates between unionized and nonunionzed women working in manufacturing. What assumptions must be made about the two populations? (4 points)

b. Construct a 95% confidence interval estimate of the difference between the two population means. Interpret this confidence interval. (8 points)

c. Does there appear to be any difference in the mean wage rate between these two groups? Explain. (6 points)

2. A large automobile insurance company selected samples of single and married male policyholders and recorded the number who had made an insurance claim over the preceding three-year period. Using a = .05, test to determine whether the claim rates differ between single and married male policyholders. Be sure to clearly state your null and alternative hypotheses, rejection range(s), test statistic and conclusion. Explain the conclusion of your test to someone who is unfamiliar with the problem. (16 points)

Single Policyholders	Married Policyholders
Sample size = 400	Sample size = 900
# making claims = 76	# making claims = 90

3. As a sales manager who supervises a team of salespeople, you’re interested in estimating the relationship between years of sales experience and annual sales (in $1000’s) for each employee.

a. Which of these two variables would you use as your independent and dependent variables in a simple linear regression equation? Why? (2 points)

b. What sign would you expect on the estimated coefficient for your chosen independent variable? Why? (3 points)

c. Suppose your statistical program produces the following estimated regression equation. Interpret both of the estimated coefficients. (6 points)

d. What other pieces of information would you like to have for this regression output? Why are they important? (9 points)

e. What other independent variables would be useful in developing a more effective model? Why? (4 points)

EMPIRICAL CASE: Two experts have provided you with subjective lists of school districts that they think are among the best in the country. For each school district, the following data were obtained:

Average Class Size
Instructional Spending per Student
Average Teacher Salary
Combined SAT score
% of Students Taking the SAT
% of Graduates Attending a 4-Year College (Dependent Variable)

Using Excel, you get the following regression output. Use this output to answer the questions below.

SUMMARY OUTPUT
Regression Statistics
Multiple R	0.772320924
R Square	0.596479609
Adjusted R Square	0.428346113
Standard Error	11.21911189
Observations	18
ANOVA
	df	SS	MS	F	Significance F
Regression	5	2232.689452	446.5378903	3.547654822	0.033561655
Residual	12	1510.421659	125.8684716
Total	17	3743.111111
	Coefficients	Standard Error	t Stat	P-value
Intercept	33.70855183	63.27729519	0.532711642	0.603958149
Average Class Size	-1.557651146	1.090141646		0.17855701
Spending Per Student ($)	-0.002427352	0.001655643		0.168327702
Average Teacher Salary ($)	-0.000258576	0.000639237		0.692956957
Combined SAT Score	0.076885289	0.03737104		0.06206088
% Taking SAT	0.285200804	0.115445124		0.029469471

a. Now that you have some Excel output, interpret each estimated coefficient in terms of both size and sign. Does the sign of each make economic sense? (10 points)

b. Use the information above to calculate the t-statistics for the missing coefficients. The intercept is calculated for you. Fill these into your table of results. (5 points)

c. What do your t-statistics tell you with regards to the significance of each coefficient? Be thorough and refer to hypothesis tests. (10 points)

d. Use the appropriate measure to comment upon the ability of the estimated model to fit the data. (4 points)

e. What can you say about overall significance for this model? What does this mean? Again, be thorough and refer to hypothesis tests. (4 points)

f. Would you suspect multicollinearity in this model? Where is it most likely to exist? What kinds of statistical information would you need to determine whether or not multicollinearity is present? (9 points)

Regression Statistics