MSCI212: STATISTICAL METHODS FOR BUSINESS
澳洲商科代写 The three problems on this sheet should help you better understand regression for the successful completion of your…
Week 10 Workshop 澳洲商科代写
The three problems on this sheet should help you better understand regression for the successful completion of your take-home coursework assignment and your final summer exam. You should work through all three problems and compare your answers to the suggested solutions to complete your learning of this material.
A household removal firm wishes to determine whether it would be possible to arrive at a satisfactory estimate of the loading/unloading work involved in a removal given only information on the number of rooms in a client’s house and/or the number of people living in the household. If so the firm plans to discontinue its present costly practice of visiting the homes of all potential clients before giving them quotations.
To explore this possibility, information has been collected on 30 removals recently undertaken by the firm and stored in ‘Removals.sav’. For each removal, data is available on:
– actual number of man-hours needed for loading and unloading (man-hrs);
– number of rooms in the client’s house (rooms);
– number of people living in the client’s home (people).
(a) Plot graphs that you think shed some light on the firm’s idea and comment on your results. 澳洲商科代写
(b) Calculate Pearson correlation coefficients (<Correlate><Bivariate> etc) between the three variables and comment on your results.
(c) Use STEPWISE regression (starting with no variables and then starting with all variables) to determine the best model for predicting the work involved in loading/unloading. (See Week 10 slides 11-14 for SPSS commands). Explain your results.
(d) Carry out a common-sense based examination of the residuals for your chosen model and comment on whether or not the regression assumptions seem to hold findings. (See Week 10 slides 15-25 for example).
(e) Use your preferred model from part (c) to estimate the work involved in loading/unloading for:
(i) an average house with 8 rooms and 4 people;
(ii) an individual house with 8 rooms and 4 people;
(iii) an average house with 14 rooms and 11 people;
(iv) an individual house with 14 rooms and 11 people.
Explain your beliefs about the accuracy of your estimates.
The regional manager of a clothes retailing company has noted that the gross profit margins that individual shops are able to achieve depends on the proportions of turnover that come from cut-price offers. The manager believes that this may well be affected by the local advertising used, whether on the radio or in newspapers, and the level of competition in the town.
He has therefore assembled data on gross profit margin and the above factors for each of the 100 shops in his region. The data is stored in file ClothesShops.sav as:
- Profit: gross profit as % of sales revenue,
- Radioads: radio advertising spending (£),
- Newsads: newspaper advertising spending (£),
- Complevel: level of competition – 0 if number of clothes retailers per 100,000 population<=10.5; 1 if: ‘1’= no. of clothes retailers per 100,000 population>10.5.
(a) Carry out a preliminary analysis of the data reporting your preliminary findings. 澳洲商科代写
(b) Use STEPWISE regression to find the ‘best’ relationship between Profit and Radioads and Newsads, i.e. do not include Competition level.2
(c) Rerun your STEPWISE analyses including Competition level as a third explanatory variable. Report your (apparent) findings carefully.
(d) Carry out a thorough residuals analysis for your chosen model (i.e. best model so far) and comment on your findings. In particular comment on any evidence you find of non-linear relationships.
(e) The company is planning to open three new shops shortly. They intend to spend £3200, £3600 and £2800 on local radio advertising respectively, £1800, £1700 and £2000 on local newspaper advertising respectively, and the shops will be in towns where the levels of competition will be low, high and low respectively. Advise the company on the gross profit margin likely to be achieved by these shops. Explain carefully the sources of inaccuracy that you believe to be present in your predictions.
Go back to the logging industry problem from your Week 8 workshop:
In the logging industry the value of a tree depends on the volume of wood in the tree trunk, as well as on the quality of the wood. The quality of wood is assessed by taking samples from a sample of trees to estimate the mean quality of each batch of logs. However, the volume of wood in a tree trunk is difficult to measure, so that in practice a logging company needs to agree with its buyers a way of estimating the volume of wood from easy-to-obtain measurements.
The volumes of 31 trees (in cubic metres) have been carefully measured (using tanks of water) as well as their heights and diameters midway along the trunk (in feet) which are easy to measure. The data is stored in the SPSS file ‘MSCI212Trees.sav’.
(a) Plot scatterplots and calculate Pearson correlation coefficients of the three variables to remind yourself of the apparent relationships between the variables. 澳洲商科代写
(b) Because of the potential non-linear relationship to height, use <Transform><Compute Variable> to create 2 extra explanatory variables: height2 and diameter2. Then use STEPWISE (both ways) to look for ‘good’ models. What is the ‘best’model you find?
(c) In a moment of inspiration use <Transform><Compute Variable> to create another possible explanatory variable: (Diameter2 x Height). Then use STEPWISE (both ways) to look for ‘good’ models. What is the ‘best’model you find? What do you think inspired the suggested variable
(Diameter2 x Height), and what might it encourage you to do in the future e.
[Before using this model you should again check the assumptions in the usual way. But the point of this question is to encourage you to use common sense alongside data in proposing possible models.]