Q1 ) Graphically analyze the data and comment on how the age of the clock and the numberof bidders are affecting the auctioned selling price.
library(scatterplot3d)
d = read.table("clock_prices.data",header = T)
summary(d)
head(d)
age = d$Age
nbids = d$Bidders
price = d$Price
par(mfrow=c(2,2))
scatterplot3d(age,nbids,price)
plot(age,nbids,main = 'Age and bidders',ylab='nbids')
plot(age,price,main = 'Age and prices',ylab='price')
plot(nbids,price,main = 'Bidders and Prices', ylab='price')
Ans 1) From the plots it is seen that i) As the age of the clocks increase there is increase in price (A positive linear association is seen from plot3) ii) As the number of bidders increase the prices increase.
Q2.Fit a first order multiple regression model to the data and answer the following basedon this model:
a.Is the model useful?
model = lm(price~age+nbids)
summary(model)
vcov(model)
Ans 2a) Yes, The Model is useful as the proportion of variability explained by the model is significant as p-value is very low
Q2b).Given the age of a clock, by what amount can one expect the selling price to go upfor one more person participating in the auction?
Ans) For a given age of a clock, addition of one more person participating in the auction will result in £85.8151 increase in price of the clock
Q2 c).An auction house has acquired several grandfather clocks each 100 years old paying an average price of £500 per clock. From the past experience it has found that suchauctions (for antique grandfather clocks) typically attract about 10-12 bidders.What can be said about its expected profit per clock with 95% confidence?
=df = data.frame(age = c(100,100,100),nbids = c(10,11,12))
x = predict.lm(model, newdata=df,se.fit=T,level = 0.95,interval = "confidence")
x$fit - 500 #Profits (substracting cost of the clocks )
Ans) The predicted profits with clocks of age 100 having bidders 10 or 11 or 12 respectively are given above with confidence interval of 95%
Q2 d)You walk into an auction selling an antique 150 year old grandfather clock and findthat there are 15 bidders (including yourself) participating in the auction. Youare extremely keen in acquiring the clock. At least what amount should you bid for the clock, so that, you are 99% certain that nobody else can out-bid you?
df = data.frame(age = c(150),nbids = c(15))
predict.lm(model, newdata=df,se.fit=T,level = 0.98,interval = "confidence")
Ans 2d) I will have to bid for atleast £1721.171 (lower bound = estimated(nbids) - t0.99 * SE(nibds) ) so that in expectation I'am 99% certain that nobody else can out-bit me
Q2e) Find the partial correlation coefficients, compare them with the corresponding marginal correlation coefficients, and comment on the nature of the relationships between the independent variables and the dependent variable.
31 *var(price) # SStotal (n-1)*Var
133.1^2*29 # SSE
4791194.21875 - 513752.69 #SSR
R2 = 4277441.52875 / 4791194.21875 # R^2 of the model (ie) Proportion of variability explained by joint linear effect of age and nbids
R2
anova(lm(price~age))
marr2a = 2554859/(2236335+2554859) # Marginal correlation coefficient of age
marr2a
1 - pf(marr2a/((1-marr2a)/29),1,29)
anova(model)
anova(lm(price~nbids))
marr2b = 746185.4/(4045008.8+746185.4) # Marginal correlation coefficient of nbids
marr2b
1 - pf(marr2b/((1-marr2b)/29),1,29)
anova(lm(price~nbids + age))
parr2ba = 1722300.7/2236335 #price,nbids|age Partial Correlation of nbids in the presence of age of the clock
parr2ba
sqrt(parr2ba/((1-parr2ba)/29)) # T value as seen in model
1 - pf(parr2ba/((1-parr2ba)/29),1,29) # P value as seen in model
parr2ab = 3530974.3/4045008.8 #price,age/nbids Partial Correlation of age in the presence of nbids
parr2ab
sqrt(parr2ab/((1-parr2ab)/29)) # T value as seen in the model
1 - pf(parr2ab/((1-parr2ab)/29),1,29) # P value as seen in model
Ans Q2e)
Marginal correlation price,age = 0.533240565921564
Marginal correlation price,nbids = 0.155741005029602
Partial correlation price,nbids|age = 0.770144320953703
Partial correlation price,age|nbids = 0.872921290059987
As such proportion of variation in price explained by the linear effect of nbins is only 15.5% with p-value =0.028019
The proportion of variation in price explained by the linear effect of age alone is 53.3% with pvalue = 3.11775533823333e-06
The proportion of variation in price explained by the nbids in the presence of age is 77% with p-value = 9.13527031798367e-11
The proportion of variation in price explained by the age in the presence of nbids is 87.3% with p-value = 1.59872115546023e-14
The proportion of variability in price explained by the joint linear effect of age and nbids is 89.3% with p-value =8.769e-15
2Q f).In presence of the other, which of the two factors, age of the clock or the number of bidders, is more important in determining the selling price of a clock?
Ans Q2f)
The partial effects of both the explanatory variables are statistically significant, as is their joint effect. However age is a more important determinant of price compared to nbids because, first, marginally age explains 53.3% of the variability in price compared to nbid's 15.5%.
Second, after accounting for the linear effect of age(on price), the linear effect of nbids additionally accounts for 77% of the remaining variability in price; compared to after accounting for the linear effect of nbids(on price), the linear effect of age additionally accounts for 87.3% of the remaining variability in price
Q3)Is the first order model acceptable? Fit as appropriate a model as possible for the auctioned selling price of grandfather clocks, based on the information on the age of the clock and the number of bidders, and then based on this model answer the same questions as in 2. b,c, and d above.
res = residuals(model)/(133.1*sqrt(1-influence(model)$hat))
par(mfrow=c(1,2))
hist(res)
boxplot(res)
par(mfrow = c(2,2))
plot(model)
source('normality.r')
normtest(res)
library(lmtest)
bptest(model)
Ans 3) Although the residuals are normal and homoskedastic. The residual plot is not completely white noise which gives room for a better model.
price1 = (price - mean(price))/sd(price)
age1 = (age - mean(age))/sd(age)
nbids1 = (nbids - mean(nbids))/sd(nbids)
#model1 = lm(price~+nbids+I(nbids*age))
model1 = lm(price1~age1 + nbids1 + I(nbids1*age1))
summary(model1)
anova(model1)
model2 = lm(price1~age1+nbids1)
summary(model2)
anova(model2)
r2 = 1.911222/3.325908
r2
1- pf(r2/((1-r2)/28),1,28)
The proportion of variability explained by the interaction between age and nbids is significant
par(mfrow=c(2,2))
plot(model1)
normtest(residuals(model1))
bptest(model1)
Q3) 2b).Given the age of a clock, by what amount can one expect the selling price to go upfor one more person participating in the auction?
y= 0.6198*sd(nbids)+mean(nbids)
y
Ans)For a given age of a clock, addition of one more person participating in the auction will result in £11.291 increase in price of the clock
Q3 2c) An auction house has acquired several grandfather clocks each 100 years old payingan average price of£500 per clock. From the past experience it has found that suchauctions (for antique grandfather clocks) typically attract about 10-12 bidders.What can be said about its expected profit per clock with 95% confidence?
df = data.frame(age = c(100,100,100),nbids = c(10,11,12))
df$age1 =(df$age - mean(age))/sd(age)
df$nbids1 =(df$nbids - mean(nbids))/sd(nbids)
x = predict.lm(model1, newdata=df,se.fit=T,level = 0.95,interval = "confidence")
((x$fit *sd(price) )+mean(price) )- 500
Ans)The predicted profits with clocks of age 100 having bidders 10 or 11 or 12 respectively are given above with confidence interval of 95%
Q3 2d) You walk into an auction selling an antique 150 year old grandfather clock and findthat there are 15 bidders (including yourself) participating in the auction. You are extremely keen in acquiring the clock. At least what amount should you bidfor the clock, so that, you are 99% certain that nobody else can out-bid you?
df = data.frame(age = c(150),nbids = c(15))
df$age1 =(df$age - mean(age))/sd(age)
df$nbids1 =(df$nbids - mean(nbids))/sd(nbids)
x = predict.lm(model1, newdata=df,se.fit=T,level = 0.98,interval = "confidence")
(x$fit * sd(price) ) + mean(price)
Ans 2d) I will have to bid for atleast £1873.213 (lower bound = estimated(nbids) - t0.99 * SE(nibds) ) so that in expectation I'am 99% certain that nobody else can out-bit me