G12SMM : Statistical Models And Methods - Linear Models - Assessment Answer

November 08, 2018
Author : Ashley Simons

Solution Code: 1DCD

Question:Statistical Models And Methods

This assignment is related to ”Statistical Models And Methods ” and experts atMy Assignment Services AUsuccessfully delivered HD quality work within the given deadline.

Statistical Models And Methods Assignment

Assignment Task

The Data

Data are available on the recommended prices of used cars in the United States. All cars are the same age, but have done different mileages and have different specifications. You have recently been employed by a used car dealership to build models to describe the dependence of recommended prices on potential explanatory variables, in order to use these models to price your own used cars. The data, which come in two parts, are available on Moodle. They are

TrainData.txt Training data, which will be used to build models. TestData.txt Test data, which will be used to assess predictions from the models built.

They can be read into R (after saving the file in your working directory) using

Train = read.table("TrainData.txt",header = T) Test = read.table("TestData.txt",header = T)

A description of the variables can be found in the file description.txt.

After reading in the data, you can look at the structure of the data (number of observa- tions/variable types etc) using the str() command, e.g. str(Train). For both data sets, you should treat the covariates Cylinder, Doors, Cruise, Sound and Leather as factors (they are treated as integers by default). This can be done using, for example,

Train$Cylinder = factor(Train$Cylinder)

The Task

(a) Using the TRAINING data, investigate models to explain the relationship between Price and the other variables. That is, Price (or transformations of it) is to be the response variable, and all other variables are potential explanatory variables. [35]

(b) Use your fitted model(s) from (a) to predict the responses for the observations in the TEST data set. That is, for each of the observations in the Test data, use the values of the explanatory variables as input to your model(s) from (a) to obtain fitted/predicted responses for these observations. Compare your predicted responses with the known observed responses from the observations in the Test data, using suitable plots/numerical summaries. [15]


These assignments are solved by our professional Statistical Models And Methodsat My Assignment Services AU and the solution are high quality of work as well as 100% plagiarism free. The assignment solution was delivered within 2-3 Days.

Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.



The main objective of this study is to determine the appropriate model to predict the price of cars using mileage, cylinder,doors, cruise,sound and leather as independent variables. The variable price and mileage are continuous variables and are measured under interval scale. The other variables cylinder,doors, cruise,sound and leather are nominal variables and are measured using nominal scale. The exploratory data analysis for the variables taken into consideration are given below

Train = read.table("D:/Downloads/TrainData.txt",header = T)

Test = read.table("D:/Downloads/TestData.txt",header = T)






The mean price of the retail car is $21650.42 with a standard deviation of $9973.78. The median price of retail cars is $18214.45. The minimum and maximum recorded retail price of car is $ 9041.91 and $70755.47 respectively.

The mean mileage of the retail car is 19808.3 with a standard deviation of 8005.7. The median mileage of retail cars is 20954. The minimum and maximum recorded retail mileage of car is $ 266 and 41829 respectively.

From the above histogram, we see that the distribution of price of retail cars has longer tail on the right side of the normal curve, indicating that the distribution of price of cars is skewed right or positively skewed

From the above histogram, we see that the distribution of mileage of retail cars has longer tail on the left side of the normal curve, indicating that the distribution of mileage of cars is skewed left or negatively skewed

Since the sample size is large, the distribution of price and mileage follows normal (by central limit theorem)

Regression Model

The multiple regression analysis is performed to predict price using mileage, cylinder,doors, cruise,sound and leather as independent variables. The R code is given below


summary (regression.out)

The value of F test statistic is 67 and its corresponding p – value < 0.05, indicating that the estimated regression model is good fit in predicting the dependent variable price

The coefficient of determination is 0.57. This indicates that 57% of the variation in the dependent variable is explained by the regression model while the remaining 43% left unexplained


The regression coding is given below


coef <- coefficients(regression.out)

resid <- residuals(regression.out)


rsq <- summary(regression.out)$r.squared

se <- summary(regression.out)$sigma

stat.coef <- summary(regression.out)$coefficients

coef <- stat.coef[,1]

se.coef <- stat.coef[,2]

t.coef <- stat.coef[,3]

p.coef <- stat.coef[,4]







out = visreg(regression.out, band=FALSE)

Regression Plots

The plot and the deviances suggest that the linear model is probably sufficient. This is a subjective judgment based on eyeballing the data and fits - perhaps the most important aspect of model fitting.

Find Solution for Statistical Models And Methods assignment by dropping us a mail at help@gradesaviours.com along with the question’s URL. Get in Contact with our experts at My Assignment Services AU and get the solution as per your specification & University requirement.


Order Now

Request Callback

Tap to ChatGet instant assignment help

Get 500 Words FREE