以下是来自UK的统计PhD为客户定制的linear regression assignment代写原稿。linear regression是数据分析中常用的方法，运用十分广泛，data mining和machine learning中都能看到它的身影。下面我们来看如何运用线性回归预测消费者的花费。这是具体的要求：
Coursework 2: Predictive linear regression method
- • Using the ‘TaykoData Set’ for analysis you will be performing a predictive linear regression.
- • The data is provided in Excel format on the Blackboard. ‘Tayko Software catalog’ is a firm that sells games and educational software.
- • In this project, we centre on applying a linear multiple regression method to predict ‘Spending’ of customers, based on a number of explanatory variables, listed below.
- • There are 23 variables. For the source variable, there are 15 different catalogues that the games and educational software can be ordered from.
- • The class/dependent variable is SPENDING
The project task:
Develop a multiple regression model for predicting spending among the purchasers.
- 1. Partition this data set into training and validation partitions on the basis of the partition variable.
- 2. Develop a best model for predicting spending using multiple linear regressions on training data.
- a. For this consider the Independent variables, which will produce the best model.
- b. You will do this based on statistical estimations such as
- i. Multicollinearity assumptions checking
- ii. Analysis of statistical indication of the best model
- c. Three regressions need to be completed
- i. Forward
- ii. Backward
- iii. Stepwise
- d. For each regression remember to select the statistics that will help you complete part b.
- e. Describe the subset selecting method, explain and justify your choice.
- 3. Discuss a variety of statistical measures (i.e goodness of fit measures in SPSS) that would allow you to validate the performance of the model.
- a. For higher marks relate these back to the model you have been developing in part 2. This is independent learning and will not be covered in class.
- 4. Perform a final regression on the validation data.
- a. This should be the best model as determined from using the test data.
- b. Report on the model accuracy as well as overall performance.
1. Data Description
There are 23 variables and 1000 observations in the original data set and we divided the data set into Training set and Validating set with each has 500 observations. Variables are as follows:
US: equals 1 if it is a US address, otherwise 0
Source: Source catalogue for the record. Including 15 source variables source_a to source_w, equals 1 if from that source, otherwise 0
Freq: Number of transactions in last year at source catalogue
last_update_days_ago: How many days ago was last update to cust. record
first_update_days_ago: How many days ago was 1st update to cust. record
Web_order: Customer placed at least 1 order via web
Gender: equals 1 when customer is male, otherwise 0
Address_is_res: Address is a residence
Purchase: Person made purchase in test mailing
Spending: Amount spent by customer in test mailing ($)
Partition: Variable indicating which partition the record will be assigned to
We will use Spending as our dependent variable and built a linearly model using other variables to find out which variables have linear relation with spending and then use the model to do prediction.
2. Exploratory Data Analysis
We have the partial plots of dependent variables against independent variable:
作为统计专业的PhD，linear regression assignment代写是非常非常基础的任务，毫不夸张的说，闭上眼睛随意做。选择EssayPhD团队代写统计assignment，用实惠的价格，收获超值的质量！