SAS Regresesion Analysis

SAS provides the capability to build regression models to analyze ordinal data correlations and use this model to predict out-of-sample values. Here is the process:

Open SAS Studio > Go Tasks > Linear Regression > the following window will open:

Input Data settings:

image17

Model settings: One can add simple variables to the indepedent variables set. If one suspects a combinaiton of variables can be influencing the dependent variable, one can add this combination of variables as “Cross” As shown in the below snapshot:
Regression Model selection is also performed here, and the following models are the options: Full Factorial – N-Factorial – Polynomial Order:
image20

Options:

image23

Selection Method:
image24

Output settings:

image18

The following is an example of the results:

Root MSE 84.04145 R-Square 0.3995
Dependent Mean 472.22486 Adj R-Sq 0.3194
Coeff Var 17.79691
Number of Observations Read 150
Number of Observations Used 35
Number of Observations with Missing Values 115

ANOVA for testing whether the predictors’ coefficients collectively are different from 0. In other words if there is a linear relationship of the response variable with the predictor ones, thus whether this (or any) model is good:

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 4 140951 35238 4.99 0.0033
Error 30 211889 7062.96453
Corrected Total 34 352840

This indicates that if the average coefficient was 0, there is a tiny chance to get our values of the coefficients. Thus there must be a linear relationship between the variables, and that the regression model is good.

Further analysis on the goodness of the model:

R-Square and Adjusted R-Square also test the collective correlation between the response and all the predictor variables collectively in one number. A value near 1 means high  correlation and thus the regression is good, and near 0 means not good.

RMSE tests the goodness of regression fit by measuring the residuals.

Here the test checks whether each predictor variable separately has a significant coefficient. Clearly VIX and Jobless and P_E seem not good predictors (there is high possibility that their coefficients are near 0):

Parameter Estimates
Variable Label DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept Intercept 1 739.59605 201.20407 3.68 0.0009
P_E P/E 1 -1.36988 2.19982 -0.62 0.5382
GDP GDP 1 17.20573 7.73193 2.23 0.0337
VIX VIX 1 0.67569 2.44551 0.28 0.7842
JOBLESS JOBLESS 1 -0.67914 0.42317 -1.60 0.1190
image21

Observed vs Predicted Y

image19

R-Student By Predicted Y

image16

Residual by Regression

 

Leave a Reply

Your email address will not be published. Required fields are marked *