# Basic Statistics

Basic Variables to look at when attempting to discover a new set: N Mean SE Mean StDev Variance CoefVar Minimum Q1 Median Q3 Maximum Range Skewness 95% range limits: Mean ± 1.95*StdDev. Confidence interval: The range in which we are x% sure that the population mean (also the regression line) will fall in. Prediction interval:[…]

# Qualitative Analysis and Forecasting

Qualitative Analysis and Forecasting: is performed by human opinions rather than event-based data. A.k.a Judgmental Forecasting, vs. statistical forecasting. Needs for Qualitative Analysis and Forecasting: – No data is available. Difficult data collection. Statistical knowledge required. – Long horizons of forecasting (forecasting for the next 2-3 years, wihle only 1 year of data is available).[…]

# Regression and Outliers

Outliers: Abnormal data points. Two types of outliers from a regression viewpoint: Could be far from the points, or far from the model, or both. Only if both, you should remove it. How to check: a. Compare the following before and after removing the outlier: leverage, Cooks, DFits, coeffecients, r2, or even MSE (rule of thumb: 20% difference): if different[…]

# SPSS Correlations

From the menu >> Choose Correlate: Then choose your variables: You will find the output in the backgound: NxN matrix of correlations.

# SPSS Neural Networks Tutorial

NN for time series: values of each lag Yt-1, Yt-2… become a new variable in the NN. Factors: nomial / categorical vars.Covariates: numeric vars. SPSS: Enter > Open Data File > choose your file. Choose from Analyze tab > Neural Networks > Multilayer perceptron. In the output tab: Classification Sample Observed Predicted No Yes Percent[…]

# SAS Regresesion Analysis

SAS provides the capability to build regression models to analyze ordinal data correlations and use this model to predict out-of-sample values. Here is the process: Open SAS Studio > Go Tasks > Linear Regression > the following window will open: Input Data settings: Model settings: One can add simple variables to the indepedent variables set.[…]

# Correlation Analysis and Dimensionality Reduction

Correlation Analysis is meant for analyzing how two ordinal variables co-variate (increase and decrease together). It helps in at least two fields: 1. Regression analysis: By choosing which variables should be included. 2. Dimensionality reduction: If 2+ independent variables are correlated (this is called: multi-collinearity), we can keep one of them for the regression or[…]

# Qualitative variables to Dummy

Dummy Variables: Why:include independent qualitative variables to forecasting. Vs Logistic regression:the dependent (response) Y variable is qualitative. How: for each value of the qualitative variable X create 1 dummy variable, and assign it 1 when this value appears in a row, otherwise 0. Create total of N-1 dummy variables + constant or N variables without[…]

# Rules on Forecasting

Q, DW and Dicky fuller Unit Root tests: all test the ACF of residual (not of the fitted model) to decide the model passes (is acceptable) as a forecaster or not. Dickey Fuller: the smaller the values the better then model, if t Error measures (RMSE, SSE, RSS, MAE): after checking that all models are[…]