TIME SERIES BASICS
Difference between regression and time series: time series are not necessarily independent and not necessarily identically distributed. They are lists of observations where the ordering matters. Ordering is very important because there is dependency and changing the order could change the meaning of the data.
Characteristics:
- Is there a trend, on average, the measurements tend to increase (or decrease) over time?
- Is there seasonality, meaning that there is a regularly repeating pattern of highs and lows related to calendar time such as seasons, quarters, months, days of the week, and so on?
- Are their outliers? In regression, outliers are far away from your line. With time series data, your outliers are far away from your other data.
- Is there a long-run cycle or period unrelated to seasonality factors?
- Is there constant variance over time, or is the variance non-constant?
- Are there any abrupt changes to either the level of the series or the variance?
Residual analysis: the correlation is 0 between residuals separated by any given time span. In other words, residuals should be unrelated to each other. Residuals usually are theoretically assumed to have an ACF that has correlation = 0 for all lags
What is (weakly) stationary: the autocorrelation for any particular lag is the same regardless of where we are in time
- The mean E(xt) is the same for all t.
- The variance of xt is the same for all t.
- The covariance (and also correlation) between xt and xt-h is the same for all t.
The observations in a stationary time series are not dependent on time.
Time series are stationary if they do not have trend or seasonal effects. Summary statistics calculated on the time series are consistent over time, like the mean or the variance of the observations.
When a time series is stationary, it can be easier to model. Statistical modeling methods assume or require the time series to be stationary to be effective.
- Look at Plots: You can review a time series plot of your data and visually check if there are any obvious trends or seasonality.
- Summary Statistics: You can review the summary statistics for your data for seasons or random partitions and check for obvious or significant differences.
- Statistical Tests: You can use statistical tests to check if the expectations of stationarity are met or have been violated. Unit root test (Augmented Di)
Auto-correlation (ACF):
For an ACF to make sense, the series must be a weakly stationary series.The ACF can be used to identify the possible structure of time series data. The ideal for a sample ACF of residuals is that there aren’t any significant correlations for any lag.
Partial Auto-correlation Function(PACF):
For a time series, the partial auto-correlation between and
is defined as the conditional correlation between
and
, conditional on
, the set of observations that come between the time points t and t−h. (The two variances in the denominator will equal each other in a stationary series.)
cross-correlation:
De-trending: De-trending each series using a linear regression with t, the index of time, as the predictor variable. The de-trended values for each of the three series are the residuals from this linear regression on t. The de-trending is useful conceptually because it takes away the common steering force that time may have on each series and created stationarity.
UNIVARIATE TIME SERIES ANALYSIS
The First-order Autoregression Model (AR(1)):
Assumptions:
, meaning that the errors are independently distributed with a normal distribution that has mean 0 and constant variance.
- Properties of the errors
are independent of
.
- The series
,
, … is (weakly) stationary. A requirement for a stationary AR(1) is that |ϕ1|<1|ϕ1|<1.
Statistics:
- $latex Var(x_t) = \frac{\sigma_\omega^2}{1-\phi_1^2}$
- The correlation between observations h time periods apart is
Pattern of ACF:
Pattern of PACF: the theoretical PACF “shuts off” past the order of the model. Use PACF to choose an AR model.
Moving Average Models (MA)
The moving-average model specifies that the output variable depends linearly on the current and various past values of a stochastic(imperfectly predictable) term (white noise).
The first order moving average model MA(1) is:
Statistics:
- $latex Var[x_t] = \sigmaσw 2(1 + θ12)$
- Autocorrelation function (ACF) is
A property of MA(q) models in general is that there are nonzero ACF for the first q lags and autocorrelations = 0 for all lags > q. Use this property to choose q. Unlike AR model, the MA model is always stationary.
Pattern of ACF:
Pattern of PACF:
Auto-regressive Integrated Moving Average (ARIMA):
Elements of the model: AR order, differencing, MA order
ROUTINE:
MULTIVARIATE TIME SERIES
Vector Autoregressive models (VAR):
- The structure is that each variable is a linear function of past lags of itself and past lags of the other variables. u_t includes terms to simultaneously fit the constant and trend.
- Information criterion statistics to compare VAR models of different orders:
- Use ACF plot of residuals
- Examine cross-correlation of residuals