时间序列分析理论及R代码实现

第一章: 绪论
第二章: 基本概念
第三章：趋势
第四章: 平稳时间序列模型
第五章: 非平稳时间序列模型
第六章: 模型识别
第七章: 参数估计
第八章：模型诊断
第九章: 预测

第一章: 绪论

什么是时间序列？
和统计里的样本有什么区别？
研究目的？研究方法？
统计指标？分析软件？
时间序列分析步骤？
常用的ARIMA模型？

例1. 加利福尼亚州洛杉矶地区100多年来的年降水量时间序列图。

#collapse=TRUE, results='hold', fig.height = 8,fig.width = 6, fig.show=‘hold’}
#options(width = 50)
data(larain)
summary(larain)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.080   9.675  14.140  14.888  18.400  40.290

plot(larain, ylab='Inches', xlab='Year', type='o', main='图1-1 洛杉矶年降水量时序图')

plot(y=larain,x=zlag(larain),ylab='Inches',xlab='Previous Year Inches', main='图1-2 洛杉矶当年降水量与去年降水量散点图')

例2. 某化工过程中颜色属性的时间序列图。

data(color)
summary(color)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   63.00   70.00   76.00   74.89   79.00   87.00

plot(color, ylab='Color Property', xlab='Batch', type='l', main='图1-3. 某化工过程中颜色属性的时间序列图')
points(color, col='blue')

plot(y=color, x=zlag(color), ylab='Color Property', xlab='Prevous Batch Color Property', main='图1-4.当前颜色值与前期颜色值的散点图')

例3. 加拿大野兔丰度的时间序列图。

data(hare)
summary(hare)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00   20.00   27.00   39.55   65.00   95.00

plot(hare, ylab='Abundance', xlab='Year', type='l', main='图1-5. 加拿大野兔丰度的时间序列图')
points(hare, col='blue')

plot(y=hare, x=zlag(hare), ylab='Abundance', xlab='Previous Year Abundance', main='图1-6.当年与上一年野兔丰度的散点图')

例4. 艾奥瓦州迪比克市月平均气温的时间序列图。

data(tempdub)
summary(tempdub)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    8.40   27.32   48.60   46.27   63.12   74.00

plot(tempdub, ylab='Temperature', type='l', main='图1-7. 艾奥瓦州迪比克市月平均气温的时间序列图')
points(tempdub, col='blue')

plot(y=tempdub, x=zlag(tempdub,d=12), ylab='Temperature', xlab='Previous Year Temperature', main='图1-7’.当年与上一年月平均气温的散点图')

例5. 滤油器月销售量的时间序列图。

data(oilfilters)
summary(oilfilters)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1572    2314    3262    3388    4488    5862

plot(oilfilters, type='l', ylab='Sales', main='图1-8. 滤油器月销售量的时间序列图')
points(oilfilters, col='blue')

plot(y=oilfilters, x=zlag(oilfilters,d=12), ylab='Sales', xlab='Previous Year Sales', main='图1-8’.当年与上一年月销售量的散点图')

plot(oilfilters,type='l',ylab='Sales', main='图1-9. 以特殊符号绘制的滤油器月销售量')
points(y=oilfilters,x=time(oilfilters), pch=as.vector(season(oilfilters)))

习题1.（on P.7）

Question 3~5.

###  rnorm(n), rchisq(n,df=2), rt(n,df=5)
n=48;  
set.seed(12345); tsnorm1 = ts(rnorm(n),freq=1,start=1)
set.seed(1234); tsnorm2 = ts(rnorm(n),freq=1,start=1)
set.seed(123); tsnorm3 = ts(rnorm(n),freq=1,start=1)
set.seed(12); tsnorm4 = ts(rnorm(n),freq=1,start=1)
opar=par(mfrow=c(2,2)) 
plot(tsnorm1,type='o'); abline(h=0,col='red')
plot(tsnorm2,type='o'); abline(h=0,col='red')
plot(tsnorm3,type='o'); abline(h=0,col='red')
plot(tsnorm4,type='o'); abline(h=0,col='red')

par(opar)

Question 6.

data(tempdub)
#op <- par(bg = "light blue")
plot(tempdub,ylab='Temperature',type='l',col = "blue")
Month=c("J","A","S","O","N","D","J","F","M","A","M","J")
points(tempdub,pch=Month,col='red')

#points(tempdub,pch=as.vector(season(tempdub)),col='red')
#par(op)

第二章: 基本概念

均值函数，方差函数 \[ \mu_t=E(X_t), \sigma_t^2=Var(X_t). \]
自协方差函数

\[ \gamma_{t,s} = Cov(X_t,X_s) = E(X_t,X_s)-\mu_t\mu_s. \]

自相关函数

\[ \rho_{t,s} = Corr(X_t,X_s) =\frac {\gamma_{t,s}} {\sqrt{\gamma_{t,t} \gamma_{s,s}}}. \]

严(强)平稳性

任意有限维分布都与初始时间点无关，只与时间间隔相关。如果期望和方差都存在的话，那么我们就有 \[ X_t \overset{d} {=} X_s. \qquad \mu_t=\mu_s\equiv\mu, \qquad \sigma_t^2=\sigma_s^2\equiv\sigma^2\] \[ \left( X_{t_1},X_{t_2}\right) \overset{d} {=} \left(X_{t_1+k},X_{t_2+k}\right). \qquad \gamma_{t,s}=\gamma_{0,\left|t-s\right|}\] \[ \left(X_{t_1},X_{t_2},\cdots,X_{t_n}\right) \overset{d} {=} \left(X_{t_1+k},X_{t_2+k},\cdots,X_{t_n+k}\right).\]

Let \(\quad \gamma_k=Cov(X_t,X_{t+k}), \quad \text{then} \quad \rho_k=Corr(X_t,X_{t+k})=\frac {\gamma_{t,t+k}}{\sqrt{\gamma_{t,t}\gamma_{t+k,t+k}}}=\frac{\gamma_k}{\gamma_0}.\)

宽(弱)平稳性

均值函数和方差函数都是常数：\(\qquad \mu_t\equiv\mu, \qquad \sigma_t^2\equiv\sigma^2\)

协方差仅依赖于时间的滞后：\(\qquad \gamma_k=\gamma_{t,t+k}=\gamma_{0,k}.\)

Let \(\quad \gamma_k=Cov(X_t,X_{t+k}), \quad \text{then} \quad \rho_k=Corr(X_t,X_{t+k})=\frac {\gamma_{t,t+k}}{\sqrt{\gamma_{t,t}\gamma_{t+k,t+k}}}=\frac{\gamma_k}{\gamma_0}.\)

例1. 白噪声序列 {\(e_t, i.i.d.\)}, 平稳的 \[\begin{align} \gamma_k=\begin{cases} \sigma_e^2, &\text{if } k=0\\ 0, &\text{if } k>0 \end{cases} \qquad\qquad\qquad &\rho_k=\begin{cases} 1, &\text{if } k=0\\ 0, &\text{if } k>0\end{cases} \end{align}\]

n=100;
opar=par(mfrow=c(2,2)) 
plot(rnorm(n),xlab='time',type='o');abline(h=0,col='red')
plot(rchisq(n,2),xlab='time',type='o');abline(h=2,col='red')
plot(rt(n,5),xlab='time',type='o');abline(h=0,col='red')
plot(rexp(n),xlab='time',type='o');abline(h=1,col='red')

par(opar)

例2. 滑动平均序列 {\(X_t=\frac{e_t+e_{t-1}}{2}\)}, 平稳的 \[\begin{align} \gamma_k=\begin{cases} 0.5\cdot\sigma_e^2, &\text{if } k=0\\ 0.25\cdot\sigma_e^2, &\text{if } k=1\\ 0, &\text{if } k>1 \end{cases} \qquad\qquad\qquad &\rho_k=\begin{cases} 1, &\text{if } k=0\\ 0.5, &\text{if } k=1\\ 0, &\text{if } k>1\end{cases} \end{align}\]

n=100;
opar=par(mfrow=c(2,2)) 
tsnorm=rnorm(n)
tsMA1=ts((tsnorm+zlag(tsnorm))/2.0,freq=1,start=1)
plot(tsMA1,type='o',ylab='MA of N(0,1)'); 
abline(h=0,col='red')

tschisq=rchisq(n,2)
tsMA2=ts((tschisq+zlag(tschisq))/2.0,freq=1,start=1)
plot(tsMA2,type='o',ylab='MA of chisq(2)'); 
abline(h=2,col='red')

tst=rt(n,5)
tsMA3=ts((tst+zlag(tst))/2.0,freq=1,start=1)
plot(tsMA3,type='o',ylab='MA of t(5)'); 
abline(h=0,col='red')

tsexp=rexp(n)
tsMA4=ts((tsexp+zlag(tsexp))/2.0,freq=1,start=1)
plot(tsMA4,type='o',ylab='MA of exp(1)'); 
abline(h=1,col='red')

par(opar)

例3. 随机游动序列 {\(Y_t=e_1+e_2+\cdots+e_t\)}, 非平稳的 \[\mu_t=t\cdot\mu, \qquad\qquad \sigma_t^2=t\cdot\sigma_e^2.\]

\[\begin{align} \gamma_{t,s}&=Cov(e_1+e_2+\cdots+e_t,e_1+e_2+\cdots+e_s)\\ &=Var(e_1+e_2+\cdots+e_{min(t,s)})\\ &=min(t,s)\cdot \sigma_e^2 \end{align}\]

\[\rho_{t,s}=\frac{\gamma_{t,s}}{\sqrt{\gamma_{t,t}\gamma_{s,s}}}=\frac{min(t,s)\cdot \sigma_e^2}{\sqrt{t \sigma_e^2 \cdot s\sigma_e^2}}=\frac{\sqrt{min(t,s)}}{\sqrt{max(t,s)}}\]

data(rwalk)
n<-length(rwalk)
plot(rwalk,type='o',ylab='Random Walk', main = '随机游动的时间序列图1')

###生成随机游动过程方法1
Rwalk <- vector()
Rwalk[1] <- rnorm(1)
for (i in 2:n){
  set.seed(123+i*456)
  Rwalk[i] <- Rwalk[i-1] + rnorm(1)
}
ts_Rwalk <- ts(Rwalk,freq=1,start=1)

plot(ts_Rwalk,type='l',ylab='Random Walk',xlab='Time',main='随机游动的时间序列图2')
points(ts_Rwalk,col='blue')

###生成随机游动过程方法2：cumsum(rnorm(n))
opar=par(mfrow=c(2,2))  
plot(cumsum(rnorm(n)),type='o',xlab='Time');
plot(cumsum(rchisq(n,2)),type='o',xlab='Time');
plot(cumsum(rt(n,5)),type='o',xlab='Time');
plot(cumsum(rexp(n,1)),type='o',xlab='Time');

par(opar)

第三章：趋势

3.1 随机趋势

受随机扰动的影响，不随时间衰减；
在不同的模拟中，可能展现完全不同的趋势。

n=60;
opar=par(mfrow=c(2,2))  
plot(cumsum(rnorm(n)),type='o',xlab='Time');
plot(cumsum(rnorm(n)),type='o',xlab='Time');
plot(cumsum(rnorm(n)),type='o',xlab='Time');
plot(cumsum(rnorm(n)),type='o',xlab='Time');

par(opar)

3.2 确定趋势

例4. 艾奥瓦州迪比克市月平均气温的时间序列图。

data(tempdub)
plot(tempdub, ylab='Temperature', type='l', main='图1-7. 艾奥瓦州迪比克市月平均气温的时间序列图')
points(tempdub, col='blue')

均值模型： \(Y_t=\mu_t+X_t\)，其中\(E(X_t)=0.\)

随机扰动\(X_t\) 是零均值的平稳序列；
均值过程\(\mu_t\)随着时间\(t\)的变化，表现出特定的变化模式：

\[ \mu_t=\begin{cases} \mu_0, & \text{常函数} \\ \beta_0+\beta_1 t, & \text{线性函数} \\ \beta_0+\beta_1 t+\beta_2 t^2, & \text{二次函数} \\ \cdots， & \text{多项式函数} \\ \mu_{t-T}, & \text{周期函数} \\ \end{cases}\]

3.3 均值函数为常数

\[\quad\hat{\mu}=\bar{Y}, \qquad E(\hat{\mu})=\mu_0, \qquad \text{无偏估计量}\]

\({X_t}\)是平稳的时间序列，自协方差函数为\(\gamma_k\), 自相关函数为\(\rho_k\).

\[Var(\hat{\mu})=\frac{\gamma_0}{n} [1+2\sum_{k=1}^{n-1}(1-\frac{k}{n})\rho_k].\]

例1’. 白噪声序列 {\(X_t=e_t, i.i.d.\)}, 平稳的, \(Var(\hat{\mu})=\frac{\gamma_0}{n}.\)

\[\begin{gathered} \gamma_k=\begin{cases} \sigma_e^2, &\text{if } k=0\\ 0, &\text{if } k>0 \end{cases} \qquad\qquad\qquad \rho_k=\begin{cases} 1, &\text{if } k=0\\ 0, &\text{if } k>0\end{cases} \end{gathered}\]

例2’. 滑动平均序列{\(X_t=\frac{1}{2}e_t+\frac{1}{2}e_{t-1}\)}, 平稳的, \(Var(\hat{\mu})=\frac{\gamma_0}{n}[1+\frac{n-1}{n}]=\frac{\gamma_0}{n}(\frac{2n-1}{n})\simeq 2\frac{\gamma_0}{n}.\)

\[\begin{align} \gamma_k=\begin{cases} 0.5\cdot\sigma_e^2, &\text{if } k=0\\ 0.25\cdot\sigma_e^2, &\text{if } k=1\\ 0, &\text{if } k>1 \end{cases} \qquad\qquad\qquad &\rho_k=\begin{cases} 1, &\text{if } k=0\\ 0.5, &\text{if } k=1\\ 0, &\text{if } k>1\end{cases} \end{align}\]

\(\qquad\) 滑动平均序列 {\(X_t=e_t-\frac{1}{2}e_{t-1}\)}, 平稳的, \(Var(\hat{\mu})=\frac{\gamma_0}{n}[1-0.8(\frac{n-1}{n})]=\frac{\gamma_0}{n}(\frac{0.2n+0.8}{n})\simeq 0.2\frac{\gamma_0}{n}.\)

\[\begin{align} \gamma_k=\begin{cases} 1.25\cdot\sigma_e^2, &\text{if } k=0\\ -0.5\cdot\sigma_e^2, &\text{if } k=1\\ 0, &\text{if } k>1 \end{cases} \qquad\qquad\qquad &\rho_k=\begin{cases} 1, &\text{if } k=0\\ -0.4, &\text{if } k=1\\ 0, &\text{if } k>1\end{cases} \end{align}\]

例5. 一般的平稳序列，随着滞后的增加，自相关函数迅速衰减，\(\sum_{k=0}^{+\infty}|\rho_k| < \infty.\) 当\(n\)充分大的时候， \(Var(\hat{\mu}) \simeq \frac{\gamma_0}{n}[\sum_{k=0}^{+\infty}\rho_k].\)

\(\qquad\) 例如，对于所有的整数\(k\), \(\rho_k=\phi^{|k|}\), \(\phi\in(-1, +1).\) 那么当\(n\)充分大的时候，\(Var(\hat{\mu}) \simeq \frac{\gamma_0}{n}[1+\frac{2\phi}{1-\phi}]=\frac{(1+\phi)\gamma_0}{(1-\phi)n}.\)

\[ \qquad\]

\({X_t}\)是非平稳的时间序列，自协方差函数为\(\gamma_{t,s}\), 自相关函数为\(\rho_t,s\).

\[Var(\hat\mu)=\frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n Cov(X_i,X_j)=\frac{1}{n^2}[\sum_{i=1}^n Var(X_i)+2\sum_{i=2}^n\sum_{j=1}^{i-1} \gamma_{i,j}]\]

例3’. 随机游动序列 {\(X_t=e_1+e_2+\cdots+e_t\)}, 非平稳的, \(\mu_t=t\cdot\mu, \quad \sigma_t^2=t\cdot\sigma_e^2.\)

\[\begin{align} \gamma_{t,s}&=Cov(e_1+e_2+\cdots+e_t,e_1+e_2+\cdots+e_s)\\ &=Var(e_1+e_2+\cdots+e_{min(t,s)})\\ &=min(t,s)\cdot \sigma_e^2 \end{align}\]

\[\rho_{t,s}=\frac{\gamma_{t,s}}{\sqrt{\gamma_{t,t}\gamma_{s,s}}}=\frac{min(t,s)\cdot \sigma_e^2}{\sqrt{t \sigma_e^2 \cdot s\sigma_e^2}}=\frac{\sqrt{min(t,s)}}{\sqrt{max(t,s)}}\]

\[\begin{align} Var(\hat{\mu})&=\frac{1}{n^2}\left [\sum_{i=1}^n i\sigma_e^2+2\sum_{i=2}^n\sum_{j=1}^{i-1} j\sigma_e^2\right ]\\ &=\frac{\sigma_e^2}{n^2}\left [\sum_{i=1}^n i+ 2\sum_{j=1}^{n-1}\sum_{i=j+1}^n j\right ]\\ &=\frac{\sigma_e^2}{n^2}\left [\sum_{i=1}^n i+ 2\sum_{j=1}^{n-1} (n-j) j\right ] \\ &=\frac{\sigma_e^2}{n^2}\left [(1+2n)\sum_{i=1}^n i- 2\sum_{j=1}^{n} j^2\right ] \\ &=\frac{\sigma_e^2}{n^2}\left [(1+2n)\frac{n(n+1)}{2}- 2 \frac{n(n+1)(2n+1)}{6}\right ] \\ &=\sigma_e^2 (2n+1) \frac{(n+1)}{6n} \end{align}\]

3.4 回归方法

\(\qquad\) {\(X_t\)}满足Gauss-Markov假设(零均值，等方差，不相关).

时间的线性趋势 \(\mu_t=\beta_0+\beta_1 t\)，

\(\qquad\) 最小二乘法估计：最小化 \(Q(\beta_0,\beta_1)=\sum_{t=1}^n [Y_t-(\beta_0+\beta_1 t)]^2\), 令偏导数为0，求解。

\[\begin{align} \hat{\beta_1} &= \frac{\sum_{t=1}^n t(Y_t-\bar{Y})}{\sum_{t=1}^n (t-\bar{t})^2}=\frac{\sum_{t=1}^n (t-\bar{t})(Y_t-\bar{Y})}{\sum_{t=1}^n (t-\bar{t})^2}\\ \hat{\beta_0} &= \bar{Y}-\hat{\beta_1}\bar{t} \end{align}\]

时间的二次函数 \(\mu_t=\beta_0+\beta_1 t +\beta_2 t^2\)

data(wages)
plot(wages,type='l',ylab='Wages',xlab='Time',main='Monthly wages')
points(wages,col='blue')

\(\qquad\) 时间的一次函数拟合

lm_wages=lm(wages~time(wages))
summary(lm_wages)

## 
## Call:
## lm(formula = wages ~ time(wages))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.23828 -0.04981  0.01942  0.05845  0.13136 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -5.490e+02  1.115e+01  -49.24   <2e-16 ***
## time(wages)  2.811e-01  5.618e-03   50.03   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.08257 on 70 degrees of freedom
## Multiple R-squared:  0.9728, Adjusted R-squared:  0.9724 
## F-statistic:  2503 on 1 and 70 DF,  p-value: < 2.2e-16

plot(wages,type='l',ylab='Wages',xlab='Time',main='一次时间函数趋势')
points(wages,col='blue')
abline(lm_wages,col='red')

最小二乘估计的斜率\(\hat{\beta}_1=0.1341\) 和截距 \(\hat{\beta}_0=-1.008\). 图表3-2展示了随机游动并叠加了最小二乘回归趋势线。

\(\qquad\) 二次时间函数拟合

lm2_wages=lm(wages~time(wages)+I(time(wages)^2))
summary(lm2_wages)

## 
## Call:
## lm(formula = wages ~ time(wages) + I(time(wages)^2))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.148318 -0.041440  0.001563  0.050089  0.139839 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      -8.495e+04  1.019e+04  -8.336 4.87e-12 ***
## time(wages)       8.534e+01  1.027e+01   8.309 5.44e-12 ***
## I(time(wages)^2) -2.143e-02  2.588e-03  -8.282 6.10e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05889 on 69 degrees of freedom
## Multiple R-squared:  0.9864, Adjusted R-squared:  0.986 
## F-statistic:  2494 on 2 and 69 DF,  p-value: < 2.2e-16

xfit <- time(wages)
yfit <- fitted(lm2_wages)
plot(wages,type='l',ylab='Wages',xlab='Time',main='二次时间函数趋势')
points(wages,col='blue')
lines(as.vector(xfit),as.vector(yfit),col='red')

周期性或季节性趋势

data(tempdub)
month.=season(tempdub) 
model2=lm(tempdub~month.-1) # -1 removes the intercept term 
summary(model2)

## 
## Call:
## lm(formula = tempdub ~ month. - 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.2750 -2.2479  0.1125  1.8896  9.8250 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## month.January     16.608      0.987   16.83   <2e-16 ***
## month.February    20.650      0.987   20.92   <2e-16 ***
## month.March       32.475      0.987   32.90   <2e-16 ***
## month.April       46.525      0.987   47.14   <2e-16 ***
## month.May         58.092      0.987   58.86   <2e-16 ***
## month.June        67.500      0.987   68.39   <2e-16 ***
## month.July        71.717      0.987   72.66   <2e-16 ***
## month.August      69.333      0.987   70.25   <2e-16 ***
## month.September   61.025      0.987   61.83   <2e-16 ***
## month.October     50.975      0.987   51.65   <2e-16 ***
## month.November    36.650      0.987   37.13   <2e-16 ***
## month.December    23.642      0.987   23.95   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.419 on 132 degrees of freedom
## Multiple R-squared:  0.9957, Adjusted R-squared:  0.9953 
## F-statistic:  2569 on 12 and 132 DF,  p-value: < 2.2e-16

model3=lm(tempdub~month.) # intercept is automatically included so one month (Jan) is dropped
summary(model3)

## 
## Call:
## lm(formula = tempdub ~ month.)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.2750 -2.2479  0.1125  1.8896  9.8250 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       16.608      0.987  16.828  < 2e-16 ***
## month.February     4.042      1.396   2.896  0.00443 ** 
## month.March       15.867      1.396  11.368  < 2e-16 ***
## month.April       29.917      1.396  21.434  < 2e-16 ***
## month.May         41.483      1.396  29.721  < 2e-16 ***
## month.June        50.892      1.396  36.461  < 2e-16 ***
## month.July        55.108      1.396  39.482  < 2e-16 ***
## month.August      52.725      1.396  37.775  < 2e-16 ***
## month.September   44.417      1.396  31.822  < 2e-16 ***
## month.October     34.367      1.396  24.622  < 2e-16 ***
## month.November    20.042      1.396  14.359  < 2e-16 ***
## month.December     7.033      1.396   5.039 1.51e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.419 on 132 degrees of freedom
## Multiple R-squared:  0.9712, Adjusted R-squared:  0.9688 
## F-statistic: 405.1 on 11 and 132 DF,  p-value: < 2.2e-16

# first creates the first pair of harmonic functions and then fit the model
har.=harmonic(tempdub,1)##a matrix consisting of \cos(2k ?? t), \sin(2k ?? t), k=1,2,...,m, excluding any zero functions.
model4=lm(tempdub~har.)
summary(model4)

## 
## Call:
## lm(formula = tempdub ~ har.)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.1580  -2.2756  -0.1457   2.3754  11.2671 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      46.2660     0.3088 149.816  < 2e-16 ***
## har.cos(2*pi*t) -26.7079     0.4367 -61.154  < 2e-16 ***
## har.sin(2*pi*t)  -2.1697     0.4367  -4.968 1.93e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.706 on 141 degrees of freedom
## Multiple R-squared:  0.9639, Adjusted R-squared:  0.9634 
## F-statistic:  1882 on 2 and 141 DF,  p-value: < 2.2e-16

tempFit4 <- ts(fitted(model4),freq=12,start=c(1964,1))
plot(tempFit4,ylab='Temperature',type='l',
ylim=range(c(fitted(model4),tempdub))) # the ylim option ensures that the 
# y axis has a range that fits the raw data and the fitted values
points(tempdub, col='blue')
points(tempFit4,col='red')

3.5 回归估计的性质及残差分析

参考“多元线性回归分析”http://www.homepage.zjut.edu.cn/yjq/

随机项的估计和预测（残差） \[\hat X_t=Y_t-\hat Y_t\]

检验残差\(\hat X_t\)是否为白噪声序列：零均值，等方差，不相关（正态独立）。

# Exhibit 3.8
plot(y=rstudent(model3), x=as.vector(time(tempdub)), xlab='Time', ylab='Standardized Residuals', type='o')
abline(h=0, col='red')

# Exhibit 3.9
plot(y=rstudent(model3), x=as.vector(time(tempdub)), xlab='Time', ylab='Standardized Residuals', type='l')
points(y=rstudent(model3), x=as.vector(time(tempdub)), pch=as.vector(season(tempdub)),  col='blue')
abline(h=0, col='red')

# Exhibit 3.10
plot(y=rstudent(model3), x=as.vector(fitted(model3)), xlab='Fitted Trend Values',
 ylab='Standardized Residuals',type="n")
points(y=rstudent(model3), x=as.vector(fitted(model3)), pch=as.vector(season(tempdub)))

# Exhibit 3.11
hist(rstudent(model3), xlab='Standardized Residuals', main='')

# Exhibit 3.12
qqnorm(rstudent(model3), main='QQ plot')
qqline(rstudent(model3), col='red')

#Shapiro-Wilk检验: H_0 正态
shapiro.test(rstudent(model3))

## 
##  Shapiro-Wilk normality test
## 
## data:  rstudent(model3)
## W = 0.9929, p-value = 0.6954

#t检验: H_0 均值为0
t.test(residuals(model3))

## 
##  One Sample t-test
## 
## data:  residuals(model3)
## t = -1.1355e-15, df = 143, p-value = 1
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.5410872  0.5410872
## sample estimates:
##     mean of x 
## -3.108378e-16

假定样本平稳，序列具有不变的均值和方差，定义滞后\(k\)的样本自相关函数为 \[ r_k=\frac{\sum_{t=k+1}^n (Y_t-\overline Y)(Y_{t-k}-\overline Y)}{\sum_{t=1}^n (Y_t-\overline Y)^2}, \qquad k=1,2,\cdots\] \(r_k\)相对滞后\(k\)的图形被称作自相关函数图。

# Exhibit 3.13
acf(rstudent(model3), main='ACF')

# 游程检验：H_0 独立
runs(rstudent(model3))

## $pvalue
## [1] 0.216
## 
## $observed.runs
## [1] 65
## 
## $expected.runs
## [1] 72.875
## 
## $n1
## [1] 69
## 
## $n2
## [1] 75
## 
## $k
## [1] 0

例3”. 随机游动过程，错误地当成线性时间趋势处理：

data(rwalk)
plot(rwalk,type='o',ylab='rWalk',main='图表3-2* 随机游动')

model1=lm(rwalk~time(rwalk))
summary(model1)

## 
## Call:
## lm(formula = rwalk ~ time(rwalk))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.70045 -0.79782  0.06391  0.63064  2.22128 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.007888   0.297245  -3.391  0.00126 ** 
## time(rwalk)  0.134087   0.008475  15.822  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.137 on 58 degrees of freedom
## Multiple R-squared:  0.8119, Adjusted R-squared:  0.8086 
## F-statistic: 250.3 on 1 and 58 DF,  p-value: < 2.2e-16

最小二乘估计的斜率\(\hat{\beta}_1=0.1341\) 和截距 \(\hat{\beta}_0=-1.008\). 图表3-2展示了随机游动并叠加了最小二乘回归趋势线。

plot(rwalk,type='o',ylab='rWalk',main='图表3-2 随机游动与线性时间趋势')
abline(model1, col='red') # add the fitted least squares line

# Exhibit 3.14
plot(y=rstudent(model1), x=as.vector(time(rwalk)), ylab='Standardized Residuals',   xlab='Time', type='o')
abline(h=0, col='red')

# Exhibit 3.15
plot(y=rstudent(model1),x=fitted(model1),ylab='Standardized Residuals',
xlab='Fitted Trend Values',type='p')

# Exhibit 3.16
acf(rstudent(model1),main='ACF')

# 游程检验：H_0 独立
runs(rstudent(model1))

## $pvalue
## [1] 0.0266
## 
## $observed.runs
## [1] 22
## 
## $expected.runs
## [1] 30.96667
## 
## $n1
## [1] 29
## 
## $n2
## [1] 31
## 
## $k
## [1] 0

习题3.7 （on P.38)

data(winnebago); winnebago

##       Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1966                                                     61   48
## 1967   53   78   75   58  146  193  124  120  134   99  130  166
## 1968  168  267  314  432  355  384  232  235  293  242  248  236
## 1969  209  358  352  406  562  389  416  493  409  328  222  195
## 1970  156  439  671  874  558  621  628  652  495  344  405  586
## 1971  403  700  837 1224 1117 1214  762  846 1228  937 1396 1174
## 1972  628 1753

n <- length(winnebago);  n

## [1] 64

Time_winn <- time(winnebago); Time_winn

##           Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## 1966                                                                        
## 1967 1967.000 1967.083 1967.167 1967.250 1967.333 1967.417 1967.500 1967.583
## 1968 1968.000 1968.083 1968.167 1968.250 1968.333 1968.417 1968.500 1968.583
## 1969 1969.000 1969.083 1969.167 1969.250 1969.333 1969.417 1969.500 1969.583
## 1970 1970.000 1970.083 1970.167 1970.250 1970.333 1970.417 1970.500 1970.583
## 1971 1971.000 1971.083 1971.167 1971.250 1971.333 1971.417 1971.500 1971.583
## 1972 1972.000 1972.083                                                      
##           Sep      Oct      Nov      Dec
## 1966                   1966.833 1966.917
## 1967 1967.667 1967.750 1967.833 1967.917
## 1968 1968.667 1968.750 1968.833 1968.917
## 1969 1969.667 1969.750 1969.833 1969.917
## 1970 1970.667 1970.750 1970.833 1970.917
## 1971 1971.667 1971.750 1971.833 1971.917
## 1972

#(a).时序图
plot(winnebago, type='l', ylab='Monthly Sale', xlab='Time', main='图1. 时序图(a)')
points(winnebago, col='blue', pch=2); abline(h=0, col='red')

#(b).时间的一次函数拟合
model_1=lm(winnebago~Time_winn)
summary(model_1)

## 
## Call:
## lm(formula = winnebago ~ Time_winn)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -419.58  -93.13  -12.78   94.96  759.21 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -394885.68   33539.77  -11.77   <2e-16 ***
## Time_winn       200.74      17.03   11.79   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 209.7 on 62 degrees of freedom
## Multiple R-squared:  0.6915, Adjusted R-squared:  0.6865 
## F-statistic: 138.9 on 1 and 62 DF,  p-value: < 2.2e-16

abline(model_1, col='red')

### y_hat=-394885.68+200.74 * Time

Res_winn <- rstudent(model_1) ##标准化残差
plot(Res_winn, type='l', ylab='Standarized Residual in (b)', xlab='Time', main='图2. 残差图(b))')
points(Res_winn, col='blue', pch=3); abline(h=0, col='red')

#(c).对数变换之后，时间的一次函数拟合
winn_log <- log(winnebago)  #数据对数化
plot(winn_log, type='l', ylab='Log(Monthly Sale)', xlab='Time', main='图3. 对数化之后时序图(c)')
points(winn_log, col='blue', pch=6); abline(h=0, col='red')

#(d).对数变换+一次函数拟合
model_2=lm(winn_log~Time_winn)
summary(model_2)

## 
## Call:
## lm(formula = winn_log ~ Time_winn)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.03669 -0.20823  0.04995  0.25662  0.86223 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -984.93878   62.99472  -15.63   <2e-16 ***
## Time_winn      0.50306    0.03199   15.73   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3939 on 62 degrees of freedom
## Multiple R-squared:  0.7996, Adjusted R-squared:  0.7964 
## F-statistic: 247.4 on 1 and 62 DF,  p-value: < 2.2e-16

abline(model_2, col='red')

### Log(y_hat)=-984.93878+ 0.50306 * Time

Res_winn_log <- rstudent(model_2) ##标准化残差
plot(Res_winn_log, type='l', ylab='Residual in log (d)', xlab='Time', main='图4. 对数变换之后的残差图(d))')
points(Res_winn_log, col='blue', pch=7); abline(h=0, col='red')

#(e)季节性+对数变换+一次函数拟合
month_winn <- season(winnebago); month_winn

##  [1] November  December  January   February  March     April     May      
##  [8] June      July      August    September October   November  December 
## [15] January   February  March     April     May       June      July     
## [22] August    September October   November  December  January   February 
## [29] March     April     May       June      July      August    September
## [36] October   November  December  January   February  March     April    
## [43] May       June      July      August    September October   November 
## [50] December  January   February  March     April     May       June     
## [57] July      August    September October   November  December  January  
## [64] February 
## 12 Levels: January February March April May June July August ... December

plot(winn_log, type='l', ylab='Log(Monthly Sale)', xlab='Time', main='图5. 季节性+对数化之后时序图(e)')
points(winn_log, col='blue', pch=as.vector(month_winn )); abline(h=0, col='red')

model_3=lm(winn_log~month_winn+Time_winn)
summary(model_3)

## 
## Call:
## lm(formula = winn_log ~ month_winn + Time_winn)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.92501 -0.16328  0.03344  0.20757  0.57388 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -997.33061   50.63995 -19.695  < 2e-16 ***
## month_winnFebruary     0.62445    0.18182   3.434 0.001188 ** 
## month_winnMarch        0.68220    0.19088   3.574 0.000779 ***
## month_winnApril        0.80959    0.19079   4.243 9.30e-05 ***
## month_winnMay          0.86953    0.19073   4.559 3.25e-05 ***
## month_winnJune         0.86309    0.19070   4.526 3.63e-05 ***
## month_winnJuly         0.55392    0.19069   2.905 0.005420 ** 
## month_winnAugust       0.56989    0.19070   2.988 0.004305 ** 
## month_winnSeptember    0.57572    0.19073   3.018 0.003960 ** 
## month_winnOctober      0.26349    0.19079   1.381 0.173300    
## month_winnNovember     0.28682    0.18186   1.577 0.120946    
## month_winnDecember     0.24802    0.18182   1.364 0.178532    
## Time_winn              0.50909    0.02571  19.800  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3149 on 51 degrees of freedom
## Multiple R-squared:  0.8946, Adjusted R-squared:  0.8699 
## F-statistic: 36.09 on 12 and 51 DF,  p-value: < 2.2e-16

lines(y=as.vector(fitted(model_3)), x=as.vector(Time_winn), col='red')
points(y=as.vector(fitted(model_3)), x=as.vector(Time_winn), col='Purple', pch=as.vector(month_winn ))

Res1_winn_log <- rstudent(model_3) ##标准化残差
plot(Res1_winn_log, type='l', ylab='Residual in log (f)', xlab='Time', main='图6. 季节性对数变换之后的残差图(f))')
points(Res1_winn_log, col='blue', pch=as.vector(month_winn ))
abline(h=0, col='red')

#(b).游程检验：独立性
runs(Res1_winn_log)

## $pvalue
## [1] 0.000243
## 
## $observed.runs
## [1] 18
## 
## $expected.runs
## [1] 32.71875
## 
## $n1
## [1] 29
## 
## $n2
## [1] 35
## 
## $k
## [1] 0

##pvalue  0.000243  拒绝“残差是独立的”

#(c).自相关性
acf(Res1_winn_log,main='图7.季节性+对数变换+一次函数拟合之后的残差自相关图' )  #自相关图

#(d).正态性
#QQ图
qqnorm(Res1_winn_log, main='图8.季节性+对数变换+一次函数拟合之后的残差QQ图' )  #QQ图
qqline(Res1_winn_log, col='blue'); abline(h=0, col='red')

#直方图
hist(Res1_winn_log , xlab='Standardized Residuals_log_2', main='图9. 季节性+对数变换+一次函数拟合之后的残差直方图')

#Shapiro-Wilk检验: 正态性
shapiro.test(Res1_winn_log )

## 
##  Shapiro-Wilk normality test
## 
## data:  Res1_winn_log
## W = 0.97035, p-value = 0.1262

##p-value = 0.1262 不能拒绝来自正态总体的假设

第四章: 平稳时间序列模型

4.1 滑动平均过程

\[ Y_t=e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2}-\cdots-\theta_q e_{t-q}, \qquad\text{ 其中 } e_t\sim WN(0,\sigma_e^2). \]

当\(q=1\)时，MA(1)过程： \(Y_t=e_t-\theta e_{t-1}.\)

\(\qquad\)均值 \(E(Y_t)=0\), 方差 \(Var(Y_t)=(1+\theta^2)\sigma_e^2\), 协方差及自相关函数 \[\begin{gathered} \begin{align} Cov(Y_t,Y_{t-1})&=Cov(e_t-\theta e_{t-1},e_{t-1}-\theta e_{t-2})=-\theta \sigma_e^2. \\ Cov(Y_t,Y_{t-2})&=Cov(e_t-\theta e_{t-1},e_{t-2}-\theta e_{t-3})=0.\\ Cov(Y_t,Y_{t-k})&=Cov(e_t-\theta e_{t-1},e_{t-k}-\theta e_{t-k-1})=0. \text{ if } k>2.\\ \end{align} \qquad\qquad \rho_k= \begin{cases} 1, &\text{ if }k=0\\ -\frac{\theta}{1+\theta^2}, &\text{ if }k=1\\ 0, &\text{ if }k\geq 2\\ \end{cases} \end{gathered}\]

\(\qquad\) 结论：MA(1)是平稳的，自相关函数在滞后 \(1\) 阶之后“截尾”。

theta <- seq(-1,1,by=0.1)
rho_1 <- (-theta)/(1+theta^2)
data.frame(theta=theta, rho_1=rho_1)

##    theta      rho_1
## 1   -1.0  0.5000000
## 2   -0.9  0.4972376
## 3   -0.8  0.4878049
## 4   -0.7  0.4697987
## 5   -0.6  0.4411765
## 6   -0.5  0.4000000
## 7   -0.4  0.3448276
## 8   -0.3  0.2752294
## 9   -0.2  0.1923077
## 10  -0.1  0.0990099
## 11   0.0  0.0000000
## 12   0.1 -0.0990099
## 13   0.2 -0.1923077
## 14   0.3 -0.2752294
## 15   0.4 -0.3448276
## 16   0.5 -0.4000000
## 17   0.6 -0.4411765
## 18   0.7 -0.4697987
## 19   0.8 -0.4878049
## 20   0.9 -0.4972376
## 21   1.0 -0.5000000

plot(theta, rho_1, col='blue', type='l', ylab=expression(rho[1]), xlab=expression(theta))
abline(h=0.5, lty=4); abline(h=-0.5, lty=4)

例如： MA(1)序列\(\theta=-0.9, \rho_1=0.4972\), 和\(\theta=0.9, \rho_1=-0.4972\)。

# Exhibit 4.2
data(ma1.2.s)  #theta=-0.9
plot(ma1.2.s, ylab=expression(Y[t]), type='l', main=expression(theta==-0.9))
points(ma1.2.s, col='blue', cex=0.5); abline(h=0, col='red')

opar=par(mfrow=c(1,2))
# Exhibit 4.3
plot(y=ma1.2.s, x=zlag(ma1.2.s), col='blue', ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', main=expression(theta==-0.9), cex=0.5)
# Exhibit 4.4
plot(y=ma1.2.s, x=zlag(ma1.2.s,2), col='blue', ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', main=expression(theta==-0.9), cex=0.5)

par(opar)

# Exhibit 4.5
data(ma1.1.s)
plot(ma1.1.s,ylab=expression(Y[t]),type='o', cex=0.5); abline(h=0, col='red')

opar=par(mfrow=c(1,2))
# Exhibit 4.6
plot(y=ma1.1.s, x=zlag(ma1.1.s), ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', cex=0.5)
# Exhibit 4.7
plot(y=ma1.1.s,x=zlag(ma1.1.s,2), ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', cex=0.5)

# An MA(1) series of length n=100 with MA coefficient equal to -0.9 and 0.9 respectively.
set.seed(12345); y1=arima.sim(model=list(ma=-c(-0.9)),n=100)
set.seed(54321); y2=arima.sim(model=list(MA=-c(0.9)), n=100)
# Note that R uses the plus convention in the MA model formula so the additional minus sign.  
plot(y1, ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='l', main=expression(theta==-0.9))
points(y1, col='blue', cex=0.5); abline(h=0, col='red')
plot(y2, ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='l', main=expression(theta==0.9))
points(y2, col='blue', cex=0.5); abline(h=0, col='red')

par(opar)

当\(q=2\)时，MA(2)过程： \(Y_t=e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2}.\)

\(\qquad\)均值 \(E(Y_t)=0\), 方差 \(Var(Y_t)=(1+\theta_1^2+\theta_2^2)\sigma_e^2\), 协方差及自相关函数 \[\begin{gathered} \begin{align} Cov(Y_t,Y_{t-1})&=Cov(e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2},e_{t-1}-\theta_1 e_{t-2}-\theta_2 e_{t-3})=(-\theta_1+\theta_1\theta_2) \sigma_e^2. \\ Cov(Y_t,Y_{t-2})&=Cov(e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2},e_{t-2}-\theta_1 e_{t-3}-\theta_2 e_{t-4})=-\theta_2 \sigma_e^2.\\ Cov(Y_t,Y_{t-k})&=Cov(e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2},e_{t-k}-\theta_1 e_{t-k-1}-\theta_2 e_{t-k-2})=0. \text{ if } k\geq 3.\\ \end{align} \qquad \rho_k= \begin{cases} 1, &\text{ if }k=0\\ \frac{-\theta_1+\theta_1\theta_2}{1+\theta_1^2+\theta_2^2}, &\text{ if }k=1\\ \frac{-\theta_2}{1+\theta_1^2+\theta_2^2}, &\text{ if }k=2\\ 0, &\text{ if }k\geq 3\\ \end{cases} \end{gathered}\]

\(\qquad\) 结论：MA(2)是平稳的，自相关函数在滞后 \(2\) 阶之后“截尾”。

例如： MA(2)序列\(\theta_1=1, \theta_2=-0.6\)。\(\rho_1=-0.678,\rho_2=0.254.\)

data(ma2.s) ##theta_1=1, theta_2=-0.6

opar=par(mfrow=c(2,2))
# Exhibit 4.8
plot(ma2.s, ylab=expression(Y[t]), type='o', cex=0.5); abline(h=0, col='red')
# Exhibit 4.9
plot(y=ma2.s, x=zlag(ma2.s), ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', cex=0.5)
# Exhibit 4.10
plot(y=ma2.s, x=zlag(ma2.s,2), ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', cex=0.5)
# Exhibit 4.11
plot(y=ma2.s, x=zlag(ma2.s,3), ylab=expression(Y[t]), xlab=expression(Y[t-3]), type='p', cex=0.5)

par(opar)

# An MA(2) series with MA coefficients equal to 1 and -0.6 and of length n=100.
y=arima.sim(model=list(ma=-c(1, -0.6)), n=100)
# Note that R uses the plus convention in the MA model formula so the additional minus sign. 
plot(y, ylab=expression(Y[t]), type='o', cex=0.5); abline(h=0, col='red')

MA(q)过程： \(Y_t=e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2}-\cdots-\theta_q e_{t-q}\)

\(\qquad\)均值 \(E(Y_t)=0\), 方差 \(Var(Y_t)=(1+\theta_1^2+\theta_2^2+\cdots+\theta_q^2)\sigma_e^2\), 协方差及自相关函数 \[\begin{gathered} Cov(Y_t,Y_{t-k})=\begin{cases} \sigma_e^2 (-\theta_k+\sum_{i=1}^{q-k}\theta_i\theta_{i+k}), &\text{ if } 1\leq k\leq q\\ 0. \text{ if } k> q.\\ \end{cases} \qquad\qquad \rho_k= \begin{cases} 1, &\text{ if }k=0\\ \frac{-\theta_k+\sum_{i=1}^{q-k}\theta_i\theta_{i+k}}{1+\sum_{i=1}^q\theta_i^2}, &\text{ if }1\leq k\leq q\\ 0, &\text{ if }k > q\\ \end{cases} \end{gathered}\]

\(\qquad\) 结论：MA(q)是平稳的，自相关函数在滞后 \(q\) 阶之后“截尾”。

4.2 一般线性过程

\(\qquad \{Y_t\}\) 表示观测到的时间序列，\(\{e_t\}\) 表示未观测到的白噪声。 \[Y_t=\psi_0e_t+\psi_1 e_{t-1}+\psi_2 e_{t-2}+\cdots=\sum_{i=0}^{\infty}\psi_i e_{t-i}.\]

期望为零 \(E(Y_t)=0.\) 一般性假设\(\psi_0=1.\)
方差有限 \(Var(Y_t)=\sigma_e^2(\sum_{i=0}^{\infty}\psi_i^2)<\infty.\)
一阶协方差

\[\begin{align} Cov(Y_t,Y_{t-1}) &= Cov(\psi_0e_t+\psi_1 e_{t-1}+\psi_2 e_{t-2}+\cdots, \psi_0e_{t-1}+\psi_1 e_{t-2}+\psi_2 e_{t-3}+\cdots ) \\ &= (\psi_0\psi_1+\psi_1\psi_2+\psi_2\psi_3+\cdots)\sigma_e^2=\sigma_e^2(\sum_{i=0}^{\infty}\psi_i\psi_{i+1}) \end{align}\]

二阶协方差

\[\begin{align} Cov(Y_t,Y_{t-2}) &= Cov(\psi_0e_t+\psi_1 e_{t-1}+\psi_2 e_{t-2}+\cdots, \psi_0e_{t-2}+\psi_1 e_{t-3}+\psi_2 e_{t-4}+\cdots ) \\ &= (\psi_0\psi_2+\psi_1\psi_3+\psi_2\psi_4+\cdots)\sigma_e^2=\sigma_e^2(\sum_{i=0}^{\infty}\psi_i\psi_{i+2}) \end{align}\]

\(k\)阶协方差

\[Cov(Y_t,Y_{t-k}) =\sigma_e^2(\sum_{i=0}^{\infty}\psi_i\psi_{i+k}),\qquad k\geq0.\]

\(k\)阶自相关函数

\[Corr(Y_t,Y_{t-k})=\frac{\sum_{i=0}^{\infty}\psi_i\psi_{i+k}}{\sum_{i=0}^{\infty}\psi_i^2}, \quad k\geq 0.\]

例如，一般线性过程\(\psi_j=\phi^j, |\phi|<1.\)

\[Y_t=e_t+\phi e_{t-1}+\phi^2 e_{t-2}+\cdots, \qquad |\phi|<1.\]

\(\qquad\)- 期望为零 \(E(Y_t)=0\), 方差有限 \(Var(Y_t)=\sigma_e^2(\sum_{i=0}^{\infty}\phi^{2i})=\frac{1}{1-\phi^2}\sigma_e^2.\)

\(\qquad\)- \(k\)阶协方差 \(Cov(Y_t,Y_{t-k})=\sigma_e^2(\sum_{i=0}^{\infty}\phi^{i}\phi^{i+k})=\frac{\phi^k}{1-\phi^2}\sigma_e^2.\)

\(\qquad\)- \(k\)阶自相关函数 \(Corr(Y_t,Y_{t-k})=\phi^k, \quad k\geq 0.\)

4.3 自回归过程

\[ Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t, \qquad\text{ 其中 } e_t\sim WN(0,\sigma_e^2). \]

假设\(e_t\) 与 \(\{Y_{t-1},Y_{t-2},\cdots,Y_1\}\) 是不相关的。

\[ \]

当\(p=1\)时，AR(1)过程： \(Y_t=\phi Y_{t-1}+e_t.\)

\(\qquad\) 如果\({Y_t}\)序列平稳，那么

\(\qquad\)- 均值 \(\mu=\frac{0}{1-\phi}=0\), 因为\(E(Y_t)=\phi E(Y_{t-1})+E(e_t)\);

\(\qquad\)- 方差 \(\gamma_0=\frac{\sigma_e^2}{1-\phi^2}\geq 0\), 因为\(Var(Y_t)=\phi^2 Var(Y_{t-1})+Var(e_t)\);

\(\qquad\quad\) ①. 平稳性的必要条件是 \(|\phi|<1.\)

\(\qquad\)- \(k\)阶协方差 \[\begin{align} Cov(Y_t,Y_{t-k})&=Cov(\phi Y_{t-1}+e_t, Y_{t-k}) \\ \gamma_k &= \phi \gamma_{k-1} \\ &= \phi\cdot\phi \gamma_{k-2}=\cdots\\ &=\phi^k \gamma_0 \\ \end{align} \]

\(\qquad\)- 自相关函数 \(\qquad \rho_k=\phi^k, k=0,1,2,\cdots\)

\(\qquad\)- AR(1)的一般线性表示

\[\begin{align} &&Y_t&=\phi Y_{t-1}+e_t\\ \phi \cdot &&Y_{t-1}&=\phi Y_{t-2}+e_{t-1} &&\cdot\phi\\ \phi^2 \cdot &&Y_{t-2}&=\phi Y_{t-3}+e_{t-2} &&\cdot\phi^2\\ &&&\cdots\\ \phi^{k-1} \cdot &&Y_{t-(k-1)}&=\phi Y_{t-k}+e_{t-(k-1)} &&\cdot\phi^{k-1}\\ \end{align}\]

\(\qquad\qquad\)求和可得 \(Y_t=e_t+\phi e_{t-1}+\phi^2 e_{t-2}+\cdots+\phi^{k-1} e_{t-(k-1)}+\phi^k Y_{t-k}.\)

\(\qquad\qquad\)如果\(|\phi|<1\), 并且\(k\rightarrow\infty\), 那么 \[Y_t=e_t+\phi e_{t-1}+\phi^2 e_{t-2}+\phi^3 e_{t-3}+\cdots\]

\(\qquad\quad\) ②. 平稳性的充分条件是 \(|\phi|<1.\)

\(\qquad\) 结论：AR(1)是平稳的充分必要条件是\(|\phi|<1\)，自相关函数\(\rho_k=\phi^k\)随着时滞\(k\)的增加而指数衰减。

x<-seq(1:12);
y1<-0.9^x; y2<-0.4^x; y3<-(-0.8)^x; y4<-(-0.5)^x; opar=par(mfrow=c(2,2))
plot(x,y1,type='h', ylim=c(-0.1,0.9), xlab='Lag', ylab=expression(rho[k]), main=expression(phi==0.9));
points(x,y1,col='blue',cex=0.5); abline(h=0,col='red');
plot(x,y2,type='h', ylim=c(-0.1,0.9),  xlab='Lag', ylab=expression(rho[k]), main=expression(phi==0.4));
points(x,y2,col='blue',cex=0.5); abline(h=0,col='red');
plot(x,y3,type='h', ylim=c(-0.9,0.9),  xlab='Lag', ylab=expression(rho[k]), main=expression(phi==-0.8));
points(x,y3,col='blue',cex=0.5); abline(h=0,col='red');
plot(x,y4,type='h', ylim=c(-0.9,0.9),  xlab='Lag', ylab=expression(rho[k]), main=expression(phi==-0.5));
points(x,y4,col='blue',cex=0.5); abline(h=0,col='red');

par(opar)

例如：AR(1)序列 \(\phi=0.9, \rho_1=0.9, \rho_2=0.81, \rho_3=0.729\cdots\)

data(ar1.s)

opar=par(mfrow=c(2,2))
# Exhibit 4.13
plot(ar1.s, ylab=expression(Y[t]), type='o', cex=0.5)
# Exhibit 4.14
plot(y=ar1.s, x=zlag(ar1.s), ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', cex=0.5)
# Exhibit 4.15
plot(y=ar1.s, x=zlag(ar1.s,2), ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', cex=0.5)
# Exhibit 4.16
plot(y=ar1.s, x=zlag(ar1.s,3), ylab=expression(Y[t]), xlab=expression(Y[t-3]), type='p', cex=0.5)

# An AR(1) series with AR coefficient equal to 0.98 and of length n=100.
set.seed(12345); y1=arima.sim(model=list(ar=c(0.98)),n=500)
set.seed(1234); y2=arima.sim(model=list(ar=c(-0.6)),n=500)
set.seed(54321); y3=arima.sim(model=list(ar=c(0.3)),n=500)
set.seed(4321); y4=arima.sim(model=list(ar=c(-0.1)),n=500)
# Note that the R convention for the AR model formula is same as the book.

plot(y1, type='o', xlab='Time', ylab='AR(1)', main=expression(phi==0.98), cex=0.5);
plot(y2, type='o', xlab='Time', ylab='AR(1)', main=expression(phi==-0.6), cex=0.5);
plot(y3, type='o', xlab='Time', ylab='AR(1)', main=expression(phi==0.3), cex=0.5);
plot(y4, type='o', xlab='Time', ylab='AR(1)', main=expression(phi==-0.1), cex=0.5);

acf(y1, xlab='Lag', ylab='ACF', main=expression(phi==0.98));
acf(y2, xlab='Lag', ylab='ACF', main=expression(phi==-0.6));
acf(y3, xlab='Lag', ylab='ACF', main=expression(phi==0.3));
acf(y4, xlab='Lag', ylab='ACF', main=expression(phi==-0.1));

par(opar)

\[ \]

当\(p=2\)时，AR(2)过程： \(Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+e_t.\)

\(\qquad\) 如果\({Y_t}\)序列平稳，那么

\(\qquad\)- 均值 \(\mu=\frac{0}{1-\phi_1-\phi_2}=0\), 因为\(E(Y_t)=\phi_1 E(Y_{t-1})+\phi_2 E(Y_{t-2})+E(e_t)\);

\(\qquad\)- 方差 \(Var(Y_t)=\phi_1^2 Var(Y_{t-1})+\phi_2^2 Var(Y_{t-2})+Var(e_t)+2\phi_1\phi_2 Cov(Y_{t-1},Y_{t-2})\), 则

\[(1-\phi_1^2-\phi_2^2)\gamma_0=2\phi_1\phi_2\gamma_1+\sigma_e^2.\]

\(\qquad\)- 一阶协方差 \((1-\phi_2)\gamma_1 = \phi_1 \gamma_0\), 因为 \[\begin{align} Cov(Y_t,Y_{t-1})&=Cov(\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+e_t, Y_{t-1}) \\ &=\phi_1 Var(Y_{t-1})+\phi_2 Cov(Y_{t-1},Y_{t-2}) \\ \end{align} \]

\(\qquad\qquad\)解方程组可得 \[\begin{cases} \gamma_0&=&\sigma_e^2\frac{1-\phi_2}{(1+\phi_2)[(1-\phi_2)^2-\phi_1^2]}\\ \gamma_1&=&\sigma_e^2\frac{\phi_1}{(1+\phi_2)[(1-\phi_2)^2-\phi_1^2]} \end{cases}\]

\(\qquad\)- 一阶自相关函数 \(\rho_1=\frac{\phi_1}{1-\phi_2}\)

\(\qquad\)- \(k\)阶协方差/自相关函数 \[\begin{align} &&Cov(Y_t,Y_{t-k})&=Cov(\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+e_t, Y_{t-k}) \\ \bf{Yule-Walker方程}&&\gamma_k&=\phi_1 \gamma_{k-1}+\phi_2 \gamma_{k-2} \\ &&\rho_k&=\phi_1 \rho_{k-1}+\phi_2 \rho_{k-2} \\ \end{align} \]

\(\qquad\) 求解二阶齐次线性差分方程\(\rho_{k+2}-\phi_1 \rho_{k+1}-\phi_2 \rho_k=0\)，初值条件为\(\rho_0=1， \rho_{-k}=\rho_k.\)

\(\qquad\)-①. 参考“差分方程讲解”http://www.homepage.zjut.edu.cn/yjq/第19页，

\(\qquad\) 当\(\lambda\)是常数时，\(\rho_k=\lambda^k\)和它的各阶差商有倍数关系，因此不妨设解为\(\rho_k=\lambda^k\)，那么 \[ \lambda^{k+2}-\phi_1 \lambda^{k+1}-\phi_2\lambda^{k}=0\]

\(\qquad\) 该齐次差分方程的特征方程为

\[\lambda^2-\phi_1 \lambda-\phi_2=0\]

\(\qquad\) 特征方程的根(特征根)为

\[\lambda_1=\frac{\phi_1- \sqrt{\phi_1^2+4\phi_2}}{2}, \qquad \lambda_2=\frac{\phi_1+ \sqrt{\phi_1^2+4\phi_2}}{2}\]

\[\begin{align} \underline{\text{特征根的情况}} & \qquad\qquad \underline{\text{差分方程的通解}}\\ \text{当 } \phi_1^2+4\phi_2>0 \text{ 时， } \lambda_1 \neq \lambda_2, \qquad & \qquad\qquad \rho_k=C_1 \lambda_1^k+C_2 \lambda_2^k \\ \text{当 } \phi_1^2+4\phi_2=0 \text{ 时， } \lambda_1 = \lambda_2, \qquad & \qquad\qquad \rho_k=(C_1 +C_2 k)\lambda_1^k \\ \text{当 } \phi_1^2+4\phi_2<0 \text{ 时， } \lambda_{1,2}= \alpha+\beta i, & \qquad\qquad \rho_k=(C_1 cos(k\theta)+C_2 sin(k\theta) )R^k \\ & \qquad\qquad R=\sqrt{\alpha^2+\beta^2}=\sqrt{-\phi_2}, tan \theta=\frac{\beta}{\alpha}\\ \end{align}\]

\(\qquad\) 将条件\(\rho_0=1, \rho_1=\rho_{-1}\)代入通解，可以确定参数\(C_1, C_2\), 写出\(k\)阶自相关函数的表达式。

\[\rho_k=\begin{cases} \frac{(1-\lambda_2^2)\lambda_1^{k+1}-(1-\lambda_1^2)\lambda_2^{k+1}}{(\lambda_1-\lambda_2)(1+\lambda_1\lambda_2)}, &\qquad 当\phi_1^2+4\phi_2>0 时，& 过阻尼状态\\ (1+\frac{1+\phi_2}{1-\phi_2}k)(\frac{\phi_1}{2})^k, &\qquad 当\phi_1^2+4\phi_2=0 时，&临界阻尼状态\\ \frac{\sin(\Theta k+\Phi)}{\sin(\Phi)}R^k, &\qquad 当\phi_1^2+4\phi_2<0 时.& 欠阻尼状态\\ &\cos(\Theta)=\frac{\phi_1}{2\sqrt{-\phi_2}}, \quad \tan(\Phi)=\frac{1-\phi_2}{1+\phi_2} \end{cases}\]

\(\qquad\) 结论：AR(2)的自相关函数 \(\rho_k\) 随着滞后阶数 \(k\) 的增加而指数衰减。 在复数特征根的情况下， \(\rho_k\) 显示为阻尼正弦波动曲线，具有阻尼因子 \(R(0\leq R<1)\)、频率 \(\Theta\)、相位 \(\Phi\)。下图给出了自相关函数可能的几种图形，左上是两个不相等的实数特征根的情形，右上是两个相等的实数特征根情形，第二排的图都是复数特征根的情况。

###AR(2)的自相关函数的理论值 #Exhibit 4.18 
Rho1 <- ARMAacf(ar=c(0.5,0.25), lag.max = 12, pacf=FALSE)
Rho2 <- ARMAacf(ar=c(1.0,-0.25), lag.max = 12, pacf=FALSE)
Rho3 <- ARMAacf(ar=c(1.5,-0.75), lag.max = 12, pacf=FALSE)
Rho4 <- ARMAacf(ar=c(1.0,-0.6), lag.max = 12, pacf=FALSE)

opar=par(mfrow=c(2,2))
plot(Rho1, type='h',  xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==0.5, ",  ", phi[2]==0.25))); 
points(Rho1, pch=8, col='blue', cex=0.8); abline(h=0);
plot(Rho2, type='h',  xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==1.0, ",  ", phi[2]==-0.25))); 
points(Rho2, pch=8, col='blue', cex=0.8); abline(h=0);
plot(Rho3, type='h',  xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==1.5, ",  ", phi[2]==-0.75))); 
points(Rho3, pch=8, col='blue', cex=0.8); abline(h=0);
plot(Rho4, type='h',  xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==1.0, ",  ", phi[2]==-0.6))); 
points(Rho4, pch=8, col='blue', cex=0.8); abline(h=0);

par(opar)

\(\qquad\)-②. 差分方程稳定性的条件是：\(\underline{\bf 特征根在单位圆内，\quad |\lambda_{1,2}|<1.}\)

\(\qquad\) 利用特征方程根与系数的关系\(\begin{cases} \lambda_1 +\lambda_2&=&\phi_1,\\ \lambda_1 \cdot\lambda_2&=&-\phi_2\end{cases}\), 可得

\[\begin{align} \phi_2+\phi_1= &-\lambda_1\lambda_2+\lambda_1 +\lambda_2=1-(1-\lambda_1)(1-\lambda_2)\\ \phi_2-\phi_1=& -\lambda_1\lambda_2-\lambda_1 -\lambda_2=1-(1+\lambda_1)(1+\lambda_2) \end{align}\]

\(\qquad\) 差分方程稳定性的等价条件是：\(\underline{\bf \phi_2+\phi_1<1, \quad \phi_2-\phi_1<1, \quad |\phi_2|<1.}\)

###AR(2)的平稳参数区域 #Exhibit 4.17 
phi1 <- seq(from = -2.5, to = 2.5, length = 51) 
plot(phi1,1+phi1,lty="dashed",type="l",xlab="",ylab="",cex.axis=0.8,ylim=c(-1.5,1.5))
abline(a = -1, b = 0, lty="dashed")
abline(a = 1, b = -1, lty="dashed")
title(ylab=expression(phi[2]),xlab=expression(phi[1]),cex.lab=1.5)
polygon(x = phi1[6:46], y = 1-abs(phi1[6:46]), col="gray")
polygon(x = phi1[6:46], y = -phi1[6:46]^2/4, col="green")
lines(phi1, -phi1^2/4, col="blue")
text(0,-.6,expression(phi[2]< (-phi[1]^2/4)),cex=1.5)
text(0,0.3,"实数根", col='red', cex=1.5)
text(0,-0.3,"复数根", col='blue', cex=1.5)
text(1.75,.5,expression(phi[2]>1-phi[1]),cex=1.5)
text(-1.75,.5,expression(phi[2]>1+phi[1]),cex=1.5)

\(\quad\) 例如， AR(2)模型\(\phi_1=1.5, \phi_2=-0.75\), \(\phi_1^2+4\phi_2=-\frac{3}{4}<0\) 特征方程有两个不相等的复数根。\(k\)阶自相关函数为\(R^k\frac{\sin(\Theta k+\Phi)}{\sin(\Phi)}\)，其中，阻尼因子\(R=\sqrt{-\phi_2}=\frac{\sqrt 3}{2}\simeq0.866\), \(\cos(\Theta)=\frac{\phi_1}{2\sqrt{-\phi_2}}=\frac{3/2}{2\cdot\sqrt{3}/2}=\frac{\sqrt 3}{2}\), 频率 \(\Theta=\pi/6\), 周期 \(f=\frac{2\pi}{\Theta}=\frac{2\pi}{\pi/6}=12\)， \(\tan(\Phi)=\frac{1-\phi_2}{1+\phi_2}=\frac{1+3/4}{1-3/4}=7\), 相位 \(\Phi=\arctan (7)\simeq81.87^o\). （习题4.9 on P.59）

##polyroot(c(1,2,3,4))  多项式方程1+2x+3x^2+4x^3=0求根
polyroot(c(0.75,-1.5,1)) #特征方程求根

## [1] 0.75+0.4330127i 0.75-0.4330127i

atan(7)*360/(2*pi)

## [1] 81.8699

#Exhibit 4.19
data(ar2.s) ##phi_1=1.5, phi_2=-0.75
Rho <- ARMAacf(ar=c(1.5,-0.75), lag.max = 20, pacf=FALSE); Rho

##           0           1           2           3           4           5 
##  1.00000000  0.85714286  0.53571429  0.16071429 -0.16071429 -0.36160714 
##           6           7           8           9          10          11 
## -0.42187500 -0.36160714 -0.22600446 -0.06780134  0.06780134  0.15255301 
##          12          13          14          15          16          17 
##  0.17797852  0.15255301  0.09534563  0.02860369 -0.02860369 -0.06435830 
##          18          19          20 
## -0.07508469 -0.06435830 -0.04022394

opar=par(mfrow=c(2,2))
plot(ar2.s, ylab=expression(Y[t]), type='o', cex=0.5)
plot(y=ar2.s, x=zlag(ar2.s), ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', cex=0.5)
plot(y=ar2.s, x=zlag(ar2.s,2), ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', cex=0.5)
plot(y=ar2.s, x=zlag(ar2.s,3), ylab=expression(Y[t]), xlab=expression(Y[t-3]), type='p', cex=0.5)

par(opar)

opar=par(mfrow=c(1,2))
plot(Rho, type='h',  xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==1.5, ",  ", phi[2]==-0.75))); 
points(Rho, pch=8, col='blue', cex=0.8); abline(h=0);
acf(ar2.s, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==1.5, ",  ", phi[2]==-0.75)))

par(opar)

set.seed(12345); y1=arima.sim(model=list(ar=c(0.5, 0.25)),n=500)
set.seed(1234); y2=arima.sim(model=list(ar=c(1.0,-0.25)),n=500)
set.seed(54321); y3=arima.sim(model=list(ar=c(1.5,-0.75)),n=500)
set.seed(4321); y4=arima.sim(model=list(ar=c(1.0,-0.6)),n=500)

opar=par(mfrow=c(2,2))
plot(y1, type='o', xlab='Time', ylab='AR(2)', main=expression(paste(phi[1]==0.5, ",  ", phi[2]==0.25)), cex=0.5);
plot(y2, type='o', xlab='Time', ylab='AR(2)', main=expression(paste(phi[1]==1.0, ",  ", phi[2]==-0.25)), cex=0.5);
plot(y3, type='o', xlab='Time', ylab='AR(2)', main=expression(paste(phi[1]==1.5, ",  ", phi[2]==-0.75)), cex=0.5);
plot(y4, type='o', xlab='Time', ylab='AR(2)', main=expression(paste(phi[1]==1.0, ",  ", phi[2]==-0.6)), cex=0.5);

acf(y1, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==0.5, ",  ", phi[2]==0.25)));
acf(y2, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==1.0, ",  ", phi[2]==-0.25)));
acf(y3, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==1.5, ",  ", phi[2]==-0.75)));
acf(y4, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==1.0, ",  ", phi[2]==-0.6)));

par(opar)

\[ \]

引入滞后算子(延迟算子) \(B\)，即 \(Y_{t-k}=B^k Y_t\), 则一阶向后差分 \(\triangledown Y_t=Y_t-Y_{t-1}=(1-B)Y_t\)，而“差分方程讲解”中使用的是一阶向前差分 \(\Delta Y_t=Y_{t+1}-Y_t\)。那么 \(k\) 阶向后差分可以表示为 \[\triangledown ^k Y_t=\triangledown ^{k-1}(Y_t-Y_{t-1})=\cdots=(1-B)^k Y_t=\sum_{j=0}^k (-1)^j {k \choose j} B^j Y_t=\sum_{j=0}^k (-1)^j {k \choose j} Y_{t-j}. \]

\(\qquad\) AR(2)过程 \(Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2} +e_t\) 可以写成 \[(1-\phi_1 B -\phi_2 B^2)Y_t=\Phi(B)Y_t=e_t\] \(\qquad \Phi(z)=1-\phi_1 z-\phi_2 z^2\) 称为自回归系数多项式， \(\Phi(z)=0\) 称为AR特征方程（自回归系数方程, 差分方程的逆反特征方程），

\(\qquad \underline{\bf 它的根和差分方程的特征根互为倒数}。\) （习题4.22 on P.60）

\[\begin{gathered} \begin{matrix}1-\phi_1 z-\phi_2 z^2&=&0 \\ \lambda^2-\phi_1 \lambda-\phi_2 &=&0 \end{matrix}\qquad \qquad z_{1,2}=\frac {1}{\lambda_{1,2}} \end{gathered}\]

\(\qquad\) AR(2)过程平稳性的条件是：\(\underline{\bf AR特征方程的根在单位圆外, \quad |z_{1,2}| > 1. }\)

\[ \]

AR(2)的一般线性表示

\(\qquad\) 如果存在两个不相等的实数特征根\(\lambda_1\neq \lambda_2\)的话，那么

\[\begin{gathered} \begin{align} \Phi(z)&=-\phi_2(z-z_1)(z-z_2)\\ &=-\phi_2(z-\frac{1}{\lambda_1})(z-\frac{1}{\lambda_2}) \\ &=\frac{-\phi_2}{\lambda_1\lambda_2}(\lambda_1 z-1)(\lambda_2 z-1)\\ &=(1-\lambda_1 z)(1-\lambda_2 z). \end{align} \qquad \qquad \begin{matrix} z_{1,2}=\frac {1}{\lambda_{1,2}} \\ \\ \\ \lambda_1\lambda_2=-\phi_2 \end{matrix} \end{gathered}\]

\(\qquad\) 记 \(\Phi^{-1}(z)=\frac{1}{\Phi(z)}\), 则 \(Y_t=\Phi^{-1}(B) e_t\). \[\Phi^{-1}(B)=\frac{1}{(1-\lambda_1 B)(1-\lambda_2 B)}=\frac{a_1}{1-\lambda_1 B}+\frac{a_2}{1-\lambda_2 B}\] \(\qquad\) 其中，\(a_1 (1-\lambda_2 B)+a_2 (1-\lambda_1 B)=1\), 即\(a_1=\frac{\lambda_1}{\lambda_1-\lambda_2},\quad a_2=\frac{-\lambda_2}{\lambda_1-\lambda_2}.\)

\(\qquad\) 利用泰勒展开\(\frac{1}{1-x}=1+x+x^2+\cdots\), 我们有

\[\begin{align} \Phi^{-1}(B)&=a_1(1+\lambda_1 B+\lambda_1^2 B^2+\cdots)+a_2(1+\lambda_2 B+\lambda_2^2 B^2+\cdots)\\ &=\sum_{j=0}^\infty(a_1\lambda_1^j+a_2\lambda_2^j) B^j\\ \Psi(B)&\triangleq \sum_{j=0}^\infty \psi_j B^j\end{align}\]

\(\qquad\) AR(2)的逆转形式为\(Y_t=\Phi^{-1}(B) e_t=\Psi(B) e_t=\sum_{j=0}^\infty \psi_j e_{t-j}.\) 类似的推导可以证明(证明略)

\[\psi_j=\begin{cases} \frac{\lambda_1^{j+1}-\lambda_2^{j+1}}{\lambda_1-\lambda_2}, &\qquad当\phi_1^2+4\phi_2>0时，\\ (1+j)\phi_1^j, &\qquad当\phi_1^2+4\phi_2=0时，\\ R^{j}\frac{\sin[(j+1)\Theta]}{\sin\Theta},&\qquad当\phi_1^2+4\phi_2<0时，\\ \end{cases}\]

Green函数递推公式

\(\qquad\) 将AR(2)的逆转形式 \(Y_t=\Psi(B)e_t\) 代入传递形式 \(\Phi(B)Y_t=e_t\),

\[\begin{align} \Phi(B)\Psi(B)e_t&=e_t\\ (1-\phi_1B-\phi_2 B^2)(\psi_0+\psi_1B+\psi_2B^2+\cdots)e_t&=e_t\\ \\ \end{align}\]

\[\begin{align} \psi_0-\phi_1\psi_0 &B -\phi_2\psi_0B^2\\ +\psi_1&B-\phi_1\psi_1B^2-\phi_2\psi_1B^3\\ &\quad\qquad\psi_2B^2-\phi_1\psi_2B^3-\phi_2\psi_2B^4\\ &\quad\qquad\quad\qquad\cdots \end{align}\]

\(\qquad\) 解方程组可得

\[\begin{cases} \psi_0&=&1, \\ \psi_1&=&\phi_1,\\ \psi_2&=&\phi_1^2+\phi_2,\\ &\vdots\\ \psi_k&=&\phi_1\psi_{k-1}+\phi_2\psi_{k-2}, \end{cases}\]

\[ \]

一般自回归过程AR(p)

\[ Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t, \qquad\text{ 其中 } e_t\sim WN(0,\sigma_e^2). \] \(\qquad\) 类似AR(2)的分析过程，AR(p)自回归系数多项式

\[\Phi(z)=1-\phi_1 z-\phi_2 z^2-\cdots-\phi_pz^p \]

\(\qquad \underline{ AR(p){\bf特征方程}1-\phi_1 z-\phi_2 z^2-\cdots-\phi_pz^p = 0 {\bf 的根落在单位圆之外时，序列是平稳的。}}\)

\(\qquad\) 利用多项式根与系数的关系，可以得到满足平稳性的必要(非充分)条件：

\[\left . \begin{align} \phi_1+\phi_2+\cdots+\phi_p&<1\\ |\phi_p|&<1 \end{align} \right \}\]

\(\qquad\) 如果\({Y_t}\)序列平稳，那么

\(\qquad\)- 均值 \(\mu=\frac{0}{1-\phi_1-\phi_2-\cdots-\phi_p}=0\), 因为\(E(Y_t)=\phi_1 E(Y_{t-1})+\phi_2 E(Y_{t-2})+\cdots+\phi_p E(Y_{t-p})+E(e_t)\);

\(\qquad\)- 方差 \(\gamma_0=\frac{\sigma_e^2}{1-\phi_1 \rho_1-\phi_2 \rho_2-\cdots-\phi_p \rho_p}.\) 因为

\[\begin{align} Var(Y_t)&=Cov(\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t，Y_t)\\ \gamma_0&=\phi_1 \gamma_1+\phi_2 \gamma_2+\cdots+\phi_p \gamma_p+\sigma_e^2\\ \gamma_0&=\phi_1 \rho_1\gamma_0+\phi_2 \rho_2\gamma_0+\cdots+\phi_p \rho_p\gamma_0+\sigma_e^2\\ \end{align}\]

\(\qquad\)- \(k\)阶协方差/自相关函数 (\(k=1,2,\cdots\)) \[\begin{align} &&Cov(Y_t,Y_{t-k})&=Cov(\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t, Y_{t-k}) \\ \bf{Yule-Walker方程}&&\gamma_k&=\phi_1 \gamma_{k-1}+\phi_2 \gamma_{k-2}+\cdots+\phi_p \gamma_{k-p} \\ &&\rho_k&=\phi_1 \rho_{k-1}+\phi_2 \rho_{k-2}+\cdots+\phi_p \rho_{k-p} \\ \end{align} \]

\(\qquad \underline{{\bf差分方程的特征方程} \lambda^p-\phi_1 \lambda^{p-1}-\phi_2 \lambda^{p-2}-\cdots-\phi_p = 0 {\bf 的根落在单位圆之内}|\lambda_{1,\cdots,p}|<1{\bf时，序列是平稳的。}}\)

\[\begin{align} \underline{\text{特征根的情况}} \qquad\qquad& \qquad\qquad \underline{\text{差分方程的通解}}\\ 当存在p个不相等的实数根\lambda_1\neq\cdots\neq\lambda_p时， & \qquad \rho_k=C_1 \lambda_1^k+C_2 \lambda_2^k+\cdots+C_p \lambda_p^k\\ 当存在相等的实根\lambda_{1,\cdots,d}和不等的实根\lambda_{3,\cdots,p}时， & \qquad \rho_k=(C_1 +C_2 k+\cdots+C_d k^{d-1})\lambda_1^k+C_{d+1} \lambda_{d+1}^k+\cdots+C_p \lambda_p^k \\ 当存在复数根\lambda_{1,2}和不相等的实数根\lambda_{3,\cdots,p}时， & \qquad\rho_k=R^k(C_1 cos(k\theta)+C_2 sin(k\theta) ) +C_{3} \lambda_{3}^k+\cdots+C_p \lambda_p^k\\ \end{align}\]

\(\qquad\) 将条件\(\rho_0=1, \rho_k=\rho_{-k}\)代入通解，可以确定参数\(C_1, C_2,\cdots,C_p\), 写出\(k\)阶自相关函数的表达式。

\(\qquad\qquad\) 结论：当\(|\lambda_{1,\cdots,p}|<1\)时，平稳AR(p)的ACF图像呈现减幅的正弦、余弦和指数衰减的混合形式，具体形式取决于特征根的性质。 \[ \] \(\qquad\) 另一方面，根据 \(\rho_0=1, \quad\rho_{-k}=\rho_{k}\), 得到一般的Yule-Walker方程组

\[\begin{cases} \rho_1&=&\phi_1 &+\phi_2 \rho_{1}&+\phi_3 \rho_{2}&+\cdots&+\phi_p \rho_{p-1}\\ \rho_2&=&\phi_1 \rho_{1}&+\phi_2 &+\phi_3 \rho_{1}&+\cdots&+\phi_p \rho_{p-2}\\ &\vdots\\ \rho_p&=&\phi_1 \rho_{p-1}&+\phi_2 \rho_{p-2}&+\phi_3 \rho_{p-3}&+\cdots&+\phi_p\\ \end{cases}\]

\(\qquad\) 给定 \(\phi_1,\phi_2,\cdots,\phi_p\)的值，令\(\phi_{p+1}=\phi_{p+2}=\cdots=\phi_{2p}=0\)，

\[\begin{cases} (\phi_2-1)\rho_1 &+\phi_3 \rho_{2}&+\phi_4 \rho_{3}&+\cdots&+\phi_{p-1} \rho_{p-2}&+\phi_p \rho_{p-1}&+\underline{\phi_{p+1} }\rho_{p}&=&-\phi_1\\ (\phi_1+\phi_3) \rho_{1}&+(\phi_4-1)\rho_2 &+\phi_5 \rho_{3}&+\cdots&+\phi_p \rho_{p-2}&+\underline{\phi_{p+1}}\rho_{p-1} &+\underline{\phi_{p+2}}\rho_{p} &=&-\phi_2\\ &\vdots\\ (\phi_{k-1}+\phi_{k+1}) \rho_{1}&+(\phi_{k-2}+\phi_{k+2})\rho_2 &+\cdots+(\phi_1+\phi_{2k-1})\rho_{k-1}&+(\phi_{2k}-1)\rho_{k}&+\phi_{2k+1}\rho_{k+1}+\cdots&+\underline{\phi_{p+k-1}}\rho_{p-1} &+\underline{\phi_{p+k}}\rho_{p} &=&-\phi_k\\ &\vdots\\ (\phi_{p-1}+\underline{\phi_{p+1}}) \rho_{1}&+(\phi_{p-2}+\underline{\phi_{p+2}})\rho_2&+(\phi_{p-3}+\underline{\phi_{p+3}}) \rho_{3}&+\cdots&+(\phi_2+\underline{\phi_{2p-2}})\rho_{p-2}&+(\phi_1+\underline{\phi_{2p-1}})\rho_{p-1}&+(\underline{\phi_{2p}}-1)\rho_{p}&=&-\phi_p\\ \end{cases}\]

\(\qquad\) 可以利用Cramer法则，\(\rho_i=\frac{|D_i|}{|D|}\), 其中\(D\)是系数矩阵，求解该线性方程组得到 \(\rho_1,\rho_2,\cdots,\rho_p\) 的值,

\(\qquad\) 然后利用递归关系\(\rho_k=\phi_1 \rho_{k-1}+\phi_2 \rho_{k-2}+\cdots+\phi_p \rho_{k-p}\)求得任意高阶\((k>p)\)时的\(\rho_k.\)

\[ \]

Green函数递推公式

\(\qquad\) 将AR(p)的逆转形式 \(Y_t=\Psi(B)e_t\) 代入传递形式 \(\Phi(B)Y_t=e_t\),

\[\begin{align} \Phi(B)\Psi(B)e_t&=e_t\\ (1-\phi_1B-\phi_2 B^2-\cdots-\phi_p B^p)(\psi_0+\psi_1B+\psi_2B^2+\cdots)e_t&=e_t\\ \\ \end{align}\]

\[\begin{align} \psi_0-\phi_1\psi_0 &B -\phi_2\psi_0B^2-\phi_3\psi_0B^3-\cdots-\phi_p\psi_0B^p\\ +\psi_1&B-\phi_1\psi_1B^2-\phi_2\psi_1B^3-\cdots-\phi_{p-1}\psi_1B^{p}-\phi_p\psi_1B^{p+1}\\ &\quad\qquad\psi_2B^2-\phi_1\psi_2B^3-\cdots-\phi_{p-2}\psi_2B^p-\phi_{p-1}\psi_2B^{p+1}-\phi_{p}\psi_2B^{p+2}\\ &\quad\qquad\quad\qquad\cdots\cdots \end{align}\]

\(\qquad\) 左右比较\(e_j\)的系数，可以得到 \[\begin{cases} \psi_0&=1, \\ \psi_1-\phi_1\psi_0&=0,\\ \psi_2-\phi_1\psi_1-\phi_2\psi_0&=0,\\ \psi_3-\phi_1\psi_2-\phi_2\psi_1-\phi_3\psi_0&=0,\\ &\vdots\\ \psi_k-\phi_1\psi_{k-1}-\phi_2\psi_{k-2}-\cdots-\phi_k\psi_0&=0,\quad(k<p)\\ &\vdots\\ \psi_p-\phi_1\psi_{p-1}-\phi_2\psi_{p-2}-\cdots-\phi_p\psi_0&=0,\\ \psi_{p+1}-\phi_1\psi_{p}-\phi_2\psi_{p-1}-\cdots-\phi_{p}\psi_1&=0,\\ &\vdots\\ \psi_{m}-\phi_1\psi_{m-1}-\phi_2\psi_{m-2}-\cdots-\phi_{p}\psi_{m-p}&=0,\quad(m>p)\\ &\vdots\\ \end{cases}\]

\(\qquad\) 解方程组可得

\[\begin{cases} \psi_0&=&1, \\ \psi_1&=&\phi_1\psi_0=\phi_1,\\ \psi_2&=&\phi_1\psi_1+\phi_2\psi_0=\phi_1^2+\phi_2,\\ \psi_3&=&\phi_1\psi_2+\phi_2\psi_1+\phi_3\psi_0=\phi_1^3+2\phi_1\phi_2+\phi_3,\\ &\vdots\\ \psi_k&=&\phi_1\psi_{k-1}+\phi_2\psi_{k-2}+\cdots+\phi_k\psi_{0}, \quad \text{ if }k\leq p \\ &\vdots\\ \psi_m&=&\phi_1\psi_{m-1}+\phi_2\psi_{m-2}+\cdots+\phi_p\psi_{m-p}, \quad \text{ if }m > p \end{cases}\]

4.4 ARMA(p,q)模型

\[ Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2}-\cdots-\theta_q e_{t-q}, \qquad\text{ 其中 } e_t\sim WN(0,\sigma_e^2). \]

用滞后算子B表示

\[\begin{align} \Phi(B)Y_t&=\Theta(B)e_t\\ 其中，\Phi(B)&=1-\phi_1 B-\phi_2 B^2+\cdots-\phi_p B^p, \\ \Theta(B)&=1-\theta_1 B-\theta_2 B^2+\cdots-\theta_q B^q, \end{align}\]

当自回归系数多项式的零点（\(\Phi(z)=0\)的根）位于单位圆外时，ARMA(p,q)序列是平稳的。
当\(p=q=1\)时，ARMA(1,1)过程

\[Y_t=\phi Y_{t-1}+e_t-\theta e_{t-1}, \qquad |\phi|<1.\]

\(\quad\) -均值 \(\mu=\frac{0}{1-\phi}=0\)，因为\(E(Y_t)=\phi E(Y_{t-1})+E(e_t)-\theta E(e_{t-1}).\)

\(\quad\) -方差 \[\begin{align} Cov(e_t,Y_t)&=Cov(e_t, \phi Y_{t-1}+e_t-\theta e_{t-1})=\sigma_e^2\\ Cov(e_{t-1},Y_t)&=Cov(e_{t-1},\phi Y_{t-1}+e_t-\theta e_{t-1})=(\phi-\theta)\sigma_e^2 \\ \gamma_0=Cov(Y_t,Y_t)&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_t)\\ &= \phi Cov(Y_{t-1},Y_t)+Cov(e_t,Y_t)-\theta Cov(e_{t-1},Y_t) \\ &= \phi \gamma_1+\sigma_e^2-\theta(\phi-\theta)\sigma_e^2 \end{align}\]

\(\quad\) -自协方差/自相关函数(\(k=1\)时)

\[\begin{align} \gamma_1=Cov(Y_t,Y_{t-1})&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_{t-1})\\ &=\phi \gamma_0-\theta \sigma_e^2 \end{align}\]

\(\quad\) 解二元线性方程组，

\[\begin{cases} \gamma_0 &=& \phi \gamma_1+\sigma_e^2-\theta(\phi-\theta)\sigma_e^2\\ \gamma_1 &=&\phi \gamma_0-\theta \sigma_e^2 \end{cases}\]

\(\quad\) 可得 \(\gamma_0=\frac{1-2\phi\theta+\theta^2}{1-\phi^2}\sigma_e^2, \quad \gamma_1=\frac{(\phi-\theta)(1-\phi\theta)}{1-\phi^2}\sigma_e^2\)，从而可得一阶自相关函数为 \(\rho_1=\frac{\gamma_1}{\gamma_0}=\frac{(\phi-\theta)(1-\phi\theta)}{1-2\phi\theta+\theta^2}\)

\(\quad\) -自协方差/自相关函数, \(\rho_k=\phi^{k-1} \rho_1=\frac{(\phi-\theta)(1-\phi\theta)}{1-2\phi\theta+\theta^2}\phi^{k-1},\quad k\geq 1.\)

\[\begin{align} \gamma_2=Cov(Y_t,Y_{t-2})&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_{t-2})=\phi \gamma_1 \\ \gamma_3=Cov(Y_t,Y_{t-3})&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_{t-3})=\phi \gamma_2 =\phi^2 \gamma_1\\ &\vdots\\ \gamma_k=Cov(Y_t,Y_{t-k})&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_{t-k})=\phi \gamma_{k-1}=\phi^{k-1} \gamma_1 \\ \end{align}\]

\(\qquad\) 结论：ARMA(1,1)是平稳的充分必要条件也是\(|\phi|<1\)，自相关函数\(\rho_k=\phi^{k-1}\rho_1\)随着时滞\(k\)的增加而指数衰减。阻尼因子是 \(\phi\)，但递减开始于 \(\rho_1\) （依赖于\(\theta\)）; 这与AR(1)的自相关函数\(\rho_k\substack{{\tiny AR(1)}\\=}\phi^k\rho_0\)不同.

\(\qquad\) 例如，ARMA(1,1)序列\(\phi=0.8,\theta=0.4\), 则\(\rho_1=0.523, \rho_2=0.418, \rho_3=0.335.\)

###ARMA(2)的自相关函数的理论值 #Exhibit 4.18 
Rho1 <- ARMAacf(ar=c(0.8),ma=-c(0.4), lag.max = 12, pacf=FALSE);Rho1

##          0          1          2          3          4          5          6 
## 1.00000000 0.52307692 0.41846154 0.33476923 0.26781538 0.21425231 0.17140185 
##          7          8          9         10         11         12 
## 0.13712148 0.10969718 0.08775775 0.07020620 0.05616496 0.04493197

Rho2 <- ARMAacf(ar=c(-0.8),ma=-c(0.4), lag.max = 12, pacf=FALSE);Rho2

##           0           1           2           3           4           5 
##  1.00000000 -0.88000000  0.70400000 -0.56320000  0.45056000 -0.36044800 
##           6           7           8           9          10          11 
##  0.28835840 -0.23068672  0.18454938 -0.14763950  0.11811160 -0.09448928 
##          12 
##  0.07559142

opar=par(mfrow=c(1,2))
plot(Rho1, type='h',  xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi==0.8, ",  ", theta==0.4)), ylim=c(-1.0,1.0)); 
points(Rho1, pch=8, col='blue', cex=0.8); abline(h=0);
plot(Rho2, type='h',  xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi==-0.8, ",  ", theta==0.4)), ylim=c(-1.0,1.0)); 
points(Rho2, pch=8, col='blue', cex=0.8); abline(h=0);

par(opar)

\(\qquad\)- ARMA(1,1)的一般线性表示

\[\begin{align} &&Y_t&=\phi Y_{t-1}+e_t-\theta e_{t-1}\\ \phi \cdot &&Y_{t-1}&=\phi Y_{t-2}+e_{t-1}-\theta e_{t-2} &&\cdot\phi\\ \phi^2 \cdot &&Y_{t-2}&=\phi Y_{t-3}+e_{t-2}-\theta e_{t-3} &&\cdot\phi^2\\ &&&\cdots\\ \phi^{k-1} \cdot &&Y_{t-(k-1)}&=\phi Y_{t-k}+e_{t-(k-1)}-\theta e_{t-k} &&\cdot\phi^{k-1}\\ \end{align}\]

\(\qquad\)求和可得 \(Y_t=e_t+(\phi-\theta) e_{t-1}+(\phi-\theta)\phi e_{t-2}+\cdots+(\phi-\theta)\phi^{k-2} e_{t-(k-1)}-\theta\phi^{k-1} e_{t-k}+\phi^k Y_{t-k}.\)

\(\qquad\)如果\(|\phi|<1\), 并且\(k\to \infty\), 那么 \[Y_t=e_t+(\phi-\theta)(e_{t-1}+\phi e_{t-2}+\phi^2 e_{t-3}+\cdots)=e_t+(\phi-\theta)\sum_{i=1}^\infty\phi^{i-1} e_{t-i}. \]

\[ \]

ARMA(p,q)过程的一般线性表示

\(\qquad\) 若\(\Phi(z)\)的零点在单位圆之外，\(Y_t\) 为平稳可逆的过程。设ARMA(p,q)的线性表示为\(Y_t=\Psi(B)e_t=1+\psi_1B+\psi_2B^2+\cdots\)，

\[\begin{align} \Phi(B)Y_t&=\Theta(B)e_t \\ \Phi(B)\Psi(B)e_t&=\Theta(B)e_t\\ (1-\phi_1 B-\phi_2 B^2-\cdots)(1+\psi_1 B+\psi_2 B^2+\cdots)e_t&=(1-\theta_1 B-\theta_2 B^2-\cdots)e_t \end{align}\]

\(\qquad\) 等式左边展开得，

\(\qquad\) 左右比较\(e_j\)的系数，可以得到 \[\begin{cases} \psi_0&=1, \\ \psi_1-\phi_1\psi_0&=-\theta_1,\\ \psi_2-\phi_1\psi_1-\phi_2\psi_0&=-\theta_2,\\ \psi_3-\phi_1\psi_2-\phi_2\psi_1-\phi_3\psi_0&=-\theta_3,\\ &\vdots\\ \psi_k-\phi_1\psi_{k-1}-\phi_2\psi_{k-2}-\cdots-\phi_k\psi_0&=-\theta_k,\quad(k<p)\\ &\vdots\\ \psi_p-\phi_1\psi_{p-1}-\phi_2\psi_{p-2}-\cdots-\phi_p\psi_0&=-\theta_p,\\ \psi_{p+1}-\phi_1\psi_{p}-\phi_2\psi_{p-1}-\cdots-\phi_{p}\psi_1&=-\theta_{p+1},\\ &\vdots\\ \psi_{m}-\phi_1\psi_{m-1}-\phi_2\psi_{m-2}-\cdots-\phi_{p}\psi_{m-p}&=-\theta_{m},\quad(m>p)\\ &\vdots\\ \end{cases}\]

\(\qquad\) 令\(\theta_j=0 (当j>q时)\)，解方程组可得

\[\begin{cases} \psi_0&=&1, \\ \psi_1&=&-\theta_{1}+\phi_1\psi_0,\\ \psi_2&=&-\theta_{2}+\phi_1\psi_1+\phi_2\psi_0,\\ \psi_3&=&-\theta_{3}+\phi_1\psi_2+\phi_2\psi_1+\phi_3\psi_0,\\ &\vdots\\ \psi_k&=&-\theta_{k}+\phi_1\psi_{k-1}+\phi_2\psi_{k-2}+\cdots+\phi_k\psi_{0}, \quad \text{ if }k\leq p \\ &\vdots\\ \psi_m&=&-\theta_{m}+\phi_1\psi_{m-1}+\phi_2\psi_{m-2}+\cdots+\phi_p\psi_{m-p}, \quad \text{ if }m > p \end{cases}\]

\[ \]

ARMA(p,q)过程的自相关函数

\[Y_t=\sum_{j=0}^\infty \psi_j e_{t-j}=\sum_{i=1}^p \phi_i Y_{t-i}-\sum_{j=0}^q \theta_j e_{t-j}, \qquad \theta_0=-1.\] \(\qquad\) -均值 \(\mu=0\)

\(\qquad\) -自协方差/自相关函数

\[Cov(e_{t-j},Y_{t-k})=Cov(e_{t-j},\sum_{m=0}^\infty \psi_m e_{t-k-m})=\begin{cases}\psi_{j-k}\sigma_e^2,&\quad \text{ if } j\geq k,\\ 0, &\quad \text{ if }j< k.\end{cases}\]

\[\begin{align} Cov(Y_t,Y_{t-k})&=Cov(\sum_{i=1}^p \phi_i Y_{t-i}-\sum_{j=0}^q \theta_j e_{t-j},Y_{t-k})\\ &=\sum_{i=1}^p \phi_i Cov(Y_{t-i},Y_{t-k})-\sum_{j=0}^q \theta_j Cov(e_{t-j},Y_{t-k})\\ \gamma_k&=\begin{cases} \sum_{i=1}^p \phi_i\gamma_{i-k}-\sigma_e^2\sum_{j=k}^q \theta_j\psi_{j-k},&\quad \text{ if }0\leq k\leq q,\\ \sum_{i=1}^p \phi_i\gamma_{i-k},&\quad \text{ if } k > q. \end{cases}\end{align}\]

\(\qquad\) 根据上式，\(\gamma_{-k}=\gamma_k\), 求解\(q+1\)元线性方程组得\(\gamma_0,\gamma_1,\cdots,\gamma_q\)，然后利用 \(\rho_i=\frac{\gamma_k}{\gamma_0}\) 计算 \(\rho_1,\cdots,\rho_q\),

\[\rho_k=\phi_1 \rho_{k-1}+\phi_2 \rho_{k-2}+\cdots+\phi_p \rho_{k-p}, \quad \text{ if } k>q.\]

4.5 MA(q)的可逆性

比较两个MA(1)过程： \(Y_t=e_t-\theta e_{t-1}\) 和 \(X_t=e_t-\frac{1}{\theta} e_{t-1}\)

\(\qquad\) 自相关函数分别为 \[\rho_k^Y= \begin{cases} 1, &\text{ if }k=0\\ -\frac{\theta}{1+\theta^2}, &\text{ if }k=1\\ 0, &\text{ if }k\geq 2\\ \end{cases} \qquad\qquad \rho_k^X= \begin{cases} 1, &\text{ if }k=0\\ -\frac{1/\theta}{1+(1/\theta)^2}=-\frac{\theta}{1+\theta^2}, &\text{ if }k=1\\ 0, &\text{ if }k\geq 2\\ \end{cases}\] \(\qquad\) 参数互为倒数的两个MA(1)过程有相同的自相关函数。（习题4.4 on P.58）

比较两个MA(2)过程：\(Y_t=e_t-\frac{1}{6} e_{t-1}-\frac{1}{6}\theta_2 e_{t-2}\) 和 \(X_t=e_t+ e_{t-1}-6 e_{t-2}.\)

\(\qquad\)自相关函数分别为 \[\rho_k^Y= \begin{cases} 1, &\text{ if }k=0\\ \frac{-\theta_1+\theta_1\theta_2}{1+\theta_1^2+\theta_2^2} =\frac{-\frac{1}{6}+\frac{1}{6}\frac{1}{6}}{1+\frac{1}{36}+\frac{1}{36}} =\frac{-5}{38}, &\text{ if }k=1\\ \frac{-\theta_2}{1+\theta_1^2+\theta_2^2}=\frac{-\frac{1}{6}}{1+\frac{1}{36}+\frac{1}{36}}=\frac{-6}{38}, &\text{ if }k=2\\ 0, &\text{ if }k\geq 3\\ \end{cases} \qquad\qquad \rho_k^X= \begin{cases} 1, &\text{ if }k=0\\ \frac{-(-1)+(-1)\cdot 6}{1+(-1)^2+6^2} =\frac{-5}{38}, &\text{ if }k=1\\ \frac{-6}{1+(-1)^2+6^2}=\frac{-6}{38}, &\text{ if }k=2\\ 0, &\text{ if }k\geq 3\\ \end{cases} \]

\(\qquad\) 不同参数的这两个MA(2)过程却有相同的自相关函数。 （习题4.12 on P.59）

\(\qquad\) 相应的特征多项式分别为

\[\begin{align}\Theta^Y(z)&=1-\frac{1}{6}z-\frac{1}{6}z^2=-\frac{1}{6}(z+3)(z-2)=(1+\frac{z}{3})(1-\frac{z}{2})\\ \Theta^X(w)&=1+w-6w^2=(1+3w)(1-2w)\end{align}\]

\(\qquad\) 这两个MA(2)的特征多项式的根互为倒数， \(z_{1,2}=\frac{1}{w_{1,2}}.\)

ARMAacf(ma=-c(1/6,1/6),lag.max=6)

##          0          1          2          3          4          5          6 
##  1.0000000 -0.1315789 -0.1578947  0.0000000  0.0000000  0.0000000  0.0000000

ARMAacf(ma=-c(-1,6),lag.max=6)

##          0          1          2          3          4          5          6 
##  1.0000000 -0.1315789 -0.1578947  0.0000000  0.0000000  0.0000000  0.0000000

将MA(1)过程改写为\(AR(\infty)\)过程

\[\begin{align} &&Y_t&=e_t-\theta e_{t-1}\\ \theta \cdot &&Y_{t-1}&=e_{t-1}-\theta e_{t-2} &&\cdot\theta\\ \theta^2 \cdot &&Y_{t-2}&=e_{t-2}-\theta e_{t-3} &&\cdot\theta^2\\ &&&\cdots\\ \theta^{k-1} \cdot &&Y_{t-(k-1)}&=e_{t-(k-1)}-\theta e_{t-k} &&\cdot\theta^{k-1}\\ \end{align}\]

\(\qquad\)求和可得 \(Y_t+\theta Y_{t-1}+\theta^2 Y_{t-2}+\cdots+\theta^{k-1} Y_{t-(k-1)}=e_t-\theta^k e_{t-k}.\)

\(\qquad\)如果\(|\theta|<1\), 并且\(k\to\infty\), 那么 \[Y_t=e_t-\theta Y_{t-1}-\theta^2 Y_{t-2}-\cdots-\theta^k Y_{t-k}-\cdots \]

\(\qquad\qquad\) 结论：\(\underline{ MA(1){\bf 可逆 }\iff|\theta|<1\iff MA{\bf 系数多项式 }\Theta(z){\bf 的零点在单位圆之外}。}\)

MA(q)或者ARMA(q)模型的可逆性

\(\qquad\qquad\) MA系数多项式 \(\qquad\qquad\Theta(z)=1-\theta_1z--\theta_2 z^2-\cdots-\theta_q z^q\)

\(\qquad\qquad\) MA特征方程为 \(\qquad\qquad1-\theta_1z--\theta_2 z^2-\cdots-\theta_q z^q=0\)

\(\qquad\) 类似于AR模型的平稳性条件，\(\underline{MA(q){\bf模型可逆}\iff MA{\bf系数多项式的零点(}MA{\bf特征方程的根)在单位圆之外。}}\)

\[Y_t=\pi_1 Y_{t-1}+\pi_2 Y_{t-2}+\pi_3 Y_{t-3}+\cdots+e_t\]

\(\qquad\) \(\boxed{对于一般的ARMA(p,q)模型，要求同时满足{\bf平稳性和可逆性}条件。}\)

第五章: 非平稳时间序列模型

5.1 \(Y_t\sim ARIMA(p,d,q)\) 模型

\[ W_t=\phi_1 W_{t-1}+\phi_2 W_{t-2}+\cdots+\phi_p W_{t-p}+e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2}-\cdots-\theta_q e_{t-q}. \]

\(\qquad 其中 \quad W_t=\nabla^d Y_t=(1-B)^d Y_t=\sum_{j=0}^d(-1)^j{d \choose j}B^jY_t=\sum_{j=0}^d(-1)^j{d \choose j}Y_{t-j}, \quad e_t\sim WN(0,\sigma_e^2).\)

时间序列分析理论及R代码实现

原俊青，浙江工业大学

http://homepage.zjut.edu.cn/yjq/

第一章: 绪论

第二章: 基本概念

第三章：趋势

3.1 随机趋势

3.2 确定趋势

3.3 均值函数为常数

3.4 回归方法

3.5 回归估计的性质及残差分析

第四章: 平稳时间序列模型

4.1 滑动平均过程

4.2 一般线性过程

4.3 自回归过程

4.4 ARMA(p,q)模型

4.5 MA(q)的可逆性

第五章: 非平稳时间序列模型

5.1 \(Y_t\sim ARIMA(p,d,q)\) 模型

第六章: 模型识别

第七章: 参数估计

第八章：模型诊断

第九章: 预测