什么是时间序列?
和统计里的样本有什么区别?
研究目的?研究方法?
统计指标?分析软件?
时间序列分析步骤?
常用的ARIMA模型?
例1. 加利福尼亚州洛杉矶地区100多年来的年降水量时间序列图。
#collapse=TRUE, results='hold', fig.height = 8,fig.width = 6, fig.show=‘hold’}
#options(width = 50)
data(larain)
summary(larain)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.080 9.675 14.140 14.888 18.400 40.290
plot(larain, ylab='Inches', xlab='Year', type='o', main='图1-1 洛杉矶年降水量时序图')
plot(y=larain,x=zlag(larain),ylab='Inches',xlab='Previous Year Inches', main='图1-2 洛杉矶当年降水量与去年降水量散点图')
例2. 某化工过程中颜色属性的时间序列图。
data(color)
summary(color)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 63.00 70.00 76.00 74.89 79.00 87.00
plot(color, ylab='Color Property', xlab='Batch', type='l', main='图1-3. 某化工过程中颜色属性的时间序列图')
points(color, col='blue')
plot(y=color, x=zlag(color), ylab='Color Property', xlab='Prevous Batch Color Property', main='图1-4.当前颜色值与前期颜色值的散点图')
例3. 加拿大野兔丰度的时间序列图。
data(hare)
summary(hare)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 20.00 27.00 39.55 65.00 95.00
plot(hare, ylab='Abundance', xlab='Year', type='l', main='图1-5. 加拿大野兔丰度的时间序列图')
points(hare, col='blue')
plot(y=hare, x=zlag(hare), ylab='Abundance', xlab='Previous Year Abundance', main='图1-6.当年与上一年野兔丰度的散点图')
例4. 艾奥瓦州迪比克市月平均气温的时间序列图。
data(tempdub)
summary(tempdub)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 8.40 27.32 48.60 46.27 63.12 74.00
plot(tempdub, ylab='Temperature', type='l', main='图1-7. 艾奥瓦州迪比克市月平均气温的时间序列图')
points(tempdub, col='blue')
plot(y=tempdub, x=zlag(tempdub,d=12), ylab='Temperature', xlab='Previous Year Temperature', main='图1-7’.当年与上一年月平均气温的散点图')
例5. 滤油器月销售量的时间序列图。
data(oilfilters)
summary(oilfilters)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1572 2314 3262 3388 4488 5862
plot(oilfilters, type='l', ylab='Sales', main='图1-8. 滤油器月销售量的时间序列图')
points(oilfilters, col='blue')
plot(y=oilfilters, x=zlag(oilfilters,d=12), ylab='Sales', xlab='Previous Year Sales', main='图1-8’.当年与上一年月销售量的散点图')
plot(oilfilters,type='l',ylab='Sales', main='图1-9. 以特殊符号绘制的滤油器月销售量')
points(y=oilfilters,x=time(oilfilters), pch=as.vector(season(oilfilters)))
习题1.(on P.7)
Question 3~5.
### rnorm(n), rchisq(n,df=2), rt(n,df=5)
n=48;
set.seed(12345); tsnorm1 = ts(rnorm(n),freq=1,start=1)
set.seed(1234); tsnorm2 = ts(rnorm(n),freq=1,start=1)
set.seed(123); tsnorm3 = ts(rnorm(n),freq=1,start=1)
set.seed(12); tsnorm4 = ts(rnorm(n),freq=1,start=1)
opar=par(mfrow=c(2,2))
plot(tsnorm1,type='o'); abline(h=0,col='red')
plot(tsnorm2,type='o'); abline(h=0,col='red')
plot(tsnorm3,type='o'); abline(h=0,col='red')
plot(tsnorm4,type='o'); abline(h=0,col='red')
par(opar)
Question 6.
data(tempdub)
#op <- par(bg = "light blue")
plot(tempdub,ylab='Temperature',type='l',col = "blue")
Month=c("J","A","S","O","N","D","J","F","M","A","M","J")
points(tempdub,pch=Month,col='red')
#points(tempdub,pch=as.vector(season(tempdub)),col='red')
#par(op)
\[ \gamma_{t,s} = Cov(X_t,X_s) = E(X_t,X_s)-\mu_t\mu_s. \]
\[ \rho_{t,s} = Corr(X_t,X_s) =\frac {\gamma_{t,s}} {\sqrt{\gamma_{t,t} \gamma_{s,s}}}. \]
任意有限维分布都与初始时间点无关,只与时间间隔相关。如果期望和方差都存在的话,那么我们就有 \[ X_t \overset{d} {=} X_s. \qquad \mu_t=\mu_s\equiv\mu, \qquad \sigma_t^2=\sigma_s^2\equiv\sigma^2\] \[ \left( X_{t_1},X_{t_2}\right) \overset{d} {=} \left(X_{t_1+k},X_{t_2+k}\right). \qquad \gamma_{t,s}=\gamma_{0,\left|t-s\right|}\] \[ \left(X_{t_1},X_{t_2},\cdots,X_{t_n}\right) \overset{d} {=} \left(X_{t_1+k},X_{t_2+k},\cdots,X_{t_n+k}\right).\]
Let \(\quad \gamma_k=Cov(X_t,X_{t+k}), \quad \text{then} \quad \rho_k=Corr(X_t,X_{t+k})=\frac {\gamma_{t,t+k}}{\sqrt{\gamma_{t,t}\gamma_{t+k,t+k}}}=\frac{\gamma_k}{\gamma_0}.\)
宽(弱)平稳性
均值函数和方差函数都是常数:\(\qquad \mu_t\equiv\mu, \qquad \sigma_t^2\equiv\sigma^2\)
协方差仅依赖于时间的滞后:\(\qquad \gamma_k=\gamma_{t,t+k}=\gamma_{0,k}.\)
Let \(\quad \gamma_k=Cov(X_t,X_{t+k}), \quad \text{then} \quad \rho_k=Corr(X_t,X_{t+k})=\frac {\gamma_{t,t+k}}{\sqrt{\gamma_{t,t}\gamma_{t+k,t+k}}}=\frac{\gamma_k}{\gamma_0}.\)
例1. 白噪声序列 {\(e_t, i.i.d.\)}, 平稳的 \[\begin{align} \gamma_k=\begin{cases} \sigma_e^2, &\text{if } k=0\\ 0, &\text{if } k>0 \end{cases} \qquad\qquad\qquad &\rho_k=\begin{cases} 1, &\text{if } k=0\\ 0, &\text{if } k>0\end{cases} \end{align}\]
n=100;
opar=par(mfrow=c(2,2))
plot(rnorm(n),xlab='time',type='o');abline(h=0,col='red')
plot(rchisq(n,2),xlab='time',type='o');abline(h=2,col='red')
plot(rt(n,5),xlab='time',type='o');abline(h=0,col='red')
plot(rexp(n),xlab='time',type='o');abline(h=1,col='red')
par(opar)
例2. 滑动平均序列 {\(X_t=\frac{e_t+e_{t-1}}{2}\)}, 平稳的 \[\begin{align} \gamma_k=\begin{cases} 0.5\cdot\sigma_e^2, &\text{if } k=0\\ 0.25\cdot\sigma_e^2, &\text{if } k=1\\ 0, &\text{if } k>1 \end{cases} \qquad\qquad\qquad &\rho_k=\begin{cases} 1, &\text{if } k=0\\ 0.5, &\text{if } k=1\\ 0, &\text{if } k>1\end{cases} \end{align}\]
n=100;
opar=par(mfrow=c(2,2))
tsnorm=rnorm(n)
tsMA1=ts((tsnorm+zlag(tsnorm))/2.0,freq=1,start=1)
plot(tsMA1,type='o',ylab='MA of N(0,1)');
abline(h=0,col='red')
tschisq=rchisq(n,2)
tsMA2=ts((tschisq+zlag(tschisq))/2.0,freq=1,start=1)
plot(tsMA2,type='o',ylab='MA of chisq(2)');
abline(h=2,col='red')
tst=rt(n,5)
tsMA3=ts((tst+zlag(tst))/2.0,freq=1,start=1)
plot(tsMA3,type='o',ylab='MA of t(5)');
abline(h=0,col='red')
tsexp=rexp(n)
tsMA4=ts((tsexp+zlag(tsexp))/2.0,freq=1,start=1)
plot(tsMA4,type='o',ylab='MA of exp(1)');
abline(h=1,col='red')
par(opar)
例3. 随机游动序列 {\(Y_t=e_1+e_2+\cdots+e_t\)}, 非平稳的 \[\mu_t=t\cdot\mu, \qquad\qquad \sigma_t^2=t\cdot\sigma_e^2.\]
\[\begin{align} \gamma_{t,s}&=Cov(e_1+e_2+\cdots+e_t,e_1+e_2+\cdots+e_s)\\ &=Var(e_1+e_2+\cdots+e_{min(t,s)})\\ &=min(t,s)\cdot \sigma_e^2 \end{align}\]
\[\rho_{t,s}=\frac{\gamma_{t,s}}{\sqrt{\gamma_{t,t}\gamma_{s,s}}}=\frac{min(t,s)\cdot \sigma_e^2}{\sqrt{t \sigma_e^2 \cdot s\sigma_e^2}}=\frac{\sqrt{min(t,s)}}{\sqrt{max(t,s)}}\]
data(rwalk)
n<-length(rwalk)
plot(rwalk,type='o',ylab='Random Walk', main = '随机游动的时间序列图1')
###生成随机游动过程方法1
Rwalk <- vector()
Rwalk[1] <- rnorm(1)
for (i in 2:n){
set.seed(123+i*456)
Rwalk[i] <- Rwalk[i-1] + rnorm(1)
}
ts_Rwalk <- ts(Rwalk,freq=1,start=1)
plot(ts_Rwalk,type='l',ylab='Random Walk',xlab='Time',main='随机游动的时间序列图2')
points(ts_Rwalk,col='blue')
###生成随机游动过程方法2:cumsum(rnorm(n))
opar=par(mfrow=c(2,2))
plot(cumsum(rnorm(n)),type='o',xlab='Time');
plot(cumsum(rchisq(n,2)),type='o',xlab='Time');
plot(cumsum(rt(n,5)),type='o',xlab='Time');
plot(cumsum(rexp(n,1)),type='o',xlab='Time');
par(opar)
受随机扰动的影响,不随时间衰减;
在不同的模拟中,可能展现完全不同的趋势。
n=60;
opar=par(mfrow=c(2,2))
plot(cumsum(rnorm(n)),type='o',xlab='Time');
plot(cumsum(rnorm(n)),type='o',xlab='Time');
plot(cumsum(rnorm(n)),type='o',xlab='Time');
plot(cumsum(rnorm(n)),type='o',xlab='Time');
par(opar)
例4. 艾奥瓦州迪比克市月平均气温的时间序列图。
data(tempdub)
plot(tempdub, ylab='Temperature', type='l', main='图1-7. 艾奥瓦州迪比克市月平均气温的时间序列图')
points(tempdub, col='blue')
均值模型: \(Y_t=\mu_t+X_t\), 其中\(E(X_t)=0.\)
随机扰动\(X_t\) 是零均值的平稳序列;
均值过程\(\mu_t\)随着时间\(t\)的变化,表现出特定的变化模式:
\[ \mu_t=\begin{cases} \mu_0, & \text{常函数} \\ \beta_0+\beta_1 t, & \text{线性函数} \\ \beta_0+\beta_1 t+\beta_2 t^2, & \text{二次函数} \\ \cdots, & \text{多项式函数} \\ \mu_{t-T}, & \text{周期函数} \\ \end{cases}\]
\[\quad\hat{\mu}=\bar{Y}, \qquad E(\hat{\mu})=\mu_0, \qquad \text{无偏估计量}\]
\[Var(\hat{\mu})=\frac{\gamma_0}{n} [1+2\sum_{k=1}^{n-1}(1-\frac{k}{n})\rho_k].\]
例1’. 白噪声序列 {\(X_t=e_t, i.i.d.\)}, 平稳的, \(Var(\hat{\mu})=\frac{\gamma_0}{n}.\)
\[\begin{gathered} \gamma_k=\begin{cases} \sigma_e^2, &\text{if } k=0\\ 0, &\text{if } k>0 \end{cases} \qquad\qquad\qquad \rho_k=\begin{cases} 1, &\text{if } k=0\\ 0, &\text{if } k>0\end{cases} \end{gathered}\]
例2’. 滑动平均序列{\(X_t=\frac{1}{2}e_t+\frac{1}{2}e_{t-1}\)}, 平稳的, \(Var(\hat{\mu})=\frac{\gamma_0}{n}[1+\frac{n-1}{n}]=\frac{\gamma_0}{n}(\frac{2n-1}{n})\simeq 2\frac{\gamma_0}{n}.\)
\[\begin{align} \gamma_k=\begin{cases} 0.5\cdot\sigma_e^2, &\text{if } k=0\\ 0.25\cdot\sigma_e^2, &\text{if } k=1\\ 0, &\text{if } k>1 \end{cases} \qquad\qquad\qquad &\rho_k=\begin{cases} 1, &\text{if } k=0\\ 0.5, &\text{if } k=1\\ 0, &\text{if } k>1\end{cases} \end{align}\]
\(\qquad\) 滑动平均序列 {\(X_t=e_t-\frac{1}{2}e_{t-1}\)}, 平稳的, \(Var(\hat{\mu})=\frac{\gamma_0}{n}[1-0.8(\frac{n-1}{n})]=\frac{\gamma_0}{n}(\frac{0.2n+0.8}{n})\simeq 0.2\frac{\gamma_0}{n}.\)
\[\begin{align} \gamma_k=\begin{cases} 1.25\cdot\sigma_e^2, &\text{if } k=0\\ -0.5\cdot\sigma_e^2, &\text{if } k=1\\ 0, &\text{if } k>1 \end{cases} \qquad\qquad\qquad &\rho_k=\begin{cases} 1, &\text{if } k=0\\ -0.4, &\text{if } k=1\\ 0, &\text{if } k>1\end{cases} \end{align}\]
例5. 一般的平稳序列,随着滞后的增加,自相关函数迅速衰减,\(\sum_{k=0}^{+\infty}|\rho_k| < \infty.\) 当\(n\)充分大的时候, \(Var(\hat{\mu}) \simeq \frac{\gamma_0}{n}[\sum_{k=0}^{+\infty}\rho_k].\)
\(\qquad\) 例如,对于所有的整数\(k\), \(\rho_k=\phi^{|k|}\), \(\phi\in(-1, +1).\) 那么当\(n\)充分大的时候,\(Var(\hat{\mu}) \simeq \frac{\gamma_0}{n}[1+\frac{2\phi}{1-\phi}]=\frac{(1+\phi)\gamma_0}{(1-\phi)n}.\)
\[ \qquad\]
\[Var(\hat\mu)=\frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n Cov(X_i,X_j)=\frac{1}{n^2}[\sum_{i=1}^n Var(X_i)+2\sum_{i=2}^n\sum_{j=1}^{i-1} \gamma_{i,j}]\]
例3’. 随机游动序列 {\(X_t=e_1+e_2+\cdots+e_t\)}, 非平稳的, \(\mu_t=t\cdot\mu, \quad \sigma_t^2=t\cdot\sigma_e^2.\)
\[\begin{align} \gamma_{t,s}&=Cov(e_1+e_2+\cdots+e_t,e_1+e_2+\cdots+e_s)\\ &=Var(e_1+e_2+\cdots+e_{min(t,s)})\\ &=min(t,s)\cdot \sigma_e^2 \end{align}\]
\[\rho_{t,s}=\frac{\gamma_{t,s}}{\sqrt{\gamma_{t,t}\gamma_{s,s}}}=\frac{min(t,s)\cdot \sigma_e^2}{\sqrt{t \sigma_e^2 \cdot s\sigma_e^2}}=\frac{\sqrt{min(t,s)}}{\sqrt{max(t,s)}}\]
\[\begin{align} Var(\hat{\mu})&=\frac{1}{n^2}\left [\sum_{i=1}^n i\sigma_e^2+2\sum_{i=2}^n\sum_{j=1}^{i-1} j\sigma_e^2\right ]\\ &=\frac{\sigma_e^2}{n^2}\left [\sum_{i=1}^n i+ 2\sum_{j=1}^{n-1}\sum_{i=j+1}^n j\right ]\\ &=\frac{\sigma_e^2}{n^2}\left [\sum_{i=1}^n i+ 2\sum_{j=1}^{n-1} (n-j) j\right ] \\ &=\frac{\sigma_e^2}{n^2}\left [(1+2n)\sum_{i=1}^n i- 2\sum_{j=1}^{n} j^2\right ] \\ &=\frac{\sigma_e^2}{n^2}\left [(1+2n)\frac{n(n+1)}{2}- 2 \frac{n(n+1)(2n+1)}{6}\right ] \\ &=\sigma_e^2 (2n+1) \frac{(n+1)}{6n} \end{align}\]
\(\qquad\) {\(X_t\)}满足Gauss-Markov假设(零均值,等方差,不相关).
\(\qquad\) 最小二乘法估计:最小化 \(Q(\beta_0,\beta_1)=\sum_{t=1}^n [Y_t-(\beta_0+\beta_1 t)]^2\), 令偏导数为0,求解。
\[\begin{align} \hat{\beta_1} &= \frac{\sum_{t=1}^n t(Y_t-\bar{Y})}{\sum_{t=1}^n (t-\bar{t})^2}=\frac{\sum_{t=1}^n (t-\bar{t})(Y_t-\bar{Y})}{\sum_{t=1}^n (t-\bar{t})^2}\\ \hat{\beta_0} &= \bar{Y}-\hat{\beta_1}\bar{t} \end{align}\]
data(wages)
plot(wages,type='l',ylab='Wages',xlab='Time',main='Monthly wages')
points(wages,col='blue')
\(\qquad\) 时间的一次函数拟合
lm_wages=lm(wages~time(wages))
summary(lm_wages)
##
## Call:
## lm(formula = wages ~ time(wages))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.23828 -0.04981 0.01942 0.05845 0.13136
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.490e+02 1.115e+01 -49.24 <2e-16 ***
## time(wages) 2.811e-01 5.618e-03 50.03 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.08257 on 70 degrees of freedom
## Multiple R-squared: 0.9728, Adjusted R-squared: 0.9724
## F-statistic: 2503 on 1 and 70 DF, p-value: < 2.2e-16
plot(wages,type='l',ylab='Wages',xlab='Time',main='一次时间函数趋势')
points(wages,col='blue')
abline(lm_wages,col='red')
最小二乘估计的斜率\(\hat{\beta}_1=0.1341\) 和截距 \(\hat{\beta}_0=-1.008\). 图表3-2展示了随机游动并叠加了最小二乘回归趋势线。
\(\qquad\) 二次时间函数拟合
lm2_wages=lm(wages~time(wages)+I(time(wages)^2))
summary(lm2_wages)
##
## Call:
## lm(formula = wages ~ time(wages) + I(time(wages)^2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.148318 -0.041440 0.001563 0.050089 0.139839
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8.495e+04 1.019e+04 -8.336 4.87e-12 ***
## time(wages) 8.534e+01 1.027e+01 8.309 5.44e-12 ***
## I(time(wages)^2) -2.143e-02 2.588e-03 -8.282 6.10e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05889 on 69 degrees of freedom
## Multiple R-squared: 0.9864, Adjusted R-squared: 0.986
## F-statistic: 2494 on 2 and 69 DF, p-value: < 2.2e-16
xfit <- time(wages)
yfit <- fitted(lm2_wages)
plot(wages,type='l',ylab='Wages',xlab='Time',main='二次时间函数趋势')
points(wages,col='blue')
lines(as.vector(xfit),as.vector(yfit),col='red')
data(tempdub)
month.=season(tempdub)
model2=lm(tempdub~month.-1) # -1 removes the intercept term
summary(model2)
##
## Call:
## lm(formula = tempdub ~ month. - 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.2750 -2.2479 0.1125 1.8896 9.8250
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## month.January 16.608 0.987 16.83 <2e-16 ***
## month.February 20.650 0.987 20.92 <2e-16 ***
## month.March 32.475 0.987 32.90 <2e-16 ***
## month.April 46.525 0.987 47.14 <2e-16 ***
## month.May 58.092 0.987 58.86 <2e-16 ***
## month.June 67.500 0.987 68.39 <2e-16 ***
## month.July 71.717 0.987 72.66 <2e-16 ***
## month.August 69.333 0.987 70.25 <2e-16 ***
## month.September 61.025 0.987 61.83 <2e-16 ***
## month.October 50.975 0.987 51.65 <2e-16 ***
## month.November 36.650 0.987 37.13 <2e-16 ***
## month.December 23.642 0.987 23.95 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.419 on 132 degrees of freedom
## Multiple R-squared: 0.9957, Adjusted R-squared: 0.9953
## F-statistic: 2569 on 12 and 132 DF, p-value: < 2.2e-16
model3=lm(tempdub~month.) # intercept is automatically included so one month (Jan) is dropped
summary(model3)
##
## Call:
## lm(formula = tempdub ~ month.)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.2750 -2.2479 0.1125 1.8896 9.8250
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 16.608 0.987 16.828 < 2e-16 ***
## month.February 4.042 1.396 2.896 0.00443 **
## month.March 15.867 1.396 11.368 < 2e-16 ***
## month.April 29.917 1.396 21.434 < 2e-16 ***
## month.May 41.483 1.396 29.721 < 2e-16 ***
## month.June 50.892 1.396 36.461 < 2e-16 ***
## month.July 55.108 1.396 39.482 < 2e-16 ***
## month.August 52.725 1.396 37.775 < 2e-16 ***
## month.September 44.417 1.396 31.822 < 2e-16 ***
## month.October 34.367 1.396 24.622 < 2e-16 ***
## month.November 20.042 1.396 14.359 < 2e-16 ***
## month.December 7.033 1.396 5.039 1.51e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.419 on 132 degrees of freedom
## Multiple R-squared: 0.9712, Adjusted R-squared: 0.9688
## F-statistic: 405.1 on 11 and 132 DF, p-value: < 2.2e-16
# first creates the first pair of harmonic functions and then fit the model
har.=harmonic(tempdub,1)##a matrix consisting of \cos(2k ?? t), \sin(2k ?? t), k=1,2,...,m, excluding any zero functions.
model4=lm(tempdub~har.)
summary(model4)
##
## Call:
## lm(formula = tempdub ~ har.)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.1580 -2.2756 -0.1457 2.3754 11.2671
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 46.2660 0.3088 149.816 < 2e-16 ***
## har.cos(2*pi*t) -26.7079 0.4367 -61.154 < 2e-16 ***
## har.sin(2*pi*t) -2.1697 0.4367 -4.968 1.93e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.706 on 141 degrees of freedom
## Multiple R-squared: 0.9639, Adjusted R-squared: 0.9634
## F-statistic: 1882 on 2 and 141 DF, p-value: < 2.2e-16
tempFit4 <- ts(fitted(model4),freq=12,start=c(1964,1))
plot(tempFit4,ylab='Temperature',type='l',
ylim=range(c(fitted(model4),tempdub))) # the ylim option ensures that the
# y axis has a range that fits the raw data and the fitted values
points(tempdub, col='blue')
points(tempFit4,col='red')
参考“多元线性回归分析”http://www.homepage.zjut.edu.cn/yjq/
随机项的估计和预测(残差) \[\hat X_t=Y_t-\hat Y_t\]
检验残差\(\hat X_t\)是否为白噪声序列: 零均值,等方差,不相关(正态独立)。
# Exhibit 3.8
plot(y=rstudent(model3), x=as.vector(time(tempdub)), xlab='Time', ylab='Standardized Residuals', type='o')
abline(h=0, col='red')
# Exhibit 3.9
plot(y=rstudent(model3), x=as.vector(time(tempdub)), xlab='Time', ylab='Standardized Residuals', type='l')
points(y=rstudent(model3), x=as.vector(time(tempdub)), pch=as.vector(season(tempdub)), col='blue')
abline(h=0, col='red')
# Exhibit 3.10
plot(y=rstudent(model3), x=as.vector(fitted(model3)), xlab='Fitted Trend Values',
ylab='Standardized Residuals',type="n")
points(y=rstudent(model3), x=as.vector(fitted(model3)), pch=as.vector(season(tempdub)))
# Exhibit 3.11
hist(rstudent(model3), xlab='Standardized Residuals', main='')
# Exhibit 3.12
qqnorm(rstudent(model3), main='QQ plot')
qqline(rstudent(model3), col='red')
#Shapiro-Wilk检验: H_0 正态
shapiro.test(rstudent(model3))
##
## Shapiro-Wilk normality test
##
## data: rstudent(model3)
## W = 0.9929, p-value = 0.6954
#t检验: H_0 均值为0
t.test(residuals(model3))
##
## One Sample t-test
##
## data: residuals(model3)
## t = -1.1355e-15, df = 143, p-value = 1
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.5410872 0.5410872
## sample estimates:
## mean of x
## -3.108378e-16
假定样本平稳,序列具有不变的均值和方差,定义滞后\(k\)的样本自相关函数为 \[ r_k=\frac{\sum_{t=k+1}^n (Y_t-\overline Y)(Y_{t-k}-\overline Y)}{\sum_{t=1}^n (Y_t-\overline Y)^2}, \qquad k=1,2,\cdots\] \(r_k\)相对滞后\(k\)的图形被称作自相关函数图。
# Exhibit 3.13
acf(rstudent(model3), main='ACF')
# 游程检验:H_0 独立
runs(rstudent(model3))
## $pvalue
## [1] 0.216
##
## $observed.runs
## [1] 65
##
## $expected.runs
## [1] 72.875
##
## $n1
## [1] 69
##
## $n2
## [1] 75
##
## $k
## [1] 0
例3”. 随机游动过程,错误地当成线性时间趋势处理:
data(rwalk)
plot(rwalk,type='o',ylab='rWalk',main='图表3-2* 随机游动')
model1=lm(rwalk~time(rwalk))
summary(model1)
##
## Call:
## lm(formula = rwalk ~ time(rwalk))
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.70045 -0.79782 0.06391 0.63064 2.22128
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.007888 0.297245 -3.391 0.00126 **
## time(rwalk) 0.134087 0.008475 15.822 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.137 on 58 degrees of freedom
## Multiple R-squared: 0.8119, Adjusted R-squared: 0.8086
## F-statistic: 250.3 on 1 and 58 DF, p-value: < 2.2e-16
最小二乘估计的斜率\(\hat{\beta}_1=0.1341\) 和截距 \(\hat{\beta}_0=-1.008\). 图表3-2展示了随机游动并叠加了最小二乘回归趋势线。
plot(rwalk,type='o',ylab='rWalk',main='图表3-2 随机游动与线性时间趋势')
abline(model1, col='red') # add the fitted least squares line
# Exhibit 3.14
plot(y=rstudent(model1), x=as.vector(time(rwalk)), ylab='Standardized Residuals', xlab='Time', type='o')
abline(h=0, col='red')
# Exhibit 3.15
plot(y=rstudent(model1),x=fitted(model1),ylab='Standardized Residuals',
xlab='Fitted Trend Values',type='p')
# Exhibit 3.16
acf(rstudent(model1),main='ACF')
# 游程检验:H_0 独立
runs(rstudent(model1))
## $pvalue
## [1] 0.0266
##
## $observed.runs
## [1] 22
##
## $expected.runs
## [1] 30.96667
##
## $n1
## [1] 29
##
## $n2
## [1] 31
##
## $k
## [1] 0
习题3.7 (on P.38)
data(winnebago); winnebago
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1966 61 48
## 1967 53 78 75 58 146 193 124 120 134 99 130 166
## 1968 168 267 314 432 355 384 232 235 293 242 248 236
## 1969 209 358 352 406 562 389 416 493 409 328 222 195
## 1970 156 439 671 874 558 621 628 652 495 344 405 586
## 1971 403 700 837 1224 1117 1214 762 846 1228 937 1396 1174
## 1972 628 1753
n <- length(winnebago); n
## [1] 64
Time_winn <- time(winnebago); Time_winn
## Jan Feb Mar Apr May Jun Jul Aug
## 1966
## 1967 1967.000 1967.083 1967.167 1967.250 1967.333 1967.417 1967.500 1967.583
## 1968 1968.000 1968.083 1968.167 1968.250 1968.333 1968.417 1968.500 1968.583
## 1969 1969.000 1969.083 1969.167 1969.250 1969.333 1969.417 1969.500 1969.583
## 1970 1970.000 1970.083 1970.167 1970.250 1970.333 1970.417 1970.500 1970.583
## 1971 1971.000 1971.083 1971.167 1971.250 1971.333 1971.417 1971.500 1971.583
## 1972 1972.000 1972.083
## Sep Oct Nov Dec
## 1966 1966.833 1966.917
## 1967 1967.667 1967.750 1967.833 1967.917
## 1968 1968.667 1968.750 1968.833 1968.917
## 1969 1969.667 1969.750 1969.833 1969.917
## 1970 1970.667 1970.750 1970.833 1970.917
## 1971 1971.667 1971.750 1971.833 1971.917
## 1972
#(a).时序图
plot(winnebago, type='l', ylab='Monthly Sale', xlab='Time', main='图1. 时序图(a)')
points(winnebago, col='blue', pch=2); abline(h=0, col='red')
#(b).时间的一次函数拟合
model_1=lm(winnebago~Time_winn)
summary(model_1)
##
## Call:
## lm(formula = winnebago ~ Time_winn)
##
## Residuals:
## Min 1Q Median 3Q Max
## -419.58 -93.13 -12.78 94.96 759.21
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -394885.68 33539.77 -11.77 <2e-16 ***
## Time_winn 200.74 17.03 11.79 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 209.7 on 62 degrees of freedom
## Multiple R-squared: 0.6915, Adjusted R-squared: 0.6865
## F-statistic: 138.9 on 1 and 62 DF, p-value: < 2.2e-16
abline(model_1, col='red')
### y_hat=-394885.68+200.74 * Time
Res_winn <- rstudent(model_1) ##标准化残差
plot(Res_winn, type='l', ylab='Standarized Residual in (b)', xlab='Time', main='图2. 残差图(b))')
points(Res_winn, col='blue', pch=3); abline(h=0, col='red')
#(c).对数变换之后,时间的一次函数拟合
winn_log <- log(winnebago) #数据对数化
plot(winn_log, type='l', ylab='Log(Monthly Sale)', xlab='Time', main='图3. 对数化之后时序图(c)')
points(winn_log, col='blue', pch=6); abline(h=0, col='red')
#(d).对数变换+一次函数拟合
model_2=lm(winn_log~Time_winn)
summary(model_2)
##
## Call:
## lm(formula = winn_log ~ Time_winn)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.03669 -0.20823 0.04995 0.25662 0.86223
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -984.93878 62.99472 -15.63 <2e-16 ***
## Time_winn 0.50306 0.03199 15.73 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3939 on 62 degrees of freedom
## Multiple R-squared: 0.7996, Adjusted R-squared: 0.7964
## F-statistic: 247.4 on 1 and 62 DF, p-value: < 2.2e-16
abline(model_2, col='red')
### Log(y_hat)=-984.93878+ 0.50306 * Time
Res_winn_log <- rstudent(model_2) ##标准化残差
plot(Res_winn_log, type='l', ylab='Residual in log (d)', xlab='Time', main='图4. 对数变换之后的残差图(d))')
points(Res_winn_log, col='blue', pch=7); abline(h=0, col='red')
#(e)季节性+对数变换+一次函数拟合
month_winn <- season(winnebago); month_winn
## [1] November December January February March April May
## [8] June July August September October November December
## [15] January February March April May June July
## [22] August September October November December January February
## [29] March April May June July August September
## [36] October November December January February March April
## [43] May June July August September October November
## [50] December January February March April May June
## [57] July August September October November December January
## [64] February
## 12 Levels: January February March April May June July August ... December
plot(winn_log, type='l', ylab='Log(Monthly Sale)', xlab='Time', main='图5. 季节性+对数化之后时序图(e)')
points(winn_log, col='blue', pch=as.vector(month_winn )); abline(h=0, col='red')
model_3=lm(winn_log~month_winn+Time_winn)
summary(model_3)
##
## Call:
## lm(formula = winn_log ~ month_winn + Time_winn)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.92501 -0.16328 0.03344 0.20757 0.57388
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -997.33061 50.63995 -19.695 < 2e-16 ***
## month_winnFebruary 0.62445 0.18182 3.434 0.001188 **
## month_winnMarch 0.68220 0.19088 3.574 0.000779 ***
## month_winnApril 0.80959 0.19079 4.243 9.30e-05 ***
## month_winnMay 0.86953 0.19073 4.559 3.25e-05 ***
## month_winnJune 0.86309 0.19070 4.526 3.63e-05 ***
## month_winnJuly 0.55392 0.19069 2.905 0.005420 **
## month_winnAugust 0.56989 0.19070 2.988 0.004305 **
## month_winnSeptember 0.57572 0.19073 3.018 0.003960 **
## month_winnOctober 0.26349 0.19079 1.381 0.173300
## month_winnNovember 0.28682 0.18186 1.577 0.120946
## month_winnDecember 0.24802 0.18182 1.364 0.178532
## Time_winn 0.50909 0.02571 19.800 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3149 on 51 degrees of freedom
## Multiple R-squared: 0.8946, Adjusted R-squared: 0.8699
## F-statistic: 36.09 on 12 and 51 DF, p-value: < 2.2e-16
lines(y=as.vector(fitted(model_3)), x=as.vector(Time_winn), col='red')
points(y=as.vector(fitted(model_3)), x=as.vector(Time_winn), col='Purple', pch=as.vector(month_winn ))
Res1_winn_log <- rstudent(model_3) ##标准化残差
plot(Res1_winn_log, type='l', ylab='Residual in log (f)', xlab='Time', main='图6. 季节性对数变换之后的残差图(f))')
points(Res1_winn_log, col='blue', pch=as.vector(month_winn ))
abline(h=0, col='red')
#(b).游程检验:独立性
runs(Res1_winn_log)
## $pvalue
## [1] 0.000243
##
## $observed.runs
## [1] 18
##
## $expected.runs
## [1] 32.71875
##
## $n1
## [1] 29
##
## $n2
## [1] 35
##
## $k
## [1] 0
##pvalue 0.000243 拒绝“残差是独立的”
#(c).自相关性
acf(Res1_winn_log,main='图7.季节性+对数变换+一次函数拟合之后的残差自相关图' ) #自相关图
#(d).正态性
#QQ图
qqnorm(Res1_winn_log, main='图8.季节性+对数变换+一次函数拟合之后的残差QQ图' ) #QQ图
qqline(Res1_winn_log, col='blue'); abline(h=0, col='red')
#直方图
hist(Res1_winn_log , xlab='Standardized Residuals_log_2', main='图9. 季节性+对数变换+一次函数拟合之后的残差直方图')
#Shapiro-Wilk检验: 正态性
shapiro.test(Res1_winn_log )
##
## Shapiro-Wilk normality test
##
## data: Res1_winn_log
## W = 0.97035, p-value = 0.1262
##p-value = 0.1262 不能拒绝来自正态总体的假设
\[ Y_t=e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2}-\cdots-\theta_q e_{t-q}, \qquad\text{ 其中 } e_t\sim WN(0,\sigma_e^2). \]
\(\qquad\)均值 \(E(Y_t)=0\), 方差 \(Var(Y_t)=(1+\theta^2)\sigma_e^2\), 协方差及自相关函数 \[\begin{gathered} \begin{align} Cov(Y_t,Y_{t-1})&=Cov(e_t-\theta e_{t-1},e_{t-1}-\theta e_{t-2})=-\theta \sigma_e^2. \\ Cov(Y_t,Y_{t-2})&=Cov(e_t-\theta e_{t-1},e_{t-2}-\theta e_{t-3})=0.\\ Cov(Y_t,Y_{t-k})&=Cov(e_t-\theta e_{t-1},e_{t-k}-\theta e_{t-k-1})=0. \text{ if } k>2.\\ \end{align} \qquad\qquad \rho_k= \begin{cases} 1, &\text{ if }k=0\\ -\frac{\theta}{1+\theta^2}, &\text{ if }k=1\\ 0, &\text{ if }k\geq 2\\ \end{cases} \end{gathered}\]
\(\qquad\) 结论:MA(1)是平稳的,自相关函数在滞后 \(1\) 阶之后“截尾”。
theta <- seq(-1,1,by=0.1)
rho_1 <- (-theta)/(1+theta^2)
data.frame(theta=theta, rho_1=rho_1)
## theta rho_1
## 1 -1.0 0.5000000
## 2 -0.9 0.4972376
## 3 -0.8 0.4878049
## 4 -0.7 0.4697987
## 5 -0.6 0.4411765
## 6 -0.5 0.4000000
## 7 -0.4 0.3448276
## 8 -0.3 0.2752294
## 9 -0.2 0.1923077
## 10 -0.1 0.0990099
## 11 0.0 0.0000000
## 12 0.1 -0.0990099
## 13 0.2 -0.1923077
## 14 0.3 -0.2752294
## 15 0.4 -0.3448276
## 16 0.5 -0.4000000
## 17 0.6 -0.4411765
## 18 0.7 -0.4697987
## 19 0.8 -0.4878049
## 20 0.9 -0.4972376
## 21 1.0 -0.5000000
plot(theta, rho_1, col='blue', type='l', ylab=expression(rho[1]), xlab=expression(theta))
abline(h=0.5, lty=4); abline(h=-0.5, lty=4)
例如: MA(1)序列\(\theta=-0.9, \rho_1=0.4972\), 和\(\theta=0.9, \rho_1=-0.4972\)。
# Exhibit 4.2
data(ma1.2.s) #theta=-0.9
plot(ma1.2.s, ylab=expression(Y[t]), type='l', main=expression(theta==-0.9))
points(ma1.2.s, col='blue', cex=0.5); abline(h=0, col='red')
opar=par(mfrow=c(1,2))
# Exhibit 4.3
plot(y=ma1.2.s, x=zlag(ma1.2.s), col='blue', ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', main=expression(theta==-0.9), cex=0.5)
# Exhibit 4.4
plot(y=ma1.2.s, x=zlag(ma1.2.s,2), col='blue', ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', main=expression(theta==-0.9), cex=0.5)
par(opar)
# Exhibit 4.5
data(ma1.1.s)
plot(ma1.1.s,ylab=expression(Y[t]),type='o', cex=0.5); abline(h=0, col='red')
opar=par(mfrow=c(1,2))
# Exhibit 4.6
plot(y=ma1.1.s, x=zlag(ma1.1.s), ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', cex=0.5)
# Exhibit 4.7
plot(y=ma1.1.s,x=zlag(ma1.1.s,2), ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', cex=0.5)
# An MA(1) series of length n=100 with MA coefficient equal to -0.9 and 0.9 respectively.
set.seed(12345); y1=arima.sim(model=list(ma=-c(-0.9)),n=100)
set.seed(54321); y2=arima.sim(model=list(MA=-c(0.9)), n=100)
# Note that R uses the plus convention in the MA model formula so the additional minus sign.
plot(y1, ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='l', main=expression(theta==-0.9))
points(y1, col='blue', cex=0.5); abline(h=0, col='red')
plot(y2, ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='l', main=expression(theta==0.9))
points(y2, col='blue', cex=0.5); abline(h=0, col='red')
par(opar)
\(\qquad\)均值 \(E(Y_t)=0\), 方差 \(Var(Y_t)=(1+\theta_1^2+\theta_2^2)\sigma_e^2\), 协方差及自相关函数 \[\begin{gathered} \begin{align} Cov(Y_t,Y_{t-1})&=Cov(e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2},e_{t-1}-\theta_1 e_{t-2}-\theta_2 e_{t-3})=(-\theta_1+\theta_1\theta_2) \sigma_e^2. \\ Cov(Y_t,Y_{t-2})&=Cov(e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2},e_{t-2}-\theta_1 e_{t-3}-\theta_2 e_{t-4})=-\theta_2 \sigma_e^2.\\ Cov(Y_t,Y_{t-k})&=Cov(e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2},e_{t-k}-\theta_1 e_{t-k-1}-\theta_2 e_{t-k-2})=0. \text{ if } k\geq 3.\\ \end{align} \qquad \rho_k= \begin{cases} 1, &\text{ if }k=0\\ \frac{-\theta_1+\theta_1\theta_2}{1+\theta_1^2+\theta_2^2}, &\text{ if }k=1\\ \frac{-\theta_2}{1+\theta_1^2+\theta_2^2}, &\text{ if }k=2\\ 0, &\text{ if }k\geq 3\\ \end{cases} \end{gathered}\]
\(\qquad\) 结论:MA(2)是平稳的,自相关函数在滞后 \(2\) 阶之后“截尾”。
例如: MA(2)序列\(\theta_1=1, \theta_2=-0.6\)。\(\rho_1=-0.678,\rho_2=0.254.\)
data(ma2.s) ##theta_1=1, theta_2=-0.6
opar=par(mfrow=c(2,2))
# Exhibit 4.8
plot(ma2.s, ylab=expression(Y[t]), type='o', cex=0.5); abline(h=0, col='red')
# Exhibit 4.9
plot(y=ma2.s, x=zlag(ma2.s), ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', cex=0.5)
# Exhibit 4.10
plot(y=ma2.s, x=zlag(ma2.s,2), ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', cex=0.5)
# Exhibit 4.11
plot(y=ma2.s, x=zlag(ma2.s,3), ylab=expression(Y[t]), xlab=expression(Y[t-3]), type='p', cex=0.5)
par(opar)
# An MA(2) series with MA coefficients equal to 1 and -0.6 and of length n=100.
y=arima.sim(model=list(ma=-c(1, -0.6)), n=100)
# Note that R uses the plus convention in the MA model formula so the additional minus sign.
plot(y, ylab=expression(Y[t]), type='o', cex=0.5); abline(h=0, col='red')
\(\qquad\)均值 \(E(Y_t)=0\), 方差 \(Var(Y_t)=(1+\theta_1^2+\theta_2^2+\cdots+\theta_q^2)\sigma_e^2\), 协方差及自相关函数 \[\begin{gathered} Cov(Y_t,Y_{t-k})=\begin{cases} \sigma_e^2 (-\theta_k+\sum_{i=1}^{q-k}\theta_i\theta_{i+k}), &\text{ if } 1\leq k\leq q\\ 0. \text{ if } k> q.\\ \end{cases} \qquad\qquad \rho_k= \begin{cases} 1, &\text{ if }k=0\\ \frac{-\theta_k+\sum_{i=1}^{q-k}\theta_i\theta_{i+k}}{1+\sum_{i=1}^q\theta_i^2}, &\text{ if }1\leq k\leq q\\ 0, &\text{ if }k > q\\ \end{cases} \end{gathered}\]
\(\qquad\) 结论:MA(q)是平稳的,自相关函数在滞后 \(q\) 阶之后“截尾”。
\(\qquad \{Y_t\}\) 表示观测到的时间序列,\(\{e_t\}\) 表示未观测到的白噪声。 \[Y_t=\psi_0e_t+\psi_1 e_{t-1}+\psi_2 e_{t-2}+\cdots=\sum_{i=0}^{\infty}\psi_i e_{t-i}.\]
期望为零 \(E(Y_t)=0.\) 一般性假设\(\psi_0=1.\)
方差有限 \(Var(Y_t)=\sigma_e^2(\sum_{i=0}^{\infty}\psi_i^2)<\infty.\)
一阶协方差
\[\begin{align} Cov(Y_t,Y_{t-1}) &= Cov(\psi_0e_t+\psi_1 e_{t-1}+\psi_2 e_{t-2}+\cdots, \psi_0e_{t-1}+\psi_1 e_{t-2}+\psi_2 e_{t-3}+\cdots ) \\ &= (\psi_0\psi_1+\psi_1\psi_2+\psi_2\psi_3+\cdots)\sigma_e^2=\sigma_e^2(\sum_{i=0}^{\infty}\psi_i\psi_{i+1}) \end{align}\]
\[\begin{align} Cov(Y_t,Y_{t-2}) &= Cov(\psi_0e_t+\psi_1 e_{t-1}+\psi_2 e_{t-2}+\cdots, \psi_0e_{t-2}+\psi_1 e_{t-3}+\psi_2 e_{t-4}+\cdots ) \\ &= (\psi_0\psi_2+\psi_1\psi_3+\psi_2\psi_4+\cdots)\sigma_e^2=\sigma_e^2(\sum_{i=0}^{\infty}\psi_i\psi_{i+2}) \end{align}\]
\[Cov(Y_t,Y_{t-k}) =\sigma_e^2(\sum_{i=0}^{\infty}\psi_i\psi_{i+k}),\qquad k\geq0.\]
\[Corr(Y_t,Y_{t-k})=\frac{\sum_{i=0}^{\infty}\psi_i\psi_{i+k}}{\sum_{i=0}^{\infty}\psi_i^2}, \quad k\geq 0.\]
例如,一般线性过程\(\psi_j=\phi^j, |\phi|<1.\)
\[Y_t=e_t+\phi e_{t-1}+\phi^2 e_{t-2}+\cdots, \qquad |\phi|<1.\]
\(\qquad\)- 期望为零 \(E(Y_t)=0\), 方差有限 \(Var(Y_t)=\sigma_e^2(\sum_{i=0}^{\infty}\phi^{2i})=\frac{1}{1-\phi^2}\sigma_e^2.\)
\(\qquad\)- \(k\)阶协方差 \(Cov(Y_t,Y_{t-k})=\sigma_e^2(\sum_{i=0}^{\infty}\phi^{i}\phi^{i+k})=\frac{\phi^k}{1-\phi^2}\sigma_e^2.\)
\(\qquad\)- \(k\)阶自相关函数 \(Corr(Y_t,Y_{t-k})=\phi^k, \quad k\geq 0.\)
\[ Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t, \qquad\text{ 其中 } e_t\sim WN(0,\sigma_e^2). \]
\[ \]
\(\qquad\) 如果\({Y_t}\)序列平稳,那么
\(\qquad\)- 均值 \(\mu=\frac{0}{1-\phi}=0\), 因为\(E(Y_t)=\phi E(Y_{t-1})+E(e_t)\);
\(\qquad\)- 方差 \(\gamma_0=\frac{\sigma_e^2}{1-\phi^2}\geq 0\), 因为\(Var(Y_t)=\phi^2 Var(Y_{t-1})+Var(e_t)\);
\(\qquad\quad\) ①. 平稳性的必要条件是 \(|\phi|<1.\)
\(\qquad\)- \(k\)阶协方差 \[\begin{align} Cov(Y_t,Y_{t-k})&=Cov(\phi Y_{t-1}+e_t, Y_{t-k}) \\ \gamma_k &= \phi \gamma_{k-1} \\ &= \phi\cdot\phi \gamma_{k-2}=\cdots\\ &=\phi^k \gamma_0 \\ \end{align} \]
\(\qquad\)- 自相关函数 \(\qquad \rho_k=\phi^k, k=0,1,2,\cdots\)
\(\qquad\)- AR(1)的一般线性表示
\[\begin{align} &&Y_t&=\phi Y_{t-1}+e_t\\ \phi \cdot &&Y_{t-1}&=\phi Y_{t-2}+e_{t-1} &&\cdot\phi\\ \phi^2 \cdot &&Y_{t-2}&=\phi Y_{t-3}+e_{t-2} &&\cdot\phi^2\\ &&&\cdots\\ \phi^{k-1} \cdot &&Y_{t-(k-1)}&=\phi Y_{t-k}+e_{t-(k-1)} &&\cdot\phi^{k-1}\\ \end{align}\]
\(\qquad\qquad\)求和可得 \(Y_t=e_t+\phi e_{t-1}+\phi^2 e_{t-2}+\cdots+\phi^{k-1} e_{t-(k-1)}+\phi^k Y_{t-k}.\)
\(\qquad\qquad\)如果\(|\phi|<1\), 并且\(k\rightarrow\infty\), 那么 \[Y_t=e_t+\phi e_{t-1}+\phi^2 e_{t-2}+\phi^3 e_{t-3}+\cdots\]
\(\qquad\quad\) ②. 平稳性的充分条件是 \(|\phi|<1.\)
\(\qquad\) 结论:AR(1)是平稳的充分必要条件是\(|\phi|<1\),自相关函数\(\rho_k=\phi^k\)随着时滞\(k\)的增加而指数衰减。
x<-seq(1:12);
y1<-0.9^x; y2<-0.4^x; y3<-(-0.8)^x; y4<-(-0.5)^x; opar=par(mfrow=c(2,2))
plot(x,y1,type='h', ylim=c(-0.1,0.9), xlab='Lag', ylab=expression(rho[k]), main=expression(phi==0.9));
points(x,y1,col='blue',cex=0.5); abline(h=0,col='red');
plot(x,y2,type='h', ylim=c(-0.1,0.9), xlab='Lag', ylab=expression(rho[k]), main=expression(phi==0.4));
points(x,y2,col='blue',cex=0.5); abline(h=0,col='red');
plot(x,y3,type='h', ylim=c(-0.9,0.9), xlab='Lag', ylab=expression(rho[k]), main=expression(phi==-0.8));
points(x,y3,col='blue',cex=0.5); abline(h=0,col='red');
plot(x,y4,type='h', ylim=c(-0.9,0.9), xlab='Lag', ylab=expression(rho[k]), main=expression(phi==-0.5));
points(x,y4,col='blue',cex=0.5); abline(h=0,col='red');
par(opar)
例如:AR(1)序列 \(\phi=0.9, \rho_1=0.9, \rho_2=0.81, \rho_3=0.729\cdots\)
data(ar1.s)
opar=par(mfrow=c(2,2))
# Exhibit 4.13
plot(ar1.s, ylab=expression(Y[t]), type='o', cex=0.5)
# Exhibit 4.14
plot(y=ar1.s, x=zlag(ar1.s), ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', cex=0.5)
# Exhibit 4.15
plot(y=ar1.s, x=zlag(ar1.s,2), ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', cex=0.5)
# Exhibit 4.16
plot(y=ar1.s, x=zlag(ar1.s,3), ylab=expression(Y[t]), xlab=expression(Y[t-3]), type='p', cex=0.5)
# An AR(1) series with AR coefficient equal to 0.98 and of length n=100.
set.seed(12345); y1=arima.sim(model=list(ar=c(0.98)),n=500)
set.seed(1234); y2=arima.sim(model=list(ar=c(-0.6)),n=500)
set.seed(54321); y3=arima.sim(model=list(ar=c(0.3)),n=500)
set.seed(4321); y4=arima.sim(model=list(ar=c(-0.1)),n=500)
# Note that the R convention for the AR model formula is same as the book.
plot(y1, type='o', xlab='Time', ylab='AR(1)', main=expression(phi==0.98), cex=0.5);
plot(y2, type='o', xlab='Time', ylab='AR(1)', main=expression(phi==-0.6), cex=0.5);
plot(y3, type='o', xlab='Time', ylab='AR(1)', main=expression(phi==0.3), cex=0.5);
plot(y4, type='o', xlab='Time', ylab='AR(1)', main=expression(phi==-0.1), cex=0.5);
acf(y1, xlab='Lag', ylab='ACF', main=expression(phi==0.98));
acf(y2, xlab='Lag', ylab='ACF', main=expression(phi==-0.6));
acf(y3, xlab='Lag', ylab='ACF', main=expression(phi==0.3));
acf(y4, xlab='Lag', ylab='ACF', main=expression(phi==-0.1));
par(opar)
\[ \]
\(\qquad\) 如果\({Y_t}\)序列平稳,那么
\(\qquad\)- 均值 \(\mu=\frac{0}{1-\phi_1-\phi_2}=0\), 因为\(E(Y_t)=\phi_1 E(Y_{t-1})+\phi_2 E(Y_{t-2})+E(e_t)\);
\(\qquad\)- 方差 \(Var(Y_t)=\phi_1^2 Var(Y_{t-1})+\phi_2^2 Var(Y_{t-2})+Var(e_t)+2\phi_1\phi_2 Cov(Y_{t-1},Y_{t-2})\), 则
\[(1-\phi_1^2-\phi_2^2)\gamma_0=2\phi_1\phi_2\gamma_1+\sigma_e^2.\]
\(\qquad\)- 一阶协方差 \((1-\phi_2)\gamma_1 = \phi_1 \gamma_0\), 因为 \[\begin{align} Cov(Y_t,Y_{t-1})&=Cov(\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+e_t, Y_{t-1}) \\ &=\phi_1 Var(Y_{t-1})+\phi_2 Cov(Y_{t-1},Y_{t-2}) \\ \end{align} \]
\(\qquad\qquad\)解方程组可得 \[\begin{cases} \gamma_0&=&\sigma_e^2\frac{1-\phi_2}{(1+\phi_2)[(1-\phi_2)^2-\phi_1^2]}\\ \gamma_1&=&\sigma_e^2\frac{\phi_1}{(1+\phi_2)[(1-\phi_2)^2-\phi_1^2]} \end{cases}\]
\(\qquad\)- 一阶自相关函数 \(\rho_1=\frac{\phi_1}{1-\phi_2}\)
\(\qquad\)- \(k\)阶协方差/自相关函数 \[\begin{align} &&Cov(Y_t,Y_{t-k})&=Cov(\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+e_t, Y_{t-k}) \\ \bf{Yule-Walker方程}&&\gamma_k&=\phi_1 \gamma_{k-1}+\phi_2 \gamma_{k-2} \\ &&\rho_k&=\phi_1 \rho_{k-1}+\phi_2 \rho_{k-2} \\ \end{align} \]
\(\qquad\) 求解二阶齐次线性差分方程\(\rho_{k+2}-\phi_1 \rho_{k+1}-\phi_2 \rho_k=0\),初值条件为\(\rho_0=1, \rho_{-k}=\rho_k.\)
\(\qquad\)-①. 参考“差分方程讲解”http://www.homepage.zjut.edu.cn/yjq/第19页,
\(\qquad\) 当\(\lambda\)是常数时,\(\rho_k=\lambda^k\)和它的各阶差商有倍数关系,因此不妨设解为\(\rho_k=\lambda^k\),那么 \[ \lambda^{k+2}-\phi_1 \lambda^{k+1}-\phi_2\lambda^{k}=0\]
\(\qquad\) 该齐次差分方程的特征方程为
\[\lambda^2-\phi_1 \lambda-\phi_2=0\]
\(\qquad\) 特征方程的根(特征根)为
\[\lambda_1=\frac{\phi_1- \sqrt{\phi_1^2+4\phi_2}}{2}, \qquad \lambda_2=\frac{\phi_1+ \sqrt{\phi_1^2+4\phi_2}}{2}\]
\[\begin{align} \underline{\text{特征根的情况}} & \qquad\qquad \underline{\text{差分方程的通解}}\\ \text{当 } \phi_1^2+4\phi_2>0 \text{ 时, } \lambda_1 \neq \lambda_2, \qquad & \qquad\qquad \rho_k=C_1 \lambda_1^k+C_2 \lambda_2^k \\ \text{当 } \phi_1^2+4\phi_2=0 \text{ 时, } \lambda_1 = \lambda_2, \qquad & \qquad\qquad \rho_k=(C_1 +C_2 k)\lambda_1^k \\ \text{当 } \phi_1^2+4\phi_2<0 \text{ 时, } \lambda_{1,2}= \alpha+\beta i, & \qquad\qquad \rho_k=(C_1 cos(k\theta)+C_2 sin(k\theta) )R^k \\ & \qquad\qquad R=\sqrt{\alpha^2+\beta^2}=\sqrt{-\phi_2}, tan \theta=\frac{\beta}{\alpha}\\ \end{align}\]
\(\qquad\) 将条件\(\rho_0=1, \rho_1=\rho_{-1}\)代入通解,可以确定参数\(C_1, C_2\), 写出\(k\)阶自相关函数的表达式。
\[\rho_k=\begin{cases} \frac{(1-\lambda_2^2)\lambda_1^{k+1}-(1-\lambda_1^2)\lambda_2^{k+1}}{(\lambda_1-\lambda_2)(1+\lambda_1\lambda_2)}, &\qquad 当\phi_1^2+4\phi_2>0 时,& 过阻尼状态\\ (1+\frac{1+\phi_2}{1-\phi_2}k)(\frac{\phi_1}{2})^k, &\qquad 当\phi_1^2+4\phi_2=0 时,&临界阻尼状态\\ \frac{\sin(\Theta k+\Phi)}{\sin(\Phi)}R^k, &\qquad 当\phi_1^2+4\phi_2<0 时.& 欠阻尼状态\\ &\cos(\Theta)=\frac{\phi_1}{2\sqrt{-\phi_2}}, \quad \tan(\Phi)=\frac{1-\phi_2}{1+\phi_2} \end{cases}\]
\(\qquad\) 结论:AR(2)的自相关函数 \(\rho_k\) 随着滞后阶数 \(k\) 的增加而指数衰减。 在复数特征根的情况下, \(\rho_k\) 显示为阻尼正弦波动曲线,具有阻尼因子 \(R(0\leq R<1)\)、频率 \(\Theta\)、相位 \(\Phi\)。下图给出了自相关函数可能的几种图形,左上是两个不相等的实数特征根的情形,右上是两个相等的实数特征根情形,第二排的图都是复数特征根的情况。
###AR(2)的自相关函数的理论值 #Exhibit 4.18
Rho1 <- ARMAacf(ar=c(0.5,0.25), lag.max = 12, pacf=FALSE)
Rho2 <- ARMAacf(ar=c(1.0,-0.25), lag.max = 12, pacf=FALSE)
Rho3 <- ARMAacf(ar=c(1.5,-0.75), lag.max = 12, pacf=FALSE)
Rho4 <- ARMAacf(ar=c(1.0,-0.6), lag.max = 12, pacf=FALSE)
opar=par(mfrow=c(2,2))
plot(Rho1, type='h', xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==0.5, ", ", phi[2]==0.25)));
points(Rho1, pch=8, col='blue', cex=0.8); abline(h=0);
plot(Rho2, type='h', xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==1.0, ", ", phi[2]==-0.25)));
points(Rho2, pch=8, col='blue', cex=0.8); abline(h=0);
plot(Rho3, type='h', xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==1.5, ", ", phi[2]==-0.75)));
points(Rho3, pch=8, col='blue', cex=0.8); abline(h=0);
plot(Rho4, type='h', xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==1.0, ", ", phi[2]==-0.6)));
points(Rho4, pch=8, col='blue', cex=0.8); abline(h=0);
par(opar)
\(\qquad\)-②. 差分方程稳定性的条件是:\(\underline{\bf 特征根在单位圆内,\quad |\lambda_{1,2}|<1.}\)
\(\qquad\) 利用特征方程根与系数的关系\(\begin{cases} \lambda_1 +\lambda_2&=&\phi_1,\\ \lambda_1 \cdot\lambda_2&=&-\phi_2\end{cases}\), 可得
\[\begin{align} \phi_2+\phi_1= &-\lambda_1\lambda_2+\lambda_1 +\lambda_2=1-(1-\lambda_1)(1-\lambda_2)\\ \phi_2-\phi_1=& -\lambda_1\lambda_2-\lambda_1 -\lambda_2=1-(1+\lambda_1)(1+\lambda_2) \end{align}\]
\(\qquad\) 差分方程稳定性的等价条件是:\(\underline{\bf \phi_2+\phi_1<1, \quad \phi_2-\phi_1<1, \quad |\phi_2|<1.}\)
###AR(2)的平稳参数区域 #Exhibit 4.17
phi1 <- seq(from = -2.5, to = 2.5, length = 51)
plot(phi1,1+phi1,lty="dashed",type="l",xlab="",ylab="",cex.axis=0.8,ylim=c(-1.5,1.5))
abline(a = -1, b = 0, lty="dashed")
abline(a = 1, b = -1, lty="dashed")
title(ylab=expression(phi[2]),xlab=expression(phi[1]),cex.lab=1.5)
polygon(x = phi1[6:46], y = 1-abs(phi1[6:46]), col="gray")
polygon(x = phi1[6:46], y = -phi1[6:46]^2/4, col="green")
lines(phi1, -phi1^2/4, col="blue")
text(0,-.6,expression(phi[2]< (-phi[1]^2/4)),cex=1.5)
text(0,0.3,"实数根", col='red', cex=1.5)
text(0,-0.3,"复数根", col='blue', cex=1.5)
text(1.75,.5,expression(phi[2]>1-phi[1]),cex=1.5)
text(-1.75,.5,expression(phi[2]>1+phi[1]),cex=1.5)
\(\quad\) 例如, AR(2)模型\(\phi_1=1.5, \phi_2=-0.75\), \(\phi_1^2+4\phi_2=-\frac{3}{4}<0\) 特征方程有两个不相等的复数根。\(k\)阶自相关函数为\(R^k\frac{\sin(\Theta k+\Phi)}{\sin(\Phi)}\), 其中, 阻尼因子\(R=\sqrt{-\phi_2}=\frac{\sqrt 3}{2}\simeq0.866\), \(\cos(\Theta)=\frac{\phi_1}{2\sqrt{-\phi_2}}=\frac{3/2}{2\cdot\sqrt{3}/2}=\frac{\sqrt 3}{2}\), 频率 \(\Theta=\pi/6\), 周期 \(f=\frac{2\pi}{\Theta}=\frac{2\pi}{\pi/6}=12\), \(\tan(\Phi)=\frac{1-\phi_2}{1+\phi_2}=\frac{1+3/4}{1-3/4}=7\), 相位 \(\Phi=\arctan (7)\simeq81.87^o\). (习题4.9 on P.59)
##polyroot(c(1,2,3,4)) 多项式方程1+2x+3x^2+4x^3=0求根
polyroot(c(0.75,-1.5,1)) #特征方程求根
## [1] 0.75+0.4330127i 0.75-0.4330127i
atan(7)*360/(2*pi)
## [1] 81.8699
#Exhibit 4.19
data(ar2.s) ##phi_1=1.5, phi_2=-0.75
Rho <- ARMAacf(ar=c(1.5,-0.75), lag.max = 20, pacf=FALSE); Rho
## 0 1 2 3 4 5
## 1.00000000 0.85714286 0.53571429 0.16071429 -0.16071429 -0.36160714
## 6 7 8 9 10 11
## -0.42187500 -0.36160714 -0.22600446 -0.06780134 0.06780134 0.15255301
## 12 13 14 15 16 17
## 0.17797852 0.15255301 0.09534563 0.02860369 -0.02860369 -0.06435830
## 18 19 20
## -0.07508469 -0.06435830 -0.04022394
opar=par(mfrow=c(2,2))
plot(ar2.s, ylab=expression(Y[t]), type='o', cex=0.5)
plot(y=ar2.s, x=zlag(ar2.s), ylab=expression(Y[t]), xlab=expression(Y[t-1]), type='p', cex=0.5)
plot(y=ar2.s, x=zlag(ar2.s,2), ylab=expression(Y[t]), xlab=expression(Y[t-2]), type='p', cex=0.5)
plot(y=ar2.s, x=zlag(ar2.s,3), ylab=expression(Y[t]), xlab=expression(Y[t-3]), type='p', cex=0.5)
par(opar)
opar=par(mfrow=c(1,2))
plot(Rho, type='h', xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi[1]==1.5, ", ", phi[2]==-0.75)));
points(Rho, pch=8, col='blue', cex=0.8); abline(h=0);
acf(ar2.s, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==1.5, ", ", phi[2]==-0.75)))
par(opar)
set.seed(12345); y1=arima.sim(model=list(ar=c(0.5, 0.25)),n=500)
set.seed(1234); y2=arima.sim(model=list(ar=c(1.0,-0.25)),n=500)
set.seed(54321); y3=arima.sim(model=list(ar=c(1.5,-0.75)),n=500)
set.seed(4321); y4=arima.sim(model=list(ar=c(1.0,-0.6)),n=500)
opar=par(mfrow=c(2,2))
plot(y1, type='o', xlab='Time', ylab='AR(2)', main=expression(paste(phi[1]==0.5, ", ", phi[2]==0.25)), cex=0.5);
plot(y2, type='o', xlab='Time', ylab='AR(2)', main=expression(paste(phi[1]==1.0, ", ", phi[2]==-0.25)), cex=0.5);
plot(y3, type='o', xlab='Time', ylab='AR(2)', main=expression(paste(phi[1]==1.5, ", ", phi[2]==-0.75)), cex=0.5);
plot(y4, type='o', xlab='Time', ylab='AR(2)', main=expression(paste(phi[1]==1.0, ", ", phi[2]==-0.6)), cex=0.5);
acf(y1, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==0.5, ", ", phi[2]==0.25)));
acf(y2, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==1.0, ", ", phi[2]==-0.25)));
acf(y3, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==1.5, ", ", phi[2]==-0.75)));
acf(y4, xlab='Lag', ylab='ACF', main=expression(paste(phi[1]==1.0, ", ", phi[2]==-0.6)));
par(opar)
\[ \]
\(\qquad\) AR(2)过程 \(Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2} +e_t\) 可以写成 \[(1-\phi_1 B -\phi_2 B^2)Y_t=\Phi(B)Y_t=e_t\] \(\qquad \Phi(z)=1-\phi_1 z-\phi_2 z^2\) 称为自回归系数多项式, \(\Phi(z)=0\) 称为AR特征方程(自回归系数方程, 差分方程的逆反特征方程),
\(\qquad \underline{\bf 它的根和差分方程的特征根互为倒数}。\) (习题4.22 on P.60)
\[\begin{gathered} \begin{matrix}1-\phi_1 z-\phi_2 z^2&=&0 \\ \lambda^2-\phi_1 \lambda-\phi_2 &=&0 \end{matrix}\qquad \qquad z_{1,2}=\frac {1}{\lambda_{1,2}} \end{gathered}\]
\(\qquad\) AR(2)过程平稳性的条件是:\(\underline{\bf AR特征方程的根在单位圆外, \quad |z_{1,2}| > 1. }\)
\[ \]
\(\qquad\) 如果存在两个不相等的实数特征根\(\lambda_1\neq \lambda_2\)的话,那么
\[\begin{gathered} \begin{align} \Phi(z)&=-\phi_2(z-z_1)(z-z_2)\\ &=-\phi_2(z-\frac{1}{\lambda_1})(z-\frac{1}{\lambda_2}) \\ &=\frac{-\phi_2}{\lambda_1\lambda_2}(\lambda_1 z-1)(\lambda_2 z-1)\\ &=(1-\lambda_1 z)(1-\lambda_2 z). \end{align} \qquad \qquad \begin{matrix} z_{1,2}=\frac {1}{\lambda_{1,2}} \\ \\ \\ \lambda_1\lambda_2=-\phi_2 \end{matrix} \end{gathered}\]
\(\qquad\) 记 \(\Phi^{-1}(z)=\frac{1}{\Phi(z)}\), 则 \(Y_t=\Phi^{-1}(B) e_t\). \[\Phi^{-1}(B)=\frac{1}{(1-\lambda_1 B)(1-\lambda_2 B)}=\frac{a_1}{1-\lambda_1 B}+\frac{a_2}{1-\lambda_2 B}\] \(\qquad\) 其中,\(a_1 (1-\lambda_2 B)+a_2 (1-\lambda_1 B)=1\), 即\(a_1=\frac{\lambda_1}{\lambda_1-\lambda_2},\quad a_2=\frac{-\lambda_2}{\lambda_1-\lambda_2}.\)
\(\qquad\) 利用泰勒展开\(\frac{1}{1-x}=1+x+x^2+\cdots\), 我们有
\[\begin{align} \Phi^{-1}(B)&=a_1(1+\lambda_1 B+\lambda_1^2 B^2+\cdots)+a_2(1+\lambda_2 B+\lambda_2^2 B^2+\cdots)\\ &=\sum_{j=0}^\infty(a_1\lambda_1^j+a_2\lambda_2^j) B^j\\ \Psi(B)&\triangleq \sum_{j=0}^\infty \psi_j B^j\end{align}\]
\(\qquad\) AR(2)的逆转形式为\(Y_t=\Phi^{-1}(B) e_t=\Psi(B) e_t=\sum_{j=0}^\infty \psi_j e_{t-j}.\) 类似的推导可以证明(证明略)
\[\psi_j=\begin{cases} \frac{\lambda_1^{j+1}-\lambda_2^{j+1}}{\lambda_1-\lambda_2}, &\qquad当\phi_1^2+4\phi_2>0时,\\ (1+j)\phi_1^j, &\qquad当\phi_1^2+4\phi_2=0时,\\ R^{j}\frac{\sin[(j+1)\Theta]}{\sin\Theta},&\qquad当\phi_1^2+4\phi_2<0时,\\ \end{cases}\]
\(\qquad\) 将AR(2)的逆转形式 \(Y_t=\Psi(B)e_t\) 代入 传递形式 \(\Phi(B)Y_t=e_t\),
\[\begin{align} \Phi(B)\Psi(B)e_t&=e_t\\ (1-\phi_1B-\phi_2 B^2)(\psi_0+\psi_1B+\psi_2B^2+\cdots)e_t&=e_t\\ \\ \end{align}\]
\[\begin{align} \psi_0-\phi_1\psi_0 &B -\phi_2\psi_0B^2\\ +\psi_1&B-\phi_1\psi_1B^2-\phi_2\psi_1B^3\\ &\quad\qquad\psi_2B^2-\phi_1\psi_2B^3-\phi_2\psi_2B^4\\ &\quad\qquad\quad\qquad\cdots \end{align}\]
\(\qquad\) 左右比较\(e_j\)的系数,可以得到 \[\begin{cases} \psi_0&=1, \\ \psi_1-\phi_1\psi_0&=0,\\ \psi_2-\phi_1\psi_1-\phi_2\psi_0&=0,\\ \psi_3-\phi_1\psi_2-\phi_2\psi_1&=0,\\ &\vdots\\ \psi_k-\phi_1\psi_{k-1}-\phi_2\psi_{k-2}&=0, \end{cases}\]
\(\qquad\) 解方程组可得
\[\begin{cases} \psi_0&=&1, \\ \psi_1&=&\phi_1,\\ \psi_2&=&\phi_1^2+\phi_2,\\ &\vdots\\ \psi_k&=&\phi_1\psi_{k-1}+\phi_2\psi_{k-2}, \end{cases}\]
\[ \]
\[ Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t, \qquad\text{ 其中 } e_t\sim WN(0,\sigma_e^2). \] \(\qquad\) 类似AR(2)的分析过程,AR(p)自回归系数多项式
\[\Phi(z)=1-\phi_1 z-\phi_2 z^2-\cdots-\phi_pz^p \]
\(\qquad \underline{ AR(p){\bf特征方程}1-\phi_1 z-\phi_2 z^2-\cdots-\phi_pz^p = 0 {\bf 的根落在单位圆之外时,序列是平稳的。}}\)
\(\qquad\) 利用多项式根与系数的关系, 可以得到满足平稳性的必要(非充分)条件:
\[\left . \begin{align} \phi_1+\phi_2+\cdots+\phi_p&<1\\ |\phi_p|&<1 \end{align} \right \}\]
\(\qquad\) 如果\({Y_t}\)序列平稳,那么
\(\qquad\)- 均值 \(\mu=\frac{0}{1-\phi_1-\phi_2-\cdots-\phi_p}=0\), 因为\(E(Y_t)=\phi_1 E(Y_{t-1})+\phi_2 E(Y_{t-2})+\cdots+\phi_p E(Y_{t-p})+E(e_t)\);
\(\qquad\)- 方差 \(\gamma_0=\frac{\sigma_e^2}{1-\phi_1 \rho_1-\phi_2 \rho_2-\cdots-\phi_p \rho_p}.\) 因为
\[\begin{align} Var(Y_t)&=Cov(\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t,Y_t)\\ \gamma_0&=\phi_1 \gamma_1+\phi_2 \gamma_2+\cdots+\phi_p \gamma_p+\sigma_e^2\\ \gamma_0&=\phi_1 \rho_1\gamma_0+\phi_2 \rho_2\gamma_0+\cdots+\phi_p \rho_p\gamma_0+\sigma_e^2\\ \end{align}\]
\(\qquad\)- \(k\)阶协方差/自相关函数 (\(k=1,2,\cdots\)) \[\begin{align} &&Cov(Y_t,Y_{t-k})&=Cov(\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t, Y_{t-k}) \\ \bf{Yule-Walker方程}&&\gamma_k&=\phi_1 \gamma_{k-1}+\phi_2 \gamma_{k-2}+\cdots+\phi_p \gamma_{k-p} \\ &&\rho_k&=\phi_1 \rho_{k-1}+\phi_2 \rho_{k-2}+\cdots+\phi_p \rho_{k-p} \\ \end{align} \]
\(\qquad \underline{{\bf差分方程的特征方程} \lambda^p-\phi_1 \lambda^{p-1}-\phi_2 \lambda^{p-2}-\cdots-\phi_p = 0 {\bf 的根落在单位圆之内}|\lambda_{1,\cdots,p}|<1{\bf时,序列是平稳的。}}\)
\[\begin{align} \underline{\text{特征根的情况}} \qquad\qquad& \qquad\qquad \underline{\text{差分方程的通解}}\\ 当存在p个不相等的实数根\lambda_1\neq\cdots\neq\lambda_p时, & \qquad \rho_k=C_1 \lambda_1^k+C_2 \lambda_2^k+\cdots+C_p \lambda_p^k\\ 当存在相等的实根\lambda_{1,\cdots,d}和不等的实根\lambda_{3,\cdots,p}时, & \qquad \rho_k=(C_1 +C_2 k+\cdots+C_d k^{d-1})\lambda_1^k+C_{d+1} \lambda_{d+1}^k+\cdots+C_p \lambda_p^k \\ 当存在复数根\lambda_{1,2}和不相等的实数根\lambda_{3,\cdots,p}时, & \qquad\rho_k=R^k(C_1 cos(k\theta)+C_2 sin(k\theta) ) +C_{3} \lambda_{3}^k+\cdots+C_p \lambda_p^k\\ \end{align}\]
\(\qquad\) 将条件\(\rho_0=1, \rho_k=\rho_{-k}\)代入通解,可以确定参数\(C_1, C_2,\cdots,C_p\), 写出\(k\)阶自相关函数的表达式。
\(\qquad\qquad\) 结论:当\(|\lambda_{1,\cdots,p}|<1\)时,平稳AR(p)的ACF图像呈现减幅的正弦、余弦和指数衰减的混合形式,具体形式取决于特征根的性质。 \[ \] \(\qquad\) 另一方面,根据 \(\rho_0=1, \quad\rho_{-k}=\rho_{k}\), 得到一般的Yule-Walker方程组
\[\begin{cases} \rho_1&=&\phi_1 &+\phi_2 \rho_{1}&+\phi_3 \rho_{2}&+\cdots&+\phi_p \rho_{p-1}\\ \rho_2&=&\phi_1 \rho_{1}&+\phi_2 &+\phi_3 \rho_{1}&+\cdots&+\phi_p \rho_{p-2}\\ &\vdots\\ \rho_p&=&\phi_1 \rho_{p-1}&+\phi_2 \rho_{p-2}&+\phi_3 \rho_{p-3}&+\cdots&+\phi_p\\ \end{cases}\]
\(\qquad\) 给定 \(\phi_1,\phi_2,\cdots,\phi_p\)的值,令\(\phi_{p+1}=\phi_{p+2}=\cdots=\phi_{2p}=0\),
\[\begin{cases} (\phi_2-1)\rho_1 &+\phi_3 \rho_{2}&+\phi_4 \rho_{3}&+\cdots&+\phi_{p-1} \rho_{p-2}&+\phi_p \rho_{p-1}&+\underline{\phi_{p+1} }\rho_{p}&=&-\phi_1\\ (\phi_1+\phi_3) \rho_{1}&+(\phi_4-1)\rho_2 &+\phi_5 \rho_{3}&+\cdots&+\phi_p \rho_{p-2}&+\underline{\phi_{p+1}}\rho_{p-1} &+\underline{\phi_{p+2}}\rho_{p} &=&-\phi_2\\ &\vdots\\ (\phi_{k-1}+\phi_{k+1}) \rho_{1}&+(\phi_{k-2}+\phi_{k+2})\rho_2 &+\cdots+(\phi_1+\phi_{2k-1})\rho_{k-1}&+(\phi_{2k}-1)\rho_{k}&+\phi_{2k+1}\rho_{k+1}+\cdots&+\underline{\phi_{p+k-1}}\rho_{p-1} &+\underline{\phi_{p+k}}\rho_{p} &=&-\phi_k\\ &\vdots\\ (\phi_{p-1}+\underline{\phi_{p+1}}) \rho_{1}&+(\phi_{p-2}+\underline{\phi_{p+2}})\rho_2&+(\phi_{p-3}+\underline{\phi_{p+3}}) \rho_{3}&+\cdots&+(\phi_2+\underline{\phi_{2p-2}})\rho_{p-2}&+(\phi_1+\underline{\phi_{2p-1}})\rho_{p-1}&+(\underline{\phi_{2p}}-1)\rho_{p}&=&-\phi_p\\ \end{cases}\]
\(\qquad\) 可以利用Cramer法则,\(\rho_i=\frac{|D_i|}{|D|}\), 其中\(D\)是系数矩阵,求解该线性方程组得到 \(\rho_1,\rho_2,\cdots,\rho_p\) 的值,
\(\qquad\) 然后利用递归关系\(\rho_k=\phi_1 \rho_{k-1}+\phi_2 \rho_{k-2}+\cdots+\phi_p \rho_{k-p}\)求得任意高阶\((k>p)\)时的\(\rho_k.\)
\[ \]
\(\qquad\) 将AR(p)的逆转形式 \(Y_t=\Psi(B)e_t\) 代入 传递形式 \(\Phi(B)Y_t=e_t\),
\[\begin{align} \Phi(B)\Psi(B)e_t&=e_t\\ (1-\phi_1B-\phi_2 B^2-\cdots-\phi_p B^p)(\psi_0+\psi_1B+\psi_2B^2+\cdots)e_t&=e_t\\ \\ \end{align}\]
\[\begin{align} \psi_0-\phi_1\psi_0 &B -\phi_2\psi_0B^2-\phi_3\psi_0B^3-\cdots-\phi_p\psi_0B^p\\ +\psi_1&B-\phi_1\psi_1B^2-\phi_2\psi_1B^3-\cdots-\phi_{p-1}\psi_1B^{p}-\phi_p\psi_1B^{p+1}\\ &\quad\qquad\psi_2B^2-\phi_1\psi_2B^3-\cdots-\phi_{p-2}\psi_2B^p-\phi_{p-1}\psi_2B^{p+1}-\phi_{p}\psi_2B^{p+2}\\ &\quad\qquad\quad\qquad\cdots\cdots \end{align}\]
\(\qquad\) 左右比较\(e_j\)的系数,可以得到 \[\begin{cases} \psi_0&=1, \\ \psi_1-\phi_1\psi_0&=0,\\ \psi_2-\phi_1\psi_1-\phi_2\psi_0&=0,\\ \psi_3-\phi_1\psi_2-\phi_2\psi_1-\phi_3\psi_0&=0,\\ &\vdots\\ \psi_k-\phi_1\psi_{k-1}-\phi_2\psi_{k-2}-\cdots-\phi_k\psi_0&=0,\quad(k<p)\\ &\vdots\\ \psi_p-\phi_1\psi_{p-1}-\phi_2\psi_{p-2}-\cdots-\phi_p\psi_0&=0,\\ \psi_{p+1}-\phi_1\psi_{p}-\phi_2\psi_{p-1}-\cdots-\phi_{p}\psi_1&=0,\\ &\vdots\\ \psi_{m}-\phi_1\psi_{m-1}-\phi_2\psi_{m-2}-\cdots-\phi_{p}\psi_{m-p}&=0,\quad(m>p)\\ &\vdots\\ \end{cases}\]
\(\qquad\) 解方程组可得
\[\begin{cases} \psi_0&=&1, \\ \psi_1&=&\phi_1\psi_0=\phi_1,\\ \psi_2&=&\phi_1\psi_1+\phi_2\psi_0=\phi_1^2+\phi_2,\\ \psi_3&=&\phi_1\psi_2+\phi_2\psi_1+\phi_3\psi_0=\phi_1^3+2\phi_1\phi_2+\phi_3,\\ &\vdots\\ \psi_k&=&\phi_1\psi_{k-1}+\phi_2\psi_{k-2}+\cdots+\phi_k\psi_{0}, \quad \text{ if }k\leq p \\ &\vdots\\ \psi_m&=&\phi_1\psi_{m-1}+\phi_2\psi_{m-2}+\cdots+\phi_p\psi_{m-p}, \quad \text{ if }m > p \end{cases}\]
\[ Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\cdots+\phi_p Y_{t-p}+e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2}-\cdots-\theta_q e_{t-q}, \qquad\text{ 其中 } e_t\sim WN(0,\sigma_e^2). \]
\[\begin{align} \Phi(B)Y_t&=\Theta(B)e_t\\ 其中,\Phi(B)&=1-\phi_1 B-\phi_2 B^2+\cdots-\phi_p B^p, \\ \Theta(B)&=1-\theta_1 B-\theta_2 B^2+\cdots-\theta_q B^q, \end{align}\]
当自回归系数多项式的零点(\(\Phi(z)=0\)的根)位于单位圆外时,ARMA(p,q)序列是平稳的。
当\(p=q=1\)时,ARMA(1,1)过程
\[Y_t=\phi Y_{t-1}+e_t-\theta e_{t-1}, \qquad |\phi|<1.\]
\(\quad\) -均值 \(\mu=\frac{0}{1-\phi}=0\), 因为\(E(Y_t)=\phi E(Y_{t-1})+E(e_t)-\theta E(e_{t-1}).\)
\(\quad\) -方差 \[\begin{align} Cov(e_t,Y_t)&=Cov(e_t, \phi Y_{t-1}+e_t-\theta e_{t-1})=\sigma_e^2\\ Cov(e_{t-1},Y_t)&=Cov(e_{t-1},\phi Y_{t-1}+e_t-\theta e_{t-1})=(\phi-\theta)\sigma_e^2 \\ \gamma_0=Cov(Y_t,Y_t)&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_t)\\ &= \phi Cov(Y_{t-1},Y_t)+Cov(e_t,Y_t)-\theta Cov(e_{t-1},Y_t) \\ &= \phi \gamma_1+\sigma_e^2-\theta(\phi-\theta)\sigma_e^2 \end{align}\]
\(\quad\) -自协方差/自相关函数(\(k=1\)时)
\[\begin{align} \gamma_1=Cov(Y_t,Y_{t-1})&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_{t-1})\\ &=\phi \gamma_0-\theta \sigma_e^2 \end{align}\]
\(\quad\) 解二元线性方程组,
\[\begin{cases} \gamma_0 &=& \phi \gamma_1+\sigma_e^2-\theta(\phi-\theta)\sigma_e^2\\ \gamma_1 &=&\phi \gamma_0-\theta \sigma_e^2 \end{cases}\]
\(\quad\) 可得 \(\gamma_0=\frac{1-2\phi\theta+\theta^2}{1-\phi^2}\sigma_e^2, \quad \gamma_1=\frac{(\phi-\theta)(1-\phi\theta)}{1-\phi^2}\sigma_e^2\), 从而可得一阶自相关函数为 \(\rho_1=\frac{\gamma_1}{\gamma_0}=\frac{(\phi-\theta)(1-\phi\theta)}{1-2\phi\theta+\theta^2}\)
\(\quad\) -自协方差/自相关函数, \(\rho_k=\phi^{k-1} \rho_1=\frac{(\phi-\theta)(1-\phi\theta)}{1-2\phi\theta+\theta^2}\phi^{k-1},\quad k\geq 1.\)
\[\begin{align} \gamma_2=Cov(Y_t,Y_{t-2})&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_{t-2})=\phi \gamma_1 \\ \gamma_3=Cov(Y_t,Y_{t-3})&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_{t-3})=\phi \gamma_2 =\phi^2 \gamma_1\\ &\vdots\\ \gamma_k=Cov(Y_t,Y_{t-k})&=Cov(\phi Y_{t-1}+e_t-\theta e_{t-1},Y_{t-k})=\phi \gamma_{k-1}=\phi^{k-1} \gamma_1 \\ \end{align}\]
\(\qquad\) 结论:ARMA(1,1)是平稳的充分必要条件也是\(|\phi|<1\),自相关函数\(\rho_k=\phi^{k-1}\rho_1\)随着时滞\(k\)的增加而指数衰减。阻尼因子是 \(\phi\),但递减开始于 \(\rho_1\) (依赖于\(\theta\)); 这与AR(1)的自相关函数\(\rho_k\substack{{\tiny AR(1)}\\=}\phi^k\rho_0\)不同.
\(\qquad\) 例如,ARMA(1,1)序列\(\phi=0.8,\theta=0.4\), 则\(\rho_1=0.523, \rho_2=0.418, \rho_3=0.335.\)
###ARMA(2)的自相关函数的理论值 #Exhibit 4.18
Rho1 <- ARMAacf(ar=c(0.8),ma=-c(0.4), lag.max = 12, pacf=FALSE);Rho1
## 0 1 2 3 4 5 6
## 1.00000000 0.52307692 0.41846154 0.33476923 0.26781538 0.21425231 0.17140185
## 7 8 9 10 11 12
## 0.13712148 0.10969718 0.08775775 0.07020620 0.05616496 0.04493197
Rho2 <- ARMAacf(ar=c(-0.8),ma=-c(0.4), lag.max = 12, pacf=FALSE);Rho2
## 0 1 2 3 4 5
## 1.00000000 -0.88000000 0.70400000 -0.56320000 0.45056000 -0.36044800
## 6 7 8 9 10 11
## 0.28835840 -0.23068672 0.18454938 -0.14763950 0.11811160 -0.09448928
## 12
## 0.07559142
opar=par(mfrow=c(1,2))
plot(Rho1, type='h', xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi==0.8, ", ", theta==0.4)), ylim=c(-1.0,1.0));
points(Rho1, pch=8, col='blue', cex=0.8); abline(h=0);
plot(Rho2, type='h', xlab='Lag', ylab=expression(rho[k]), main=expression(paste(phi==-0.8, ", ", theta==0.4)), ylim=c(-1.0,1.0));
points(Rho2, pch=8, col='blue', cex=0.8); abline(h=0);
par(opar)
\(\qquad\)- ARMA(1,1)的一般线性表示
\[\begin{align} &&Y_t&=\phi Y_{t-1}+e_t-\theta e_{t-1}\\ \phi \cdot &&Y_{t-1}&=\phi Y_{t-2}+e_{t-1}-\theta e_{t-2} &&\cdot\phi\\ \phi^2 \cdot &&Y_{t-2}&=\phi Y_{t-3}+e_{t-2}-\theta e_{t-3} &&\cdot\phi^2\\ &&&\cdots\\ \phi^{k-1} \cdot &&Y_{t-(k-1)}&=\phi Y_{t-k}+e_{t-(k-1)}-\theta e_{t-k} &&\cdot\phi^{k-1}\\ \end{align}\]
\(\qquad\)求和可得 \(Y_t=e_t+(\phi-\theta) e_{t-1}+(\phi-\theta)\phi e_{t-2}+\cdots+(\phi-\theta)\phi^{k-2} e_{t-(k-1)}-\theta\phi^{k-1} e_{t-k}+\phi^k Y_{t-k}.\)
\(\qquad\)如果\(|\phi|<1\), 并且\(k\to \infty\), 那么 \[Y_t=e_t+(\phi-\theta)(e_{t-1}+\phi e_{t-2}+\phi^2 e_{t-3}+\cdots)=e_t+(\phi-\theta)\sum_{i=1}^\infty\phi^{i-1} e_{t-i}. \]
\[ \]
\(\qquad\) 若\(\Phi(z)\)的零点在单位圆之外,\(Y_t\) 为平稳可逆的过程。设ARMA(p,q)的线性表示为\(Y_t=\Psi(B)e_t=1+\psi_1B+\psi_2B^2+\cdots\),
\[\begin{align} \Phi(B)Y_t&=\Theta(B)e_t \\ \Phi(B)\Psi(B)e_t&=\Theta(B)e_t\\ (1-\phi_1 B-\phi_2 B^2-\cdots)(1+\psi_1 B+\psi_2 B^2+\cdots)e_t&=(1-\theta_1 B-\theta_2 B^2-\cdots)e_t \end{align}\]
\(\qquad\) 等式左边展开得,
\[\begin{align} \psi_0-\phi_1\psi_0 &B -\phi_2\psi_0B^2-\phi_3\psi_0B^3-\cdots-\phi_p\psi_0B^p\\ +\psi_1&B-\phi_1\psi_1B^2-\phi_2\psi_1B^3-\cdots-\phi_{p-1}\psi_1B^{p}-\phi_p\psi_1B^{p+1}\\ &\quad\qquad\psi_2B^2-\phi_1\psi_2B^3-\cdots-\phi_{p-2}\psi_2B^p-\phi_{p-1}\psi_2B^{p+1}-\phi_{p}\psi_2B^{p+2}\\ &\quad\qquad\quad\qquad\cdots\cdots \end{align}\]
\(\qquad\) 左右比较\(e_j\)的系数,可以得到 \[\begin{cases} \psi_0&=1, \\ \psi_1-\phi_1\psi_0&=-\theta_1,\\ \psi_2-\phi_1\psi_1-\phi_2\psi_0&=-\theta_2,\\ \psi_3-\phi_1\psi_2-\phi_2\psi_1-\phi_3\psi_0&=-\theta_3,\\ &\vdots\\ \psi_k-\phi_1\psi_{k-1}-\phi_2\psi_{k-2}-\cdots-\phi_k\psi_0&=-\theta_k,\quad(k<p)\\ &\vdots\\ \psi_p-\phi_1\psi_{p-1}-\phi_2\psi_{p-2}-\cdots-\phi_p\psi_0&=-\theta_p,\\ \psi_{p+1}-\phi_1\psi_{p}-\phi_2\psi_{p-1}-\cdots-\phi_{p}\psi_1&=-\theta_{p+1},\\ &\vdots\\ \psi_{m}-\phi_1\psi_{m-1}-\phi_2\psi_{m-2}-\cdots-\phi_{p}\psi_{m-p}&=-\theta_{m},\quad(m>p)\\ &\vdots\\ \end{cases}\]
\(\qquad\) 令\(\theta_j=0 (当j>q时)\), 解方程组可得
\[\begin{cases} \psi_0&=&1, \\ \psi_1&=&-\theta_{1}+\phi_1\psi_0,\\ \psi_2&=&-\theta_{2}+\phi_1\psi_1+\phi_2\psi_0,\\ \psi_3&=&-\theta_{3}+\phi_1\psi_2+\phi_2\psi_1+\phi_3\psi_0,\\ &\vdots\\ \psi_k&=&-\theta_{k}+\phi_1\psi_{k-1}+\phi_2\psi_{k-2}+\cdots+\phi_k\psi_{0}, \quad \text{ if }k\leq p \\ &\vdots\\ \psi_m&=&-\theta_{m}+\phi_1\psi_{m-1}+\phi_2\psi_{m-2}+\cdots+\phi_p\psi_{m-p}, \quad \text{ if }m > p \end{cases}\]
\[ \]
\[Y_t=\sum_{j=0}^\infty \psi_j e_{t-j}=\sum_{i=1}^p \phi_i Y_{t-i}-\sum_{j=0}^q \theta_j e_{t-j}, \qquad \theta_0=-1.\] \(\qquad\) -均值 \(\mu=0\)
\(\qquad\) -自协方差/自相关函数
\[Cov(e_{t-j},Y_{t-k})=Cov(e_{t-j},\sum_{m=0}^\infty \psi_m e_{t-k-m})=\begin{cases}\psi_{j-k}\sigma_e^2,&\quad \text{ if } j\geq k,\\ 0, &\quad \text{ if }j< k.\end{cases}\]
\[\begin{align} Cov(Y_t,Y_{t-k})&=Cov(\sum_{i=1}^p \phi_i Y_{t-i}-\sum_{j=0}^q \theta_j e_{t-j},Y_{t-k})\\ &=\sum_{i=1}^p \phi_i Cov(Y_{t-i},Y_{t-k})-\sum_{j=0}^q \theta_j Cov(e_{t-j},Y_{t-k})\\ \gamma_k&=\begin{cases} \sum_{i=1}^p \phi_i\gamma_{i-k}-\sigma_e^2\sum_{j=k}^q \theta_j\psi_{j-k},&\quad \text{ if }0\leq k\leq q,\\ \sum_{i=1}^p \phi_i\gamma_{i-k},&\quad \text{ if } k > q. \end{cases}\end{align}\]
\(\qquad\) 根据上式,\(\gamma_{-k}=\gamma_k\), 求解\(q+1\)元线性方程组得\(\gamma_0,\gamma_1,\cdots,\gamma_q\),然后利用 \(\rho_i=\frac{\gamma_k}{\gamma_0}\) 计算 \(\rho_1,\cdots,\rho_q\),
\[\rho_k=\phi_1 \rho_{k-1}+\phi_2 \rho_{k-2}+\cdots+\phi_p \rho_{k-p}, \quad \text{ if } k>q.\]
\(\qquad\) 自相关函数分别为 \[\rho_k^Y= \begin{cases} 1, &\text{ if }k=0\\ -\frac{\theta}{1+\theta^2}, &\text{ if }k=1\\ 0, &\text{ if }k\geq 2\\ \end{cases} \qquad\qquad \rho_k^X= \begin{cases} 1, &\text{ if }k=0\\ -\frac{1/\theta}{1+(1/\theta)^2}=-\frac{\theta}{1+\theta^2}, &\text{ if }k=1\\ 0, &\text{ if }k\geq 2\\ \end{cases}\] \(\qquad\) 参数互为倒数的两个MA(1)过程有相同的自相关函数。(习题4.4 on P.58)
\(\qquad\)自相关函数分别为 \[\rho_k^Y= \begin{cases} 1, &\text{ if }k=0\\ \frac{-\theta_1+\theta_1\theta_2}{1+\theta_1^2+\theta_2^2} =\frac{-\frac{1}{6}+\frac{1}{6}\frac{1}{6}}{1+\frac{1}{36}+\frac{1}{36}} =\frac{-5}{38}, &\text{ if }k=1\\ \frac{-\theta_2}{1+\theta_1^2+\theta_2^2}=\frac{-\frac{1}{6}}{1+\frac{1}{36}+\frac{1}{36}}=\frac{-6}{38}, &\text{ if }k=2\\ 0, &\text{ if }k\geq 3\\ \end{cases} \qquad\qquad \rho_k^X= \begin{cases} 1, &\text{ if }k=0\\ \frac{-(-1)+(-1)\cdot 6}{1+(-1)^2+6^2} =\frac{-5}{38}, &\text{ if }k=1\\ \frac{-6}{1+(-1)^2+6^2}=\frac{-6}{38}, &\text{ if }k=2\\ 0, &\text{ if }k\geq 3\\ \end{cases} \]
\(\qquad\) 不同参数的这两个MA(2)过程却有相同的自相关函数。 (习题4.12 on P.59)
\(\qquad\) 相应的特征多项式分别为
\[\begin{align}\Theta^Y(z)&=1-\frac{1}{6}z-\frac{1}{6}z^2=-\frac{1}{6}(z+3)(z-2)=(1+\frac{z}{3})(1-\frac{z}{2})\\ \Theta^X(w)&=1+w-6w^2=(1+3w)(1-2w)\end{align}\]
\(\qquad\) 这两个MA(2)的特征多项式的根互为倒数, \(z_{1,2}=\frac{1}{w_{1,2}}.\)
ARMAacf(ma=-c(1/6,1/6),lag.max=6)
## 0 1 2 3 4 5 6
## 1.0000000 -0.1315789 -0.1578947 0.0000000 0.0000000 0.0000000 0.0000000
ARMAacf(ma=-c(-1,6),lag.max=6)
## 0 1 2 3 4 5 6
## 1.0000000 -0.1315789 -0.1578947 0.0000000 0.0000000 0.0000000 0.0000000
\[\begin{align} &&Y_t&=e_t-\theta e_{t-1}\\ \theta \cdot &&Y_{t-1}&=e_{t-1}-\theta e_{t-2} &&\cdot\theta\\ \theta^2 \cdot &&Y_{t-2}&=e_{t-2}-\theta e_{t-3} &&\cdot\theta^2\\ &&&\cdots\\ \theta^{k-1} \cdot &&Y_{t-(k-1)}&=e_{t-(k-1)}-\theta e_{t-k} &&\cdot\theta^{k-1}\\ \end{align}\]
\(\qquad\)求和可得 \(Y_t+\theta Y_{t-1}+\theta^2 Y_{t-2}+\cdots+\theta^{k-1} Y_{t-(k-1)}=e_t-\theta^k e_{t-k}.\)
\(\qquad\)如果\(|\theta|<1\), 并且\(k\to\infty\), 那么 \[Y_t=e_t-\theta Y_{t-1}-\theta^2 Y_{t-2}-\cdots-\theta^k Y_{t-k}-\cdots \]
\(\qquad\qquad\) 结论:\(\underline{ MA(1){\bf 可逆 }\iff|\theta|<1\iff MA{\bf 系数多项式 }\Theta(z){\bf 的零点在单位圆之外}。}\)
\(\qquad\qquad\) MA系数多项式 \(\qquad\qquad\Theta(z)=1-\theta_1z--\theta_2 z^2-\cdots-\theta_q z^q\)
\(\qquad\qquad\) MA特征方程为 \(\qquad\qquad1-\theta_1z--\theta_2 z^2-\cdots-\theta_q z^q=0\)
\(\qquad\) 类似于AR模型的平稳性条件,\(\underline{MA(q){\bf模型可逆}\iff MA{\bf系数多项式的零点(}MA{\bf特征方程的根)在单位圆之外。}}\)
\[Y_t=\pi_1 Y_{t-1}+\pi_2 Y_{t-2}+\pi_3 Y_{t-3}+\cdots+e_t\]
\(\qquad\) \(\boxed{对于一般的ARMA(p,q)模型,要求同时满足{\bf平稳性和可逆性}条件。}\)
\[ W_t=\phi_1 W_{t-1}+\phi_2 W_{t-2}+\cdots+\phi_p W_{t-p}+e_t-\theta_1 e_{t-1}-\theta_2 e_{t-2}-\cdots-\theta_q e_{t-q}. \]
\(\qquad 其中 \quad W_t=\nabla^d Y_t=(1-B)^d Y_t=\sum_{j=0}^d(-1)^j{d \choose j}B^jY_t=\sum_{j=0}^d(-1)^j{d \choose j}Y_{t-j}, \quad e_t\sim WN(0,\sigma_e^2).\)