Beyond Markov Property (MP), a further property is Time
Homogeneity (TH): Xt can be Markov but with its transition matrix
P(t) depending on time t. E.g., it may depend on
the weekday at t if observations are daily, and then a dependence
Xt on Xt−7 conditional on Xt−1 may be diagnosed if TH
is unduly assumed.
Assuming TH holds, a possible check for MP is testing that Xt is independent
from Xt−2 conditional on Xt−1, as Michael Chernick and StasK
suggested. This can be done by using a test for contingency table.
We can build the n contingency tables of Xt and Xt−2
conditional on {Xt−1=xj} for the n possible values xj,
and test for independence. This can also be done using Xt−ℓ
with ℓ>1 in place of Xt−2.
In R, contingency tables or arrays are easily
produced thanks to the factor facility and the functions apply
,
sweep
. The idea above can also be exploited graphically. Packages ggplot2 or lattice easily provide conditional plots to compare conditional
distributions p(Xt|Xt−1=xj,Xt−2=xi). For instance
setting i as row index and j as column index in trellis should under MP lead to similar
distributions within a column.
The chap. 5 of the book The statistical analysis of stochastic processes in time by J.K Lindsey contains other ideas for checking assumptions.
[## simulates a MC with transition matrix in 'trans', starting from 'ini'
simMC <- function(trans, ini = 1, N) {
X <- rep(NA, N)
Pcum <- t(apply(trans, 1, cumsum))
X[1] <- ini
for (t in 2:N) {
U <- runif(1)
X[t] <- findInterval(U, Pcum[X[t-1], ]) + 1
}
X
}
set.seed(1234)
## transition matrix
P <- matrix(c(0.1, 0.1, 0.1, 0.7,
0.1, 0.1, 0.6, 0.2,
0.1, 0.3, 0.2, 0.4,
0.2, 0.2, 0.3, 0.3),
nrow = 4, ncol = 4, byrow = TRUE)
N <- 2000
X <- simMC(trans = P, ini = 1, N = N)
## it is better to work with factors
X <- as.factor(X)
levels(X) <- LETTERS[1:4]
## table transitions and normalize each row
Phat <- table(X[1:(N-1)], X[2:N])
Phat <- sweep(x = Phat, MARGIN = 1, STATS = apply(Phat, 1, sum), FUN = "/")
## explicit dimnames
dimnames(Phat) <- lapply(list("X(t-1)=" ,"X(t)="),
paste, sep = "", levels(as.factor(X)))
## transition 3-fold contingency array
P3 <- table(X[1:(N-2)], X[2:(N-1)], X[3:N])
dimnames(P3) <- lapply(list("X(t-2)=", "X(t-1)=" ,"X(t)="),
paste, sep = "", levels(as.factor(X)))
## apply ONE indendence test
fisher.test(P3[ , 1, ], simulate.p.value = TRUE)
## plot conditional distr.
library(lattice)
X3 <- data.frame(X = X[3:N], lag1X = X[2:(N-1)], lag2X = X[1:(N-2)])
histogram( ~ X | lag1X + lag2X, data = X3, col = "SteelBlue3")
]