Filtr ładowania początkowego / algorytm filtra cząstek stałych (zrozumienie)

Naprawdę nie rozumiem, jak działa filtr ładowania początkowego. Z grubsza znam pojęcia, ale nie rozumiem niektórych szczegółów. To pytanie jest dla mnie, aby usunąć bałagan. W tym miejscu użyję tego popularnego algorytmu filtrowania z referencji Douceta (do tej pory uważam, że jest to najłatwiejsza referencja). Pozwól mi najpierw powiedzieć, że moim problemem jest zrozumienie, które rozkłady są znane, a które nieznane.

Oto moje pytania:

W 2), co to jest rozkład ? Czy ten rozkład jest znany ? Czy znamy ten rozkład dla wszystkich $p(x_t|x^{(i)}_{t-1})$ $t$ ? Jeśli tak, ale co, jeśli nie możemy pobrać z niego próbki? Zabawne, że nazywają to krokiem próbkowania ważności, ale nie widzę dystrybucji propozycji.
Również w 2) czy jest znanym rozkładem? „Normalizuj ważność wag oznacza $p(y_t|\tilde{x}^{(i)}_{t})$ ? Co oznacza tyldy na $w^{(i)}_{t}=\frac{\tilde{w}^{(i)}_{t}}{\sum_{i=1}^{N}\tilde{w}^{(i)}_{t}}$ $x$ and $w$ mean? Does it mean something like un-resampled or un-normalised respectively?
I would appreciate if anyone could give a simple toy example using well-known distributions to use this bootstrap filter. The end goal of the bootstrap filter is not clear to me.

particle-filter

— tintinthong
źródło

That is the transition density of the state ( $x_t$ ), which is part of your model and therefore known. You do need to sample from it in the basic algorithm, but approximations are possible. $p(x_t|x_{t-1})$ is the proposal distribution in this case. It is used because the distribution $p(x_t|x_{0:t-1},y_{1:t})$ is generally not tractable.
Yes, that's the observation density, which is also part of the model, and therefore known. Yes, that's what normalization means. The tilde is used to signify something like "preliminary": $\tilde{x}$ is $x$ before resampling, and $\tilde{w}$ is $w$ before renormalization. I would guess that it is done this way so that the notation matches up between variants of the algorithm that don't have a resampling step (i.e. $x$ is always the final estimate).
The end goal of the bootstrap filter is to estimate the sequence of conditional distributions $p(x_t|y_{1:t})$ (the unobservable state at $t$ , given all observations until $t$ ).

Consider the simple model:

X_{t} = X_{t - 1} + η_{t}, η_{t} \sim N (0, 1)

$X_t = X_{t-1} + \eta_t, \quad \eta_t \sim N(0,1)$

X_{0} \sim N (0, 1)

$X_0 \sim N(0,1)$

Y_{t} = X_{t} + ε_{t}, ε_{t} \sim N (0, 1)

$Y_t = X_t + \varepsilon_t, \quad \varepsilon_t \sim N(0,1)$

This is a random walk observed with noise (you only observe $Y$ , not $X$ ). You can compute $p(X_t|Y_1, ..., Y_t)$ exactly with the Kalman filter, but we'll use the bootstrap filter at your request. We can restate the model in terms of the state transition distribution, the initial state distribution, and the observation distribution (in that order), which is more useful for the particle filter:

X_{t} | X_{t - 1} \sim N (X_{t - 1}, 1)

$X_t | X_{t-1} \sim N(X_{t-1},1)$

X_{0} \sim N (0, 1)

$X_0 \sim N(0,1)$

Y_{t} | X_{t} \sim N (X_{t}, 1)

$Y_t | X_t \sim N(X_t,1)$

Applying the algorithm:

Initialization. We generate $N$ particles (independently) according to $X_0^{(i)} \sim N(0,1)$ .
We simulate each particle forward independently by generating $X_1^{(i)} | X_0^{(i)} \sim N(X_0^{(i)},1)$ , for each $N$ .

We then compute the likelihood $\tilde{w}_t^{(i)} = \phi(y_t; x_t^{(i)},1)$ , where $\phi(x; \mu, \sigma^2)$ is the normal density with mean $\mu$ and variance $\sigma^2$ (our observation density). We want to give more weight to particles which are more likely to produce the observation $y_t$ that we recorded. We normalize these weights so they sum to 1.
We resample the particles according to these weights $w_t$ . Note that a particle is a full path of $x$ (i.e. don't just resample the last point, it's the whole thing, which they denote as $x_{0:t}^{(i)}$ ).

Go back to step 2, moving forward with the resampled version of the particles, until we've processed the whole series.

An implementation in R follows:

# Simulate some fake data
set.seed(123)

tau <- 100
x <- cumsum(rnorm(tau))
y <- x + rnorm(tau)

# Begin particle filter
N <- 1000
x.pf <- matrix(rep(NA,(tau+1)*N),nrow=tau+1)

# 1. Initialize
x.pf[1, ] <- rnorm(N)
m <- rep(NA,tau)
for (t in 2:(tau+1)) {
  # 2. Importance sampling step
  x.pf[t, ] <- x.pf[t-1,] + rnorm(N)

  #Likelihood
  w.tilde <- dnorm(y[t-1], mean=x.pf[t, ])

  #Normalize
  w <- w.tilde/sum(w.tilde)

  # NOTE: This step isn't part of your description of the algorithm, but I'm going to compute the mean
  # of the particle distribution here to compare with the Kalman filter later. Note that this is done BEFORE resampling
  m[t-1] <- sum(w*x.pf[t,])

  # 3. Resampling step
  s <- sample(1:N, size=N, replace=TRUE, prob=w)

  # Note: resample WHOLE path, not just x.pf[t, ]
  x.pf <- x.pf[, s]
}

plot(x)
lines(m,col="red")

# Let's do the Kalman filter to compare
library(dlm)
lines(dropFirst(dlmFilter(y, dlmModPoly(order=1))$m), col="blue")

legend("topleft", legend = c("Actual x", "Particle filter (mean)", "Kalman filter"), col=c("black","red","blue"), lwd=1)

The resulting graph:

A useful tutorial is the one by Doucet and Johansen, see here.

— Chris Haug
źródło

For your bullet point 2)in applying the algorithm

X_{1}^{(i)} | X_{0}^{(i)} \sim N (0, 1)

$X_1^{(i)} | X_0^{(i)} \sim N(0,1)$ ->

X_{1}^{(i)} | X_{0}^{(i)} \sim N (X_{0}^{(i)}, 1)

$X_1^{(i)} | X_0^{(i)} \sim N(X_0^{(i)},1)$ ?? Thank you so much. I have a working bootstrap filter under this model. Thanks for the emphasis on resampling the paths and not just the t-th particles.

— tintinthong

That's correct, I fixed the typo

— Chris Haug

The paths do not have to be re-sampled do they?? From other literature, there is no need to sample the paths. I just need to sample the particles at each time step. I was wondering whether there is a reason for resampling the paths

— tintinthong