The Kullback-Leibler divergence κ(P|Q) of P with respect to Q is infinite when P is not absolutely continuous with respect to Q, that is, when there exists a measurable set A such that Q(A)=0 and P(A)≠0. Furthermore the KL divergence is not symmetric, in the sense that in general κ(P∣Q)≠κ(Q∣P). Recall that
κ(P∣Q)=∫Plog(PQ).
A way out of both these drawbacks, still based on KL divergence, is to introduce the midpoint
R=12(P+Q).
Thus
R is a probability measure, and
P and
Q are always absolutely continuous with respect to
R. Hence one can consider a "distance" between
P and
Q, still based on KL divergence but using
R, defined as
η(P,Q)=κ(P∣R)+κ(Q∣R).
Then
η(P,Q) is nonnegative and finite for every
P and
Q,
η is symmetric in the sense that
η(P,Q)=η(Q,P) for every
P and
Q, and
η(P,Q)=0 iff
P=Q.
An equivalent formulation is
η(P,Q)=2log(2)+∫(Plog(P)+Qlog(Q)−(P+Q)log(P+Q)).
Addendum 1 The introduction of the midpoint of P and Q is not arbitrary in the sense that
η(P,Q)=min[κ(P∣⋅)+κ(Q∣⋅)],
where the minimum is over the set of probability measures.
Addendum 2 @cardinal remarks that η is also an f-divergence, for the convex function
f(x)=xlog(x)−(1+x)log(1+x)+(1+x)log(2).