Score matching loss in diffusion models.

In my post, I introduced the reverse stochastic differential equation (SDE) which is used to generate data sample by integrating it.

Reverse SDE in diffusion models

The reverse SDE is given by

dx_t = \left[f(x_t, t)-g(t)^2 \nabla_{x_t} \log p_t (x_t) \right] + g(t) d\bar{w}_t.  \tag{reverse SDE}

The score function \nabla_{x_t} \log p_t(x_t) are required, to integrate (reverse SDE). One way to approximate the score function is using a neural network s_\theta(x_t) called a scored model parametrized by \theta . The score matching (SM) loss , \mathcal{L}_{SM}(\theta) , can be used to train the score model to approximate the score function

\mathcal{L}_{SM}(\theta) = \mathbf{E}_{ t, x_t} \lVert s_\theta(x_t,t) - \nabla_{x_t} \log p_t (x_t) \tag{SM loss}.

However, the score function is often intractable, therefore (SM loss) can not be used in practice.

In my next post, I will introduce the denoising score matching (DSM) loss as an alternative to the SM loss, making the score model approximate the score function.

Leave a Comment