In my post, I introduced the reverse stochastic differential equation (SDE) which is used to generate data sample by integrating it.
Reverse SDE in diffusion models
The reverse SDE is given by
dx_t = \left[f(x_t, t)-g(t)^2 \nabla_{x_t} \log p_t (x_t) \right] + g(t) d\bar{w}_t. \tag{reverse SDE}The score function \nabla_{x_t} \log p_t(x_t) are required, to integrate (reverse SDE). One way to approximate the score function is using a neural network s_\theta(x_t) called a scored model parametrized by \theta . The score matching (SM) loss , \mathcal{L}_{SM}(\theta) , can be used to train the score model to approximate the score function
\mathcal{L}_{SM}(\theta) = \mathbf{E}_{ t, x_t} \lVert s_\theta(x_t,t) - \nabla_{x_t} \log p_t (x_t) \tag{SM loss}.However, the score function is often intractable, therefore (SM loss) can not be used in practice.
In my next post, I will introduce the denoising score matching (DSM) loss as an alternative to the SM loss, making the score model approximate the score function.