티스토리 뷰
- What is Variational Score Distillation (VSD)?
It is a generalized version of Score Distillation Sampling (SDS), which learns the distribution of 3D parameters (like NeRF) using the pre-trained 2D text-to-image diffusion models. It learns the distribution based on particle-based variational inference, which tries to learn the text-conditioned distribution by updating the finite set of samples (particles) to minimize the KL divergence between the true distribution and the distribution approximated by the current samples.
- What is particle-based variational inference?
Particle-based variational inference uses a finite set of samples (particles) to approximate the distribution and tries to iteratively minimize the difference between the true distribution and the approximation. One of the update ruls is SVGD(Stein Variational Gradient Descent).
Recently, particle-based variational inference methods (ParVIs) have been proposed. They use a set of samples, or particles, to represent the approximating distribution (like MCMC) and deterministically update particles by minimiz- ing the KL-divergence to p (like VIs).
Understanding and Accelerating Particle-Based Variational Inference: https://arxiv.org/pdf/1807.01750.pdf
- What is Stein Variational Gradient Descent (SVGD)? What is Wassertian Gradient Flow?
Stein Variational Gradient Descent is a method to estimate a vector field that minimizes the KL divergence between the true distribution and current approximation, which can be predicted by neural networks or in an analytical way. SVGD is known to simulate the steepest descending curve (Wassertian Gradient Flow) in 2-Wassertian space, which is the space of different distribution that defines euclidean distance between different distributions.
- What is Markov Chain Monte Carlo techniques? What is Markov Chian and Monte Carlo Simulation?
Monte Carlo Simulation or method is an approach of approximation toward distribution via repeated random sampling. Markov chain is a process that describes a sequence of possible events (states) with some transition probability that only depends on the previous event. MCMC generates samples from a Markov Chain. Theoretically, MCMC can match a target distribution with an infinite number of samples, but often converge slowly due to undesirable autocorrelation between samples.
A Unified Particle-Optimization Framework for Scalable Bayesian Sampling: https://arxiv.org/pdf/1805.11659.pdf
- How does VSD exactly works? (ex. pipeline, model) What are the losses of Variational Score Distillation?
It first initializes n particles (3D parameter sets, which are NeRFs) that approximates the text-conditioned 3D parameter distribution. It random samples a 3D parameter set and renders an image from a camera pose like DreamFusion, add noise, and feed the noisy image to two models. One is a frozen pre-trained StableDiffision and the other is trainable LoRA initialized with SD. There are two losses. One is the MSE loss between the ground truth added noise and the predicted noise by LoRA, whose backpropagation updates LoRA weights. The other is the MSE loss between the predicted noise by the frozen SD and the trainable LoRA. The predicted noise by the frozen SD approximates the real image, and the loss is aligning the distribution from particles to the real distribution.
- Why is VSD better than SDS?
SDS is a special case of VSD, where the distribution of 3D parameters is a Dirac Delta function. It is related to low diversity of SDS. Somehow SDS requires a very high classification free guidance to produce high fidelity images, but VSD does not. So high CFG scales affect to the over-saturation and over-smoothing.
- How does VSD with n=1 particle differ from SDS?
I think it is the same in terms of the 3D parameter distribution (i.e. Dirac Delta function), but still VSD learns a LoRA for score function prediction, which empirically proven to be beneficial.
- What is Variational Score Distillation (VSD)?
'Research (연구 관련)' 카테고리의 다른 글
What is JAX linen (nn) Module? (2) | 2024.06.20 |
---|---|
What is VLM? (0) | 2024.05.09 |
What is index building? (0) | 2024.03.30 |
What is offset noise? (0) | 2024.03.29 |
What is rigid align? (0) | 2024.03.29 |
- Total
- Today
- Yesterday
- camera coordinate
- 헬스
- nohup
- densepose
- part segmentation
- Interview
- Machine Learning
- 컴퓨터비젼
- Docker
- 컴퓨터비전
- 문경식
- 비전
- deep learning
- nerf
- 머신러닝
- 인터뷰
- focal length
- pytorch
- Transformation
- Generative model
- 에디톨로지
- 피트니스
- VAE
- demo
- Virtual Camera
- spin
- 2d pose
- world coordinate
- Pose2Mesh
- pyrender
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |