홍석쓰 블로그

Background: To undestand the equivarance of a spherical harmonics on rotationA spherical harmonics is a function defined on the surface of a sphere that assigns a value to each point on the sphere. Mathematically, if we denote the sphere by S^2, then a function f defined on the sphere is written as f: S^2→C (or \mathbb{R} if the function is real-valued). This means that for every point on the sp..

Research (연구 관련) 2024. 7. 1. 15:49

What is Equivariance in Computer Vision?

What is SO(3)?SO(3) is a special orthogonal group, which ia set of 3x3 matrices that transform a 3d point without changing distances between two points (isometry), invertible, and has +1 determinant. It is a proper rotation matrix. An isometry with -1 determinant is an improper rotation matrix or a reflection + rotation matrix.A proper rotation matrix with determinant 1, denoted by R(nˆ, θ), rep..

Research (연구 관련) 2024. 6. 28. 12:44

What is JAX linen (nn) Module?

06/20/2024Pytorch의 nn.Module과 비슷하게 JAX의 neural network library인 FLAX에도 nn.Module이 있는데Pytorch nn.Module과 일대일 대응을 해보려 하다 배운 사실 정리:__init__ -> setupforwad -> __call__ (이건 자기 마음이긴 한데, Pytorch 문법대로 인스턴스를 forward로 쓰고 싶음 이렇게 하는 것임)그런데 setup이 다른 argument를 받지 않는 method라 혼란스러웠는데, nn.Module은 Python 3.7 dataclasses를 가정하고 상속해서 쓰기 때문에 그랬다. 말인즉슨, class variable로 선언해도 인스턴스 variable으로 되기 때문에 __init__ function안 쓰..

Research (연구 관련) 2024. 6. 20. 19:19

What is VLM?

What is VLM (Vision Language Model)?VLM is a model with a multi-modal architecture that learns to associate information from image and text modalities. The focus of the multi-modal learning is to pre-train an model on vision and language task and improve the downstream task performance such as VQA (Vision Question Answering). Why VLM? What are the Use Cases?1. Image Search and Retrieval / 2. Rob..

Research (연구 관련) 2024. 5. 9. 07:00

What is Variational Score Distillation?

- What is Variational Score Distillation (VSD)? It is a generalized version of Score Distillation Sampling (SDS), which learns the distribution of 3D parameters (like NeRF) using the pre-trained 2D text-to-image diffusion models. It learns the distribution based on particle-based variational inference, which tries to learn the text-conditioned distribution by updating the finite set of samples (..

Research (연구 관련) 2024. 4. 1. 04:24

What is index building?

What is similarity search? Similarity search is used in data retrieval. Given a query data (ex. text, image), find K most similar data points from the database. How can a vector representation be used? A vector representation (ex. CNN feature, CLIP latent embeddings) can be used in similiarity search and classification. The modern AI-based high-dimensional vectors are known to be powerful and fl..

Research (연구 관련) 2024. 3. 30. 05:57

What is offset noise?

https://www.youtube.com/watch?v=cVxQmbf3q7Q How does adding random mean relate to changing low frequency of noised images? https://isamu-website.medium.com/understanding-common-diffusion-noise-schedules-and-sample-steps-are-flawed-and-offset-noise-52a73ab4fded Understanding “Common Diffusion Noise Schedules and Sample Steps are Flawed” and Offset Noise This blog post is inspired by the GitHub us..

Research (연구 관련) 2024. 3. 29. 01:40

What is rigid align?

How do you obtain the rotation matrix? What is reflection of a rotation matrix? Is a reflection matrix a rotation matrix? https://medium.com/machine-learning-world/linear-algebra-points-matching-with-svd-in-3d-space-2553173e8fed Linear Algebra. Points matching with SVD in 3D space Problem medium.com https://igl.ethz.ch/projects/ARAP/svd_rot.pdf https://www.quora.com/What-is-the-relationship-bet..

Research (연구 관련) 2024. 3. 29. 00:20

What is DDPM and DDIM?

- What is forward diffusion and reverse diffusion?In variational diffusion models, Forward diffusion is an encoding process that gradually corrupts an image to a complete noise map (standard gaussian), which is mapping image space to latent space. Forward diffusion does not involve learnable parameters and it is a fixed markov chain process that is defined as a linear Gaussian model at each time..

Research (연구 관련) 2024. 3. 27. 06:12

Laplacian Smoothing / GraphCNN

Laplacian Smoothing 복습하다가 발견한 사실인데, math의 graph theory에서 정의하는 laplacian matrix이랑 computer vision과 computer graphics에서 정의하는 laplacian operation이 다른 것 같음. Graph theory에서는 Degree matrix - Adjacency matrix, 즉 difference operation으로 high pass filter이고, Laplacian Smoothing이라는 테크닉과 용어를 사용하는 computer vision/graphics에서는 Adjacency matrix - Degree matrix이다. Input matrix 의 second derivative이기 때문에 얻을 수 있는 fo..

Research (연구 관련) 2024. 3. 4. 13:29

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

티스토리툴바