๐Ÿ“š Study 114

[๋”ฅ๋Ÿฌ๋‹๊ณผ ์„ค๊ณ„] Unsupervised Learning ๋น„์ง€๋„ํ•™์Šต๊ธฐ์ดˆ

๋ณธ ๊ฒŒ์‹œ๊ธ€์€ ๋‹ค์Œ ๊ฐ•์˜๋ฅผ ๋“ฃ๊ณ  ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.https://www.youtube.com/watch?v=V9HcvXliJmw&list=PLQASD18hjBgyLqK3PgXZSp5FHmME7elWS&index=6     # Basic Probability  supervised learning๊ณผ ๋‹ค๋ฅด๊ฒŒ unsupervised learning์—์„œ๋Š”'ํ™•๋ฅ '์— ๋Œ€ํ•œ ๊ฐœ๋…์ด ๋งŽ์ด ๋‚˜์˜ค๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ฅผ ๋‹ค์‹œ ๊ณต๋ถ€ํ•˜๊ณ  ๋„˜์–ด๊ฐˆ ํ•„์š”๊ฐ€ ์žˆ๋‹ค.-(์กฐ๊ฑด๋ถ€ ํ™•๋ฅ  ex.)$ p(x1,x2,x3) = p(x1|x2, x3) * p(x2|x3) * p(x3) $-(์ „์ฒด ํ™•๋ฅ ์˜ ๋ฒ•์น™)$ p(y) = \sum_{x} p(x,y) = \sum_{x} p(y|x)p(x) $ ์ด๊ฑธ ์—ฐ์†์ ์ธ data์— ๋Œ€ํ•ด ์ž‘์—…์„ ํ–ˆ์„ ๋•Œ๊ฐ€ (Marginal..

๐Ÿ“š Study/AI 2024.07.10

[cs231n] Variational Autoencoders (VAE)

์ด์ „ PixelCNN ๊ฐ™์€ ๊ฒฝ์šฐ์—๋Š”, ํ™•๋ฅ  ๋ชจ๋ธ์ด ๊ณ„์‚ฐ ๊ฐ€๋Šฅํ•œ ํ•จ์ˆ˜์˜€๋Š”๋ฐ,VAE(Variational Autoencoders)๋Š” ํ™•๋ฅ  ๋ชจ๋ธ์ด ๊ณ„์‚ฐ ๋ถˆ๊ฐ€๋Šฅํ•œ ํ•จ์ˆ˜๋กœ ์ •์˜๊ฐ€ ๋œ๋‹ค.๋”ฐ๋ผ์„œ, Lower bound(ํ•˜ํ•œ์„ )์„ ๊ตฌํ•ด์„œ ๊ณ„์‚ฐ ๊ฐ€๋Šฅํ•œ ํ˜•ํƒœ๋กœ ๋งŒ๋“ค์–ด์ฃผ๋Š”๊ฒŒ ๋ชฉ์ ์ด๋‹ค. VAE์— ๋Œ€ํ•ด ๋ฐ”๋กœ ๋“ค์–ด๊ฐ€๊ธฐ ์ „์—,Autoencoder์˜ ๊ณผ์ •์ธ Encoder์™€ Decoder์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.   Autoencoder์ด๋ž€ input data $x$๋กœ๋ถ€ํ„ฐ ๋” ๋‚ฎ์€ ์ฐจ์›์˜ feature $z$๋ฅผ ํ•™์Šตํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.$z$๊ฐ€ $x$๋ณด๋‹ค ์ฐจ์›์ด ๋‚ฎ์€ ์ด์œ ๋Š”, ๊ธฐ์กด์˜ input ์ค‘์—์„œ 'ํ•ต์‹ฌ ์ •๋ณด'๋งŒ์„ ๊ฐ–๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.์ฆ‰, encoder๋ฅผ ํ†ตํ•ด input data์— Noise๋ผ๊ณ  ์ƒ๊ฐ๋˜๋Š” ๋ถ€๋ถ„์€ ์ œ๊ฑฐํ•˜๊ณ  ์‹ถ๋‹ค๋Š” ๋œป์ด๋‹ค. ์ •๋ฆฌํ•ด์„œ..

๐Ÿ“š Study/AI 2024.07.09

[Algorithm] ๊ทธ๋ฆฌ๋””(Greedy) ์•Œ๊ณ ๋ฆฌ์ฆ˜

# 1. ๋‹น์žฅ ์ข‹์€ ๊ฒƒ๋งŒ ์„ ํƒํ•˜๋Š” ๊ทธ๋ฆฌ๋””๊ทธ๋ฆฌ๋””(Greedy) ์•Œ๊ณ ๋ฆฌ์ฆ˜: ํƒ์š•์  ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋œป์€, 'ํ˜„์žฌ ์ƒํ™ฉ์—์„œ ์ง€๊ธˆ ๋‹น์žฅ ์ข‹์€ ๊ฒƒ๋งŒ ๊ณ ๋ฅด๋Š” ๋ฐฉ๋ฒ•'์„ ์˜๋ฏธํ•œ๋‹ค.๋”ฐ๋ผ์„œ, ๋งค์ˆœ๊ฐ„ ๊ฐ€์žฅ ์ข‹์•„๋ณด์ด๋Š” ๊ฒƒ์„ ์„ ํƒํ•˜๊ฒŒ ๋˜๋ฉฐ, ํ˜„์žฌ์˜ ์„ ํƒ์ด ๋ฏธ๋ž˜์— ์–ด๋–ค ์˜ํ–ฅ์„ ๋ฏธ์น ์ง€ ๊ณ ๋ คํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์ด๋‹ค.๋”ฐ๋ผ์„œ ๊ทธ๋ฆฌ๋”” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ '์ •๋‹น์„ฑ ๋ถ„์„'์ด ์ค‘์š”ํ•˜๋‹ค.๋‹จ์ˆœํžˆ ํ˜„์žฌ best๋ฅผ ์„ ํƒํ•ด๋„ ๊ทธ๊ฒŒ ์ตœ์ ์˜ ํ•ด๊ฐ€ ๋˜๋Š”์ง€ ๊ฒ€ํ† ๊ฐ€ ํ•„์š”ํ•˜๋‹ค๋Š” ์†Œ๋ฆฌ์ด๋‹ค! ์ฝ”๋”ฉ ํ…Œ์ŠคํŠธ์—์„œ ๊ทธ๋ฆฌ๋”” ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋ฌธ์ œ์œ ํ˜•์€'์‚ฌ์ „์— ์™ธ์šฐ๊ณ  ์žˆ์ง€ ์•Š์•„๋„ ํ’€ ์ˆ˜ ์žˆ์„ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ ๋ฌธ์ œ ์œ ํ˜•'์ด๋‹ค.๋ฐ˜๋ฉด ์ดํ›„์— ๊ณต๋ถ€ํ•  ์ •๋ ฌ, ์ตœ๋‹จ ๊ฒฝ๋กœ ๋“ฑ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์œ ํ˜•์€์ด๋ฏธ ๊ทธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์‚ฌ์šฉ๋ฐฉ๋ฒ•์„ ์ •ํ™•ํ•˜๊ฒŒ ์•Œ์•„์•ผ ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค.๋˜ํ•œ, ๊ทธ๋ฆฌ๋””๋Š” ๊ธฐ์ค€์— ๋”ฐ๋ผ ์ข‹์€ ๊ฒƒ์„ ์„ ํƒํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋ฏ€๋กœ๋ฌธ..

[Paper Review] Mini-Splatting: Representing Scenes with a Constrained Number of Gaussians

2024๋…„ 3์›”์— arxiv์— ์˜ฌ๋ผ์˜จ ๋…ผ๋ฌธ์œผ๋กœ,๊ธฐ์กด์— 3DGS์˜ ๊ฐ€์šฐ์‹œ์•ˆ ์ˆ˜ ํ˜น์€ ์ฐจ์›์„ ์ค„์ด๊ธฐ ์œ„ํ•œ ์‹œ๋„๋“ค(LightGaussian, Compact3DGS ๋“ฑ..)์˜ ํ•œ๊ณ„๋ฅผ ์–ธ๊ธ‰ํ•˜๋ฉฐ์ƒˆ๋กœ์šด ์•„์ด๋””์–ด๋ฅผ ์ฃผ์žฅํ•œ๋‹ค๋Š” ์ ์—์„œ ํฅ๋ฏธ๋กœ์›Œ ์ฝ๊ฒŒ ๋˜์—ˆ๋‹ค.  Mini-Splatting: Representing Scenes with a Constrained Number of GaussiansIn this study, we explore the challenge of efficiently representing scenes with a constrained number of Gaussians. Our analysis shifts from traditional graphics and 2D computer vision to t..

3DGS์—์„œ Covariance Matrix๋ฅผ ๊ตฌํ•  ๋•Œ transpose matrix๋ฅผ ๊ณฑํ•ด์ฃผ๋Š” ์ด์œ ?

3DGS ๋…ผ๋ฌธ์„ ์ฝ๋‹ค๊ฐ€ ์ˆ˜์‹์„ ๋ณด๊ณ  ๋“  ๊ถ๊ธˆ์ฆ์ด๋‹ค. ๋จผ์ €, world ์ขŒํ‘œ๊ณ„์—์„œ covariance matrix(๊ณต๋ถ„์‚ฐํ–‰๋ ฌ)์€,(1) ํฌ๊ธฐ๋ณ€ํ™˜ํ–‰๋ ฌ(scaling matrix) S์™€ (2) ํšŒ์ „๋ณ€ํ™˜ํ–‰๋ ฌ(rotation matrix) R์„ ์ด์šฉํ•ด์„œ $$\sum = RSS^{T}R^{T}$$๋‹ค์Œ๊ณผ ๊ฐ™์€ ์‹์œผ๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค. ๋˜ํ•œ, image ์ขŒํ‘œ๊ณ„์—์„œ ๊ณต๋ถ„์‚ฐํ–‰๋ ฌ์€,(1) world์ขŒํ‘œ๊ณ„์—์„œ camera์ขŒํ‘œ๊ณ„๋กœ ๋ณ€ํ™˜ํ•˜๋Š” viewing transform๊ณผ (2) camera์ขŒํ‘œ๊ณ„์—์„œ image์ขŒํ‘œ๊ณ„๋กœ ๋ณ€ํ™˜ํ•˜๋Š” projective transformation์— ๋Œ€ํ•œ ์•„ํ•€๊ทผ์‚ฌ์˜ Jacobian์„ ์ด์šฉํ•ด์„œ $$  \sum^{'} = JW \sum W^{T}J^{T} $$์œ„์˜ ์‹์œผ๋กœ ํ‘œํ˜„ํ•œ๋‹ค.  ๋‘ ์‹์„ ์‚ดํŽด๋ณด๋ฉด ์™œ ์ „์น˜ํ–‰๋ ฌ..

3DGS์—์„œ ํœด๋ฆฌ์Šคํ‹ฑ(heuristic)์˜ ์˜๋ฏธ?

Radsplat ๋…ผ๋ฌธ์„ ์ฝ๋‹ค๊ฐ€ 3DGS์˜ ํ•œ๊ณ„์ ์œผ๋กœ,ํœด๋ฆฌ์Šคํ‹ฑ ๊ธฐ๋ฒ•์œผ๋กœ ์ธํ•ด optimizationํ•˜๊ธฐ ์–ด๋ ต๋‹ค๋Š” ์–˜๊ธฐ๊ฐ€ ์žˆ์—ˆ๋‹ค.3DGS, however, suffers from a challenging optimization landscape and an unbounded model size.The number of Gaussian primitives is not known as a priori, and carefully-tuned merging, splitting, and pruning heuristics are required to acheive satisfactory results.The brittlenenss of these heuristics become particularly evident in..

NeRF ๊ฐ„๋‹จ ์„ค๋ช… with ์•ฝ๊ฐ„์˜ ์ฝ”๋“œ

[Paper Review] NeRF : Representing Scenes as Neural Radiance Fields for View Synthesis (ECCV2020)NeRF ๋ชจ๋ธ์€ ๋งŽ์€ ๋ธ”๋กœ๊ทธ์™€ ์œ ํŠœ๋ธŒ ์ž๋ฃŒ๋ฅผ ์ฐพ์•„๋ณด๋ฉฐ ์ดํ•ดํ•˜๋Š” ์ˆ˜์ค€์— ๊ทธ์ณค๋Š”๋ฐ ๋…ผ๋ฌธ์„ ์ •๋…ํ•˜๋‹ˆ ํ›จ์”ฌ ๋” ์ดํ•ด ์ •๋„๊ฐ€ ๊นŠ์–ด์ง„ ๊ธฐ๋ถ„์ด๋‹ค. ์ง์ ‘ ๊ธ€์„ ์จ๋ณด๋ฉฐ ์™„๋ฒฝํžˆ ๋‚ด ๊ฒƒ์œผ๋กœ ๋งŒ๋“ค์ž! ๋‹ค์Œ๊ณผ ๊ฐ™์€ dusruddl2.tistory.com ↑ ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํ–ˆ์—ˆ๋Š”๋ฐ ์ •๋ง ๋‚ด๊ฐ€ NeRF ๋ชจ๋ธ์„ ์ œ๋Œ€๋กœ ์ดํ•ดํ•˜๊ณ  ์žˆ๋‚˜? ์˜๋ฌธ์ด ๋“ค์–ด์„œ ์“ฐ๊ฒŒ ๋œ ํฌ์ŠคํŠธ ํ—ท๊ฐˆ๋ ธ๋˜ ๋ถ€๋ถ„ ์œ„์ฃผ๋กœ ๊ฐ„๋‹จ ๋ฆฌ๋ทฐํ•  ์˜ˆ์ •์ด๋‹ค. nerf/tiny_nerf.ipynb at master · bmild/nerfCode release for NeRF (Neural Radiance Fields..

[Paper Review] 3D Gaussian Splatting for Real-Time Radiance Field Rendering (SIGGRAPH 2023)

3DGS๋ฅผ ์ฒ˜์Œ ๊ณต๋ถ€ํ•˜์‹œ๋Š” ๋ถ„๋“ค์ด๋ผ๋ฉด xoft๋‹˜์˜ ๋ธ”๋กœ๊ทธ์™€ ์œ ํŠœ๋ธŒ ๊ฐ•์˜๋ฅผ ๋จผ์ € ๋“ค์œผ์‹œ๋Š”๊ฑธ ์ถ”์ฒœ๋“œ๋ฆฝ๋‹ˆ๋‹ค.์ „์ฒด ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ดํ•ดํ•˜๊ธฐ ์‰ฝ๊ฒŒ ๋‹ค๋ค„์ฃผ์‹œ๊ธฐ ๋•Œ๋ฌธ์— ์ดํ•ด๊ฐ€ ์‰ฝ์Šต๋‹ˆ๋‹ค :) ๋ณธ ๊ธ€์€ ๋…ผ๋ฌธ์„ ์ˆœ์„œ๋Œ€๋กœ ์ฝ๊ณ  ์‹ถ์€ ๋ถ„์—๊ฒŒ ๋„์›€์ด ๋  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•ฉ๋‹ˆ๋‹ค. (xoft๋‹˜์˜ ๊ธ€์„ ๋งŽ์ด ์ฐธ๊ณ ํ•˜์˜€์Šต๋‹ˆ๋‹ค.)๋ถ€์กฑํ•œ ์ง€์‹์œผ๋กœ ์ž‘์„ฑํ•œ ๊ธ€์ด๊ธฐ ๋•Œ๋ฌธ์— ์ž˜๋ชป๋œ ๋ถ€๋ถ„์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์Œ๊ป ์ง€์ ํ•ด์ฃผ์„ธ์š”! 1. IntroductionMLP๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•˜๋Š” NeRF ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ๋“ค์€ ๋ Œ๋”๋ง ์†๋„๊ฐ€ ๋„ˆ๋ฌด ๋А๋ ค ์‹ค์ œ ์‘์šฉ์—๋Š” ์ œํ•œ์ ์ด์—ˆ๋Š”๋ฐ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” 3DGS๋ฅผ ํ†ตํ•ด(1) training ์‹œ๊ฐ„๋„ ์ด์ „ ๋ฐฉ๋ฒ•์ฒ˜๋Ÿผ ๋น ๋ฅด๊ฒŒ ๊ทธ๋ฆฌ๊ณ  (2) ํ€„๋ฆฌํ‹ฐ๋„ ์œ ์ง€ํ•˜๋ฉด์„œ (3) ๋ Œ๋”๋ง ์†๋„๋ฅผ ๋งค์šฐ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์—ˆ๋‹ค.(real-time, high-qual..

3DGS์˜ tile rasterizer์—์„œ ๊ฒน์น˜๋Š” tile๊ฐœ์ˆ˜์— ๋”ฐ๋ผ ์ธ์Šคํ„ด์Šคํ™”ํ•˜๋Š” ์ด์œ ?

3DGS์˜ tile rasterizer ๋ถ€๋ถ„์„ ์ฝ๋‹ค๊ฐ€ ์™œ ๊ทธ๋ ‡์ง€?ํ•˜๋Š” ์ƒ๊ฐ์„ ๋“ค๊ฒŒ ํ•œ ๋ถ€๋ถ„์ด ์žˆ์—ˆ๋‹ค. We then instantiate each Gaussian according to the number of tiles they overlap and assign each instance a key that combines view space depth and tile ID. ์™œ ๊ฐ€์šฐ์‹œ์•ˆ์„ ๊ทธ๋“ค์ด ๊ฒน์น˜๋Š” ํƒ€์ผ ์ˆ˜๋กœ ์ธ์Šคํ„ด์Šคํ™”ํ•˜๋Š” ๊ฒƒ์ผ๊นŒ? ์ด๋Š” ํ•ด๋‹น ๊ฐ€์šฐ์‹œ์•ˆ์ด ์ด๋ฏธ์ง€์— ์–ด๋А์ •๋„ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€ ๋ฐ˜์˜ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋ผ๊ณ  ์ƒ๊ฐํ•œ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ํ•œ ์ด๋ฏธ์ง€๊ฐ€ ๋…ผ๋ฌธ์ฒ˜๋Ÿผ 16x16 ํƒ€์ผ๋กœ ๋ถ„ํ• ๋˜์–ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ด๋ณด์ž. ์ด๋ฏธ์ง€์— projection๋œ 2D ๊ฐ€์šฐ์‹œ์•ˆ๋“ค์„ ๊ณ ๋ คํ•ด๋ณผ ๋•Œ, ๊ฐ€์šฐ์‹œ์•ˆ์ด ๊ฒน์น˜๋Š” ํƒ€์ผ ์ˆ˜๊ฐ€ ๋งŽ์„ ๊ฒฝ์šฐ -..