[CV Notes] Lecture 17 - 3D Vision

Posted by Euisuk's Dev Log on August 14, 2024

[CV Notes] Lecture 17 - 3D Vision

์›๋ณธ ๊ฒŒ์‹œ๊ธ€: https://velog.io/@euisuk-chung/Notes-Computer-Vision-Lecture-17

๋‹ค์Œ์€ ์•„๋ž˜ Lecture์— ๋Œ€ํ•œ ์š”์•ฝ ๋ฐ ํ•„๊ธฐ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ‹€๋ฆฐ ๋‚ด์šฉ์ด ์žˆ๋‹ค๋ฉด ๋Œ“๊ธ€ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค ๐Ÿ™Œ

1. 3D ๋น„์ „ ๊ฐœ์š”

  • 2D์—์„œ 3D๋กœ์˜ ํ™•์žฅ:
    • ์ด์ „ ๊ฐ•์˜์—์„œ๋Š” ์ด๋ฏธ์ง€์—์„œ ๊ฐ์ฒด๋ฅผ ์ธ์‹ํ•˜๊ณ  ์œ„์น˜๋ฅผ ์ฐพ๋Š” ๋‹ค์–‘ํ•œ ์ž‘์—…(์˜ˆ: ๊ฐ์ฒด ํƒ์ง€, ์˜๋ฏธ๋ก ์  ๋ถ„ํ• , ์ธ์Šคํ„ด์Šค ๋ถ„ํ•  ๋“ฑ)์— ๋Œ€ํ•ด ๋‹ค๋ค˜์Šต๋‹ˆ๋‹ค.

  • ์‹ค์ œ ์„ธ๊ณ„๋Š” 2D๊ฐ€ ์•„๋‹ˆ๋ผ 3D์ด๋ฏ€๋กœ, ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์— 3์ฐจ์› ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ 3D ๊ตฌ์กฐ๋ฅผ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด ๊ฐ•์˜์—์„œ๋Š” 2D ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ชจ๋ธ์— 3D ๊ณต๊ฐ„์„ ํฌํ•จ์‹œ์ผœ 3D ๊ตฌ์กฐ๋ฅผ ์ดํ•ดํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํƒ๊ตฌํ•ฉ๋‹ˆ๋‹ค.

2. 3D ๋ฌธ์ œ์˜ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๊ณผ์ œ

  • ๋ณธ๊ฒฉ์ ์ธ ์„ค๋ช…์— ์•ž์„œ, 3D ๋ฌธ์ œ์˜ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๊ณผ์ œ์— ๋Œ€ํ•ด์„œ ๊ฐ€๋ณ๊ฒŒ ์„ค๋ช…์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

  • ๋‹จ์ผ ์ด๋ฏธ์ง€์—์„œ 3D ๋ชจ์–‘ ์˜ˆ์ธก (์ขŒ):

    • ๋‹จ์ผ RGB ์ด๋ฏธ์ง€๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ๊ทธ ์ด๋ฏธ์ง€ ๋‚ด ๊ฐ์ฒด์˜ 3D ๋ชจ์–‘์„ ์˜ˆ์ธกํ•˜๋Š” ์ž‘์—….
    • ์ž…๋ ฅ์€ ์—ฌ์ „ํžˆ 2D ์ด๋ฏธ์ง€์ง€๋งŒ, ์ถœ๋ ฅ์€ ๊ฐ์ฒด์˜ 3D ํ‘œํ˜„์ด ๋ฉ๋‹ˆ๋‹ค.
  • 3D ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๋Š” ์ž‘์—… (์šฐ):

    • 3D ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ๋ถ„๋ฅ˜ ๋˜๋Š” ์„ธ๋ถ„ํ™” ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
    • ์ด ์ž‘์—…์—์„œ๋Š” 3D ๋ฐ์ดํ„ฐ๋ฅผ ์ง์ ‘ ๋‹ค๋ฃจ๋ฉฐ, ์ด๋ฅผ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์—์„œ ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋‹ค๋ฃน๋‹ˆ๋‹ค.

3. 5๊ฐ€์ง€ 3D ๋ชจ์–‘ ํ‘œํ˜„ ๋ฐฉ์‹

  • โ€œ๋ชจ๋ธ๋งโ€œ์ด๋ผ๋Š” ์šฉ์–ด๋Š” 3D ๋ชจ์–‘๊ณผ 3D ์ •๋ณด๋ฅผ ํ‘œํ˜„ํ•˜๊ฑฐ๋‚˜ ํ‘œํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๊ณ„ํ•˜๊ณ  ๊ตฌ์„ฑํ•˜๋Š” ๊ณผ์ •์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

    • ๊ตฌ์ฒด์ ์œผ๋กœ๋Š”, ์ปดํ“จํ„ฐ ๊ทธ๋ž˜ํ”ฝ์Šค๋‚˜ ์ปดํ“จํ„ฐ ๋น„์ „์—์„œ ๊ฐ์ฒด์˜ 3D ๊ตฌ์กฐ๋ฅผ ์ˆ˜ํ•™์  ๋˜๋Š” ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜์˜ ๋ฐฉ์‹์œผ๋กœ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒƒ์„ ๋งํ•ฉ๋‹ˆ๋‹ค.
    • ๋ชจ๋ธ๋งํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ๋žŒ๋“ค์ด ์‚ฌ์šฉํ•˜๋Š” ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ ํ‘œํ˜„ ๋ฐฉ์‹(representation)์ด ๋งŽ์ด ์žˆ์œผ๋ฉฐ, ๋ณธ ๊ฐ•์˜์—์„œ๋Š” ์ด๋Ÿฌํ•œ ์„œ๋กœ ๋‹ค๋ฅธ 3D ๋ชจ์–‘ ํ‘œํ˜„ ๋ฐฉ์‹ ์ค‘์—์„œ ์‚ฌ๋žŒ๋“ค์ด ์‹ค๋ฌด์—์„œ ์ž์ฃผ ์‚ฌ์šฉํ•˜๋Š” ๋‹ค์„ฏ ๊ฐ€์ง€๋ฅผ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค.
  • ๊นŠ์ด ๋งต(Depth Map):

    • ๊ฐœ๋…: ๊นŠ์ด ๋งต์€ ๊ฐ ํ”ฝ์…€์— ๋Œ€ํ•ด ์นด๋ฉ”๋ผ์™€ ํ•ด๋‹น ํ”ฝ์…€์ด ํ‘œํ˜„ํ•˜๋Š” ๊ฐ์ฒด ๊ฐ„์˜ ๊ฑฐ๋ฆฌ๋ฅผ ํ• ๋‹นํ•˜๋Š” ๊ฐ„๋‹จํ•œ 3D ํ‘œํ˜„ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.
    • ํ‘œํ˜„ ๋ฐฉ์‹: ์ „ํ†ต์ ์ธ RGB ์ด๋ฏธ์ง€๊ฐ€ ์ƒ‰์ƒ ๊ฐ’์„ ์ €์žฅํ•˜๋Š” 2D ๊ทธ๋ฆฌ๋“œ๋ผ๋ฉด, ๊นŠ์ด ๋งต์€ ๊ฐ ํ”ฝ์…€์— ๊ฑฐ๋ฆฌ ๊ฐ’์„ ํ• ๋‹นํ•˜๋Š” 2D ๊ทธ๋ฆฌ๋“œ๋กœ, RGB ์ด๋ฏธ์ง€์— ๊นŠ์ด ์ •๋ณด๋ฅผ ์ถ”๊ฐ€ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

    • ์žฅ์ : ๊นŠ์ด ๋งต์€ ๋‹ค์–‘ํ•œ 3D ์„ผ์„œ(์˜ˆ: ๋งˆ์ดํฌ๋กœ์†Œํ”„ํŠธ Kinect, iPhone์˜ Face ID ๋“ฑ)๋กœ๋ถ€ํ„ฐ ์ง์ ‘ ์บก์ฒ˜ํ•  ์ˆ˜ ์žˆ๋Š” ์œ ํ˜•์˜ 3D ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค.
    • ๋‹จ์ : ๊นŠ์ด ๋งต์€ ๊ฐ€๋ ค์ง„ ๊ฐ์ฒด์˜ ๊ตฌ์กฐ๋ฅผ ์บก์ฒ˜ํ•  ์ˆ˜ ์—†์œผ๋ฏ€๋กœ, ์ผ๋ถ€ ๊ฐ์ฒด์˜ ๊ตฌ์กฐ๋ฅผ ์˜จ์ „ํžˆ ํ‘œํ˜„ํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. ์ด ๋•Œ๋ฌธ์— ์™„์ „ํ•œ 3D๊ฐ€ ์•„๋‹Œ โ€˜2.5Dโ€™๋กœ ๋ถˆ๋ฆฝ๋‹ˆ๋‹ค.
    • ๊นŠ์ด ๋งต ์˜ˆ์ธก: ์‹ ๊ฒฝ๋ง์„ ์‚ฌ์šฉํ•ด RGB ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ๊นŠ์ด ๋งต์„ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์ „์ฒด์ ์œผ๋กœ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(Fully Convolutional Network, FCN) ์•„ํ‚คํ…์ฒ˜๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ’ก Scale-depth ambiguity

  • Scale-depth ambiguity๋Š” 3D ๋น„์ „์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฌธ์ œ๋กœ, ๋‹จ์ผ 2D ์ด๋ฏธ์ง€์—์„œ๋Š” ๊ฐ์ฒด์˜ ์‹ค์ œ ํฌ๊ธฐ์™€ ๊ฑฐ๋ฆฌ ๊ฐ„์˜ ๊ตฌ๋ถ„์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
  • ์ฆ‰, ์–ด๋–ค ๊ฐ์ฒด๊ฐ€ ๋ฉ€๋ฆฌ ์žˆ๋Š” ํฐ ๊ฐ์ฒด์ธ์ง€, ๊ฐ€๊นŒ์ด ์žˆ๋Š” ์ž‘์€ ๊ฐ์ฒด์ธ์ง€ ๋‹จ์ผ ์ด๋ฏธ์ง€์—์„œ๋งŒ์œผ๋กœ๋Š” ๊ตฌ๋ณ„ํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

  • ์˜ˆ๋ฅผ ๋“ค์–ด, ๋‘ ๋ฐฐ ๋” ํฐ ๊ณ ์–‘์ด๊ฐ€ ๋‘ ๋ฐฐ ๋” ๋ฉ€๋ฆฌ ๋–จ์–ด์ ธ ์žˆ๋‹ค๊ณ  ์ƒ์ƒํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค(์œ„ ์ด๋ฏธ์ง€ ์ฐธ๊ณ ). ์ด ๋‘ ๊ณ ์–‘์ด๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ํฌ๊ธฐ์™€ ๊ฑฐ๋ฆฌ์— ์žˆ์ง€๋งŒ, 2D ์ด๋ฏธ์ง€์—์„œ๋Š” ๋‘ ๊ณ ์–‘์ด๊ฐ€ ๋™์ผํ•œ ํฌ๊ธฐ๋กœ ๋ณด์ด๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋‘˜์„ ๊ตฌ๋ถ„ํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ชจํ˜ธ์„ฑ ๋•Œ๋ฌธ์—, 3D ๋ฐ์ดํ„ฐ๋ฅผ ์˜ˆ์ธกํ•˜๊ฑฐ๋‚˜ ๋ถ„์„ํ•  ๋•Œ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์ด ๊ฐ์ฒด์˜ ์ ˆ๋Œ€์ ์ธ ํฌ๊ธฐ์™€ ๊ฑฐ๋ฆฌ๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์˜ˆ์ธกํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, scale-invariant loss function์ด๋ผ๋Š” ํŠน์ˆ˜ํ•œ ์†์‹ค ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ ˆ๋Œ€์ ์ธ ํฌ๊ธฐ ๋Œ€์‹ , ๊ฐ์ฒด์˜ ์ƒ๋Œ€์ ์ธ ํฌ๊ธฐ์™€ ๊นŠ์ด ๊ด€๊ณ„์— ์ง‘์ค‘ํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ์Šค์ผ€์ผ ์ฐจ์ด์— ์˜ํ•œ ์˜ค๋ฅ˜๋ฅผ ์ค„์ด๊ณ , ๋ชจ๋ธ์ด ๋ณด๋‹ค ์ผ๊ด€๋œ 3D ์˜ˆ์ธก์„ ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋„์™€์ค๋‹ˆ๋‹ค.

  • ํ‘œ๋ฉด ๋ฒ•์„  ๋งต(Surface Normal Map):

    • ๊ฐœ๋…: ๊ฐ ํ”ฝ์…€์— ๋Œ€ํ•ด ๊ฐ์ฒด ํ‘œ๋ฉด์˜ ๋ฐฉํ–ฅ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฒกํ„ฐ๋ฅผ ํ• ๋‹นํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.
    • ํ‘œํ˜„ ๋ฐฉ์‹: ๋ฒ•์„  ๋ฒกํ„ฐ์˜ ๋ฐฉํ–ฅ์„ RGB ์ƒ‰์ƒ์œผ๋กœ ํ‘œํ˜„ํ•˜์—ฌ, ๊ฐ ํ”ฝ์…€์—์„œ ํ‘œ๋ฉด์ด ์–ด๋А ๋ฐฉํ–ฅ์„ ๊ฐ€๋ฆฌํ‚ค๊ณ  ์žˆ๋Š”์ง€ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
    • ํ™œ์šฉ: ํ‘œ๋ฉด ๋ฒ•์„  ๋งต์€ ๋ฌผ์ฒด์˜ ์„ธ๋ถ€์ ์ธ ํ‘œ๋ฉด ๊ตฌ์กฐ๋ฅผ ์ดํ•ดํ•˜๋Š” ๋ฐ ์œ ์šฉํ•˜๋ฉฐ, ํ‘œ๋ฉด ๋ฒ•์„  ๋งต๊ณผ ๊นŠ์ด ๋งต์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๊ฐ์ฒด์˜ 3D ๊ตฌ์กฐ๋ฅผ ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ฒŒ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


  • ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ(Voxel Grid):

    • ๊ฐœ๋…: 3D ๊ณต๊ฐ„์„ ์ผ์ •ํ•œ ํฌ๊ธฐ์˜ ๊ทธ๋ฆฌ๋“œ๋กœ ๋‚˜๋ˆ„๊ณ , ๊ฐ ๊ทธ๋ฆฌ๋“œ ์…€์ด ์ ์œ ๋˜์—ˆ๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ ์ด์ง„ ๊ฐ’์œผ๋กœ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.

    • ํŠน์ง•: ๋งˆ์น˜ โ€˜๋งˆ์ธํฌ๋ž˜ํ”„ํŠธโ€™์™€ ๊ฐ™์€ ๋ธ”๋ก ๊ธฐ๋ฐ˜์˜ ์„ธ๊ณ„๋ฅผ ์ƒ์ƒํ•˜๋ฉด ์ดํ•ดํ•˜๊ธฐ ์‰ฌ์šฐ๋ฉฐ, 2D์˜ ํ”ฝ์…€ ๊ทธ๋ฆฌ๋“œ๋ฅผ 3D๋กœ ํ™•์žฅํ•œ ๊ฐœ๋…์ž…๋‹ˆ๋‹ค.

    • ๋ณต์…€ ์˜ˆ์ธก: ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ์—์„œ ๊ฐ์ฒด๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ฑฐ๋‚˜ ์ธ์‹ํ•˜๊ธฐ ์œ„ํ•ด 3D ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(3D Convolutional Neural Network)์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

      1. 2D ์ž…๋ ฅ ์ด๋ฏธ์ง€: ์ž…๋ ฅ์œผ๋กœ 3์ฑ„๋„ RGB ์ด๋ฏธ์ง€(ํฌ๊ธฐ: 3xHxW)๊ฐ€ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
      2. 2D CNN: 2D ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(CNN)์ด ์ด๋ฏธ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•˜์—ฌ ํŠน์ง• ๋งต์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋•Œ, ์ƒ์„ฑ๋œ 2D ํŠน์ง• ๋งต์˜ ํฌ๊ธฐ๋Š” CxHxW์ž…๋‹ˆ๋‹ค.
      3. 3D CNN: 2D CNN์—์„œ ๋‚˜์˜จ ํŠน์ง• ๋งต์„ ๊ธฐ๋ฐ˜์œผ๋กœ, 3D CNN์ด ์‚ฌ์šฉ๋˜์–ด 3D ํŠน์ง• ๋งต(Cโ€™xDxHxW)์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
      4. ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ ์ƒ์„ฑ: ๋งˆ์ง€๋ง‰์œผ๋กœ, 3D CNN์„ ์‚ฌ์šฉํ•ด 4D ํ…์„œ(ํฌ๊ธฐ: 1xVxVxV)๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ, ๋ณต์…€์˜ ์ ์œ  ํ™•๋ฅ (occupancy probability)์„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
      5. ํ•™์Šต: ๊ฐ ๋ณต์…€์˜ ์˜ˆ์ธก๋œ ์ ์œ  ํ™•๋ฅ ์„ ๊ทธ๋ผ์šด๋“œ ํŠธ๋ฃจ์Šค์™€ ๋น„๊ตํ•˜๋Š” per-voxel cross-entropy loss๋ฅผ ์‚ฌ์šฉํ•ด ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

        ๐Ÿ‘‰ ์œ„ ๋ฐฉ์‹์€ 3D CNN์„ ์‚ฌ์šฉํ•˜์—ฌ 3D ๊ณต๊ฐ„ ์ „์ฒด์—์„œ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ณ  ๋ณต์…€ ์ ์œ  ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ 3D CNN์€ 2D CNN์— ๋น„ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ด์œ ๋กœ ๊ณ„์‚ฐ ๋น„์šฉ์ด ๋งŽ์ด ๋“ญ๋‹ˆ๋‹ค:

        1. ํ๋ธŒ์  ์ฆ๊ฐ€: 3D CNN์˜ ๊ณ„์‚ฐ ๋ณต์žก๋„๋Š” 2D CNN๋ณด๋‹ค ํ›จ์”ฌ ํฝ๋‹ˆ๋‹ค. ์ด๋Š” 3D ํ•ฉ์„ฑ๊ณฑ ํ•„ํ„ฐ๊ฐ€ 3D ๊ณต๊ฐ„ ๋‚ด์—์„œ ์Šฌ๋ผ์ด๋”ฉํ•˜๋ฉฐ ์ ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์—, ๊ณ„์‚ฐ ๋น„์šฉ์ด ํ๋ธŒ์ ์œผ๋กœ ์ฆ๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
        2. ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ: 3D CNN์€ 3D ํŠน์ง• ๋งต์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰๋„ ๋งค์šฐ ํฝ๋‹ˆ๋‹ค. ํŠนํžˆ ๊ณ ํ•ด์ƒ๋„์˜ ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋ ค๋ฉด, ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

        ๐Ÿ‘‰ ๋‘ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์ธ Voxel Tubes ๋ฐฉ์‹์€ ์ด๋Ÿฌํ•œ ๊ณ„์‚ฐ ๋น„์šฉ์„ ์ค„์ด๊ธฐ ์œ„ํ•ด ๊ณ ์•ˆ๋˜์—ˆ์Šต๋‹ˆ๋‹ค:

        1. ํšจ์œจ์„ฑ: 3D CNN ๋Œ€์‹  2D CNN๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” 2D CNN์ด ๋” ์ ์€ ๊ณ„์‚ฐ ๋น„์šฉ๊ณผ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ•„์š”๋กœ ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ํ›จ์”ฌ ๋” ํšจ์œจ์ ์ž…๋‹ˆ๋‹ค.
        2. ๊ฐ„๋‹จํ•œ ํ•ด์„: ๋งˆ์ง€๋ง‰ ๊ณ„์ธต์—์„œ ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ์‹์ด ์ง๊ด€์ ์ด๋ฉฐ, Z์ถ• ๋ฐฉํ–ฅ์œผ๋กœ ๋ณต์…€์˜ ์ ์œ  ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•˜๋Š” โ€˜ํŠœ๋ธŒโ€™๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

      1. 2D CNN: ์ด์ „๊ณผ ๋™์ผํ•˜๊ฒŒ, 2D CNN์ด ์ด๋ฏธ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•˜์—ฌ ํŠน์ง• ๋งต์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
      2. 3D ํŠน์ง• ์ถ”์ถœ: 3D CNN์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ , 2D CNN์˜ ๋งˆ์ง€๋ง‰ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์—์„œ โ€˜ํŠœ๋ธŒโ€™ ํ˜•ํƒœ์˜ ๋ณต์…€ ์ ์œ  ํ™•๋ฅ ์„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ โ€˜ํŠœ๋ธŒโ€™๋Š” ๊ฐ ์ฑ„๋„์ด ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ์˜ ํ•œ ์ถ•(Z์ถ•)์„ ๋”ฐ๋ผ ์˜ˆ์ธก๋œ ์ ์œ  ํ™•๋ฅ ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.
      3. ํŠœ๋ธŒ์˜ ํ•ด์„: ๋งˆ์ง€๋ง‰ ๊ณ„์ธต์˜ ์ถœ๋ ฅ์€ VxVxV ํฌ๊ธฐ์˜ ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ๋กœ ํ•ด์„๋˜๋ฉฐ, ๊ฐ ์ฑ„๋„์€ 3D ๊ณต๊ฐ„์—์„œ Z์ถ• ๋ฐฉํ–ฅ์˜ ๋ณต์…€ ์ ์œ  ํ™•๋ฅ ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
      4. ํ•™์Šต: ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ per-voxel cross-entropy loss๋ฅผ ์‚ฌ์šฉํ•ด ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.
    • ๋‹จ์ : ๋†’์€ ํ•ด์ƒ๋„๋กœ ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ๋ฅผ ํ‘œํ˜„ํ•˜๋ ค๋ฉด ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, 1024ร—1024ร—1024 ํ•ด์ƒ๋„์˜ ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ๋ฅผ ์ €์žฅํ•˜๋Š” ๋ฐ๋งŒ ์•ฝ 4GB์˜ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

    • ํšจ์œจ์„ฑ ๋ฌธ์ œ ํ•ด๊ฒฐ: ๋ณต์…€์˜ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ์ค„์ด๊ธฐ ์œ„ํ•ด ๋‹ค์ค‘ ํ•ด์ƒ๋„ ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ(Multi-resolution Voxel Grid)๋‚˜ ์˜ฅํŠธ๋ฆฌ(Octree) ๋“ฑ์˜ ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์•”์‹œ์  ํ‘œ๋ฉด(Implicit Surface):

    • ๊ฐœ๋…: 3D ๋ชจ์–‘์„ ํ•จ์ˆ˜๋กœ ํ‘œํ˜„ํ•˜์—ฌ, ์ž„์˜์˜ 3D ์ขŒํ‘œ๊ฐ€ ๊ฐ์ฒด์˜ ๋‚ด๋ถ€์ธ์ง€ ์™ธ๋ถ€์ธ์ง€๋ฅผ ํ™•๋ฅ ๋กœ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.

    • ํ‘œํ˜„ ๋ฐฉ์‹: ์ฃผ์–ด์ง„ 3D ๊ณต๊ฐ„์˜ ์ขŒํ‘œ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„, ๊ทธ ์ขŒํ‘œ๊ฐ€ ๊ฐ์ฒด์˜ ํ‘œ๋ฉด์ธ์ง€ ์—ฌ๋ถ€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค. ์ด ํ•จ์ˆ˜๋Š” ์‹ ๊ฒฝ๋ง์œผ๋กœ ํ•™์Šต๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

      ์•”์‹œ์  ํ•จ์ˆ˜๋Š” 3D ๊ณต๊ฐ„์—์„œ ์ž„์˜์˜ ์ ์ด ํŠน์ • ๊ฐ์ฒด์˜ ๋‚ด๋ถ€์— ์žˆ๋Š”์ง€ ๋˜๋Š” ์™ธ๋ถ€์— ์žˆ๋Š”์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.

      โœ”๏ธ ์ด ํ•จ์ˆ˜๋Š” 3D ๊ณต๊ฐ„ ๋‚ด์˜ ์ขŒํ‘œ (x, y, z)๋ฅผ ์ž…๋ ฅ๋ฐ›์•„, ํ•ด๋‹น ์ขŒํ‘œ๊ฐ€ ๊ฐ์ฒด์˜ ๋‚ด๋ถ€์— ์žˆ์œผ๋ฉด 1, ์™ธ๋ถ€์— ์žˆ์œผ๋ฉด 0์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

      โœ”๏ธ ๊ฒฐ๊ณผ์ ์œผ๋กœ, 3D ๊ฐ์ฒด์˜ ํ‘œ๋ฉด์€ o(x) = 1/2์ธ ์ ๋“ค์˜ ์ง‘ํ•ฉ์œผ๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค.

    • ํ™œ์šฉ: ์•”์‹œ์  ํ‘œ๋ฉด ํ‘œํ˜„์€ ๋ณต์…€๊ณผ ๋‹ฌ๋ฆฌ ํŠน์ • ์ ์—์„œ๋งŒ ํ‘œ๋ฉด์„ ์ •์˜ํ•˜์ง€ ์•Š๊ณ , 3D ๊ณต๊ฐ„ ์ „์ฒด์—์„œ ํ‘œ๋ฉด์„ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋Š” ์žฅ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.


  • ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ(Point Cloud):

    • ๊ฐœ๋…: 3D ๋ชจ์–‘์„ ํ‘œํ˜„ํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋กœ, ๊ฐ์ฒด์˜ ํ‘œ๋ฉด์„ 3D ๊ณต๊ฐ„์˜ ์ˆ˜๋งŽ์€ ์ ์œผ๋กœ ํ‘œํ˜„ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ์ ์€ 3D ์ขŒํ‘œ๊ณ„์—์„œ (x, y, z) ์œ„์น˜ ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ, ์ด๋Ÿฌํ•œ ์ ๋“ค์˜ ์ง‘ํ•ฉ์ด ๊ฐ์ฒด์˜ ์ „์ฒด ๋ชจ์–‘์„ ํ˜•์„ฑํ•ฉ๋‹ˆ๋‹ค.

    • ํŠน์ง•: ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ๋Š” ๋ณต์…€ ๊ทธ๋ฆฌ๋“œ๋ณด๋‹ค ๋” ์ ์‘์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ, ๊ฐ์ฒด์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ์ ์˜ ๋ฐ€๋„๋ฅผ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ์žฅ์ :

      • ์„ธ๋ถ€ ๊ตฌ์กฐ ํ‘œํ˜„: ๋งŽ์€ ์ ๋“ค์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ์ฒด์˜ ์„ธ๋ฐ€ํ•œ ๋ถ€๋ถ„(์˜ˆ: ๋น„ํ–‰๊ธฐ ๋‚ ๊ฐœ์˜ ๋๋ถ€๋ถ„)์„ ์ •ํ™•ํ•˜๊ฒŒ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
      • ์ ์‘์„ฑ: ์ ์˜ Loss Function ๋ฐ€๋„๋ฅผ ์ƒํ™ฉ์— ๋งž๊ฒŒ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์„ธ๋ถ€ ์‚ฌํ•ญ์ด ํ•„์š”ํ•œ ๋ถ€๋ถ„์—๋Š” ์ ์„ ๋” ๋ฐฐ์น˜ํ•˜๊ณ , ์„ธ๋ถ€ ์‚ฌํ•ญ์ด ๋œ ์ค‘์š”ํ•œ ๋ถ€๋ถ„์—๋Š” ์ ์„ ๋œ ๋ฐฐ์น˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ๋‹จ์ :

      • ๋ช…์‹œ์  ํ‘œ๋ฉด ํ‘œํ˜„ ๋ถ€์กฑ: Point Cloud๋Š” ์ ๋“ค์˜ ์ง‘ํ•ฉ์œผ๋กœ๋งŒ ํ‘œํ˜„๋˜๊ธฐ ๋•Œ๋ฌธ์—, ๊ฐ์ฒด์˜ ๋ช…์‹œ์  ํ‘œ๋ฉด์„ ์ง์ ‘์ ์œผ๋กœ ๋‚˜ํƒ€๋‚ด์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ, ๋ Œ๋”๋ง์ด๋‚˜ ๋‹ค๋ฅธ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์—์„œ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์ถ”๊ฐ€์ ์ธ ํ›„์ฒ˜๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ ๋“ค๋กœ๋ถ€ํ„ฐ ์‚ผ๊ฐํ˜• ๋ฉ”์‰ฌ๋ฅผ ์ถ”์ถœํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
      • ์ƒˆ๋กœ์šด ์•„ํ‚คํ…์ฒ˜์™€ ์†์‹ค ํ•จ์ˆ˜ ํ•„์š”: Point Cloud๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š”, ์ƒˆ๋กœ์šด ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜์™€ ์†์‹ค ํ•จ์ˆ˜๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

        ๐Ÿ’ก PointNet

        PointNet์€ Point Cloud ๋ฐ์ดํ„ฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜๋กœ, ๋‹ค์–‘ํ•œ ์‘์šฉ ๋ถ„์•ผ์—์„œ Point Cloud ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜๊ณ  ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ค‘์š”ํ•œ ๋„๊ตฌ์ž…๋‹ˆ๋‹ค.

      ๐Ÿ’ก Loss Function

      => Chamfer Distance๋Š” ๋‘ ๊ฐœ์˜ Point Cloud ์ง‘ํ•ฉ ๊ฐ„์˜ ์œ ์‚ฌ์„ฑ์„ ๋น„๊ตํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์†์‹ค ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.

      • Point Cloud๋Š” 3D ๊ณต๊ฐ„ ๋‚ด์—์„œ ๊ฐ์ฒด์˜ ํ‘œ๋ฉด์„ ํ‘œํ˜„ํ•˜๋Š” ์ ๋“ค์˜ ์ง‘ํ•ฉ์ด๊ธฐ ๋•Œ๋ฌธ์—, ๋‘ Point Cloud๋ฅผ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋‘ ์ง‘ํ•ฉ ๋‚ด ๊ฐ ์ ๋“ค์ด ์–ผ๋งˆ๋‚˜ ๊ฐ€๊นŒ์šด์ง€๋ฅผ ์ธก์ •ํ•˜๋Š” ๋ฐฉ์‹์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
      • Chamfer Distance๋Š” ์ด๋Ÿฌํ•œ ๋น„๊ต๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์Šฌ๋ผ์ด๋“œ์—์„œ ๋ณด์ด๋Š” ์ˆ˜์‹์„ ์•„๋ž˜์™€ ๊ฐ™์ด ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

        โ–ถ๏ธ (์ฒซ๋ฒˆ์งธํ•ญ) ์ง‘ํ•ฉ ๐‘†1์˜ ๊ฐ ์  ๐‘ฅ์— ๋Œ€ํ•ด, ๐‘†2์—์„œ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์  ๐‘ฆ๊นŒ์ง€์˜ L2 ๊ฑฐ๋ฆฌ(์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ)๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ด ์ตœ์†Œ ๊ฑฐ๋ฆฌ๋ฅผ ๋ชจ๋“  ๐‘†1์˜ ์ ๋“ค์— ๋Œ€ํ•ด ํ•ฉ์‚ฐํ•ฉ๋‹ˆ๋‹ค.

        โ–ถ๏ธ (๋‘๋ฒˆ์งธํ•ญ) ์ง‘ํ•ฉ ๐‘†2์˜ ๊ฐ ์  ๐‘ฆ์— ๋Œ€ํ•ด, ๐‘†1์—์„œ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์  x๊นŒ์ง€์˜ L2 ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ์ด๋ฅผ ํ•ฉ์‚ฐํ•ฉ๋‹ˆ๋‹ค.

  • ํ™œ์šฉ ์˜ˆ: ์ž์œจ ์ฃผํ–‰ ์ฐจ๋Ÿ‰์˜ LiDAR ์„ผ์„œ๊ฐ€ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜์—ฌ, ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ฃผํ–‰ ํ™˜๊ฒฝ์„ ์ธ์‹ํ•ฉ๋‹ˆ๋‹ค.


  • ๋ฉ”์‰ฌ(Mesh) ํ‘œํ˜„

    • ๊ฐœ๋…: ๋ฉ”์‰ฌ๋Š” 3D ๊ณต๊ฐ„์—์„œ ์ •์ (Vertices), ๋ชจ์„œ๋ฆฌ(Edges), ๊ทธ๋ฆฌ๊ณ  ๋ฉด(Faces)์˜ ์ง‘ํ•ฉ์œผ๋กœ ๊ฐ์ฒด๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.

      • ์ •์ (Vertices): ํด๋ฆฌ๊ณค์˜ ๋ชจ์„œ๋ฆฌ๋ฅผ ์ •์˜ํ•˜๋Š” ๊ณต๊ฐ„์˜ ์ ์ž…๋‹ˆ๋‹ค.
      • ๋ชจ์„œ๋ฆฌ(Edges): ๊ผญ์ง€์ ์„ ์—ฐ๊ฒฐํ•˜๋Š” ์„ ์ž…๋‹ˆ๋‹ค.
      • ๋ฉด(Faces): ๋ชจ์„œ๋ฆฌ๋กœ ๋‘˜๋Ÿฌ์‹ธ์ธ ํ‰ํ‰ํ•œ ํ‘œ๋ฉด์ž…๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ๋ชจ๋ธ์—์„œ ๋ฉด์€ ์‚ผ๊ฐํ˜• ๋˜๋Š” ์‚ฌ๊ฐํ˜•์ž…๋‹ˆ๋‹ค.
        • ๋ฉ”์‰ฌ๋Š” ์‚ผ๊ฐํ˜• ๋ฉ”์‰ฌ(Triangle Mesh)๊ฐ€ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋ฉฐ, ์ด ๋ฐฉ์‹์€ ๊ฐ์ฒด์˜ ํ‘œ๋ฉด์„ ๋ช…ํ™•ํ•˜๊ฒŒ ์ •์˜ํ•˜๊ณ  ์‹œ๊ฐ์ ์œผ๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋Š” ์žฅ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

  • ํŠน์ง•: ๊ทธ๋ž˜ํ”ฝ์Šค์—์„œ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋ฉฐ, ํ‘œ๋ฉด์„ ๋ช…์‹œ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ…์Šค์ฒ˜๋‚˜ ์ƒ‰์ƒ ๋“ฑ์˜ ์ •๋ณด๋ฅผ ์‰ฝ๊ฒŒ ๋ถ€์—ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์žฅ์ :
    • ์ •ํ™•ํ•œ ํ‘œ๋ฉด ํ‘œํ˜„: ๋ฉ”์‰ฌ๋Š” ๊ฐ์ฒด์˜ ํ‘œ๋ฉด์„ ๋ช…์‹œ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๊ทธ๋ž˜ํ”ฝ์Šค๋‚˜ ๋ Œ๋”๋ง์— ๋งค์šฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.
    • ํ…์Šค์ฒ˜ ๋งคํ•‘: ๋ฉ”์‰ฌ ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ํ‘œ๋ฉด์— ํ…์Šค์ฒ˜๋‚˜ ์ƒ‰์ƒ, ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ๋ฅผ ์‰ฝ๊ฒŒ ๋ถ€์—ฌํ•˜๊ณ  ์ด๋ฅผ ์‹œ๊ฐ์ ์œผ๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋‹จ์ :
    • ๋ณต์žก์„ฑ: ๋ฉ”์‰ฌ๋ฅผ ์‹ ๊ฒฝ๋ง์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒƒ์€ ๋ณต์žกํ•œ ์ž‘์—…์ž…๋‹ˆ๋‹ค. ํŠนํžˆ, ๋ฉ”์‰ฌ๋Š” ๊ทธ๋ž˜ํ”„ ๊ตฌ์กฐ๋กœ ํ‘œํ˜„๋˜๋ฏ€๋กœ, ๊ธฐ์กด์˜ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(CNN)์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
  • TASK: Predicting Meshes (Pixel2Mesh)

    • Pixel2Mesh์˜ ์•„ํ‚คํ…์ณ๋Š” ํฌ๊ฒŒ Mesh Deformation Block๊ณผ Graph Unpooling Layer ์˜์—ญ์œผ๋กœ ๊ตฌ๋ถ„๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค:
      1. Mesh Deformation Block์€ ์›ํ•˜๋Š” 3D-shape์œผ๋กœ ํ”ผ์ณ๋ฅผ ํ•™์Šตํ•˜๋Š” ๋‹จ๊ณ„
      2. Graph Unpooling Layer์€ ์ข€ ๋” ๋†’์€ resolution์˜ 3D ๊ฒฐ๊ณผ๋ฅผ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ์ •๊ตํ™”ํ•˜๋Š” ๋‹จ๊ณ„
    • ์•„๋ž˜๋Š” ์ด์— ๋Œ€ํ•ด์„œ ๋…ผ๋ฌธ์—์„œ ์ฐพ์•„์„œ ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.


4. 3D ๋ชจ์–‘ ํ‰๊ฐ€ ๋ฐฉ์‹

  1. Intersection over Union (IoU):
    • ์„ค๋ช…: IoU๋Š” ๊ฒน์น˜๋Š” ๋ถ€๋ถ„์„ ์ธก์ •ํ•˜๋Š” ์ง€ํ‘œ๋กœ, ๋‘ 3D ํ˜•ํƒœ ์‚ฌ์ด์˜ ๊ฒน์น˜๋Š” ๋ถ€ํ”ผ๋ฅผ ๋‘ ํ˜•ํƒœ์˜ ๊ฒฐํ•ฉ๋œ ๋ถ€ํ”ผ๋กœ ๋‚˜๋ˆˆ ๊ฐ’์ž…๋‹ˆ๋‹ค.
    • ์žฅ๋‹จ์ : ๋ฏธ์„ธํ•œ ๊ตฌ์กฐ๋ฅผ ์ž˜ ํฌ์ฐฉํ•˜์ง€ ๋ชปํ•˜๋ฉฐ, ๊ฐ’์ด ๋‚ฎ์„ ๋•Œ๋Š” ์˜๋ฏธ ์žˆ๋Š” ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค.

  2. Chamfer Distance (CD):

    • ์„ค๋ช…: Chamfer Distance๋Š” ๋‘ 3D ์  ๊ตฌ๋ฆ„ ๊ฐ„์˜ ํ‰๊ท  ์ตœ์†Œ ๊ฑฐ๋ฆฌ๋กœ ์ธก์ •๋ฉ๋‹ˆ๋‹ค. ํ•œ ํ˜•ํƒœ์˜ ๊ฐ ์ ์ด ๋‹ค๋ฅธ ํ˜•ํƒœ์—์„œ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ ๊นŒ์ง€์˜ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ํ‰๊ท ์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
    • ์žฅ๋‹จ์ : ์ด์ƒ์น˜์— ๋งค์šฐ ๋ฏผ๊ฐํ•˜๋ฉฐ, ์ง์ ‘์ ์œผ๋กœ ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค.

  3. F1 Score:

    • ์„ค๋ช…: Precision๊ณผ Recall์„ ๊ฒฐํ•ฉํ•œ F1 Score๋Š” ์ž„๊ณ„๊ฐ’์„ ์กฐ์ •ํ•จ์œผ๋กœ์จ ๋‹ค์–‘ํ•œ ์Šค์ผ€์ผ์—์„œ์˜ ๋””ํ…Œ์ผ์„ ํฌ์ฐฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ์žฅ๋‹จ์ : ์ด์ƒ์น˜์— ๋Œ€ํ•ด ๊ฐ•๊ฑดํ•˜์ง€๋งŒ, ๋‹ค์–‘ํ•œ ์Šค์ผ€์ผ์—์„œ ๋””ํ…Œ์ผ์„ ํฌ์ฐฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋‹ค๋ฅธ ์ž„๊ณ„๊ฐ’์„ ์„ค์ •ํ•ด ์ฃผ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋’ค์— ๋‚˜์˜ค๋Š” ๊ฐœ๋…๋“ค์„ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” 3D Vision์— ๋Œ€ํ•œ ์ดํ•ด๋ฅผ ํ•ด์•ผํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค์Œ๊ณผ ๊ฐ™์ด โ€œ3D Visionโ€ ๊ฐ•์˜๋ฅผ ๋“ฃ๊ณ , ์ •๋ฆฌํ•ด๋ณด์•˜์Šต๋‹ˆ๋‹ค.

  • ์ด ๊ฐ•์˜๋Š” 2D ์ด๋ฏธ์ง€์—์„œ 3D ๊ตฌ์กฐ๋ฅผ ์ดํ•ดํ•˜๊ณ  ํ‘œํ˜„ํ•˜๋Š” ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•๊ณผ ๊ทธ ํ•œ๊ณ„, ๊ทธ๋ฆฌ๊ณ  ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ํ˜์‹ ์ ์ธ ๊ธฐ์ˆ ์„ ํƒ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฐ๊ฐ์˜ ํ‘œํ˜„ ๋ฐฉ์‹์€ ํŠน์ • ์ž‘์—…์— ๋”ฐ๋ผ ์žฅ๋‹จ์ ์ด ์žˆ์œผ๋ฉฐ, ๋ณต์žกํ•œ 3D ๋ฐ์ดํ„ฐ์˜ ์ฒ˜๋ฆฌ์™€ ์˜ˆ์ธก์„ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ์ ‘๊ทผ๋ฒ•์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์ฝ์–ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค ๐Ÿค—



-->