[๋จธ์‹ ๋Ÿฌ๋‹] ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ (Logistic Regression)

Posted by Euisuk's Dev Log on March 7, 2025

[๋จธ์‹ ๋Ÿฌ๋‹] ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ (Logistic Regression)

์›๋ณธ ๊ฒŒ์‹œ๊ธ€: https://velog.io/@euisuk-chung/๋จธ์‹ ๋Ÿฌ๋‹-๋กœ์ง€์Šคํ‹ฑ-ํšŒ๊ท€-๋ชจ๋ธ-Logistic-Regression

  1. ๊ฐœ์š”

์•ˆ๋…•ํ•˜์„ธ์š”! ์ด๋ฒˆ ๊ธ€์—์„œ๋Š” ๋จธ์‹ ๋Ÿฌ๋‹์˜ ์ฃผ์š” ๊ฐœ๋…์ธ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€(Logistic Regression) ๋ชจ๋ธ๊ณผ ์˜ค์ฆˆ ๋น„(Odds Ratio)์— ๋Œ€ํ•ด์„œ ์‰ฝ๊ฒŒ ์ •๋ฆฌํ•ด๋ณด์•˜์Šต๋‹ˆ๋‹ค. ์นœ๊ตฌ์—๊ฒŒ ์ œ๋Œ€๋กœ ๋ฉ‹์ง€๊ฒŒ ์„ค๋ช…ํ•˜๊ณ  ์‹ถ์—ˆ๋Š”๋ฐ, ๋ญ”๊ฐ€ ๋‹ค์‹œ ํ•œ๋ฒˆ ์ •๋ฆฌ๊ฐ€ ํ•„์š”ํ•  ๊ฒƒ ๊ฐ™๋”๋ผ๊ณ ์š” ใ…Žใ…Ž ^^7

  • ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์€ ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ์˜ˆ์ธกํ•  ๋•Œ ์‚ฌ์šฉํ•˜๋Š” ํ†ต๊ณ„์  ๋ฐฉ๋ฒ•์œผ๋กœ, ํŠนํžˆ ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ๋งŽ์ด ํ™œ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • ์˜ˆ๋ฅผ ๋“ค์–ด, ํŠน์ • ์Œ์‹ ์†Œ๋น„ ์Šต๊ด€์ด ๊ฑด๊ฐ•์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ๋ถ„์„ํ•  ๋•Œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

(์ฐธ๊ณ ) ๋ณธ ๋ธ”๋กœ๊ทธ ํฌ์ŠคํŠธ์˜ ์ด๋ฏธ์ง€ ์ž๋ฃŒ๋Š” ๊ณ ๋ ค๋Œ€ํ•™๊ต DMQA ๊น€์„ฑ๋ฒ” ๊ต์ˆ˜๋‹˜์˜ ์ˆ˜์—…์ž๋ฃŒ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ’ก ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ(Binary Classification Task) ๋ž€?

  • ์ด์ง„ ๋ถ„๋ฅ˜๋Š” ์ง‘ํ•ฉ ์˜ ์š”์†Œ๋ฅผ ๋‘ ๊ทธ๋ฃน(๊ฐ๊ฐ ํด๋ž˜์Šค ๋ผ๊ณ  ํ•จ) ์ค‘ ํ•˜๋‚˜๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
    • EX-1. ํ™˜์ž๊ฐ€ ํŠน์ • ์งˆ๋ณ‘์„ ์•“๊ณ  ์žˆ๋Š”์ง€ ์•„๋‹Œ์ง€๋ฅผ ํŒ๋‹จํ•˜๊ธฐ ์œ„ํ•œ ๊ฑด๊ฐ•๊ฒ€์ง„.
    • EX-2. ์ •๋ณด ๊ฒ€์ƒ‰์—์„œ ํŽ˜์ด์ง€๊ฐ€ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ์ง‘ํ•ฉ์— ์žˆ์–ด์•ผ ํ•˜๋Š”์ง€ ์•„๋‹Œ์ง€์˜ ์—ฌ๋ถ€๋ฅผ ๊ฒฐ์ •.

์ด ๊ธ€์—์„œ๋Š” ์˜ค์ฆˆ ๋น„์˜ ๊ฐœ๋…๊ณผ ํ•ด์„ ๋ฐฉ๋ฒ•์„ ๋จผ์ € ์„ค๋ช…ํ•œ ํ›„, ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์ด ์˜ค์ฆˆ ๋น„๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ์‹์„ ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

  • ์‹คํ—˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•  ๋•Œ ์˜ค์ฆˆ ๋น„๋ฅผ ์‘์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ดํ•ดํ•˜๋Š” ๋ฐ ๋งŽ์€ ๋„์›€์ด ๋˜๊ธธ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๐Ÿ˜œ
  1. ์˜ค์ฆˆ ๋น„(Odds Ratio)์™€ ํ•ด์„ ๋ฐฉ๋ฒ•

์˜ค์ฆˆ ๋น„(Odds Ratio, OR)๋Š” ๋‘ ๊ทธ๋ฃน ๊ฐ„ ํŠน์ • ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ•  ๊ฐ€๋Šฅ์„ฑ์„ ๋น„๊ตํ•˜๋Š” ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค.

  • ์—ฐ๊ตฌ์—์„œ ํ”ํžˆ ์‚ฌ์šฉ๋˜๋Š” ์˜ค์ฆˆ ๋น„ ํ•ด์„ ๋ฐฉ๋ฒ•์„ ์˜ˆ์ œ๋กœ ์„ค๋ช…ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์˜ˆ์ œ: ์Œ์‹ ์†Œ๋น„์™€ ๋น„๋งŒ ๊ฐ„์˜ ๊ด€๊ณ„ ๋ถ„์„

์˜ˆ๋ฅผ ๋“ค์–ด ์–ด๋–ค ์—ฐ๊ตฌ์—์„œ ๊ณ ๊ธฐ ์†Œ๋น„์™€ ๋น„๋งŒ ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ๋ถ„์„ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ด๋ด…์‹œ๋‹ค.

  • ์—ฐ๊ตฌ์ž๋“ค์€ 200๋ช…์˜ ์ฐธ๊ฐ€์ž(N=200)๋ฅผ ๋Œ€์ƒ์œผ๋กœ ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰์— ๋”ฐ๋ผ ๋‘ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ„์—ˆ์Šต๋‹ˆ๋‹ค.

    ๊ทธ๋ฃน ๋น„๋งŒ ๋ฐœ์ƒ (A) ๋น„๋งŒ ์—†์Œ (B)
    ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰ ์ ์Œ 30 70
    ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰ ๋งŽ์Œ 50 50
  • ์ด ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•˜์—ฌ ์˜ค์ฆˆ ๋น„๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ’ก ์ž ๊น! ์˜ค์ฆˆ(Odds)๋ž€?

  • ์˜ค์ฆˆ(Odds)๋Š” ํŠน์ • ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ•  ํ™•๋ฅ ์„ ๋ฐœ์ƒํ•˜์ง€ ์•Š์„ ํ™•๋ฅ ๋กœ ๋‚˜๋ˆˆ ๊ฐ’์ž…๋‹ˆ๋‹ค. (๋’ค์—์„œ ๋” ์ž์„ธํ•˜๊ฒŒ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค)

    ๋ฐœ์ƒํ• ย ํ™•๋ฅ ๋ฐœ์ƒํ•˜์ง€ย ์•Š์„ย ํ™•๋ฅ \frac{\text{๋ฐœ์ƒํ•  ํ™•๋ฅ }}{\text{๋ฐœ์ƒํ•˜์ง€ ์•Š์„ ํ™•๋ฅ }}๋ฐœ์ƒํ•˜์ง€ย ์•Š์„ย ํ™•๋ฅ ๋ฐœ์ƒํ• ย ํ™•๋ฅ โ€‹

  1. ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰์ด ์ ์€ ๊ทธ๋ฃน์—์„œ ๋น„๋งŒ์ด ๋ฐœ์ƒํ•  ์˜ค์ฆˆ(Odds):
    • ๋น„๋งŒ ๋ฐœ์ƒ(A)์„ ์‚ฌ๊ฑด(event)์œผ๋กœ ๋ณด๊ณ , ๋น„๋งŒ์ด ๋ฐœ์ƒํ•  ํ™•๋ฅ ๊ณผ ๋ฐœ์ƒํ•˜์ง€ ์•Š์„ ํ™•๋ฅ ์˜ ๋น„์œจ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

Oddsย Lowย Cosumeย =3070=0.43\text{Odds}_{\text{ Low Cosume }} = \frac{30}{70} = 0.43Oddsย Lowย Cosumeย โ€‹=7030โ€‹=0.43

  1. ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰์ด ๋งŽ์€ ๊ทธ๋ฃน์—์„œ ๋น„๋งŒ์ด ๋ฐœ์ƒํ•  ์˜ค์ฆˆ(Odds):
    • ๋น„๋งŒ ๋ฐœ์ƒ(A)์„ ์‚ฌ๊ฑด(event)์œผ๋กœ ๋ณด๊ณ , ๋น„๋งŒ์ด ๋ฐœ์ƒํ•  ํ™•๋ฅ ๊ณผ ๋ฐœ์ƒํ•˜์ง€ ์•Š์„ ํ™•๋ฅ ์˜ ๋น„์œจ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

Oddsย Highย Cosumeย =5050=1\text{Odds}_{\text{ High Cosume }} = \frac{50}{50} = 1Oddsย Highย Cosumeย โ€‹=5050โ€‹=1

  1. ๋‘ ๊ทธ๋ฃน์˜ ์˜ค์ฆˆ ๋น„ ๊ณ„์‚ฐ:

OR=Oddsย Highย Cosumeย Oddsย Lowย Cosume=10.43โ‰ˆ2.33OR = \frac{\text{Odds}_{\text{ High Cosume }}}{\text{Odds}_{\text{ Low Cosume}}} = \frac{1}{0.43} \approx 2.33OR=Oddsย Lowย Cosumeโ€‹Oddsย Highย Cosumeย โ€‹โ€‹=0.431โ€‹โ‰ˆ2.33

์ฆ‰, ๊ณ ๊ธฐ๋ฅผ ๋งŽ์ด ์†Œ๋น„ํ•˜๋Š” ์‚ฌ๋žŒ์ด ์ ๊ฒŒ ์†Œ๋น„ํ•˜๋Š” ์‚ฌ๋žŒ๋ณด๋‹ค ๋น„๋งŒ์ด ๋ฐœ์ƒํ•  ํ™•๋ฅ ์ด ์•ฝ 2.33๋ฐฐ ๋†’๋‹ค๊ณ  ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ค์ฆˆ ๋น„(Odds Ratio, OR)

์˜ค์ฆˆ ๋น„(Odds Ratio, OR)๋Š” ๋‘ ๊ฐœ์˜ ์˜ค์ฆˆ(Odds)๋ฅผ ๋น„๊ตํ•œ ๋น„์œจ์ž…๋‹ˆ๋‹ค. ์ด๋Š” ์œ„์•„๋ž˜์˜ ์ˆœ์„œ์— ๋”ฐ๋ผ ํ•ด์„์ด ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ“Œ ์˜ค์ฆˆ ๋น„ ๊ณ„์‚ฐ์—์„œ ์œ„์•„๋ž˜ ์ˆœ์„œ์˜ ์˜๋ฏธ

  • ์˜ค์ฆˆ ๋น„๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ๋น„๊ตํ•˜๊ณ ์ž ํ•˜๋Š” ๊ทธ๋ฃน์˜ ์˜ค์ฆˆ๋ฅผ ๋ถ„์ž๋กœ, ๊ธฐ์ค€ ๊ทธ๋ฃน์˜ ์˜ค์ฆˆ๋ฅผ ๋ถ„๋ชจ๋กœ ๋†“๊ณ  ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

OR=Oddsย ๋น„๊ตย ๊ทธ๋ฃนOddsย ๊ธฐ์ค€ย ๊ทธ๋ฃนOR = \frac{\text{Odds}_{\text{ ๋น„๊ต ๊ทธ๋ฃน}}}{\text{Odds}_{\text{ ๊ธฐ์ค€ ๊ทธ๋ฃน}}}OR=Oddsย ๊ธฐ์ค€ย ๊ทธ๋ฃนโ€‹Oddsย ๋น„๊ตย ๊ทธ๋ฃนโ€‹โ€‹

  • OR > 1 : ๋น„๊ต ๊ทธ๋ฃน์—์„œ ์‚ฌ๊ฑด ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ์ด ๊ธฐ์ค€ ๊ทธ๋ฃน๋ณด๋‹ค ๋†’์Œ
  • OR < 1 : ๋น„๊ต ๊ทธ๋ฃน์—์„œ ์‚ฌ๊ฑด ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ์ด ๊ธฐ์ค€ ๊ทธ๋ฃน๋ณด๋‹ค ๋‚ฎ์Œ
  • OR = 1 : ๋‘ ๊ทธ๋ฃน ๊ฐ„ ์‚ฌ๊ฑด ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ์— ์ฐจ์ด๊ฐ€ ์—†์Œ

๐Ÿ“Œ ํ˜„์žฌ ์˜ˆ์ œ์—์„œ ์ ์šฉ

  • ํ˜„์žฌ ์˜ˆ์ œ์—์„œ๋Š” ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰์ด ๋งŽ์€ ๊ทธ๋ฃน์„ ๋น„๊ต ๊ทธ๋ฃน, ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰์ด ์ ์€ ๊ทธ๋ฃน์„ ๊ธฐ์ค€ ๊ทธ๋ฃน์œผ๋กœ ์„ค์ •ํ•˜์˜€์Šต๋‹ˆ๋‹ค.

OR=Odds๊ณ ๊ธฐย ๋งŽ์ดย ์†Œ๋น„Odds๊ณ ๊ธฐย ์ ๊ฒŒย ์†Œ๋น„=1.00.43โ‰ˆ2.33OR = \frac{\text{Odds}_{\text{๊ณ ๊ธฐ ๋งŽ์ด ์†Œ๋น„}}}{\text{Odds}_{\text{๊ณ ๊ธฐ ์ ๊ฒŒ ์†Œ๋น„}}} = \frac{1.0}{0.43} \approx 2.33OR=Odds๊ณ ๊ธฐย ์ ๊ฒŒย ์†Œ๋น„โ€‹Odds๊ณ ๊ธฐย ๋งŽ์ดย ์†Œ๋น„โ€‹โ€‹=0.431.0โ€‹โ‰ˆ2.33

์ฆ‰, ๊ณ ๊ธฐ๋ฅผ ๋งŽ์ด ์†Œ๋น„ํ•˜๋Š” ๊ทธ๋ฃน์ด ๊ณ ๊ธฐ๋ฅผ ์ ๊ฒŒ ์†Œ๋น„ํ•˜๋Š” ๊ทธ๋ฃน๋ณด๋‹ค ๋น„๋งŒ ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ์ด 2.33๋ฐฐ ๋†’๋‹ค๊ณ  ํ•ด์„ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿค” ๋งŒ์•ฝ ์œ„์•„๋ž˜๋ฅผ ๋ฐ”๊พผ๋‹ค๋ฉด?

  • ๋งŒ์•ฝ ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰์ด ์ ์€ ๊ทธ๋ฃน์„ ๋น„๊ต ๊ทธ๋ฃน์œผ๋กœ ํ•˜๊ณ  ๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰์ด ๋งŽ์€ ๊ทธ๋ฃน์„ ๊ธฐ์ค€ ๊ทธ๋ฃน์œผ๋กœ ๋†“๋Š”๋‹ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค.

OR=Odds๊ณ ๊ธฐย ์ ๊ฒŒย ์†Œ๋น„Odds๊ณ ๊ธฐย ๋งŽ์ดย ์†Œ๋น„=0.431.0โ‰ˆ0.43OR = \frac{\text{Odds}_{\text{๊ณ ๊ธฐ ์ ๊ฒŒ ์†Œ๋น„}}}{\text{Odds}_{\text{๊ณ ๊ธฐ ๋งŽ์ด ์†Œ๋น„}}} = \frac{0.43}{1.0} \approx 0.43OR=Odds๊ณ ๊ธฐย ๋งŽ์ดย ์†Œ๋น„โ€‹Odds๊ณ ๊ธฐย ์ ๊ฒŒย ์†Œ๋น„โ€‹โ€‹=1.00.43โ€‹โ‰ˆ0.43

์ด ๊ฒฝ์šฐ, ๊ณ ๊ธฐ๋ฅผ ์ ๊ฒŒ ์†Œ๋น„ํ•˜๋Š” ๊ทธ๋ฃน์ด ๊ณ ๊ธฐ๋ฅผ ๋งŽ์ด ์†Œ๋น„ํ•˜๋Š” ๊ทธ๋ฃน๋ณด๋‹ค ๋น„๋งŒ ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ์ด 0.43๋ฐฐ(์ฆ‰, ๋‚ฎ๋‹ค)๋ผ๋Š” ์˜๋ฏธ๋กœ ํ•ด์„๋ฉ๋‹ˆ๋‹ค.

  • ์˜ค์ฆˆ ๋น„์˜ ํฌ๊ธฐ๋Š” ๋™์ผํ•˜์ง€๋งŒ, ๋ถ„๋ชจ์™€ ๋ถ„์ž์˜ ์ˆœ์„œ์— ๋”ฐ๋ผ ํ•ด์„์ด ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ์—ฐ๊ตฌ ๋ชฉ์ ์— ๋งž๊ฒŒ ์ˆœ์„œ๋ฅผ ์ •ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.
    • ์ผ๋ฐ˜์ ์œผ๋กœ ๊ด€์‹ฌ ์žˆ๋Š” ๋ณ€์ˆ˜(์˜ˆ: ํŠน์ • ํ–‰๋™์„ ํ–ˆ์„ ๋•Œ์˜ ํšจ๊ณผ)๊ฐ€ ์žˆ๋Š” ๊ทธ๋ฃน์„ ๋ถ„์ž๋กœ ๋†“๊ณ  ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์ด ์ง๊ด€์ ์ž…๋‹ˆ๋‹ค.
  1. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์˜ ํ•„์š”์„ฑ

์ผ๋ฐ˜์ ์ธ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์€ ์ข…์† ๋ณ€์ˆ˜(Y)๊ฐ€ ์—ฐ์†ํ˜•์ผ ๋•Œ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • ํ•˜์ง€๋งŒ ํ˜„์‹ค์—์„œ๋Š” 0๊ณผ 1๋กœ ๊ตฌ๋ถ„๋˜๋Š” ์ด์ง„ ๋ณ€์ˆ˜(binary variable)๊ฐ€ ๋” ์ž์ฃผ ๋“ฑ์žฅํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด:

  • A๊ฐ€ ์งˆ๋ณ‘์ด ๋ฐœ์ƒํ•  ์—ฌ๋ถ€(ํ™•๋ฅ )
  • B๊ฐ€ ์ƒํ’ˆ์„ ๊ตฌ๋งคํ•  ์—ฌ๋ถ€(ํ™•๋ฅ )
  • C๊ฐ€ ์‹œํ—˜์„ ํ†ต๊ณผํ•  ์—ฌ๋ถ€(ํ™•๋ฅ )

์ด์ฒ˜๋Ÿผ ๊ฒฐ๊ณผ ๊ฐ’์ด ๋‘ ๊ฐœ์˜ ๋ฒ”์ฃผ(0 ๋˜๋Š” 1)๋กœ ๋‚˜๋‰  ๋•Œ, ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ์ ์šฉํ•˜๋ฉด ์˜ˆ์ธก๊ฐ’์ด 0๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ 1๋ณด๋‹ค ์ปค์ง€๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

  • ๋”ฐ๋ผ์„œ ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜(Sigmoid Function)๋ฅผ ์ด์šฉํ•˜์—ฌ ์˜ˆ์ธก๊ฐ’์„ 0๊ณผ 1 ์‚ฌ์ด๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ’ก ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜(Sigmoid Function) ๋ž€?

  • ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜๋Š” S์ž ํ˜•ํƒœ์˜ ๊ณก์„ ์„ ๊ฐ€์ง€๋ฉฐ, ์‹ค์ˆ˜ ๊ฐ’์„ 0๊ณผ 1 ์‚ฌ์ด๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋น„์„ ํ˜• ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • ์ฃผ์–ด์ง„ ์ž…๋ ฅ x์— ๋Œ€ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋ฉ๋‹ˆ๋‹ค.sigmoid(x)=11+eโˆ’xsigmoid(x) = \frac{1}{1 + e^{-x}}sigmoid(x)=1+eโˆ’x1โ€‹
  1. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์˜ ์ •์˜

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด ํŠน์ • ์ž…๋ ฅ (XXX)์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•˜๋Š” ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

  • ๋‹จ์ˆœํžˆ ํ•˜๋‚˜์˜ (xxx) ๊ฐ’์ด ์•„๋‹ˆ๋ผ, ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋…๋ฆฝ ๋ณ€์ˆ˜( featuresfeaturesfeatures )๋“ค์„ ๊ณ ๋ คํ•ฉ๋‹ˆ๋‹ค.

์ผ๋ฐ˜์ ์œผ๋กœ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์€ ์„ ํ˜• ํšŒ๊ท€ ์‹์„ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜์— ์ ์šฉํ•˜์—ฌ ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค.

ฯ€(X)=11+eโˆ’(ฮฒ0+ฮฒ1X1+ฮฒ2X2+โ‹ฏ+ฮฒnXn)\pi(X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_n X_n)}}ฯ€(X)=1+eโˆ’(ฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+ฮฒ2โ€‹X2โ€‹+โ‹ฏ+ฮฒnโ€‹Xnโ€‹)1โ€‹

์—ฌ๊ธฐ์„œ:

  • ฮฒ0\beta_0ฮฒ0โ€‹ (์ ˆํŽธ, bias term)
  • ฮฒ1,ฮฒ2,โ€ฆ,ฮฒn\beta_1, \beta_2, โ€ฆ, \beta_nฮฒ1โ€‹,ฮฒ2โ€‹,โ€ฆ,ฮฒnโ€‹ (๊ฐ ๋ณ€์ˆ˜ X1,X2,โ€ฆ,XnX_1, X_2, โ€ฆ, X_nX1โ€‹,X2โ€‹,โ€ฆ,Xnโ€‹ ์— ๋Œ€ํ•œ ํšŒ๊ท€ ๊ณ„์ˆ˜)
  • X1,X2,โ€ฆ,XnX_1, X_2, โ€ฆ, X_nX1โ€‹,X2โ€‹,โ€ฆ,Xnโ€‹ (์ž…๋ ฅ ๋ณ€์ˆ˜๋“ค)

์ฆ‰, ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” ๋‹จ์ˆœํžˆ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜์— ์„ ํ˜• ๊ฒฐํ•ฉ๋œ ๋…๋ฆฝ ๋ณ€์ˆ˜๋“ค์„ ๋Œ€์ž…ํ•œ ๊ฒƒ์ด๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ’ก ์„ค๋ช…์˜ ํŽธ์˜๋ฅผ ์œ„ํ•ด ํ•˜๋‚˜์˜ ์ž…๋ ฅ๋ณ€์ˆ˜ X ๋งŒ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์„ ์˜ˆ๋กœ ๋“ค์–ด ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

์ด๋Š” ์•„๋ž˜์™€ ๊ฐ™์€ ์ˆ˜์‹์œผ๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค.

ฯ€(X)=11+eโˆ’(ฮฒ0+ฮฒ1X)\pi(X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X)}}ฯ€(X)=1+eโˆ’(ฮฒ0โ€‹+ฮฒ1โ€‹X)1โ€‹

  • ์—ฌ๊ธฐ์„œ (ฯ€(X)\pi(X)ฯ€(X))๋Š” ํŠน์ • ๋ณ€์ˆ˜๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ๊ฒฐ๊ณผ๊ฐ€ 1์ด ๋  ํ™•๋ฅ ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“Œ Odds(์Šน์‚ฐ):

  • ํŠน์ • ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ•  ํ™•๋ฅ ๊ณผ ๋ฐœ์ƒํ•˜์ง€ ์•Š์„ ํ™•๋ฅ ์˜ ๋น„์œจ

    Odds=ฯ€(X)1โˆ’ฯ€(X)Odds = \frac{\pi(X)}{1 - \pi(X)}Odds=1โˆ’ฯ€(X)ฯ€(X)โ€‹

    • Odds๊ฐ’์„ ๋„์ถœํ•˜๋ฉด, ์•„๋ž˜์™€ ๊ฐ™์€ ๊ฐ’์ด ๋‚˜์˜ต๋‹ˆ๋‹ค.Odds=eฮฒ0+ฮฒ1XOdds = e^{\beta_0 + \beta_1 X}Odds=eฮฒ0โ€‹+ฮฒ1โ€‹X

๐Ÿ“Œ Logit ๋ณ€ํ™˜(Logit Transformation):

  • Odds์— ๋กœ๊ทธ๋ฅผ ์ทจํ•˜๋ฉด ์„ ํ˜• ๊ด€๊ณ„๋กœ ๋ณ€ํ™˜๋จ

    log(Odds)=log(ฯ€(X)1โˆ’ฯ€(X))=logโก(eฮฒ0+ฮฒ1X)log(Odds) = log \left( \frac{\pi(X)}{1 - \pi(X)} \right) = \log \left( e^{\beta_0 + \beta_1 X} \right)log(Odds)=log(1โˆ’ฯ€(X)ฯ€(X)โ€‹)=log(eฮฒ0โ€‹+ฮฒ1โ€‹X)

    • ๋กœ๊ทธ์˜ ์„ฑ์งˆ์„ ์ด์šฉํ•˜๋ฉด,logโก(Odds)=ฮฒ0+ฮฒ1X\log(Odds) = \beta_0 + \beta_1 Xlog(Odds)=ฮฒ0โ€‹+ฮฒ1โ€‹X

์ด๋ฅผ ํ†ตํ•ด ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์€ ๊ธฐ์กด ์„ ํ˜• ํšŒ๊ท€์™€ ๋น„์Šทํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋ฉด์„œ๋„, ๊ฒฐ๊ณผ๊ฐ’์„ ํ™•๋ฅ ๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ฆ‰, ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์€ ๊ฒฐ๊ตญ log(Odds)๋ฅผ ์„ ํ˜•์‹์œผ๋กœ ํ‘œํ˜„ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
    • ๋”ฐ๋ผ์„œ, ์šฐ๋ฆฌ๊ฐ€ ์ถ”์ •ํ•˜๋Š” ํšŒ๊ท€ ๊ณ„์ˆ˜(ฮฒ0,ฮฒ1\beta_0, \beta_1ฮฒ0โ€‹,ฮฒ1โ€‹)๋Š” log(Odds)์™€์˜ ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ํŠน์ • ๋ณ€์ˆ˜์˜ ๋ณ€ํ™”๊ฐ€ ์˜ค์ฆˆ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ๋ถ„์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ( logโก(Odds)=ฮฒ0+ฮฒ1X\log(Odds) = \beta_0 + \beta_1 Xlog(Odds)=ฮฒ0โ€‹+ฮฒ1โ€‹X )

โ“ ์–ด? ๊ทธ๋ ‡๋‹ค๋ฉด X์˜ ๊ณ„์ˆ˜์ธ ฮฒ1\beta_1ฮฒ1โ€‹์— ๋ญ”๊ฐ€ ์ˆจ๊ฒจ์ง„ ์˜๋ฏธ๊ฐ€ ์žˆ์„ ๊ฑฐ ๊ฐ™์€๋ฐ?

  • ฮฒ1\beta_1ฮฒ1โ€‹์˜ ์˜๋ฏธ: x๊ฐ€ ํ•œ๋‹จ์œ„ ์ฆ๊ฐ€ํ–ˆ์„ ๋•Œ log(odds)์˜ ์ฆ๊ฐ€๋Ÿ‰

  • ์ด๋ฅผ ์ง€์ˆ˜ ํ•จ์ˆ˜ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•˜๋ฉด, X๊ฐ€ ํ•œ ๋‹จ์œ„ ์ฆ๊ฐ€ํ•  ๋•Œ ์˜ค์ฆˆ(odds)๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋ณ€ํ™”ํ•˜๋Š”์ง€๋ฅผ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    eฮฒ1=oddsย whenย X+1oddsย whenย Xe^{\beta_1} = \frac{\text{odds when } X+1}{\text{odds when } X}eฮฒ1โ€‹=oddsย whenย Xoddsย whenย X+1โ€‹

  • ์ฆ‰, ฮฒ1\beta_1ฮฒ1โ€‹ ๊ฐ’์ด 0.50.50.5๋ผ๋ฉด X๊ฐ€ 111 ์ฆ๊ฐ€ํ•  ๋•Œ odds๊ฐ€ e0.5โ‰ˆ1.65e^{0.5} \approx 1.65e0.5โ‰ˆ1.65๋ฐฐ ์ฆ๊ฐ€ํ•œ๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค.
  • ๋งŒ์•ฝ ํšŒ๊ท€๊ณ„์ˆ˜๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ๋ผ๋ฉด, ๊ฐ๊ฐ์˜ ํšŒ๊ท€ ๊ณ„์ˆ˜(ฮฒ1,ฮฒ2,โ€ฆ,ฮฒn\beta_1, \beta_2, โ€ฆ, \beta_nฮฒ1โ€‹,ฮฒ2โ€‹,โ€ฆ,ฮฒnโ€‹)๋Š” ๊ฐ ๋…๋ฆฝ ๋ณ€์ˆ˜๋“ค์ด ์ข…์† ๋ณ€์ˆ˜์— ๋ฏธ์น˜๋Š” ๊ฐœ๋ณ„์ ์ธ ์˜ํ–ฅ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.logโก(Odds)=ฮฒ0+ฮฒ1X1+ฮฒ2X2+โ‹ฏ+ฮฒnXn\log(Odds) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_n X_nlog(Odds)=ฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+ฮฒ2โ€‹X2โ€‹+โ‹ฏ+ฮฒnโ€‹Xnโ€‹
  • ์—ฌ๊ธฐ์„œ ฮฒi\beta_iฮฒiโ€‹๋Š” ํ•ด๋‹น ๋ณ€์ˆ˜ XiX_iXiโ€‹๊ฐ€ ํ•œ ๋‹จ์œ„ ์ฆ๊ฐ€ํ•  ๋•Œ log(odds)๊ฐ€ ๋ณ€ํ•˜๋Š” ์–‘์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
    • ๋งŒ์•ฝ ฮฒ2=0.7\beta_2 = 0.7ฮฒ2โ€‹=0.7์ด๋ผ๋ฉด, X2X_2X2โ€‹๊ฐ€ 1 ์ฆ๊ฐ€ํ•  ๋•Œ odds๋Š” e0.7โ‰ˆ2.01e^{0.7} \approx 2.01e0.7โ‰ˆ2.01๋ฐฐ ์ฆ๊ฐ€ํ•œ๋‹ค๋Š” ๋œป์ž…๋‹ˆ๋‹ค.
    • ๋ฐ˜๋ฉด ฮฒ3=โˆ’0.5\beta_3 = -0.5ฮฒ3โ€‹=โˆ’0.5๋ผ๋ฉด, X3X_3X3โ€‹๊ฐ€ 1 ์ฆ๊ฐ€ํ•  ๋•Œ odds๋Š” eโˆ’0.5โ‰ˆ0.61e^{-0.5} \approx 0.61eโˆ’0.5โ‰ˆ0.61๋ฐฐ ๊ฐ์†Œํ•œ๋‹ค๋Š” ๋œป์ž…๋‹ˆ๋‹ค.
    • ์ด๋ฅผ ํ†ตํ•ด ๊ฐ ๋…๋ฆฝ ๋ณ€์ˆ˜๋“ค์ด ๊ฒฐ๊ณผ ๋ณ€์ˆ˜์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ๊ฐœ๋ณ„์ ์œผ๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Equation. ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜, ์˜ค์ฆˆ(์Šน์‚ฐ), ๋กœ์ง“๋ณ€ํ™˜(Logistic ํšŒ๊ท€๋ชจ๋ธ)

ํ•ญ๋ชฉ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜ ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€
์ •์˜ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ์‚ฌ์šฉ๋˜๋Š” ํ•จ์ˆ˜ S์ž ๋ชจ์–‘์˜ ์ˆ˜ํ•™์  ํ•จ์ˆ˜ ์ด์ง„ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ํ†ต๊ณ„ ๋ชจ๋ธ
์ˆ˜์‹ 11+eโˆ’x\frac{1}{1 + e^{-x}}1+eโˆ’x1โ€‹ 11+eโˆ’(ฮฒ0+ฮฒx)\frac{1}{1 + e^{-(\beta_0 + \beta x)}}1+eโˆ’(ฮฒ0โ€‹+ฮฒx)1โ€‹ logโก(Odds)=ฮฒ0+ฮฒ1X\log(Odds) = \beta_0 + \beta_1 Xlog(Odds)=ฮฒ0โ€‹+ฮฒ1โ€‹X
์ฃผ์š” ์‚ฌ์šฉ์ฒ˜ ์‹ ๊ฒฝ๋ง์˜ ๋น„์„ ํ˜• ๋ณ€ํ™˜ ํ™•๋ฅ  ๋ชจ๋ธ๋ง ๋ถ„๋ฅ˜ ๋ฌธ์ œ (์˜ˆ: ์ŠคํŒธ/๋น„์ŠคํŒธ)
๋งฅ๋ฝ ๋”ฅ๋Ÿฌ๋‹ ๋ฐ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ˆ˜ํ•™์  ๊ฐœ๋… ํ†ต๊ณ„/๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ
๊ด€๊ณ„ ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜์™€ ์ˆ˜์‹ ๋™์ผ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์˜ ๊ธฐ๋ฐ˜์ด ๋จ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜๋ฅผ ํ™œ์šฉ

Table. ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜ / ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜ / ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€

  1. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ์ถ”์ • ๋ฐฉ๋ฒ•

5.1. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์˜ ๋ชฉ์ 

  • ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€(Logistic Regression)๋Š” ์ด์ง„ ๋ถ„๋ฅ˜(Binary Classification) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ์„ ํ˜• ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

    • ์ถœ๋ ฅ ๊ฐ’ yyy๋Š” 0 ๋˜๋Š” 1์ด๋ฉฐ, ์ž…๋ ฅ ๋ฐ์ดํ„ฐ xxx์— ๋Œ€ํ•œ ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ ์€ ์‹œ๊ทธ๋ชจ์ด๋“œ(Sigmoid) ํ•จ์ˆ˜๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค.
ฯ€(xi)=P(yi=1โˆฃxi)=eฮฒ0+ฮฒ1Xi1+โ‹ฏ+ฮฒpXip1+eฮฒ0+ฮฒ1Xi1+โ‹ฏ+ฮฒpXip\pi(x_i) = P(y_i = 1 x_i) = \frac{e^{\beta_0 + \beta_1 X_{i1} + \dots + \beta_p X_{ip}}}{1 + e^{\beta_0 + \beta_1 X_{i1} + \dots + \beta_p X_{ip}}}ฯ€(xiโ€‹)=P(yiโ€‹=1โˆฃxiโ€‹)=1+eฮฒ0โ€‹+ฮฒ1โ€‹Xi1โ€‹+โ‹ฏ+ฮฒpโ€‹Xipโ€‹eฮฒ0โ€‹+ฮฒ1โ€‹Xi1โ€‹+โ‹ฏ+ฮฒpโ€‹Xipโ€‹โ€‹

์ฆ‰, ๋ชจ๋ธ์€ ์ž…๋ ฅ xix_ixiโ€‹๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ yi=1y_i = 1yiโ€‹=1์ผ ํ™•๋ฅ ์„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.


5.2. ์ตœ๋Œ€ ์šฐ๋„ ์ถ”์ •(MLE, Maximum Likelihood Estimation)

  • MLE์˜ ๋ชฉํ‘œ๋Š” ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๊ฐ€์žฅ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ฮฒ\betaฮฒ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ฐ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ (xi,yi)(x_i, y_i)(xiโ€‹,yiโ€‹)์— ๋Œ€ํ•œ ํ™•๋ฅ ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋ฉ๋‹ˆ๋‹ค.

  • yi=1y_i = 1yiโ€‹=1์ผ ํ™•๋ฅ : P(yi=1)=ฯ€(xi)P(y_i = 1) = \pi(x_i)P(yiโ€‹=1)=ฯ€(xiโ€‹)
  • yi=0y_i = 0yiโ€‹=0์ผ ํ™•๋ฅ : P(yi=0)=1โˆ’ฯ€(xi)P(y_i = 0) = 1 - \pi(x_i)P(yiโ€‹=0)=1โˆ’ฯ€(xiโ€‹)

์ „์ฒด ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ nnn๊ฐœ์— ๋Œ€ํ•œ ์šฐ๋„ ํ•จ์ˆ˜(Likelihood Function) L(ฮฒ)L(\beta)L(ฮฒ)๋Š” ๊ฐœ๋ณ„ ํ™•๋ฅ ์˜ ๊ณฑ์œผ๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค.

L(ฮฒ)=โˆi=1nฯ€(xi)yi(1โˆ’ฯ€(xi))1โˆ’yiL(\beta) = \prod_{i=1}^{n} \pi(x_i)^{y_i} (1 - \pi(x_i))^{1 - y_i}L(ฮฒ)=i=1โˆnโ€‹ฯ€(xiโ€‹)yiโ€‹(1โˆ’ฯ€(xiโ€‹))1โˆ’yiโ€‹

์ด ์šฐ๋„ ํ•จ์ˆ˜ L(ฮฒ)L(\beta)L(ฮฒ)๋ฅผ ์ตœ๋Œ€๋กœ ๋งŒ๋“œ๋Š” ฮฒ\betaฮฒ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด MLE์˜ ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.


5.3. ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„(Log-Likelihood) ํ•จ์ˆ˜

  • ์šฐ๋„ ํ•จ์ˆ˜๋Š” ๊ณฑ ํ˜•ํƒœ์ด๋ฏ€๋กœ ์ตœ์ ํ™”๋ฅผ ์‰ฝ๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด ๋กœ๊ทธ๋ฅผ ์ทจํ•ฉ๋‹ˆ๋‹ค.

lnโกL(ฮฒ)=โˆ‘i=1n(yilnโกฯ€(xi)+(1โˆ’yi)lnโก(1โˆ’ฯ€(xi)))\ln L(\beta) = \sum_{i=1}^{n} \left( y_i \ln \pi(x_i) + (1 - y_i) \ln(1 - \pi(x_i)) \right)lnL(ฮฒ)=i=1โˆ‘nโ€‹(yiโ€‹lnฯ€(xiโ€‹)+(1โˆ’yiโ€‹)ln(1โˆ’ฯ€(xiโ€‹)))

  • ์ด ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜(lnโกL\ln LlnL)๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋ฉด ์ตœ์ ์˜ ฮฒ\betaฮฒ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

โ“ (์ฐธ๊ณ ) ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„(Log-Likelihood) ํ•จ์ˆ˜ ์ž์„ธํ•˜๊ฒŒ ์‚ดํŽด๋ณด๊ธฐ

  • ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„(Log-Likelihood) ํ•จ์ˆ˜๋Š” ์šฐ๋„ ํ•จ์ˆ˜์— ๋กœ๊ทธ๋ฅผ ์ทจํ•œ ํ˜•ํƒœ์ž…๋‹ˆ๋‹ค.

lnโกL=โˆ‘i(yilnโกฯ€(xi)+(1โˆ’yi)lnโก(1โˆ’ฯ€(xi)))\ln L = \sum_i \left( y_i \ln \pi(x_i) + (1 - y_i) \ln(1 - \pi(x_i)) \right)lnL=iโˆ‘โ€‹(yiโ€‹lnฯ€(xiโ€‹)+(1โˆ’yiโ€‹)ln(1โˆ’ฯ€(xiโ€‹)))

  • ์ด์ œ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜ ฯ€(xi)\pi(x_i)ฯ€(xiโ€‹)๋ฅผ ๋Œ€์ž…ํ•ฉ๋‹ˆ๋‹ค.

ฯ€(xi)=eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp\pi(x_i) = \frac{e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p}}{1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p}}ฯ€(xiโ€‹)=1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹โ€‹

  • ์ด๋ฅผ lnโกฯ€(xi)\ln \pi(x_i)lnฯ€(xiโ€‹)์™€ lnโก(1โˆ’ฯ€(xi))\ln(1 - \pi(x_i))ln(1โˆ’ฯ€(xiโ€‹))์— ์ ์šฉํ•˜๋ฉด:

lnโกฯ€(xi)=lnโก(eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)\ln \pi(x_i) = \ln \left( \frac{e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p}}{1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p}} \right)lnฯ€(xiโ€‹)=ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹โ€‹) =(ฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)โˆ’lnโก(1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)= (\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p) - \ln(1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p})=(ฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹)โˆ’ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹) lnโก(1โˆ’ฯ€(xi))=lnโก(11+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)\ln(1 - \pi(x_i)) = \ln \left( \frac{1}{1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p}} \right)ln(1โˆ’ฯ€(xiโ€‹))=ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹1โ€‹) =โˆ’lnโก(1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)= -\ln(1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p})=โˆ’ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹)

  • ์ด์ œ ์ด๋ฅผ ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜์— ๋Œ€์ž…ํ•˜๋ฉด:

lnโกL=โˆ‘iyi((ฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)โˆ’lnโก(1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp))\ln L = \sum_i y_i \left( (\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p) - \ln(1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p}) \right)lnL=iโˆ‘โ€‹yiโ€‹((ฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹)โˆ’ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹)) +โˆ‘i(1โˆ’yi)(โˆ’lnโก(1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp))+ \sum_i (1 - y_i) \left( -\ln(1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p}) \right)+iโˆ‘โ€‹(1โˆ’yiโ€‹)(โˆ’ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹))

  • ์ด๋ฅผ ์ „๊ฐœํ•˜๋ฉด:

โˆ‘iyi(ฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)โˆ’โˆ‘iyilnโก(1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)\sum_i y_i (\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p) - \sum_i y_i \ln(1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p})iโˆ‘โ€‹yiโ€‹(ฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹)โˆ’iโˆ‘โ€‹yiโ€‹ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹) โˆ’โˆ‘i(1โˆ’yi)lnโก(1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)- \sum_i (1 - y_i) \ln(1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p})โˆ’iโˆ‘โ€‹(1โˆ’yiโ€‹)ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹)

  • ์ด์ œ ๋‘ ๋ฒˆ์งธ, ์„ธ ๋ฒˆ์งธ ํ•ญ์„ ํ•ฉ์น˜๋ฉด:

โˆ‘iyi(ฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)โˆ’โˆ‘ilnโก(1+eฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp)\sum_i y_i (\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p) - \sum_i \ln(1 + e^{\beta_0 + \beta_1 X_1 + \dots + \beta_p X_p})iโˆ‘โ€‹yiโ€‹(ฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹)โˆ’iโˆ‘โ€‹ln(1+eฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹)

  • ์œ„ ๋กœ๊ทธ-์šฐ๋„ํ•จ์ˆ˜(log likelihood function)๊ฐ€ ์ตœ๋Œ€๊ฐ€ ๋˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ ฮฒ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด ๋ชฉ์ 
  • ๋กœ๊ทธ-์šฐ๋„ํ•จ์ˆ˜(log likelihood function)๋Š” ํŒŒ๋ผ๋ฏธํ„ฐฮฒ์— ๋Œ€ํ•ด ๋น„์„ ํ˜•์ด๋ฏ€๋กœ ์„ ํ˜•ํšŒ๊ท€

    ๋ชจ๋ธ๊ณผ ๊ฐ™์ด ๋ช…์‹œ์ ์ธ ํ•ด๊ฐ€ ์กด์žฌํ•˜์ง€ ์•Š์Œ (์ด๋ฅผ โ€œNo closed-form solution existsโ€์ด๋ผ๊ณ  ํ•จ)

๋”ฐ๋ผ์„œ, ์šฐ๋ฆฌ๋Š” ์•„๋ž˜ 5.4. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์˜ ์†์‹ค ํ•จ์ˆ˜ (Cost Function)์™€ ๊ฐ™์€ ์ตœ์ ํ™” ์ ‘๊ทผ์œผ๋กœ ์ด๋ฅผ ๋„์ถœํ•˜๊ณ ์ž ํ•จ.


5.4. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์˜ ์†์‹ค ํ•จ์ˆ˜ (Cost Function)

  • ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ๋Š” ์ตœ์ ํ™” ๋ฌธ์ œ๋ฅผ ์ตœ์†Œํ™”(Minimization) ํ˜•ํƒœ๋กœ ๋ฐ”๊พธ๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์ž…๋‹ˆ๋‹ค.
    • ์ด๋ฅผ ์œ„ํ•ด ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜์˜ ๋ถ€ํ˜ธ๋ฅผ ๋ฐ˜์ „์‹œ์ผœ์„œ Negative Log-Likelihood (NLL)์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.
    • ์ตœ์ ํ™” ๊ณผ์ •์—์„œ ์šฐ๋ฆฌ๋Š” ์šฐ๋„๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๋Œ€์‹  ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฌธ์ œ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

J(ฮฒ)=โˆ’lnโกL(ฮฒ)J(\beta) = -\ln L(\beta)J(ฮฒ)=โˆ’lnL(ฮฒ)

๐Ÿ“– (์ •๋ฆฌ) ์ฆ‰, ์ตœ๋Œ€ ์šฐ๋„ ์ถ”์ •(MLE)์—์„œ๋Š” ln(๐ฟ)ln(๐ฟ)ln(L)์„ ์ตœ๋Œ€๋กœ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ง€๋งŒ, ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์†์‹ค(loss) ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ตœ์ ํ™”ํ•ฉ๋‹ˆ๋‹ค.

  • ์ด๋ฅผ ์œ„ํ•ด Negative Log-Likelihood (NLL), ์ฆ‰ ์Œ์˜ ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

    maxโกฮฒlnโกL(ฮฒ)โ‡’minโกฮฒโˆ’lnโกL(ฮฒ)\max_{\beta} \ln L(\beta) \quad \Rightarrow \quad \min_{\beta} -\ln L(\beta)ฮฒmaxโ€‹lnL(ฮฒ)โ‡’ฮฒminโ€‹โˆ’lnL(ฮฒ)

J(ฮฒ)=โˆ’โˆ‘i=1n(yilnโกฯ€(xi)+(1โˆ’yi)lnโก(1โˆ’ฯ€(xi)))J(\beta) = - \sum_{i=1}^{n} \left( y_i \ln \pi(x_i) + (1 - y_i) \ln(1 - \pi(x_i)) \right)J(ฮฒ)=โˆ’i=1โˆ‘nโ€‹(yiโ€‹lnฯ€(xiโ€‹)+(1โˆ’yiโ€‹)ln(1โˆ’ฯ€(xiโ€‹)))

์ด ์‹์€ Binary Cross-Entropy (BCE) ์†์‹ค ํ•จ์ˆ˜์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

J(ฮฒ)=โˆ’1nโˆ‘i=1n(yilnโกฯ€(xi)+(1โˆ’yi)lnโก(1โˆ’ฯ€(xi)))J(\beta) = -\frac{1}{n} \sum_{i=1}^{n} \left( y_i \ln \pi(x_i) + (1 - y_i) \ln(1 - \pi(x_i)) \right)J(ฮฒ)=โˆ’n1โ€‹i=1โˆ‘nโ€‹(yiโ€‹lnฯ€(xiโ€‹)+(1โˆ’yiโ€‹)ln(1โˆ’ฯ€(xiโ€‹)))

์ฐธ๊ณ  : Binary Cross-Entropy (BCE) ์†์‹ค ํ•จ์ˆ˜

์ฆ‰, ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์˜ MLE ๋ฌธ์ œ๋Š” ๊ฒฐ๊ตญ Cross-Entropy ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฌธ์ œ์™€ ๊ฐ™์•„์ง‘๋‹ˆ๋‹ค.


5.5. argmax ๊ด€์ ์—์„œ ํ•ด์„

MLE์˜ ๋ชฉํ‘œ๋Š” ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„ lnโกL(ฮฒ)\ln L(\beta)lnL(ฮฒ)๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ฮฒ\betaฮฒ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ฮฒ^=argโกmaxโกฮฒlnโกL(ฮฒ)\hat{\beta} = \arg\max_{\beta} \ln L(\beta)ฮฒ^โ€‹=argฮฒmaxโ€‹lnL(ฮฒ)

ํ•˜์ง€๋งŒ ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ๋Š” ์†์‹ค ํ•จ์ˆ˜(Cost Function)๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฌธ์ œ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

ฮฒ^=argโกminโกฮฒJ(ฮฒ)=argโกminโกฮฒโˆ’lnโกL(ฮฒ)\hat{\beta} = \arg\min_{\beta} J(\beta) = \arg\min_{\beta} -\ln L(\beta)ฮฒ^โ€‹=argฮฒminโ€‹J(ฮฒ)=argฮฒminโ€‹โˆ’lnL(ฮฒ)

์ฆ‰, ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๊ฒƒ๊ณผ Negative Log-Likelihood๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฒƒ์€ ๋™๋“ฑํ•œ ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค.


5.6. ์ตœ์ข… ์ •๋ฆฌ

๊ฐœ๋… ๋ชฉ์  ํ‘œํ˜„์‹
์ตœ๋Œ€ ์šฐ๋„ ์ถ”์ • (MLE) ์šฐ๋„๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ฮฒ\betaฮฒ ์ฐพ๊ธฐ L(ฮฒ)=โˆi=1nฯ€(xi)yi(1โˆ’ฯ€(xi))1โˆ’yiL(\beta) = \prod_{i=1}^{n} \pi(x_i)^{y_i} (1 - \pi(x_i))^{1 - y_i}L(ฮฒ)=โˆi=1nโ€‹ฯ€(xiโ€‹)yiโ€‹(1โˆ’ฯ€(xiโ€‹))1โˆ’yiโ€‹
๋กœ๊ทธ ๊ฐ€๋Šฅ๋„ (Log-Likelihood) ์šฐ๋„์˜ ๋กœ๊ทธ๋ฅผ ์ทจํ•ด ์ตœ๋Œ€ํ™” lnโกL=โˆ‘i=1n(yilnโกฯ€(xi)+(1โˆ’yi)lnโก(1โˆ’ฯ€(xi)))\ln L = \sum_{i=1}^{n} \left( y_i \ln \pi(x_i) + (1 - y_i) \ln(1 - \pi(x_i)) \right)lnL=โˆ‘i=1nโ€‹(yiโ€‹lnฯ€(xiโ€‹)+(1โˆ’yiโ€‹)ln(1โˆ’ฯ€(xiโ€‹)))
Negative Log-Likelihood (NLL) ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„์˜ ๋ถ€ํ˜ธ๋ฅผ ๋ฐ”๊ฟ” ์ตœ์†Œํ™” J(ฮฒ)=โˆ’lnโกL(ฮฒ)=โˆ’โˆ‘i=1n(yilnโกฯ€(xi)+(1โˆ’yi)lnโก(1โˆ’ฯ€(xi)))J(\beta) = -\ln L(\beta) = -\sum_{i=1}^{n} \left( y_i \ln \pi(x_i) + (1 - y_i) \ln(1 - \pi(x_i)) \right)J(ฮฒ)=โˆ’lnL(ฮฒ)=โˆ’โˆ‘i=1nโ€‹(yiโ€‹lnฯ€(xiโ€‹)+(1โˆ’yiโ€‹)ln(1โˆ’ฯ€(xiโ€‹)))
Binary Cross-Entropy (BCE) ์†์‹ค ํ•จ์ˆ˜ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์—์„œ ์ตœ์ ํ™”ํ•˜๋Š” ํ‘œ์ค€ ์†์‹ค ํ•จ์ˆ˜ J(ฮฒ)=โˆ’1nโˆ‘i=1n(yilnโกฯ€(xi)+(1โˆ’yi)lnโก(1โˆ’ฯ€(xi)))J(\beta) = -\frac{1}{n} \sum_{i=1}^{n} \left( y_i \ln \pi(x_i) + (1 - y_i) \ln(1 - \pi(x_i)) \right)J(ฮฒ)=โˆ’n1โ€‹โˆ‘i=1nโ€‹(yiโ€‹lnฯ€(xiโ€‹)+(1โˆ’yiโ€‹)ln(1โˆ’ฯ€(xiโ€‹)))

์ฆ‰, MLE์—์„œ ๋กœ๊ทธ ๊ฐ€๋Šฅ๋„๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๋ฌธ์ œ๋Š” ๊ฒฐ๊ตญ Cross-Entropy ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฌธ์ œ์™€ ๊ฐ™์•„์ง‘๋‹ˆ๋‹ค.

์ด๋Š” ์šฐ๋ฆฌ๊ฐ€ ํ”ํžˆ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์˜ ์†์‹ค ํ•จ์ˆ˜(Binary Cross-Entropy, BCE)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ์ž…๋‹ˆ๋‹ค.

  1. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ชจ๋ธ ๊ฒฐ๊ณผ ํ•ด์„

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ชจ๋ธ์„ ์ƒ์„ฑํ•œ ํ›„ ๋‚˜์˜ค๋Š” ๊ฒฐ๊ณผ ํ…Œ์ด๋ธ”์˜ ๊ฒฐ๊ณผ๋ฅผ ํ•ด์„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

6.1 ์ถ”์ •๋œ ํŒŒ๋ผ๋ฏธํ„ฐ (Coefficient)

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ชจ๋ธ์—์„œ ํŒŒ๋ผ๋ฏธํ„ฐ (Coefficient, ฮฒ\betaฮฒ)๋Š” ํ…Œ์ด๋ธ”์˜ ๊ฒฐ๊ณผ์—์„œ ๋กœ๊ทธ ์˜ค์ฆˆ(Log-Odds) ๋ณ€ํ™”๋Ÿ‰์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.

logโก(ฯ€(x)1โˆ’ฯ€(x))=ฮฒ0+ฮฒ1X1+โ‹ฏ+ฮฒpXp\log \left( \frac{\pi(x)}{1 - \pi(x)} \right) = \beta_0 + \beta_1 X_1 + \dots + \beta_p X_plog(1โˆ’ฯ€(x)ฯ€(x)โ€‹)=ฮฒ0โ€‹+ฮฒ1โ€‹X1โ€‹+โ‹ฏ+ฮฒpโ€‹Xpโ€‹

  • ฮฒ>0\beta > 0ฮฒ>0 : ํ•ด๋‹น ๋ณ€์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•  ๋•Œ ์„ฑ๊ณต ํ™•๋ฅ ์ด ์ฆ๊ฐ€
  • ฮฒ<0\beta < 0ฮฒ<0 : ํ•ด๋‹น ๋ณ€์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•  ๋•Œ ์„ฑ๊ณต ํ™•๋ฅ ์ด ๊ฐ์†Œ

๋”ฐ๋ผ์„œ, ํšŒ๊ท€๊ณ„์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•˜๋ฉด ์„ฑ๊ณต ํ™•๋ฅ ์ด ์ฆ๊ฐ€ํ•˜๊ณ , ํšŒ๊ท€๊ณ„์ˆ˜๊ฐ€ ์Œ์ˆ˜๋ฉด ์„ฑ๊ณต ํ™•๋ฅ ์ด ๊ฐ์†Œํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

6.2 ํŒŒ๋ผ๋ฏธํ„ฐ ํ‘œ์ค€ํŽธ์ฐจ (Standard Error)

์ถ”์ •๋œ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ํ‘œ์ค€ํŽธ์ฐจ (Standard Error, SE)๋Š” ํ•ด๋‹น ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์–ผ๋งˆ๋‚˜ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.

  • Std. Error๊ฐ€ ์ž‘์„์ˆ˜๋ก : ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฒฐ๊ณผ์˜ ์‹ ๋ขฐ์„ฑ์ด ๋†’์Œ
  • Std. Error๊ฐ€ ํฌ๋ฉด : ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฒฐ๊ณผ์˜ ์‹ ๋ขฐ์„ฑ์ด ๋‚ฎ์Œ

์ด ๊ฐ’์€ ์‹ ๋ขฐ๊ตฌ๊ฐ„ (Confidence Interval, CI) ๊ณ„์‚ฐ์— ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

95%ย CI=ฮฒยฑ1.96ร—Std.ย Error95\% \text{ CI} = \beta \pm 1.96 \times \text{Std. Error}95%ย CI=ฮฒยฑ1.96ร—Std.ย Error

6.3 p-value (ํ†ต๊ณ„์  ์œ ์˜์„ฑ)

p-value๋Š” ํ•ด๋‹น ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์ข…์† ๋ณ€์ˆ˜์— ์œ ์˜๋ฏธํ•œ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€๋ฅผ ํŒ๋‹จํ•˜๋Š” ๊ฐ’์ž…๋‹ˆ๋‹ค.

  • p-value < 0.05 : ํ•ด๋‹น ๋ณ€์ˆ˜๋Š” ์ข…์† ๋ณ€์ˆ˜์— ์œ ์˜๋ฏธํ•œ ์˜ํ–ฅ์„ ์ค€๋‹ค.
  • p-value \geq 0.05 : ํ•ด๋‹น ๋ณ€์ˆ˜๋Š” ์ข…์† ๋ณ€์ˆ˜์— ์œ ์˜๋ฏธํ•œ ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š๋Š”๋‹ค.

p-value๊ฐ€ 0.05๋ณด๋‹ค ์ž‘์œผ๋ฉด ํ•ด๋‹น ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ์ข…์† ๋ณ€์ˆ˜์— ์œ ์˜๋ฏธํ•œ ์˜ํ–ฅ์„ ์ค€๋‹ค๊ณ  ํŒ๋‹จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

6.4 Odds Ratio (์Šน์‚ฐ ๋น„์œจ)

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ชจ๋ธ์—์„œ Odds Ratio(์Šน์‚ฐ ๋น„์œจ)์€ ํŠน์ • ๋ณ€์ˆ˜๊ฐ€ 1 ์ฆ๊ฐ€ํ•  ๋•Œ ์„ฑ๊ณต(์ข…์† ๋ณ€์ˆ˜ Y=1Y=1Y=1)์˜ ์˜ค์ฆˆ(Odds)๊ฐ€ ๋ช‡ ๋ฐฐ ๋ณ€ํ™”ํ•˜๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’์ž…๋‹ˆ๋‹ค.

  • ์šฐ๋ฆฌ๊ฐ€ ์–ป๋Š” ํšŒ๊ท€๊ณ„์ˆ˜ ฮฒ\betaฮฒ๋Š” ๋กœ๊ทธ ์˜ค์ฆˆ(Log-Odds)์˜ ๋ณ€ํ™”๋Ÿ‰์„ ์˜๋ฏธํ•˜๋ฉฐ, ์ด๋ฅผ ์ง€์ˆ˜ ํ•จ์ˆ˜ eฮฒe^{\beta}eฮฒ๋กœ ๋ณ€ํ™˜ํ•˜๋ฉด Odds Ratio(์Šน์‚ฐ ๋น„์œจ)์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Oddsย Ratio=eฮฒ\text{Odds Ratio} = e^{\beta}Oddsย Ratio=eฮฒ

๐Ÿ“– Odds Ratio ํ•ด์„

  • Odds Ratio > 1 : ํ•ด๋‹น ๋ณ€์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•˜๋ฉด ์„ฑ๊ณต ํ™•๋ฅ ์ด ์ฆ๊ฐ€ํ•จ.
    • ์˜ˆ: Oddsย Ratio=1.5\text{Odds Ratio} = 1.5Oddsย Ratio=1.5๋ผ๋ฉด, ํ•ด๋‹น ๋ณ€์ˆ˜๊ฐ€ 1 ์ฆ๊ฐ€ํ•  ๋•Œ ์„ฑ๊ณตํ•  ํ™•๋ฅ ์ด 1.5๋ฐฐ ์ฆ๊ฐ€ํ•จ.
  • Odds Ratio = 1 : ํ•ด๋‹น ๋ณ€์ˆ˜๊ฐ€ ์„ฑ๊ณต ํ™•๋ฅ ์— ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์Œ.
  • Odds Ratio < 1 : ํ•ด๋‹น ๋ณ€์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•˜๋ฉด ์„ฑ๊ณต ํ™•๋ฅ ์ด ๊ฐ์†Œํ•จ.
    • ์˜ˆ: Oddsย Ratio=0.5\text{Odds Ratio} = 0.5Oddsย Ratio=0.5๋ผ๋ฉด, ํ•ด๋‹น ๋ณ€์ˆ˜๊ฐ€ 1 ์ฆ๊ฐ€ํ•  ๋•Œ ์„ฑ๊ณตํ•  ํ™•๋ฅ ์ด ์ ˆ๋ฐ˜(50%)๋กœ ๊ฐ์†Œํ•จ.
  1. ๊ฒฐ๋ก 

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์€ ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์œ ์šฉํ•œ ๋„๊ตฌ์ด๋ฉฐ, ์˜ค์ฆˆ ๋น„๋ฅผ ํ†ตํ•ด ๋ณ€์ˆ˜ ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ๋ช…ํ™•ํ•˜๊ฒŒ ๋ถ„์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์—ฐ๊ตฌ์ž์™€ ์‹คํ—˜์ž๋“ค์€ ์ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹คํ—˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๋‹ค ์ง๊ด€์ ์œผ๋กœ ํ•ด์„ํ•˜๊ณ , ์˜๋ฏธ ์žˆ๋Š” ์—ฐ๊ตฌ ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์•ž์œผ๋กœ ์‹คํ—˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•  ๋•Œ, ์˜ค์ฆˆ ๋น„๋ฅผ ํ™œ์šฉํ•ด๋ณด์‹œ๊ธธ ์ถ”์ฒœ๋“œ๋ฆฝ๋‹ˆ๋‹ค!

ํ™”์ดํŒ…์ž…๋‹ˆ๋‹ค ๐Ÿ’Œ



-->