Prophet์„ ํ™œ์šฉํ•œ ์‹œ๊ณ„์—ด ์˜ˆ์ธก๐Ÿ”ญ

Posted by Euisuk's Dev Log on July 12, 2024

Prophet์„ ํ™œ์šฉํ•œ ์‹œ๊ณ„์—ด ์˜ˆ์ธก๐Ÿ”ญ

์›๋ณธ ๊ฒŒ์‹œ๊ธ€: https://velog.io/@euisuk-chung/Prophet์„-ํ™œ์šฉํ•œ-์‹œ๊ณ„์—ด-์˜ˆ์ธก

Prophet ์†Œ๊ฐœ

Facebook Prophet์€ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก์„ ์œ„ํ•ด Facebook์—์„œ ๊ฐœ๋ฐœํ•œ ์˜คํ”ˆ์†Œ์Šค ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค. Prophet์€ ์ž๋™ํ™”๋œ ์ด์ƒ ํƒ์ง€ ๋ฐ ๊ณ„์ ˆ์„ฑ ๋ถ„์„์„ ์ง€์›ํ•˜๋ฉฐ, ํŠนํžˆ ๋น„์ฆˆ๋‹ˆ์Šค ์˜ˆ์ธก(์˜ˆ: ๋งค์ถœ, ์‚ฌ์šฉ์ž ์ˆ˜, ์›น ํŠธ๋ž˜ํ”ฝ ๋“ฑ)์— ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.

Prophet์˜ ์žฅ์ 

Meta(Facebook)์—์„œ๋Š” Prophet์˜ ์žฅ์ ์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค:

1. ์ •ํ™•ํ•˜๊ณ  ๋น ๋ฅด๋‹ค (Accurate and Fast)

  • ์ •ํ™•์„ฑ: Prophet์€ Facebook ๋‚ด๋ถ€์˜ ๋‹ค์–‘ํ•œ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์—์„œ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ์˜ˆ์ธก์„ ์ œ๊ณตํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๊ณ„ํš ์ˆ˜๋ฆฝ ๋ฐ ๋ชฉํ‘œ ์„ค์ •์— ํฐ ๋„์›€์„ ์ค๋‹ˆ๋‹ค.
  • ์†๋„: ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ, ๋‹ค๋ฅธ ์ ‘๊ทผ ๋ฐฉ์‹๋ณด๋‹ค ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. Stan์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ”ผํŒ…ํ•˜๋ฏ€๋กœ, ๋ช‡ ์ดˆ ๋งŒ์— ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2. ์™„์ „ ์ž๋™ํ™” (Fully Automatic)

  • ์ž๋™ ์˜ˆ์ธก: ๋ฐ์ดํ„ฐ๊ฐ€ ์–ด์ง€๋Ÿฝ๊ณ  ๋ณต์žกํ•ด๋„ ๋ณ„๋‹ค๋ฅธ ์ˆ˜์ž‘์—… ์—†์ด ํ•ฉ๋ฆฌ์ ์ธ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ฐ•๊ฑด์„ฑ: ์ด์ƒ์น˜, ๊ฒฐ์ธก์น˜, ์‹œ๊ฐ„ ์‹œ๊ณ„์—ด์˜ ๊ทน์ ์ธ ๋ณ€ํ™”์— ๋Œ€ํ•ด ๊ฐ•๊ฑดํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž…๋‹ˆ๋‹ค.

3. ์กฐ์ • ๊ฐ€๋Šฅํ•œ ์˜ˆ์ธก (Tunable Forecasts)

  • ์‚ฌ์šฉ์ž ์กฐ์ • ๊ฐ€๋Šฅ์„ฑ: ์‚ฌ์šฉ์ž๊ฐ€ ์˜ˆ์ธก์„ ๋ฏธ์„ธ ์กฐ์ •ํ•˜๊ณ  ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์–‘ํ•œ ๊ฐ€๋Šฅ์„ฑ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
  • ํ•ด์„ ๊ฐ€๋Šฅํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜: ์ธ๊ฐ„์ด ํ•ด์„ ๊ฐ€๋Šฅํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋„๋ฉ”์ธ ์ง€์‹์„ ์ถ”๊ฐ€ํ•จ์œผ๋กœ์จ ์˜ˆ์ธก์„ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

4. R ๋˜๋Š” Python์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅ (Available in R or Python)

  • ์–ธ์–ด ์ง€์›: Prophet์€ R๊ณผ Python์—์„œ ๋ชจ๋‘ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋™์ผํ•œ Stan ์ฝ”๋“œ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.
  • ์œ ์—ฐ์„ฑ: ์‚ฌ์šฉ์ž๊ฐ€ ํŽธ์•ˆํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์–ธ์–ด๋ฅผ ์„ ํƒํ•˜์—ฌ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฃผ์š” ๊ธฐ๋Šฅ

  1. ์ถ”์„ธ ๋ณ€๋™: ์„ ํ˜• ๋ฐ ๋น„์„ ํ˜• ์ถ”์„ธ ๋ชจ๋ธ๋ง ์ง€์›.

    Prophet์€ ๋ฐ์ดํ„ฐ์˜ ์žฅ๊ธฐ์ ์ธ ์ถ”์„ธ๋ฅผ ๋ชจ๋ธ๋งํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.Prophet์€ ์ž๋™์œผ๋กœ ์ตœ์ ์˜ ์ถ”์„ธ ๋ชจ๋ธ์„ ์„ ํƒํ•˜๊ฑฐ๋‚˜, ์‚ฌ์šฉ์ž๊ฐ€ ์ˆ˜๋™์œผ๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Prophet์—์„œ๋Š” ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ์ถ”์„ธ ๋ชจ๋ธ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค:

    • ์„ ํ˜• ์ถ”์„ธ (Linear Trend): ๋ฐ์ดํ„ฐ๊ฐ€ ์ผ์ •ํ•œ ์†๋„๋กœ ์ฆ๊ฐ€ํ•˜๊ฑฐ๋‚˜ ๊ฐ์†Œํ•  ๋•Œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
    • ๋น„์„ ํ˜• ์ถ”์„ธ (Non-linear Trend): ๋ฐ์ดํ„ฐ์˜ ๋ณ€ํ™” ์†๋„๊ฐ€ ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋ณ€ํ•  ๋•Œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋กœ๊ทธ ๋ณ€ํ™˜ ์ถ”์„ธ ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

      1
      2
      3
      4
      5
      6
      
      # ๊ธฐ๋ณธ ์„ค์ •์—์„œ๋Š” ์„ ํ˜• ์ถ”์„ธ ๋ชจ๋ธ์„ ์‚ฌ์šฉ
      model = Prophet()  # ์ด๋Š” model = Prophet(growth='linear')์™€ ๋™์ผ
           
      # ์„ ํƒํ•ด์„œ ์‚ฌ์šฉ๊ฐ€๋Šฅ
      model = Prophet(growth='linear')  # ์„ ํ˜• ์ถ”์„ธ ๋ชจ๋ธ
      model = Prophet(growth='logistic')  # ๋น„์„ ํ˜• ์ถ”์„ธ ๋ชจ๋ธ
      
  2. ๊ณ„์ ˆ์„ฑ: ์ฃผ๊ธฐ์  ๋ณ€๋™(์ผ, ์ฃผ, ๋…„ ๋‹จ์œ„)์„ ๋ชจ๋ธ๋ง.

    Prophet์€ ๋ฐ์ดํ„ฐ์˜ ์ฃผ๊ธฐ์ ์ธ ๋ณ€๋™์„ ๋ชจ๋ธ๋งํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์ฃผ๊ฐ„, ์›”๊ฐ„, ์—ฐ๊ฐ„ ๋“ฑ์˜ ์ฃผ๊ธฐ๋กœ ๋ฐœ์ƒํ•˜๋Š” ๋ณ€๋™์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. Prophet์€ ๊ธฐ๋ณธ์ ์œผ๋กœ ์—ฐ๊ฐ„ ๊ณ„์ ˆ์„ฑ(yearly_seasonality)๊ณผ ์ฃผ๊ฐ„ ๊ณ„์ ˆ์„ฑ(weekly_seasonality)์„ ํฌํ•จํ•˜๋ฉฐ, ์ด๋Š” ํ•„์š”์— ๋”ฐ๋ผ ์ผœ๊ณ  ๋Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ถ”๊ฐ€๋กœ, ์‚ฌ์šฉ์ž ์ •์˜ ๊ณ„์ ˆ์„ฑ์„ ๋„๋ฉ”์ธ ์ง€์‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    ๊ธฐ๋ณธ ๊ณ„์ ˆ์„ฑ

  • ์—ฐ๊ฐ„ ๊ณ„์ ˆ์„ฑ (yearly_seasonality): ๋ฐ์ดํ„ฐ์˜ ์—ฐ๊ฐ„ ์ฃผ๊ธฐ๋ฅผ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค.
  • ์ฃผ๊ฐ„ ๊ณ„์ ˆ์„ฑ (weekly_seasonality): ๋ฐ์ดํ„ฐ์˜ ์ฃผ๊ฐ„ ์ฃผ๊ธฐ๋ฅผ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    
    # ๊ธฐ๋ณธ ์—ฐ๊ฐ„ ๊ณ„์ ˆ์„ฑ์„ ํ™œ์„ฑํ™”
    model = Prophet(yearly_seasonality=True)
      
    # ๊ธฐ๋ณธ ์ฃผ๊ฐ„ ๊ณ„์ ˆ์„ฑ์„ ํ™œ์„ฑํ™”
    model = Prophet(weekly_seasonality=True)
      
    # ์—ฐ๊ฐ„ ๊ณ„์ ˆ์„ฑ์„ ๋น„ํ™œ์„ฑํ™”
    model = Prophet(yearly_seasonality=False)
      
    # ์ฃผ๊ฐ„ ๊ณ„์ ˆ์„ฑ์„ ๋น„ํ™œ์„ฑํ™”
    model = Prophet(weekly_seasonality=False)
      
    # ์‚ฌ์šฉ์ž ์ •์˜ ๊ณ„์ ˆ์„ฑ ์ถ”๊ฐ€
    model.add_seasonality(name='monthly', period=30.5, fourier_order=5)
    
  1. ํœด์ผ ํšจ๊ณผ: ๊ณตํœด์ผ ๋ฐ ํŠน๋ณ„ ์ด๋ฒคํŠธ์— ๋Œ€ํ•œ ํšจ๊ณผ๋ฅผ ๋ฐ˜์˜.

    Prophet์€ ๊ณตํœด์ผ ๋ฐ ํŠน๋ณ„ ์ด๋ฒคํŠธ๊ฐ€ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๊ณตํœด์ผ ๋ฆฌ์ŠคํŠธ๋ฅผ ์ œ๊ณตํ•˜์—ฌ ๋ชจ๋ธ์— ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    1
    2
    3
    4
    5
    6
    7
    
    from prophet.make_holidays import make_holidays_df
       
    # ๊ณตํœด์ผ ๋ฆฌ์ŠคํŠธ ์ƒ์„ฑ
    holidays = make_holidays_df(year_list=[2015, 2016, 2017, 2018, 2019], country='US')
       
    # ๋ชจ๋ธ ์ดˆ๊ธฐํ™” ์‹œ ๊ณตํœด์ผ ์ถ”๊ฐ€
    model = Prophet(holidays=holidays)
    
  2. ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ: ์ž๋™์œผ๋กœ ๊ฒฐ์ธก์น˜๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ๋ถˆ๊ทœ์น™ํ•œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ์Œ.

    Prophet์€ ๊ฒฐ์ธก์น˜๋ฅผ ์ž๋™์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ถˆ๊ทœ์น™ํ•œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์—์„œ๋„ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ์ธก์น˜๋ฅผ ๋ณ„๋„๋กœ ์ฒ˜๋ฆฌํ•  ํ•„์š” ์—†์ด Prophet ๋ชจ๋ธ์— ๋ฐ์ดํ„ฐ๋ฅผ ๊ทธ๋Œ€๋กœ ์ž…๋ ฅํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

  3. ์ด์ƒ์น˜ ์ฒ˜๋ฆฌ: ์ด์ƒ์น˜์— ๊ฐ•๊ฑดํ•œ ๋ชจ๋ธ๋ง.

    Prophet์€ ์ด์ƒ์น˜์— ๊ฐ•๊ฑดํ•œ ๋ชจ๋ธ๋ง์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด์ƒ์น˜๊ฐ€ ์žˆ๋Š” ๋ฐ์ดํ„ฐ์—์„œ๋„ ๋ชจ๋ธ์€ ๊ฐ•๊ฑดํ•˜๊ฒŒ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ ์ œ๊ฑฐ๊ฐ€ ํ•„์š”ํ•˜๋ฉด, ์‚ฌ์šฉ์ž๋Š” ์ด์ƒ์น˜๋ฅผ ์ˆ˜๋™์œผ๋กœ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    1
    2
    
    # ์ด์ƒ์น˜๊ฐ€ ํฌํ•จ๋œ ๋ฐ์ดํ„ฐ
    train_data.loc[train_data['ds'] == '2017-07-01', 'y'] = None  # ํŠน์ • ๋‚ ์งœ์˜ ์ด์ƒ์น˜๋ฅผ ๊ฒฐ์ธก์น˜๋กœ ์„ค์ •
    

์ฝ”๋“œ ์‹ค์Šต

์‹ค์Šต ๋ฐ์ดํ„ฐ

Kaggle: Panama Electricity Load Forecasting

ํ•ด๋‹น ๋ฐ์ดํ„ฐ์…‹์€ ํŒŒ๋‚˜๋งˆ์˜ ์ „๋ ฅ ๋ถ€ํ•˜(MW)๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค. ๋‚ ์”จ ๋ณ€์ˆ˜์™€ ํŠน๋ณ„ํ•œ ๋‚ (๊ณตํœด์ผ, ํ•™๊ต ํœด์ผ ๋“ฑ)์„ ์ฐธ์กฐํ•˜์—ฌ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ๋ชฉ์ ์ž…๋‹ˆ๋‹ค. Kaggle์—์„œ ์ œ๊ณต๋˜๊ณ  ์žˆ์œผ๋ฉฐ, ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ์ด๋ฅผ ํ™œ์šฉํ•œ ์‹œ๊ณ„์—ด ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ์…‹ ๊ฐœ์š”

  • ๋ชฉ์ : ํŒŒ๋‚˜๋งˆ์˜ ์ „๋ ฅ ๋ถ€ํ•˜(MW)๋ฅผ ์˜ˆ์ธก
  • ์ฐธ์กฐ ๋ณ€์ˆ˜: ๋‚ ์”จ ๋ณ€์ˆ˜ ๋ฐ ํŠน๋ณ„ํ•œ ๋‚ 
  • ํŠน์ง•: ์ด 15๊ฐœ์˜ ํŠน์ง•(features)์„ ํฌํ•จ

๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ

  1. ํŠน์ง•(features):
  • ์ด 15๊ฐœ์˜ ํŠน์ง•์„ ํฌํ•จ

  • 12๊ฐœ์˜ ์—ฐ์†์ ์ธ ์ˆ˜์น˜ํ˜• ๋ณ€์ˆ˜: ๋‚ ์”จ ๋ณ€์ˆ˜

  • 3๊ฐœ์˜ ํŠน๋ณ„ํ•œ ๋‚  ๊ด€๋ จ ๋ณ€์ˆ˜: ๊ณตํœด์ผ, ๊ณตํœด์ผ ID, ํ•™๊ต ํœด์ผ

  1. ๊ฒฐ์ธก์น˜ ์—†์Œ: ๋ฐ์ดํ„ฐ์…‹์— ๊ฒฐ์ธก์น˜๊ฐ€ ์—†์Œ
  2. ์˜ˆ์ธก ๋Œ€์ƒ: โ€˜nat_demandโ€™ ์—ด์— ์žˆ๋Š” ํŒŒ๋‚˜๋งˆ์˜ ์ „๋ ฅ ๋ถ€ํ•˜

Column ์„ค๋ช…

  • ds: ๋‚ ์งœ ๋ฐ ์‹œ๊ฐ„ (datetime)
  • T2M_toc, QV2M_toc, TQL_toc, W2M_toc: ํŠน์ • ์ง€์—ญ(toc)์˜ ๋‚ ์”จ ๋ณ€์ˆ˜ (์˜จ๋„, ์Šต๋„, ๊ฐ•์ˆ˜๋Ÿ‰, ํ’์† ๋“ฑ)
  • T2M_san, QV2M_san, TQL_san, W2M_san: ๋‹ค๋ฅธ ์ง€์—ญ(san)์˜ ๋‚ ์”จ ๋ณ€์ˆ˜
  • T2M_dav, QV2M_dav, TQL_dav, W2M_dav: ๋˜ ๋‹ค๋ฅธ ์ง€์—ญ(dav)์˜ ๋‚ ์”จ ๋ณ€์ˆ˜
  • Holiday_ID: ๊ณตํœด์ผ ID
  • holiday: ๊ณตํœด์ผ ์—ฌ๋ถ€ (True/False)
  • school: ํ•™๊ต ํœด์ผ ์—ฌ๋ถ€ (True/False)
  • nat_demand: ํŒŒ๋‚˜๋งˆ์˜ ์ „๋ ฅ ๋ถ€ํ•˜ (MW)

์ด ๋ฐ์ดํ„ฐ์…‹์€ ๋‚˜๋จธ์ง€ ๋ณ€์ˆ˜๋“ค์„ ํ™œ์šฉํ•˜์—ฌ ํŒŒ๋‚˜๋งˆ์˜ ์ „๋ ฅ ๋ถ€ํ•˜(net_demand)๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ๋‚ ์”จ ๋ณ€์ˆ˜์™€ ํŠน๋ณ„ํ•œ ๋‚ ์„ ๊ณ ๋ คํ•˜์—ฌ ์ „๋ ฅ ๋ถ€ํ•˜๋ฅผ ์˜ˆ์ธกํ•จ์œผ๋กœ์จ, ์ „๋ ฅ ํšŒ์‚ฌ๋Š” ํšจ์œจ์ ์œผ๋กœ ์ „๋ ฅ ๊ณต๊ธ‰์„ ๊ด€๋ฆฌํ•˜๊ณ  ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ๊ฐ train๊ณผ test y ๋ฐ์ดํ„ฐ์˜ ์–‘์ƒ์„ ์‚ดํŽด๋ณด์‹œ์ฃ .

Train y

Test y

๋ฐ์ดํ„ฐ ์ค€๋น„

Prophet์—์„œ ์š”๊ตฌํ•˜๋Š” ๋ฐ์ดํ„ฐ ํ˜•์‹์€ ds (datetime)์™€ y (value) ์ปฌ๋Ÿผ์„ ํฌํ•จํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

1
2
3
4
5
6
7
8
9
10
11
12
13
import pandas as pd

# ๋ฐ์ดํ„ฐ ๋กœ๋“œ
train_data = pd.read_csv('./data/panama/train.csv')
test_data = pd.read_csv('./data/panama/test.csv')

# ์ปฌ๋Ÿผ ์ด๋ฆ„ ๋ณ€๊ฒฝ
train_data.rename(columns={'datetime': 'ds', 'nat_demand': 'y'}, inplace=True)
test_data.rename(columns={'datetime': 'ds', 'nat_demand': 'y'}, inplace=True)

# ๋‚ ์งœ ํ˜•์‹ ๋ณ€ํ™˜
train_data['ds'] = pd.to_datetime(train_data['ds'], format='%d-%m-%Y %H:%M')
test_data['ds'] = pd.to_datetime(test_data['ds'], format='%d-%m-%Y %H:%M')

์„ค๋ช…:

  • pd.read_csv: CSV ํŒŒ์ผ์„ DataFrame์œผ๋กœ ๋กœ๋“œํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • rename(columns=...): ์ปฌ๋Ÿผ ์ด๋ฆ„์„ ๋ณ€๊ฒฝํ•˜์—ฌ Prophet์ด ์š”๊ตฌํ•˜๋Š” ํ˜•์‹์— ๋งž์ถฅ๋‹ˆ๋‹ค.
  • pd.to_datetime: ๋ฌธ์ž์—ด์„ datetime ๊ฐ์ฒด๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ํ•™์Šต ๋ฐ ์˜ˆ์ธก

๋ชจ๋ธ ์ดˆ๊ธฐํ™” ๋ฐ ํ•™์Šต

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from prophet import Prophet

# ๋ชจ๋ธ ์ดˆ๊ธฐํ™”
model = Prophet()

# ์ถ”๊ฐ€ ๋ณ€์ˆ˜ (exogenous variables) ๋“ฑ๋ก
regressors = ['T2M_toc', 'QV2M_toc', 'TQL_toc', 'W2M_toc', 
              'T2M_san', 'QV2M_san', 'TQL_san', 'W2M_san', 
              'T2M_dav', 'QV2M_dav', 'TQL_dav', 'W2M_dav', 
              'Holiday_ID', 'holiday', 'school']

for regressor in regressors:
    model.add_regressor(regressor)

# ๋ชจ๋ธ ํ•™์Šต
model.fit(train_data)

์„ค๋ช…:

  • Prophet(): Prophet ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • add_regressor: ๋ชจ๋ธ์— ์™ธ๋ถ€ ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  • fit: ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋กœ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ต๋‹ˆ๋‹ค.

์ฃผ์˜ ์‚ฌํ•ญ:

  • ์™ธ๋ถ€ ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•  ๋•Œ, ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์™€ ์˜ˆ์ธก ๋ฐ์ดํ„ฐ์— ๋™์ผํ•œ ๋ณ€์ˆ˜๊ฐ€ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ์™ธ๋ถ€ ๋ณ€์ˆ˜๋Š” ์ˆ˜์น˜ํ˜• ๋ณ€์ˆ˜์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ฒ”์ฃผํ˜• ๋ณ€์ˆ˜์ผ ๊ฒฝ์šฐ, one-hot-encoding๊ณผ ๊ฐ™์ด ์ˆ˜์น˜ํ˜•์œผ๋กœ ๋ณ€ํ™˜์„ ์ˆ˜ํ–‰ํ•ด์ฃผ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ฐ์ดํ„ฐ์˜ ์‹œ๊ฐ„ ํ˜•์‹๊ณผ ๊ฐ„๊ฒฉ์„ ํ™•์ธํ•˜์—ฌ ์ผ๊ด€๋˜๊ฒŒ ์œ ์ง€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ์ธก ์ˆ˜ํ–‰

1
2
3
4
5
6
7
8
# ์˜ˆ์ธก์„ ์œ„ํ•œ ๋ฏธ๋ž˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ (ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์™€ ๋™์ผํ•œ ๊ธฐ๊ฐ„)
future = test_data[['ds'] + regressors]

# ์˜ˆ์ธก ์ˆ˜ํ–‰
forecast = model.predict(future)

# ์˜ˆ์ธก ๊ฒฐ๊ณผ ํ™•์ธ
print(len(forecast), len(test_data)) # 744 744

์„ค๋ช…:

  • make_future_dataframe: ์˜ˆ์ธก์„ ์œ„ํ•œ ๋ฏธ๋ž˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • predict: ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • print: ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

์‹œ๊ฐํ™”

Prophet์€ Plotly ๊ธฐ๋ฐ˜ ์‹œ๊ฐํ™” ํ•จ์ˆ˜์™€ Matplotlib ๊ธฐ๋ฐ˜ ์‹œ๊ฐํ™” ํ•จ์ˆ˜๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

1. ์˜ˆ์ธก ๊ฒฐ๊ณผ ์‹œ๊ฐํ™” (Plotly ์‚ฌ์šฉ)

1
2
3
4
5
6
7
8
9
from prophet.plot import plot_plotly, plot_components_plotly

# ์˜ˆ์ธก ๊ฒฐ๊ณผ ์‹œ๊ฐํ™” (Plotly)
fig_forecast = plot_plotly(model, forecast)
fig_forecast.show()

# ์ปดํฌ๋„ŒํŠธ ์‹œ๊ฐํ™” (Plotly)
fig_components = plot_components_plotly(model, forecast)
fig_components.show()

์„ค๋ช…:

  • plot_plotly: Plotly๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒํ•˜๊ฒŒ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋งˆ์šฐ์Šค๋กœ ๊ทธ๋ž˜ํ”„๋ฅผ ํ™•๋Œ€ํ•˜๊ฑฐ๋‚˜ ํŠน์ • ๊ตฌ๊ฐ„์„ ์ž์„ธํžˆ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • plot_components_plotly: Plotly๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก ๊ฒฐ๊ณผ์˜ ์ปดํฌ๋„ŒํŠธ(์ถ”์„ธ, ๊ณ„์ ˆ์„ฑ, ํœด์ผ ํšจ๊ณผ ๋“ฑ)๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ์ปดํฌ๋„ŒํŠธ๊ฐ€ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์— ์–ด๋–ค ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€ ์‰ฝ๊ฒŒ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2. ์˜ˆ์ธก ๊ฒฐ๊ณผ ์‹œ๊ฐํ™” (Matplotlib ์‚ฌ์šฉ)

1
2
3
4
5
6
7
8
9
from prophet.plot import plot_forecast_component, add_changepoints_to_plot

# ์˜ˆ์ธก ๊ฒฐ๊ณผ ์‹œ๊ฐํ™” (Matplotlib)
fig1 = model.plot(forecast)
plt.show()

# ์ปดํฌ๋„ŒํŠธ ์‹œ๊ฐํ™” (Matplotlib)
fig2 = model.plot_components(forecast)
plt.show()

์„ค๋ช…:

  • plot: Matplotlib์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ Prophet์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋ฉฐ, ์‹ค์ œ ๋ฐ์ดํ„ฐ์™€ ์˜ˆ์ธก๋œ ๋ฐ์ดํ„ฐ, ์˜ˆ์ธก ๊ตฌ๊ฐ„ ๋“ฑ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
  • plot_components: Matplotlib์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก ๊ฒฐ๊ณผ์˜ ์ปดํฌ๋„ŒํŠธ๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ถ”์„ธ, ์ฃผ๊ธฐ์„ฑ(๊ณ„์ ˆ์„ฑ), ํœด์ผ ํšจ๊ณผ ๋“ฑ์˜ ๊ฐœ๋ณ„ ์ปดํฌ๋„ŒํŠธ๋ฅผ ์‹œ๊ฐํ™”ํ•˜์—ฌ ๋ฐ์ดํ„ฐ์˜ ๋ณ€๋™ ์š”์ธ์„ ์‹œ๊ฐ์ ์œผ๋กœ ๋ถ„์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ€๋Šฅ ์ด์Šˆ

โš ๏ธ ํ•˜์ง€๋งŒ! ์—ฌ๊ธฐ์„œ 2๊ฐ€์ง€ ๋ฌธ์ œ์ ์„ ๋งˆ์ฃผํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!!

(1) ์ฒซ๋ฒˆ์งธ๋Š” JupyterLab์„ ์‚ฌ์šฉํ•˜๋Š” ๋ถ„๋“ค์˜ ๊ฒฝ์šฐ ๋ณ„๋„๋กœ Plotly๋ฅผ ์‹คํ–‰ํ•˜์‹ค ๊ฒฝ์šฐ, ์‹œ๊ฐํ™” ์ฝ”๋“œ๋Š” ์‹คํ–‰์€ ๋˜์ง€๋งŒ ๊ฒฐ๊ณผ๊ฐ€ ๋ณด์ด์ง€ ์•Š๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ๋ช‡ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•ด ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค:

  1. ์ฃผํ”ผํ„ฐ๋žฉ ํ™•์žฅ ํ”„๋กœ๊ทธ๋žจ ์„ค์น˜:

    JupyterLab์—์„œ Plotly๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์ถ”๊ฐ€ ํ™•์žฅ ํ”„๋กœ๊ทธ๋žจ์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ๋ช…๋ น์–ด๋กœ ์„ค์น˜ํ•ด๋ณด์„ธ์š”:

    1
    
    jupyter labextension install jupyterlab-plotly
    
  2. ๋ Œ๋”๋Ÿฌ ์„ค์ •:

    ๋…ธํŠธ๋ถ ์‹œ์ž‘ ๋ถ€๋ถ„์— ๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ๋ Œ๋”๋Ÿฌ๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ์„ค์ •ํ•ด๋ณด์„ธ์š”:

    1
    2
    
    import plotly.io as pio
    pio.renderers.default = "jupyterlab"
    
  3. ์ธ๋ผ์ธ ๋ชจ๋“œ ์‚ฌ์šฉ:

    plotly.offline.init_notebook_mode()๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ธ๋ผ์ธ ๋ชจ๋“œ๋ฅผ ํ™œ์„ฑํ™”ํ•ด๋ณด์„ธ์š”:

    1
    2
    
    import plotly.offline as pyo
    pyo.init_notebook_mode(connected=True)
    
  4. ๋ช…์‹œ์ ์œผ๋กœ ๊ทธ๋ž˜ํ”„ ํ‘œ์‹œ:

    fig.show() ๋Œ€์‹  display(fig)๋ฅผ ์‚ฌ์šฉํ•ด๋ณด์„ธ์š”:

    1
    2
    
    from IPython.display import display
    display(fig)
    
  5. ์ฃผํ”ผํ„ฐ๋žฉ ์žฌ์‹œ์ž‘:

    ๋ณ€๊ฒฝ์‚ฌํ•ญ์ด ์ ์šฉ๋˜์ง€ ์•Š์„ ๊ฒฝ์šฐ ์ฃผํ”ผํ„ฐ๋žฉ์„ ์™„์ „ํžˆ ์ข…๋ฃŒํ•˜๊ณ  ๋‹ค์‹œ ์‹œ์ž‘ํ•ด๋ณด์„ธ์š”.

(2) ๋‘๋ฒˆ์งธ๋Š” Prophet์˜ plot ํ•จ์ˆ˜๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ์ „์ฒด ์˜ˆ์ธก ๊ธฐ๊ฐ„์„ ์‹œ๊ฐํ™”ํ•˜๋ฏ€๋กœ, ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์™€ ์˜ˆ์ธก ๊ธฐ๊ฐ„์ด ๋ชจ๋‘ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ, ๊ฒฐ๊ณผ๋กœ ๋‚˜์˜จ ๊ทธ๋ž˜ํ”„์˜ ๋ชจ์–‘์ด ์ด์˜์ง€ ์•Š์„ ํ™•๋ฅ ์ด ๋งค์šฐ๋งค์šฐ ๋†’์Šต๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธ ๊ธฐ๊ฐ„๋งŒ ์‹œ๊ฐํ™”ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด, ์•„๋ž˜์™€ ๊ฐ™์ด ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ํ•„ํ„ฐ๋งํ•˜์—ฌ ์›ํ•˜๋Š” ๊ธฐ๊ฐ„๋งŒ ์‹œ๊ฐํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์‹œ๊ฐํ™” ํ‘œ๊ธฐ ๊ตฌ์—ญ์„ xlim์„ ํ†ตํ•ด ์กฐ์ ˆํ•˜์—ฌ ์„ค์ •ํ•ด์ฃผ๋ฉด test ๊ฒฐ๊ณผ๋งŒ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# ํ…Œ์ŠคํŠธ ๊ธฐ๊ฐ„๋งŒ ํ•„ํ„ฐ๋ง
forecast_test_period = forecast[forecast['ds'].isin(test_data['ds'])]

# ์ „์ฒด ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”
fig = model.plot(forecast)
# x์ถ•์˜ ๋ฒ”์œ„๋ฅผ ํ…Œ์ŠคํŠธ ๊ธฐ๊ฐ„์œผ๋กœ ์ œํ•œ
plt.xlim([test_data['ds'].min(), test_data['ds'].max()])
# ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ์‹ค์ œ๊ฐ’ ์ถ”๊ฐ€
plt.plot(test_data['ds'], test_data['y'], 'r.', label='Actual')
# ๊ทธ๋ž˜ํ”„ ํ‘œ์‹œ
plt.show()

# Plotly๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ปดํฌ๋„ŒํŠธ ์‹œ๊ฐํ™”
fig_components = model.plot_components(forecast_test_period)
fig_components.show()

์„ค๋ช…:

  • warnings.filterwarnings('ignore'): ๊ฒฝ๊ณ  ๋ฉ”์‹œ์ง€๋ฅผ ๋ฌด์‹œํ•˜๋„๋ก ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
  • forecast[forecast['ds'].isin(test_data['ds'])]: ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ๊ธฐ๊ฐ„์— ํ•ด๋‹นํ•˜๋Š” ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ํ•„ํ„ฐ๋งํ•ฉ๋‹ˆ๋‹ค.
  • model.plot(forecast): Prophet ๋ชจ๋ธ์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ „์ฒด ์˜ˆ์ธก ๊ฒฐ๊ณผ๊ฐ€ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.
  • plt.xlim([test_data['ds'].min(), test_data['ds'].max()]): x์ถ•์˜ ๋ฒ”์œ„๋ฅผ ํ…Œ์ŠคํŠธ ๊ธฐ๊ฐ„์œผ๋กœ ์ œํ•œํ•ฉ๋‹ˆ๋‹ค.
  • plt.plot(test_data['ds'], test_data['y'], 'r.', label='Actual'): ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์‹ค์ œ ๊ฐ’์„ ๊ทธ๋ž˜ํ”„์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  • plt.show(): ๊ทธ๋ž˜ํ”„๋ฅผ ํ™”๋ฉด์— ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.
  • model.plot_components(forecast_test_period): Prophet ๋ชจ๋ธ์˜ ์ปดํฌ๋„ŒํŠธ๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค. Plotly๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ์ปดํฌ๋„ŒํŠธ(์ถ”์„ธ, ๊ณ„์ ˆ์„ฑ, ํœด์ผ ํšจ๊ณผ ๋“ฑ)๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • ์•„๋‹ˆ๋ฉด ๊ทธ๋ƒฅ ๋งˆ์Œ ํŽธํ•˜๊ฒŒ ์ง์ ‘ ์‹œ๊ฐํ™” ๊ฒฐ๊ณผ๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error

# ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’ ๋ณ‘ํ•ฉ
result = test_data.copy()
result['yhat'] = forecast.set_index('ds').loc[test_data['ds']]['yhat'].values

# ์˜ˆ์ธก ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”
plt.figure(figsize=(20, 4))
plt.plot(result['ds'], result['y'], label='Actual')
plt.plot(result['ds'], result['yhat'], label='Predicted', linestyle='dashed')
plt.fill_between(forecast['ds'], forecast['yhat_lower'], forecast['yhat_upper'], color='gray', alpha=0.2)
plt.legend()
plt.show()

์„ค๋ช…:

  • test_data.copy(): ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณต์‚ฌํ•˜์—ฌ result ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • forecast.set_index('ds').loc[test_data['ds']]['yhat'].values: ์˜ˆ์ธก ๊ฒฐ๊ณผ์—์„œ ํ…Œ์ŠคํŠธ ๊ธฐ๊ฐ„์— ํ•ด๋‹นํ•˜๋Š” ์˜ˆ์ธก ๊ฐ’์„ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
  • result['yhat'] = ...: ์ถ”์ถœํ•œ ์˜ˆ์ธก ๊ฐ’์„ result ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  • plt.figure(figsize=(20, 4)): ๊ทธ๋ž˜ํ”„์˜ ํฌ๊ธฐ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
  • plt.plot(result['ds'], result['y'], label='Actual'): ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์‹ค์ œ ๊ฐ’์„ ๊ทธ๋ž˜ํ”„์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  • plt.plot(result['ds'], result['yhat'], label='Predicted', linestyle='dashed'): ์˜ˆ์ธก ๊ฐ’์„ ๊ทธ๋ž˜ํ”„์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  • plt.fill_between(forecast['ds'], forecast['yhat_lower'], forecast['yhat_upper'], color='gray', alpha=0.2): ์˜ˆ์ธก ๊ฐ’์˜ ๋ถˆํ™•์‹ค์„ฑ ๊ตฌ๊ฐ„์„ ํšŒ์ƒ‰ ์Œ์˜์œผ๋กœ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.
  • plt.legend(): ๋ฒ”๋ก€๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  • plt.show(): ๊ทธ๋ž˜ํ”„๋ฅผ ํ™”๋ฉด์— ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.

์„ฑ๋Šฅ ํ‰๊ฐ€

์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์‹ค์ œ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์™€ ๋น„๊ตํ•˜์—ฌ ์„ฑ๋Šฅ์„ ๊ฒ€์ฆํ•ฉ๋‹ˆ๋‹ค.

1
2
3
4
5
6
7
8
9
from sklearn.metrics import mean_squared_error

# ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’ ๋ณ‘ํ•ฉ
result = test_data.copy()
result['yhat'] = forecast.set_index('ds').loc[test_data['ds']]['yhat'].values

# ์„ฑ๋Šฅ ํ‰๊ฐ€ (MSE)
mse = mean_squared_error(result['y'], result['yhat'])
print(f'Mean Squared Error: {mse}')

์„ค๋ช…:

  • mean_squared_error: ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’์˜ ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • copy: ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ๋ณต์‚ฌํ•ฉ๋‹ˆ๋‹ค.
  • set_index: ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์˜ ์ธ๋ฑ์Šค๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
  • loc: ํŠน์ • ํ–‰์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

์š”์•ฝ

์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” Prophet์— ๋Œ€ํ•œ ์†Œ๊ฐœ์™€ ์ด๋ฅผ ํ™œ์šฉํ•œ ์‹œ๊ณ„์—ด ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•ด๋ณด์•˜์Šต๋‹ˆ๋‹ค.

  1. ๋ฐ์ดํ„ฐ ์ค€๋น„: Prophet์—์„œ ์š”๊ตฌํ•˜๋Š” ํ˜•์‹์— ๋งž์ถฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ค€๋น„ํ•ฉ๋‹ˆ๋‹ค.
  2. ๋ชจ๋ธ ์ดˆ๊ธฐํ™” ๋ฐ ํ•™์Šต: Prophet ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™”ํ•˜๊ณ  ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ต๋‹ˆ๋‹ค.
  3. ์˜ˆ์ธก ์ˆ˜ํ–‰: ๋ฏธ๋ž˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ์ƒ์„ฑํ•˜๊ณ  ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  4. ์‹œ๊ฐํ™”: Prophet์˜ ๋‚ด์žฅ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก ๊ฒฐ๊ณผ์™€ ์ปดํฌ๋„ŒํŠธ๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค.
  5. ์„ฑ๋Šฅ ํ‰๊ฐ€: ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์‹ค์ œ ๋ฐ์ดํ„ฐ์™€ ๋น„๊ตํ•˜์—ฌ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

๊ธด ๊ธ€ ์ฝ์–ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค โญ



-->