OpenAI, GPT‑4o 이미지 생성 κΈ°λŠ₯ μ†Œκ°œ – AI 이미지 μƒμ„±μ˜ μƒˆλ‘œμš΄ μ‹œλŒ€

Posted by Euisuk's Dev Log on March 26, 2025

OpenAI, GPT‑4o 이미지 생성 κΈ°λŠ₯ μ†Œκ°œ – AI 이미지 μƒμ„±μ˜ μƒˆλ‘œμš΄ μ‹œλŒ€

원본 κ²Œμ‹œκΈ€: https://velog.io/@euisuk-chung/GPT4o-이미지-생성-κΈ°λŠ₯-μ†Œκ°œ-AI-이미지-μƒμ„±μ˜-μƒˆλ‘œμš΄-μ‹œλŒ€

μ•ˆλ…•ν•˜μ„Έμš”! 였늘 λ‚˜μ˜¨ κΈ°λŠ₯은 μ œκ°€ 정말정말 κΈ°λ‹€λ Έλ˜ κΈ°λŠ₯μΈλ°μš”!! πŸ™Œ

참고 링크

2025λ…„ 3μ›” 25일, OpenAIλŠ” GPT‑4o λͺ¨λΈμ„ 톡해 λ„€μ΄ν‹°λΈŒ 이미지 생성(Native Image Generation) κΈ°λŠ₯을 정식 μΆœμ‹œν–ˆμŠ΅λ‹ˆλ‹€. 이번 λ°œν‘œλŠ” λ‹¨μˆœν•œ κΈ°λŠ₯ κ°œμ„ μ„ λ„˜μ–΄, AI 이미지 생성이 ν…μŠ€νŠΈμ™€ 이미지가 μ™„μ „νžˆ ν†΅ν•©λœ μ˜΄λ‹ˆλͺ¨λ‹¬(Omnimodal) κ²½ν—˜μœΌλ‘œ μ§„ν™”ν–ˆλ‹€λŠ” μ μ—μ„œ μ£Όλͺ©ν•  λ§Œν•©λ‹ˆλ‹€.

λ°”λ‘œλ°”λ‘œ 였늘의 μΈλ„€μΌμ²˜λŸΌ β€œλ‚΄κ°€ 직접 λ„£μ–΄μ£Όκ±°λ‚˜, μ‚¬μ§„μ˜ μŠ€νƒ€μΌμ„ λ°”κΏ”λ³΄κ±°λ‚˜, 그림에 ν…μŠ€νŠΈλ₯Ό μ •ν™•ν•˜κ²Œ λ„£λŠ” 거”가 λ“œλ””μ–΄ κ°€λŠ₯ν•  κ²ƒμœΌλ‘œ λ³΄μž…λ‹ˆλ‹€!!

πŸ˜„ 사싀 이거, μ˜ˆμ „μ—λŠ” λ„ˆλ¬΄ ν•˜κ³  μ‹Άμ—ˆμ§€λ§Œ 늘 ν•œκ³„κ°€ μžˆμ—ˆμ£ .

  • 특히 ν…μŠ€νŠΈμ˜ μ†Œλ¬Έμž/λŒ€λ¬Έμž ꡬ뢄도 μ •ν™•νžˆ μ•ˆ 되고, μ΄λ―Έμ§€μ—μ„œ μƒμ„±λœ 글씨가 κΉ¨μ§€λŠ” 것이 μ•„μ‰¬μ› λŠ”λ°μš”β€¦

μ΄μ œλŠ” μ•„λž˜ λ‰΄μŠ€ κΈ°μ‚¬λ“€μ²˜λŸΌ λ„ˆλ¬΄ νŽΈμ§‘μ΄ μž˜λ˜μ–΄μ„œ μ €μž‘κΆŒ 이슈 μ–˜κΈ°κ°€ 될 μ •λ„λ‘œ λ¬Έμ œκ°€ 되고 μžˆμŠ΅λ‹ˆλ‹€.

λ°”λ‘œ 이미지 μƒμ„±ν•΄λ³΄λŠ” 1인 γ…‹γ…‹ πŸ–ΌοΈ

μ˜ˆμ‹œ1. 유λͺ…ν•œ Meme νŽΈμ§‘ν•˜κΈ° - SH** UP AND TAKE MY MONEY!!

μ˜ˆμ‹œ2. νŠΈλŸΌν”„ 이미지 νŽΈμ§‘ν•˜κΈ° - μœ„ 쑰선일보 기사 μ°Έκ³ 

μ΄λ²ˆμ— μ—…λ°μ΄νŠΈ ν•œ GPT-4o 이미지 생성 κΈ°λŠ₯μ—μ„œλŠ” κ·Έ λͺ¨λ“  뢀뢄이 정말 κΈ°λŒ€ μ΄μƒμœΌλ‘œ κ°œμ„ λœ 것 κ°™μ•„ λ„ˆλ¬΄ μ„€λ ˆμš”! 😍 (μ†λ„λŠ” μ’€ λŠλ €μ‘Œμ§€λ§Œ)


πŸ“Œ 그럼 μ–΄λ–€ 뢀뢄이 μ’‹μ•„μ‘ŒλŠ”μ§€ 본격적으둜 ν•œλ²ˆ μ‚΄νŽ΄λ³ΌκΉŒμš”?

OpenAIλŠ” GPT‑4o에 κ°€μž₯ μ§„λ³΄λœ 이미지 생성기λ₯Ό ν†΅ν•©ν•˜λ©° 이미지 μƒμ„±μ˜ λ°©ν–₯을 β€˜μ˜ˆμœ κ·Έλ¦Όβ€™μ—μ„œ β€˜μ“Έλͺ¨ μžˆλŠ” λ„κ΅¬β€™λ‘œ μ™„μ „νžˆ μ „ν™˜ν–ˆλ‹€κ³  μ„€λͺ…ν–ˆμŠ΅λ‹ˆλ‹€.

β€œWe’ve built our most advanced image generator yet into GPT‑4o. The resultβ€”image generation that is not only beautiful, but useful.” - OpenAI, March 25, 2025

μ΄μ „μ—λŠ” λ‹¨μˆœνžˆ 비주얼이 λ©‹μ§„ 이미지λ₯Ό μƒμ„±ν•˜λŠ” 것이 λͺ©ν‘œμ˜€λ‹€λ©΄, μ΄μ œλŠ” μ‹€μ œ μž‘μ—…μ— ν™œμš©ν•  수 μžˆλŠ” μœ μŠ€μΌ€μ΄μŠ€ μ€‘μ‹¬μœΌλ‘œ λ°œμ „ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

ℹ️ μ‹ κ·œ κΈ°λŠ₯ μš”μ•½

  • 🎨 λ©€ν‹°ν„΄ 이미지 생성 (Multi-turn Generation)

    μ±„νŒ…μ„ 이어가며 이미지 μˆ˜μ •/κ°œμ„  κ°€λŠ₯. 캐릭터 λ””μžμΈ, μž₯λ©΄ ꡬ성에 유리.

  • πŸ“ μ •λ°€ν•œ μΈμŠ€νŠΈλŸ­μ…˜ 반영

    ν…μŠ€νŠΈ 기반 λ³΅μž‘ν•œ μ§€μ‹œμ‚¬ν•­ (10~20개 객체 배치 λ“±)도 μΆ©μ‹€νžˆ 반영.

  • 🧠 μΈμ»¨ν…μŠ€νŠΈ ν•™μŠ΅ (In-context Learning)

    μ—…λ‘œλ“œλœ μ΄λ―Έμ§€μ—μ„œ μŠ€νƒ€μΌ/ꡬ성 ν•™μŠ΅ ν›„ μƒˆλ‘œμš΄ 이미지 생성.

  • 🌍 μ›”λ“œ 지식 μ—°κ²° (World Knowledge)

    ν…μŠ€νŠΈ λͺ¨λΈμ˜ 지식을 ν™œμš©ν•΄ 이미지λ₯Ό 더 λ˜‘λ˜‘ν•˜κ²Œ 생성.

  • ✍️ μ •ν™•ν•œ ν…μŠ€νŠΈ λ Œλ”λ§ (Text Rendering)

    ν‘œμ§€νŒ, λ©”λ‰΄νŒ, μΈν¬κ·Έλž˜ν”½ λ“± ν…μŠ€νŠΈ 포함 이미지도 μ •ν™•ν•˜κ³  κΉ”λ”ν•˜κ²Œ 생성.

  • 🎭 λ‹€μ–‘ν•œ μŠ€νƒ€μΌ & ν¬ν† λ¦¬μ–Όλ¦¬μ¦˜

    λ§Œν™”, μˆ˜μ±„ν™”, λ””μ§€ν„Έ νŽ˜μΈνŒ…, 싀사 μŠ€νƒ€μΌ λ“± 폭넓은 μŠ€νƒ€μΌ 지원.

λ³Έ λΈ”λ‘œκ·Έ ν¬μŠ€νŠΈλŠ” OpenAI Youtube Demo(Part1)와 OpenAI Blog λ‚΄μš©(Part2)둜 λ‚˜λ‰©λ‹ˆλ‹€.


PART 1. OpenAI Youtube Demo μ†Œκ°œ

λ°œν‘œλŠ” μ‹€μ œ 데λͺ¨ μ‹œμ—° μ€‘μ‹¬μœΌλ‘œ μ΄λ£¨μ–΄μ‘ŒμœΌλ©°, 각 μ„Έμ…˜μ΄ μ •κ΅ν•œ ν…μŠ€νŠΈ λ Œλ”λ§, 속성 κ²°ν•©μ˜ 정밀도, 밈(meme) 이미지 μƒμ„±μ˜ μœ μ—°μ„±, 닀쀑 λͺ¨λ‹¬ μž…λ ₯ ν™œμš© 등을 λ‹¨κ³„λ³„λ‘œ λ³΄μ—¬μ£Όμ—ˆμŠ΅λ‹ˆλ‹€.

🎬 이미지 생성 ν’ˆμ§ˆμ˜ μ§„ν™”

λ°œν‘œ μ΄ˆλ°˜μ— β€œGPT‑4o의 이미지 생성 ν’ˆμ§ˆμ€ 과거의 λͺ¨λΈκ³ΌλŠ” 차원이 λ‹€λ₯΄λ‹€β€κ³  κ°•μ‘°ν–ˆμŠ΅λ‹ˆλ‹€.

μ‹€μ œλ‘œ μ•„λž˜μ™€ 같은 이미지 생성 데λͺ¨λ₯Ό 톡해 λ‹€μŒμ„ λ³΄μ—¬μ£Όμ—ˆμŠ΅λ‹ˆλ‹€.

πŸ“· DEMO 1. POV(1인칭 μ‹œμ )의 이미지 μš”μ²­:

μ‚¬μš©μžμ˜ μ‹œμ μ—μ„œ 쒅이 μœ„ λ°œν‘œ λ…ΈνŠΈκ°€ 있고, λ°°κ²½μ—λŠ” 촬영 νŒ€μ΄ μžˆλŠ” μž₯λ©΄. 이 μ΄λ―Έμ§€μ—μ„œ GPT‑4oλŠ” λ‹€μŒκ³Ό 같은 μ„ΈλΆ€ 사항을 μ •ν™•ν•˜κ²Œ λ°˜μ˜ν–ˆμŠ΅λ‹ˆλ‹€:

  • 배경은 νλ¦Ών•˜κ²Œ 처리되고 (depth-of-field ν‘œν˜„)
  • 쒅이 μœ„ ν…μŠ€νŠΈλŠ” μ •ν™•νžˆ λ Œλ”λ§λ¨
  • β€œSPEAKER NOTES”, β€œPART 1”, β€œPART 2” λ“± 문ꡬ가 μ˜€νƒˆμž 없이 ν‘œν˜„λ¨

직접 λ§Œλ“€μ–΄λ„ μ œλŒ€λ‘œ λ‚˜μ˜€λŠ” 것을 λ³Ό 수 있음

πŸ“ μ‚¬μš© ν”„λ‘¬ν”„νŠΈ(ENG):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
A tall, POV image of me in an old loft. Film crew present, facing me.

There is a sheet of paper on the table. There is text written on the paper (focus on paper, background out of focus. Paper occupies most of the page)

The text reads:

**SPEAKER NOTES:**

**PART 1**  
- native support for image generation in a model as powerful as GPT-4  
- render full paragraphs of text and combine images  
- "rough around the edges"  
- make it accessible  

**PART 2**  
- memes?  
- "we're surrounded by images"  
- images that "persuade, inform and educate".  
- "workhorse images"  
- gives the power of useful image generation to the world

πŸ‘“ ν”„λ‘¬ν”„νŠΈ νŠΉμ§• 정리

이미지 생성 μš”μ²­ ν”„λ‘¬ν”„νŠΈλŠ” λ‹€μŒκ³Ό 같은 3단 ꡬ성을 λ”°λ¦…λ‹ˆλ‹€:

  • μž₯λ©΄ μ„€μ • (Scene Setting)

    • 곡간(낑은 λ‘œν”„νŠΈ)κ³Ό μ‹œμ (POV, λ‚˜μ˜ μ‹œμ„ )을 λͺ…ν™•νžˆ μ„€μ •
    • 인물 ꡬ성(μ΄¬μ˜νŒ€, λ‚˜λ₯Ό λ°”λΌλ³΄λŠ” μƒνƒœ)
  • 초점 λŒ€μƒ μ„€μ • (Focal Object)

    • μ΄λ―Έμ§€μ˜ 쀑심: 책상 μœ„ 쒅이
    • β€œfocus on paper”, β€œbackground out of focusβ€λŠ” Depth-of-Field(심도 ν‘œν˜„)을 λͺ…μ‹œν•¨
  • ν…μŠ€νŠΈ μ‚½μž… μ§€μ‹œ (Embedded Text Instruction)

    • 쒅이에 적힌 λ‚΄μš©μ„ κ·ΈλŒ€λ‘œ λͺ…μ‹œ (β€œThe text reads:” μ΄ν›„μ˜ λ¬Έμž₯λ“€)

μœ„ ν”„λ‘¬ν”„νŠΈλ₯Ό μ§μ—­ν•΄μ„œ ν•œκΈ€λ‘œ μš”μ²­ν•˜λ©΄, μ•„μ§κΉŒμ§€ ν•œκΈ€μ€ μ™„λ²½(?)ν•˜μ§„ μ•Šμ€ 것을 λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€.

μ•„λž˜λŠ” μœ„ 이미지λ₯Ό λ§Œλ“€κΈ° μœ„ν•΄ μ‚¬μš©ν•œ 영문 ν”„λ‘¬ν”„νŠΈλ₯Ό μ§μ—­ν•œ ν•œκΈ€ ν”„λ‘¬ν”„νŠΈ λ‚΄μš©μž…λ‹ˆλ‹€.

πŸ“ μ‚¬μš© ν”„λ‘¬ν”„νŠΈ(KOR):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
낑은 λ‘œν”„νŠΈ μ•ˆ. 카메라 μ‹œμ μ€ 인물의 μ‹œμ μ—μ„œ λ°”λΌλ³΄λŠ” λ“―ν•œ Tall, POV 이미지.  
λ‚΄ μ•žμ— μ˜ν™” 촬영 νŒ€μ΄ 있으며, λ‚˜λ₯Ό 바라보고 μžˆλ‹€.

책상 μœ„μ— 쒅이 ν•œ μž₯이 놓여 있고, κ·Έ 쒅이에 글이 μ ν˜€ μžˆλ‹€.  
**μ΄λ―Έμ§€μ˜ μ΄ˆμ μ€ 쒅이에 맞좰져 있으며**, 배경은 νλ¦Ών•˜κ²Œ μ²˜λ¦¬λ˜μ–΄ μžˆλ‹€.  
쒅이가 μ΄λ―Έμ§€μ˜ λŒ€λΆ€λΆ„μ„ μ°¨μ§€ν•œλ‹€.

μ’…μ΄μ—λŠ” λ‹€μŒκ³Ό 같은 λ‚΄μš©μ΄ μ ν˜€ μžˆλ‹€:

**λ°œν‘œμž λ…ΈνŠΈ:**

**1λΆ€**  
- GPT-4처럼 κ°•λ ₯ν•œ λͺ¨λΈμ΄ 이미지 생성을 기본으둜 지원  
- κΈ΄ λ¬Έλ‹¨μ˜ ν…μŠ€νŠΈλ₯Ό λ Œλ”λ§ν•˜κ³  이미지λ₯Ό κ²°ν•©  
- "μ‘°κΈˆμ€ μ‘°μ•…ν•˜λ‹€"  
- λˆ„κ΅¬λ‚˜ μ‰½κ²Œ μ ‘κ·Ό κ°€λŠ₯ν•˜κ²Œ λ§Œλ“€ 것  

**2λΆ€**  
- 밈(meme)은 μ–΄λ–¨κΉŒ?  
- "μš°λ¦¬λŠ” μ΄λ―Έμ§€λ‘œ λ‘˜λŸ¬μ‹Έμ—¬ μžˆλ‹€"  
- μ‚¬λžŒλ“€μ„ "μ„€λ“ν•˜κ³ , 정보 μ „λ‹¬ν•˜κ³ , κ΅μœ‘ν•˜λŠ”" 이미지듀  
- "일꾼 같은 이미지듀"  
- μœ μš©ν•œ 이미지 생성을 λˆ„κ΅¬λ‚˜ ν•  수 μžˆλ„λ‘ ν•˜λŠ” 기술

이 데λͺ¨λŠ” ν…μŠ€νŠΈ 포함 μ΄λ―Έμ§€μ˜ 정확도, μ‹œμ  ν‘œν˜„, λ°°κ²½ 흐림 처리 λ“± 볡합적 쑰건을 μžμ—°μŠ€λŸ½κ²Œ λ§Œμ‘±μ‹œν‚¨ λŒ€ν‘œ μ‚¬λ‘€μ˜€μŠ΅λ‹ˆλ‹€.


🎨 DEMO 2.1. 이미지 μƒν˜Έμž‘μš© - 직접 찍은 사진과 μƒν˜Έμž‘μš©

  • μ…€μΉ΄λ₯Ό 찍어 μ—…λ‘œλ“œ
  • β€œanime frame으둜 λ°”κΏ”μ€˜β€λΌκ³  μš”μ²­
  • GPT‑4oλŠ” μ–Όκ΅΄, 손 λͺ¨μ–‘, ν‘œμ •, λ°°κ²½ 등을 κ·ΈλŒ€λ‘œ μœ μ§€ν•œ 채 anime μŠ€νƒ€μΌλ‘œ λ³€ν™˜ν•œ 이미지λ₯Ό 생성

πŸ–ΌοΈ DEMO 2.2. 밈(meme) 생성 – μ’€ 더 μ‘μš©ν•˜κΈ°

μ΄μ–΄μ§€λŠ” μ‹œμ—°μ—μ„œλŠ”, μ…€μΉ΄ β†’ anime λ³€ν™˜ β†’ meme μƒμ„±μ˜ νλ¦„μœΌλ‘œ μ΄μ–΄μ§‘λ‹ˆλ‹€.

πŸ’¬ ν”„λ‘¬ν”„νŠΈ μ˜ˆμ‹œ:

β€œμ΄κ±Έ 밈으둜 λ§Œλ“€μ–΄μ€˜. λ¬Έκ΅¬λŠ” β€˜Feel the AGI’.”

GPT‑4oλŠ” 이전 ν”„λ‘¬ν”„νŠΈμ™€ 이미지 μ»¨ν…μŠ€νŠΈλ₯Ό μœ μ§€ν•œ 채, μ μ ˆν•œ μœ λ¨Έμ™€ ꡬ성을 κ°–μΆ˜ 밈 이미지λ₯Ό μƒμ„±ν–ˆμŠ΅λ‹ˆλ‹€.


πŸƒ DEMO 3. λ‚˜λ§Œμ˜ νŠΈλ ˆμ΄λ”© μΉ΄λ“œ λ§Œλ“€κΈ° & 개인 μ°½μž‘ λ„κ΅¬λ‘œμ˜ ν™•μž₯

이번 데λͺ¨μ—μ„œλŠ” κ°œλ°œμžκ°€ β€œμ „λ¬Έμ μΈ λ””μžμ΄λ„ˆκ°€ λ§Œλ“  것 같은 νŠΈλ ˆμ΄λ”© μΉ΄λ“œβ€λ₯Ό λ§Œλ“€μ–΄ λ³΄λŠ” μ‹œμ—°μ„ μˆ˜ν–‰ν•©λ‹ˆλ‹€.

  • μ‹€μ œ μΉ΄λ“œ 사진
  • μžμ‹ μ˜ κ°•μ•„μ§€ Sanji 사진
  • μΉ΄λ“œμ— λ“€μ–΄κ°ˆ ν…μŠ€νŠΈ 정보 (이름, λŠ₯λ ₯치, 연도, λ°°κ²½ μ„€μ • λ“±)

πŸ“ μ‚¬μš© ν”„λ‘¬ν”„νŠΈ(ENG):

1
2
3
4
5
6
7
8
9
10
11
I want to make a trade card for our launch.  
I've uploaded an example from the Sora launch, please design a trade card in the same style. The picture on the trade card should be a Shiba Inu snowboarding – please use my dog Sanji from the photo.  

A couple details to note:  
- Mention "GPT-4o Image" and "2025" in the headline as that's what we are launching today  
- Use "Generative AI image model" as its ability. Be sure to mention "GPT-4o" and "native multimodal" in the description  
- Sanji weighs "30 lbs", and is "14 inches" tall  
- Include these stats on the card, and make them render in a 2x2 grid view  
  ‒ "speed": "1 min"  
  ‒ "genre": "any"  
- Add a fun punch line in the end!

πŸ“ μ‚¬μš© ν”„λ‘¬ν”„νŠΈ(KOR):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
우리 λŸ°μΉ­μ„ μœ„ν•œ **νŠΈλ ˆμ΄λ”© μΉ΄λ“œ(trade card)**λ₯Ό λ§Œλ“€κ³  μ‹Άμ–΄.  
Sora 런칭 λ•Œ μ‚¬μš©λœ μ˜ˆμ‹œ 이미지λ₯Ό μ—…λ‘œλ“œν–ˆμœΌλ‹ˆ, λ™μΌν•œ μŠ€νƒ€μΌλ‘œ νŠΈλ ˆμ΄λ”© μΉ΄λ“œλ₯Ό λ””μžμΈν•΄μ€˜.  
μΉ΄λ“œμ— λ“€μ–΄κ°ˆ μ΄λ―Έμ§€λŠ” **μŠ€λ…Έλ³΄λ“œλ₯Ό νƒ€λŠ” μ‹œλ°”κ²¬(Shiba Inu)**이면 μ’‹κ² κ³ ,  
λ‚΄ κ°•μ•„μ§€ **Sanji**의 사진을 ν™œμš©ν•΄μ€˜.

λ‹€μŒ μ„ΈλΆ€ 사항듀을 κΌ­ λ°˜μ˜ν•΄μ€˜:

- 제λͺ©μ—λŠ” **"GPT-4o Image"**와 **"2025"**λΌλŠ” 문ꡬ가 κΌ­ λ“€μ–΄κ°€μ•Ό ν•΄. 였늘 λŸ°μΉ­λ˜λŠ” λ‚΄μš©μ„ λ°˜μ˜ν•œ κ±°λ‹ˆκΉŒ.  
- λŠ₯λ ₯(ability)μœΌλ‘œλŠ” **"Generative AI image model"**을 μ‚¬μš©ν•΄μ€˜.  
  섀λͺ…μ—λŠ” **"GPT-4o"**와 **"native multimodal"**μ΄λΌλŠ” ν‘œν˜„μ„ κΌ­ λ„£μ–΄μ€˜.  
- SanjiλŠ” **λͺΈλ¬΄κ²Œ 30νŒŒμš΄λ“œ(30 lbs)**, **ν‚€λŠ” 14인치(14 inches)**μ•Ό.  
- λ‹€μŒ μŠ€νƒ―(stat)듀은 μΉ΄λ“œμ— ν¬ν•¨ν•˜κ³ , **2x2 κ·Έλ¦¬λ“œ ν˜•νƒœ**둜 λ³΄μ—¬μ€˜:  
  ‒ 속도(speed): "1λΆ„"  
  ‒ μž₯λ₯΄(genre): "any"  
- λ§ˆμ§€λ§‰μ—” **μž¬λ―ΈμžˆλŠ” ν•œ 쀄 λ¬Έμž₯(punch line)**도 μΆ”κ°€ν•΄μ€˜!

직접 λ§Œλ“€μ–΄λ³Έ 이쁜 Mint μΉ΄λ“œ πŸ’Œ (사진 κ·ΈλŒ€λ‘œλŠ” κ΅¬ν˜„μ΄ 아직 λΆ€μ‘±ν•œ λŠλ‚Œ)

GPT‑4oλŠ” 사진 μŠ€νƒ€μΌμ„ λͺ¨μ‚¬ν•˜λ©΄μ„œλ„, μ§€μ •λœ ν…μŠ€νŠΈμ™€ 숫자 데이터λ₯Ό μ •ν™•νžˆ λ°˜μ˜ν•œ νŠΈλ ˆμ΄λ”© μΉ΄λ“œλ₯Ό μƒμ„±ν–ˆμŠ΅λ‹ˆλ‹€.

κ·Έλž˜λ„ λ†€λΌμš΄ 점은 μΉ΄λ“œμ˜ λ ˆμ΄μ•„μ›ƒ, μ •λ ¬, 폰트 μŠ€νƒ€μΌ, μž‘μ€ κΈ€μ”¨κΉŒμ§€ 맀우 μ •λ°€ν•˜κ²Œ ν‘œν˜„λ˜μ—ˆλ‹€λŠ” κ²ƒμž…λ‹ˆλ‹€.


πŸͺ™ DEMO 4. 이미지 νŽΈμ§‘, ν•©μ„±, 기념 코인 μ œμž‘

λ§ˆμ§€λ§‰ μ„Έμ…˜μ—μ„œλŠ” GPT‑4oλ₯Ό ν™œμš©ν•΄ 기념 코인(Memorial Coin)을 λ§Œλ“œλŠ” 과정을 μ‹œμ—°ν–ˆμŠ΅λ‹ˆλ‹€.

  • μ•žμ„œ μƒμ„±ν•œ 이미지 4μž₯을 λ°°κ²½ μš”μ†Œλ‘œ ν™œμš©
  • Hex 색상 μ½”λ“œλ‘œ β€œλ΄„ 컬러 νŒ”λ ˆνŠΈβ€λ₯Ό 적용
  • 기념 문ꡬ와 λ‚ μ§œ μ‚½μž…
  • 이후 β€œνˆ¬λͺ… 배경으둜 λ§Œλ“€μ–΄μ€˜β€ μš”μ²­ β†’ PNG둜 생성

이 μ‹œμ—°μ€ GPT‑4oκ°€ 단일 이미지 생성에 머무λ₯΄μ§€ μ•Šκ³ , λ©€ν‹°ν„΄(Multi-turn) μƒν˜Έμž‘μš©μ„ 톡해 이미지 νŽΈμ§‘ 및 반볡적 μ»€μŠ€ν„°λ§ˆμ΄μ§•κΉŒμ§€ κ°€λŠ₯ν•˜λ‹€λŠ” 것을 λ³΄μ—¬μ£Όμ—ˆμŠ΅λ‹ˆλ‹€.

μ•„λž˜ μ§ˆλ¬Έλ“€λ„ μΆ”κ°€μ μœΌλ‘œ μ§ˆμ˜ν•΄μ„œ 이미지λ₯Ό μ›ν•˜λŠ” λ°©ν–₯으둜 μ—…λ°μ΄νŠΈν•΄ λ³Ό μˆ˜λ„ μžˆκ² κ΅°μš”!!

  • β€œλ’·λ©΄ λ””μžμΈλ„ λ§Œλ“€μ–΄μ€˜β€
  • β€œμ΄λ¦„λ§ˆλ‹€ 색상 λ‹€λ₯΄κ²Œ μ μš©ν•΄μ€˜β€
  • β€œμ΄ λΆ€λΆ„λ§Œ μˆ˜μ •ν•΄μ€˜β€

PART 2. OpenAI Blog λ‚΄μš© μ†Œκ°œ

κΈ€μ˜ 맨 μ•žμ—μ„œ μ–˜κΈ°ν•œ κ²ƒμ²˜λŸΌ 이번 GPT-4o ImageGeneration μ—…λ°μ΄νŠΈλŠ” μ•„λž˜μ™€ κ°™μŠ΅λ‹ˆλ‹€.

ℹ️ μ‹ κ·œ κΈ°λŠ₯ μš”μ•½

  • 🧭 λ©€ν‹°ν„΄ 이미지 생성 (Multi-turn Generation)
  • πŸ“ μ •λ°€ν•œ μΈμŠ€νŠΈλŸ­μ…˜ 반영
  • 🧠 μΈμ»¨ν…μŠ€νŠΈ ν•™μŠ΅ (In-context Learning)
  • 🌍 μ›”λ“œ 지식 μ—°κ²° (World Knowledge)
  • ✍️ μ •ν™•ν•œ ν…μŠ€νŠΈ λ Œλ”λ§ (Text Rendering)
  • 🎭 λ‹€μ–‘ν•œ μŠ€νƒ€μΌ & ν¬ν† λ¦¬μ–Όλ¦¬μ¦˜

πŸ” 이번 μ±•ν„°μ—μ„œλŠ” 각각의 κΈ°λŠ₯에 λŒ€ν•΄μ„œ μ’€ 더 μ‚΄νŽ΄λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€.

μ•„λž˜λŠ” GPT-4o의 이미지 생성 κΈ°λŠ₯을 6κ°€μ§€ 핡심 κΈ°λŠ₯λ³„λ‘œ μ •λ¦¬ν•˜κ³ , 각 κΈ°λŠ₯에 λ§žλŠ” μ˜ˆμ‹œ Promptλ₯Ό μžμ„Έν•˜κ²Œ μΆ”κ°€ν•œ λ‚΄μš©μž…λ‹ˆλ‹€. 싀무 ν™œμš©μ„ μœ„ν•œ ν”„λ‘¬ν”„νŠΈ μž‘μ„±μ— μœ μš©ν•˜λ„λ‘ κ΅¬μ„±ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

μ•„λž˜ μ˜ˆμ‹œλ“€μ˜ ν”„λ‘¬ν”„νŠΈμ™€ μ΄λ―Έμ§€λŠ” OpenAI Blog에 λ‚˜μ˜¨ 곡식 μ‚¬λ‘€λ“€μž…λ‹ˆλ‹€.

(μ‘°κΈˆμ”© μΆ”κ°€/λ³€κ²½ν•œ λ‚΄μš©λ„ μ‘΄μž¬ν•©λ‹ˆλ‹€)


🧭 1. Multi-turn Generation (λ©€ν‹°ν„΄ 이미지 생성)

GPT-4oλŠ” μ±„νŒ… 기반으둜 이미지 생성 과정을 λ‹¨κ³„μ μœΌλ‘œ 이어가며 μ μ§„μ μœΌλ‘œ 정ꡐ화할 수 μžˆμŠ΅λ‹ˆλ‹€.

πŸ” μ£Όμš” κΈ°λŠ₯

  • λŒ€ν™”ν˜• λ°©μ‹μœΌλ‘œ 이미지 μˆ˜μ • κ°€λŠ₯
  • 일관성 μœ μ§€ (캐릭터 μŠ€νƒ€μΌ, λ°°κ²½ μš”μ†Œ, UI λ“±)
  • λ³΅μž‘ν•œ μž₯λ©΄ ꡬ성을 λ‹¨κ³„μ μœΌλ‘œ λ°œμ „

🎨 μ˜ˆμ‹œ Prompt

CAT (Image)

1
β†’ Give this cat a detective hat and a monocle

β†’ κΈ°λ³Έ 이미지에 탐정 λͺ¨μžμ™€ λ‹¨μ•ˆκ²½ μΆ”κ°€

1
β†’ turn this into a triple A video games made with a 4k game engine and add some User interface as overlay from a mystery RPG where we can see a health bar and a minimap at the top as well as spells at the bottom with consistent and iconography

β†’ AAAκΈ‰ κ²Œμž„ μŠ€νƒ€μΌλ‘œ λ³€ν™˜, UI μ˜€λ²„λ ˆμ΄ μΆ”κ°€

β†’ κ²Œμž„ HUD, μΌκ΄€λœ μ•„μ΄μ½˜, 해상도/μŠ€νƒ€μΌ μ—…κ·Έλ ˆμ΄λ“œ

1
β†’ update to a landscape image 16:9 ratio, add more spells in the UI, and unzoom the visual so that we see the cat in a third person view walking through a steampunk manhattan creating beautiful contrast and lighting like in the best triple A game, with cool-toned colors

β†’ ν™”λ©΄ λΉ„μœ¨ λ³€κ²½(16:9), μ‹œμ  λ³€κ²½(3인칭), λ°°κ²½ μ„€μ • μΆ”κ°€

β†’ μŠ€νŒ€νŽ‘ν¬ λ§¨ν•΄νŠΌ λ°°κ²½, μ»¬λŸ¬ν†€ 쑰절, μ‘°λͺ… 효과 λ“± κ³ κΈ‰ μ‘°μ •

🧩 μΆ”κ°€ μ˜ˆμ‹œ

μΆ”κ°€μ μœΌλ‘œ μ•„λž˜ ν”„λ‘¬ν”„νŠΈλ₯Ό μ μš©ν•˜μ—¬ κ³ μ–‘μ΄μ—κ²Œ 베트맨 마슀크λ₯Ό μ”Œμ›Œλ³΄μ•˜μŠ΅λ‹ˆλ‹€.

1
β†’ turn this cat's detective hat and a monocle to batman mask.

β†’ μ˜μƒ λ³€κ²½μœΌλ‘œ μƒˆλ‘œμš΄ 캐릭터 μŠ€νƒ€μΌ 생성


πŸ“‹ 2. Instruction Following (μ§€μ‹œμ‚¬ν•­ 기반 생성)

GPT-4oλŠ” κΈ΄ prompt와 λ³΅μž‘ν•œ 객체 μ§€μ‹œμ‚¬ν•­μ„ μ •ν™•νžˆ λ”°λ¦…λ‹ˆλ‹€.

(μ΅œλŒ€ 10~20개의 객체도 처리 κ°€λŠ₯.)

πŸ“ μ˜ˆμ‹œ Prompt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
A square image containing a 4 row by 4 column grid containing 16 objects on a white background. 

Go from left to right, top to bottom. Here's the list:
1. a blue star
2. red triangle
3. green square
4. pink circle
5. orange hourglass
6. purple infinity sign
7. black and white polka dot bowtie
8. tiedye "42"
9. an orange cat wearing a black baseball cap
10. a map with a treasure chest
11. a pair of googly eyes
12. a thumbs up emoji
13. a pair of scissors
14. a blue and white giraffe
15. the word "OpenAI" written in cursive
16. a rainbow-colored lightning bolt

β†’ 4x4 정사각 격자, μ’Œβ†’μš°, μƒβ†’ν•˜ μˆœμ„œ, 각각의 속성이 μ •ν™•νžˆ 반영됨

🧩 μΆ”κ°€ μ˜ˆμ‹œ

μ„¬μ„Έν•œ μš”μ²­ μ‚¬ν•­κΉŒμ§€ μˆ˜ν–‰ν•΄μ£ΌλŠ” GPT‑4o’s image generation.

1
show me a wine glass with only the tiniest drop of red wine in it.

β†’ 맀우 μ •λ°€ν•œ λ””ν…ŒμΌλ„ κ΅¬ν˜„ κ°€λŠ₯.

β†’ β€œtiniest dropβ€μ΄λΌλŠ” μˆ˜λŸ‰μ  ν‘œν˜„λ„ μ‹œκ°ν™”λ¨.


🧠 3. In-context Learning (μ»¨ν…μŠ€νŠΈ ν•™μŠ΅)

μ°Έμ‘° 이미지 기반으둜 μŠ€νƒ€μΌ, ꡬ성, 아이디어λ₯Ό ν•™μŠ΅ν•΄ μƒˆλ‘œμš΄ 이미지에 λ°˜μ˜ν•©λ‹ˆλ‹€.

πŸ–Ό μ˜ˆμ‹œ Prompt

1
2
3
Draw a design for a vehicle with triangular wheels using these images as references.  
Label the front and back wheels, and at the bottom write:  
β€œTRIANGLE WHEELED VEHICLE. English Patent. 2025. OPENAI.” (in small caps)

β†’ μ „μ†‘λœ μ°Έμ‘° 이미지λ₯Ό λ°”νƒ•μœΌλ‘œ μ°¨λŸ‰ ν˜•νƒœ ꡬ성

β†’ β€œTRIANGLE WHEELED VEHICLE. English Patent. 2025. OPENAI.” ν…μŠ€νŠΈ 포함

🧩 μΆ”κ°€ μ˜ˆμ‹œ

1
an photorealistic image of a blue chainsaw

β†’ λ‹¨μˆœ μš”μ²­μ—λ„ ν¬ν† λ¦¬μ–Όλ¦¬μ¦˜μ„ 기반으둜 맀우 세뢀적이고 사싀적인 λ””ν…ŒμΌ κ΅¬ν˜„


🌐 4. World Knowledge Integration (세계 지식 연계)

GPT-4oλŠ” ν…μŠ€νŠΈ λͺ¨λΈμ˜ 지식을 기반으둜 이미지λ₯Ό λ…Όλ¦¬μ μœΌλ‘œ μƒμ„±ν•©λ‹ˆλ‹€.

🍹 μ˜ˆμ‹œ Prompt

1
2
3
4
5
6
7
8
9
10
Make me a professionally shot photorealistic diagram of the top selling cocktails in my bar with recipes labeled on each drink.

put the recipes on handwritten cards in front of each drink.

the cards are brown, and the text is black.

background is white

Title is "4 most popular cocktails"

β†’ μΉ΅ν…ŒμΌ ꡬ성, λ ˆμ‹œν”Ό, 사진 촬영 μŠ€νƒ€μΌ 등을 ν†΅ν•©ν•œ 이미지 생성

β†’ 손글씨 μΉ΄λ“œ, 음료 μœ„μ£Ό λ°°μ—΄, λ°°κ²½ λ“± 볡합 쑰건 만쑱

🧩 μΆ”κ°€ μ˜ˆμ‹œ

1
make a visual infographic describing why SF is so foggy

β†’ μ§€μ—­ 기반 κΈ°ν›„ 지식과 μ‹œκ° μžλ£Œν™”κΉŒμ§€ μ§„ν–‰ (Fog = ν•΄μ–‘μ„± κΈ°ν›„, μ§€ν˜• 영ν–₯ λ“± μ‹œκ°ν™”)


πŸ–Ό 5. Photorealism & Style Transfer (ν¬ν† λ¦¬μ–Όλ¦¬μ¦˜ 및 μŠ€νƒ€μΌ λ³€ν™˜)

λ‹€μ–‘ν•œ μŠ€νƒ€μΌμ— λŒ€ν•œ ν•™μŠ΅μ„ 톡해 μ‚¬μ‹€μ μ΄κ±°λ‚˜ 예술적인 이미지 생성이 κ°€λŠ₯

πŸ‘¨β€πŸ« μ˜ˆμ‹œ Prompt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
A wide image taken with a phone of a glass whiteboard, in a room overlooking the Bay Bridge. The field of view shows a woman writing, sporting a tshirt wiith a large OpenAI logo. The handwriting looks natural and a bit messy, and we see the photographer's reflection.

The text reads:

(left)
"Transfer between Modalities:

Suppose we directly model
p(text, pixels, sound) [equation]
with one big autoregressive transformer.

Pros:
* image generation augmented with vast world knowledge
* next-level text rendering
* native in-context learning
* unified post-training stack

Cons:
* varying bit-rate across modalities
* compute not adaptive"

(Right)
"Fixes:
* model compressed representations
* compose autoregressive prior with a powerful decoder"

On the bottom right of the board, she draws a diagram:
"tokens -> [transformer] -> [diffusion] -> pixels"

β†’ νŠΉμ • 촬영 각도, μ†κΈ€μ”¨μ˜ μžμ—°μŠ€λŸ¬μ›€, λ°°κ²½ μ°½λ¬Έ/λ°˜μ‚¬ νš¨κ³ΌκΉŒμ§€ μ •λ°€ μž¬ν˜„

🧩 μΆ”κ°€ μ˜ˆμ‹œ

1
2
A cat looking into a puddle of water on a street.  
The reflection is that of a tiger, realistically distorted by ripples in the water.

β†’ 메타포적 ν‘œν˜„κΉŒμ§€ μ •κ΅ν•˜κ²Œ κ΅¬ν˜„

β†’ λ¬Όκ²° λ°˜μ‚¬ νš¨κ³ΌκΉŒμ§€ ν¬ν† λ¦¬μ–Όν•˜κ²Œ λ¬˜μ‚¬


πŸ“ 6. Text Rendering (ν…μŠ€νŠΈ λ Œλ”λ§ 정확도)

이미지 λ‚΄ ν…μŠ€νŠΈ λ Œλ”λ§ ν’ˆμ§ˆμ΄ νƒμ›”ν•˜λ©°, ν‘œμ§€νŒ, λ©”λ‰΄νŒ, μ΄ˆλŒ€μž₯ λ“± μ‹€μ‚¬μš© κ°€λŠ₯

🧾 μ˜ˆμ‹œ Prompt

1
2
3
4
5
6
7
8
9
10
Create a photorealistic image of two witches in their 20s (one ash balayage, one with long wavy auburn hair) reading a street sign.

Context:
a city street in a random street in Williamsburg, NY with a pole covered entirely by numerous detailed street signs (e.g., street sweeping hours, parking permits required, vehicle classifications, towing rules), including few ridiculous signs at the middle: (paraphrase it to make these legitimate street signs)"Broom Parking for Witches Not Permitted in Zone C" and "Magic Carpet Loading and Unloading Only (15-Minute Limit)" and "Reindeer Parking by Permit Only (Dec 24–25)\n Violators will be placed on Naughty List." The signpost is on the right of a street. Do not repeat signs. Signs must be realistic.

Characters:
one witch is holding a broom and the other has a rolled-up magic carpet. They are in the foreground, back slightly turned towards the camera and head slightly tilted as they scrutinize the signs.

Composition from background to foreground:
streets + parked cars + buildings -> street sign -> witches. Characters must be closest to the camera taking the shot

β†’ λ³΅μž‘ν•œ κ°„νŒ ꡬ성 및 μ‹€μ œ ν…μŠ€νŠΈ λ Œλ”λ§ μ™„λ²½ 반영

β†’ β€œBroom Parking for Witches…” 같은 유머 μš”μ†ŒκΉŒμ§€ ν˜„μ‹€μ  ν‘œμ§€νŒ ν˜•μ‹μœΌλ‘œ ν‘œν˜„

🧩 μΆ”κ°€ μ˜ˆμ‹œ

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
I'm opening a traditional concept restaurant in Marin called Haein. It focuses on Korean food cooked with organic, farm-fresh ingredients, with a rotating menu based on what's seasonal. I want you to design an image - a menu incorporating the following menu items - lean into the traditional/rustic style while keeping it feeling upscale and sleek. Please also include illustrations of each dish in an elegant, peter rabbit style. Make sure all the text is rendered correctly, with a white background.

(Top)

Doenjang Jjigae (Fermented Soybean Stew) – $18 House-made doenjang with local mushrooms, tofu, and seasonal vegetables served with rice.

Galbi Jjim (Braised Short Ribs) – $34 Slow-braised local grass-fed beef ribs with pear and black garlic glaze, seasonal root vegetables, and jujube.

Grilled Seasonal Fish – Market Price ($22-$30) Whole or fillet of local, sustainable fish grilled over charcoal, served with perilla leaf ssam and house-made sauces.

Bibimbap – $19 Heirloom rice with a rotating selection of farm-fresh vegetables, house-fermented gochujang, and pasture-raised egg.

Bossam (Heritage Pork Wraps) – $28 Slow-cooked pork belly with napa cabbage wraps, oyster kimchi, perilla, and seasonal condiments.

(Bottom) Dessert & Drinks Seasonal Makgeolli (Rice Wine) – $12/glass

Rotating flavors based on seasonal fruits and flowers (persimmon, citrus, elderflower, etc.).

Hoddeok (Korean Sweet Pancake) – $9 Pan-fried cinnamon-stuffed pancake with black sesame ice cream.

β†’ κ³ κΈ‰ 전톡 μŠ€νƒ€μΌμ˜ ν•œμ‹ λ©”λ‰΄νŒ μ‹œκ°ν™”

β†’ Peter Rabbit μŠ€νƒ€μΌ 일러슀트 포함, ν…μŠ€νŠΈ μ˜€νƒ€ μ—†μŒ


마무리

이번 GPT-4o μ—…λ°μ΄νŠΈλ₯Ό 톡해 μš°λ¦¬λŠ” 이미지 μƒμ„±μ—μ„œλ„ λ‹¨μˆœν•œ 생성을 λ„˜μ–΄, ν”„λ‘¬ν”„νŠΈλ₯Ό ν†΅ν•œ μ„Έλ°€ν•œ 컨트둀과 닀단계 정ꡐ화, 그리고 지식 기반 μƒμ„±κΉŒμ§€ 폭넓은 κ°€λŠ₯성을 확인할 수 μžˆμ—ˆμŠ΅λ‹ˆλ‹€.

ν”„λ‘¬ν”„νŠΈ ν•˜λ‚˜μ—λ„ μ„Έμ‹¬ν•˜κ²Œ μ˜λ„λ₯Ό λ‹΄λŠ”λ‹€λ©΄, 훨씬 더 κ³ κΈ‰μŠ€λŸ½κ³  창의적인 이미지λ₯Ό λ§Œλ“€μ–΄λ‚Ό 수 μžˆμŒμ„ λ³΄μ—¬μ£ΌλŠ” 인상적인 μ‚¬λ‘€λ“€μ΄μ—ˆμŠ΅λ‹ˆλ‹€.

OpenAI의 CEO μƒ˜ μ•ŒνŠΈλ§Œμ€ 이번 μ—…λ°μ΄νŠΈκ°€ μ†λ„λŠ” λ‹€μ†Œ λŠλ €μ‘Œμ§€λ§Œ, 그만큼 정밀도와 ν‘œν˜„λ ₯ λ©΄μ—μ„œλŠ” 큰 진보가 μžˆμ—ˆλ‹€κ³  κ°•μ‘°ν–ˆμŠ΅λ‹ˆλ‹€. μ„±λŠ₯ μ΅œμ ν™”λŠ” 점차 κ°œμ„ λ  μ˜ˆμ •μ΄λ‹ˆ μ•žμœΌλ‘œμ˜ μ—…λ°μ΄νŠΈλ„ κΈ°λŒ€ν•΄λ³Ό λ§Œν•©λ‹ˆλ‹€.

ν•œνŽΈ, μ§€λΈŒλ¦¬ μŠ€νƒ€μΌ λ“± μ €μž‘κΆŒκ³Ό 이미지 μŠ€νƒ€μΌ μ‚¬μš©μ— λŒ€ν•œ μ΄μŠˆλ„ 점차 λŒ€λ‘λ˜κ³  μžˆλŠ”λ°μš”, OpenAI와 μ°½μž‘μž κ°„μ˜ 윀리적이고 법적인 쑰율 λ°©ν–₯도 μ€‘μš”ν•œ κ΄€μ „ ν¬μΈνŠΈκ°€ 될 κ²ƒμž…λ‹ˆλ‹€.

πŸ’¬ μ—¬λŸ¬λΆ„μ€ μ–΄λ–€ 이미지λ₯Ό λ§Œλ“€μ–΄λ³΄κ³  μ‹ΆμœΌμ‹ κ°€μš”?

🎨 ν”„λ‘¬ν”„νŠΈμ— μ–΄λ–€ 상상을 λ‹΄μ•„λ³΄μ…¨λ‚˜μš”?

직접 μ‹€ν—˜ν•΄λ³΄κ³ , λ‚˜λ§Œμ˜ μž‘ν’ˆμ„ λ§Œλ“€μ–΄λ³΄λŠ” 것도 큰 μž¬λ―Έμž…λ‹ˆλ‹€.

μž¬λ―ΈμžˆλŠ” μ‹œλ„λ‚˜ 결과물이 μžˆλ‹€λ©΄ λŒ“κΈ€μ΄λ‚˜ 링크둜 ν•¨κ»˜ κ³΅μœ ν•΄μ£Όμ„Έμš”! 🎨

μ˜€λŠ˜λ„ μ½μ–΄μ£Όμ…”μ„œ κ°μ‚¬ν•©λ‹ˆλ‹€!😊



-->