[๊ฐ•์˜๋…ธํŠธ] RAG From Scratch : Query Retrieval ๊ธฐ๋ฒ•

Posted by Euisuk's Dev Log on September 14, 2024

[๊ฐ•์˜๋…ธํŠธ] RAG From Scratch : Query Retrieval ๊ธฐ๋ฒ•

์›๋ณธ ๊ฒŒ์‹œ๊ธ€: https://velog.io/@euisuk-chung/RAG-From-Scratch-15-18

  • ํ•ด๋‹น ๋ธ”๋กœ๊ทธ ํฌ์ŠคํŠธ๋Š” RAG From Scratch : Coursework ๊ฐ•์˜ ํŒŒํŠธ 15 - 18 ๋‚ด์šฉ์„ ๋‹ค๋ฃจ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋ณ„๋„์˜ ๊ฐ•์ขŒ ๋‚ด์šฉ์ด ๋ณด์ด์ง€ ์•Š์•„์„œ ์‹ค์Šต ์ฝ”๋“œ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ Reverse Enginneeringํ•ด์„œ ์ž๋ฃŒ๋ฅผ ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค. ์ฐธ๊ณ  ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค!

1. Re-ranking (์žฌ์ •๋ ฌ)

  • Re-ranking์€ ๊ฒ€์ƒ‰ ์‹œ์Šคํ…œ์—์„œ ์ดˆ๊ธฐ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ์ˆœ์œ„๋ฅผ ๋‹ค์‹œ ๋งค๊ธฐ๋Š” ๊ณผ์ •์ž…๋‹ˆ๋‹ค. ๊ฒ€์ƒ‰๋œ ๊ฒฐ๊ณผ๊ฐ€ ์‚ฌ์šฉ์ž์˜ ๊ธฐ๋Œ€์— ๋ถ€ํ•ฉํ•˜์ง€ ์•Š๊ฑฐ๋‚˜, ๊ด€๋ จ์„ฑ์ด ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์„ ๋•Œ ์ฃผ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ ๊ฒ€์ƒ‰ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๋จผ์ € ์ˆ˜ํ–‰๋˜๊ณ , ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ์žฌ์ •๋ ฌํ•˜์—ฌ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋” ์ ํ•ฉํ•œ ํ•ญ๋ชฉ์„ ์ƒ์œ„์— ๋…ธ์ถœํ•ฉ๋‹ˆ๋‹ค. Re-ranking ๊ณผ์ •์—์„œ๋Š” ์‚ฌ์šฉ์ž์˜ ๊ณผ๊ฑฐ ํ–‰๋™ ๋ฐ์ดํ„ฐ๋‚˜ ๋ฌธ๋งฅ์  ์š”์†Œ, ์ตœ์‹ ์„ฑ ๋“ฑ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์š”์ธ์„ ๋ฐ˜์˜ํ•˜์—ฌ ์ˆœ์œ„๋ฅผ ์กฐ์ •ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  • Reciprocal Rank Fusion (RRF) ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์—ฌ๋Ÿฌ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ์ข…ํ•ฉํ•˜์—ฌ ์žฌ์ •๋ ฌํ•˜๋Š” ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋กœ, ์—ฌ๋Ÿฌ ๊ฒ€์ƒ‰ ์‹œ์Šคํ…œ์—์„œ ๋‚˜์˜จ ๊ฒฐ๊ณผ์˜ ์ˆœ์œ„๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ๊ฒฐํ•ฉํ•ฉ๋‹ˆ๋‹ค. RRF๋Š” ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ์ƒํ˜ธ ์ˆœ์œ„์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ๊ฐ ๊ฒฐ๊ณผ์— ๋Œ€ํ•ด ์ˆœ์œ„ ์—ญ์ˆ˜๋ฅผ ๊ฐ€์‚ฐํ•˜๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์ƒ์œ„์— ๋…ธ์ถœ๋œ ๊ฒฐ๊ณผ๊ฐ€ ์šฐ์„ ์‹œ๋ฉ๋‹ˆ๋‹ค.

์ฃผ์š” ํŠน์ง•:

  1. ์ดˆ๊ธฐ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ๊ฐœ์„ : ์ฒซ ๋ฒˆ์งธ ๊ฒ€์ƒ‰์—์„œ ๋†“์นœ ๊ด€๋ จ ํ•ญ๋ชฉ๋“ค์„ ์ƒ์œ„๋กœ ๋Œ์–ด์˜ฌ๋ฆฝ๋‹ˆ๋‹ค.
  2. ๋‹ค์–‘ํ•œ ์š”์†Œ ๊ณ ๋ ค: ์‚ฌ์šฉ์ž ์„ ํ˜ธ๋„, ๋ฌธ๋งฅ, ์ตœ์‹ ์„ฑ ๋“ฑ ๋‹ค์–‘ํ•œ ์š”์†Œ๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ์ˆœ์œ„๋ฅผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
  3. ๋จธ์‹ ๋Ÿฌ๋‹ ํ™œ์šฉ: ๋งŽ์€ re-ranking ์‹œ์Šคํ…œ์€ ๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ๋” ์ •ํ™•ํ•œ ์ˆœ์œ„๋ฅผ ๋งค๊น๋‹ˆ๋‹ค.

Re-ranking ์ฝ”๋“œ ์˜ˆ์‹œ (RRF)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import ChatPromptTemplate

# ๋ฒกํ„ฐ ์Šคํ† ์–ด ์ƒ์„ฑ ๋ฐ ๋ฌธ์„œ ๋กœ๋“œ
documents = ["This is document 1 about LLM.", "This is document 2 about AI agents."]

vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=OpenAIEmbeddings(),
)

retriever = vectorstore.as_retriever()

# ์ฟผ๋ฆฌ ์ƒ์„ฑ ๋ฐ ๊ฒ€์ƒ‰
query = "What is LLM?"
results = retriever.get_relevant_documents(query)

# RRF ์žฌ์ •๋ ฌ ํ•จ์ˆ˜
def reciprocal_rank_fusion(results, k=60):
    fused_scores = {}
    for rank, doc in enumerate(results):
        if doc not in fused_scores:
            fused_scores[doc] = 0
        fused_scores[doc] += 1 / (rank + k)

    reranked_results = sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    return reranked_results

# RRF๋ฅผ ์ ์šฉํ•œ ์žฌ์ •๋ ฌ
reranked_results = reciprocal_rank_fusion(results)
print("Reranked results:", reranked_results)

์ฝ”๋“œ ๋ถ€์—ฐ ์„ค๋ช…:

  • reciprocal_rank_fusion ํ•จ์ˆ˜๋Š” ๊ฐ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ์ˆœ์œ„๋ฅผ ๋ฐ›์•„ ํ•ด๋‹น ์ˆœ์œ„์— ๋”ฐ๋ผ ์ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ˆœ์œ„๊ฐ€ ๋†’์€ ๊ฒฐ๊ณผ๋Š” ์—ญ์ˆ˜ ๋ฐฉ์‹์œผ๋กœ ๋†’์€ ์ ์ˆ˜๋ฅผ ๋ฐ›์Šต๋‹ˆ๋‹ค.
  • k=60์€ ์—ญ์ˆ˜ ๊ณ„์‚ฐ์—์„œ ์ถ”๊ฐ€์ ์ธ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ์—ญํ• ์„ ํ•˜๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์ˆœ์œ„๊ฐ€ ๋„ˆ๋ฌด ๋‚ฎ์€ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ๊ฐ€์ค‘์น˜๊ฐ€ ์ง€๋‚˜์น˜๊ฒŒ ์ž‘์•„์ง€์ง€ ์•Š๋„๋ก ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.

2. Retrieval (CRAG)

  • CRAG๋Š” Retrieval-Augmented Generation(RAG) ๊ธฐ๋ฐ˜ ์‹œ์Šคํ…œ์˜ ๊ฐœ์„ ๋œ ํ˜•ํƒœ๋กœ, ๊ฒ€์ƒ‰๋œ ๋ฌธ์„œ์˜ ๊ด€๋ จ์„ฑ์„ ํ‰๊ฐ€ํ•˜๊ณ  ํ•„์š”์‹œ ์งˆ๋ฌธ์„ ๋‹ค์‹œ ์ž‘์„ฑํ•˜์—ฌ ๊ฒ€์ƒ‰ ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์ด ์‹œ์Šคํ…œ์€ ์ดˆ๊ธฐ ๊ฒ€์ƒ‰์ด ์ถฉ๋ถ„ํ•˜์ง€ ์•Š๊ฑฐ๋‚˜ ๋ถ€์ •ํ™•ํ•  ๋•Œ ์ž์ฒด์ ์œผ๋กœ ์งˆ๋ฌธ์„ ์ˆ˜์ •ํ•˜๊ณ  ์ถ”๊ฐ€ ๊ฒ€์ƒ‰์„ ์ˆ˜ํ–‰ํ•ด ์ตœ์ข…์ ์œผ๋กœ ๋” ์ •ํ™•ํ•œ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • CRAG์˜ ํ•ต์‹ฌ์€ ์ž๊ธฐ ์ˆ˜์ • ๋ฉ”์ปค๋‹ˆ์ฆ˜์œผ๋กœ, ๊ฒ€์ƒ‰๋œ ๋ฌธ์„œ๊ฐ€ ๋ถ€์ ์ ˆํ•˜๋‹ค๊ณ  ํŒ๋‹จ๋˜๋ฉด ์‹œ์Šคํ…œ์ด ์ด๋ฅผ ํ‰๊ฐ€ํ•˜๊ณ  ๊ฒ€์ƒ‰ ์ „๋žต์„ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด, ์‚ฌ์šฉ์ž๋Š” ๋” ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฃผ์š”ํŠน์ง•

  1. ๊ฒ€์ƒ‰ ํ‰๊ฐ€๊ธฐ(Retrieval Evaluator) ๋„์ž…:
    • ๊ฒฝ๋Ÿ‰ T5-large ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฒ€์ƒ‰๋œ ๋ฌธ์„œ์™€ ์ฟผ๋ฆฌ ๊ฐ„์˜ ๊ด€๋ จ์„ฑ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
    • ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ โ€œ์ •ํ™•โ€, โ€œ๋ถ€์ •ํ™•โ€, โ€œ๋ชจํ˜ธโ€๋กœ ๋ถ„๋ฅ˜ํ•ฉ๋‹ˆ๋‹ค.
  2. ์ž๊ธฐ ์ˆ˜์ • ๋ฉ”์ปค๋‹ˆ์ฆ˜:
    • ๊ฒ€์ƒ‰๋œ ๋ฌธ์„œ๊ฐ€ ๋ถ€์ ์ ˆํ•˜๋‹ค๊ณ  ํŒ๋‹จ๋˜๋ฉด, ์‹œ์Šคํ…œ์ด ์ž๋™์œผ๋กœ ์ˆ˜์ • ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
    • ์›น ๊ฒ€์ƒ‰์„ ํ†ตํ•ด ์ถ”๊ฐ€ ์ •๋ณด๋ฅผ ํš๋“ํ•˜์—ฌ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.
  3. ๋ถ„ํ•ด-์žฌ๊ตฌ์„ฑ(Decompose-then-Recompose) ์•Œ๊ณ ๋ฆฌ์ฆ˜:
    • ๊ฒ€์ƒ‰๋œ ๋ฌธ์„œ์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์ง€์‹์„ ์ถ”์ถœํ•˜๊ณ  ์žฌ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.
    • ๋ถˆํ•„์š”ํ•˜๊ฑฐ๋‚˜ ๊ด€๋ จ ์—†๋Š” ์ •๋ณด๋ฅผ ํ•„ํ„ฐ๋งํ•˜์—ฌ ํ•ต์‹ฌ ์ •๋ณด๋งŒ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค.
  4. ์œ ์—ฐํ•œ ํ†ตํ•ฉ:
    • ๊ธฐ์กด RAG ๊ธฐ๋ฐ˜ ์‹œ์Šคํ…œ์— ํ”Œ๋Ÿฌ๊ทธ ์•ค ํ”Œ๋ ˆ์ด ๋ฐฉ์‹์œผ๋กœ ์‰ฝ๊ฒŒ ํ†ตํ•ฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  5. ์„ฑ๋Šฅ ํ–ฅ์ƒ:
    • ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ํ’ˆ์งˆ์„ ๊ฐœ์„ ํ•จ์œผ๋กœ์จ ์ตœ์ข… ์ƒ์„ฑ ๊ฒฐ๊ณผ์˜ ์ •ํ™•์„ฑ๊ณผ ๊ด€๋ จ์„ฑ์„ ๋†’์ž…๋‹ˆ๋‹ค.

CRAG ์ฝ”๋“œ ์˜ˆ์‹œ

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# ๊ฒ€์ƒ‰๊ธฐ ๋ฐ ๊ฒ€์ƒ‰ ๋„๊ตฌ
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
web_search_tool = TavilySearchResults(k=3)

# ์งˆ๋ฌธ ์žฌ์ž‘์„ฑ LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
rewrite_prompt = ChatPromptTemplate.from_template(
    "Here is the initial question: {question}. Rephrase this question for better search."
)

def corrective_rag(question):
    # 1๋‹จ๊ณ„: ๋ฒกํ„ฐ ์Šคํ† ์–ด ๊ฒ€์ƒ‰
    results = retriever.get_relevant_documents(question)

    # 2๋‹จ๊ณ„: ๋ฌธ์„œ ํ‰๊ฐ€
    if not results or len(results) < 1:
        # 3๋‹จ๊ณ„: ์งˆ๋ฌธ ์žฌ์ž‘์„ฑ ๋ฐ ์›น ๊ฒ€์ƒ‰
        rewritten_question = rewrite_prompt.invoke({"question": question})
        web_results = web_search_tool.invoke({"query": rewritten_question})
        results.extend(web_results)

    return results

question = "What are AI agents?"
corrected_docs = corrective_rag(question)
print(corrected_docs)

์ฝ”๋“œ ๋ถ€์—ฐ ์„ค๋ช…:

  • corrective_rag ํ•จ์ˆ˜๋Š” ๊ฒ€์ƒ‰๋œ ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ฐ€ํ•˜๊ณ , ๊ทธ ๊ฒฐ๊ณผ๊ฐ€ ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์„ ๊ฒฝ์šฐ ์งˆ๋ฌธ์„ ์žฌ์ž‘์„ฑํ•˜์—ฌ ์›น ๊ฒ€์ƒ‰์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„œ rewrite_prompt๋Š” ์ดˆ๊ธฐ ์งˆ๋ฌธ์„ ๋‹ค์‹œ ์ž‘์„ฑํ•˜๋Š” LLM ํ˜ธ์ถœ์„ ํ†ตํ•ด ์ด๋ฃจ์–ด์ง€๋ฉฐ, ์ด๋Š” ์งˆ๋ฌธ์„ ๋ช…ํ™•ํ•˜๊ฒŒ ํ•˜๊ฑฐ๋‚˜ ์„ธ๋ถ„ํ™”ํ•˜์—ฌ ๋” ๋‚˜์€ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ์œ ๋„ํ•ฉ๋‹ˆ๋‹ค.

3. Retrieval (Self-RAG)

  • Self-RAG๋Š” ์–ธ์–ด ๋ชจ๋ธ์ด ์ž์ฒด์ ์œผ๋กœ ์ •๋ณด๋ฅผ ๊ฒ€์ƒ‰ํ•˜๊ณ , ์ƒ์„ฑํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋น„ํ‰ํ•˜๋ฉฐ ํ•„์š”์‹œ ์ถ”๊ฐ€ ๊ฒ€์ƒ‰์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๊ธฐ์กด RAG ์‹œ์Šคํ…œ์—์„œ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š” ์ œํ•œ ์‚ฌํ•ญ์„ ๋ณด์™„ํ•˜๊ณ , ๋ชจ๋ธ์ด ์Šค์Šค๋กœ ์ƒ์„ฑ ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ฐ€ํ•˜์—ฌ ๋” ๋‚˜์€ ๋‹ต๋ณ€์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋Š” ์œ ์—ฐ์„ฑ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋ชจ๋ธ์€ ์ƒ์„ฑ๋œ ๋‹ต๋ณ€์ด ์ถฉ๋ถ„ํžˆ ๊ด€๋ จ์„ฑ์ด ์—†๋‹ค๊ณ  ํŒ๋‹จ๋˜๋ฉด, ์ถ”๊ฐ€ ์ •๋ณด๋ฅผ ๊ฒ€์ƒ‰ํ•ด ์ด๋ฅผ ๋ณด์™„ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ ์ž๊ฐ€ ๋ฐ˜์˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์€ ํŠนํžˆ ์ •ํ™•์„ฑ์ด ์ค‘์š”ํ•œ ๋ถ„์•ผ์—์„œ ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ชจ๋ธ์ด ํ•œ ๋ฒˆ์˜ ๊ฒ€์ƒ‰์œผ๋กœ ์ถฉ๋ถ„ํ•œ ์ •๋ณด๋ฅผ ์–ป์ง€ ๋ชปํ•  ๋•Œ ์ด๋ฅผ ์ธ์ง€ํ•˜๊ณ  ์ถ”๊ฐ€์ ์œผ๋กœ ๊ฒ€์ƒ‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ค๋‹ˆ๋‹ค.

์ฃผ์š” ํŠน์ง•:

  1. ์ž๊ฐ€ ๋ฐ˜์˜: ๋ชจ๋ธ์ด ์ž์‹ ์˜ ์ถœ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ณ  ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.
  2. ๋™์  ๊ฒ€์ƒ‰: ํ•„์š”์— ๋”ฐ๋ผ ์ถ”๊ฐ€ ์ •๋ณด๋ฅผ ๊ฒ€์ƒ‰ํ•ฉ๋‹ˆ๋‹ค.
  3. ๋น„ํ‰ ๋Šฅ๋ ฅ: ์ƒ์„ฑ๋œ ๋‚ด์šฉ์˜ ์ •ํ™•์„ฑ๊ณผ ๊ด€๋ จ์„ฑ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

Self-RAG ์ฝ”๋“œ ์˜ˆ์‹œ

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

# ๋ฌธ์„œ ๋ฐ ๋ฒกํ„ฐ ์Šคํ† ์–ด ์„ค์ •
documents = ["Document 1 about LLM", "Document 2 about AI agents"]
vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()

# LLM ๋ฐ ๋ฌธ์„œ ํ‰๊ฐ€
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

def self_rag(question):
    # 1๋‹จ๊ณ„: ๊ฒ€์ƒ‰
    results = retriever.get_relevant_documents(question)

    # 2๋‹จ๊ณ„: ๋‹ต๋ณ€ ์ƒ์„ฑ
    generated_answer = llm.invoke({
        "context": results,
        "question": question
    })

    # 3๋‹จ๊ณ„: ์ƒ์„ฑ๋œ ๋‹ต๋ณ€ ํ‰๊ฐ€
    if "insufficient" in generated_answer:
        # ๋ฌธ์„œ๊ฐ€ ์ถฉ๋ถ„ํ•˜์ง€ ์•Š๋‹ค๋ฉด ์ถ”๊ฐ€ ๊ฒ€์ƒ‰
        additional_results = retriever.get_relevant_documents(f"{question} more details")
        results.extend(additional_results)
        generated_answer = llm.invoke({
            "context": results,
            "question": question
        })

    return generated_answer

question = "What are the types of AI agents?"
answer = self_rag(question)
print(answer)

์ฝ”๋“œ ๋ถ€์—ฐ ์„ค๋ช…:

  • self_rag ํ•จ์ˆ˜์—์„œ๋Š” ๋จผ์ € ๊ฒ€์ƒ‰์„ ํ†ตํ•ด ๋ฌธ์„œ๋ฅผ ๊ฐ€์ ธ์˜จ ํ›„, ๊ทธ ๋ฌธ์„œ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ LLM์ด ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • ๋งŒ์•ฝ ๋ชจ๋ธ์ด ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•œ ํ›„์— ๋‹ต๋ณ€์ด ์ถฉ๋ถ„ํ•˜์ง€ ์•Š๋‹ค๊ณ  ํŒ๋‹จํ•˜๋ฉด, ์ถ”๊ฐ€ ๊ฒ€์ƒ‰์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ๋‹ต๋ณ€์„ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.

4. Active RAG

  • Active RAG๋Š” LLM์ด ์–ธ์ œ, ์–ด๋””์„œ ์ •๋ณด๋ฅผ ์ถ”๊ฐ€๋กœ ๊ฒ€์ƒ‰ํ• ์ง€ ์Šค์Šค๋กœ ํŒ๋‹จํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์ด๋Š” ์ดˆ๊ธฐ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ๊ธฐ๋ณธ ์‘๋‹ต์„ ์ƒ์„ฑํ•œ ํ›„, ์‹œ์Šคํ…œ์ด ์ž์œจ์ ์œผ๋กœ ํ›„์† ์งˆ๋ฌธ์„ ์ƒ์„ฑํ•˜๊ฑฐ๋‚˜ ์ถ”๊ฐ€ ๊ฒ€์ƒ‰์„ ํ†ตํ•ด ๋” ๋‚˜์€ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜๋Š” ๋Šฅ๋ ฅ์„ ๊ฐ–์ถ”๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Active RAG์˜ ์ฃผ๋œ ๋ชฉํ‘œ๋Š” ์ฒซ ๋ฒˆ์งธ ์ƒ์„ฑ๋œ ๋‹ต๋ณ€์— ๋งŒ์กฑํ•˜์ง€ ์•Š๊ณ , ๋” ๋‚˜์€ ๋‹ต๋ณ€์„ ์œ„ํ•ด ์ ๊ทน์ ์œผ๋กœ ์ถ”๊ฐ€ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฃผ์š” ํŠน์ง•:

  1. ๋ฐ˜๋ณต์  ๊ฒ€์ƒ‰: ์ดˆ๊ธฐ ์‘๋‹ต ํ›„ ์ถ”๊ฐ€ ์ •๋ณด๋ฅผ ๊ฒ€์ƒ‰ํ•ฉ๋‹ˆ๋‹ค.
  2. ์งˆ๋ฌธ ์ƒ์„ฑ: ๋ชจ๋ธ์ด ์Šค์Šค๋กœ ํ›„์† ์งˆ๋ฌธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  3. ์‘๋‹ต ๊ฐœ์„ : ์ƒˆ๋กœ์šด ์ •๋ณด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ดˆ๊ธฐ ์‘๋‹ต์„ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.

Active RAG ์ฝ”๋“œ ์˜ˆ์‹œ

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.prompts import ChatPromptTemplate

# ์งˆ๋ฌธ ๋ถ„์„ ๋ฐ ์›น ๊ฒ€์ƒ‰ ๋„๊ตฌ
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
web_search_tool = TavilySearchResults(k=3)

# ์งˆ๋ฌธ ๋ผ์šฐํŒ… ๋ฐ Active RAG
def active_rag(question):
    # ์งˆ๋ฌธ ๋ถ„์„
    initial_results = retriever.get_relevant_documents(question)

    if len(initial_results) < 2:  # ๋ฌธ์„œ๊ฐ€ ๋ถ€์กฑํ•˜๋ฉด ์›น ๊ฒ€์ƒ‰ ์ˆ˜ํ–‰
        additional_results = web_search_tool.invoke({"query": question})
        initial_results.extend(additional_results)

    # ์ตœ์ข… ๋‹ต๋ณ€ ์ƒ์„ฑ
    answer = llm.invoke({
        "context": initial_results,
        "question": question
    })

    return answer

question = "What is LLM task decomposition?"
final_answer = active_rag(question)
print(final_answer)

์ฝ”๋“œ ๋ถ€์—ฐ ์„ค๋ช…:

  • active_rag ํ•จ์ˆ˜๋Š” ์ดˆ๊ธฐ ๊ฒ€์ƒ‰์„ ์ˆ˜ํ–‰ํ•œ ํ›„, ๊ฒ€์ƒ‰๋œ ๋ฌธ์„œ๊ฐ€ ๋ถ€์กฑํ•˜๊ฑฐ๋‚˜ ๊ด€๋ จ์„ฑ์ด ๋‚ฎ์œผ๋ฉด ์›น ๊ฒ€์ƒ‰ ๋„๊ตฌ๋ฅผ ์ด์šฉํ•ด ์ถ”๊ฐ€ ๊ฒ€์ƒ‰์„ ์‹คํ–‰ํ•˜๋Š” ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.
  • ์ด๋ฅผ ํ†ตํ•ด ๋” ํ’๋ถ€ํ•œ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ์–ป์–ด ์ตœ์ข…์ ์œผ๋กœ ๋” ๋‚˜์€ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

5. Adaptive RAG

  • Adaptive RAG๋Š” ์งˆ๋ฌธ์˜ ์„ฑ๊ฒฉ์— ๋”ฐ๋ผ ์ ์ ˆํ•œ ๊ฒ€์ƒ‰ ๋ฐฉ๋ฒ•์„ ๋™์ ์œผ๋กœ ์„ ํƒํ•˜๋Š” ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. ์ฆ‰, ์งˆ๋ฌธ์ด ๋‹จ์ˆœํ•˜๋ฉด ๊ฐ„๋‹จํ•œ ๋ฒกํ„ฐ ์Šคํ† ์–ด ๊ฒ€์ƒ‰์„ ์‚ฌ์šฉํ•˜๊ณ , ์งˆ๋ฌธ์ด ๋ณต์žกํ•˜๋ฉด ์›น ๊ฒ€์ƒ‰์ด๋‚˜ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์กฐํšŒ๋ฅผ ์„ ํƒํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์งˆ๋ฌธ์˜ ๋ณต์žก์„ฑ๊ณผ ํ•„์š”์— ๋”ฐ๋ผ ์ตœ์ ์˜ ๊ฒ€์ƒ‰ ์ „๋žต์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ฆฌ์†Œ์Šค๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ณ , ๋ณต์žกํ•œ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ๋” ์ •ํ™•ํ•˜๊ณ  ์‹ ์†ํ•œ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Adaptive RAG๋Š” ๋‹ค์–‘ํ•œ ๊ฒ€์ƒ‰ ๋ฐฉ๋ฒ•์„ ์ ์ ˆํžˆ ์กฐํ•ฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์งˆ๋ฌธ ์œ ํ˜•์— ๋Œ€ํ•ด ๋†’์€ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์ฃผ์š” ํŠน์ง•:

  1. ๋™์  ์ „๋žต ์„ ํƒ: ์งˆ๋ฌธ์— ๋”ฐ๋ผ ๋‹ค๋ฅธ ๊ฒ€์ƒ‰ ๋ฐฉ๋ฒ•์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
  2. ๋ณต์žก์„ฑ ํ‰๊ฐ€: ์งˆ๋ฌธ์˜ ๋ณต์žก์„ฑ์„ ๋ถ„์„ํ•˜์—ฌ ์ ์ ˆํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์„ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
  3. ๋ฆฌ์†Œ์Šค ์ตœ์ ํ™”: ๊ฐ„๋‹จํ•œ ์งˆ๋ฌธ์—๋Š” ๊ฐ„๋‹จํ•œ ๊ฒ€์ƒ‰, ๋ณต์žกํ•œ ์งˆ๋ฌธ์—๋Š” ๋” ์‹ฌ์ธต์ ์ธ ๊ฒ€์ƒ‰์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Adaptive RAG ์ฝ”๋“œ ์˜ˆ์‹œ

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from langchain_openai import OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate

# ๋ฒกํ„ฐ ์Šคํ† ์–ด ๊ฒ€์ƒ‰๊ธฐ
vector_retriever = vectorstore.as_retriever()

# LLM ์„ค์ •
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# ์งˆ๋ฌธ ๋ผ์šฐํŒ… ํ•จ์ˆ˜
def adaptive_rag(question):
    # ์งˆ๋ฌธ ๋ถ„์„ ๋ฐ ๋ผ์šฐํŒ…
    if "LLM" in question:
        results = vector_retriever.get_relevant_documents(question)
    else:
        results = web_search_tool.invoke({"query": question})

    # ๋‹ต๋ณ€ ์ƒ์„ฑ
    answer = llm.invoke({
        "context": results,
        "question": question
    })
    return answer

question = "What are the types of LLM agents?"
adaptive_answer = adaptive_rag(question)
print(adaptive_answer)

์ฝ”๋“œ ๋ถ€์—ฐ ์„ค๋ช…:

  • adaptive_rag ํ•จ์ˆ˜๋Š” ์งˆ๋ฌธ์— ๋”ฐ๋ผ ๋ฒกํ„ฐ ์Šคํ† ์–ด ๊ฒ€์ƒ‰ ๋˜๋Š” ์›น ๊ฒ€์ƒ‰์„ ์„ ํƒํ•˜๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ LLM์ด ์ตœ์ข… ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋ฅผ ํ†ตํ•ด ์งˆ๋ฌธ์˜ ๋ณต์žก๋„๋‚˜ ์„ฑ๊ฒฉ์— ๋”ฐ๋ผ ๊ฒ€์ƒ‰ ์ „๋žต์„ ๋™์ ์œผ๋กœ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


-->