๐Ÿ“š Study/Paper Review

[LLM] Base Model๊ณผ Instruct Model, ๊ทธ๋ฆฌ๊ณ  Chat Template

์œฐ๊ฐฑ 2025. 3. 27. 14:35

# Base Model, Instruct Model

 

Base Model: ๋‹จ์ˆœํžˆ ๋‹ค์Œ ํ† ํฐ ์˜ˆ์ธก์ด๋ผ๋Š” ๋ชฉํ‘œ๋กœ ์‚ฌ์ „ ํ•™์Šต๋งŒ์„ ๊ฑฐ์นœ ๋ชจ๋ธ

Instruct Model: ํŠน์ •ํ•œ ๋ชฉ์ ์˜ ํƒœ์Šคํฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋„๋ก ๋ณ„๋„์˜ ํŒŒ์ธํŠœ๋‹์„ ๊ฑฐ์นœ ๋ชจ๋ธ

 

์•„๋ž˜ ์‚ฌ์ง„์ฒ˜๋Ÿผ,

์•„๋ฌด๊ฒƒ๋„ ๋ถ™์–ด์žˆ์ง€ ์•Š์œผ๋ฉด base์ด๊ณ  instruct ๋ชจ๋ธ์€ ๋’ค์— Instruct, it, chat ๋“ฑ ๋ญ”๊ฐ€ ์ถ”๊ฐ€๋กœ ๋ถ™์–ด ์žˆ๋‹ค.

 

์‰ฝ๊ฒŒ ๋งํ•˜๋ฉด GPT์™€ ChatGPT ๊ฐ™์€ ๋А๋‚Œ์ด๋‹ค.

ChatGPT๋„ ์›๋ž˜๋Š” InstructGPT๋ผ๋Š” ๋ชจ๋ธ์„ ๋ฒ ์ด์Šค๋กœ ํ•˜๋Š”๋ฐ,
์ด ๋ชจ๋ธ์ด ๋ฐ”๋กœ ์‚ฌ์šฉ์ž์˜ ์ž…๋ ฅ์— ๋งž๋Š” ์ ์ ˆํ•œ ์‘๋‹ต์„ ์ƒ์„ฑํ•˜๋„๋ก ๋ณ„๋„๋กœ ํ•™์Šต์ด ๋œ ๋ชจ๋ธ์ด๋‹ค.
๋ฌผ๋ก  ๋ณธ์งˆ์ ์œผ๋กœ ๋‹ค์Œ ํ† ํฐ์„ ์˜ˆ์ธกํ•œ๋‹ค๋Š” ์‚ฌ์‹ค์€ ๋ณ€ํ•˜์ง€ ์•Š์ง€๋งŒ,
์‚ฌ์ „ ํ•™์Šต๋งŒ ๋œ Base Model์€ ๋ง ๊ทธ๋Œ€๋กœ, ์ž…๋ ฅ์ด ์งˆ๋ฌธ์ธ์ง€ ์•„๋‹Œ์ง€์—๋Š” ๊ด€์‹ฌ์ด ์—†๊ณ  ๊ทธ๋Ÿด ๋“ฏํ•œ ๋ฌธ์žฅ์„ ์ด์–ด์„œ ์ƒ์„ฑํ•˜๊ธฐ๋งŒ ํ•œ๋‹ค. ๋ฐ˜๋ฉด Instruct Model์€ ๋ช…์‹œ์ ์œผ๋กœ ์งˆ๋ฌธ์— ๋Œ€ํ•œ ์‘๋‹ต์„ ์ƒ์„ฑํ•˜๋„๋ก ์ถ”๊ฐ€์ ์œผ๋กœ ํ›ˆ๋ จ์ด ๋˜์—ˆ๋‹ค๋Š” ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค.

 


# Chat Template

์šฐ๋ฆฌ๊ฐ€ LLM์„ ์‚ฌ์šฉํ•˜๋Š” ๋ชฉ์ ์€ ๋Œ€๋ถ€๋ถ„ ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋Œ€๋‹ต์„ ์–ป๊ฑฐ๋‚˜, ์–ด๋–ค ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š”๋ฐ ์žˆ๋‹ค.

๊ทธ๋Ÿฌ๋ฉด ๋‹จ์ˆœํžˆ Instruct Model์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๋งŒ์œผ๋กœ ํ•ด๊ฒฐ์ด ๋ ๊นŒ?

๋งž๊ธด ํ•˜์ง€๋งŒ, ์ด ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ๋•Œ๋„ ์˜ฌ๋ฐ”๋ฅธ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋„˜๊ฒจ์ฃผ๋Š” ๊ฒƒ์ด ํ•„์š”ํ•˜๋‹ค.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = "cuda" if torch.cuda.is_available() else "cpu"
# base_model = "meta-llama/Llama-3.2-1B"
base_model = "meta-llama/Llama-3.2-1B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model).to(device)

raw_input = "What is Large Language Model?"
encoded_input = tokenizer(raw_input, return_tensors="pt").to(device)

outputs = model.generate(**encoded_input, max_new_tokens=20)
print(tokenizer.decode(outputs[0]))

# <|begin_of_text|>What is Large Language Model? (with examples and applications) 
# A large language model is a type of artificial intelligence (AI) model

 

์ถœ๋ ฅ์„ ์‚ดํŽด๋ณด๋ฉด, ์–ธ๋œป ์งˆ๋ฌธ์— ๋‹ต์„ ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด์ง€๋งŒ ์ด์ƒํ•œ ๋ถ€๋ถ„์ด ์žˆ๋‹ค.

(with examples and applications)๋ผ๋Š” ๋ฌธ์žฅ์„ ๋งˆ์Œ๋Œ€๋กœ ์ถ”๊ฐ€ํ•˜๊ณ , ๊ทธ ๋‹ค์Œ์— ๊ทธ์— ๋Œ€ํ•œ ๋Œ€๋‹ต์„ ์ƒ์„ฑํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

 

๊ทธ ์ด์œ ๋Š”, LLM์˜ ๋™์ž‘ ์›๋ฆฌ๋ฅผ ์‚ดํŽด๋ณด๋ฉด ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋‹ค.

์‚ฌ์šฉ์ž๊ฐ€ LLM์„ ๋„˜๊ฒจ์ฃผ๋Š” ์ž…๋ ฅ์€ ์‚ฌ์‹ค ์žˆ๋Š” ๊ทธ๋Œ€๋กœ ๋“ค์–ด๊ฐ€์ง€ ์•Š๋Š”๋‹ค.

์‚ฌ์šฉ์ž๊ฐ€ ์ž…๋ ฅํ•ด์ค€ ์›๋ณธ ๋ฌธ์žฅ์„ Query๋ผ๊ณ  ํ•˜๋ฉด, ์ด ์ฟผ๋ฆฌ ์•ž๋’ค์— ๋‹ค์–‘ํ•œ ์š”์†Œ๊ฐ€ ๋ถ™์–ด์„œ LLM์œผ๋กœ ์ž…๋ ฅ๋œ๋‹ค.

 

์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ
LLM์— ์ž…๋ ฅ๋˜๋Š” ํ”„๋กฌํ”„ํŠธ๋Š” ๋‹ค์–‘ํ•œ ๊ตฌ์„ฑ๋œ๋‹ค.
- ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ
- ์ง€์‹œ์‚ฌํ•ญ
- ์‚ฌ์šฉ์ž์˜ ์ž…๋ ฅ
{system} ๋‹น์‹ ์€ ์œ ์šฉํ•œ AI ์–ด์‹œ์Šคํ„ดํŠธ์ž…๋‹ˆ๋‹ค. {/system}
{instruction} ์งˆ๋ฌธ์—๋Š” ํ•ญ์ƒ ํ•œ๊ตญ์–ด๋กœ ๋Œ€๋‹ตํ•˜์„ธ์š”. {/instruction}
{user} Large Language Model์ด ๋ฌด์—‡์ธ๊ฐ€์š”? {/user}
{assistant}โ€‹


์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ์—๋Š” LLM์ด ์–ด๋–ค ์ง€์นจ์„ ๋ฐ”ํƒ•์œผ๋กœ ์‘๋‹ต์„ ์ƒ์„ฑํ•ด์•ผ ํ•˜๋Š”์ง€๊ฐ€ ๋ช…์‹œ๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ๊ฐœ๋–ก๊ฐ™์ด ๋งํ•ด๋„ ์ฐฐ๋–ก๊ฐ™์ด ์•Œ์•„๋“ฃ๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค.


์šฐ๋ฆฌ๊ฐ€ LLM์— ์ž…๋ ฅํ•˜๋Š” ๋‚ด์šฉ์€ {user} {\user} ์‚ฌ์ด์— ๋“ค์–ด๊ฐ€๋Š” ๋‚ด์šฉ์ด๊ฒ ์ง€๋งŒ,
์ฑ„ํŒ…์„ ์œ„ํ•ด ํ•™์Šต๋œ LLM์ด ์‹ค์ œ๋กœ ํ•™์Šต ๊ณผ์ •์—์„œ ์ž…๋ ฅ ๋ฐ›๋Š” ํ”„๋กฌํ”„ํŠธ์˜ ์ „๋ฌธ์€ ์œ„์™€ ๊ฐ™๋‹ค.
๋”ฐ๋ผ์„œ Inference ๋‹จ๊ณ„์—์„œ๋„ ์œ„์™€ ๊ฐ™์€ ํ˜•์‹์— ๋งž์ถฐ์„œ ์ž…๋ ฅ์„ ์ „๋‹ฌํ•ด์ค˜์•ผ ํ•œ๋‹ค.

 

์ด๋ฅผ ์ฑ„ํŒ… ํ…œํ”Œ๋ฆฟ Chat Template์ด๋ผ๊ณ  ํ•œ๋‹ค.

Meta์—์„œ๋Š” ๋ฒ„์ „๋ณ„๋กœ Llama ๋ชจ๋ธ์˜ Prompt Template์— ๋Œ€ํ•œ ๊ฐ€์ด๋“œ๋ฅผ ์ž์„ธํ•˜๊ฒŒ ์ œ๊ณตํ•œ๋‹ค.

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 23 July 2024

You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>

What is the capital of France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

<|begin_of_text|>: ๋ง ๊ทธ๋Œ€๋กœ ํ”„๋กฌํ”„ํŠธ์˜ ์‹œ์ž‘์ž„์„ ์•Œ๋ฆฌ๋Š” ์šฉ๋„

<|start_header_id|><|end_header_id|>: ํŠน์ •ํ•œ ์ฃผ์ฒด์˜ ํ„ด(Turn)์ž„์„ ์•Œ๋ฆฌ๋Š” ์šฉ๋„ (system, user, assistant)

<|eot_id|>: end of turn์˜ ์ค„์ž„๋ง๋กœ, ํŠน์ •ํ•œ ์ฃผ์ฒด์˜ ํ„ด์ด ๋๋‚ฌ์Œ์„ ์•Œ๋ฆฌ๋Š” ์šฉ๋„

 

์œ„ ํ…œํ”Œ๋ฆฟ์—์„œ๋Š” ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ์— LLM์—๊ฒŒ ๊ฐ„๋‹จํ•œ ์ง€์‹œ์‚ฌํ•ญ์„ ์ค€ ํ›„์— ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ์˜ ํ„ด์ด ๋๋‚ฌ์Œ์„ ์•Œ๋ฆฌ๊ณ , ์ด์–ด์„œ ์‚ฌ์šฉ์ž๊ฐ€ ํ”„๋ž‘์Šค์˜ ์ˆ˜๋„๊ฐ€ ์–ด๋”˜์ง€๋ฅผ ๋ฌผ์€ ํ›„ ์งˆ๋ฌธํ•˜๋Š” ํ„ด์ด ๋๋‚ฌ์Œ์„ ๋ช…์‹œํ•œ๋‹ค.

์ด์ œ ์ด ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž…๋ ฅ๋ฐ›์€ LLM์€ ํ”„๋ž‘์Šค์˜ ์ˆ˜๋„๊ฐ€ ์–ด๋”˜์ง€๋ฅผ ๋Œ€๋‹ตํ•œ ํ›„ assistant์˜ ํ„ด์„ ์ข…๋ฃŒํ•˜๊ฒ ์ฃ .

 

๋งค๋ฒˆ ์ด๋Ÿฐ ํ…œํ”Œ๋ฆฟ์— ๋งž์ถฐ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž…๋ ฅํ•˜๋Š”๊ฑด ๋งค์šฐ ๊ท€์ฐฎ์€ ์ผ์ด๋‹ค.

transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—๋Š” ๊ฐ ๋ชจ๋ธ์— ๋งž๊ฒŒ Chat Template์„ ์ž๋™์œผ๋กœ ๊ตฌ์„ฑํ•ด์ฃผ๋Š” ๊ธฐ๋Šฅ์ด ์žˆ๋‹ค.

messages = [{"role": "user", "content": "What is Large Language Model?"}]

encoded_prompt = tokenizer.apply_chat_template(messages, 
                                               add_generation_prompt=True,
                                               return_tensors="pt").to(device)
                                               
outputs = model.generate(encoded_prompt, max_new_tokens=50)
print(tokenizer.decode(outputs[0]))

"""
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 01 Jan 2025

<|eot_id|><|start_header_id|>user<|end_header_id|>

What is Large Language Model?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

A Large Language Model (LLM) is a type of artificial intelligence (AI) model that is designed to process and understand human language. It is a type of neural network that uses deep learning techniques to analyze and generate text.

A Large Language Model
"""

 

max_new_tokens ๊ฐ€ ๊ธธ์ง€ ์•Š์•„์„œ ์‘๋‹ต์ด ์ž˜๋ ธ์ง€๋งŒ,

์•„๋ฌดํŠผ ์ด์ œ๋Š” ์งˆ๋ฌธ์— ๋ญ”๊ฐ€๋ฅผ ๋ง๋ถ™์ด์ง€ ์•Š๊ณ  ๋Œ€๋‹ต๋งŒ์„ ์ƒ์„ฑํ•˜๊ณ  ์žˆ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

์ฐธ๊ณ ๋กœ ์‘๋‹ต์„ ๋งˆ์น˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด ๋์— <|eot_id|> ๊ฐ€ ์ƒ์„ฑ๋œ๋‹ค.

max_length ๋ฅผ ๋ชจ๋‘ ์ฑ„์šฐ์ง€ ์•Š์•„๋„, ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋‹ต์„ ์ถฉ๋ถ„ํžˆ ํ–ˆ๋‹ค๊ณ  ์ƒ๊ฐ๋˜๋ฉด ํ„ด์„ ๋งˆ์นœ๋‹ค๋Š” ์˜๋ฏธ์˜ ์ŠคํŽ˜์…œ ํ† ํฐ์„ ๋งˆ์ง€๋ง‰์œผ๋กœ ์ถœ๋ ฅํ•˜๊ณ , ์ƒ์„ฑ์„ ๋๋‚ด๋Š” ๊ฒƒ์ด๋‹ค.

 

Instruct ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ๋•Œ๋Š” ๋ฐ˜๋“œ์‹œ Chat Template์— ๋งž์ถฐ์„œ ์ž…๋ ฅ์„ ๋„ฃ์–ด์ค˜์•ผ ํ•œ๋‹ค.

LLM์—๋Š” ๋‹จ์ˆœํžˆ ์‚ฌ์šฉ์ž๊ฐ€ ์ง์ ‘์ ์œผ๋กœ ์ž‘์„ฑํ•œ ์ฟผ๋ฆฌ๋งŒ ์ž…๋ ฅ๋˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋‹ค. (์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ๋“ฑ)

ex. 
*๋‹จ์ˆœํžˆ ์‚ฌ์šฉ์ž ์ฟผ๋ฆฌ๋งŒ ์ฃผ๋Š” ๊ฒฝ์šฐ๋งŒ์„ ์ƒ๊ฐํ•˜๊ธฐ ์‰ฝ์ง€๋งŒ*
์‚ฌ์šฉ์ž: ํŒŒ๋ฆฌ๋Š” ์–ด๋А ๋‚˜๋ผ์˜ ์ˆ˜๋„์•ผ?

*์‹ค์ œ LLM ์ž…๋ ฅ์€ ์ด๋ ‡๊ฒŒ ๋” ํ’๋ถ€ํ•˜๋‹ค:*
<|system|> ๋‹น์‹ ์€ ์นœ์ ˆํ•˜๊ณ  ๋…ผ๋ฆฌ์ ์ธ AI์ž…๋‹ˆ๋‹ค. <|end|>
<|user|> ํŒŒ๋ฆฌ๋Š” ์–ด๋А ๋‚˜๋ผ์˜ ์ˆ˜๋„์•ผ? <|end|>