RepLLaMA - Fine-Tuning LLaMA for Multi-Stage Text Retrieval

논문의 목적 임베딩 모델이나 리랭커를 LLM을 기반으로 훈련시킬 수는 없을까? LLM은 완전 똑똑한데, 이것을 리랭커나 임베딩 모델로서 사용하고 싶다! 어떻게 했을까? Retriever 이전에는 주로 [[BERT]]와 같은 Bi Encoder를 통해 임베딩을 생성했다. 특히, [CLS] 토큰의 representation을 임베딩 벡터로서 이용했다. 어라? 그런데 Bi Encoder랑은 다르게...

2026-02-22 paper / ai-ml / information-retrieval / retriever

RepLLaMA - Fine-Tuning LLaMA for Multi-Stage Text Retrieval 글 대표 일러스트 — Jeffrey Kim의 SecondBrain 빌드 로그 아티클 커버

Quick context

First, this page captures one concrete build-log step, research note, or project lesson from Jeffrey Kim.

Next, use the tags, related reading, and home archive to move from this note to deeper material in the same topic cluster.

Finally, follow the RSS feed if you want the next experiment, retrospective, or paper review as soon as it ships.

Archive note

First, this imported note is intentionally compact. It acts as a pointer into the wider SecondBrain archive rather than a long-form standalone article.

Next, use the tags, related reading, and project sections to move toward deeper context. Those paths usually lead to fuller write-ups, experiments, or project retrospectives.

Finally, revisit this page together with the home archive and RSS feed when you want the follow-up posts that expand the same topic.

논문의 목적

임베딩 모델이나 리랭커를 LLM을 기반으로 훈련시킬 수는 없을까?
LLM은 완전 똑똑한데, 이것을 리랭커나 임베딩 모델로서 사용하고 싶다!

어떻게 했을까?

Retriever

이전에는 주로 [[BERT]]와 같은 Bi-Encoder를 통해 임베딩을 생성했다. 특히, [CLS] 토큰의 representation을 임베딩 벡터로서 이용했다.
어라? 그런데 Bi-Encoder랑은 다르게 LLM은 단방향이다. (ARM) 그러면 어떻게 임베딩을 생성할까?
그래서, 인풋의 마지막 토큰으로 무조건 라는 end-of-sequence 토큰을 생성하게 했다.
이 eos 토큰의 마지막 layer의 representation을 곧 임베딩 벡터로서 사용하였다.

훈련 Loss로는 Contrastive Loss 중 하나인 InfoNCE loss 함수를 사용하였다.

\mathcal{L}(Q, D^+, {D_N}) = - \log \frac {e^{Sim(Q, D^+)}} {e^{Sim(Q, D^+)} + \sum_{D_i^- \in D_N} e^{Sim(Q, D_i^-)}}

여기서 query $Q$ 에 대한 positive passage $D^+$ 와 negative passage의 집합 $D_N$ 을 이용하여 각각의 similarity score를 계산한다. 간단히 알 수 있듯이, positive passage와의 유사도 점수를 최대화하고, negative passage와의 유사도 점수를 최소화 하는 방향으로 학습이 진행된다. 이 때, negative passage들에는 아주 반대의 의미를 가지고 있는 hard negative와, 단순히 positive가 아닌데 batch에 담겨 있는 passage인 in-batch negative가 모두 포함된다.

Reranker

리랭커도 Retriever와 유사하게, 마지막 eos 토큰의 representation을 사용한다. 다만, 추가적으로 Linear layer를 추가하여 각 쿼리와 도큐먼트 간의 관계도를 0~1 사이의 값으로 나타내도록 한다.

다른 점은, loss로 contrastive loss가 아닌 binary classification에 쓰이는 loss를 사용하고, positive 단락 및 hard-negative 단락들만 사용하여 훈련한다.