最強ギャル解説！✨弱→強！LLMのプロンプト魔改造術ってマジ！？

Published：2025/8/22 18:33:06

最強ギャル解説！✨弱→強！LLMのプロンプト魔改造術ってマジ！？

1. 超要約

ちっちゃいモデル（先生）がデカいモデル（生徒）のプロンプト（指示）をめっちゃ良くする魔法🧙‍♀️！低コストでLLMを爆速強化しちゃお！

2. ギャル的キラキラポイント✨

● 小さいモデルで大きいモデルを操る！コスパ最強🔥 ● プロンプトの達人じゃなくてもOK！誰でも使える簡単設計💖 ● クローズドソースのモデルもイケる！汎用性バツグン😎

3. 詳細解説

背景

LLM（大規模言語モデル）って超スゴいけど、性能UPにはお金と手間がかかるのよね🥺。ファインチューニング（微調整）とか大変だし、クローズドモデル（公開されてないモデル）はそもそも無理じゃん？😩

方法

そこで登場！WST（Weak-to-Strong Transfer）✨ 小っちゃいモデル（先生）を使って、でっかいモデル（生徒）のプロンプトを自動で作るの！強化学習（どんどん良くする方法）で、めっちゃ良いプロンプトが完成するんだって！😍

続きは「らくらく論文」アプリで

WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning

Haosen Ge / Shuo Li / Lianghuan Huang

Effective prompt engineering remains a challenging task for many applications. We introduce Weak-to-Strong Transfer (WST), an automatic prompt engineering framework where a small "Teacher" model generates instructions that enhance the performance of a much larger "Student" model. Unlike prior work, WST requires only a weak teacher, making it efficient and broadly applicable in settings where large models are closed-source or difficult to fine-tune. Using reinforcement learning, the Teacher Model's instructions are iteratively improved based on the Student Model's outcomes, yielding substantial gains across reasoning (MATH-500, GSM8K) and alignment (HH-RLHF) benchmarks - 98% on MATH-500 and 134% on HH-RLHF - and surpassing baselines such as GPT-4o-mini and Llama-70B. These results demonstrate that small models can reliably scaffold larger ones, unlocking latent capabilities while avoiding misleading prompts that stronger teachers may introduce, establishing WST as a scalable solution for efficient and safe LLM prompt refinement.

cs / cs.LG / cs.AI

Arxivで見る