728x90
728x90
https://openreview.net/forum?id=rkIw2GqYEt
Probing to Refine: Reinforcement Distillation of LLM Reasoners via...
Distilling robust reasoning capabilities from large language models (LLMs) into smaller, computationally efficient student models remains an unresolved challenge. Despite recent advances, distilled...
openreview.net
https://aclanthology.org/2024.emnlp-main.811/
LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History
Akash Gupta, Ivaxi Sheth, Vyas Raina, Mark Gales, Mario Fritz. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024.
aclanthology.org
728x90
'인공지능 > 논문 리뷰 or 진행' 카테고리의 다른 글
| Latent Reasoning, Soft Thinking 논문 정리 3 (0) | 2026.02.21 |
|---|---|
| Latent Reasoning, Soft Thinking 논문 정리 2 (0) | 2026.02.20 |
| Latent Reasoning, Soft Thinking 논문 정리 1 (1) | 2026.02.19 |
| Multi-turn, Long-context Benchmark 논문 4 (0) | 2026.02.04 |
| Privacy AI 관련 조사 13 (0) | 2026.02.03 |