'2026/02/21 글 목록

2026/02/21 1

Latent Reasoning, Soft Thinking 논문 정리 3

https://arxiv.org/abs/2511.06411 SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy OptimizationThe soft-thinking paradigm for Large Language Model (LLM) reasoning can outperform the conventional discrete-token Chain-of-Thought (CoT) reasoning in some scenarios, underscoring its research and application value. However, while the discre..

인공지능/논문 리뷰 or 진행 18:15:13

NLP, AI, XAI에 관심있는 공대생의 일기장...?

Today :
Yesterday :

728x90

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28

공대생 도전 일지

2026/02/21 1

티스토리툴바