'2026/05/12 글 목록

2026/05/12 2

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

https://arxiv.org/abs/2305.02301 Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model SizesDeploying large language models (LLMs) is challenging because they are memory inefficient and compute-intensive for practical applications. In reaction, researchers train smaller task-specific models by either finetuning with human labels or distilling usi..

인공지능/논문 리뷰 or 진행 02:04:54

Associative Recurrent Memory Transformer

https://arxiv.org/abs/2407.04841 Associative Recurrent Memory TransformerThis paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step. Our approach, Associative Recurrent Memory Transformer (ARMT), is based on transformearxiv.org ICML 2024 Next Generation of Sequence Modeling Architectur..

인공지능/논문 리뷰 or 진행 01:10:29

NLP, AI, XAI에 관심있는 공대생의 일기장...?

Today :
Yesterday :

728x90

일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

공대생 도전 일지

2026/05/12 2

티스토리툴바