LLM Block Diffusion 논문 리뷰 = Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

인공지능/논문 리뷰 or 진행

LLM Block Diffusion 논문 리뷰 = Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

이게될까 2025. 3. 14. 14:50

728x90

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Diffusion language models offer unique benefits over autoregressive models due to their potential for parallelized generation and controllability, yet they lag in likelihood modeling and are limited to fixed-length generation. In this work, we introduce a

arxiv.org

시간 나는대로 천천히 리뷰 작성 시작하겠습니다..

https://m-arriola.com/bd3lms/

SOCIAL MEDIA TITLE TAG

SOCIAL MEDIA DESCRIPTION TAG TAG

m-arriola.com

기존 Diffususion모델은 한꺼번에 전체적인 내용을 출력하다보니 의존성 문제가 분명 있을 수 밖에 없어보였습니다.

2025.02.19 - [인공지능/논문 리뷰 or 진행] - LLM Diffusion 논문 리뷰 - Large Language Diffusion Models

LLM Diffusion 논문 리뷰 - Large Language Diffusion Models

https://arxiv.org/abs/2502.09992 Large Language Diffusion ModelsAutoregressive models (ARMs) are widely regarded as the cornerstone of large language models (LLMs). We challenge this notion by introducing LLaDA, a diffusion model trained from scratch unde

yoonschallenge.tistory.com

그렇기에 이 논문을 보고 정리를 안 할 수가 없었네요

기존 모델은 고정길이 문제점, 캐싱이 되지 않은 문제, 출력 자체도 그리 좋아 보이지 않았기에 좀 더 명확한 Diffusion모델이 필요했습니다.

그렇기에 이번 Block Diffusion이 나왔고 명확한 성능 차이를 보여줍니다.

Autoregression: ✅ High quality ✅ Arbitrary-length ✅ KV caching ❌ Not parallelizable

Diffusion: ❌ Lower quality ❌ Fixed-length ❌ No KV caching ✅ Parallelizable

Block Diffusion: ✅ High quality ✅ Arbitrary-length ✅ KV caching ✅ Parallelizable

저작자표시 비영리

'인공지능 > 논문 리뷰 or 진행' 카테고리의 다른 글

Embedding + Generation Model 사전 논문 조사5 - 데이터 셋 및 평가 데이터 정리 (0)	2025.03.17
과제 겸 논문 리뷰 - Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens (0)	2025.03.16
Embedding + Generation Model 사전 논문 조사4 - Multi-modal Generative Embedding Model, Self-Retrieval (0)	2025.03.14
Embedding + Generation Model 사전 논문 조사3 EI-ARAG, GAEF (0)	2025.03.13
GEAR: A Simple GENERATE, EMBED, AVERAGE AND RANK Approach for Unsupervised Reverse Dictionary - 논문 리뷰 (0)	2025.03.12

현재글LLM Block Diffusion 논문 리뷰 = Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

인공지능, 자율주행에 관심있는 공대생의 일기장...?

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

공대생 도전 일지