반응형

인공지능 767

Improving Factuality and Reasoning in Language Models through Multiagent Debate - 논문 리뷰

https://arxiv.org/abs/2305.14325 Improving Factuality and Reasoning in Language Models through Multiagent DebateLarge language models (LLMs) have demonstrated remarkable capabilities in language generation, understanding, and few-shot learning in recent years. An extensive body of work has explored how their performance may be further improved through the tools of parxiv.org Agent 논문입니다! 그 중에서도 ..

Few-Shot, CoT(Chain-of-Thought)와 ReAct 하나 하나 뜯어보기

https://arxiv.org/abs/2005.14165 Language Models are Few-Shot LearnersRecent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fiarxiv.orghttps://arxiv.org/abs/2201.11903 Chain-of-Thought Prompting Eli..

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models - 논문 리뷰

https://arxiv.org/abs/2201.11903 Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsWe explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in suarxiv.org 직전에 봤던 논문의 연장선 같은 느낌입니다.Few-Sh..

Language Models are Few-Shot Learners - 논문 리뷰

https://arxiv.org/abs/2005.14165 Language Models are Few-Shot LearnersRecent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fiarxiv.orgFew-Shot은 이 그림으로 명확하게 설명이 가능하겠네요 파라미터의 변경 없이 Prompt에 몇 개의 예시만으로..

Reflexion: Language Agents with Verbal Reinforcement Learning - 논문 리뷰

https://arxiv.org/abs/2303.11366 Reflexion: Language Agents with Verbal Reinforcement LearningLarge language models (LLMs) have been increasingly used to interact with external environments (e.g., games, compilers, APIs) as goal-driven agents. However, it remains challenging for these language agents to quickly and efficiently learn from trial-and-arxiv.org LLM이 지속된 자기 반성을 통해 학습을 진행하여 강화학습과 같이 성..

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning - 논문 리뷰

https://arxiv.org/abs/2501.12948 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement LearningWe introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoninarxiv.org 2025.02.02 - [인공지..

DeepSeek-V3 Technical Report - 논문 리뷰

https://arxiv.org/abs/2412.19437 DeepSeek-V3 Technical ReportWe present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and Deeparxiv.org 화제의 모델입니다...저는 이 논문이 나왔을 때 화제가 되었어야 하지 않았나 생각했는데 너무 뒤늦게 R1모델이 나오고 나서 화제..

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model - 논문 리뷰

https://arxiv.org/abs/2405.04434 DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language ModelWe present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokearxiv.org 요즘 제일 화두가 되고 있는 딥..

MindAgent: Emergent Gaming Interaction - 논문 리뷰

https://arxiv.org/abs/2309.09971 MindAgent: Emergent Gaming InteractionLarge Language Models (LLMs) have the capacity of performing complex scheduling in a multi-agent system and can coordinate these agents into completing sophisticated tasks that require extensive collaboration. However, despite the introduction of numerousarxiv.org MINDAGENT 논문은 대규모 언어 모델(LLM)을 활용한 다중 에이전트 협업과 계획 능력을 체계적으로 평가한..

The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended Requests - 논문 리뷰

https://arxiv.org/abs/2409.14371 The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended RequestsGenerative AI agents are often expected to respond to complex user requests that have No One Right Answer (NORA), e.g., "design a vegetarian meal plan below 1800 calories". Such requests may entail a set of constraints that the agent should adhere to...

728x90
728x90