반응형

인공지능/논문 리뷰 or 진행 200

Language Models are Few-Shot Learners - 논문 리뷰

https://arxiv.org/abs/2005.14165 Language Models are Few-Shot LearnersRecent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fiarxiv.orgFew-Shot은 이 그림으로 명확하게 설명이 가능하겠네요 파라미터의 변경 없이 Prompt에 몇 개의 예시만으로..

Reflexion: Language Agents with Verbal Reinforcement Learning - 논문 리뷰

https://arxiv.org/abs/2303.11366 Reflexion: Language Agents with Verbal Reinforcement LearningLarge language models (LLMs) have been increasingly used to interact with external environments (e.g., games, compilers, APIs) as goal-driven agents. However, it remains challenging for these language agents to quickly and efficiently learn from trial-and-arxiv.org LLM이 지속된 자기 반성을 통해 학습을 진행하여 강화학습과 같이 성..

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning - 논문 리뷰

https://arxiv.org/abs/2501.12948 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement LearningWe introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoninarxiv.org 2025.02.02 - [인공지..

DeepSeek-V3 Technical Report - 논문 리뷰

https://arxiv.org/abs/2412.19437 DeepSeek-V3 Technical ReportWe present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and Deeparxiv.org 화제의 모델입니다...저는 이 논문이 나왔을 때 화제가 되었어야 하지 않았나 생각했는데 너무 뒤늦게 R1모델이 나오고 나서 화제..

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model - 논문 리뷰

https://arxiv.org/abs/2405.04434 DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language ModelWe present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokearxiv.org 요즘 제일 화두가 되고 있는 딥..

MindAgent: Emergent Gaming Interaction - 논문 리뷰

https://arxiv.org/abs/2309.09971 MindAgent: Emergent Gaming InteractionLarge Language Models (LLMs) have the capacity of performing complex scheduling in a multi-agent system and can coordinate these agents into completing sophisticated tasks that require extensive collaboration. However, despite the introduction of numerousarxiv.org MINDAGENT 논문은 대규모 언어 모델(LLM)을 활용한 다중 에이전트 협업과 계획 능력을 체계적으로 평가한..

The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended Requests - 논문 리뷰

https://arxiv.org/abs/2409.14371 The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended RequestsGenerative AI agents are often expected to respond to complex user requests that have No One Right Answer (NORA), e.g., "design a vegetarian meal plan below 1800 calories". Such requests may entail a set of constraints that the agent should adhere to...

Generative Agents: Interactive Simulacra of Human Behavior - 논문 리뷰

https://arxiv.org/abs/2304.03442 Generative Agents: Interactive Simulacra of Human BehaviorBelievable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agarxiv.org 음 본인 스스로를 정의했다는 것을 이 논문의 중심으로 봐야 할지, 상대방과..

AutoAgents: A Framework for Automatic Agent Generation - 논문 리뷰

https://arxiv.org/abs/2309.17288 AutoAgents: A Framework for Automatic Agent GenerationLarge language models (LLMs) have enabled remarkable advances in automated task-solving with multi-agent systems. However, most existing LLM-based multi-agent approaches rely on predefined agents to handle simple tasks, limiting the adaptability of multi-aarxiv.org 이 논문은 기존 에이전트들이 고정된 시스탬에서 돌아가는 한계를 지적하고, 그 한계..

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs - 논문 리뷰

https://arxiv.org/abs/2307.16789 ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIsDespite the advancements of open-source large language models (LLMs), e.g., LLaMA, they remain significantly limited in tool-use capabilities, i.e., using external tools (APIs) to fulfill human instructions. The reason is that current instruction tuning laarxiv.org 이 논문은 API를 정리하여 GPT를 이용..

728x90
728x90