반응형

인공지능/논문 리뷰 or 진행 199

The Rise and Potential of Large Language Model Based Agents: A Survey - 논문 리뷰

https://arxiv.org/abs/2309.07864 The Rise and Potential of Large Language Model Based Agents: A SurveyFor a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions,arxiv.org 일단 80페이지 짜리 논문입니다....읽어보려고 했으나 ..

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders - 논문 리뷰

https://arxiv.org/abs/2407.14435 Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse AutoencodersSparse autoencoders (SAEs) are a promising unsupervised approach for identifying causally relevant and interpretable linear features in a language model's (LM) activations. To be useful for downstream tasks, SAEs need to decompose LM activations faithfullyarxiv.org Jump ReLU는 특정 임계치..

An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation

https://arxiv.org/abs/2410.03334 An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report GenerationRadiological services are experiencing unprecedented demand, leading to increased interest in automating radiology report generation. Existing Vision-Language Models (VLMs) suffer from hallucinations, lack interpretability, and require expensive fine-tuninarxiv.org이미지를..

The Geometry of Concepts: Sparse Autoencoder Feature Structure - 논문 리뷰

https://arxiv.org/abs/2410.19750 The Geometry of Concepts: Sparse Autoencoder Feature StructureSparse autoencoders have recently produced dictionaries of high-dimensional vectors corresponding to the universe of concepts represented by large language models. We find that this concept universe has interesting structure at three levels: 1) The "atomicarxiv.org원자 수준: 단어 관계(예: "Austria:Vienna::Switz..

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

https://arxiv.org/abs/2403.19647 Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language ModelsWe introduce methods for discovering and applying sparse feature circuits. These are causally implicated subnetworks of human-interpretable features for explaining language model behaviors. Circuits identified in prior work consist of polysemantic and diffarxiv.org이 논문은..

Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models - 논문리뷰

https://arxiv.org/abs/2402.19103 Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language ModelsLarge Language Models (LLMs) have shown impressive capabilities but still suffer from the issue of hallucinations. A significant type of this issue is the false premise hallucination, which we define as the phenomenon when LLMs generate hallucinated text..

One Agent To Rule Them All: Towards Multi-agent Conversational AI - 논문 리뷰

https://arxiv.org/abs/2203.07665 One Agent To Rule Them All: Towards Multi-agent Conversational AIThe increasing volume of commercially available conversational agents (CAs) on the market has resulted in users being burdened with learning and adopting multiple agents to accomplish their tasks. Though prior work has explored supporting a multitude of doarxiv.org 이 논문은 오래된 논문입니다.그래서 GPT 3,4와 같이 모든..

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings - 논문 리뷰

https://arxiv.org/abs/1607.06520 Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word EmbeddingsThe blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural lanarxiv.org 워드 임베딩은 데이터 내 성별 고..

Gender Bias in Neural Natural Language Processing - 논문 리뷰

https://arxiv.org/abs/1807.11714 Gender Bias in Neural Natural Language ProcessingWe examine whether neural natural language processing (NLP) systems reflect historical biases in training data. We define a general benchmark to quantify gender bias in a variety of neural NLP tasks. Our empirical evaluation with state-of-the-art neural coarxiv.org 여기선 단어를 교체하면서 임베딩 공간, attention score를 보고 편향을 확인했습..

Could an artificial-intelligence agent pass an introductory physics course? - 논문리뷰

https://journals.aps.org/prper/abstract/10.1103/PhysRevPhysEducRes.19.010132 저는 멀티 에이전트, 컴퓨터의 모든 것을 관할하는 Agent를 확인하고 싶었는데 여기서 Agent == Chat GPT 였네요...게다가 나온지 오래된 모델이라 지금 모델이랑 비교하면 좀 차이가 클 것으로 예상됩니다.이 전의 언어모델의 약점이 무엇이었는지 확인하는 차 적당히 보고 지나가면 될 것 같습니다.쉬운 코딩 문제는 잘 풀지만 물리학 입문 과정 조차 잘 풀지 못한다.약점으론 수학적 계산 오류, 논리적 오류, 개념적 이해 부족이 있고 학습 능력(지식 업데이트)나 메타인지(자기 점검 능력)이 없다.데이터 셋이 고정되었고(2021년), 매 입력마다 출력이 바뀌며 불안정..

728x90
728x90