'2025/03 글 목록

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos? - 논문 리뷰

https://arxiv.org/abs/2411.10979 VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?The advancement of Multimodal Large Language Models (MLLMs) has enabled significant progress in multimodal understanding, expanding their capacity to analyze video content. However, existing evaluation benchmarks for MLLMs primarily focus on abstract videoarxiv.org AbstractMultimodal LLM(MLLMs)는 상당..

인공지능/논문 리뷰 or 진행 2025.03.30

Uncertainty를 활용한 Agent - Towards Uncertainty-Aware Language Agent

https://arxiv.org/abs/2401.14016 Towards Uncertainty-Aware Language AgentWhile Language Agents have achieved promising success by placing Large Language Models at the core of a more versatile design that dynamically interacts with the external world, the existing approaches neglect the notion of uncertainty during these interacarxiv.org 최근에 준비하고 있던 주제인데 이미 선행 자료가 있었더라고요...?그렇게 찾을 땐 안나오더니 하필... 여..

인공지능/논문 리뷰 or 진행 2025.03.27

vllm 통해 reasoning path 데이터 만들기

지금 데이터를 늘리는 작업을 진행해서... 문서와 질문 그리고 정답을 통해 정답이 추론되는 과정을 만들려고 합니다.import jsonlinesimport jsonimport timefrom typing import Any, Optionalimport torchfrom transformers import AutoTokenizer # AutoModelForCausalLM 사용 안 함# vllm 임포트 (원래 주석처리 되어있던 부분을 활성화)from vllm import LLM, SamplingParams# from huggingface_hub import login# login("만약 access 필요한 모델이면, 토큰 발급받고 여기에 입력하삼!")import base64import timefrom ty..

인공지능/자연어 처리 2025.03.24

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction - 논문 리뷰

https://arxiv.org/abs/2501.03218 Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and ReactionActive Real-time interaction with video LLMs introduces a new paradigm for human-computer interaction, where the model not only understands user intent but also responds while continuously processing streaming video on the fly. Unlike offline video L..

인공지능/논문 리뷰 or 진행 2025.03.23

Uncertainty를 어떻게 측정해야 할까 - Estimating LLM Uncertainty with Logits - 논문 리뷰

https://arxiv.org/abs/2502.00290 Estimating LLM Uncertainty with LogitsIn recent years, Large Language Models (LLMs) have seen remarkable advancements and have been extensively integrated across various fields. Despite their progress, LLMs are prone to hallucinations, producing responses that may not be dependable if the modearxiv.org 지금 진행 중인 연구에 관련이 있는 논문입니다.Uncertainty를 측정하기 위해 우린 이렇게 했다! 라는 ..

인공지능/논문 리뷰 or 진행 2025.03.22

Vllm을 활용한 빠른 Bench Mark Test 진행하기

https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro TIGER-Lab/MMLU-Pro · Datasets at Hugging Face[ "Boycotts, Buyalls, Blockchain technology, Increased Sales", "Buycotts, Boycotts, Digital technology, Decreased Sales", "Boycotts, Buycotts, Digital technology, Decreased Sales", "Buycotts, Boycotts, Blockchain technology, Charitable donations", "Boycotthuggingface.co 이번에 만드는 모델을 평가하기 위해 벤치마크 중 하나인 ..

인공지능/자연어 처리 2025.03.21

Gemma-3 사용하기 (Feat.오류)

https://huggingface.co/google/gemma-3-12b-it google/gemma-3-12b-it · Hugging FaceThis repository is publicly accessible, but you have to accept the conditions to access its files and content. To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged inhuggingface.co 일단 저는 12b모델을 사용했습니다.from transformers import pipeline..

인공지능/자연어 처리 2025.03.20

Enhancing Lexicon-Based Text Embeddings with Large Language Models - 논문 리뷰

https://arxiv.org/abs/2501.09749 Enhancing Lexicon-Based Text Embeddings with Large Language ModelsRecent large language models (LLMs) have demonstrated exceptional performance on general-purpose text embedding tasks. While dense embeddings have dominated related research, we introduce the first Lexicon-based EmbeddiNgS (LENS) leveraging LLMs that achiearxiv.org기존 Dense embedding의 문제점을 말합니다.그리고 ..

인공지능/논문 리뷰 or 진행 2025.03.19

Embedding + Generation Model 사전 논문 조사6 - 데이터 셋 및 평가 데이터 정리

2025.03.17 - [인공지능/논문 리뷰 or 진행] - Embedding + Generation Model 사전 논문 조사5 - 데이터 셋 및 평가 데이터 정리 Embedding + Generation Model 사전 논문 조사5 - 데이터 셋 및 평가 데이터 정리2024.12.23 - [인공지능/논문 리뷰 or 진행] - ChatQA: Surpassing GPT-4 on Conversational QA and RAG - 논문 리뷰 ChatQA: Surpassing GPT-4 on Conversational QA and RAG - 논문 리뷰https://arxiv.org/abs/2401.10225 ChatQA: Surpassing GPT-4 onyoonschallenge.tistory.com여기서 ..

인공지능/논문 리뷰 or 진행 2025.03.18

Embedding + Generation Model 사전 논문 조사5 - 데이터 셋 및 평가 데이터 정리

2024.12.23 - [인공지능/논문 리뷰 or 진행] - ChatQA: Surpassing GPT-4 on Conversational QA and RAG - 논문 리뷰 ChatQA: Surpassing GPT-4 on Conversational QA and RAG - 논문 리뷰https://arxiv.org/abs/2401.10225 ChatQA: Surpassing GPT-4 on Conversational QA and RAGIn this work, we introduce ChatQA, a suite of models that outperform GPT-4 on retrieval-augmented generation (RAG) and conversational question answering (Q..

인공지능/논문 리뷰 or 진행 2025.03.17

공대생 도전 일지

2025/03 25

티스토리툴바

« 2025/03 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31