공대생 도전 일지

ALFWorld, Plan Bench 뜯어보기

https://alfworld.github.io/ ALFWorldAligning Text and Embodied Environments for Interactive Learningalfworld.github.io https://github.com/alfworld/alfworld GitHub - alfworld/alfworld: ALFWorld: Aligning Text and Embodied Environments for Interactive LearningALFWorld: Aligning Text and Embodied Environments for Interactive Learning - alfworld/alfworldgithub.com여기서 다운받을 수 있습니다. alfred.pddl에서는 각 ta..

인공지능/Agent 00:02:33

Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks - 논문 리뷰

https://arxiv.org/abs/2410.18387 Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical TasksSeveral medical Multimodal Large Languange Models (MLLMs) have been developed to address tasks involving visual images with textual instructions across various medical modalities, achieving impressive results. Most current medical generalist models are regarxiv.org 지금 존재하는 모델들은 Re..

인공지능/논문 리뷰 or 진행 2025.04.07

오랜만에 파이썬 복습하기 - List, Tuple, Set, Dictionary, numpy

print("awegaewg",3,"awegawegaweg",654684,"\n\naegage")print에선 ,를 통해 여러가지 출력할 수 있다. List a = [1,2,3,[4,5,6],[7,8],9]print(a)이렇게 부분만 지정해줄 수 있습니다. -는 뒤에서부터 숫자를 세기에 2개가 빠지는 것을 볼 수 있습니다.List는 독립적이지 않습니다.그렇기에 .copy를 통해 독립적으로 만들어 줘야 함 생성할 때 곱하기 연산자도 사용 가능합니다. 이걸 통해 좀 더 다양하게 생성 가능!이렇게 되니 조심해야 합니다. 다양하게 응용해서 생성 가능합니다. append를 통해서 list에 값 추가 가능 저 곱하기 기호는 조심해서 사용하긴 해야 겠네요 count를 통해 list안에 몇개가 있는지 알 수 있습..

언어/Python 2025.04.04

vllm 활용해서 logit 추출 및 logprob, CoT, SC-CoT Inference 진행

class로 된 python이라 self나 다른 것 들이 붙어있긴 한데 적당히 보면 될 것 같습니다.기록 용이라....from datasets import load_from_disk, DatasetDictimport argparse, os, json, torch, itertools, math, refrom typing import List, Dict, Tuplefrom scipy.special import digammafrom vllm import LLM, SamplingParamsfrom collections import defaultdict, Counterfrom transformers import AutoTokenizerfrom setproctitle import setproctitle 일단 전부 ..

인공지능/자연어 처리 2025.04.02

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos? - 논문 리뷰

https://arxiv.org/abs/2411.10979 VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?The advancement of Multimodal Large Language Models (MLLMs) has enabled significant progress in multimodal understanding, expanding their capacity to analyze video content. However, existing evaluation benchmarks for MLLMs primarily focus on abstract videoarxiv.org AbstractMultimodal LLM(MLLMs)는 상당..

인공지능/논문 리뷰 or 진행 2025.03.30

Uncertainty를 활용한 Agent - Towards Uncertainty-Aware Language Agent

https://arxiv.org/abs/2401.14016 Towards Uncertainty-Aware Language AgentWhile Language Agents have achieved promising success by placing Large Language Models at the core of a more versatile design that dynamically interacts with the external world, the existing approaches neglect the notion of uncertainty during these interacarxiv.org 최근에 준비하고 있던 주제인데 이미 선행 자료가 있었더라고요...?그렇게 찾을 땐 안나오더니 하필... 여..

인공지능/논문 리뷰 or 진행 2025.03.27

vllm 통해 reasoning path 데이터 만들기

지금 데이터를 늘리는 작업을 진행해서... 문서와 질문 그리고 정답을 통해 정답이 추론되는 과정을 만들려고 합니다.import jsonlinesimport jsonimport timefrom typing import Any, Optionalimport torchfrom transformers import AutoTokenizer # AutoModelForCausalLM 사용 안 함# vllm 임포트 (원래 주석처리 되어있던 부분을 활성화)from vllm import LLM, SamplingParams# from huggingface_hub import login# login("만약 access 필요한 모델이면, 토큰 발급받고 여기에 입력하삼!")import base64import timefrom ty..

인공지능/자연어 처리 2025.03.24

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction - 논문 리뷰

https://arxiv.org/abs/2501.03218 Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and ReactionActive Real-time interaction with video LLMs introduces a new paradigm for human-computer interaction, where the model not only understands user intent but also responds while continuously processing streaming video on the fly. Unlike offline video L..

인공지능/논문 리뷰 or 진행 2025.03.23

Uncertainty를 어떻게 측정해야 할까 - Estimating LLM Uncertainty with Logits - 논문 리뷰

https://arxiv.org/abs/2502.00290 Estimating LLM Uncertainty with LogitsIn recent years, Large Language Models (LLMs) have seen remarkable advancements and have been extensively integrated across various fields. Despite their progress, LLMs are prone to hallucinations, producing responses that may not be dependable if the modearxiv.org 지금 진행 중인 연구에 관련이 있는 논문입니다.Uncertainty를 측정하기 위해 우린 이렇게 했다! 라는 ..

인공지능/논문 리뷰 or 진행 2025.03.22

Vllm을 활용한 빠른 Bench Mark Test 진행하기

https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro TIGER-Lab/MMLU-Pro · Datasets at Hugging Face[ "Boycotts, Buyalls, Blockchain technology, Increased Sales", "Buycotts, Boycotts, Digital technology, Decreased Sales", "Boycotts, Buycotts, Digital technology, Decreased Sales", "Buycotts, Boycotts, Blockchain technology, Charitable donations", "Boycotthuggingface.co 이번에 만드는 모델을 평가하기 위해 벤치마크 중 하나인 ..

인공지능/자연어 처리 2025.03.21

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

공대생 도전 일지

전체 글 1009

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

2025. 04
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30