'2025/05/18 글 목록

2025/05/18 2

Adversarial Attacks in NLP 관련 논문 정리 - 5

https://aclanthology.org/2025.findings-naacl.123/ Attention Tracker: Detecting Prompt Injection Attacks in LLMsKuo-Han Hung, Ching-Yun Ko, Ambrish Rawat, I-Hsin Chung, Winston H. Hsu, Pin-Yu Chen. Findings of the Association for Computational Linguistics: NAACL 2025. 2025.aclanthology.org이 논문은 Attention 패턴 관점에서 prompt injection 공격 메커니즘을 분석합니다.black box 모델에선 불 가능한 조건이 되는 거죠...원래는 Instruction에 높은 ..

인공지능/논문 리뷰 or 진행 2025.05.18

Adversarial Attacks in NLP 관련 논문 정리 - 4

https://arxiv.org/abs/2401.15897 Red-Teaming for Generative AI: Silver Bullet or Security Theater?In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating tharxiv.orgSurvey 논문 이네요 미 백악관에서 행정명령으로 발표한 AI..

인공지능/논문 리뷰 or 진행 2025.05.18

인공지능, 자율주행에 관심있는 공대생의 일기장...?

Today :
Yesterday :

728x90

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

공대생 도전 일지

2025/05/18 2

티스토리툴바