반응형

2024/10/17 3

From Understanding to Utilization: A Survey on Explainability for Large Language Models - 리뷰

https://arxiv.org/abs/2401.12874 From Understanding to Utilization: A Survey on Explainability for Large Language ModelsExplainability for Large Language Models (LLMs) is a critical yet challenging aspect of natural language processing. As LLMs are increasingly integral to diverse applications, their "black-box" nature sparks significant concerns regarding transparency andarxiv.org      1. 문제 정의..

Rethinking Interpretability in the Era of Large Language Models - 논문 리뷰

https://arxiv.org/abs/2402.01761 Rethinking Interpretability in the Era of Large Language ModelsInterpretable machine learning has exploded as an area of interest over the last decade, sparked by the rise of increasingly large datasets and deep neural networks. Simultaneously, large language models (LLMs) have demonstrated remarkable capabilities acrarxiv.org 파라미터 단에서 LLM을 설명하는 것이 아닌 LLM의 출력이 이유..

Scaling and evaluating sparse autoencoders - 논문 리뷰

https://arxiv.org/abs/2406.04093 Scaling and evaluating sparse autoencodersSparse autoencoders provide a promising unsupervised approach for extracting interpretable features from a language model by reconstructing activations from a sparse bottleneck layer. Since language models learn many concepts, autoencoders need to be veryarxiv.orgSparse Autoencoder를 지속적으로 하긴 하지만 뭔가 확실하게 알고 하는 느낌이 아니라서 읽어 ..

728x90
728x90