latent space, l1 coefficient, context length에 따른 Sparse Autoencoder 학습

인공지능/XAI

latent space, l1 coefficient, context length에 따른 Sparse Autoencoder 학습

이게될까 2024. 10. 29. 00:16

728x90

2024.10.15 - [인공지능/자연어 처리] - l1 Coefficient에 따른 Sparse Autoencoder 학습, 출력 확인

l1 Coefficient에 따른 Sparse Autoencoder 학습, 출력 확인

학습 완료 직전에 네트워크 오류로 허깅페이스에서 데이터를 가져오지 못하여 60%, 80%만 확인할 수 있습니다...ㅠ l1 coefficient0.01151050100l1 Loss 1205.29108.9743.2535.9213.9712.47mse loss2.24153.03287376.66764.88782

yoonschallenge.tistory.com

이전 글에 좀 더 작은 latent space 있습니다.

l1 coefficient	0.01	0.05	0.1	0.01	0.05	0.1
latent_space	8192	8192	8192	16384	16384	16384
context_length	256	256	256	512	512	512
l1 Loss	1199.57	672.90	602.08	1063.27	736.28	649.73
mse loss	2.23	1.77	6.71	1.50	2.08	7.94
over_all loss	14.22	35.41	66.92	12.14	38.90	72.91
below_1e-5	49	8	0	4894	4446	3535
below_1e-6	49	8	0	4892	4445	3535
dead_features	49	7	0	4862	4431	3525
ce_loss_score
l0	6873.58	3144.56	2780.78	8257.80	4230.01	3897.64
ce_loss_with_sae
ce_loss_without_sae

음 l1 coefficient가 커질수록 왜 sparsity가 적어지는지는 의문이네요...?

l1 coefficient	0.1	1	10	100
latent_space	32768	32768	32768	32768
context_length	512	512	512	512
l1 Loss	760.57	171.24	67.82	4.76
mse loss	29.70	196.22	627.27	1523.65
over_all loss	105.76	367.46	1005.44	1999.75
below_1e-5	21512	18605	27182	32175
below_1e-6	21490	18554	27176	32175
dead_features	20390	17164	26069	31085
ce_loss_score	0.99	0.987	0.85	0.18
l0	5359.07	624.92	30.34	1.16
ce_loss_with_sae	2.94	3.05	4.29	10.13
ce_loss_without_sae	2.94	2.94	2.94	2.94

1024 length부터 gpu를 여러개 쓰는데 오류가 생겨서 못 했지만 이젠 해결한 것 같습니다.

https://github.com/jbloomAus/SAELens/issues/337

[Bug Report] · Issue #337 · jbloomAus/SAELens

If you are submitting a bug report, please fill in the following details and use the tag [bug]. Describe the bug I encountered a RuntimeError during training while using sae_lens. The error appears...

github.com

여기에 오류 작성했었습니다.

학습한 것을 보면 latent size 32768에서 l1 coefficient 0.1이나 1이 사용할 만한 모델인 것 같네요

https://huggingface.co/yoonLM

yoonLM (Yoon JeongHo)

yoonLM/sae_llama3.2org_1B_256_4_l1_0.01 Updated 10 days ago

huggingface.co

여기서 사용해보실 수 있습니다.

저작자표시

'인공지능 > XAI' 카테고리의 다른 글

Sparse Autoencoder Test - l1_Coefficient, Context_length, Latent_space_size (2)	2024.10.31
sae-vis tutorial (3)	2024.10.31
l1 Coefficient에 따른 Sparse Autoencoder 학습, 출력 확인 (0)	2024.10.15
Sparse Autoencoder 학습 - l1 regularization coefficient에 따른 학습 변화 (0)	2024.10.13
Sparse Autoencoder 학습과 문제점 (0)	2024.10.08

현재글latent space, l1 coefficient, context length에 따른 Sparse Autoencoder 학습

인공지능, 자율주행에 관심있는 공대생의 일기장...?

Today :
Yesterday :

공대생 도전 일지