'2024/11/12 글 목록

2024/11/12 2

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts - 논문 리뷰

https://arxiv.org/abs/2112.06905 GLaM: Efficient Scaling of Language Models with Mixture-of-ExpertsScaling language models with more data, compute and parameters has driven significant progress in natural language processing. For example, thanks to scaling, GPT-3 was able to achieve strong results on in-context learning tasks. However, training these laarxiv.org MoE는 파라미터를 늘리면서도 추론 속도나 전력 사용을 줄였..

인공지능/논문 리뷰 or 진행 2024.11.12

Using Degeneracy in the Loss Landscape for Mechanistic Interpretability - 논문리뷰

https://arxiv.org/abs/2405.10927 Using Degeneracy in the Loss Landscape for Mechanistic InterpretabilityMechanistic Interpretability aims to reverse engineer the algorithms implemented by neural networks by studying their weights and activations. An obstacle to reverse engineering neural networks is that many of the parameters inside a network are not involvarxiv.org 신경망의 해석 가능성을 방해하는 퇴행적 구조를 해결..

인공지능/논문 리뷰 or 진행 2024.11.12

인공지능, 자율주행에 관심있는 공대생의 일기장...?

Today :
Yesterday :

728x90

일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

공대생 도전 일지

2024/11/12 2

티스토리툴바