728x90
728x90
소규모 모델을 학습해서 더 큰 모델을 학습한다고 하니...
시간나면 한번 하겠씁니다.
https://arxiv.org/abs/2406.17711
Data curation via joint example selection further accelerates multimodal learning
Data curation is an essential component of large-scale pretraining. In this work, we demonstrate that jointly selecting batches of data is more effective for learning than selecting examples independently. Multimodal contrastive objectives expose the depen
arxiv.org
728x90