Home / Papers / Continual Learning of Language Models

Continual Learning of Language Models

DOI: 10.48550/arXiv.2302.03241Semantic Scholar

7 Citations•2023•

Zixuan Ke, Yijia Shao, Haowei Lin

ArXiv

It is seen that different domains give similar importance values, which indirectly shows that the proxy can approximately identify the common general knowledge.

Abstract

For each domain i , we compare its importance vector with the importance vector of every other domain, and then average the cosine similarities to get the value for domain i . We get 0.92 for Restaurant, the same 0.91 for ACL, AI, and Phone, 0.89 for PubMed and 0.92 for Camera. We see that different domains give similar importance values, which indirectly shows that our proxy can approximately identify the common general knowledge.