المدة الزمنية 9:23

Generalization through Memorization: Nearest Neighbor Language Models (Research Paper Walkthrough)

921 مشاهدة
0
29
تم نشره في 2021/07/20

#languagemodels #knn #nlp Bigger Models, Better Results?? This research extends a pre-trained neural language model by linearly interpolating it with a k-nearest neighbors model, achieving new state-of-the-art results on Wikitext-103 with no additional training. ⏩ Abstract: We introduce NN-LMs, which extend a pre-trained neural language model (LM) by linearly interpolating it with a nearest neighbors (NN) model. The nearest neighbors are computed according to distance in the pre-trained LM embedding space, and can be drawn from any text collection, including the original LM training data. Applying this transformation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our NN-LM achieves a new state-of-the-art perplexity of 15.79 -- a 2.9 point improvement with no additional training. We also show that this approach has implications for efficiently scaling up to larger training sets and allows for effective domain adaptation, by simply varying the nearest neighbor datastore, again without further training. Qualitatively, the model is particularly helpful in predicting rare patterns, such as factual knowledge. Together, these results strongly suggest that learning similarity between sequences of text is easier than predicting the next word, and that nearest neighbor search is an effective approach for language modeling in the long tail. Please feel free to share out the content and subscribe to my channel :) ⏩ Subscribe - /channel/UCoz8NrwgL7U9535VNc0mRPA ⏩ OUTLINE: 0:00 - Background and Abstract 04:10 - illustration of kNN-LM - Algorithm 07:21 - Experiment - Results ⏩ Paper Title: Generalization through Memorization: Nearest Neighbor Language Models ⏩ Paper: https://openreview.net/attachment?id=HklBjCEKvH&name=original_pdf ⏩ Author: Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis ⏩ Organisation: Stanford University, Facebook AI Research ⏩ IMPORTANT LINKS Full Playlist on BERT usecases in NLP: /watch/f0qWcxQ5NUQPqklJ3qYj8VtOFZl9qqAsLP=tsil&czAPd1Pk5CkkW Full Playlist on Text Data Augmentation Techniques: /watch/ZIiU1VhG2R-VuO59_g36gUtOFZl9qqAsLP=tsil&oNs4bQcs9O9sU Full Playlist on Text Summarization: /watch/f0qWcxQ5NUQPqklJ3qYj8VtOFZl9qqAsLP=tsil&czAPd1Pk5CkkW Full Playlist on Machine Learning with Graphs: /watch/fYiGj-1R1vhf_XXDm6Tt7UtOFZl9qqAsLP=tsil&cj1yNA_LJu-LG Full Playlist on Evaluating NLG Systems: /watch/ubCnwP98Eeu00VyNR5gzlXtOFZl9qqAsLP=tsil&U7mu5-zlIC-ln ********************************************** If you want to support me financially which totally optional and voluntary ❤️ You can consider buying me chai ( because i don't drink coffee :) ) at https://www.buymeacoffee.com/TechvizCoffee ********************************************** ⏩ Youtube - /c/TechVizTheDataScienceGuy ⏩ LinkedIn - https://linkedin.com/in/prakhar21 ⏩ Medium - https://medium.com/@ prakhar.mishra ⏩ GitHub - https://github.com/prakhar21 ⏩ Twitter - https://twitter.com/rattller ********************************************* Tools I use for making videos :) ⏩ iPad - https://tinyurl.com/y39p6pwc ⏩ Apple Pencil - https://tinyurl.com/y5rk8txn ⏩ GoodNotes - https://tinyurl.com/y627cfsa #techviz #datascienceguy #machinelearning #ai About Me: I am Prakhar Mishra and this channel is my passion project. I am currently pursuing my MS (by research) in Data Science. I have an industry work-ex of 3 years in the field of Data Science and Machine Learning with a particular focus on Natural Language Processing (NLP).

الفئة

عرض المزيد

تعليقات - 11