Saugata Chatterjee Portfolio
Menu
Home
Posts
Publications
Resources
Podcasts
About
Resources
Lectures and tutorials on statistics, machine learning, physics, computational physics
Tutorials
Statistics Tutorials
R Tutorials
Sagemath Tutorials
ML channels
Machine Learning Made Simple Podcast
Machine Learning Made Simple Video Channel
DeepLearningAI
a
ML papers
Machine Learning Methods
Attention Is All You Need, Ashish Vaswani et al., arXiv:1706.03762
An Introduction to Variational Autoencoders, Diederik P. Kingma et al., arXiv:1906.02691
Mixtral of Experts, Albert Q. Jiang et al., arXiv:2401.04088
Fine-Tuning and Adaptation
LoRA: Low-Rank Adaptation of Large Language Models, Edward Hu et al., arXiv:2106.09685
QLoRA: Efficient Finetuning of Quantized LLMs, Tim Dettmers et al., arXiv:2305.14314
Evaluations
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding: Alex Wang et al., arXiv:1804.07461
HellaSwag: Can a Machine Really Finish Your Sentence?, Rowan Zellers et al., arXiv:1905.07830
Surveys
A Survey of Large Language Models, Wayne Xin Zhao et al., arXiv:2303.18223
Retrieval-Augmented Generation for Large Language Models: A Survey, Yunfan Gao et al., arXiv:2312.10997
Language Models
GPT-3: Language Models are Few-Shot Learners, Tom B. Brown et al., arXiv:2005.14165
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, Colin Raffel et al., arXiv:1910.10683
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin et al., arXiv:1810.04805
RoBERTa: A Robustly Optimized BERT Approach for Natural Language Understanding, Yinhan Liu et al., arXiv:1907.11692
XLNet: Generalized Autoregressive Pretraining for Language Understanding, Zhilin Yang et al., arXiv:1906.08237
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, Zhenzhong Lan et al., arXiv:1909.11942
DistilBERT: A Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter, Victor Sanh et al., arXiv:1910.01108
ERNIE: Enhanced Representation through kNowledge Integration, Yu Sun et al., arXiv:1904.09223
XLM-R: Unsupervised Cross-lingual Representation Learning at Scale, Alexis Conneau et al., arXiv:1911.02116
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators, Kevin Clark et al., arXiv:2003.10555
Reformer: The Efficient Transformer, Nikita Kitaev et al., arXiv:2001.04451
Longformer: The Long-Document Transformer, Iz Beltagy et al., arXiv:2004.05150
Important Websites in the ML World
For (State-of-the-art) SOTA models see
Hugging Face
For connecting papers with their associated codes see
paperswithcode