Transfer learning across different domains and tasks can be facilitated by publicly available datasets. For natural language processing, GLUE, SQuAD, and CoNLL are popular choices; these datasets are based on existing corpora such as Wikipedia, news articles, or web pages and can be used for tasks such as text classification, question answering, or named entity recognition. They are compatible with pre-trained models such as BERT, GPT-3, or RoBERTa which are trained on large-scale language data. For computer vision, ImageNet, CIFAR-10, and COCO are commonly used datasets for tasks such as image classification, object detection, or semantic segmentation; they are composed of high-quality images with diverse categories and annotations and can be used with pre-trained models such as ResNet, VGG, or YOLO which are trained on large-scale image data. For speech recognition, LibriSpeech, TIMIT, and VoxForge are popular choices; these datasets are composed of audio recordings with transcriptions and metadata and can be used with pre-trained models such as Wav2Vec, DeepSpeech or Jasper which are trained on large-scale speech data.