Check out RefSeq release 228, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also as divided by logical groupings.
What’s included in this release?
As of January 3, 2025, this full release incorporates genomic, transcript, and protein data containing:
- 513,096,240 records, including
- 391,903,900 proteins
- 67,997,702 RNAs
- Sequences from 162,138 organisms
New horse and donkey assemblies and annotations
Horse Annotation Release GCF_041296265.1-RS_2024_12 contains the annotated genes, transcripts, and proteins in a new phased haploid assembly of a Thoroughbred derived haplotype. Learn more in the annotation report. The annotation products are available in the sequence databases and on the FTP site.
Donkey Annotation Release GCF_041296235.1-RS_2024_12 contains the annotated genes, transcripts, and proteins in a new phased haploid assembly of the Thoroughbred horse dam and the donkey sire haplotype from a mule. Learn more in the annotation report. The annotation products are available in the sequence databases and on the FTP site.
New eukaryotic genome annotations
This release contains new or updated annotations generated by NCBI’s eukaryotic genome annotation pipeline for 38 species, including:
- Rabbit, based on new assembly mOryCun1.1 (GCF_964237555.1-RS_2024_11) (pictured)
- White-tailed deer, based on updated assembly Ovbor_1.2 (GCF_023699985.2-RS_2024_12)
- Merriam’s kangaroo rat, based on new assembly mDipMer1.0.p (GCF_024711535.1-RS_2024_11)
- Axolotl, based on new assembly UKY_AmexF1_1 (GCF_040938575.1-RS_2024_10)
- American cockroach, based on new assembly P.americana_PAMFEO1_priV1 (GCF_040183065.1-RS_2024_10)
- Wood tobacco, based on updated assembly ASM39365v2 (GCF_000393655.2-RS_2024_11)
Upcoming changes to NCBI Taxonomy for viruses (Spring 2025)
As previously mentioned, NCBI will add binomial species names to about 3000 viruses to reflect changes to the International Code of Virus Classification and Nomenclature (ICVCN).
Stay up to date
RefSeq is part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration. Follow us on social @NCBI and join our mailing list to keep up to date with RefSeq and other CGR news.
Questions?
If you have questions or would like to provide feedback, please reach out to us!