New datasets available on the Registry of Open Data from the National Library of Medicine, Digital Earth Africa, Amazon, and others
                  Posted on: 
                 Jul 15, 2021 
                
 
                Forty-four new or updated datasets from the National Library of Medicine, Digital Earth Africa, Amazon, and others are available on the Registry of Open Data in the following categories.
 
COVID-19:
- InRad COVID-19 X-Ray and CT Scans from Faculdade de Medicina da Universidade de São Paulo Institute of Radiology (InRad)
- REDASA COVID-19 Dataset from Imperial College London
Agriculture:
- iSDAsoil from Innovative Solutions for Decision Agriculture (iSDA)
Climate and Weather:
- Updated: Global Surface Summary of Day from National Oceanic and Atmospheric Administration
- Updated: Geostationary Operational Environmental Satellites (GOES) 16 & 17 from National Oceanic and Atmospheric Administration
- National Bathymetric Source Data from National Oceanic and Atmospheric Administration
- Rapid Refresh Forecast System (RRFS) Ensemble [Prototype] from National Oceanic and Atmospheric Administration
- Climate retrospective Analysis and Forecast Ensemble system: version 1 from the Commonwealth Scientific and Industrial Research Organisation
Energy:
- Commercial Building Sector Stock model (ComStock) from National Renewable Energy Laboratory
 
Geospatial:
- Landsat, Sentinel-2, and Sentinel-1 data over Africa managed by Digital Earth Africa
- Normalized Difference Urban Index (NDUI) from Remote Sensing Big Data Intelligent Application Laboratory, Sun Yat-sen University
Life Sciences:
- CIViC (Clinical Interpretation of Variants in Cancer) from the Washington University School of Medicine
- GBIF Species Occurrences from Global Biodiversity Information Facility
- BossDB Open Neuroimagery Datasets from Johns Hopkins University Applied Physics Laboratory
- Conformational Space of Short Peptides from Toyoko and Universidad Nacional de Quilmes
- Ivy Glioblastoma Atlas from the Allen Institute for Brain Sciences
Machine Learning:
- 12 benchmark datasets from the Allen Institute for Artificial Intelligence (AI2)
- Amazon Berkeley Objects Dataset from Amazon
- Airborne Object Tracking Dataset from Amazon
- Helpful Sentences from Reviews from Amazon
- Low Context Name Entity Recognition (NER) Datasets with Gazetteer from Amazon
- Multilingual Name Entity Recognition (NER) Datasets with Gazetteer from Amazon
- Amazon-PQA from Amazon
- WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation from Amazon
- Pre- and post-purchase product questions from Amazon
- MWIS VR Instances from Amazon
- FashionLocalTriplets from Amazon
- PASS: Perturb-and-Select Summarizer for Product Reviews from Amazon
- Corn Kernel Counting Dataset from Intelinair, Inc.
- High-Order Accurate Direct Numerical Simulation of Flow over a MTU-T161 Low Pressure Turbine Blade from PyFR
- PubMedCentral Open Access Text Mining Datasets from the National Library of Medicine
Looking to make your data available? The AWS Open Data Sponsorship Program covers the cost of storage for publicly available, high-value, cloud-optimized datasets. We work with data providers who seek to:
- Democratize access to data by making it available for analysis on AWS
- Develop new cloud-native techniques, formats, and tools that lower the cost of working with data
- Encourage the development of communities that benefit from access to shared datasets
Learn how to propose your dataset to the AWS Open Data Sponsorship Program.