Gen3 is...

how data commons are made. Data commons co-locate data, storage and computing infrastructure with commonly used tools for analyzing and sharing data to create an open interoperable resource for the research community.

View on GitHub

Explore A Data Commons


Data commons speed up and democratize the process of scientific discovery enabling more scientific research through pragmatic access to vast data and computing resources.

Gen3 Services

Gen3 supports data submission including clinical attributes, phenotypic information, and data files. The submissions are validated against the data dictionary to ensure all required fields are present and have appropriate data values.
Data Submission
Gen3 provides permanent data GUIDs (globally unique IDs) for data objects. The service tracks the physical locations and hash of every asset (file) in the data commons object store. The Gen3 platform includes landing pages which support FAIR descriptions of the data objects.
Object Index
Gen3 features a friendly GraphQL API for searching and discovering data. The GraphQL API enables faceted and precise searching through the flexible data model. Search capabilities enable quick and easy creation of virtual cohorts that can be exported to a manifest for data download.
Data Search
Gen3 utilizes OpenID Connect for providing AuthN services with AuthZ specified on a per commons basis. Currently supported identity providers include Google and Shibboleth, supporting providers such as NIH iTrust, InCommon Federation, and eduGAIN.
Gen3 includes a data portal as a default application over a commons. The portal is an interactive website that allows users to explore, submit, and download data. The data portal utilizes the public APIs offered by the data commons as a demonstration to the power of Gen3.
Data Portal
All of the Gen3 services support powerful APIs which allow them to interact with each other and external users. These APIs enable extensible application development for future services and users.

Why Gen3?

Flexible data model

Gen3 supports a graph data model backed by a data dictionary. The data dictionary and data model are encoded into easy to edit YAML files.

Open source software

Gen3 is built on a foundation of open source software including Python, Flask, Nginx, Apache, Kubernetes, Terraform, Packer, React, Jenkins, Go, Jupyter, PostgreSQL and Docker. Gen3 is itself open source under the Apache version 2.0 license.

Cloud Agnostic

Gen3 currently supports Amazon Web Services, Google Cloud Platform, and Openstack environments with support soon for Microsoft Azure. Gen3 is capable of being run in these environments while securely handling controlled access data. Gen3 commons have been run in environments that are compliant with FISMA and FedRAMP moderate protocols.


Try out a Gen3 Data Commons sandbox with open access data so you can explore the search, download, and API capabilities.

Explore A Data Commons


Get Started

I want to run a Gen3 commons

Learn how to run a Gen3 data commons using either Docker-Compose or our Cloud Automation scripts. Docker-Compose lets you quick start your new data commons in a matter of minutes!

Learn more about running Gen3

I want to submit data to a Gen3 commons

Please read our generic user documentation on how to submit data to a Gen3 commons. To submit data to a specific commons, please see their documentation and be in touch with their support email. For your own data commons, feel free to modify our generic doucmentation.

Learn more about submitting data

I want to access data in a Gen3 commons

Read the swagger docs on the Gen3 APIs and visit sample queries in our generic Gen3 user documentation. To obtain authorized access to protected data, please contact the support email for that data commons.

Learn more about accessing data

I want to build an app for a Gen3 commons

Build applications for Gen3 data commons using our REST & GraphQL APIs. Build the data exploration portal of your dreams, or build a custom way to submit specific data nodes. The power is yours!

Learn more about building apps


Gen3 Logo Usage Guidelines

Gen3 has a distinguishable logo that should be used as a visual identifier for all data commons powered by Gen3 open-source software. The Gen3 logo is a restricted use logo for those who build their solutions or data commons with the Gen3 technology or create applications with compatibility to interact with the Gen3 technology. There are guidelines for the logo's appearance and usage, outlined here. The restricted use logos may only be used if you meet the requirements specified below. Please consider whether you will be able to comply with these requirements before applying the logo.

Standard Color Logo

The normal full color reproduction is suitable for use on light backgrounds without modification.
Download Standard Full Color Logo

Inverse Color Logo

When the logo is used on a background that is darker than 50% grey, you should reverse the type to white for legibility.
Download Inverse Color Logo

Clear Space

There should be enough clear space around the logo.

About Gen3

Gen3 is an open source software with an Apache 2.0 license for colocating compute and storage in a data commons. Gen3 is agnostic to the data type and the storage location, needing, minimally, a data model, data, a secure landing page for the portal, and, a research goal in mind.