The US National Institutes of Health first launched the All of Us project in 2015 with the ambitious goal of building a database of fully sequenced genomes from at least one million Americans. The database would also include anonymized electronic health records and responses to health surveys from participants, as well as blood, saliva or urine samples, to be maintained at a biobank established at the Mayo Clinic in Rochester, Minnesota. The ultimate aim of the project is to improve diagnostic and drug development, aid clinical trial recruitment, and advance science’s overall understanding of disease. To accomplish this mission, All of Us will provide a diverse database and cloud-based workbench for its investigation that could facilitate collaborative research on a variety of health conditions.
To date, about 660,000 individuals have registered to participate, and around 300,000 have completed the first steps of granting access to their health records, donating at least one biological sample, and completing health surveys relating to social and environmental factors that can impact health. In March of this year, the All of Us project released the first 100,000 whole genome sequences to researchers, who can access the data via a cloud-based research workbench after registering and taking mandatory training on the ethical use of de-identified data. Use of the research workbench and database is open to academic /research institutions and citizen scientists alike and is intended to facilitate collaboration between research groups. For example, in Arizona approximately 40 researchers from around the state have already teamed up to examine specific subjects such as the impacts of Valley Fever, a serious regional health issue, and predictors of endometriosis.
The All of Us project has made diversity and inclusivity a key feature. Previously more than 90% of participants in large genomic studies have been of European descent. In contrast, approximately half of the data being gathered by the All of Us project is from participants belonging to racial and ethnic groups that have been historically under-represented in medical research. Those participating in the project also include an unprecedented number from lower income and educational level groups, as well as individuals from rural communities. To recruit such a diverse population, the program has sent mobile units to communities across the United States to educate and register participants, visiting more than 100 cities in over 40 states to date. In return for sharing their genomic and health data, participants receive a free report on their ancestry and genomic traits, and by the end of 2022 will also have access to information about their hereditary disease risk and medication-gene interactions.
The All of Us research workbench furthermore includes links to the U.S. Census Bureau’s American Community Survey, which provides details about the areas where participants live. This combination of genomic, social and environmental data is designed to help researchers better understand how genes can cause or influence disease in the context of other factors that may affect a person’s health.
While other large-scale genomic studies have been launched in recent years, the All of Us project has the greatest breadth of participation, widest focus, and largest variety of included data to date. The size and depth of the program is thus expected to allow researchers to spot patterns relating to disease that would not be visible at a smaller scale and to open research to a wide range of diseases, as well as to take into consideration health influences beyond genomics. In addition, additional samples for each participant will be biobanked and stored for later use, should new markers become available.
In comparison, the 100,000 Genomes Project of Genomics England is focused primarily on developing a better understanding of cancer and rare diseases. Recruitment in the project was completed in December 2018, and to date, whole genomes have been sequenced from about 85,000 UK National Health Service patients. The effort has also included the study of sequences from about 12,000 different tumors, allowing researchers to map out genetic changes that lead normal cells to develop into malignant ones. This program, although well underway, is not expected to achieve the same level of depth and breath of analysis as what the All of Us project will achieve.
Moreover, a major international research initiative called the Human Pangenome Reference Consortium was launched in April 2022 with funding from the National Human Genome Research Institute. However, the aim of this effort is the more limited creation of a reference genome that captures the genetic diversity of all world populations.