Loading
The complete genetic sequences, medical records, and extensive health data of over 1 million people will become available for researchers this year. Major progress has recently been made on understanding the regulatory sequences in the human genome that act as switches, turning genes on and off in cells. There are only a few examples of variants in these DNA switches causing disease. We have identified variants of these switches causing very rare disease. We have identified variants of a short sequence that mean children are born without a pancreas. We showed that this short sequence is a master switch that turns on the key gene leading to pancreas development. We have also identified very rare variants in another switch that leads to children producing too much insulin and having dangerously low glucose levels. In this case it is because the switch is inappropriately turned on and a protein is produced in the pancreas that shouldn't be. In this project we will use the >1 million individuals with whole genome sequencing data to identify the switches that are important for common type 2 diabetes. As preliminary data and proof of principle we have already analysed height in 150,000 UK Biobank participants. We identified 31 previously unknown associations. One example is variants of a switch that turns on a gene called HMGA1. People with these switch variants are, on average, 5cm taller. This is particularly interesting because changing the protein sequence of HMGA1 does not affect height. We have confirmed these associations in 200,000 people from the All of Us and TOPMed cohorts. We have also performed preliminary analyses for diabetes. We have identified an association with a rare variant near HNF1A that occurs in a long non-coding RNA, a specific type of switch. We have recently demonstrated this long non-coding RNA is important for turning on HNF1A. It is extremely challenging computationally to analyse data on 1,000,000 complete whole genomes. Interpretation is a substantial challenge. This project will build on our initial work by refining our WGS analysis pipeline to make it efficient, cost-effective and publically available. This project is timely because UK Biobank will release whole genome sequence data on 500,000 people by the end of this year. We will use this data to perform single variant and group testing of regulatory switches. The analyses will be performed in different ancestry groups as well as a combined analysis. We will confirm our findings using the US cohorts All of Us and TOPMed which will have >500,000 individuals of diverse ancestries available for analysis. We will test the identified regions in our rare familial diabetes cohort and in the 100,000 genomes project. These are a collection of people where it is expected that there is a single genetic cause of their diabetes. This is important because we have an excellent track record of translating genetic diagnosis into treatment change. We will also perform functional follow-up of a subset of switches to provide new insights into pancreas development and function. This project will provide a substantial advance in our understanding of the role of non-coding variants in human disease. It will allow us to develop efficient and cost-effective approaches analysing whole genome sequence data. We will provide new insights into the regulation of pancreas development and function. It may also dramatically improve the quality of life for some patients with rare forms of diabetes. Our project is important if we are to make major advances in understanding disease mechanisms using whole genome sequencing.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=ukri________::46085fa4fa0d3543d0d192e76da644f9&type=result"></script>');
-->
</script>