400万基因开关开启医学大突破
Among the many mysteries of human biology is why complex diseases like diabetes, high blood pressure and psychiatric disorders are so difficult to predict and, often, to treat. An equally perplexing puzzle is why one individual gets a disease like cancer or depression, while an identical twin remains perfectly healthy.
有关人类生物学还有许多未解之谜,其中之一是,为什么很多复杂的疾病,如糖尿病、高血压和精神疾病总是难以预先发现,有时也很难治愈。同样令人困惑的还有,为什么同卵双胞胎其中之一患上了像癌症或抑郁症这样的病,而另一个却安然无恙。
Now scientists have discovered a vital clue to unraveling these riddles. The human genome is packed with at least four million gene switches that reside in bits of DNA that once were dismissed as “junk” but that turn out to play critical roles in controlling how cells, organs and other tissues behave. The discovery, considered a major medical and scientific breakthrough, has enormous implications for human health because many complex diseases appear to be caused by tiny changes in hundreds of gene switches.
现在,科学家们发现了一个可以解开谜团的重要线索。人类基因组携带着至少400万个基因开关,这些基因开关位于曾被认为是垃圾的DNA片段上,但后来发现,它们在控制细胞、器官和其他组织的行为上起着关键作用。这一发现被认为是医学界和科学界的重大突破,对人类健康具有重大意义,因为很多复杂的疾病似乎就是由数百个基因开关的微小变化引起的。
The findings, which are the fruit of an immense federal project involving 440 scientists from 32 laboratories around the world, will have immediate applications for understanding how alterations in the non-gene parts of DNA contribute to human diseases, which may in turn lead to new drugs. They can also help explain how the environment can affect disease risk. In the case of identical twins, small changes in environmental exposure can slightly alter gene switches, with the result that one twin gets a disease and the other does not.
这些发现是一个大规模美国国家项目的成果。这一项目包括了来自世界上32个实验室的440名科学家。这些发现可以直接用于研究DNA非基因部分的改变如何引起人类疾病,从而促进新药的开发;也可以帮助解释环境如何影响患病风险。在同卵双胞胎一例中,所处环境的微小变化也会轻微地改变基因开关,从而导致双胞胎之一患上某种疾病,而另外一个则不会患病。
As scientists delved into the “junk” — parts of the DNA that are not actual genes containing instructions for proteins — they discovered it is not junk at all. At least 80 percent of it is active and needed. The result is an annotated road map of much of this DNA, noting what it is doing and how. It includes the system of switches that, acting like dimmer switches for lights, control which genes are used in a cell and when they are used, and determine, for instance, whether a cell becomes a liver cell or a neuron.
这些基因“垃圾”并不是真正的基因,不包含蛋白质指令。但是,当科学家们研究这些“垃圾”的时候,他们发现这部分根本不是“垃圾”,其中至少有80%是有活性和有用的。科学家们研制出了一幅针对这部分DNA的带注解地图,指出其功能和活动方式。地图显示开关系统就如同灯的光线调节开关,能够控制哪些基因被用于细胞以及何时被用,还能决定一个细胞是成长为肝细胞还是神经细胞。
“It’s Google Maps,” said Eric Lander, president of the Broad Institute, a joint research endeavor of Harvard and the Massachusetts Institute of Technology. In contrast, the project’s predecessor, the Human Genome Project, which determined the entire sequence of human DNA, “was like getting a picture of Earth from space,” he said. “It doesn’t tell you where the roads are, it doesn’t tell you what traffic is like at what time of the day, it doesn’t tell you where the good restaurants are, or the hospitals or the cities or the rivers.”
“这就像谷歌地图,”博德研究所(Broad Institute)的所长埃里克·兰德(Eric Lander)说道。该研究所由哈佛大学(Harvard)和麻省理工大学(Massachusetts Institute of Technology)共同成立。相比之下, 该项目的先驱确定了人类DNA序列的人类基因组计划(Human Genome Project)则“更像是从太空中拍摄了地球的图像。那幅画没有告诉你路在哪儿,没有告诉你一天中某个时候的交通如何,没有告诉你好的餐馆在哪儿,也没有告诉你医院、城市或河流在哪儿,”兰德说。
The new result “is a stunning resource,” said Dr. Lander, who was not involved in the research that produced it but was a leader in the Human Genome Project. “My head explodes at the amount of data.”
新发现是“惊人的资源”,兰德说。他没有参与该研究,但曾负责人类基因组计划。“看到如此大的数据量,我的头都快爆炸了。”
The discoveries were published on Wednesday in six papers in the journal Nature and in 24 papers in Genome Research and Genome Biology. In addition, The Journal of Biological Chemistry is publishing six review articles, and Science is publishing yet another article.
新发现以六篇论文的形式于周三发表在《自然》杂志(Nature)上,并以24篇论文发表在《基因组研究》(Genome Research)和《基因组生物学》(Genome Biology)上。另外,《生物化学杂志》(The Journal of Biological Chemistry)将会发表六篇评论文章,《科学》也会接着发表一篇文章。
Human DNA is “a lot more active than we expected, and there are a lot more things happening than we expected,” said Ewan Birney of the European Molecular Biology Laboratory-European Bioinformatics Institute, a lead researcher on the project.
人类DNA“比我们预期的要活跃得多,还有很多是我们之前没有想到的,”来自欧洲分子生物实验室-欧洲生物信息研究所(European Molecular Biology Laboratory-European Bioinformatics Institute)的尤安·伯尼(Ewan Birney)说道,他是该项目的领头研究人员。
In one of the Nature papers, researchers link the gene switches to a range of human diseases — multiple sclerosis, lupus, rheumatoid arthritis, Crohn’s disease, celiac disease — and even to traits like height. In large studies over the past decade, scientists found that minor changes in human DNA sequences increase the risk that a person will get those diseases. But those changes were in the junk, now often referred to as the dark matter — they were not changes in genes — and their significance was not clear. The new analysis reveals that a great many of those changes alter gene switches and are highly significant.
在《自然》杂志的一篇论文中,研究人员指出基因开关与一系列人类疾病有关系——多发性硬化症、狼疮、风湿性关节炎、克罗恩病、乳糜泻等,甚至还包括像身高这样的体貌特征。在过去10年的大型研究中,科学家们发现人类DNA序列的微小变化会增加人们患上此类疾病的风险。但这些变化是发生在基因“垃圾”(如今被称为“暗物质”)上,而不是发生在基因上,并且,这些变化究竟有多大意义,也不甚清楚。新的研究表明,很多这类变化会改变基因开关,并且意义重大。
“Most of the changes that affect disease don’t lie in the genes themselves; they lie in the switches,” said Michael Snyder, a Stanford University researcher for the project, called Encode, for Encyclopedia of DNA Elements.
“影响疾病的大多数变异不在基因本身,而在基因开关上,”项目的研究员之一,斯坦福大学(Stanford University)的迈克尔·斯奈德(Michael Snyder)说道。该项目称为“DNA元件百科全书计划”(Encyclopedia of DNA Elements),简称Encode。
And that, said Dr. Bradley Bernstein, an Encode researcher at Massachusetts General Hospital, “is a really big deal.” He added, “I don’t think anyone predicted that would be the case.”
“这是很重要的发现,”Encode研究员、马萨诸塞州综合医院(Massachusetts General Hospital)的布拉德利· 伯恩斯坦博士(Bradley Bernstein)说道。他还补充,“我认为没有人预见到会是这样。”
The discoveries also can reveal which genetic changes are important in cancer, and why. As they began determining the DNA sequences of cancer cells, researchers realized that most of the thousands of DNA changes in cancer cells were not in genes; they were in the dark matter. The challenge is to figure out which of those changes are driving the cancer’s growth.
这些发现还能揭示,哪些基因变异对癌症比较重要,以及为什么重要。当研究人员开始绘制癌细胞的DNA序列时,他们意识到,癌细胞中的几千个DNA变异中的大多数并不在基因中,而是在暗物质中。困难的是找出这些变异中究竟哪些造成了癌症的生长。
In prostate cancer, for example, mutations have been found in important genes that are not readily attacked by drugs. But Encode, by showing which regions of the dark matter control those genes, gives another way to attack them: target those controlling switches.
比如,在前列腺癌中已经发现了重要基因上的一些突变,但药物不易对它们起作用。但是,Encode能展示暗物质的哪些具体区域控制着这些基因,从而给出了解决它们的新途径:针对那些控制开关实施治疗。
Dr. Bernstein said, “This is a resource, like the human genome, that will drive science forward.”
伯恩斯坦博士称,“就像人类基因组一样,这是一种会推动科学进步的资源。”
The system, though, is stunningly complex, with many redundancies. Just the idea of so many switches was almost incomprehensible, Dr. Bernstein said.
不过,这个系统的复杂性令人咋舌,带有许多冗余。伯恩斯坦博士称,开关数量之多,到了难以想象的程度。
There also is a sort of DNA wiring system that is almost inconceivably intricate.
DNA网络系统也错综复杂几乎令人难以置信。
“It is like opening a wiring closet and seeing a hairball of wires,” said Mark Gerstein, an Encode researcher from Yale. “We tried to unravel this hairball and make it interpretable.”
“这就像是打开一个配线室,结果看到一团乱麻,”Encode研究员、耶鲁大学的马克·格斯坦(Mark Gerstein)说道。“我们试图解开这团麻,使之能够被理解。”
The project began in 2003, as researchers began to appreciate how little they knew about human DNA. In recent years, some began to find switches in the 99 percent of human DNA that is not genes, but they could not fully characterize or explain what a vast majority of it was doing.
Encode计划始于2003年,当时研究人员开始意识到,他们对人类DNA知之甚少。前几年,一些人开始发现人类DNA中99%的非基因部分存在基因开关,但他们未能充分描述或解释其中大部分的功能。
The thought before the start of the project, said Thomas Gingeras, an Encode researcher from Cold Spring Harbor Laboratory, was that only 5 to 10 percent of the DNA in a human being was actually being used.
Encode研究员、冷泉港实验室(Cold Spring Harbor Laboratory)的托马斯·金格拉斯(Thomas Gingeras)称,计划开始之前,大家认为,仅有5%到10%的人类DNA真正被用到。
The big surprise was not only that almost all of the DNA is used but also that a large proportion of it is gene switches.
结果却大大出乎预料。不仅几乎全部的DNA都有用,而且其中的大部分是基因开关。
By the time the National Human Genome Research Institute, part of the National Institutes of Health, embarked on Encode, major advances in DNA sequencing and computational biology had made it conceivable to try to understand the dark matter of human DNA. Even so, the data analysis was daunting — the researchers generated 15 trillion bytes of raw data. Analyzing the data required the equivalent of more than 300 years of computer time.
当美国国家国家卫生研究院(National Institutes of Health)的分支机构国家人类基因组研究所(National Human Genome Research Institute)启动Encode计划的时候,DNA测序和计算生物学的重大进展已经使人类DNA暗物质变得比较容易理解了。尽管如此,数据分析仍令人望而却步。研究人员得到的原始数据有15万亿字节之巨。分析这些数据需要相当于一台计算机运算300多年的时间。
Just organizing the researchers and coordinating the work was an enormous undertaking. Dr. Gerstein, who was one of the project’s leaders, has produced a diagram of the authors with their connections to one another. It looks nearly as complicated as the wiring diagram for the human DNA switches.
仅仅组织这些研究者并协调彼此的工作就是一项巨大的工程。格斯坦博士是项目的负责人之一,他画出了一张论文作者图,标出了他们之间彼此的联系。图的复杂程度,看起来与人类DNA开关网络图差不了多少。