Harvard Medical School researchers have mapped the interaction partners for proteins encoded by more than 5,800 genes, representing over a quarter of the human genome, according to a brand new study published online in Nature on May 17.
The network, dubbed BioPlex 2.0, identifies more than 56,000 unique protein-to-protein interactions—87 percent of them previously unknown—the largest such network to date.
BioPlex reveals protein communities associated with fundamental cellular processes as well as diseases such as hypertension as well as cancer, as well as highlights brand new opportunities for efforts to understand human biology as well as disease.
The work was done in collaboration with Biogen, which also provided partial funding for the study.
“A gene isn’t just a sequence of a piece of DNA. A gene is usually also the protein This particular encodes, as well as we will never understand the genome until we understand the proteome,” said co-senior author Wade Harper, the Bert as well as Natalie Vallee Professor of Molecular Pathology as well as chair of the Department of Cell Biology at Harvard Medical School. “BioPlex provides a framework with the depth as well as breadth of data needed to address This particular challenge.”
“This particular project is usually an atlas of human protein interactions, spanning almost every aspect of biology,” said co-senior author Steven Gygi, professor of cell biology as well as director of the Thermo Fisher Center for Multiplexed Proteomics at Harvard Medical School. “This particular creates a social network for each protein as well as allows us to see not only how proteins interact, nevertheless also possible functional roles for previously unknown proteins.”
Bait as well as prey
Of the roughly 20,000 protein-coding genes inside human genome, scientists have studied only a fraction in detail. To work toward a description of the entire cast of proteins in a cell as well as the interactions between them—known as the proteome as well as interactome, respectively—a team led by Harper as well as Gygi developed BioPlex, a high-throughput approach for the identification of protein interplay.
BioPlex uses so-called affinity purification, in which an individual tagged “bait” protein is usually expressed in human cells inside laboratory. The bait protein binds with its interaction partners, or “prey” proteins, which are then fished out via the cell as well as analyzed using mass spectrometry, a technique which identifies as well as quantifies proteins based on their unique molecular signatures. In 2015, an initial effort (BioPlex 1.0) used approximately 2,0 different bait proteins, drawn via the Human ORFeome database, to identify nearly 24,000 protein interactions.
inside current study, the team expanded the network to include a total of 5,891 bait proteins, which revealed 56,553 interactions involving 10,961 different proteins. An estimated 87 percent of these interactions have not been previously reported.
Guilt by association
y mapping these interactions, BioPlex 2.0 identifies groups of functionally related proteins, which tend to cluster into tightly interconnected communities. Such “guilt-by-association” analyses suggested possible roles for previously unknown proteins, as these communities often commingle proteins with both known as well as unknown functions.
The team mapped numerous protein clusters associated with basic cellular processes, such as DNA transcription as well as energy production, as well as a variety of human diseases. Colorectal cancer, for example, appears to be linked to protein networks which play a role in abnormal cell growth, while hypertension is usually linked to protein networks for ion channels, transcription factors as well as metabolic enzymes.
“With the upgraded network, we can make stronger predictions because we have a more complete picture of the interactions within a cell,” said first author Edward Huttlin, instructor of cell biology at Harvard Medical School. “We can pick out statistical patterns inside data which might suggest disease susceptibility for certain proteins, or others which might suggest function or localization properties. This particular makes a significant portion of the human proteome accessible for study.”
The entire BioPlex network as well as accompanying data are publicly available, supporting both large-scale studies of protein interaction as well as targeted studies of the function of specific proteins.
Although the network serves as the largest collection of such data gathered to date, the authors caution This particular remains an incomplete product. The current pipeline expresses bait proteins in only one cell type (human embryonic kidney cells) grown under one set of conditions, for example, as well as distinct interactions may occur in different cell types or microenvironments.
As the network increases in size as well as more human proteins are used as baits, scientists can better judge the accuracy of each individual protein interaction by considering its context inside larger network. Isolating the same protein complex repeatedly, each time using a different member as a bait, can provide multiple independent experimental observations to confirm each protein’s membership. Moreover, by using prey proteins as bait, many protein interactions can be observed inside opposite direction as well. Both of these scenarios greatly reduce the likelihood which particular interactions were identified due to chance. The team continues to add to BioPlex, that has a target goal of around 10,000 bait proteins, which could cover half of the human genome as well as could further increase the predictive power of the network.
“We certainly aren’t seeing all the interactions, nevertheless This particular’s a launching point. We think This particular’s important to continue to build This particular map, to see how much of This particular is usually reproduced in various other cell types under different conditions, to see whether the interactions are similar or dynamic,” Gygi said. “Because whether you’re interested in cancer or neurodegenerative disease, basic development or evolutionary fitness—you can make brand new hypotheses as well as learn something via This particular network.”
Facebook for the proteome
Architecture of the human interactome defines protein communities as well as disease networks, Nature (2017). nature.com/articles/doi:10.1038/nature22366