In separate papers published this week, two independent teams have
drafted the first maps of the human proteome -- which charts all of the
proteins that make up a person. And both teams discovered that proteins
do come from “noncoding” DNA sequences.
The proteome is an important complement to the genome and transcriptome,
and together they create a more complete resource for researching
health and diseases. While genes determine many of our characteristics,
they’re able to do that by providing instructions for making proteins.
So these draft maps -- which you can think of as the Human Genome
Project for proteins -- consist of profiles of proteins expressed in all
sorts of different human cell types. Both drafts were generated using mass spectrometry.
One of the teams, led by Akhilesh Pandey from Johns Hopkins University,
identified and annotated proteins encoded by 17,294 genes – that
accounts for around 84 percent of all the genes in the human genome that
are predicted to encode proteins (that number is estimated at 19,629,
if you’re curious). The team extracted proteins from samples of 30
different tissues, then used enzymes to cut them into small pieces
called peptides. They ran the peptides through a series of instruments
to identify and measure their relative abundance.
They also discovered 193 novel proteins that come from regions of the
genome that haven't been predicted to code for proteins. Within the
genome, there are stretches of DNA whose sequences don’t follow a
conventional protein-coding gene pattern – these have been labeled
as noncoding. “The fact that 193 of the proteins came from DNA sequences
predicted to be noncoding means that we don’t fully understand how
cells read DNA, because clearly those sequences do code for proteins,”
Pandey explains in a news release.
The other team, led by Bernhard Kuster of Technische Universität München
(TUM) in Germany, assembled protein evidence for over 18,000 genes
(or 92 percent of the entire proteome) by compiling raw mass spec data
from databases and other analyses that were already available. These
include a core of 10,000–12,000 proteins expressed in several different
tissues, and to fill in the gaps, they generated their own mass spec
data by analyzing 60 human tissues, 13 body fluids, and 147 cancer cell
lines.
Like the Hopkins team, they also found evidence of translation from
DNA regions that were not thought to be translated. This includes more
than 400 translated long, intergenic non-coding RNAs (lincRNAs). "While
we have a good idea of what the genome looks like, we didn't know how
many of those potentially 20,000 protein-coding genes would actually
make protein," Kuster tells BBC.
The team also identified protein markers that may pre
dict an
individual’s resistance or sensitivity to drugs for diseases like
cancer.
“You can think of the human body as a huge library where each protein is a book,” Pandey says.
“The difficulty is that we don’t have a comprehensive catalog that
gives us the titles of the available books and where to find them.” Now
it looks like we’ve got two first drafts of that comprehensive catalog.
Each group has built a publicly accessible, interactive database of
their datasets: Human Proteome Map and ProteomicsDB.
Although they had seen each other's work at conferences, both Pandey and Kuster tell BBC they had "no idea" they were headed towards publishing simultaneously. And now they share a Nature cover. "We never saw this as a race to be first," Kuster says.
"My interpretation is that when the time is right, somebody's going to
just do it. And perhaps two people are going to do it!" Here's the human
body map of protein expression.
The findings [here and here] were published in Nature this week.
No comments :
Post a Comment