The Human Protein Reference Database (HPRD) is a protein database accessible through the Internet.[1] It is closely associated with the premier Indian Non-Profit research organisation Institute of Bioinformatics (IOB), Bangalore, India. This database is a collaborative output of IOB and the Pandey Lab of Johns Hopkins University.
Overview
The HPRD is a result of an international collaborative effort between the Institute of Bioinformatics in Bangalore, India and the Pandey lab at Johns Hopkins University in Baltimore, USA. HPRD contains manually curated scientific information pertaining to the biology of most human proteins. Information regarding proteins involved in human diseases is annotated and linked to Online Mendelian Inheritance in Man (OMIM) database. The National Center for Biotechnology Information provides link to HPRD through its human protein databases (e.g. Entrez Gene, RefSeq protein pertaining to genes and proteins.
This resource depicts information on human protein functions including protein–protein interactions, post-translational modifications, enzyme-substrate relationships and disease associations. Protein annotation information that is catalogued was derived through manual curation using published literature by expert biologists and through bioinformatics analyses of the protein sequence. The protein–protein interaction and subcellular localization data from HPRD have been used to develop a human protein interaction network.[2]
Highlights of HPRD as follows:
- From 10,000 protein–protein interactions (PPIs) annotated for 3,000 proteins in 2003, HPRD has grown to over 36,500 unique PPIs annotated for 25,000 proteins including 6,360 isoforms by the end of 2007.[3]
- More than 50% of molecules annotated in HPRD have at least one PPI and 10% have more than 10 PPIs.
- Experiments for PPIs are broadly grouped into three categories namely in vitro, in vivo and yeast two hybrid (Y2H). Sixty percent of PPIs annotated in HPRD are supported by a single experiment whereas 26% of them are found to have two of the three experimental methods annotated.
- HPRD contains 18,000 manually curated PTMs data belonging to 26 different types. Phosphorylation is the leading type of modification of protein contributing to 63% of PTM data annotated in HPRD. Glycosylation, proteolytic cleavage and disulfide bridge events are the next leading contributors of PTM data.
- HPRD data is available for download in tab delimited and XML file formats.[4]
HPRD also integrates data from Human Proteinpedia, a community portal for integrating human protein data. The data from HPRD can be freely accessed and used by academic users while commercial entities are required to obtain a license for use. Human Proteinpedia[5] content is freely available for anyone to download and use.
PhosphoMotif Finder
PhosphoMotif Finder[6] contains known kinase/phosphatase substrate as well as binding motifs that are curated from the published literature. It reports the PRESENCE of any literature-derived motif in the query sequence. PhosphoMotif Finder does NOT PREDICT any motifs in the query protein sequence using any algorithm or other computational strategies.
Comparison of protein data
There are other databases that deal with human proteome (e.g. BioGRID, BIND, DIP, HPRD, IntAct, MINT, MIPS, PDZBase and Reactome). Each database has its own style of presenting the data. It is a difficult task for most investigators to compare the voluminous data from these databases in order to conclude strengths and weaknesses of each database. Mathivanan and colleagues [7] tried to address this issue while analyzing protein data by asking various questions. This analysis will help biologists to choose among these databases based on their needs.
References
- ↑ Peri S, et al. (2003). "Development of human protein reference database as an initial platform for approaching systems biology in humans". Genome Research. 13 (10): 2363–71. doi:10.1101/gr.1680803. PMC 403728. PMID 14525934.
- ↑ Gandhi T.K.B.; et al. (March 2006). "Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets". Nature Genetics. 38 (3): 285–293. doi:10.1038/ng1747. PMID 16501559. S2CID 1446423.
- ↑ Mathivanan S.; et al. (2006). "An evaluation of human protein–protein interaction data in the public domain". BMC Bioinformatics. 2006 (7): S19. doi:10.1186/1471-2105-7-s5-s19. PMC 1764475. PMID 17254303.
- ↑ Mishra G.; et al. (2006). "Human protein reference database—2006 update". Nucleic Acids Research. 34 (Database issue): 411–414. doi:10.1093/nar/gkj141. PMC 1347503. PMID 16381900.
- ↑ Mathivanan S.; et al. (2008). "Human Proteinpedia enables sharing of human protein data" (PDF). Nature Biotechnology. 26 (2): 164–167. doi:10.1038/nbt0208-164. hdl:10261/60528. PMID 18259167. S2CID 205265347.
- ↑ Amanchy R.; et al. (2007). "A compendium of curated phosphorylation-based substrate and binding motifs". Nature Biotechnology. 2007 (25): 285–286. doi:10.1038/nbt0307-285. PMID 17344875. S2CID 38824337.
- ↑ Mathivanan S, Periaswamy B, Gandhi TK, et al. (2006). "An evaluation of human protein-protein interaction data in the public domain". BMC Bioinformatics. 7 (Suppl 5): S19. doi:10.1186/1471-2105-7-S5-S19. PMC 1764475. PMID 17254303.