HPASubC: A suite of tools for user subclassification of human protein atlas tissue images


BACKGROUND: The human protein atlas (HPA) is a powerful proteomic tool for visualizing the distribution of protein expression across most human tissues and many common malignancies. The HPA includes immunohistochemically-stained images from tissue microarrays (TMAs) that cover 48 tissue types and 20 common malignancies. The TMA data are used to provide expression information at the tissue, cellular, and occasionally, subcellular level. The HPA also provides subcellular data from confocal immunofluorescence data on three cell lines. Despite the availability of localization data, many unique patterns of cellular and subcellular expression are not documented. MATERIALS AND METHODS: To get at this more granular data, we have developed a suite of Python scripts, HPASubC, to aid in subcellular, and cell-type specific classification of HPA images. This method allows the user to download and optimize specific HPA TMA images for review. Then, using a playstation-style video game controller, a trained observer can rapidly step through 10’s of 1000’s of images to identify patterns of interest. RESULTS: We have successfully used this method to identify 703 endothelial cell (EC) and/or smooth muscle cell (SMCs) specific proteins discovered within 49,200 heart TMA images. This list will assist us in subdividing cardiac gene or protein array data into expression by one of the predominant cell types of the myocardium: Myocytes, SMCs or ECs. CONCLUSIONS: The opportunity to further characterize unique staining patterns across a range of human tissues and malignancies will accelerate our understanding of disease processes and point to novel markers for tissue evaluation in surgical pathology.

J Pathol Inform
Toby C. Cornish
Toby C. Cornish
Professor of Pathology and Biomedical Informatics

Clinical informaticist, gastrointestinal pathologist, and researcher.