Investigating bias in semantic similarity measures for analysis of protein interactions

Mina, M. and Guzzi, P. H. (2012) Investigating bias in semantic similarity measures for analysis of protein interactions. Il nuovo cimento C, 35 (Sup. 1). pp. 71-80. ISSN 1826-9885

ncc10369.pdf - Published Version

Download (521kB) | Preview
Official URL:


Protein interactions are fundamental blocks of almost all cellular processes, so the study of the set of protein interactions in a single organism (also referred to as Protein Interaction Networks - PIN) is an important step in the comprehension of mechanism at molecular level. Recently, the possibility to annotate such data using Gene Ontology and the consequent use of ontology-based analysis has been exploited, e.g. the use of semantic similarity (SS) measures. Whereas, SS measures present many challenges and different issues that have to be faced. In particular SS measures are affected from three main biases: i) annotation length, ii) evidence codes, and iii) shallow annotation. The common cause of such biases are the structure of GO and the corpora of annotations (GOA). Consequently, the impact of this variability has to be considered when developing novel algorithms for protein interactions analysis. Although the criticality of these aspects, there is a lack in the systematic analysis of the bias. Few works dealt with the three sources of bias most affecting SS measures. This paper demonstrates the existence of the bias that affect main SS on a set of well-known yeast complexes. It also provides some evidences about the variability of the bias effects over the proteome.

Item Type: Article
Uncontrolled Keywords: Computational techniques; simulations
Subjects: 500 Scienze naturali e Matematica > 530 Fisica
Depositing User: Marina Spanti
Date Deposited: 27 Apr 2020 14:54
Last Modified: 27 Apr 2020 14:54

Actions (login required)

View Item View Item