A Hybrid Method for Detecting Protein Complexes in Weighted Protein-Protein Interaction Networks

Jerome Cary Beltran, Catalina Montes

Adviser: Adrian Roy Valdez, PhD

Co-Adviser: Jaymar Soriano, MSc

Co-Adviser: John Justine Villar, MSc

Abstract

The detection of protein complexes within protein-protein interaction networks (PPIN) using computational methods has become a field of interest, as many studies have shown that protein-protein interactions (PPIs) regulate many significant cellular functions. While a number of computational methods developed for protein complex detection are based on graph clustering, experimental studies have revealed that some relevant biological insights about protein complexes are not reflected in these methods.

This study proposes a hybrid algorithm which combines a multi-level and regularized variant of the Markov clustering algorithm (MCL), and a variant of MCL that uses a core-attachment scheme to weight PPIs and perform graph clustering. The proposed algorithm was tested on the BioGRID and DIP S. Cerevisiae PPIN datasets, and the output clusters were compared against the CYC2008 protein complex data. Furthermore, F-scores were computed for each output cluster, and the average F-score over all computed clusters. To further improve and analyze the performance of the hybrid algorithm, a search was also conducted for the best values for the method’s user-defined parameters that would give better clustering results compared to its performance under default or baseline values.

Results of the clustering using the proposed algorithm showed an improvement of between 35% to 150% in terms of average F-scores as compared with the results from three other clustering algorithms considered in this work. This improvement in F-scores thus reflects the effect of incorporating biological information to a pure, graph-theoretic clustering algorithm.