Publications


  1. Projections for fast protein structure retrieval. Sourangshu Bhattacharya, Chiranjib Bhattacharyya and Nagasuma R. Chandra. BMC Bioinformatics. 2006 Dec 18;7 Suppl 5:S5.


Abstract:

Background:
In recent times, there has been an exponential rise in the number of protein structures in databases e.g. PDB. So, design of fast algorithms capable of querying such databases is becoming an increasingly important research issue. This paper reports an algorithm, motivated from spectral graph matching techniques, for retrieving protein structures similar to a query structure from a large protein structure database. Each protein structure is specified by the 3D coordinates of residues of the protein. The algorithm is based on a novel characterization of the residues, called projections, leading to a similarity measure between the residues of the two proteins. This measure is exploited to efficiently compute the optimal equivalences.
Results:
Experimental results show that, the current algorithm outperforms the state of the art on benchmark datasets in terms of speed without losing accuracy. Search results on SCOP 95% nonredundant database, for fold similarity with 5 proteins from different SCOP classes show that the current method performs competitively with the standard algorithm CE. The algorithm is also capable of detecting non-topological similarities between two proteins which is not possible with most of the state of the art tools like Dali.


  1. Comparison of protein structures by growing neighborhood alignments. Sourangshu Bhattacharya, Chiranjib Bhattacharyya and Nagasuma R Chandra. BMC Bioinformatics. 2007 Mar 6;8:77.


Abstract:

Background:
Design of protein structure comparison algorithm is an important research issue, having far reaching implications. In this article, we describe a protein structure comparison scheme, which is capable of detecting correct alignments even in difficult cases, e.g. non-topological similarities. The proposed method computes protein structure alignments by comparing, small substructures, called neighborhoods. Two different types of neighborhoods, sequence and structure, are defined, and two algorithms arising out of the scheme are detailed. A new method for computing equivalences having non-topological similarities from pairwise similarity score is described. A novel and fast technique for comparing sequence neighborhoods is also developed.
Results:
The experimental results show that the current programs show better performance on Fischer and Novotny's benchmark datasets, than state of the art programs, e.g. DALI, CE and SSM. Our programs were also found to calculate correct alignments for proteins with huge amount of indels and internal repeats. Finally, the sequence neighborhood based program was used in extensive fold and non-topological similarity detection experiments. The accuracy of the fold detection experiments with the new measure of similarity was found to be similar or better than that of the standard algorithm CE.
Conclusion:
A new scheme, resulting in two algorithms, have been developed, implemented and tested. The programs developed are accessible at http://mllab.csa.iisc.ernet.in/mp2/runprog.html.


  1. Structural Alignment based Kernels for Protein Structure Classification. Sourangshu Bhattacharya, Chiranjib Bhattacharyya and Nagasuma R Chandra. In Proceedings of 24th International Conference on Machine Learning (ICML), 2007.


Abstract

Structural alignments are the most widely used tools for comparing proteins with low sequence similarity. The main contribution of this paper is to derive various kernels on proteins from structural alignments, which do not use sequence information. Central to the kernels is a novel alignment algorithm which matches substructures of fixed size using spectral graph matching techniques. We derive positive semi-definite kernels which capture the notion of similarity between substructures. Using these as base more sophisticated kernels on protein structures are proposed. To empirically evaluate the kernels we used a 40% sequence non-redundant structures from 15 different SCOP superfamilies. The kernels when used with SVMs show competitive performance with CE, a state of the art structure comparison program.


  1. Kernels on Attributed Pointsets with Applications. Mehul Parsana, Sourangshu Bhattacharya, Chiranjib Bhattacharyya and K. R. Ramakrishnan. In Proceedings of 21st Annual Conference on Neural Information Processing Systems (NIPS), 2007.

Abstract

This paper introduces kernels on attributed pointsets, which are sets of vectors embedded in an euclidean space. The embedding gives the notion of neighborhood, which is used to define positive semidefinite kernels on pointsets. Two novel kernels on neighborhoods are proposed, one evaluating the attribute similarity and the other evaluating shape similarity. Shape similarity function is motivated from spectral graph matching techniques. The kernels are tested on three real life applications: face recognition, photo album tagging, and shot annotation in video sequences, with encouraging results.




Copyleft 2007

Sourangshu Bhattacharya
Feel free to copy !! Why duplicate effort ?
Only give me some credit.
See GNU GPL.