Profile

The major focus of research has been the use of data driven in silico approaches including machine learning for deciphering the evolutionarily conserved features which govern biomolecular structure and interaction networks. These conserved principles are utilized for developing computational approaches for identifying novel biosynthetic pathways, protein-protein interaction networks, disease associated mutations and new drug targets by genome mining. These computational methods are being used for analysis/prediction of metabolic, signalling and regulatory networks associated with pathogenic organisms like Mycobacterium tuberculosis and Plasmodium falciparum, deciphering molecular basis of disease association of human microbiome and elucidating role of disease associated SNPs in human genome. The innovative computational methods developed by our group have successfully addressed several biologically relevant questions in the broader areas of host-pathogen interactions and disease biology, with specific emphasis on drug target identification and drug discovery for tuberculosis, malaria and SARS-CoV-2.

Current Focus Areas

  • Machine learning-based methods are being developed for in silico identification and functional annotation of prokaryotic small ORFs and prediction of the functional consequence of the disease associated mutations in smORFs found in human genome.

  • Microsecond scale MD simulations are being used to understand how binding of allosteric inhibitors to the mutant kinases can potentially shift the population of conformers from active to inactive state. Machine learning based scoring function is being developed for prediction of allosteric inhibitors for EGFR.

  • AI/ML-based method has been developed for prediction of genotypic drug resistance (DR) for M. tb using whole genome sequences and identification of novel DR associated mutations. Structural modelling of DR associated mutant proteins is being carried out to decipher the structural basis of drug resistance.

  • In order to evaluate the prediction accuracy of the AI/ML-based methods for prediction of oligomeric complexes of proteins & host-virus PPIs, systematic benchmarking of Alphafold2 and ESMFold have been carried out on newly released protein structures which lack high homology with known structures.

  • Peptide-specific (individual model for each epitope) and pan-specific (single model for all epitopes) machine learning models have been developed to predict the recognition specificities of TCR-pMHCs by utilizing only epitope and CDR3β sequences obtained from public domain TCRSeq data.

Selected Publications

  • Khanduja A, Kumar M and Mohanty D (2023) ProsmORF-pred: a machine learning-based method for the identification of small ORFs in prokaryotic genomes. Briefings in Bioinformatics bbad101.

  • Gupta P and Mohanty D (2021) SMMPPI: A machine learning based approach for prediction of modulators of protein-protein interactions & its application for identification of novel inhibitors for RBD:hACE2 interactions in SARS-CoV-2. Briefings in Bioinformatics 22(5):bbab111.

  • Agrawal P, Amir S, Deepak, Barua D and Mohanty D (2021) RiPPMiner-Genome: A Web Resource for Automated Prediction of Crosslinked Chemical Structures of RiPPs by Genome Mining. J Mol Biol 433(11):166887.

  • Agrawal P and Mohanty D (2020) A machine learning-based method for prediction of macrocyclization patterns of polyketides and nonribosomal peptides. Bioinformatics 37:603-611.

  • Agrawal P, Khater S, Gupta M, Sain N and Mohanty D (2017) RiPPMiner: A bioinformatics resource for deciphering chemical structures of RiPPs based on prediction of cleavage and cross-links Nucleic Acids Res 45 (W1): W80-W88.

Skills & Proficiency

Structural Bioinformatics Machine Learning Genome Mining Molecular Dynamics Protein-Protein Interactions Kinase Mycobacterium Drug Resistance Secondary Metabolites Microbiome smORF PDZ domain Protein Structure Prediction