Mrinal Kanti Das

alt text 

PhD student, Machine Learning Lab
Department of Computer Science and Automation
Indian Institute of Science
Bangalore, India, PIN - 560 012

Email: mrinal at csa dot iisc dot ernet dot in
Personal email: nmrinl at gmail dot com

Advisor : Prof. Chiranjib Bhattacharyya

My fascination is to develop simple and novel mathematical models to address interesting and challenging practical problems. During my PhD I have focused on text datasets, they are ubiquitous and possess some interesting challenges requiring novel techniques. Broadly, I am interested in

  • Designing hierarchical Bayesian models for text datasets (mono-lingual as well as multi-lingual), information retrieval and extraction.

  • Exploring nonparametric Bayesian models for practical problems.

I am proficient and experienced in using both variational approximation as well as MCMC based inference for topic models. I like to explore different areas of application and in recent past I have worked with various types of text datasets like software projects, speech transcripts, multi-lingual corpora, news/blogs and comments. I have observed some challenging problems associated with these applications and developed novel mathematical models to solve them.

Overview of my recent work

Topic models are popular mathematical tools for analysing text datasets, where a corpus is a collection of documents. The state of art notion in topic models was to use single topic vector per document.

I came up with the novel yet simple idea of using multiple topic vectors (MTV). We have observed phenomenal ability of MTV in (i) discovering subtle topics, (ii) modeling specific correspondence. Both of them helped in inventing novel models (i) subtle topic models (STM, in ICML, 2013), (ii) specific correspondence topic models (SCTM, in WSDM, 2014).

Currently I am working on nonparametric Bayesian models for learning very large scale (more than 700 million tokens) datasets. There is NO method known using MCMC for such scale without using expensive parallel hardware. I have invented a novel Bayesian nonparametric prior and used the concept of MTV across documents to apply MCMC. We have observed significant results.

Personal Information

  • Born on May 10, 1982.

  • Citizen of India