Rupesh Kumar Srivastava
I am an Artificial Intelligence researcher, interested in learning algorithms (algorithms for learning from data) & learning algorithms (the idea that instead of designing algorithms, we should be learning them). I believe that learning is fundamentally about compression, and thinking about compression is likely the most promising way of making progress towards better and more general learning algorithms. I’m always happy to talk about these ideas. Email me!
Some highlights of my past research are:
- Bayesian Flow Networks: a new generative modeling framework that combines Bayesian inference with Deep Learning, and gracefully extends the ideas behind diffusion models to discrete data.
- Upside-Down Reinforcement Learning: a new way of learning to act from reinforcement that avoids reward prediction in favor of relying purely on supervised learning, taking the ideas behind learning in hindsight to a logical conclusion.
- Highway Networks / Recurrent Highway Networks: the first neural network architecture that enabled training networks with very large depths of tens to hundreds of layers, this was a predecessor and general version of now-common Residual Networks.
A list of my publications can be found on Google Scholar.
Currently I am a Senior Research Scientist at NNAISENSE. I was one of the first employees at the company, and also play various other roles including managing the software infrastructure team.
I completed my PhD in 2018 at the Swiss AI lab IDSIA / USI in Lugano, Switzerland supervised by Jürgen Schmidhuber. My dissertation was New Architectures for Very Deep Learning, focused on the training of very deep networks.
During an internship at Microsoft Research in 2015, I was part of the team that developed and published one of the first neural networks based image captioning systems. Our team tied (with Google) for the First place at the COCO Image Captioning Challenge in 2015.
I obtained a Bachelors and Masters in Mechanical Engineering from IIT Kanpur. My Masters thesis was on using evolutionary algorithms for reliable design optimization, supervised by Kalyanmoy Deb. During my Bachelors I played an early role in project planning and design on Jugnu, India’s first nanosatellite. Jugnu was launched on 12 October 2011 into low Earth orbit by ISRO using a PSLV-CA C18.