Siddhartha Thota

Hello! I’m Sid. I am a graduate student at the University of Toronto. I am expecting to graduate from the MScAC program in December 2018. I am interested in applications of Machine Learning and building massive ML applications.

I am in the market for an 8-month applied research internship from May 2018.

Email  /  Resume  /  Google Scholar  /  LinkedIn

sidewalk terrain estimation

Sidewalk terrain estimation using mobile sensors for estimating wheelchair accessibility
pdf / code archive

We build a machine learning model to classify mobile phone accelerometer data into terrain information. GPS information is used to correlate multiple accelerometer readings, enabling crowdsourcing. Road segments are eventually assigned a binary label, with a confidence that increases with the amount of data at hand. The system is verified with walks around Toronto.

scheduling algorithms for multi-tier cloud

Scheduling algorithms for applications on heterogenous multi-tier clouds
pdf / code archive

This project tries to characterize applications that run on a heterogenous, multi-tier cloud. A metric is created based on the performance of the app in relation to the resources it’s assigned, that can be used to make future decisions about the application. This measure also lets us make decisions about reprovisioning applications while having least impact on performance guarantees. The approach is based on experiments run on docker hosts, running on local machines.

swara identification

Scale independent raga identification using chromagram patterns and swara based features
pdf / ieee article

In Indian classical music a raga describes the constituent structure of notes in a musical piece. In this work, we investigate the problem of scale independent automatic raga identification by achieving state-of-the-art results using GMM based Hidden Markov Models over a collection of features consisting of chromagram patterns, mel-cepstrum coefficients and timbre features.On a dataset of 4 ragas - darbari, khamaj, malhar and sohini; we have achieved an average accuracy of 97%.

call center dude

Call center call analysis for emotion-based cues

As my undergraduate project, I worked on call center call recordings. Working on a hypothesis that a call center complaint call usually goes from a “conflict” to a “resolution” (or not), I manually labeled several hundred hours of call center calls with binary tags and tried to train a HMM model. The takeaways from this project were insight into variabilities in real-world data, how to extract features (MFCC, FFT and other signal features) and working with noisy data. I also learnt a whole lot about how to take an unlabeld dataset and go to something that resembeled an annotated dataset. Details of this project remain fairly confidential because of the nature of the dataset we worked on.

Work Experience

After graduating from NITK I worked at Adobe for the next three years. I started in a team that worked on a research-y customer-facing application that involved a lot of prototyping and back-and-forth with the customers. After that project was handed over to a passive maintenance team, I built prototypes for components of Adobe’s Sensei. I left Adobe to join MScAC that I am pursing currently at University of Toronto.

Adobe logo

Publishing machine learning algorithms at scale, as services

This project aimed at taking machine learning algorithms developed in expert silos inside Adobe and helping other teams that need the intelligence use them. Back in 2015 when every major organization was working on setting up a machine learning pipeline, I worked on a prototype that platform-agnostic, easily extendable and user-friendly for machine learning consumers.

  • Worked on deploying machine learning algorithms as services for teams inside Adobe.
  • Worked with data scientists to understand working of algorithms and the feature transformations for data.
  • Adapted learning and testing algorithms to generalize settings and parameters, to make it usable for other teams.
  • Created representations of feature transforms and built an ETL framework to consume data from users.
  • Designed and built a multi-cloud (AWS, Azure, in-house) service management framework on Azure and AWS.
  • Built data pipelines, configuration services and scheduling interfaces for setting up model training.

Adobe logo

Marketing Mix Model solver for digital marketing spend optimization

As a fresh grad at Adobe, I was involved in helping take an analytical marketing mix model solver to production. I took the prototype, rewrote it using Pandas (yes, it was earlier using dicts and lists) and then improved and tested the SQL queries. Eventually, the product was taken to production. Then, I handed over the project to long term maintenance teams and moved on to shipping ML as a service.

  • Optimized customer spends using a customized analytical marketing mix model solver.
  • Built massive ETL pipelines, provided seasonality and multiple campaign support to customers.
  • Gained experience of working in a highly agile, self-motivated research project.

Website design is stolen from Jon Barron's elegant website. The source code for this website is here.