stbv's blog

I am a Software Engineer at the non-profit Avanti Fellows, where I develop open source technology for public school systems alongside Pritam Sukumar. I also mentor contributors in the Code for GovTech program, supporting the creation of Digital Public Goods. I deeply aspire to understand how machine learning can help us address critical needs within public systems and to build responsible technology. Previously, I worked as a Research Associate at Adobe Research under the guidance of Balaji Vasan Srinivasan. Before Adobe, I studied Electrical Engineering and Machine Learning at IIT Kanpur, where my thesis advisor was Prof. Piyush Rai. Feel free to reach out to me at: teja.surya59@gmail.com

I recently shared detailed notes on my research taste. Do check this post and let me know if any of the ideas resonate with you.

To emphasize, I have a strong interest in designing models with the following characteristics:

1) formulated and deployed with care for local communities,
2) built with domain-dependent inductive biases,
3) small scale; both in parameter count and training data size,
4) adequately specified while graciously leaving room for critique,
5) controllable and interpretable; likely through discrete latent variables,
6) inferred via (semi)differentiable approximate Bayesian procedures.

That's a lot! Briefly, I'd like to go against the tide of current ML trends and build adequately-sized models with an unwavering commitment to responsible practices.

Outside of work, I enjoy movies and books, not only for the art but also for dissecting the craft. You can check out my Goodreads reviews here. I plan to write more on other books soon. As for movies, I keep my hot takes offline but am happy to share them if you're interested.

I am also keen on exploring cities and their hidden spaces. Based in Mumbai, I'm endlessly intrigued by its urban histories and politics. If I'm not at home, you might find me loitering in the gullies around Sion or Byculla.

Jul. 18, 2025

Review of The Deccan Sun | The News Minute

Zeenath Sajida, a forgotten Deccan icon, revisited through careful translation.

Literature Sociology

Jul. 5, 2025

Review of The Hyderabadis | Scroll

Displacement, broken geographies, and evolving identities in the city's history.

Cities Politics

Feb. 27, 2025

Mangli Kanduri?

A first attempt at tracing the origins of a Masjid's name.

Cities Politics

Nov. 15, 2024

Production World: Part 2

Touring The Trade-Offs and Joys of Avanti's Tech Deployments

Deployment NGO

Nov. 4, 2024

Notes on Research Taste

Rough notes on research that excites me. Written to help me discover my taste in research.

Notes ML Sociology

Oct. 17, 2024

Production World: Part 1

Journey of Avanti's Quiz Engine From Local Dev To Production.

Dec. 22, 2023

Start Python with Colab | Video

I made a quick intro on using Colab for basic data analysis. Beginners in Python may find this useful. Slides

Video Python

Dec. 9, 2021

Tutorial on Human Evaluation

Tutorial on a workflow involving AMT and GDrive that caters to the collection of human responses for 1000s of images/documents/videos or other media. Code

Tutorial Evaluation

Dec. 5, 2019

Ethics in Technology

Exploring skepticism around ethics in technology through the lens of Metcalf et al.'s Corporate Logics (2019).

Review Tech Ethics

Nov. 5, 2019

Constructing Prestige and Elaborating the Professional

Paper Review of Llerena Guiu Searle's ethonographic study in the National Capital Region of India (2007).

Review Sociology Urban

Apr. 15, 2019

Hierarchical Dirichlet Process

An attempt to intuitively explain HDP, and why it makes sense to use HDPs for joint mixture modeling of multiple related datasets.

Probability Modelling

Apr. 10, 2019

Black-Box Variational Inference

An intuitive explanation of Ranganath et. al.'s Black-Box Variational Inference approach, and how it helps us get over tedious inference calculations.

Probability Modelling

Nov. 1, 2018

Are Mobile Phones changing Social Networks?

Paper Review of a longitudinal study of core networks in Kerala by Palackal et al. (2011)

Review Sociology Networks

Oct. 20, 2018

Social Media in South India

Book Review of Shriram Venkatraman's Social Media in South India.

Review Book Sociology New Media

Oct. 8, 2018

Influencers on Social Media

Why do they fit so well into the Post-Fordist narrative

New Media Sociology

Aug. 27, 2018

Self-Surveillance and the Wayback Machine

Can you really implement self-surveillance on social media?

Sociology New Media

Aug. 20, 2018

Theorizing Affordances: From Request to Refuse

Applying Davis & Chouinard's Affordance Theory to Twitter's Thread Feature

Sociology New Media

Nov. 28, 2024

RAIL Fellowship Symposium 2024

A tech worker's perspective on Responsible AI practices. Slides

Slides Development NGO

July. 7, 2022

Intro to Tech at Avanti Fellows

A presentation on Avanti's tech at the Tech4Dev Sprint organized by Donald Lobo and Glific.

Slides Development NGO

July. 7, 2022

The Evolution of AI: From Rules to Prompts

Discussed the modelling gains we made over the past two decades. Some examples of prompting relevant to a broader NGO community.

Slides ML NGO

Dec. 16, 2021

Reflections on Foundation Models

Trends, properties, criticisms, and immediate uses of large-scale models.

Slides Modelling

Aug. 27, 2021

Post-OCR Error Detection and Correction | Video

Through the sociopolitical lens of Sivaji: The Boss (2007). Slides

Video Slides Altppt

Jul. 16, 2021

Intern General Quiz 2021

Quiz hosted by Vinay Aggarwal and me during Adobe Internship 2021.

Quiz Slides

Feb. 10, 2021

Reflections on OpenAI's CLIP Model

Comparing CLIP training and properties with previous class of supervised models.

Slides Modelling

Sep. 14, 2020

Intro to Collaborative Content Creation

Does collaboration aid in creativity? An initial view of the academic landscape.

Slides Creativity

Over the years, I've been part of diverse projects and I'm very grateful to all collaborators. I've divided them into few themes for easy access. Although I'm not fully invested in all the themes right now, I'm excited to hear and discuss ideas. For a full list, check my Google Scholar page.

Theme 1: Probabilistic Modelling and Inference

PMI forms the foundation of my understanding in ML, anchoring my intuitions and expanding my approach to complex problems. My fascination with PMI began with the CS698X course at IIT Kanpur, taught by Prof. Piyush Rai, which was a pivotal experience. I've exhaustively documented my thoughts on PMI here.

(a) Undergraduate Thesis: My thesis, in collaboration with Shashank Gupta, explored bandits for sequential recommendations. We focused on two key ideas: the power of bandits in sequential interaction, facilitating online learning, and the efficiency of clustering arms to streamline computations. Using a DP-means approach with stick-breaking priors, we bypassed the need for a fixed cluster size, letting clusters emerge naturally. Report | Code
(b) Mixture of ARMA: This related project asks whether time series can be modeled as a mixture of ARMA processes, clustering them using the EM algorithm. Report | Code
(c) sslvae: Recently, I implemented SSLVAE, leveraging VAEs for label prediction in images with discrete latent variables modeled by a Concrete distribution. This approach draws from the work of Ben Poole et al. and Yee Whye Teh on the Gumbel-Softmax trick. Differentiating them is a pain, yet I'm thoroughly fascinated by VAEs and the interpretability offered via discrete LVs. Code
(d) Misc: Exploratory modelling in Jax, Pytorch: Link 1 | Link 2

Theme 2: Multimodal Combinations

Image and text generation tasks are highly pedagogical. They have been a great entry point for me into Deep Learning. However, I have mixed thoughts about the subfield in its current shape.

(a) SongTrain: Though not directly tied to image/text, Songtrain marked my first significant project working with data. We analyzed audio frequencies to track users' singing accuracy compared to an original track, providing real-time feedback and scoring. This project introduced me to the possibilities of multimodal data, Our work earned the first place in Microsoft's code.fun.do! Video | The Hindu.
(b) Image-Text Fragments: We developed a tool to generate custom image-text combinations from articles, adapting content and style to meet specific user needs. Work published at IUI 2020. Paper
(c) Temporal Fragments: Expanding on image-text combinations, what if we incorporate a temporal dimension? This allows us to add audio and video elements to the mix. Of course, it doesn't work for all documents! We focused on procedural texts. Work published in WACV 2023. Paper | Video

Theme 3: Systems for Creative Collaboration

Generating images and videos certainly has its functional applications, but I feel it doesn't yet fully capture the needs of a creative's toolbox. What truly excites me is the prospect of co-creation. Just as a chat interface provides an interactive flow in textual modality, I envision a similarly iterative, back-and-forth process in image and video modalities. This interaction doesn't have to be limited to a human-agent dynamic; it could very well involve human-human interactions, with an agent in the background, subtly aligning differences. You can read more about my perspectives on RL here.

(a) Codifying Conflicts in Co-Coloring: We looked at the process of creatives collaborating on a simple line-art coloring task, focusing on the conflicts and pain points that arise. Through user studies, we explored whether participants could effectively collaborate and impart their unique preferences in the final artwork. This project allowed me to work with SVGs, a flexible and often overlooked modality. Work published in CHI LBW 2022. Paper | Poster | Video
(b) Collabcolor: Extending the previous task, we developed CollabColor, an RL-based support system that assists users in finding a shared creative vision during collaboration. Framed as a multi-agent MDP, this project encompassed user studies, behaviour cloning, supervised RL with transformers, extensive evaluations, and more. We found that CollabColor's interventions improved coherence in final colorizations. While the immediate impact is on artistic collaboration, can RL interventions broadly help reconcile local differences toward societal harmony? Perhaps a utopian overreach, but one worth pursuing. Paper Draft

Theme 4: Compressing Data and Models

My experience working on creative support systems highlighted a recurring issue: users struggle with large, slow models that aren't well-suited for production or mobile use. I've always felt that today's models are bloated, impacting UX unnecessarily. This led me to an interest in compression and efficiency. You can read more here and here.

(a) Post-OCR Error Detection: OCR outputs suffer from common document issues like poor angles, folds, and creases. For Adobe Scan, I developed a lightweight, mobile-friendly model that flags documents likely to yield OCR errors, enabling targeted post-processing. Code
(b) User-guided Variable-rate Learned Image Compression: Neural nets can compress images but this process is lossy. I explored an approach where we ask users for their critical regions and compress less aggressively in those regions. This enables differential bit allocation, allowing a single model to operate at multiple bit rates. Work published in CVPR 2022 Workshop. Paper

Theme 5: Tech for Public Service Systems

Alongside ML research, I've been deeply interested in the sociological, historical, and political landscapes of the communities I belong to. My previous research often catered to functional productivity and efficiency, and I wanted to explore public interest technology that primarily serves human-centered values. Low-resource contexts require participatory action and field research with care and dignity as a necessary ethic. I feel these constraints, whether geographical or economic, can foster uniquely creative solutions. You can read more here and here.

(a) Avanti's Open Source Tools: I played a major role in developing tools for India's public school systems, reaching over 250,000 students across 5,000 schools. These tools enable assessments, attendance tracking, resource access, reporting, and data analysis for thousands of students and teachers daily. Code | Video
(b) I lead a project applying LLM-based interventions to enhance student memory retention, emphasizing responsible, ethically-aligned AI practices in education. More on this soon!

I maintain multiple public repositories at Avanti Fellows, focusing on projects such as the Quiz Engine, Portal, Reporting Engine, Session Manager, and College Predictor. These repos have a variety of open issues, many of which are beginner-friendly. Please consider contributing! You can reach out to me on Avanti's Discord. My discord id is suryabulusu.
I've listed some personal projects outside of Avanti here, primarily related to programming and ML:

Nov. 5, 2024

Reading

(a) A History of the Bible, by John Barton. Two chapters in, it's already a masterpiece in elucidation. Barton is extremely compassionate.
(b) Recoding America, by Jennifer Pahlka.

Writing - Lots to write!

(a) An essay on Hyderabad. Expanding on my review of Prof. Afsar Mohammad's book.
(b) An essay on my interactions with Anki over the past five years. Did Anki improve my memory retention? Did Anki make me more creative?
(c) An essay on my experiences in the non-profit sector.

Work and Research

(a) I've finally updated my website once again after 5 years!
(b) Building an LLM-based intervention for memory retention in Avanti's Quiz Engine.