Sifat Hossain

Md Sifat Hossain

CS researcher & software engineer working on large language models, agentic AI systems, and autonomous software engineering.

Open to PhD positions for Fall 2027.

Md Sifat Hossain

About

I am a Software Engineer at Therap BD Ltd and a recent Computer Science graduate from the University of Dhaka. My research lives at the intersection of large language models, autonomous software engineering, and agentic AI - building systems that can reason about, write, and refine code through structured multi-model feedback.

During my undergraduate years I worked at the Data Mining Research Lab (DU) with Md. Fahim Arefin and Prof. Tarannum Shaila Zaman (UMBC), where I co-authored two papers studying how state-of-the-art LLMs solve ICPC-level competitive programming problems and how persistent multi-model feedback loops can make autonomous code generation substantially more reliable.

I am actively applying for PhD programs starting Fall 2027, where I hope to continue working on reliable, reasoning-capable AI systems for software engineering.

Research interests
  • Large Language Models for Code Generation
  • Agentic AI Systems & Multi-Model Feedback
  • Autonomous Software Engineering
  • Competitive Programming as LLM Benchmark
  • AI Alignment & RLHF

Publications

* denotes equal contribution. See the publications page for full details and BibTeX.

  1. [1]
    A-ProS framework diagram: problem acquisition and preprocessing feeds a GPT-based solution generator, whose outputs are submitted to Codeforces; verdicts drive a multi-model debugging feedback loop (Codestral, Llama-3.3, DeepSeek-R1) that iteratively refines the solution.

    A-ProS: Towards Reliable Autonomous Programming Through Multi-Model Feedback

    Anika Tabassum*, Md Sifat Hossain*, Md. Fahim Arefin, Tariqul Islam, Tarannum Shaila Zaman

    ACM Transactions on Software Engineering and Methodology (TOSEM), 2026 · * Equal contribution

    TOSEM 2026arXivPDFProject PageCode
  2. [2]
    LLM-ProS pipeline: ICPC problem data collection, data preprocessing, model testing across GPT-4o / Mistral Large / Llama / OpenAI o1 models, and automated submission with verdict classification (Accepted, Compilation error, Time limit exceeded, Wrong answer).

    LLM-ProS: Analyzing Large Language Models’ Performance in Competitive Problem Solving

    Md Sifat Hossain, Anika Tabassum, Md. Fahim Arefin, Tarannum Shaila Zaman

    LLM4Code 2025 Workshop, ICSE 2025 - 47th International Conference on Software Engineering, Ottawa, Canada

    ICSE 2025 (LLM4Code)arXivPDFCode

Research Experience

Research Assistant · Data Mining Research Lab, University of Dhaka

Aug 2023 - Present

Dhaka, Bangladesh

Supervisor: Md. Fahim Arefin (DU CSE), in collaboration with Prof. Tarannum Shaila Zaman (UMBC, Information Systems).

  • Designed and implemented LLM-ProS, a novel evaluation framework for benchmarking LLM performance on ICPC World Finals problems. Curated a 166-problem dataset (2011-2024), built automated submission pipelines via Codeforces Gym, and analyzed five state-of-the-art models across correctness, resource utilization, and chain-of-thought reasoning. Published at ICSE 2025 (LLM4Code).
  • Extended this work into A-ProS, an autonomous multi-model agentic framework that separates solution generation from specialized debugging feedback under a 2×3 factorial design. Developed the full orchestration pipeline, Codeforces browser automation (Selenium + Playwright), verdict capture, and SQLite logging. Ran ablations on persistent vs. stateless context and trust calibration (ECE) across critic models. Accepted at ACM TOSEM 2026.
  • Completed undergraduate thesis “A Hybrid LLM Feedback Framework for Automated Competitive Programming Workflows,” proposing an iterative test-driven benchmarking pipeline integrating OpenAI o3-mini with specialist LLMs (DeepSeek, Qwen) for error diagnosis and code refinement via Codeforces-based validation.

RLHF Data Researcher & Pod Lead · Turing Enterprises Inc.

Jul 2025 - Dec 2025

Remote

  • Contributed to Reinforcement Learning from Human Feedback (RLHF) data creation supporting large-scale AI model alignment research, focusing on dataset quality, consistency, and annotation methodology for code and reasoning tasks.
  • Led a team of 10, establishing quality control protocols that ensured annotation reliability across diverse task domains.
  • Coordinated cross-functional team activities and maintained consistency standards critical to downstream model training, directly supporting AI alignment objectives.

Machine Learning Research Intern · Brainwave Matrix Solutions

Aug 2024 - Sep 2024

Remote, India

  • Developed a fraud detection model applying anomaly detection and supervised learning on imbalanced datasets, achieving 85% precision. Investigated model behavior under class imbalance and precision-recall trade-offs.
  • Automated model training and deployment pipelines using Docker and Jenkins, enabling reproducible ML experimentation and continuous integration of model updates.

Education

University of Dhaka

Jan 2020 - Feb 2025

Bachelor of Science in Computer Science and Engineering · CGPA 3.13 / 4.00

Undergraduate Thesis

A Hybrid LLM Feedback Framework for Automated Competitive Programming Workflows

Proposed a novel test-driven iterative benchmarking framework integrating multiple LLMs (OpenAI o3-mini, DeepSeek, Qwen) with Codeforces-based validation to evaluate and improve automated code generation and error correction in competitive programming contexts.

Relevant coursework
  • Data Structures & Algorithms
  • Object-Oriented Programming
  • Software Design Patterns
  • Artificial Intelligence
  • Machine Learning
  • Natural Language Processing
  • Theory of Computation
  • Compiler Design
  • Operating Systems
  • Database Management Systems
  • Software Engineering
  • Probability & Statistics

Industry Experience

Software Engineer · Therap BD Ltd

Apr 2025 - Present

Dhaka, Bangladesh

  • Develop and maintain scalable features for Therap’s EHR SaaS platform (used across all 50 US states) using Java, Spring, Hibernate, JSP, and Oracle DB, with a focus on correctness and reliability under HIPAA constraints.
  • Build and containerize full-stack modules using React.js and Docker, deployed on WebLogic Server, contributing to platform stability and consistent delivery across environments.

Software Engineer (Part-time) · Zeroxa DT

Mar 2023 - Jun 2024

Remote, London, UK

  • Built and deployed scalable web applications for 5+ clients using React.js and FastAPI, significantly reducing average page load times through targeted performance optimization.
  • Architected CI/CD pipelines with automated testing and deployment workflows on AWS (EC2, S3, RDS), accelerating release cycles while maintaining production code quality.

Leadership & Service

Vice President, ICT and Graphics · Notre Dame English Club, Notre Dame College, Dhaka

2018 - 2019

Dhaka, Bangladesh

  • Organized the 6th National English Carnival, a national-level academic competition with 10,000+ participants across 30+ events; previously co-ordinated the 5th edition (5,000+ participants).
  • Owned judge coordination and external communications, managing scheduling and briefing for judges drawn from industry, academia, and senior education across parallel event tracks.
  • Led sponsorship outreach, securing corporate partnerships that funded event operations and prize pools.

Projects

Research & data

A-ProS

Python · OpenAI · DeepSeek · Selenium · Playwright · SQLite

Reference implementation of A-ProS (TOSEM 2026) - an autonomous agentic framework separating solution generation (GPT-4 / GPT-5) from specialized debugging feedback (DeepSeek-R1, Llama-3.3, Codestral) under a 2×3 factorial design. Persistent multi-model feedback loops achieve 2.2-2.3× greater gains than stateless baselines on 367 ICPC and Codeforces problems.

Code on GitHub →

Hybrid Feedback Loop - LLM Benchmark Pipeline

Python · Selenium · BeautifulSoup · SQLite3

Data and orchestration pipeline underlying LLM-ProS (ICSE 2025) and the broader A-ProS benchmark - scraping 166 ICPC World Finals problems, normalizing LaTeX/HTML, structuring statements / I/O specs / constraints / samples, automating Codeforces Gym submissions, and capturing per-attempt verdict + runtime + memory in SQLite. Extended to Codeforces, forming the 367-problem benchmark used in A-ProS.

Code on GitHub →

TikTok Scraper

Python · Selenium · Requests · BeautifulSoup · SQLite3

Scraper extracting video descriptions and author metadata for specified keywords and tags, enabling structured analysis across 5,000+ videos. Selenium for dynamic rendering, BeautifulSoup for parsing, SQLite3 for storage - the same scraping architecture later applied in the LLM benchmark pipeline.

Code on GitHub →
Software

Smart Event Ticketing System

Java · Spring MVC · Hibernate/JPA · PostgreSQL · JSP

Multi-role event management platform with pessimistic locking to guarantee transactional consistency and prevent concurrent booking conflicts at scale. Real-time event filtering with asynchronous data retrieval for responsive search.

JobGenie

React.js · FastAPI · MongoDB

Job search platform with automated CV generation and personalized job matching, integrating live job scraping to fetch and rank relevant listings.

Code on GitHub →

OyeAmigo

Kotlin · Android SDK

Personality-based social networking Android app with null-safe Kotlin architecture, reducing crash rates and improving runtime stability.

Code on GitHub →

Achievements

Honors
  • Zelf Hackathon 2.0 - Honorable Mention - Scraping Engineer track.
Competitive programming
  • Codeforces - Max rating 1603 (Expert) · 1000+ problems solved.
  • CodeChef - 4★ · max rating 1921.
  • AtCoder - Handle sifat_sif · regular contest participant.
  • LeetCode - Handle sifat_sif.
  • ICPC Dhaka Regional Onsite - Top-35 of 220+ teams (2023) · Top-49 of 309 teams (2024).
  • BUET Inter-University Programming Contest 2023 - 5th of 102 teams.
  • Samsung R&D BD Coding Contest 2024 - Final Round qualifier · 55th of 908 in Round 1.
  • Meta Hacker Cup 2024 - Round 2 qualifier · global rank 2,166.
  • NCPC 2023 - 44th of 198 teams.

Technical Skills

Languages
  • Python
  • Java
  • C++
  • JavaScript
  • TypeScript
  • Kotlin
Frameworks & Libraries
  • Spring Boot
  • Hibernate
  • FastAPI
  • React.js
  • Next.js
  • JSP
  • Selenium
  • Playwright
Databases
  • PostgreSQL
  • Oracle DB
  • MongoDB
  • SQLite
Research & ML
  • PyTorch
  • HuggingFace Transformers
  • OpenAI API
  • Pandas
  • NumPy
  • Jupyter
  • LaTeX
Tools
  • Git
  • Linux
  • Bash
  • Docker
  • Jenkins
  • Nginx
  • JUnit5

Contact

The fastest way to reach me is by email at sifatb910@gmail.com. I am currently applying for PhD positions for Fall 2027; I would be happy to discuss research fit or share my CV, thesis, and code on request.