Teddy Liu
Computer Science
UC Davis

Computer Science student with research experience in LLM evaluation, data analysis, and ML-based systems. NeurIPS 2025 co-author. Authorized to work in the US on OPT/CPT.

Davis, CA Available 2026 ML · NLP · CV

About

I work on the layer where research meets product, designing benchmarks, evaluation pipelines, and ML-powered tools that turn raw model capability into something useful and measurable.

Recent work spans a multi-LLM benchmarking platform at UC Davis, a NeurIPS 2025 benchmark on adversarial bias detection, and shipping production AI dashboards with Claude, Gmail, and Drive APIs. I move between Python, TypeScript, and Swift, and care equally about model behavior and the seams between systems.

Background

Originally from Madagascar, I pursued my studies in International Business in England, where I immersed myself in diverse cultures and gained a deep appreciation for global markets.

Eager to expand my skill set, I am now advancing my expertise in Computer Science in the United States, merging my passion for technology with the strategic perspective I acquired through my business background.

Skills

Languages

Python · TypeScript · JavaScript · C++ · Java · SQL · HTML/CSS

Frameworks

React · Node.js · Next.js · Langchain · Flask · FastAPI

ML & Data

scikit-learn · NumPy · Pandas · OpenCV · YOLOv8

Infra

Git · Docker · Google Cloud · PostgreSQL · MongoDB · Prisma

Focus

LLM Evaluation · NLP · Computer Vision · Algorithm Design

APIs

Claude · Gmail · Google Drive · OAuth 2.0

Experience

Jun 2025 — Sep 2025

LLM AutoEval Benchmark — SWE Intern

Dept. of Computer Science, UC Davis

Built a multi-LLM benchmarking platform supporting 7 NLP metrics and custom datasets, enabling automated evaluation across multiple models.
Architected the automated evaluation pipeline comparing LLM outputs against ground truth data to generate accuracy rankings and performance insights.
Implemented BLEU, ROUGE, METEOR, Perplexity, Semantic Similarity, CIDEr, and SWD for comprehensive LLM assessment, significantly reducing manual evaluation.

Apr 2025 — Aug 2025

E-search — Research Assistant (Computer Vision)

Dept. of Chemical Engineering, UC Davis · Python

Optimized vibration parameters using Python computer vision analysis of high-speed camera footage to maximize object separation efficiency.
Developed a computer vision system processing high-speed camera footage to extract and analyze real-time waveform data from mechanical vibrations.

Apr 2025

RobustBiasBench — NeurIPS 2025 Co-author

Dept. of Computer Science, UC Davis

Co-authored a NeurIPS 2025 benchmark submission introducing RobustBiasBench — evaluating bias detection robustness in LLMs under adversarial textual perturbations.
Curated and preprocessed an 18k-sample dataset through systematic collection, cleaning, filtering, and manual labeling.
Trained and tested a Support Vector Machine on the final dataset using TF-IDF vectorization.

Selected Projects

Semicolon

May 2026

Swift · TypeScript · Next.js · Python · MongoDB

iOS app turning the iPhone into a smart dashcam. Rolling buffer saves the last 60s of dual-camera footage to MongoDB. Real-time perception pipeline streams frames to a FastAPI sidecar running YOLOv8, fused with ARKit LiDAR scene-depth for sub-100ms hazard scoring on close passes, doorings, and blocked bike lanes.

Company Pulse

Mar 2026

TypeScript · React · Node.js · PostgreSQL · Claude API · Gmail API

Deployed AI automation dashboard for a client company. Auto-generates daily business briefings from Google Drive and schedules AI-drafted client emails via Gmail. Human-in-the-loop outbox with one-click delivery. OAuth tokens secured with AES-256-GCM on a Prisma/Postgres backend.

FieldScout Copilot

Feb 2026

Python · Swift · On-device LLM

Offline iOS app converting field worker voice notes into structured agronomic observations via on-device LLM inference in under 90 seconds. Local rules engine generates time-bounded treatment recommendations from live weather features, with playbook patching and version-tracked audit trails.

PokeMe

Jan 2026

Python · Swift · Google App Engine

Sports app letting students find pickup buddies and meetups nearby. Full-stack on Google App Engine with a profile-based AI recommendation engine. 20+ active users.

The Moderator

Jun 2025

JavaScript · Python · Langchain

MultiAgent Diplomacy — an AI strategy game where LLM agents autonomously negotiate and compete via Langchain. Integrated Google Cloud TTS to synthesize natural voice between agents.

Pyrosphere

Apr 2025

TypeScript · Python · SQL

Real-time wildfire detection alerting California residents to active fire threats via live camera monitoring. ML model trained on 21,000+ images detects smoke and fire across 1,150 traffic cameras. Predictive risk model trained on 14,000 historical incidents.

Education

Sep 2024 — Jun 2026

University of California, Davis

B.S. Computer Science · Davis, CA

Coursework: Software Engineering, Artificial Intelligence, Computer Architecture, Algorithm Design & Analysis, Programming Languages, Theory of Computation, Operating Systems. Active in the Google Developer Student Club, ML lab research, and hackathons.

Jan 2022 — Jun 2024

De Anza College

A.S. Computer Science · Cupertino, CA

Coursework: Data Structures & Algorithms, Object Oriented Analysis & Design, Linear Algebra, Discrete Mathematics. Director's Choice Award at De Anza Hacks 2.5 for an audio-reactive LED & haptic visualization device.

Let's work together

Open to internships
and full time jobs.

yqtliu@ucdavis.edu 917-861-1487 LinkedIn GitHub

Teddy LiuComputer ScienceUC Davis