██████╗██╗██████╗ ██████╗
██╔════╝██║██╔══██╗██╔═══██╗
██║ ██║██████╔╝██║ ██║
██║ ██║██╔══██╗██║ ██║
╚██████╗██║██║ ██║╚██████╔╝
╚═════╝╚═╝╚═╝ ╚═╝ ╚═════╝
Ciro Zhang
Hi! I’m Ciro Zhang, a recent UCSD grad (B.S. Data Science & Computer Engineering) heading to Harvard for my S.M. in Data Science this fall. I build machine learning systems that tackle real-world problems — from generative AI for gene function prediction to bioluminescence forecasting on the California coast. I also have a published paper in computational pathology and have TAed 600+ students across ML and data science courses.
Education 0x2000


Experiences 0x3000
GenAI Researcher
Electrical & Computer Engineering · UC San Diego (Advisor: Dr. Pengtao Xie)
Developing GeneChat, a multimodal LLM that generates interpretable gene explanations from DNA sequences. The model uses a hybrid DNABERT + Vicuna-13B architecture trained on 100k+ gene sequences on A100 GPUs, and is benchmarked against GPT-4o, LLaMA 3, and Gemini on gene function prediction tasks.
ML Researcher
Scripps Institution of Oceanography · UC San Diego (Advisor: Dr. George Sugihara)
Developed EDM-LSTM, a hybrid model combining empirical dynamic modeling with LSTM for ocean bioluminescence forecasting. Trained on 1,000+ weeks of San Diego ocean data and achieved a 23% AUC improvement over standalone EDM and LSTM baselines.
Data Engineering Intern
BC Cancer Agency
Worked on bioinformatics data pipelines, writing Groovy automation scripts inside Nextflow to preprocess cellular and gene expression datasets. Built Python tooling to detect and fix formatting issues across large experimental files, and used QuPath pipelines to extract quantitative features from cell imaging data.
CV Researcher
Tea Labs · University of British Columbia (Advisor: Dr. Li Xiaoxiao)
Built CV pipelines for whole-slide imaging (WSI) pathology analysis using YOLO. Designed a multi-stage slide processing pipeline covering blur/edge filtering, patch extraction, and parallel GPU inference. Also developed semi-supervised dataset pipelines to scale model training despite limited expert annotations.
Research Intern
BiMBA · Peking University (Advisor: Dr. Ma Jingjing)
Built an automated pipeline to scrape and analyze trending keywords from Tencent platforms, studying factors that drive online charitable donations. Integrated LLM APIs to generate structured trend reports and automatically publish findings to Feishu Sheets.
Teaching 0x4000
Publications 0x4500
Histological subtype is associated with PD-L1 expression and CD8+ T-cell infiltrates in triple-negative breast carcinoma
Annals of Diagnostic Pathology · Salisbury T, Abozina A, Zhang C, Mao E, Banyi N, Leo J, Ionescu D, Zhou C, Wang G
Investigated the relationship between tumor histological subtypes and immune markers (PD-L1 expression, CD8+ T-cell infiltration) across 72 triple-negative breast carcinoma cases, identifying subtype-specific patterns with implications for immunotherapy response prediction.
Bigger Group Projects 0x5000

DSC10 Practice Platform
Python • Pandoc • BeautifulSoup • Educational ToolPractice problem platform for UCSD's DSC 10 course hosting past exams and discussion materials. Built LaTeX-to-Markdown conversion tools and standardized problem formatting system.

HAB Forecasting: Harmful Algal Bloom Prediction (WIP)
Python • EDM–LSTM • Time-Series Forecasting • Oceanographic DataDeveloped and deployed a web platform with the HKN project team for forecasting bioluminescent harmful algal blooms (HABs) at UCSD Scripps, integrating our EDM–LSTM model with real-time coastal monitoring data.
Personal Projects 0x6000
Paper Reader: Document AI Pipeline
Python • YOLO • PyMuPDF • BERTBuilt a pipeline to convert research PDFs into structured Markdown/JSON datasets. Applied DocLayNet (YOLO) for layout detection and PyMuPDF for text/figure extraction. Reduced noise using regex + BERT-based filtering.
Gmail Replier: LLM Auto-Reply Bot
Python • OpenAI API • Gmail API • OAuthPython service that scans Gmail and drafts replies with OpenAI or Ollama, supporting dry-run and safe-prompt modes. Integrated Gmail API with OAuth, custom labels, and robust error-handling with exponential backoff.
Fish Game: HTML5 Jumping Game
JavaScript • HTML5 CanvasBrowser-based jumping game built with vanilla JavaScript and HTML5 Canvas. Features custom physics engine, camera system, and procedurally generated environments. Deployed on GitHub Pages for instant play.
Wiki-Graph-Explorer: Wikipedia BFS Game
Svelte • D3.js • Graph AlgorithmsInteractive web app that visualizes paths between Wikipedia articles using BFS graph traversal. Built with Svelte and D3 for dynamic graph visualization. Includes data processing pipeline with Jupyter notebooks for Wikipedia article parsing.
Slice: Android Bill Splitting App
Java • Android • GradleAndroid app for splitting bills among groups. Built with Java using Android Studio with 120+ commits. Features user-friendly interface for tracking expenses and calculating individual shares.
OneTouch: Android Shortcut Widget
Java • Android • WidgetsAndroid widget app providing quick shortcuts and one-tap actions. Built with Java for Android, featuring customizable widgets for improved productivity and quick access to frequently used functions.
Custom 9-bit ISA CPU NEW
SystemVerilog • Python • AssemblyBuilt a fully custom CPU from scratch in SystemVerilog — 9-bit ISA, 8-bit data path, 8 registers, 14 opcodes. Designed the entire architecture under tight constraints, then wrote assembly programs that pushed its limits: a Hamming distance analyzer, an arithmetic range finder, and a 32-bit signed multiplier packed into dual 8-bit registers.
YouTube Audio Downloader NEW
Python • Flask • PyInstaller • HTMLDesktop app for extracting and downloading audio from YouTube videos. Web-based interface where users paste a URL and get the audio file back. Packaged as a standalone executable with PyInstaller — no Python install required to run.