Thom Man Hei Matthew

Thom Man Hei
Matthew

Data Science @ CityU HK

Aspiring Data Scientist/AI engineer & AI/ML @ AS Watson Group

About

I'm a Data Science student at City University of Hong Kong with a strong focus on AI, machine learning, and building systems that scale. My work centers on LLMs and AI agents—from prototyping retrieval-augmented pipelines to deploying practical tools for data analysis and automation.

Growing up in Hong Kong has given me a blend of technical rigour and an international outlook. I'm comfortable working across Python, data engineering, and modern ML tooling, and I enjoy turning messy real-world data into clear insights and actionable solutions.

When I'm not coding or studying, I keep up with the latest in AI research and open-source projects. I'm always open to collaboration on data science or AI-related projects—feel free to reach out.

Professional

Education

City University of Hong Kong
Bachelor of Science in Data Science
Aug. 2024 – June 2028
Hong Kong

Tech stack.

Technologies and tools I work with to build innovative solutions.

Experiences

Click an experience to expand and see what I did there.

Key Projects

Click a card to flip and see details.

Autonomous Research Agent (DeepSeek-powered)

Python

LangGraph

DeepSeek LLM

Tavily

Streamlit

Click to flip

Built an autonomous research agent using LangGraph + DeepSeek LLM: decomposes complex queries into sub-queries, runs parallel Tavily searches, summarizes findings, self-reflects with 0–10 confidence scoring, and generates fully-cited markdown reports. Supports configurable search depth/iterations, interactive Streamlit UI, CLI testing, and programmatic Python API for multi-hop research automation (e.g. academic surveys, etc.).

Code

Legal Reasoning LLM (Llama-3 Fine-tune)

Python

PyTorch

LoRA

Hugging Face

Click to flip

Developed "Headnote LLM", a domain-specific 8B-parameter model by fine-tuning Meta Llama-3.1 on 10k+ legal judgment datasets. Achieved state-of-the-art performance on legal reasoning tasks while adding only ∼168 MB of trainable parameters via LoRA adapters.

Colab

Statistical Arbitrage Trading Platform

PythonAlpaca APIXGBoostKalman Filter

Click to flip

Engineered a modular algorithmic trading platform focusing on cointegration analysis (Johansen test) and Z-score signal detection. Implemented dynamic hedging using Kalman Filters and XGBoost for signal generation, integrating risk management (VaR) and real-time visualization.

Code

Certifications

Credentials and courses I've completed. Click to view verification.

Machine Learning Specialization

DeepLearning.AI

credential

View credential

Deep Learning Specialization

DeepLearning.AI

credential

View credential

Machine Learning in Production

DeepLearning.AI

credential

View credential

Generative AI with Large Language Models

DeepLearning.AI

credential

View credential

Agentic AI with LangChain & LangGraph

IBM

credential

View credential

Certification of Service (Leadership)

City University of Hong Kong

credential

View credential

Student Chapter of Department of Data Science (2024–2025)

City University of Hong Kong

credential

View credential

Get in Touch

Have a question or want to collaborate? I'd love to hear from you.

Phone Email LinkedIn GitHub HFHuggingFace