Thom Man Hei Matthew avatar

Thom Man Hei
Matthew

Data Science @ CityU HK

Aspiring Data Scientist/AI engineer & AI/ML @ AS Watson Group

Get in touch

About

I'm a Data Science student at City University of Hong Kong with a strong focus on AI, machine learning, and building systems that scale. My work centers on LLMs and AI agents—from prototyping retrieval-augmented pipelines to deploying practical tools for data analysis and automation.

Growing up in Hong Kong has given me a blend of technical rigour and an international outlook. I'm comfortable working across Python, data engineering, and modern ML tooling, and I enjoy turning messy real-world data into clear insights and actionable solutions.

When I'm not coding or studying, I keep up with the latest in AI research and open-source projects. I'm always open to collaboration on data science or AI-related projects—feel free to reach out.

Professional

Education

  • City University of Hong Kong

    Bachelor of Science in Data Science

    Aug. 2024 – June 2028

    Hong Kong

Tech stack.

Technologies and tools I work with to build innovative solutions.

Experiences

Click an experience to expand and see what I did there.

Key Projects

Click a card to flip and see details.

Autonomous Research Agent (DeepSeek-powered)

PythonLangGraphDeepSeek LLMTavilyStreamlit

Click to flip

Built an autonomous research agent using LangGraph + DeepSeek LLM: decomposes complex queries into sub-queries, runs parallel Tavily searches, summarizes findings, self-reflects with 0–10 confidence scoring, and generates fully-cited markdown reports. Supports configurable search depth/iterations, interactive Streamlit UI, CLI testing, and programmatic Python API for multi-hop research automation (e.g. academic surveys, etc.).

Code

Legal Reasoning LLM (Llama-3 Fine-tune)

PythonPyTorchLoRAHugging Face

Click to flip

Developed "Headnote LLM", a domain-specific 8B-parameter model by fine-tuning Meta Llama-3.1 on 10k+ legal judgment datasets. Achieved state-of-the-art performance on legal reasoning tasks while adding only ∼168 MB of trainable parameters via LoRA adapters.

Colab

Statistical Arbitrage Trading Platform

PythonAlpaca APIXGBoostKalman Filter

Click to flip

Engineered a modular algorithmic trading platform focusing on cointegration analysis (Johansen test) and Z-score signal detection. Implemented dynamic hedging using Kalman Filters and XGBoost for signal generation, integrating risk management (VaR) and real-time visualization.

Code

Get in Touch

Have a question or want to collaborate? I'd love to hear from you.