OPEN TO WORK / STATISTICS / DATA SCIENCE / MACHINE LEARNING

Hello, I'm

Dong Bokun.

A statistics student building careful, research-driven data work.

I am currently pursuing an M.S. in Statistics at The Chinese University of Hong Kong, Shenzhen. My work sits between statistical reasoning, practical analysis, and machine learning workflows that aim to be rigorous, explainable, and useful.

Current Focus

Statistics, modeling, and machine learning

Graduate Study

M.S. in Statistics, CUHK-Shenzhen

Working Style

Calm, structured, and research-oriented

Scroll to explore

Research-minded, detail-sensitive, and still early in a long path.

My academic background started from mathematics and is now moving deeper into statistics and data science. I care about how quantitative work is structured, how evidence is communicated, and how models behave outside ideal settings.

I am especially motivated by projects that require both technical reasoning and calm judgment: cleaning imperfect data, selecting an appropriate method, and explaining the result with honesty rather than overclaiming.

Python (PyTorch / Hugging Face / NumPy / Pandas)
SQL data querying and cleaning
MATLAB modeling and simulation
SPSS statistical analysis
Research reading and academic writing
IELTS 6.5

Areas I want to keep studying, building, and explaining better.

Statistical Learning

I am interested in how statistical structure, regularization, and uncertainty estimation improve real-world decision systems.

Machine Learning Systems

I enjoy turning models into reliable workflows, from data cleaning and training loops to evaluation pipelines that are easy to reproduce.

Causal & Applied Inference

My academic attention leans toward interpretable methods that explain what changed, why it changed, and how confident we should be.

Selected work across statistics, machine learning, and quantitative modeling.

The filter is meant to show range, not decoration. It helps recruiters and faculty read the work through the lens they care about.

Machine LearningResearch Project

LLM Pretraining and Mathematical Fine-Tuning

Built a NanoGPT training workflow, processed a 100M-token corpus, and adapted a GPT-2 style model for mathematical problem solving.

Focused on training setup, data preparation, and task-specific adaptation.

PyTorchHugging FaceData Pipeline
Data AnalysisConsulting Project

Conference Market Expansion Analytics

Designed and analyzed 500+ valid survey responses, then translated segmentation and regression outputs into an actionable market strategy.

Produced a decision-oriented report combining descriptive, predictive, and business-facing analysis.

SPSSRegressionSurvey Design
Numerical MethodsAcademic Project

RIS-Based Communication and Power Transfer Modeling

Developed beamforming logic and simulation workflows in MATLAB for long-distance communication and energy transfer scenarios.

Connected algorithm design with simulation validation and paper-ready documentation.

MATLABOptimizationSimulation
StatisticsCoursework + Practice

Regression and Visualization Practice Archive

Collected small applied studies around regression, statistical testing, and visual explanation for coursework and independent practice.

Strengthened my ability to explain statistical reasoning clearly instead of only presenting final numbers.

InferenceVisualizationReporting

A compact view of education, research, competitions, and project growth.

2021

Education

B.S. in Mathematics and Applied Mathematics

Started undergraduate training with a focus on mathematical foundations and structured quantitative thinking.

2024

Research

First-Author Academic Papers

Contributed first-author papers for ICFTBA 2024 and CONF-MLA 2024.

2024

Competition

National Competition Achievement

Received first prize in a national communication and information processing competition.

2025

Projects

Applied Analytics and Simulation Projects

Expanded project work across market analytics, numerical modeling, and machine learning practice.

2025-2027

Education

M.S. in Statistics at CUHK-Shenzhen

Current graduate study centered on statistical learning, data visualization, and regression analysis.

A small visualization corner for how I think: probability, trend, and structure.

Normal distribution as a visual language

A simple curve, but a useful reminder: variance, density, and uncertainty are not abstract concepts when interpreting real data.

Small records that keep the site personal, not just professional.

I keep a little space for reading, observation, and slow reflection.

Photography helps me record texture, calmness, and the scenes I tend to notice.

I prefer work that is complete first, and polished second.

If the work feels relevant, I am happy to continue the conversation.

I am especially open to research internships, analytics roles, and early-career opportunities that value careful quantitative thinking.