Experience & Domains

Alignerr: Frontier LLM Capability Boundary Probing via Graduate-Level Physics & Risk-Engineering Benchmarks

Designed graduate-level, computation-heavy scientific reasoning benchmarks to probe the capability boundaries of a frontier large language model, prioritizing substance over trick wording
Built physics problems centered on long-form derivations, invariant-based reasoning, and tight constraint management to prevent shortcut solutions
Recombined rigorous physics foundations into mechanics-heavy prompts that demand both physical intuition and careful mathematical derivations
Converted risk-engineering research themes into quantitative prompts that require explicit assumptions, nontrivial modeling choices, and end-to-end reasoning
Raised difficulty by layering advanced techniques and edge conditions, pushing problems beyond standard textbook patterns
Representative example: designed a right-triangle kinematics puzzle with path-invariant vertical time, reducing optimization to horizontal scheduling; proved the fastest path, showed the slowest path has no maximizer but a supremum, and derived harmonic-number asymptotics for a staircase construction approaching the bound
Reused the same layered-difficulty design playbook across multiple topics and scenarios, balancing high-compute reasoning with creativity

Alignerr: End-to-End RLHF Preference Data Engineering with Scenario Design, Rubrics & A/B Ranking Signals

Owned end-to-end human preference data engineering for RLHF alignment, designing scenarios, datasets, and rubrics to rank multiple model variants via A/B preference tests
Crafted evaluation scenarios that probe where strong reasoning diverges from human engineering criteria, turning alignment goals into testable, domain-grounded tasks
Built realistic input datasets with natural messiness and confounders, avoiding contrived control-variable setups so model behavior reflects real decision contexts
Defined quantifiable rubrics and structured response requirements to convert subjective judgments into consistent, rankable preference signals
Authored reusable task templates plus data-generation rules so each scenario can be re-instantiated and rerun at high volume without synthetic-looking artifacts
Representative example: designed an ESG rating evaluation that surfaces old-company advantage bias by pairing comparable new vs. established firms and rewarding substantive performance over disclosure tenure; applied the same bias-probing approach across multiple domains and scenarios
Produced pairwise rankings with graded preference strength and clear comparative rationales that pinpointed failure modes and decision-critical trade-offs
Supported evaluator consistency through lightweight calibration on edge cases and evidence re-check steps when inputs or sources were inconsistent

Micro1: LLM Contextual Advertising Domain Leadership & A/B-Validated Decision Playbooks

Served as the domain expert for contextual advertising in LLM chat experiences, translating ambiguous ad-strategy tradeoffs into clear decision criteria
Used a consistent workflow of marketing research, structured debate, interviews with senior domain experts, and A/B validation to raise decision confidence and reduce low-value experiments
Applied the same decision framework across consumer categories including travel, food delivery, FMCG, and e-commerce, ensuring consistency rather than one-off judgments
Representative example in travel-intent ads: compared a shortlist display of about 20 hotels with a full-inventory display of more than 100 hotels for a booking brand; recommended full inventory to capture clicks and reduce competitive leakage; validated via A/B testing and documented the guideline for future use
Codified when scarcity marketing is likely to work by defining criteria such as decision-journey length, market-leader dynamics, and luxury or design-led categories, avoiding misapplication in long-journey decisions like travel planning
Narrowed experiment scope by replacing step-by-step settings with 2 representative endpoints, default versus scarcity, and validating the best default through A/B testing
Maintained evaluation rigor at scale through rubric-based scoring and written rationales, calibration and adjudication on borderline relevance and usefulness, and validity checks under landing-page variants, geo or session differences, redirects, and page changes; flagged rubric and workflow gaps as tasks evolved

Duke University: Sustainability Event Intelligence Labeling for Equity Research Backtests & AI Workflows

Designed a labeled dataset linking firm-level sustainability events to subsequent stock returns, preparing clean inputs for equity research backtests, systematic screening rules and future ML or AI workflows
Reviewed 10-K filings, sustainability reports, proxy statements and news flow, coding each event’s dimension, direction and estimated materiality, then merging labels with return and fundamentals data in Python

SAIF Partners (SoftBank Asia Infrastructure Funds): Thematic Investment Analytics in Dental with Policy Intelligence & Governance/Compliance Risk Gates

Consolidated 2000 funding cases and 200 regulatory items into a standardized dataset of more than 600 Greater China dental deals and 120 to 150 actionable regulatory events
The dataset became the backbone for systematic screening, thematic analysis, and follow-on financial modeling across the broader Greater China dental theme
Used the database to test whether orthodontics was large enough, structurally attractive, and manageable from a policy risk perspective, supporting a dedicated subsector focus
Owned 4 public data workstreams: funding flows, policy and regulation, industry structure, and competitive landscape across aligner brands, private chains, and dental digital and SaaS providers
Partnered with the VP and team to translate findings into a deduplicated longlist of more than 100 orthodontics related targets across brands, services, equipment, and software
Ran a data pass mapping each target’s subsector tag, funding history, valuation range, and investor base, then assigned pass, watch, or exclude labels with concise risk notes
Reduced the universe to more than 10 candidates by screening against SAIF exclusion lists and conduct codes for regulatory, reputational, shareholder, and single supplier red flags
Applied a governance focused ESG gate anchored to MSCI ESG Key Issues, translating Corporate Governance and Corporate Behavior topics into 10 subcategories and 33 actionable checks
Drafted pass, conditional, and deny memos that guided diligence depth and valuation discussions, highlighting ownership and control complexity, related party exposure, disclosure gaps, audit quality concerns, and business ethics risks
Presented governance and risk insights to partners via scenario style memos, prompting 2 management meetings and additional auditor tenure checks before advancing finalists to on-site due diligence
Representative example: flagged accounting and related party issues in 2 finalists; recommendations became management Q and A and auditor tenure checks, reshaping site visit diligence Representative of similar work across categories and contexts
Unified inputs from about 20 sources including CVSource, industry news, and corporate registries; deduplicated rounds, standardized entities and subsector tags, and corrected errors in dates, rounds, and currencies

Jones Lang LaSalle (JLL): Greater China Capital Markets Underwriting with Structured Screening, Valuation & Diligence Readiness

Representative example: supported a cross border China office mandate for a Southeast Asian conglomerate, with ESG embedded into underwriting
Screened 22 cities with JLL’s China office market framework and a PESTLE lens, narrowing to 5 core markets and a 20 asset universe
Compared cities on liquidity and institutional grade stock across 4 investable regions identified by JLL research, then focused on Beijing, Shanghai, Guangzhou, Shenzhen, and Chengdu
Underwrote shortlisted assets using 3 valuation approaches: discounted cash flow, replacement cost, and comparable transactions
Built an ESG pre-screening gate inspired by the GRESB Real Estate Assessment, consolidating around 40 ESG and operating indicators into 1 template
Ran the desktop gate to remove about 40 percent of candidate assets before full valuation and due diligence, mainly due to ESG red flags or outsized future capital expenditure, CAPEX
Structured screening into Management, Performance, and Development to link governance quality, environmental performance, and forward upgrade risk to underwriting
Management checks covered ESG policy coverage, ownership and SPV transparency, unresolved compliance or safety issues, and availability of 12 months continuous records
Performance benchmarking covered energy and carbon intensity, water use, waste, green building certifications, and evidence of efficiency upgrades
Development screening assessed upgrade and CAPEX risk to meet tighter energy codes and carbon requirements, plus refurbishment or extension risks impacting cash flow and exit assumptions
Used peer benchmarking within the same city, submarket, vintage, and system type, then translated results into red, amber, green signals for go or no go decisions
Where key utilities data were missing, used proxy indicators such as equipment age, chiller efficiency, building management systems, sub metering, indoor air monitoring, and maintenance records
Built 10 Excel, Python, and Tableau dashboards combining valuation outputs, market fundamentals, and ESG signals at city, asset, and pipeline levels, including scenario based risk exposure views
Led bilingual briefings for capital markets leadership and client managers; outputs directly set site visit sequencing, targeted ESG questions, and governance and due diligence checklists
Produced standardized asset books and data packs per property, covering submarket positioning, tenant concentration, WALE, vacancy, rents, operating costs, models, and diligence notes in a consistent structure

CITIC Securities: Equity Research Analytics Enablement through Sector Databases, Valuation Stress-Testing & Trading-Color Synthesis

Supported equity and industry research at CITIC Securities by compiling sector datasets, stress-testing valuation assumptions and summarizing trading color for internal investment strategy discussions

Jiritsu Network: Automated Token-Issuer Screening Infrastructure with Scalable ETL, Ranking Models & Counterparty Shortlisting

Representative example: re-implemented GSIA negative screening and best-in-class selection as a "not for bad" then "go for good" workflow, moving from 18000 CoinGecko tokens to an 800-issuer universe and a 30-issuer shortlist for commercial, risk, and governance review
Engineered a Python ETL with Prefect, DuckDB, and Parquet to ingest, standardize, and refresh large-scale token data for issuer analysis
Built negative screens covering liquidity and pricing validity, data completeness, and basic governance and disclosure red flags to remove unworkable names early
Added business relevance filters to focus on RWA, stablecoin, serious DeFi, and infrastructure issuers, excluding meme, gambling, and NSFW categories
De-duplicated multi-chain and wrapped assets and mapped tokens to issuer-level entities to produce a partnership-ready counterparty list
Trained a LightGBM pairwise ranker for "go for good", using uncertainty sampling to concentrate labeling on the hardest boundary decisions
Used Leiden clustering to surface look-alike issuers and improve coverage and diversity beyond the top-ranked candidates
Led governance and disclosure engagements with shortlisted issuers, aligning on reporting cadence, reserve transparency, and inputs needed for on-chain verification
Structured 6 tokenization pilots with lower-risk counterparties, tying selection to clear disclosure commitments and verification readiness
Made the pipeline repeatable and monitorable with scheduled runs, retries on failure, and Parquet-based outputs for fast reruns and backfills

Euromonitor International: Large-Scale E-Commerce Data Standardization via Automated SKU ETL & Market-Entry Scenario Modeling

Owned the China workstream for a global green home appliance market entry engagement for a European brand, translating energy-efficiency standards and subsidies into launch timing, pricing, and positioning scenarios for cross-regional strategy discussions
Built category- and city-tier scenarios for timing and price positioning, accounting for subsidy volatility, tightening efficiency standards, and consumer trade-up in mid to high price bands
Represented China in global stakeholder calls by aligning definitions of green products, calibrating expectations on policy rollout pace, and flagging approval, certification, and channel execution risks
Standardized 50000 e-commerce SKUs across 5 platforms and 7 major appliance categories through an automated Python ETL pipeline, enabling consistent comparison of efficiency labels, price points, promotions, feature sets, and basic review signals
Packaged the ETL logic into a reusable template for other APAC markets, doubling processing throughput and reducing duplicated manual data cleaning
Visualized competition and shelf density in Tableau to surface under-served combinations of high efficiency and mid to high price positioning, especially in tier 1 and tier 2 city demand
Representative example: separated national programs from local pilot subsidies and mapped transitions from old to new efficiency standards to identify categories with stable requirements versus those likely to tighten again; applied the same approach across multiple categories and market scenarios

China Construction Bank Asia: Quantitative Issuer Scoring System using PCA-Based Universe Construction & Methodology Governance

Applied principal component analysis to 10 sustainability indicators across 2000 issuers, constructing a ranked investment universe that highlighted the top 28 percent for follow-on credit and equity analysis
Specified scoring logic, data sources and edge cases in a concise methodology note, clarifying model behavior for product managers and risk officers evaluating the scoring platform’s adoption

Financial Modeling and Investment

DCF, trading and transaction comparables, replacement-cost triangulation, cash-flow and scenario analysis, screening frameworks, investment-universe construction, Excel model auditing, committee-ready shortlists and memos, institutional-grade model review, assumption validation, and investment decision support

Mathematics

probability theory, mathematical statistics, stochastic processes, stochastic calculus, financial mathematics, linear algebra, real analysis, complex variables, ordinary and partial differential equations, dynamical systems, optimization

Data and AI

Python (pandas), SQL, R (basic); machine learning; ETL pipelines; web scraping; dataset design and labeling for quantitative and AI model evaluation, gold-standard reference construction, quality adjudication, and evaluation rubric design

Analytics and Tools

Excel (multi-sheet models, auditing, sensitivity analysis), Power BI, Tableau, Alteryx, data validation workflows, variance analysis, dashboard design, documentation of assumptions, checks and edge-case treatments

Communication and Teaching

bilingual presentations, methodology briefings, written step-by-step explanations of model logic, training materials for screening rules, dashboards and quantitative workflows

Languages

English (C1, fluent); Mandarin (native); Cantonese (basic spoken)

Experience & Domains

Artificial Intelligence

Finance

Data Science

Environmental, Social, and Governance

Core Capabilities