CV

Director of Engineering at Qualcomm. Researcher working on AI systems, inference optimization, and hardware-software co-design.

Contact Information

Name Alex (Wei) Chen
Professional Title Director of Engineering
Email alexchen4ai@gmail.com

Professional Summary

Director of Engineering at Qualcomm (via acquisition of Nexa AI). Researcher and engineer focused on AI systems, LLM/VLM inference optimization, hardware-software co-design, and robotics. PhD from Stanford University (2024).

Experience

  • 2026 - present

    Santa Clara, CA

    Director of Engineering (Principal Engineer/Manager)
    Qualcomm
    • Joined Qualcomm following the acquisition of Nexa AI. Lead on-device AI optimization research and engineering focused on accelerating generative AI inference on Qualcomm Snapdragon NPU (Hexagon HTP) and edge SoCs.
    • Drive end-to-end optimization of LLMs and VLMs for Qualcomm Hexagon HTP NPU, encompassing hardware-aware quantization (INT4/INT8/FP16 mixed-precision), operator fusion, memory bandwidth scheduling, and kernel-level tuning.
    • Collaborate with silicon, compiler, and runtime teams to co-design neural network architectures and inference pipelines that fully exploit Hexagon NPU vector and tensor acceleration.
    • Manage a cross-functional team spanning model research, runtime optimization, and SDK development.
  • 2024 - 2026

    Palo Alto, CA

    Founder, CEO and Chief Scientist
    Nexa AI
    • Founded Nexa AI, specializing in efficient AI research and deployment. Built the NexaML engine and NexaSDK (7.6K GitHub stars) for generative AI model deployment on NPU, GPU, and CPU. Acquired by Qualcomm in 2026.
    • Principal architect and first author of AutoNeural, Octopus V1–V4, OmniVLM, OmniAudio, and NexaQuant. Octopus V2 represents ~2% of total HuggingFace downloads since 2022.
    • Solutions deliver ~55% faster throughput and ~28% better output quality vs. existing AI inference stacks. Trusted by Geely, HP, Lenovo, and İşbank.
    • Official partner of Qualcomm, AMD, NVIDIA, IBM, Google, Microsoft, Intel, Docker, HP, Lenovo, and Dell.
  • 2021 - 2023

    Palo Alto, CA

    Investment Scout
    Sequoia Capital
    • Sourced and evaluated early-stage startups in the Bay Area focusing on the Stanford ecosystem. Conducted market research, due diligence, and facilitated founder-partner connections.

Education

  • 2019 - 2024

    California, US

    PhD
    Stanford University
    Mechanics and Computation
    • Research: Large-scale numerical simulation frameworks combining HPC with ML-based surrogate models to accelerate PDE solvers.
  • 2015 - 2019

    Shanghai, China

    Bachelor
    Tongji University
    Engineering (minor in Mathematics)

Awards

  • 2025
    Best Paper Runner-Up Award
    IEEE ICDM

    For the paper: DP-FedLoRA: Privacy-Enhanced Federated Fine-Tuning for On-Device Large Language Models.

Skills

Programming Languages (Expert): Python, C/C++, Java, JavaScript
Machine Learning (Expert): LLM/VLM Pretraining & Post-training, Quantization, Model Optimization, On-device Inference, PyTorch, ONNX
Hardware Acceleration (Expert): Qualcomm Hexagon HTP, CUDA, Vulkan, OpenCL, Kernel Optimization for NPU/GPU/CPU
Systems (Proficient): Linux, Docker

Languages

English : Fluent
Chinese (Mandarin) : Native