Chunk.
Semantic-aware splitter respects token budget while keeping paragraph boundaries intact. Coherent arguments stay together.
I’m Jiajun (Eddy) Huang — an engineer pursuing an M.S. in Artificial Intelligence at Northeastern Silicon Valley, after a B.S. in Computer Science from UC Davis. I build applied AI systems end-to-end, from data plumbing to model serving, and I believe good systems quietly shape how people experience the world.

Jiajun · 黄家骏
Eddy.
Lat 37.39° N
Bay Area, CA
Now pursuing an M.S. in AI at Northeastern Silicon Valley. Based in San Jose, in the Bay Area. Looking for a Fall 2026 SWE, ML, or AI internship. Focusing on agentic AI and retrieval-augmented generation systems. Currently exploring tool-use patterns inside long-running agent loops. Reach me at hi at jiajunh dot me. Stay hungry. Stay foolish.
Now pursuing an M.S. in AI at Northeastern Silicon Valley.
A working vocabulary
Selected work
Built end-to-end. Shipped to real domains. Each is documented as a case study with the trade-offs I made and the things I’d do differently next time.
A walk through one project
Hybrid retrieval over enterprise docs — FAISS dense vectors fused with BM25, then re-ranked.
Semantic-aware splitter respects token budget while keeping paragraph boundaries intact. Coherent arguments stay together.
FAISS over MPNet embeddings. Captures paraphrased and conceptual intent; weak on literal-term lookups.
rank-bm25 over the same chunk set so the two retrievers stay aligned. Captures literal names, acronyms, IDs.
Reciprocal rank fusion at query time. Score-agnostic — no normalization needed across two very different distributions.
Cross-encoder over the top-k. Small k keeps the cost bounded; this is where the final ordering is earned.
How I think about the work
If you understand a system well enough, you can teach it to handle itself.
I grew up noticing how much of human work is patterns repeated by hand. That observation pointed somewhere specific — automation isn’t the goal, understanding is. The systems that survive aren’t the clever ones; they’re the ones whose authors knew exactly what they were doing, and why.
Where I’ve studied
2025 — 2027 · In progress
M.S. Artificial Intelligence
Northeastern University — Silicon Valley
2020 — 2024
B.S. Computer Science
University of California, Davis
If you’re hiring
Short feedback loops, hard problems, latency budgets and eval rigor treated as first-class deliverables. I’d love to talk if that sounds like the team you’re building.