KnowtexHealthcare IT

Applied ML Engineer

PythonPyTorchTensorFlowLLMSF · Mid · Seed

About Knowtex

Knowtex is building the future of voice AI operating systems for clinicians, transforming how healthcare documentation happens at the point of care. Founded by Stanford AI scientists with deep clinical experience, we're experiencing explosive growth across both commercial health systems and federal healthcare, with our ambient documentation platform scaling rapidly to thousands of clinicians across hundreds of specialties. We're at an inflection point where cutting-edge AI meets real clinical impact, giving clinicians hours back each day to focus on what matters most - their patients.

Position Overview

We are seeking an Applied ML Engineer to productionize and scale machine learning systems powering our voice AI platform. This role bridges research and engineering — transforming models into reliable, low-latency, production-grade systems deployed across enterprise healthcare environments.

You will work closely with ML Scientists, Backend Engineers, and Platform teams to optimize inference performance, build evaluation pipelines, and ensure robust model deployment in regulated environments.

Key Responsibilities

Productionize ML models for real-time clinical applications
Optimize inference pipelines for low latency and high throughput
Deploy and scale models using AWS-based infrastructure
Build automated evaluation and regression testing frameworks for LLM outputs
Implement monitoring systems for model performance and drift detection
Collaborate with Backend teams to integrate ML services into APIs and workflows
Improve model efficiency through quantization, batching, caching, and optimization techniques
Support specialty-level model evaluation and performance analysis
Contribute to CI/CD workflows for ML deployment

Required Qualifications

3–7+ years of experience in machine learning engineering or applied ML roles
Strong proficiency in Python and PyTorch (or TensorFlow)
Experience deploying ML models in production environments
Familiarity with transformer architectures and large language models
Experience with model optimization techniques (quantization, distillation, pruning)
Experience working with cloud infrastructure (AWS preferred)
Strong software engineering fundamentals and debugging skills

Preferred Qualifications

Experience with speech recognition systems or NLP pipelines
Experience with Triton Inference Server or similar deployment frameworks
Familiarity with healthcare data or clinical documentation workflows
Experience working in regulated environments (HIPAA, GovCloud, etc.)
Knowledge of medical coding systems (ICD-10, CPT)

Technical Environment

Python, PyTorch / TensorFlow
Transformer-based LLM architectures
AWS (SageMaker, ECS, Lambda, S3)
Triton Inference Server
CI/CD pipelines for ML deployment
Observability tools for performance and drift monitoring

Compensation & Benefits

Meaningful equity compensation
Unlimited PTO
Premium health, dental, and vision coverage
401(k) plan

Check your CV against this role

Drop your CV. You get a 0-100 fit score against the actual job description, plus the read a senior engineering lead would write. Private to you.

Score this once, or every future role

Start the candidate journey and every new role on the board gets scored against you.

Five minutes. Tell us what you’re after, drop your CV once, pick how we should reach out. You get a candid read back and you only hear from us when a role fits.

Start the journey How it works

More at Knowtex

Software Engineer (Applications Engineering)SF · Mid→Mobile DeveloperSF · Mid→ML Scientist (Research)SF · Mid→Backend EngineerSF · Mid→