wir sind auf der Suche nach

Senior/Lead Python Engineer (ML Infrastructure) Fernbedienung Vollzeit

Unser Kunde:

Our client is a technology-focused company building high-performance, real-time ML inference systems. The team develops ultra-low-latency engines that process billions of requests per day, integrating ML models with business-critical decision-making pipelines. They are looking for an experienced backend engineer to own and scale production-grade ML services with strong focus on latency, reliability, and observability.

Ihre Aufgaben:

Lead the design and development of low-latency ML inference services handling massive request volumes.
Build and scale real-time decision-making engines, integrating ML models with business logic under strict SLAs.
Collaborate closely with data scientists to deploy ML models seamlessly and reliably in production.
Design systems for model versioning, shadowing, and A/B testing at runtime.
Ensure high availability, scalability, and observability of production systems.
Continuously optimize latency, throughput, and cost-efficiency using modern tools and techniques.
Work independently while collaborating with cross-functional teams including Algo, Infrastructure, Product, Engineering, and Business stakeholders.

Erforderliche Erfahrungen und Qualifikationen:

B.Sc. or M.Sc. in Computer Science, Software Engineering, or related technical field.
5+ years of experience building high-performance backend or ML inference systems.
Expert in Python and experience with low-latency APIs and real-time serving frameworks (e.g., FastAPI, Triton Inference Server, TorchServe, BentoML).
Experience with scalable service architectures, message queues (Kafka, Pub/Sub), and asynchronous processing.
Strong understanding of model deployment, online/offline feature parity, and real-time monitoring.
Experience with cloud environments (AWS, GCP, OCI) and container orchestration (Kubernetes).
Familiarity with in-memory and NoSQL databases (Aerospike, Redis, Bigtable) for ultra-fast data access.
Experience with observability stacks (Prometheus, Grafana, OpenTelemetry) and alerting/diagnostics best practices.
Strong ownership mindset and ability to deliver solutions end-to-end.
Passion for performance, clean architecture, and impactful systems.

Das wäre ein Plus:

Prior experience leading high-throughput, low-latency ML systems in production.
Knowledge of real-time feature pipelines and streaming data platforms.
Familiarity with advanced monitoring and profiling techniques for ML services.

Arbeitsbedingungen

5-Tage-Woche, 8-Stunden-Tag, flexible Arbeitszeiten;

Fernarbeit.

Kontakt

E-Mail

Kontakt Wir

Büro UK:

Telefon:

Folgen Sie uns:

A-listware ist bereit, Ihre strategische IT-Outsourcing-Lösung zu sein