wir sind auf der Suche nach

Senior/Lead Python Engineer (ML Infrastructure) Fernbedienung Vollzeit

Unser Kunde:

Our client is a technology-focused company building high-performance, real-time ML inference systems. The team develops ultra-low-latency engines that process billions of requests per day, integrating ML models with business-critical decision-making pipelines. They are looking for an experienced backend engineer to own and scale production-grade ML services with strong focus on latency, reliability, and observability.

Ihre Aufgaben:

  • Lead the design and development of low-latency ML inference services handling massive request volumes.
  • Build and scale real-time decision-making engines, integrating ML models with business logic under strict SLAs.
  • Collaborate closely with data scientists to deploy ML models seamlessly and reliably in production.
  • Design systems for model versioning, shadowing, and A/B testing at runtime.
  • Ensure high availability, scalability, and observability of production systems.
  • Continuously optimize latency, throughput, and cost-efficiency using modern tools and techniques.
  • Work independently while collaborating with cross-functional teams including Algo, Infrastructure, Product, Engineering, and Business stakeholders.

Erforderliche Erfahrungen und Qualifikationen:

  • B.Sc. or M.Sc. in Computer Science, Software Engineering, or related technical field.
  • 5+ years of experience building high-performance backend or ML inference systems.
  • Expert in Python and experience with low-latency APIs and real-time serving frameworks (e.g., FastAPI, Triton Inference Server, TorchServe, BentoML).
  • Experience with scalable service architectures, message queues (Kafka, Pub/Sub), and asynchronous processing.
  • Strong understanding of model deployment, online/offline feature parity, and real-time monitoring.
  • Experience with cloud environments (AWS, GCP, OCI) and container orchestration (Kubernetes).
  • Familiarity with in-memory and NoSQL databases (Aerospike, Redis, Bigtable) for ultra-fast data access.
  • Experience with observability stacks (Prometheus, Grafana, OpenTelemetry) and alerting/diagnostics best practices.
  • Strong ownership mindset and ability to deliver solutions end-to-end.
  • Passion for performance, clean architecture, and impactful systems.

Das wäre ein Plus:

  • Prior experience leading high-throughput, low-latency ML systems in production.
  • Knowledge of real-time feature pipelines and streaming data platforms.
  • Familiarity with advanced monitoring and profiling techniques for ML services.

Arbeitsbedingungen

5-Tage-Woche, 8-Stunden-Tag, flexible Arbeitszeiten

5-Tage-Woche, 8-Stunden-Tag, flexible Arbeitszeiten;

Arbeiten von zu Hause mit Kaffee

Fernarbeit.

Kontakt
Stellenangebote abonnieren

    Kontakt Wir
    Büro UK:
    Telefon:
    Folgen Sie uns:
    A-listware ist bereit, Ihre strategische IT-Outsourcing-Lösung zu sein

      Zustimmung zur Verarbeitung von personenbezogenen Daten
      Datei hochladen