nous recherchons

Senior/Lead Python Engineer (ML Infrastructure) A distance Temps plein

Notre client :

Our client is a technology-focused company building high-performance, real-time ML inference systems. The team develops ultra-low-latency engines that process billions of requests per day, integrating ML models with business-critical decision-making pipelines. They are looking for an experienced backend engineer to own and scale production-grade ML services with strong focus on latency, reliability, and observability.

Vos tâches :

  • Lead the design and development of low-latency ML inference services handling massive request volumes.
  • Build and scale real-time decision-making engines, integrating ML models with business logic under strict SLAs.
  • Collaborate closely with data scientists to deploy ML models seamlessly and reliably in production.
  • Design systems for model versioning, shadowing, and A/B testing at runtime.
  • Ensure high availability, scalability, and observability of production systems.
  • Continuously optimize latency, throughput, and cost-efficiency using modern tools and techniques.
  • Work independently while collaborating with cross-functional teams including Algo, Infrastructure, Product, Engineering, and Business stakeholders.

Expérience et compétences requises :

  • B.Sc. or M.Sc. in Computer Science, Software Engineering, or related technical field.
  • 5+ years of experience building high-performance backend or ML inference systems.
  • Expert in Python and experience with low-latency APIs and real-time serving frameworks (e.g., FastAPI, Triton Inference Server, TorchServe, BentoML).
  • Experience with scalable service architectures, message queues (Kafka, Pub/Sub), and asynchronous processing.
  • Strong understanding of model deployment, online/offline feature parity, and real-time monitoring.
  • Experience with cloud environments (AWS, GCP, OCI) and container orchestration (Kubernetes).
  • Familiarity with in-memory and NoSQL databases (Aerospike, Redis, Bigtable) for ultra-fast data access.
  • Experience with observability stacks (Prometheus, Grafana, OpenTelemetry) and alerting/diagnostics best practices.
  • Strong ownership mindset and ability to deliver solutions end-to-end.
  • Passion for performance, clean architecture, and impactful systems.

Ce serait un plus :

  • Prior experience leading high-throughput, low-latency ML systems in production.
  • Knowledge of real-time feature pipelines and streaming data platforms.
  • Familiarity with advanced monitoring and profiling techniques for ML services.

Conditions de travail

Semaine de travail de 5 jours, journée de travail de 8 heures, horaire flexible

Semaine de travail de 5 jours, journée de travail de 8 heures, horaire flexible ;

Travailler à domicile avec du café

Travail à distance.

Nous contacter
S'abonner aux offres d'emploi

    Contact Nous
    Bureau au Royaume-Uni :
    Téléphone :
    Suivez-nous :
    A-listware est prêt à devenir votre solution stratégique d'externalisation des technologies de l'information.

      Consentement au traitement des données personnelles
      Télécharger le fichier