As the ML Ops Engineer, you'll be responsible for deploying, managing, and optimizing machine learning models in preproduction and production environments. On classical VM based workloads and containerized environments. With extensive experience on Dataiku and MLFlow.

Key Responsibilities:

  • Develop and maintain ML pipelines with experience on Dataiku and MLFlow.
  • Have DevOps experience and CI/CD deployment pipelines processes with experience on Azure DevOps primarily and Jenkins being nice to have.
  • Experience on operationalizing compute workloads on classical virtual machines specifically red hat Linux or oracle Linux and docker standalone and Kubernetes environments with preference for OpenShift experience, in addition to experience on deployment and operation of windows-based workloads (IIS, windows services…).
  • Monitor model performance metrics and implement strategies for continuous improvement and avoiding model drift.
  • Collaborate with data scientists and engineers to ensure model scalability and reliability.
  • Experience on observability platforms like Prometheus or Grafana.
  • Implement best practices for version control, continuous integration, and continuous deployment (CI/CD) for ML models.
  • Optimize, deploy, and run local small LLMs on CPU-based environments, ensuring efficient inference and resource utilization.

Qualifications:

  • Bachelor’s degree in Computer Science, Engineering, or a related field.
  • 3+ years of experience in machine learning operations or a related role.
  • Experience with on premise compute landscape especially vmware based compute environments and local saudi cloud platforms (e.g., Nournet,STC and others).
  • Certification on Dataiku is preferred
  • Additional certifications on administration of compute workloads such as CKA are a plus

Skills:

  • Extensive experience on Dataiku platform especially on MLOPs Automation and API nodes and experience on MLFlow.
  • Knowledge in Small LLMs and their operationalization and tuning.
  • Proficiency in Python, Docker, Kubernetes, and MLOps tools (MLflow).
  • Knowledge of ML frameworks (e.g. TensorFlow, PyTorch).
  • Strong problem-solving and troubleshooting skills.
  • Experience in .net and C# is a plus
  • Experience in troubleshooting cloudera environment and spark workload and Hadoop, Hive, or Impala is a plus
  • OpenShift experience is a plus

Salary

0 - 0 AED

Monthly based

Location

Alexandria,Alexandria,Egypt

Job Benefits
Remote work
Job Overview
Job Posted:
2 weeks ago
Job Expire:
2 weeks from now
Job Type
Full-time
Job Role
Individual Contributor
Education
Bachelor's Degree
Experience
3-5 Years
Total Vacancies
1

Job Tags:

Share This Job:

Location

Alexandria,Alexandria,Egypt