Automation Engineer
KeySkills
Job Description
Job Description:
We are seeking a hands-on AI/ML Infrastructure Engineer to design, automate, and maintain robust build systems and deployment pipelines for AI/ML components. This role involves direct software development in C++ and Python while supporting infrastructure for model training and high-performance inference systems. The ideal candidate is a strong systems engineer with deep cross-platform experience, especially in Linux and IBM z/OS environments.
Key Responsibilities:
-
Design and implement build automation systems for large, distributed AI/C++/Python codebases.
-
Develop tools/scripts to support rapid development, testing, and deployment across diverse environments.
-
Integrate C++ components with Python-based AI workflows using tools like
pybind11
orCython
. -
Lead the creation of portable, reproducible dev environments, ensuring parity between dev and prod.
-
Maintain and extend CI/CD pipelines (Jenkins, GitLab CI) for Linux and z/OS.
-
Automate testing, artifact management, and release validation processes.
-
Monitor and improve build performance and system reliability.
-
Collaborate with AI researchers, architects, and mainframe engineers to align infrastructure with goals.
-
Contribute to internal documentation, knowledge sharing, and process optimization.
Required Education:
-
Bachelor?s Degree in Computer Science, Engineering, or related field.
Required Technical and Professional Expertise:
-
Strong programming in C++ and Python.
-
Deep understanding of compiled vs. interpreted languages.
-
Experience with CI/CD tools like Jenkins or GitLab CI.
-
Proficient in build tools: CMake, Make, Meson, Ninja.
-
Experience with Linux and IBM z/OS development.
-
Integration of C++ with Python via
pybind11
,Cython
, etc. -
Strong troubleshooting skills for build-time and runtime issues.
-
Shell scripting (Bash, Zsh) and system-level operations.
-
Familiarity with Docker or other containerization tools.
Preferred Technical and Professional Experience:
-
Experience with AI/ML frameworks: PyTorch, TensorFlow, ONNX.
-
Background in IBM z/OS mainframe software development.
-
Understanding of z/OS packaging workflows.
-
Knowledge of system performance tuning in high-throughput compute environments.
-
Familiarity with GPU computing, low-level profiling/debugging tools.
-
Experience with distributed systems, microservices, and REST APIs.
-
Contributions to open-source projects (infra, DevOps, AI tooling).
-
MLOps experience: integrating AI/ML models with CI/CD pipelines into production workflows.
-
Understanding of security, compliance, and coding standards in AI engineering.
-
Proven track record of high-quality code delivery, effective cross-team communication, and stakeholder collaboration.
-