We are seeking a skilled Machine Learning Engineer to design, develop, and deploy ML models for Agentic AI use cases using AWS AI/ML services. The role involves building end-to-end ML pipelines, integrating LLMs and RAG systems, collaborating with data and MLOps teams, and optimizing models for performance, scalability, and cost-efficiency.
Key Responsibilities
Design, develop, and deploy ML models for Agentic AI use cases.
Work with AWS AI/ML ecosystem (SageMaker, Bedrock, Lambda, Step Functions, S3, DynamoDB, Kinesis).
Preprocess and engineer features from structured, unstructured, and streaming data.
Collaborate with data engineers to ensure high-quality, well-curated training datasets.
Implement LLM fine-tuning, embeddings, and retrieval-augmented generation (RAG) pipelines.
Evaluate and optimize models for accuracy, performance, scalability, and cost-efficiency.
Integrate models into production applications and APIs.
Work with MLOps teams to automate training, testing, deployment, and monitoring workflows.
Perform experimentation, A/B testing, and model validation to ensure reliability.
Document experiments, pipelines, and best practices for reproducibility.
Required Skills & Qualifications
3–6 years of experience in ML engineering (adjust based on seniority).
Strong programming skills in Python (NumPy, Pandas, Scikit-learn, PyTorch, TensorFlow).
Solid understanding of ML lifecycle (data preprocessing, training, evaluation, deployment).
Experience with AWS services for ML (SageMaker, Lambda, ECS/EKS, Step Functions, Bedrock).
Familiarity with large language models (LLMs), NLP, and embeddings.
Strong knowledge of APIs and microservice deployment.
Experience with ML pipeline orchestration (Airflow, Kubeflow, MLflow, or similar).
Understanding of data versioning, experiment tracking, and model registry.
Proficiency in SQL/NoSQL databases and vector databases (Weaviate, Pinecone, FAISS).
Python, NumPy, Pandas, Scikit-learn, PyTorch, TensorFlow, AWS SageMaker, AWS Lambda, AWS Bedrock, AWS Step Functions, ECS, EKS, LLMs, NLP, Embeddings, RAG, APIs, Microservices, Airflow, Kubeflow, MLflow, Data Versioning, Experiment Tracking, Model Registry, SQL, NoSQL, Weaviate, Pinecone, FAISS