Principal Engineer - AI/ML
hpe
Job Description
What you will do:
- Experiment, design, develop and maintain generative AI models (e.g., LLMs, multimodal models), agentic AI architectures, and pipelines with high potential for value and scale.
- Collaborate with other GenAI engineers, data scientists, product managers, and other engineers to ensure successful implementation of generative AI solutions.
- Perform research and testing to develop or customize GenAI algorithms, prompt engineering strategies, and agentic workflows; conduct model training, fine-tuning, and evaluation as needed; integrate, test, tune and monitor the solutions developed.
- Research and evaluate new technologies and tools for generative AI, including Retrieval-Augmented Generation (RAG), prompt orchestration, agentic AI frameworks, model evaluation and safety tools.
- Maintain and update existing generative AI systems, including prompt libraries, agent workflows, and deployed models.
- Troubleshoot and debug GenAI systems, including issues related to model outputs, prompt reliability, hallucinations, and agentic behaviors.
- Work collaboratively with cross-functional partners and stakeholders, identify opportunities for business impact, understand, refine, and prioritize requirements for GenAI models and solutions, drive engineering decisions, and quantify impact.
- Hands-on development, productionization, and operation of GenAI models, agents, and pipelines at scale, including both batch and real-time use cases.
- Work with large scale structured and unstructured data, build and continuously improve cutting edge GenAI models, including LLMs, vision-language models, and agentic systems.
- Provide technical guidance and mentorship to other team members and interns in GenAI best practices, prompting strategies, and agentic design.
- Identify areas of improvement in existing GenAI systems, including prompt engineering, agent reliability, and model deployment.
What you will need:
- B.E/B.Tech/M.Tech/M.E degree in Computer Science or equivalent
- 10-12 years of industry experience in applied AI/GenAI, designing and developing scalable enterprise level solutions
- Experience in architecting and building large, highly scalable systems & software applications (e.g., well-designed APIs, high volume data pipelines, efficient algorithms, agentic workflows, prompt orchestration)
- Strong programming (Python / C++ or equivalent) and data engineering skills
- Deep understanding of Generative AI best practices (e.g., prompt engineering, RAG, agentic design, model fine-tuning, optimization), LLMs, multimodal models, and deep learning basics.
- Experience with these technologies: OpenAI GPT, Llama, Hugging Face Transformers, LangChain, DeepSpeed, Ray, Kubernetes, Spark, Kafka (or equivalent).
- Industry experience building end-to-end GenAI infrastructure and/or building and productionizing Generative AI models, agents, and workflows
- Experience with MLOps/LLMOps practices and tools (e.g., MLflow, Weights & Biases, DVC, SageMaker, Vertex AI)
- Design, implement and integrate the next generation of Generative AI infrastructure to empower other Data Scientists and AI engineers to build GenAI models and agents that make real-time decisions.
- You will collaborate with other engineers and data scientists to create optimal experiences on the Core GenAI platform, including but not limited to: prompt libraries, agentic orchestration, the real-time serving layer, and the offline training system
- Strong collaboration and communication skills, both verbal and written
- Bring a deep empathy for customer needs and insights as well as an intuitive grasp of the business problems we’re trying to solve
Good to have:
- Experience with traditional machine learning and deep learning frameworks and algorithms (e.g., RNNs, CNNs, Transformers, GANs)
- Knowledge of reinforcement learning, transfer learning, and meta-learning concepts
- Hands-on experience with TensorFlow, PyTorch, JAX, Keras
- Familiarity with data labeling platforms, ML model monitoring and evaluation tools
- Experience with MLOps/LLMOps practices and tools (e.g., MLflow, Weights & Biases, DVC, SageMaker, Vertex AI)
- Exposure to model safety, bias detection, explainability, and responsible AI practices
- Experience with cloud platforms (AWS, Azure, GCP) for scalable AI deployments
- Contributions to open source GenAI/ML projects or research publications