Job Summary
NetApp is seeking a capable Staff Data & Applied Scientist to join the Data Services organization. The overarching vision of this organization is to empower organizations to effectively govern their data estate and build cyber-resiliency while accelerating their digital transformation journey. To get to this vision, we will embark on an AI-first approach to build and deliver world-class data services. As a key technical leader in this initiative, the Staff Data & Applied Scientist will be responsible for architecting machine learning systems, and building data pipelines and ML models across data governance and compliance domains. While overseeing the technical work of 1-3 data and machine learning engineers, the ideal candidate will also possess deep subject matter expertise in modern AI/ML systems and a demonstrated history of shipping impactful products into production. This is going to be a challenging and fun role in one of the most exciting roles in the industry today
Job Requirements
Lead the design and implementation of ML systems for Data governance area with techniques such as classical Machine learning, Generative AI models and AI agents.
Ensure scalability, reliability, and performance of AI models in production environments.
Oversee ML design reviews, create best practices and playbooks for end-to-end ML systems in production.
Collaborate with data engineers to develop scalable data pipelines for various AI/ML-driven solutions from building curated data pipelines, ML feature pipelines and deployment services.
Work with a great deal of autonomy and be the technical thought leader in data governance product area. creating a forward-looking vision with clear direction.
Effectively communicate complex technical artifacts to both technical (engineers & scientists) and non-technical audiences.
Work closely with cross-functional teams including business stakeholders to innovate and unlock new use-cases for our customers that is driven through data intelligence.
Participate in cross-functional meetings, workshops, and planning sessions to ensure data engineering activities support the overall objectives across data services and platform initiatives.
Coaching and leadership for data scientists and the broader cross-functional team, helping influence and develop their skills and capabilities by fostering a culture of innovation and continuous learning.
Have a strong customer focus and build AI/ML products that delight our customers.
Represent NetApp as a leader and ambassador in the machine learning community, building relationships with external partners and promoting the company's product capabilities in industry/academic conferences
Education and Qualifications
Master's or Bachelor's in computer science, Engineering, Applied Mathematics/Statistics/Data Science or equivalent skills.
10+ years of experience as a data and machine learning engineer, with a track record of building data/feature pipelines and shipping successful products with AI/ML & NLP capabilities at scale. Recent focus on experimenting & deploying LLMs to production is a bonus.
Solid understanding of supervised and unsupervised machine learning algorithms and 5+ years of experience shipping them in production.
Strong Proficiency in Python, modern ML frameworks (PyTorch, transformers) and cloud platforms.
Applied knowledge of MLOps practices, CI/CD pipelines and ML model lifecycle management.
Excellent communication and collaboration skills, with demonstrated ability to work effectively with cross-functional teams and stakeholders at all levels of the organization.
(Preferred) Good understanding of data governance, security policies and compliance frameworks.
(Preferred) Demonstrated curiosity and tinkering with AI agents or multi-agent systems.
(Preferred) Publications or contributions to AI/ML community related to NLP or data governance.
(Preferred) 2+ years' experience in technically leading a team of data and machine learning engineers.
(Preferred) Solid understanding of deep learning approaches in Natural Language Processing and Computer Vision domains.
(Preferred) Active GitHub profile showcasing relevant ML projects or Kaggle achievements are a bonus.