Software Development Manager - ML Compiler, AWS Neuron, Annapurna Labs

Amazon Web Services (AWS)


Date: 12 hours ago
City: Toronto, ON
Contract type: Full time
DESCRIPTION

The Product: AWS Machine Learning accelerators are at the forefront of AWS innovation. The Inferentia chip delivers best-in-class ML inference performance at the lowest cost in the cloud. Trainium will deliver the best-in-class ML training performance with the most teraflops (TFLOPS) of compute power for ML in the cloud. This is all enabled by a cutting edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML compiler, runtime and natively integrates into popular ML frameworks, such as PyTorch, TensorFlow and MxNet. The Neuron SDK optimizes performance of complex neural net models executed on AWS Inferentia and Trainium. AWS Neuron is used at scale with customers and partners like PyTorch, Epic Games, Snap, AirBnB, Autodesk, Amazon Alexa, Amazon Rekognition and more customers in various other segments.

The Team: The Amazon Annapurna Labs team is responsible for building innovation in silicon and software for AWS customers. We are at the forefront of innovation by combining cloud scale with the world’s most talented engineers. Our team covers multiple disciplines including silicon engineering, hardware design and verification, software and operations. With such breadth of talent, there's opportunity to learn all of the time. We operate in spaces that are very large, yet our teams remain small and agile. There is no blueprint. We're inventing. We're experimenting. When you couple that with the ability to work on so many different products and services, it's a very unique learning culture.

Learn More About Our History:

https://www.amazon.science/how-silicon-innovation-became-the-secret-sauce-behind-awss-success

You: As a Manager III on the AWS Neuron team, you'll be leading a team of compiler engineers through developing, deploying, and scaling a compiler targeting AWS Inferentia and Trainium. You'll need to be technically capable, credible and curious in your own right as a trusted AWS Neuron Manager, innovating on behalf of our customers. You’ll leverage your vision and technical communication skills as a hands-on partner to AWS ML services teams, to be involved in pre-silicon design, bring new products/optimizations/features to market, and many other exciting projects to ensure the Neuron SDK exceeds our customers' needs of high performance, low cost, and ease of use.

You will have deep knowledge of resource management, scheduling, code generation, optimization, and new instruction architectures including CPU, NPU, GPU and novel forms of compute.

AWS Utility Computing (UC) provides product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Additionally, this role may involve exposure to and experience with Amazon's growing suite of generative AI services and other cutting-edge cloud computing offerings across the AWS portfolio.

Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.

Explore The Product:

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-cc/index.html

https://github.com/aws/aws-neuron-sdk

https://aws.amazon.com/machine-learning/neuron/

https://aws.amazon.com/machine-learning/neuron/

In order to be considered for this role, candidates must be currently located or willing to relocate to Toronto.

Key job responsibilities

The Product: AWS Machine Learning accelerators are at the forefront of AWS innovation. The Inferentia chip delivers best-in-class ML inference performance at the lowest cost in the cloud. Trainium will deliver the best-in-class ML training performance with the most teraflops (TFLOPS) of compute power for ML in the cloud. This is all enabled by a cutting edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML compiler, runtime and natively integrates into popular ML frameworks, such as PyTorch, TensorFlow and MxNet. The Neuron SDK optimizes performance of complex neural net models executed on AWS Inferentia and Trainium. AWS Neuron is used at scale with customers and partners like PyTorch, Epic Games, Snap, AirBnB, Autodesk, Amazon Alexa, Amazon Rekognition and more customers in various other segments.

The Team: The Amazon Annapurna Labs team is responsible for building innovation in silicon and software for AWS customers. We are at the forefront of innovation by combining cloud scale with the world’s most talented engineers. Our team covers multiple disciplines including silicon engineering, hardware design and verification, software and operations. With such breadth of talent, there's opportunity to learn all of the time. We operate in spaces that are very large, yet our teams remain small and agile. There is no blueprint. We're inventing. We're experimenting. When you couple that with the ability to work on so many different products and services, it's a very unique learning culture.

Learn More About Our History:

https://www.amazon.science/how-silicon-innovation-became-the-secret-sauce-behind-awss-success

You: As a Manager III on the AWS Neuron team, you'll be leading a team of compiler engineers through developing, deploying, and scaling a compiler targeting AWS Inferentia and Trainium. You'll need to be technically capable, credible and curious in your own right as a trusted AWS Neuron Manager, innovating on behalf of our customers. You’ll leverage your vision and technical communication skills as a hands-on partner to AWS ML services teams, to be involved in pre-silicon design, bring new products/optimizations/features to market, and many other exciting projects to ensure the Neuron SDK exceeds our customers' needs of high performance, low cost, and ease of use.

You will have deep knowledge of resource management, scheduling, code generation, optimization, and new instruction architectures including CPU, NPU, GPU and novel forms of compute.

Explore The Product:

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-cc/index.html

https://github.com/aws/aws-neuron-sdk

https://aws.amazon.com/machine-learning/neuron/

https://aws.amazon.com/machine-learning/neuron/

In order to be considered for this role, candidates must be currently located or willing to relocate to Toronto.

About The Team

About the Team

Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.

Diverse Experiences

AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.

About AWS

Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Inclusive Team Culture

Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.

Work/Life Balance

We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.

Mentorship & Career Growth

We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.

BASIC QUALIFICATIONS

  • 3+ years of engineering team management experience
  • 6+ years of working directly within engineering teams experience
  • 4+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience
  • Experience partnering with product or program management teams
  • Excellent software design fundamentals, knowledge of software engineering principles, and a deep understanding of compilers (resource management, instruction scheduling, code generation, and compute graph optimization

PREFERRED QUALIFICATIONS

  • M.S. or Ph.D. in Computer Science or related technical field
  • Experience with toolchains (LLVM, GCC) and code generation techniques for new hardware
  • Knowledge of compiler internals from front end to run-time environment with emphasis on AI acceleration

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, disability, age, or other legally protected status. If you would like to request an accommodation, please notify your Recruiter.


Company - Amazon Development Centre Canada ULC

Job ID: A2685808

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

Lead Data Engineer – Data Platform (Bangkok-Based, Relocation Provided)

Agoda, Toronto, ON
12 hours ago
About AgodaAgoda is an online travel booking platform for accommodations, flights, and more. We build and deploy cutting-edge technology that connects travelers with a global network of 4.7M hotels and holiday properties worldwide, plus flights, activities, and more . Based in Asia and part of Booking Holdings, our 7,100+ employees representing 95+ nationalities in 27 markets foster a work environment...

Will Consultant

TD Bank, Toronto, ON
12 hours ago
Work Location:Toronto, Ontario, CanadaHours:37.5Line of Business:LegalPay Details:$65,600 - $98,400 CADTD is committed to providing fair and equitable compensation opportunities to all colleagues. Growth opportunities and skill development are defining features of the colleague experience at TD. Our compensation policies and practices have been designed to allow colleagues to progress through the salary range over time as they progress in their...

Senior Business Insights and Reporting Analyst (1962)

TD Bank, Toronto, ON
13 hours ago
Work Location:Ottawa, Ontario, CanadaHours:37.5Line of Business:Risk ManagementPay Details:$76,800 - $115,200 CADTD is committed to providing fair and equitable compensation opportunities to all colleagues. Growth opportunities and skill development are defining features of the colleague experience at TD. Our compensation policies and practices have been designed to allow colleagues to progress through the salary range over time as they progress in...