This repository contains samples for AWS Neuron, the software development kit (SDK) that enables machine learning (ML) inference and training workloads on the AWS ML accelerator chips Inferentia and Trainium.
The samples in this repository provide an indication of the types of deep learning models that can be used with Trainium and Inferentia, but do not represent an exhaustive list of supported models. If you have additional model samples that you would like to contribute to this repository, please submit a pull request following the repository's contribution guidelines.
Samples are organized by use case (training, inference) and deep learning framework (PyTorch, TensorFlow) below:
Framework | Description | Instance Type |
---|---|---|
PyTorch NeuronX (torch-neuronx) | Sample training scripts for training various PyTorch models on AWS Trainium | Trn1, Trn1n & Inf2 |
Usage | Description | Instance Type |
---|---|---|
Nemo Megatron for Neuron | A library that enables large-scale distributed training of language models such as Llama and is adapted from Nemo Megatron. | Trn1, Trn1n |
AWS Neuron samples for ParallelCluster | How to use AWS ParallelCluster to build HPC compute cluster that uses trn1 compute nodes to run your distributed ML training job. | Trn1, Trn1n |
AWS Neuron samples for EKS | The samples in this repository demonstrate the types of patterns that can be used to deliver inference and distributed training on EKS using Inferentia and Trainium. | Trn1, Trn1n |
AWS Neuron samples for SageMaker | SageMaker Samples using ml.trn1 instances for machine learning (ML) training workloads on the AWS ML accelerator chips Trainium. | Trn1, Trn1n |
Framework | Description | Instance Type |
---|---|---|
PyTorch NeuronX (torch-neuronx) | Sample Jupyter notebooks demonstrating model compilation and inference for various PyTorch models on AWS Inferentia2 and Trainium | Inf2 & Trn1 |
PyTorch NeuronX (transformers-neuronx) | Sample Jupyter Notebooks demonstrating tensor parallel inference for various PyTorch large language models (LLMs) on AWS Inferentia2 and Trainium | Inf2 & Trn1 |
PyTorch Neuron (torch-neuron) | Sample Jupyter notebooks demonstrating model compilation and inference for various PyTorch models on AWS Inferentia | Inf1 |
TensorFlow Neuron (tensorflow-neuron) | Sample Jupyter notebooks demonstrating model compilation and inference for various TensorFlow models on AWS Inferentia | Inf1 |
Usage | Description | Instance Type |
---|---|---|
AWS Neuron samples for SageMaker | SageMaker Samples using ml.inf2 and ml.trn1 instances for machine learning (ML) inference workloads on the AWS ML accelerator chips Inferentia2 and Trainium. | Inf2 & Trn1 |
If you encounter issues with any of the samples in this repository, please open an issue via the GitHub Issues feature.
Please refer to the CONTRIBUTING document for details on contributing additional samples to this repository.
Please refer to the Change Log.
Model | Framework | Training/Inference | Instance Type | Status |
---|---|---|---|---|
Fairseq | PyTorch | Inference | Inf1 | RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! |
Yolof | PyTorch | Inference | Inf1 | RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! |