Skip to content

Releases: intel/neural-compressor

Intel® Neural Compressor v1.13.1 Release

13 Aug 11:52
Compare
Choose a tag to compare

Features

  • Support experimental auto-coding quantization for PyTorch

    • Post-training static and dynamic quantization for PyTorch
    • Post-training static quantization for IPEX
    • Mixed-precision (BF16, INT8, and FP32) for PyTorch
  • Refactor quantization utilities for ONNX Runtime

Bug fix

  • Fixed model compression orchestration issue caused by PyTorch v1.11
  • Fixed GUI issues

Validated Configurations

  • Python 3.8
  • Centos 8.4
  • TensorFlow 2.9
  • Intel TensorFlow 2.9
  • PyTorch 1.12.0+cpu
  • IPEX 1.12.0
  • MXNet 1.7.0
  • ONNX Runtime 1.11.0

Intel® Neural Compressor v1.13 Release

27 Jul 09:04
98d829a
Compare
Choose a tag to compare

Features

  • Quantization

    • Support new quantization APIs for Intel TensorFlow
    • Support FakeQuant (QDQ) quantization format for ITEX
    • Improve INT8 quantization recipes for ONNX Runtime
  • Mixed Precision

    • Enhance mixed precision interface to support BF16 (FP16) mixed with FP32
  • Neural Architecture Search

    • Support SuperNet-based neural architecture search (DyNAS)
  • Sparsity

    • Support training for block-wise structured sparsity
  • Strategy

    • Support operator-type based tuning strategy

Productivity

  • Support light (default) and full binary packages (default package size 0.5MB, full package size 2MB)
  • Add experimental accuracy diagnostic feature for INT8 quantization including tensor statistics visualization and fine-grained precision setting
  • Add experimental one-click BF16/INT8 low precision enabling & inference optimization, first-ever code-free solution in industry

Ecosystem

  • Upstream 4 more quantized models (emotion_ferplus, ultraface, arcfase, bidaf) to ONNX Model Zoo
  • Upstream 10 quantized Transformers-based models to HuggingFace Model Hub

Examples

  • Add notebooks for Quantization on Intel DevCloud, Distillation/Sparsity/Quantization for BERT-Mini SST-2, and Neural Architecture Search (DyNAS)
  • Add more quantization examples from TensorFlow Model Zoo

Validated Configurations

  • Python 3.8, 3.9, 3.10
  • Centos 8.3 & Ubuntu 18.04 & Win10
  • TensorFlow 2.7, 2.8, 2.9
  • Intel TensorFlow 2.7, 2.8, 2.9
  • PyTorch 1.10.0+cpu, 1.11.0+cpu, 1.12.0+cpu
  • IPEX 1.10.0, 1.11.0, 1.12.0
  • MxNet 1.6.0, 1.7.0, 1.8.0
  • ONNX Runtime 1.9.0, 1.10.0, 1.11.0

Intel® Neural Compressor v1.12 Release

27 May 14:40
Compare
Choose a tag to compare

Features

  • Quantization

    • Support accuracy-aware AMP (INT8/BF16/FP32) on PyTorch
    • Improve post-training quantization (static & dynamic) on PyTorch
    • Improve post-training quantization on TensorFlow
    • Improve QLinear and QDQ quantization modes on ONNX Runtime
    • Improve accuracy-aware AMP (INT8/FP32) on ONNX Runtime
  • Pruning

    • Improve pruning-once-for-all for NLP models
  • Sparsity

    • Support experimental sparse kernel for reference examples

Productivity

  • Support model deployment by loading INT8 models directly from HuggingFace model hub
  • Improve GUI with optimized model downloading, performance profiling, etc.

Ecosystem

  • Highlight simple quantization usage with few clicks on ONNX Model Zoo
  • Upstream INC quantized models (ResNet101, Tiny YoloV3) to ONNX Model Zoo

Examples

  • Add Bert-mini distillation + quantization notebook example
  • Add DLRM & SSD-ResNet34 quantization examples on IPEX
  • Improve BERT structured sparsity training example

Validated Configurations

  • Python 3.8, 3.9, 3.10
  • Centos 8.3 & Ubuntu 18.04 & Win10
  • TensorFlow 2.6.2, 2.7, 2.8
  • Intel TensorFlow 1.15.0 UP3, 2.7, 2.8
  • PyTorch 1.8.0+cpu, 1.9.0+cpu, 1.10.0+cpu
  • IPEX 1.8.0, 1.9.0, 1.10.0
  • MxNet 1.6.0, 1.7.0, 1.8.0
  • ONNX Runtime 1.8.0, 1.9.0, 1.10.0

Intel® Neural Compressor v1.11 Release

15 Apr 14:04
Compare
Choose a tag to compare

Features

  • Quantization
    • Supported QDQ as experimental quantization format for ONNX Runtime
    • Improved FX symbolic tracing for PyTorch
    • Supported multi-metrics for quantization tuning
  • Knowledge distillation
    • Improved distillation algorithm for intermediate layer knowledge transfer
  • Productivity
    • Improved quantization productivity for ONNX Runtime through GUI
    • Improved PyTorch INT8 model save/load methods
  • Ecosystem
    • Upstreamed INC quantized Yolov3, DenseNet, Mask-Rcnn, Yolov4 models to ONNX Model Zoo
    • Became PyTorch ecosystem tool shortly after published PyTorch INC tutorial
  • Examples
    • Added INC quantized ResNet50 v1.5 and BERT-Large model for IPEX
    • Supported dynamic quantization & weight sharing on bare metal reference engine

Intel® Neural Compressor v1.10 Release

28 Feb 05:27
1eb6529
Compare
Choose a tag to compare

Features

  • Quantization
    • Supported the quantization on latest deep learning frameworks
    • Supported the quantization for a new model domain (Audio)
    • Supported the compatible quantization recipes for framework upgrade
  • Pruning & Knowledge distillation
    • Supported fine-tuning and quantization using INC & Optimum for “Prune Once for All: Sparse Pre-Trained Language Models” published at ENLSP NeurIPS Workshop 2021
  • Structured sparsity
    • Proved the sparsity training recipes across multiple model domains (CV, NLP, and Recommendation System)

Productivity

  • Improved INC GUI for easy quantization
  • Supported Windows OS conda installation

Ecosystem

  • Upgraded INC v1.9 into HuggingFace Optimum
  • Upsteamed INC quantized mobilenet & faster-rcnn models to ONNX Model Zoo

Examples

  • Supported quantization on 300 random models
  • Added bare-metal examples for Bert-mini and DLRM

Validated Configurations

  • Python 3.7, 3.8, 3.9
  • Centos 8.3 & Ubuntu 18.04 & Win10
  • TensorFlow 2.6.2, 2.7, 2.8
  • Intel TensorFlow 1.15.0 UP3, 2.7, 2.8
  • PyTorch 1.8.0+cpu, 1.9.0+cpu, 1.10.0+cpu
  • IPEX 1.8.0, 1.9.0, 1.10.0
  • MxNet 1.6.0, 1.7.0, 1.8.0
  • ONNX Runtime 1.8.0, 1.9.0, 1.10.0

Distribution:

  Channel Links Install Command
Source Github https://github.com/intel/neural-compressor.git $ git clone https://github.com/intel/neural-compressor.git
Binary Pip https://pypi.org/project/neural-compressor $ pip install neural-compressor
Binary Conda https://anaconda.org/intel/neural-compressor $ conda install neural-compressor -c conda-forge -c intel

Contact:

Please feel free to contact [email protected], if you get any questions.

Intel® Neural Compressor v1.9 Release

04 Jan 01:56
768c49e
Compare
Choose a tag to compare

Features

  • Knowledge distillation

    • Supported one-shot compression pipelines (knowledge distillation during quantization-aware training) on PyTorch
    • Added more distillation examples on TensorFlow and PyTorch
  • Quantization

    • Supported multi-objective tuning for quantization
    • Supported Intel Extension for PyTorch v1.10 version
    • Improved quantization-aware training support on PyTorch v1.10
  • Pruning

    • Added more magnitude pruning examples on TensorFlow
  • Reference bara-metal examples

    • Supported BF16 optimizations on NLP models
    • Added sparse DLRM model (experimental)
  • Productivity

    • Added Python favorable API (alternative to YAML configuration file)
    • Improved user facing APIs more pythonic
  • Ecosystem

    • Integrated pruning API into HuggingFace Optimum
    • Added ssd-mobilenetv1, efficientnet, ssd, fcn_rn50, inception_v1 quantized models to ONNX Model Zoo

Validated Configurations

  • Python 3.7 & 3.8 & 3.9
  • Centos 8.3 & Ubuntu 18.04
  • TensorFlow 2.6.2 & 2.7
  • Intel TensorFlow 2.4.0, 2.5.0 and 1.15.0 UP3
  • PyTorch 1.8.0+cpu, 1.9.0+cpu, IPEX 1.8.0
  • MxNet 1.6.0, 1.7.0, 1.8.0
  • ONNX Runtime 1.6.0, 1.7.0, 1.8.0

Distribution:

  Channel Links Install Command
Source Github https://github.com/intel/neural-compressor.git $ git clone https://github.com/intel/neural-compressor.git
Binary Pip https://pypi.org/project/neural-compressor $ pip install neural-compressor
Binary Conda https://anaconda.org/intel/neural-compressor $ conda install neural-compressor -c conda-forge -c intel

Contact:

Please feel free to contact [email protected], if you get any questions.

Intel® Neural Compressor v1.8.1 Release

10 Dec 07:24
Compare
Choose a tag to compare

Features

Validated Configurations

  • Python 3.6 & 3.7 & 3.8 & 3.9
  • Centos 8.3 & Ubuntu 18.04
  • TensorFlow 2.6.2 & 2.7
  • Intel TensorFlow 2.4.0, 2.5.0 and 1.15.0 UP3
  • PyTorch 1.8.0+cpu, 1.9.0+cpu, IPEX 1.8.0
  • MxNet 1.6.0, 1.7.0, 1.8.0
  • ONNX Runtime 1.6.0, 1.7.0, 1.8.0

Distribution:

  Channel Links Install Command
Source Github https://github.com/intel/neural-compressor.git $ git clone https://github.com/intel/neural-compressor.git
Binary Pip https://pypi.org/project/neural-compressor $ pip install neural-compressor
Binary Conda https://anaconda.org/intel/neural-compressor $ conda install neural-compressor -c conda-forge -c intel

Contact:

Please feel free to contact [email protected], if you get any questions.

Intel® Neural Compressor v1.8 Release

22 Nov 05:22
Compare
Choose a tag to compare

Features

  • Knowledge distillation
    • Implemented the algorithms of paper “Pruning Once For All” accepted by NeurIPS 2021 ENLSP workshop
    • Supported optimization pipelines (knowledge distillation & quantization-aware training) on PyTorch
  • Quantization
    • Added the support of ONNX RT 1.7
    • Added the support of TensorFlow 2.6.2 and 2.7
    • Added the support of PyTorch 1.10
  • Pruning
    • Supported magnitude pruning on TensorFlow
  • Acceleration library
    • Supported Hugging Face top 10 downloaded NLP models

Productivity

  • Added performance profiling feature to INC UI service.
  • Improved ease-of-use user interface for quantization with few clicks

Ecosystem

  • Added notebook of using HuggingFace optimization library (Optimum) to Transformers
  • Enabled top 20 downloaded Hugging Face NLP models with Optimum
  • Upstreamed more INC quantized models to ONNX Model Zoo

Validated Configurations

  • Python 3.6 & 3.7 & 3.8 & 3.9
  • Centos 8.3 & Ubuntu 18.04
  • TensorFlow 2.6.2 & 2.7
  • Intel TensorFlow 2.4.0, 2.5.0 and 1.15.0 UP3
  • PyTorch 1.8.0+cpu, 1.9.0+cpu, IPEX 1.8.0
  • MxNet 1.6.0, 1.7.0, 1.8.0
  • ONNX Runtime 1.6.0, 1.7.0, 1.8.0

Distribution:

  Channel Links Install Command
Source Github https://github.com/intel/neural-compressor.git $ git clone https://github.com/intel/neural-compressor.git
Binary Pip https://pypi.org/project/neural-compressor $ pip install neural-compressor
Binary Conda https://anaconda.org/intel/neural-compressor $ conda install neural-compressor -c conda-forge -c intel

Contact:

Please feel free to contact [email protected], if you get any questions.

Intel® Neural Compressor v1.7.1 Release

24 Oct 23:35
Compare
Choose a tag to compare

Intel® Neural Compressor(formerly known as Intel® Low Precision Optimization Tool) v1.7 release is featured by:

Features

  • Acceleration library
    • Support unified buffer memory allocation policy

Ecosystem

  • Upstreamed INC quantized models (alexnet/caffenet/googlenet/squeezenet) to ONNX Model Zoo

Documentation

  • Performance and accuracy data update

Validated Configurations

  • Python 3.6 & 3.7 & 3.8 & 3.9
  • Centos 8.3 & Ubuntu 18.04
  • TensorFlow 2.6.0
  • Intel TensorFlow 2.4.0, 2.5.0 and 1.15.0 UP3
  • PyTorch 1.8.0+cpu, 1.9.0+cpu, IPEX 1.8.0
  • MxNet 1.6.0, 1.7.0, 1.8.0
  • ONNX Runtime 1.6.0, 1.7.0, 1.8.0

Distribution:

  Channel Links Install Command
Source Github https://github.com/intel/neural-compressor.git $ git clone https://github.com/intel/neural-compressor.git
Binary Pip https://pypi.org/project/neural-compressor $ pip install neural-compressor
Binary Conda https://anaconda.org/intel/neural-compressor $ conda install neural-compressor -c conda-forge -c intel

Contact:

Please feel free to contact INC Maintainers, if you get any questions.

Intel® Neural Compressor v1.7 Release

01 Oct 06:05
Compare
Choose a tag to compare

Intel® Neural Compressor(formerly known as Intel® Low Precision Optimization Tool) v1.7 release is featured by:

Features

  • Quantization
    • Improved quantization accuracy in SSD-Reset34 and MobileNet v3 on TensorFlow
  • Pruning
    • Supported magnitude pruning on TensorFlow
  • Knowledge distillation
    • Supported knowledge distillation on PyTorch
  • Multi-node support
    • Supported multi-node pruning with distributed dataloader on PyTorch
    • Supported multi-node inference for benchmark on PyTorch
  • Acceleration library
    • Added a domain-specific acceleration library for NLP models

Productivity

  • Supported the configuration-free (pure Python) quantization
  • Improved ease-of-use user interface for quantization with few clicks

Ecosystem

  • Integrated into HuggingFace optimization library (Optimum)
  • Upstreamed INC quantized models (RN50, VGG16) to ONNX Model Zoo

Documentation

  • Add tutorial and examples for knowledge distillation
  • Add tutorial and examples for multi-node training
  • Add tutorial and examples for acceleration library

Validated Configurations

  • Python 3.6 & 3.7 & 3.8 & 3.9
  • Centos 8.3 & Ubuntu 18.04
  • TensorFlow 2.6.0
  • Intel TensorFlow 2.4.0, 2.5.0 and 1.15.0 UP3
  • PyTorch 1.8.0+cpu, 1.9.0+cpu, IPEX 1.8.0
  • MxNet 1.6.0, 1.7.0, 1.8.0
  • ONNX Runtime 1.6.0, 1.7.0, 1.8.0

Distribution:

  Channel Links Install Command
Source Github https://github.com/intel/neural-compressor.git $ git clone https://github.com/intel/neural-compressor.git
Binary Pip https://pypi.org/project/neural-compressor $ pip install neural-compressor
Binary Conda https://anaconda.org/intel/neural-compressor $ conda install neural-compressor -c conda-forge -c intel

Contact:

Please feel free to contact [email protected], if you get any questions.