Name	Name	Last commit message	Last commit date
parent directory ..
.gitignore	.gitignore
README.md	README.md
_copy.json.config	_copy.json.config
deepseek_ov_config.json	deepseek_ov_config.json
deepseek_ov_config.json.config	deepseek_ov_config.json.config
deepseek_ov_npu_config.json	deepseek_ov_npu_config.json
deepseek_ov_npu_config.json.config	deepseek_ov_npu_config.json.config
deepseek_trtrtx.json	deepseek_trtrtx.json
deepseek_trtrtx.json.config	deepseek_trtrtx.json.config
deepseek_vitis_ai_config.json	deepseek_vitis_ai_config.json
deepseek_vitis_ai_config.json.config	deepseek_vitis_ai_config.json.config
inference_model.json	inference_model.json
inference_sample.ipynb	inference_sample.ipynb
info.yml	info.yml
model_project.config	model_project.config
requirements.txt	requirements.txt
winml.py	winml.py

Name

Last commit message

Last commit date

.gitignore

README.md

_copy.json.config

deepseek_ov_config.json

deepseek_ov_config.json.config

deepseek_ov_npu_config.json

deepseek_ov_npu_config.json.config

deepseek_trtrtx.json

deepseek_trtrtx.json.config

deepseek_vitis_ai_config.json

deepseek_vitis_ai_config.json.config

inference_model.json

inference_sample.ipynb

DeepSeek-R1-Distill-Qwen-7B Model Optimization

This repository demonstrates the optimization of the DeepSeek-R1-Distill-Qwen-7B model using post-training quantization (PTQ) techniques. The optimization process is divided into these workflows:

OpenVINO for Intel® GPU/NPU
- This process uses OpenVINO specific passes like OpenVINOOptimumConversion, OpenVINOIoUpdate and OpenVINOEncapsulation
NVModelOptQuantization for NVIDIA TRT for RTX GPU

Intel® Workflows

These workflows performs quantization with Optimum Intel®. It performs the optimization pipeline:

HuggingFace Model -> Quantized OpenVINO model -> Quantized encapsulated ONNX OpenVINO IR model

NVModelOptQuantization for NVIDIA TRT for RTX GPU

To run this workflow, you need to install CUDA as required in Doc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

DeepSeek-R1-Distill-Qwen-7B Model Optimization

Intel® Workflows

NVModelOptQuantization for NVIDIA TRT for RTX GPU

FilesExpand file tree

aitk

Directory actions

More options

Directory actions

More options

Latest commit

History

aitk

Folders and files

parent directory

README.md

DeepSeek-R1-Distill-Qwen-7B Model Optimization

Intel® Workflows

NVModelOptQuantization for NVIDIA TRT for RTX GPU