NVIDIA / Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 426
Star 2.9k

Code
Issues 58
Pull requests 203
Actions
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security and quality
Insights

Pull requests: NVIDIA/Model-Optimizer

Labels 32 Milestones 0

New pull request New

203 Open 1,067 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix non-deterministic T5 calibration NaN on multi-GPU

#1636 opened Jun 5, 2026 by kevalmorabia97 Collaborator

Loading…

Add NVFP4 fakequant for attention BMM operands (prefill + decode)

#1635 opened Jun 5, 2026 by kaix-nv Contributor • Draft

Route decode-only skip-softmax through the decode kernel in vLLM

#1634 opened Jun 5, 2026 by kaix-nv Contributor • Draft

docs(skills): fix VLM PTQ ViT-quantization + AA eval judge/tool-call gaps

#1632 opened Jun 5, 2026 by Edwardf0t1 Contributor

Loading…

fix(export): correct unified_export_megatron at EP > 1 and DP > 1

#1631 opened Jun 4, 2026 by yueshen2016 Contributor

Loading…

3 of 4 tasks

[minor] fix for GLM4.7 mtp module in PTQ

#1630 opened Jun 4, 2026 by Fridah-nv Contributor

Loading…

remove deprecated get_default_load_sharded_strategy

#1629 opened Jun 4, 2026 by dimapihtar

Loading…

[6058841] Consistent types on If/Loop/Scan subgraphs during FP16/BF16 conversion

#1628 opened Jun 4, 2026 by ajrasane Contributor • Draft

[6058907] Fix ShapeInferenceError in ONNX int8+fp16 quantization of weakly-typed models

#1627 opened Jun 4, 2026 by ajrasane Contributor

Loading…

docs(eval skill): vLLM backend env vars + SLURM HF-cache/cpu_partition guidance

#1625 opened Jun 4, 2026 by cjluo-nv Collaborator

Loading…

Add efficient Triton decode attention kernel with fused skip-softmax …

#1624 opened Jun 3, 2026 by kaix-nv Contributor • Draft

Skip-Softmax calibration in vLLM

#1622 opened Jun 3, 2026 by kaix-nv Contributor • Draft

DFlash speculative decoding for MiniMax-M2.7 (FSDP2): auto mask-token, FSDP2 resume fixes, per-checkpoint draft export

#1621 opened Jun 3, 2026 by yeyu-nvidia Contributor

Loading…

Add W4A16 NVFP4-MSE Qwen3.5 dense/MoE PTQ recipes

#1620 opened Jun 3, 2026 by cjluo-nv Collaborator

Loading…

[OMNIML-4930] specdec_bench cell t1_d7 — Qwen/Qwen3.5-4B / MTP / vllm

#1619 opened Jun 3, 2026 by ChenhanYu Collaborator • Draft

[OMNIML-4929] specdec_bench cell t1_d3 — Qwen/Qwen3.5-4B / MTP / vllm

#1618 opened Jun 3, 2026 by ChenhanYu Collaborator • Draft

[OMNIML-4928] specdec_bench cell t0_d7 — Qwen/Qwen3.5-4B / MTP / vllm

#1617 opened Jun 3, 2026 by ChenhanYu Collaborator • Draft

[OMNIML-4927] specdec_bench cell t0_d3 — Qwen/Qwen3.5-4B / MTP / vllm

#1614 opened Jun 3, 2026 by ChenhanYu Collaborator • Draft

[OMNIML-4928] cell_t0_d7

#1613 opened Jun 2, 2026 by ChenhanYu Collaborator • Draft

[Feat]: Specdec Multinode Streaming

#1611 opened Jun 2, 2026 by h-guo18 Contributor

Loading…

[OMNIML-4886] specdec_bench cell t0_d3 — Qwen/Qwen3.5-4B / MTP / vllm

#1608 opened Jun 2, 2026 by ChenhanYu Collaborator • Draft

Fix torch import error to remove circular dependency & move Nemotron configs

#1606 opened Jun 2, 2026 by jenchen13 Contributor

Loading…

[OMNIML-3994] Add SharedQuantState

#1605 opened Jun 2, 2026 by sychen52 Contributor

Loading…

[OMNIML-4886] cell_t0_d3

#1603 opened Jun 2, 2026 by ChenhanYu Collaborator • Draft

Add NVFP4 + QAD to the Nemotron-3-Nano-30B-A3B tutorial

#1601 opened Jun 2, 2026 by kevalmorabia97 Collaborator • Draft

Previous 1 2 3 4 5 … 8 9 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!