All Updates
Workshop v0.2.0
Workshop v0.2.0
Improvements
- Full-precision and quantized GGUF conversion pipeline
- Configurable calibration datasets for quantization-aware conversion
- .nvx packaging: model + tokenizer + inference config in a single deployable artifact
- Delta-based model versioning with diff and merge tooling
- Batch processing support for large model collections
Performance Improvements
- Quantization pipeline throughput increased by 28% via parallelized tensor operations
- Delta diff computation optimized — 4x faster for models over 10B parameters
Bug Fixes
- Fixed incorrect attention head mapping during GGUF conversion of MHA architectures
- Fixed .nvx package corruption when tokenizer config exceeds 64KB