All Updates
Workshop v0.2.0

Workshop v0.2.0

Improvements

  • Full-precision and quantized GGUF conversion pipeline
  • Configurable calibration datasets for quantization-aware conversion
  • .nvx packaging: model + tokenizer + inference config in a single deployable artifact
  • Delta-based model versioning with diff and merge tooling
  • Batch processing support for large model collections

Performance Improvements

  • Quantization pipeline throughput increased by 28% via parallelized tensor operations
  • Delta diff computation optimized — 4x faster for models over 10B parameters

Bug Fixes

  • Fixed incorrect attention head mapping during GGUF conversion of MHA architectures
  • Fixed .nvx package corruption when tokenizer config exceeds 64KB