I’d like to request the adoption of Multi-Token Prediction (MTP drafter for Gemma-4-31b, which can boost its throughput up to 3x without any loss in performance. It would be a game changer for a variety of use cases.
Please authenticate to join the conversation.
New Submission
Feature Requests
New Model
11 days ago

Dongwon
Get notified by email when there are changes.
New Submission
Feature Requests
New Model
11 days ago

Dongwon
Get notified by email when there are changes.