I’m using gemma-4-uncensored via API for controlling NPC behavior in a Skyrim mod called SkyrimNet. While it’s perfect for response quality, speed, and cost for what I want to use it for, I’m frequently running into 429 model overloaded errors. Unlike coding or text generation for characters on Venice’s website, this is a use case where I can’t easily just regenerate responses to retry the model multiple times. I’ve only had good consistent results without overloaded errors during extreme off-peak hours (early mornings in EST time zone). Would it be possible to direct some more resources to gemma-4-uncensored so that it’s not overloaded so often?
Please authenticate to join the conversation.
New Submission
Bugs
10 days ago

David Anderson
Get notified by email when there are changes.
New Submission
Bugs
10 days ago

David Anderson
Get notified by email when there are changes.