Introducing extra enterprise-grade options for API prospects


To assist organizations scale their AI utilization with out over-extending their budgets, we’ve added two new methods to cut back prices on constant and asynchronous workloads:

  • Discounted utilization on dedicated throughput: Clients with a sustained stage of tokens per minute (TPM) utilization on GPT-4 or GPT-4 Turbo can request entry to provisioned throughput to get reductions starting from 10–50% primarily based on the scale of the dedication.
  • Lowered prices on asynchronous workloads: Clients can use our new Batch API to run non-urgent workloads asynchronously. Batch API requests are priced at 50% off shared costs, provide a lot larger price limits, and return outcomes inside 24 hours. That is supreme to be used instances like mannequin analysis, offline classification, summarization, and artificial knowledge era.


We plan to maintain including new options centered on enterprise-grade safety, administrative controls, and value administration. For extra info on these launches, go to our API documentation or get in touch with our team to debate customized options in your enterprise.

Leave a Reply

Your email address will not be published. Required fields are marked *