Lamini On-Demand
Lamini On-Demand provides a self-service, pay-as-you-go platform for running LLM tuning and inference jobs on a high-performance GPU cluster. New and existing users receive $300 in free credit to get started. Inference is billed at $0.50 per million tokens, covering input, output, and JSON structured responses. Tuning costs $1 per step on one GPU, scaling linearly with additional GPUs for faster performance. Users benefit from advanced features like memory tuning for mixture-of-experts models, guaranteed JSON output, and flexible burst across GPUs. Credits can be purchased in $100 increments via the account dashboard. For enterprise-scale needs, Lamini also offers Reserved GPU clusters and Self-Managed licenses for on-premise or air-gapped deployments.
Starting from
$300
Up to
$300
Program Tiers
On-Demand Pay-as-you-go
Access GPU tuning and inference with $300 free credit and simple usage-based pricing.
Credit Value
$300
Duration
Until credits are exhausted
Benefits





Eligibility




