Type something to search...
Active

Lamini On-Demand

ai gpu llm cloud api

Lamini On-Demand provides a self-service, pay-as-you-go platform for running LLM tuning and inference jobs on a high-performance GPU cluster. New and existing users receive $300 in free credit to get started. Inference is billed at $0.50 per million tokens, covering input, output, and JSON structured responses. Tuning costs $1 per step on one GPU, scaling linearly with additional GPUs for faster performance. Users benefit from advanced features like memory tuning for mixture-of-experts models, guaranteed JSON output, and flexible burst across GPUs. Credits can be purchased in $100 increments via the account dashboard. For enterprise-scale needs, Lamini also offers Reserved GPU clusters and Self-Managed licenses for on-premise or air-gapped deployments.

Starting from

$300

Up to

$300

Program Tiers

On-Demand Pay-as-you-go

Access GPU tuning and inference with $300 free credit and simple usage-based pricing.

Credit Value

$300

Duration

Until credits are exhausted

Benefits

Launchpad
Launchpad
Basic benefits to help you get started
Level 1
$300 free credit upon signup
$0.50 per million inference tokens
$1 per tuning step per GPU (linear scaling)
Memory Tuning for mixture-of-experts models
Burst tuning across multiple GPUs

Eligibility

Lunar Gravity
Lunar Gravity
Simple application process, usually just a quick form
Level 1
New and existing Lamini users
No long-term commitments required

Apply Now

1

Create a Lamini account

Sign up on the Lamini platform to automatically receive $300 in free credit.

Sign Up
2

Run your first job

Start a tuning or inference job in the On-Demand dashboard to utilize your free credit.

Use Now

Frequently Asked Questions