Plans

Explaining how our different subscription tiers work.

Featherless provides serverless access to models as well agent runtimes, enabling you to power AI-applications without needing to manage infrastructure.

Our plans are subscription and concurrency based. Allowing unlimited monthly requests with a fixed number of concurrent requests. Subscription tiers differ by the model sizes and context lengths offered.

Featherless offers consumer plans with smaller plans for interactive chat for assistants and role-playing, and two larger plans for agentic inference and coding plans.

Consumer Plans

Plan

Tier

Price (/month)

Features

Featherless Basic

Chat

$10

  • Private, secure, and anonymous usage (no logs)

  • Use any model up to 15B parameters

  • 2 concurrent connections*

Featherless Premium

Chat

$25

  • Private, secure, and anonymous usage (no logs)

  • Access any model in the catalogue (including K2.5 & GLM 5.1)

  • Up to 4 concurrent connections*

Featherless Claw Standard

Agentic

$100

  • Private, secure, and anonymous usage (no logs)

  • Access any model up to 229B (e.g. StepFun, MiniMax)

  • Up to 8 concurrent connections*

  • Up to 256K context

  • One agent sandbox

Featherless Claw Standard

Agentic

$200

  • Private, secure, and anonymous usage (no logs)

  • Access any model in the catalogue (including Deepseek, K2.5 & GLM 5.1)

  • Up to 8 concurrent connections*

  • Up to 256K context

  • One agent sandbox

*smaller models allow for higher concurrency than larger models. See more below.

Business Plans

Business plans are scalable, allowing users to purchase larger amounts of inference to power production applications - whether agent fleets or other AI applications.

Plan

Price (/unit/month)

Features

Featherless 229B Full Context

$100

  • Private, secure, and anonymous usage (no logs)

  • Access any model up to 229B (e.g. StepFun, MiniMax)

  • Up to 8 concurrent connections*

  • Up to 256K context

  • One sandbox per unit

Featherless Max

$200

  • Private, secure, and anonymous usage (no logs)

  • Access any model in the catalogue (including Deepseek, K2.5 & GLM 5.1)

  • Up to 8 concurrent connections*

  • Up to 256K context

  • One sandbox per unit

*For more info on how the concurrency limits work visit:

Last edited: Apr 16, 2026