Limits

Limits are applied to all accounts in order to prevent any impact on service caused by deliberate or accidental abuse. Limits vary depending on the package associated with your account and upgrading to a package with additional message, connection or channel allowance increases any related limits. Enterprise package limits are customizable.

You can view the limits for your package in the account dashboard or use the calculator to see how limits change based on connection, channel and message quotas.

Overview

Limits are either time-based, quantity-based or instantaneous rates. The effect of hitting or exceeding a limit varies depending on its type.

Limits are also categorized as local or global depending on whether they relate to a single resource, or the aggregate of all resources of a given type:

Local limits
relate to a single resource, such as an individual channel or connection. Examples of local limits include the maximum number of channels per connection and the maximum publish rate on a channel. Email notifications are not sent for local limits.
Global limits
relate to all resources associated with the account, such as peak connections. Global limits are an aggregate of usage across all applications in your account. Email notification warnings will be sent when a global limit is close to being reached, or has been reached.

Certain limits have both a soft and a hard limit:

Soft limits
are the lower threshold of a limit. For quota limits, the soft limit is equivalent to the actual quota. No restrictions are enforced when a soft limit is reached, however for quota limits, any usage above the soft limit is charged as overages.
Hard limits
are the point at which restrictions are enforced. They are always higher than the equivalent soft limit.

All notifications regarding limit warnings, rate limiting and exceeded limits are logged. Current and historic notifications can be viewed in your account. This includes details such as how much a limit was exceeded by and the expiry date of any blocks.

Instantaneous rate limits

Instantaneous rate limits relate to the frequency of a given operation at a moment in time and are expressed as a number of operations per second. How local instantaneous rate limits and global instantaneous rate limits are applied differs.

Local instantaneous rate limits

Local instantaneous rate limits reject operations in excess of their hard limit and return an error code. For example, the limit on a self-service package for the message publish rate on an individual channel is 50 messages per second. Any message publish attempts in excess of the 50 messages per second will be rejected and an error code will be returned to the publisher.

Global instantaneous rate limits

Global instantaneous rate limits apply rate suppression to operations that exceed their hard limits. Rate suppression is calculated on a rolling probabilistic basis. For example, the limit on a self-service package for publishing messages into a queue is 200 messages per second. If a queue rule is attempting to publish 400 messages per second into the queue, each message will have a 50% chance of being rejected. The suppression probability is continuously updated based on the publishing rate.

Global instantaneous rate limits only have a hard limit. Once the hard limit rate has been exceeded then message suppression will occur. As soon as the rate drops below the hard limit threshold, the suppression probability will decrease to zero.

Time-based limits

Time-based limits relate to the number of operations that can be made within an hour, or within a month. If the hard limit for a time-based limit is hit, then the resource is blocked for the remainder of that period.

For example, if the hourly limit for the number of API requests is hit, no more API requests can be made for the remainder of that hour. Hourly limits are based on a clock hour, so regardless of whether the limit is hit at 11:01 or 11:59, it will reset at 12:00.

Monthly limits, such as the total count of messages, reset the next month.

Quantity-based limits

Quantity-based limits specify a set amount of resources or operations that can be in use simultaneously. Usage above the hard limit is restricted. The restrictions are only removed when a resource or operation is reduced below the hard limit, or the quota for the limit is increased.

An example of a quantity-based limit is the one for peak connections. When the hard limit is reached then no more connections can be made to Ably until some of the existing ones disconnect and the total is back below the hard limit.

Quota limits

Quotas are based on the account package that you are on. Quotas specify the number of peak channels, peak connections, messages and bandwidth that you expect to consume each month. Monthly quotas can be configured for self-service accounts. Quotas for free accounts are fixed. Enterprise account quotas are set in a way that ensures that limits are not reached during normal operations for an account.

The quotas for an account can be considered a soft limit, with no restrictions coming into effect until the hard limit is reached. Free accounts have a small buffer between the soft and hard limits. Self-service accounts have a considerably larger buffer between soft and hard limits, however any usage over the soft limit and up to the hard limit is charged as overages.

Limit Free Self-service Business (legacy) Enterprise
Soft Hard Soft Hard Soft Hard Soft Hard
Peak connections 200 240 Quota 2.5x quota Quota 5x quota Custom
Peak channels 200 240 Quota 2.5x quota Quota 5x quota Custom
Total messages 6,000,000 7,200,000 Quota 2.5x quota Quota 5x quota Custom
Messages (per hour) 84,000 100,000 Quota / 72 2.5x quota / 72 Quota / 72 5x quota / 72 Custom
Message rate (per second) 70 Hourly * 2.5 Hourly * 2.5 Custom
Total bandwidth (GiB) 11.5 13.8 Quota 2.5x quota Quota 5x quota Custom
Bandwidth (per hour in MiB) 160 190 Quota / 72 2.5x quota / 72 Quota / 72 5x quota / 72 Custom
Bandwidth rate (per second in KiB ) 132 Hourly * 2.5 Hourly * 2.5 Custom

Peak connections

Peak connections are the maximum number of realtime clients connected to Ably simultaneously at any point within a month.

For example, if you have 10,000 customers and at the busiest time of the month 500 connect to Ably at the same time, then your peak connections figure is 500.

The peak connections limit is a quantity-based limit. If the hard limit is reached, no more connections can be made to Ably until some of the existing ones disconnect.

Peak connection notifications

There are three types of peak connection limit notifications you may receive:

  • connections.warning – when you reach 80% of your pre-paid quota.
  • connections.soft – when your pre-paid quota has been exceeded.
  • connections.hard – when the hard limit on the account has been exceeded.

Peak channels

Peak channels are the maximum number of channels that are active simultaneously at any point within a month.

For example, if you have 10,000 customers and at the busiest time of the month 500 users each attach to their own individual channel as well as a single channel they are all members of, then your peak channel figure is 501. Alternatively, if at the busiest time of the month 2,000 users are evenly split between 50 channels, with 40 users in each, then your peak channel figure is 50.

A channel is considered active when a message is published on the channel using the REST API, or a realtime client attaches to it. The channel will remain active for as long as a client is attached to it, unless all clients explicitly detach from the channel or close their connections.

A channel will automatically close when there are no more realtime clients attached to it and approximately one minute has passed since the last client detached and since the last message was published to the channel.

The peak channels limit is a quantity-based limit. If the hard limit is reached, additional channels cannot be created until some of the existing active ones are detached.

Peak channel notifications

There are three types of peak channel limit notifications you may receive:

  • channels.warning – when you reach 80% of your pre-paid quota.
  • channels.soft – when your pre-paid quota has been exceeded.
  • channels.hard – when the hard limit on the account has been exceeded

Messages

Message counts and rates are the number of messages published and received in your account.

  • The total messages for a month are set based on your quota.
  • The hourly message limit is calculated using the total message limit divided by 72.
  • The message rate is calculated using the hourly limit multiplied by 2.5, divided into seconds. For example, if the hourly limit is 208,000 then the message rate will be (208,000 x 2.5) / 60 / 60 = 145.

If the message rate is exceeded, a global instantaneous rate limit will be applied to message delivery and publishing. The hourly and total message limits are time-based. If the hard limit is hit for those, message delivery and publishing will be blocked until the following hour or month.

Message notifications

Message limit notifications are split into hourly, monthly and message rate notifications.

Hourly notifications

There are three types of hourly message limit notifications you may receive:

  • messages.hourly.warning.count – when you reach 80% of your pre-paid quota.
  • messages.hourly.soft.count – when the soft limit on the account, derived from the pre-paid quota, has been exceeded.
  • messages.hourly.hard.count – when the hard limit on the account has been exceeded.
Monthly notifications

There are three types of monthly message limit notifications you may receive:

  • messages.monthly.warning.count – when you reach 80% of your pre-paid quota.
  • messages.monthly.soft.count – when your pre-paid quota has been exceeded.
  • messages.monthly.hard.count – when the hard limit on the account has been exceeded.
Message rate notifications

There are two types of message rate limit notifications you may receive:

  • messages.maxRate.warning – when you reach 50% of the hard limit on the account.
  • messages.maxRate.hard – when the hard limit on the account has been exceeded.

Bandwidth

Bandwidth is the amount of data transferred through messages.

  • The total bandwidth is calculated using the average message size of 2KiB multiplied by the total message count.
  • The hourly bandwidth limit is calculated using the total bandwidth limit divided by 72.
  • The bandwidth rate is calculated using the hourly bandwidth limit multiplied by 2.5, divided into seconds.

If the bandwidth rate is exceeded, a global instantaneous rate limit will be applied to message delivery and publishing. The hourly and total bandwidth limits are time-based. If the hard limit is hit for those, message delivery and publishing will be blocked until the following hour or month.

Bandwidth notifications

Bandwidth limit notifications are split into hourly and monthly notifications.

Hourly notifications

There are three types of hourly bandwidth limit notifications you may receive:

  • messages.hourly.warning.data – when you reach 80% of your pre-paid quota.
  • messages.hourly.soft.data – when the soft limit on the account, derived from the pre-paid quota, has been exceeded.
  • messages.hourly.hard.data – when the hard limit on the account has been exceeded.
Monthly notifications

There are three types of monthly bandwidth limit notifications you may receive:

  • messages.monthly.warning.data – when you reach 80% of your pre-paid quota.
  • messages.monthly.soft.data – when your pre-paid quota has been exceeded.
  • messages.monthly.hard.data – when the hard limit on the account has been exceeded.

App limits

App limits relate to the quantity of resources that can be created per account.

Limit Free Self-service Business (legacy) Enterprise
Number of apps (per account) 100
Number of API keys (per account) 100
Number of rules (per account) 100
Number of namespaces (per account) 100

Number of apps

The number of apps is the maximum number of applications that can be created per account.

The number of apps per account is a quantity-based limit. If the limit is reached then no additional apps can be created until some of the existing ones have been deleted. Note that deleting an app will permanently delete its message history and statistics, as well as revoke access to Ably for any API keys associated with it.

Number of API keys

The number of API keys is the maximum number of API keys that can be created per account.

The number of API keys per account is a quantity-based limit. If the limit is reached then no additional API keys can be created until some of the existing ones have been revoked.

Number of rules

The number of rules is the maximum number of integration rules that can be created per account.

The number of rules per account is a quantity-based limit. If the limit is reached then no additional integration rules can be created until some of the existing ones have been deleted.

Number of namespaces

The number of namespaces is the maximum number of namespaces that can be created per account.

The number of namespaces per account is a quantity-based limit. If the limit is reached then no additional namespaces can be defined until some of the existing ones have been removed.

Token request limits

Token request limits relate to the rate and size of token requests made to Ably.

Limit Free Self-service Business (legacy) Enterprise
Soft Hard Soft Hard Soft Hard Soft Hard
Token requests (per hour) 72,000 86,000 0.4 * hard limit Connection rate per hour / 1000 0.2 * hard limit Connection rate per hour / 1000 Custom
Token request rate (per second) 50 0.4 * hourly limit 0.2 * hourly limit Custom
Token request size 128KiB

Token request rate

The token request rate limit is the maximum rate at which token requests can be made. The hourly token request limit is the number of token requests that can be made per hour to Ably.

If the token request rate is exceeded, a global instantaneous rate limit will be applied to token requests. The hourly token request limit is a time-based limit, meaning no additional token requests will be accepted until the following hour if the hard limit is hit.

Token request size

The token request size is the maximum size of a signed token request that will be accepted by the Ably platform.

Token request notifications

Token request limit notifications are split into hourly and request rate notifications.

Hourly notifications

There are three types of hourly token request limit notifications you may receive:

  • tokenRequests.hourly.warning – when you reach 80% of your pre-paid quota.
  • tokenRequests.hourly.soft – when your pre-paid quota has been exceeded.
  • tokenRequests.hourly.hard – when the hard limit on the account has been exceeded.

Token request rate limit notifications

There are two types of token request rate limit notifications you may receive:

  • tokenRequests.maxRate.warning – when you reach 50% of the hard limit on the account.
  • tokenRequests.maxRate.hard – when the hard limit on the account has been exceeded.

Connection limits

Connection limits relate to the realtime connections to Ably from your account.

Limit Free Self-service Business (legacy) Enterprise
Connection rate (per second) 20 50 minimum 50 minimum Custom
Number of channels (per connection) 50 200 200 Custom
Outbound message rate (per second) 15 50 50 Custom
Inbound message rate (per second) 15 50 50 Custom
Connection state TTL 2 minutes

Connection rate

The connection rate limit is the maximum rate at which new realtime connections can be made to Ably.

The limit is calculated based on the hard limit for peak connections, with a minimum value of 50 per second for self-service accounts.

If the connection rate is exceeded, a global instantaneous rate limit will be applied to new connection attempts.

Number of channels

The number of channels per connection are the number of channels each client can be attached to on a realtime connection.

The number of channels per connection is a quantity-based limit. If a connection attempts to exceed the limit on the number of channels it is attached to, the attachment will fail and the error code 90010 will be returned.

Outbound message rate

The outbound message rate limit is the maximum rate at which messages can be published on a realtime connection.

If the outbound message rate is exceeded for a connection, a local instantaneous rate limit will be applied to that connection.

Inbound message rate

The inbound message rate limit is the maximum rate at which messages can be received on a realtime connection.

If the inbound message rate is exceeded for a connection, a local rate limit will be applied to that connection.

Connection state TTL

Connection state time to live (TTL) is the duration that Ably will preserve the state of a dropped connection for. Realtime connections support connection state recovery which allows for dropped connections to be resumed if the connection is reestablished within the period of the connection state TTL.

If a connection is reestablished within the period of the connection state TTL, channel attachments are preserved and any missed messages are replayed to the client. If the reconnection is unsuccessful, or outside the period of the connection state TTL, the connection will move to the suspended state.

Channel limits

Channel limits relate to the number, rate and membership of channels on your account.

Limit Free Self-service Business (legacy) Enterprise
Number of subscribers (per channel) Unlimited
Presence members (per channel) 50 200 200 Custom
Channel creation rate (per second) 20 50 minimum 50 minimum Custom
Message publish rate (per second) 15 50 50 Custom

Number of subscribers

There is no limit on the number of clients that can be subscribed to a channel.

Presence members

The number of clients that can be simultaneously present on a channel is limited. This also ensures the rate of presence messages remains supportable, as it is common for all members on a channel to change state at a similar time.

As an example, consider 200 clients subscribed to presence events on a channel and all of them join and leave the presence set within a few minutes. This would result in the following messages:

  • 200 presence messages published for the enter event.
  • 200 × 200 (40,000) messages subscribed to for the enter events.
  • 200 presence messages published for the leave event.
  • 200 × 200 (40,000) presence messages subscribed to for the leave event.

This highlights the potential for 80,400 messages to be sent in a very short space of time on a single channel.

The number of presence members is a quantity-based limit. Any clients that attempt to join the presence set over the presence member limit will be rejected and the error code 91003 will be returned.

Channel creation rate

The channel creation rate limit is the maximum rate at which channels can be created across your Ably account.

The limit is calculated based on the hard limit for peak channels, with a minimum value of 50 per second for self-service accounts.

If the channel creation rate is exceeded, a global instantaneous rate limit will be applied to channel creation.

Message publish rate

The message rate limit is the maximum rate at which messages can be published for each channel.

If the message publish rate is exceeded for a channel, a local instantaneous rate limit on publishing will be applied to that channel.

Message limits

Message limits relate to the number, rate and bandwidth of messages on your account.

Limit Free Self-service Business (legacy) Enterprise
Message size (KiB) 16 64 64 256
History TTL (hours) 24 72 72 Custom

Message size

The message size is the maximum size of a single published message.

Attempting to publish a message larger than the message size limit will fail and the error code 40009 will be returned.

History TTL

History time to live (TTL) is the maximum time that a message or presence event can be retrieved from history.

Queue limits

Queue limits relate to the number, length and rates of queues.

Limit Free Self-service Business (legacy) Enterprise
Number of queues (per account) 5 50 50 Custom
Queue length (per account) 10,000 50,000 50,000 Custom
Queue publish rate (per second) 100 200 200 Custom
Queue TTL (hours) 1 24 24 Custom

Number of queues

The number of queues is the maximum number of queues that can be created. This limit includes all applications in your account.

Queue length

Queue length is the maximum number of messages that can be stored in queues whilst waiting to be consumed. This value is shared between all queues on the account.

For example, with a free account you could have one queue with a length of 10,000 messages, or two queues each with a length of 5,000 messages.

Queue publish rate

The queue publish rate limit is the maximum rate at which messages can be published to a queue.

If the queue publish rate is exceeded, a global instantaneous rate limit on publishing to queues is applied.

Queue TTL

Queue time to live (TTL) is the time that a message is stored in a queue for. If the message is not consumed before this time then it is transferred to the dead letter queue.

Integration limits

Integration limits relate to the rate of webhooks and the rate of messages streamed through Firehose.

Limit Free Self-service Business (legacy) Enterprise
Webhook batch size 50 100 100 Custom
Webhook batch concurrency 1 1 1 Custom
Firehose external queues message rate (per second) 50 Custom
Function invocation rate (per second) 15 30 30 Custom
Function concurrency 30 60 60 Custom

Webhook batch size

The limit on webhook batch size is the maximum number of webhook events that can be sent per batch.

Webhook batch concurrency

Webhook batch concurrency is the number of webhook batches that can be processed simultaneously.

Firehose external queues message rate

The Firehose external queues message rate limit is the maximum rate at which Firehose messages can be streamed.

If the Firehose message rate is exceeded, a global instantaneous rate limit on publishing to Firehose is applied.

Function invocation rate

The function invocation rate limit is the maximum rate at which functions can be invoked. This includes AWS Lambda, Google Cloud and Azure functions that can be triggered by events.

If the function invocation rate is exceeded, a global instantaneous rate limit on function invocations is applied.

Function concurrency

Function concurrency is the maximum number of functions that can run at the same time. This rate only applies to Google Cloud functions and Azure functions. AWS Lambda functions do not count towards this rate as they use the asynchronous event invocation type.

API request limits

API request limits are the maximum number of REST API requests that can be made to Ably. This excludes token requests.

Limit Free Self-service Business (legacy) Enterprise
Soft Hard Soft Hard Soft Hard Soft Hard
API requests (per hour) 8,000 10,000 0.1 * hourly message soft limit 0.1 * hourly message hard limit 0.1 * hourly message soft limit 0.1 * hourly message hard limit Custom
API request rate (per second) 20 50 minimum 50 minimum Custom

If the API request rate is exceeded, a global instantaneous rate limit will be applied to API requests. The hourly API request limit is a time-based limit, meaning no additional API requests will be accepted until the following hour if the hard limit is hit.

API limit notifications

API requests limit notifications are split into hourly and request rate notifications.

Hourly notifications

There are three types of hourly API requests limit notifications you may receive:

  • apiRequests.hourly.warning – when you reach 80% of your pre-paid quota.
  • apiRequests.hourly.soft – when your pre-paid quota has been exceeded.
  • apiRequests.hourly.hard – when the hard limit on the account has been exceeded.

API requests rate notifications

There are two types of API requests rate limit notifications you may receive:

  • apiRequests.maxRate.warning – when you reach 50% of the hard limit on the account.
  • apiRequests.maxRate.hard – when the hard limit on the account has been exceeded.

Control API limits

Control API limits relate to the number of requests that can be made using the Control API per hour.

Limit Free Self-service Business (legacy) Enterprise
Authenticated account requests (per hour) 4000 4000 4000 4000
Authenticated access token requests (per hour) 2000 2000 2000 2000
Unauthenticated requests (per hour) 60 60 60 60

Authenticated account requests

The authenticated account request limit is the maximum number of requests that can be made using the Control API per hour from authenticated users.

The authenticated account request limit is a time-based limit. If the limit is hit, no more requests can be made until the following hour. Any request attempts that exceed the limit will be rejected with the HTTP status code 429.

Authenticated access token requests

The authenticated access token request limit is the maximum number of requests that can be made for each access token using the Control API per hour.

The authenticated access token request limit is a time-based limit. If the limit is hit, no more requests can be made using that access token until the following hour. Any request attempts that exceed the limit will be rejected with the HTTP status code 429.

Unauthenticated requests

The unauthenticated request limit is the maximum number of requests that can be made using the Control API from an unauthenticated user per hour. This limit is applied per IP address.

The unauthenticated request limit is a time-based limit. If the limit is hit, no more requests can be made until the following hour. Any request attempts that exceed the limit will be rejected with the HTTP status code 401.

Next steps

For further information on limits see:

  • The account dashboard for details of the limits associated with your package.
  • The calculator to see how limits change based on connection, channel and message quotas.

Need help?

If you need any help with your implementation or if you have encountered any problems, do get in touch. You can also quickly find answers from our knowledge base, and blog.