Limits

Certain limits are applied in order to prevent any impact on service caused by accidental or deliberate abuse. Limits vary depending on the package associated with your account. You can view the limits for your package in the account dashboard.

PAYG accounts have high default limits, but far higher limits can be set if needed. All Committed Use package limits are customizable.

Contact us if your application requires limits beyond what’s currently set for your package.

Limits are either time-based, quantity-based or instantaneous rates. The effect of hitting or exceeding a limit varies depending on its type.

Limits are also categorized as local or global depending on whether they relate to a single resource, or the aggregate of all resources of a given type:

Local limits
relate to a single resource, such as an individual channel or connection. Examples of local limits include the maximum number of channels per connection and the maximum publish rate on a channel. Email notifications are not sent for local limits.
Global limits
relate to all resources associated with the account, such as peak connections. Global limits are an aggregate of usage across all applications in your account. Email notification warnings will be sent when a global limit is close to being reached, or has been reached.

Certain limits have both a soft and a hard limit:

Soft limits
are the lower threshold of a limit. For quota limits, the soft limit is equivalent to the actual quota. No restrictions are enforced when a soft limit is reached, however for quota limits, any usage above the soft limit is charged as overages.
Hard limits
are the point at which restrictions are enforced. They are always higher than the equivalent soft limit.

All notifications regarding limit warnings, rate limiting and exceeded limits are logged. Current and historic notifications can be viewed in your account. This includes details such as how much a limit was exceeded by and the expiry date of any blocks.

Instantaneous rate limits relate to the frequency of a given operation at a moment in time and are expressed as a number of operations per second. How local instantaneous rate limits and global instantaneous rate limits are applied differs.

Local instantaneous rate limits reject operations in excess of their hard limit and return an error code. For example, the default limit for the message publish rate on an individual channel is 50 messages per second (higher limits are available). Any message publish attempts in excess of the 50 messages per second will be rejected and an error code will be returned to the publisher.

Global instantaneous rate limits apply rate suppression to operations that exceed their hard limits. Rate suppression is calculated on a rolling probabilistic basis. For example, the default limit on a PAYG package for publishing messages into a queue is 200 messages per second. If a queue rule is attempting to publish 400 messages per second into the queue, each message will have a 50% chance of being rejected. The suppression probability is continuously updated based on the publishing rate.

Global instantaneous rate limits only have a hard limit. Once the hard limit rate has been exceeded then message suppression will occur. As soon as the rate drops below the hard limit threshold, the suppression probability will decrease to zero.

Time-based limits relate to the number of operations that can be made within an hour, or within a month. If the hard limit for a time-based limit is hit, then the resource is blocked for the remainder of that period.

For example, if the hourly limit for the number of API requests is hit, no more API requests can be made for the remainder of that hour. Hourly limits are based on a clock hour, so regardless of whether the limit is hit at 11:01 or 11:59, it will reset at 12:00.

Monthly limits, such as the total count of messages, reset the next month.

Quantity-based limits specify a set amount of resources or operations that can be in use simultaneously. Usage above the hard limit is restricted. The restrictions are only removed when a resource or operation is reduced below the hard limit, or the quota for the limit is increased.

An example of a quantity-based limit is the one for peak connections. When the hard limit is reached then no more connections can be made to Ably until some of the existing ones disconnect and the total is back below the hard limit.

Quotas are based on the account package that you are on. Quotas specify the number of peak channels, peak connections, messages and bandwidth that you expect to consume each month. Quotas for free accounts are fixed. Committed Use account quotas are set in a way that ensures that limits are not reached during normal operations for an account.

The quotas for an account can be considered a soft limit, with no restrictions coming into effect until the hard limit is reached. Free accounts have a small buffer between the soft and hard limits.

Limit Free PAYG Committed Use
Soft Hard Soft Hard Soft Hard
Peak connections 200 240 100,000 250,000 Custom >= PAYG
Peak channels 200 240 100,000 250,000 Custom >= PAYG
Total monthly messages 6,000,000 7,200,000 No limit No limit No limit
Messages (per hour) 84,000 100,000 1,400,000 3,500,000 Custom >= PAYG
Message rate (per second) 70 2,400 Custom >= PAYG
Total bandwidth (GiB) 11.5 13.8 190 475 Custom >= PAYG
Bandwidth (per hour in MiB) 160 190 2.6 6.6 Custom >= PAYG
Bandwidth rate (per second in KiB ) 132 4,600 Custom >= PAYG

Limits on any paid account can be increased – contact us to request a higher limit.

Peak connections are the maximum number of realtime clients connected to Ably simultaneously at any point within a month.

For example, if you have 10,000 customers and at the busiest time of the month 500 connect to Ably at the same time, then your peak connections figure is 500.

The peak connections limit is a quantity-based limit. If the hard limit is reached, no more connections can be made to Ably until some of the existing ones disconnect.

There are three types of peak connection limit notifications you may receive:

  • connections.warning – when you reach 80% of your pre-paid quota.
  • connections.soft – when your pre-paid quota has been exceeded.
  • connections.hard – when the hard limit on the account has been exceeded.

Peak channels are the maximum number of channels that are active simultaneously at any point within a month.

For example, if you have 10,000 customers and at the busiest time of the month 500 users each attach to their own individual channel as well as a single channel they are all members of, then your peak channel figure is 501. Alternatively, if at the busiest time of the month 2,000 users are evenly split between 50 channels, with 40 users in each, then your peak channel figure is 50.

A channel is considered active when a message is published on the channel using the REST API, or a realtime client attaches to it. The channel will remain active for as long as a client is attached to it, unless all clients explicitly detach from the channel or close their connections.

A channel will automatically close when there are no more realtime clients attached to it and approximately one minute has passed since the last client detached and since the last message was published to the channel.

The peak channels limit is a quantity-based limit. If the hard limit is reached, additional channels cannot be created until some of the existing active ones are detached.

There are three types of peak channel limit notifications you may receive:

  • channels.warning – when you reach 80% of your pre-paid quota.
  • channels.soft – when your pre-paid quota has been exceeded.
  • channels.hard – when the hard limit on the account has been exceeded

Message counts and rates are the number of messages published and received in your account. In paid accounts, there are only limits to the amount of messages that can be sent hourly and per-second.

If the message rate is exceeded, a global instantaneous rate limit will be applied to message delivery and publishing. The hourly message limits are time-based. If the hard limit is hit for those, message delivery and publishing will be blocked until the following hour.

Message limit notifications are split into hourly, monthly and message rate notifications.

There are three types of hourly message limit notifications you may receive:

  • messages.hourly.warning.count – when you reach 80% of your pre-paid quota.
  • messages.hourly.soft.count – when the soft limit on the account, derived from the pre-paid quota, has been exceeded.
  • messages.hourly.hard.count – when the hard limit on the account has been exceeded.

There are three types of monthly message limit notifications you may receive:

  • messages.monthly.warning.count – when you reach 80% of your pre-paid quota.
  • messages.monthly.soft.count – when your pre-paid quota has been exceeded.
  • messages.monthly.hard.count – when the hard limit on the account has been exceeded.

There are two types of message rate limit notifications you may receive:

  • messages.maxRate.warning – when you reach 50% of the hard limit on the account.
  • messages.maxRate.hard – when the hard limit on the account has been exceeded.

Bandwidth is the amount of data transferred through messages.

  • The total bandwidth is calculated using the average message size of 2KiB multiplied by the total message count.
  • The hourly bandwidth limit is calculated using the total bandwidth limit divided by 72.
  • The bandwidth rate is calculated using the hourly bandwidth limit multiplied by 2.5, divided into seconds.

If the bandwidth rate is exceeded, a global instantaneous rate limit will be applied to message delivery and publishing. The hourly and total bandwidth limits are time-based. If the hard limit is hit for those, message delivery and publishing will be blocked until the following hour or month.

Bandwidth limit notifications are split into hourly and monthly notifications.

There are three types of hourly bandwidth limit notifications you may receive:

  • messages.hourly.warning.data – when you reach 80% of your pre-paid quota.
  • messages.hourly.soft.data – when the soft limit on the account, derived from the pre-paid quota, has been exceeded.
  • messages.hourly.hard.data – when the hard limit on the account has been exceeded.

There are three types of monthly bandwidth limit notifications you may receive:

  • messages.monthly.warning.data – when you reach 80% of your pre-paid quota.
  • messages.monthly.soft.data – when your pre-paid quota has been exceeded.
  • messages.monthly.hard.data – when the hard limit on the account has been exceeded.

Application limits relate to the quantity of resources that can be created per account.

Limit Free PAYG Committed Use
Number of apps (per account) 100
Number of API keys (per account) 100
Number of rules (per account) 100
Number of namespaces (per account) 100

The number of apps is the maximum number of applications that can be created per account.

The number of apps per account is a quantity-based limit. If the limit is reached then no additional apps can be created until some of the existing ones have been deleted. Note that deleting an app will permanently delete its message history and statistics, as well as revoke access to Ably for any API keys associated with it.

The number of API keys is the maximum number of API keys that can be created per account.

The number of API keys per account is a quantity-based limit. If the limit is reached then no additional API keys can be created until some of the existing ones have been revoked.

The number of rules is the maximum number of integration rules that can be created per account.

The number of rules per account is a quantity-based limit. If the limit is reached then no additional integration rules can be created until some of the existing ones have been deleted.

The number of namespaces is the maximum number of namespaces that can be created per account.

The number of namespaces per account is a quantity-based limit. If the limit is reached then no additional namespaces can be defined until some of the existing ones have been removed.

Token request limits relate to the rate and size of token requests made to Ably.

Limit Free PAYG Committed Use
Soft Hard Soft Hard Soft Hard
Token requests (per hour) 72,000 86,000 120,000 300,000 Custom >= PAYG
Token request rate (per second) 50 208 Custom >= PAYG
Token request size 128KiB

The token request rate limit is the maximum rate at which token requests can be made. The hourly token request limit is the number of token requests that can be made per hour to Ably.

If the token request rate is exceeded, a global instantaneous rate limit will be applied to token requests. The hourly token request limit is a time-based limit, meaning no additional token requests will be accepted until the following hour if the hard limit is hit.

The token request size is the maximum size of a signed token request that will be accepted by the Ably platform.

Token request limit notifications are split into hourly and request rate notifications.

There are three types of hourly token request limit notifications you may receive:

  • tokenRequests.hourly.warning – when you reach 80% of your pre-paid quota.
  • tokenRequests.hourly.soft – when your pre-paid quota has been exceeded.
  • tokenRequests.hourly.hard – when the hard limit on the account has been exceeded.

There are two types of token request rate limit notifications you may receive:

  • tokenRequests.maxRate.warning – when you reach 50% of the hard limit on the account.
  • tokenRequests.maxRate.hard – when the hard limit on the account has been exceeded.

Connection limits relate to the realtime connections to Ably from your account.

Limit Free PAYG Business (legacy) Committed Use
Connection rate (per second) 20 2,000 50 minimum Custom
Number of channels (per connection) 50 200 200 Custom
Outbound message rate (per second) 15 50 50 Custom
Inbound message rate (per second) 15 50 50 Custom
Connection state TTL 2 minutes

The connection rate limit is the maximum rate at which new realtime connections can be made to Ably.

The limit is calculated based on the hard limit for peak connections.

If the connection rate is exceeded, a global instantaneous rate limit will be applied to new connection attempts.

The number of channels per connection are the number of channels each client can be attached to on a realtime connection

The number of channels per connection is a quantity-based limit. If a connection attempts to exceed the limit on the number of channels it is attached to, the attachment will fail and the error code 90010 will be returned.

The outbound message rate limit is the maximum rate at which messages can be received on a realtime connection.

If the outbound message rate is exceeded for a connection, a local instantaneous rate limit will be applied to that connection.

The inbound message rate limit is the maximum rate at which messages can be published on a realtime connection.

If the inbound message rate is exceeded for a connection, a local instantaneous rate limit will be applied to that connection.

Connection state time to live (TTL) is the duration that Ably will preserve the state of a dropped connection for. Realtime connections support connection state recovery which allows for dropped connections to be resumed if the connection is reestablished within the period of the connection state TTL.

If a connection is reestablished within the period of the connection state TTL, channel attachments are preserved and any missed messages are replayed to the client. If the reconnection is unsuccessful, or outside the period of the connection state TTL, the connection will move to the suspended state.

Channel limits relate to the number, rate and membership of channels on your account.

Limit Free PAYG Business (legacy) Committed Use
Number of subscribers (per channel) Unlimited
Presence members (per channel) 50 200 200 Custom
Channel creation rate (per second) 20 2,000 50 minimum Custom
Message publish rate (per second) 15 50 50 Custom

There is no limit on the number of clients that can be subscribed to a channel.

The number of clients that can be simultaneously present on a channel is limited. This also ensures the rate of presence messages remains supportable, as it is common for all members on a channel to change state at a similar time.

As an example, consider 200 clients subscribed to presence events on a channel and all of them join and leave the presence set within a few minutes. This would result in the following messages:

  • 200 presence messages published for the enter event.
  • 200 × 200 (40,000) messages subscribed to for the enter events.
  • 200 presence messages published for the leave event.
  • 200 × 200 (40,000) presence messages subscribed to for the leave event.

This highlights the potential for 80,400 messages to be sent in a very short space of time on a single channel.

The number of presence members is a quantity-based limit. Any clients that attempt to join the presence set over the presence member limit will be rejected and the error code 91003 will be returned.

The channel creation rate limit is the maximum rate at which channels can be created across your Ably account.

The limit is calculated based on the hard limit for peak channels.

If the channel creation rate is exceeded, a global instantaneous rate limit will be applied to channel creation.

The message rate limit is the maximum rate at which messages can be published for each channel.

If the message publish rate is exceeded for a channel, a local instantaneous rate limit on publishing will be applied to that channel.

Message limits relate to the number, rate and bandwidth of messages on your account.

Limit Free PAYG Business (legacy) Committed Use
Message size (KiB) 16 64 64 256
History TTL (hours) 24 72 72 Custom

The message size is the maximum size of a single published message.

Attempting to publish a message larger than the message size limit will fail and the error code 40009 will be returned.

History time to live (TTL) is the maximum time that a message or presence event can be retrieved from history.

Queue limits relate to the number, length and rates of queues.

Limit Free PAYG Business (legacy) Committed Use
Number of queues (per account) 5 50 50 Custom
Queue length (per account) 10,000 50,000 50,000 Custom
Queue publish rate (per second) 100 200 200 Custom
Queue TTL (hours) 1 24 24 Custom

The number of queues is the maximum number of queues that can be created. This limit includes all applications in your account.

Queue length is the maximum number of messages that can be stored in queues whilst waiting to be consumed. This value is shared between all queues on the account.

For example, with a free account you could have one queue with a length of 10,000 messages, or two queues each with a length of 5,000 messages.

The queue publish rate limit is the maximum rate at which messages can be published to a queue.

If the queue publish rate is exceeded, a global instantaneous rate limit on publishing to queues is applied.

Queue time to live (TTL) is the time that a message is stored in a queue for. If the message is not consumed before this time then it is transferred to the dead letter queue.

Integration limits relate to the rate of webhooks and the rate of messages streamed through Firehose.

Limit Free PAYG Business (legacy) Committed Use
Webhook batch size 50 100 100 Custom
Webhook batch concurrency 1 1 1 Custom
Firehose external queues message rate (per second) 50 50 Custom
Function invocation rate (per second) 15 30 30 Custom
Function concurrency 30 60 60 Custom

The limit on webhook batch size is the maximum number of webhook events that can be sent per batch.

Webhook batch concurrency is the number of webhook batches that can be processed simultaneously.

The Firehose external queues message rate limit is the maximum rate at which Firehose messages can be streamed.

If the Firehose message rate is exceeded, a global instantaneous rate limit on publishing to Firehose is applied.

The function invocation rate limit is the maximum rate at which functions can be invoked. This includes AWS Lambda, Google Cloud and Azure functions that can be triggered by events.

If the function invocation rate is exceeded, a global instantaneous rate limit on function invocations is applied.

Function concurrency is the maximum number of functions that can run at the same time. This rate only applies to Google Cloud functions and Azure functions. AWS Lambda functions do not count towards this rate as they use the asynchronous event invocation type.

API request limits are the maximum number of REST API requests that can be made to Ably. This excludes token requests.

Limit Free PAYG Business (legacy) Committed Use
Soft Hard Soft Hard Soft Hard Soft Hard
API requests (per hour) 8,000 10,000 139,000 350,000 0.1 * hourly message soft limit 0.1 * hourly message hard limit Custom
API request rate (per second) 20 240 50 minimum Custom

If the API request rate is exceeded, a global instantaneous rate limit will be applied to API requests. The hourly API request limit is a time-based limit, meaning no additional API requests will be accepted until the following hour if the hard limit is hit.

API requests limit notifications are split into hourly and request rate notifications.

There are three types of hourly API requests limit notifications you may receive:

  • apiRequests.hourly.warning – when you reach 80% of your pre-paid quota.
  • apiRequests.hourly.soft – when your pre-paid quota has been exceeded.
  • apiRequests.hourly.hard – when the hard limit on the account has been exceeded.

There are two types of API requests rate limit notifications you may receive:

  • apiRequests.maxRate.warning – when you reach 50% of the hard limit on the account.
  • apiRequests.maxRate.hard – when the hard limit on the account has been exceeded.

Control API limits relate to the number of requests that can be made using the Control API per hour.

Limit Free PAYG Business (legacy) Committed Use
Authenticated account requests (per hour) 4000 4000 4000 4000
Authenticated access token requests (per hour) 2000 2000 2000 2000
Unauthenticated requests (per hour) 60 60 60 60

The authenticated account request limit is the maximum number of requests that can be made using the Control API per hour from authenticated users.

The authenticated account request limit is a time-based limit. If the limit is hit, no more requests can be made until the following hour. Any request attempts that exceed the limit will be rejected with the HTTP status code 429.

The authenticated access token request limit is the maximum number of requests that can be made for each access token using the Control API per hour.

The authenticated access token request limit is a time-based limit. If the limit is hit, no more requests can be made using that access token until the following hour. Any request attempts that exceed the limit will be rejected with the HTTP status code 429.

The unauthenticated request limit is the maximum number of requests that can be made using the Control API from an unauthenticated user per hour. This limit is applied per IP address.

The unauthenticated request limit is a time-based limit. If the limit is hit, no more requests can be made until the following hour. Any request attempts that exceed the limit will be rejected with the HTTP status code 401.

For further information on limits see:

  • The account dashboard for details of the limits associated with your package.
  • The calculator to see how limits change based on connection, channel and message quotas.
Overview