• Blog:

  • Home
  • Company
  • Engineering
  • Developers
  • Edge Messaging
  • How to monitor the health of Ably-dependent code

    By: Michał Niegrzybowski 5 min read

    Introduction

    Hi! I’m Michał Niegrzybowski, and I’m a .NET consultant. You can see my previous article on this blog about developing a realtime full-stack app with .NET, Angular, and MongoDB.


    How often do you face an issue with a service that doesn't work and you don’t know why? How often is it related to some external services, dependencies like a database, or queue mechanism you are using? Of course, you could check each service from time to time, and you could run some smoke tests against the service, but it will not give you an answer if “service is not working because the connection to MSSQL failed.”

    In .NET applications we can define health checks. Which, as the name says, do a check of our service. There are plenty of out-of-the-box checks like, for example, for SQL Databases, MongoDB, ElasticSearch, and so on. As a contribution to the ably-labs repository, we released an open source Ably.Healthcheck library.

    In this article, we illustrate each of the available health checks and hope to get the answer to the question, “Is Ably alive?”

    What to do when things go wrong

    Let’s imagine you browse the internet and suddenly every page fails to load. What do you do? Perhaps you try pressing F5, but nothing happens. So you open the console and write ping www.google.com:

    OK. It seems that everything is fine and there was some problem with your browser configuration.

    If code that uses Ably doesn’t work, you can also check if it’s because of you or Ably by using a ping. You can regularly call ping to send a heartbeat ping to the Ably server, which returns the elapsed time in seconds when a heartbeat ping request is echoed from the server.

    Send a regular heartbeat ping to the Ably server

    Here’s a code snippet which is the only thing you need to add to your health check configuration located in the ConfigureServices method in the startup file.

    services.AddHealthChecks()
    	.AddCheck(
        	"AblyPing",
            AblyPingHealthCheck(
            new AblyRealtime ("apiKey"), Timespan.FromSeconds 1.
            )
    	)

    Worth noting is the last argument, which says what is the acceptable delay for a ping (1 second).

    It’s important to mention that calling ping() for a health check is free of charge. Ably doesn’t bill you for ping messages compared to the other two health checks described later in this post.

    The ping works but my messages are not coming through the channel

    Sending a ping is a basic check to confirm that everything is okay with your Ably dependency. Another option if you want to be more certain that the Ably service is healthy is called ChannelHealthCheck.

    What does it give you in comparison to PingHealthCheck? It uses a real channel, sends a message and waits for a message delivery status. So you verify your concrete channel and that everything with it is correctly configured.

    This diagram shows how the ChannelHealthCheck works:

    The sequence of ChannelHealthCheck

    This is what you need to add to your ConfigureServices method:

    services.AddHealthChecks()
    	.AddCheck(
        	"AblyChannel",
            AblyChannelHealthCheck(
    			new AblyRealtime ("apiKey"),
    			"ServiceName",
    			"ChannelName"
    		)
    	)

    In comparison to PingHealthcheck we need to pass two additional arguments.

    • ServiceName is a postfix to our messages’ topic so we can easily distinguish them from “normal” messages.
    • ChannelName is the name of the channel we want to test. By default we are using the “Healthcheck” channel.

    Ping works, channel works, but sometimes my messages arrive slowly

    The final and most thorough available health check is named Timer. This not only checks if the message arrives for a real channel but it also checks the timestamp inside the message and how much it differs from the time when message was sent. You could set up an acceptable interval on your own when configuring this health check so you will know up-front if there is a “lag” between message exchange.

    This diagram shows how TimerHealthCheck works:

    The sequence of TimerHealthCheck

    This is what you need to add to the ConfigureServices method:

    services.AddHealthChecks()
    	.AddCheck(
        	"AblyChannel",
            AblyTimerHealthCheck(
            	new AblyRealtime ("apiKey"),
                "ServiceName",
                "ChannelName",
                TimeSpan.FromSeconds 1.,
                TimeSpan.FromSeconds 1.
        	)
    	)

    There are two additional arguments compared to the previous check. First, timespan is related to the acceptable time difference between the time when you send the message and the message property inside of a message. Whereas the second argument describes how long to wait to receive a message via the Ably channel. You need to remember that the first parameter should be always less than the second one.

    Configuration of Ably Healthchecks in .NET WebApp

    As mentioned in the introduction, all code samples shown earlier make use of the new Ably.Healthcheck library. This library helps you to start using Ably health checks with just a few lines of code. Let’s see how the more complete code snippet looks. So let’s go to our sample WebApp written in .NET. We need to place AddHealthChecks in the Startup file.

    ...
    member this.ConfigureServices(services: IServiceCollection) =
        ...
        let ably = new AblyRealtime ("apiKey")
        ...
        services.AddHealthChecks()
    		.AddCheck(
                "AblyPing",
                AblyPingHealthCheck(
                    Ably,
                    TimeSpan.FromSeconds 1.
                )
            )        
            .AddCheck(
                "AblyChannel",
                AblyChannelHealthCheck(
                    ably,
                    "ServiceName",
                    "ChannelName"
                )
            )
            .AddCheck(
                "AblyTimer",
                AblyTimerHealthCheck(
                    ably,
                    "serviceName",
                    "ChannelName",
                    TimeSpan.FromSeconds1.,
                    TimeSpan.FromSeconds1.
                )
            )
        |> ignore
        ...
    ...
    

    Improving the UI with AddHealthCheckUI

    We also want to add a nice health check view, so we included AddHealthCheckUI in ConfigureService in our start up file.

    member this.ConfigureServices(services: IServiceCollection) =
        ...
        services
            .AddHealthChecksUI(fun s ->
                s
                    .SetEvaluationTimeInSeconds(60)
                    .AddHealthCheckEndpoint("Self", $"http://{Dns.GetHostName()}/health")
                |> ignore)
            .AddInMemoryStorage() |> ignore
        ...
    
    member this.Configure(app: IApplicationBuilder, env: IWebHostEnvironment) =
        ...
        ...
        app.UseEndpoints(fun endpoints ->
                endpoints.MapControllers() |> ignore
                endpoints.MapHealthChecksUI(fun setup ->
                    setup.UIPath <- "/ui-health"
                    setup.ApiPath <- "/api-ui-health"
                ) |> ignore
                endpoints.MapHealthChecks(
                    "/health",
                    HealthCheckOptions(
                        Predicate = (fun _ -> true),
                        ResponseWriter = Func<HttpContext, HealthReport, Task>(fun (context) (c: HealthReport) -> UIResponseWriter.WriteHealthCheckUIResponse(context, c))
                    )
                ) |> ignore
            ) |> ignore
        ...
    

    The results

    Here is an example failure result (Ably Channel pass, Timer fail):

    The Channel test passes but the Timer fails

    And here is an example success result (Ably Channel and Timer pass):

    Both Channel and Timer checks pass

    I decided to run those tests every 60 seconds instead of the default 10, so as not to produce too much unnecessary traffic in Ably dashboards.


    I hope you enjoyed this article. Thanks for reading!

    Useful resources and further .NET reading