Writing GitHub bots in .NET

Mar 4, 2022 · 5 minute read

For a while now the Octokit libraries for .NET have lagged behind the JavaScript libraries, especially when it comes to webhooks. Unfortunately, I needed a GitHub webhook client for an internal project, so I had to write my own. It wasn’t too much extra effort to open source it, and thus Octokit.Webhooks was born!

I wanted to give a quick example of how to get up and running with Octokit.Webhooks, and what better way than to write a small GitHub bot?

Setup

For this project, I’m going to be using .NET 6.0 and ASP.NET’s new minimal APIs to simplify setup. From a terminal I’m going to create a new web API project:

1dotnet new webapi --output octokit-webhooks-sample

The default template is set up to be a weather forecast API, but I can simplify it a bit more for this sample. My Program.cs looks like this:

1var builder = WebApplication.CreateBuilder(args);
2
3var app = builder.Build();
4
5app.Run();

Next up I’m going to install the Octokit.Webhooks.AspNetCore package:

1dotnet add package Octokit.Webhooks.AspNetCore

This package consumes the Octokit.Webhooks package, which contains core functionality like deserializers and processors, and adds ASP.NET Core specific code, like automatic API endpoint mapping and shared secret verification.

Handling webhooks

Now that I’ve got my project set up, I need to create my own processor to handle incoming webhooks. Octokit.Webhooks ships with an abstract class called WebhookEventProcessor that does all the heavy lifting of deserializing incoming webhooks. All I need to do is to create my own class that inherits from it, and write some logic to act on the webhook events.

 1using Octokit.Webhooks;
 2using Octokit.Webhooks.Events;
 3using Octokit.Webhooks.Events.IssueComment;
 4
 5public sealed class MyWebhookEventProcessor : WebhookEventProcessor
 6{
 7
 8  private readonly ILogger<MyWebhookEventProcessor> logger;
 9
10  public MyWebhookEventProcessor(ILogger<MyWebhookEventProcessor> logger)
11  {
12    this.logger = logger;
13  }
14
15  protected override Task ProcessIssueCommentWebhookAsync(WebhookHeaders headers, IssueCommentEvent issueCommentEvent, IssueCommentAction action)
16  {
17    this.logger.LogInformation(issueCommentEvent.Comment.Body);
18    return Task.CompletedTask;
19  }
20}

I created a small class MyWebhookEventProcessor that inherits from WebhookEventProcessor, and has an override for ProcessIssueCommentWebhookAsync that logs out the comment body. I also get the headers and the action passed to this method, so I could write a switch case and have different handling for created, edited, and deleted actions, but this is enough for now.

I also need to hook up MyWebhookEventProcessor in my startup class.

 1using Octokit.Webhooks;
 2using Octokit.Webhooks.AspNetCore;
 3
 4var builder = WebApplication.CreateBuilder(args);
 5
 6builder.Services.AddSingleton<WebhookEventProcessor, MyWebhookEventProcessor>();
 7
 8var app = builder.Build();
 9
10app.UseRouting();
11
12app.UseEndpoints(endpoints =>
13{
14  endpoints.MapGitHubWebhooks();
15});
16
17app.Run();

This is enough to tell ASP.NET to hook up dependency injection for MyWebhookEventProcessor, enable routing. It will also automatically add a route to handle incoming GitHub webhooks. By default it’s exposed at /api/github/webhooks, but you can use any route you’d like. MapGitHubWebhooks also accepts a shared secret which allows you to verify the content signature of GitHub webhooks.

That’s all the code required on my side. Now I just need to expose my service to the internet, and configure GitHub to start sending me webhooks.

GitHub webhook configuration

For GitHub to be able to send me webhooks, my service needs to be publicly accessible to the internet. I recently discovered a neat little service to do this with nothing more than ssh: localhost.run.

If I run my app with dotnet run then I can find the port that it’s running on:

1info: Microsoft.Hosting.Lifetime[14]
2      Now listening on: http://localhost:5002

And using localhost.run I can create a tunnel for that port:

1$ ssh -R 80:localhost:5002 nokey@localhost.run
2
3...
4
5b49b69845954b1.lhrtunnel.link tunneled with tls termination, https://b49b69845954b1.lhrtunnel.link

Now on GitHub if I visit the settings for a repository, and go to webhooks, I can create a new webhook configuration using that domain name.

Creating a new GitHub Webhook

Now all I need to do is create a comment on an issue in the same repository….

A test issue comment

And it’ll get logged in my terminal!

1info: MyWebhookEventProcessor[0]
2      Test comment

Making it interactive

Logging things to the terminal is great and all, but to make it a bot it should really do something. For that I’ll need the Octokit package:

1dotnet add package Octokit

And I’ll use it in MyWebhookEventProcessor :

 1private readonly GitHubClient client;
 2
 3public MyWebhookEventProcessor(ILogger<MyWebhookEventProcessor> logger)
 4{
 5  this.logger = logger;
 6  this.client = new GitHubClient(new ProductHeaderValue("octokit-webhooks-sample"))
 7  {
 8    Credentials = new Credentials("...")
 9  };
10}

For this example I’m using a personal access token. You can create your own here. If I were deploying this as a production service, I would probably use something a bit more robust, like a GitHub App.

1protected override async Task ProcessIssueCommentWebhookAsync(WebhookHeaders headers, IssueCommentEvent issueCommentEvent, IssueCommentAction action)
2{
3  this.logger.LogInformation(issueCommentEvent.Comment.Body);
4  await this.client.Issue.Comment.Create(
5    repositoryId: issueCommentEvent.Repository.Id,
6    number: (int)issueCommentEvent.Issue.Number,
7    newComment: "Hello, world!"
8  );
9}

I need to do that cast from long to int because Octokit still has an open PR to convert all its IDs to long. I’ve also got to add the async modifier to my method, so I can await the issue comment creation method in Octokit.

Once I’ve done all that, I can create an issue comment on my repository on GitHub and my “bot” will reply!

A reply from the bot

What next?

If you find the library useful or interesting, give it a star on GitHub. And create an issue or pull request if you find find a bug or have a feature request.

If you want to create a fully-fledged bot, check out the GitHub documentation on creating a GitHub app. It’s the next progression from PATs, and allows you to more easily share your bot.

Netlify billing Denial-of-Service

Apr 11, 2021 · 4 minute read

At its heart, Netlify is a platform for hosting static websites. I initially started using it as an alternative to GitHub Pages, and features like Netlify CMS and preview deploys really won me over. However, I recently got bit by preview deploys and the build minutes pricing.

I had a few GitHub repos set up to automatically deploy to Netlify previews from pull requests. I had also configured Renovate to automatically open pull requests for my dependencies. Javascript projects have a lot of dependencies, so there were more than a few pull requests!

Originally Netlify didn’t charge for build minutes, but in October 2019 they introduced a cap of 300 minutes per month on the start plan, with the option to buy packs of 500 build minutes for $7. I’m not criticising them for this change. I understand that they need to make money, and I think $7 for 500 minutes is a reasonable price. However, I don’t think Netlify has put the necessary protections in place for their customers.

SaaS cost attack

A quick Google search for “huge bill” for AWS, Azure, or GCP returns thousands of results. Stories of people who, at best, misconfigured their infrastructure or, at worst, were the target of a DDoS attack. Whatever the cause, the outcome is the same: a massive bill for thousands of dollars.

The big three cloud providers, and most other SaaS companies, offer some sort of protections or hard limits to prevent these sorts of surprises: Azure has Azure Cost Management + Billing, AWS has AWS Budget, and GCP has Cloud Billing budgets. Netlify has no such protections.

To me, this seems like a massive oversight and is a core feature that most users, and all business, would expect to have. Obviously the intention here is for Netlify to be able to charge customers when they exceed their build minutes. This also leaves open a massive opportunity for malicious actors to incur huge costs for Netlify customers with very little effort.

Proof-of-concept

Netlify allows you to configure your site’s build using a netlify.toml stored in the root of your repository. Here’s an excerpt taken from their documentation:

 1# Settings in the [build] context are global and are applied to all contexts
 2# unless otherwise overridden by more specific contexts.
 3[build]
 4  # Directory to change to before starting a build.
 5  # This is where we will look for package.json/.nvmrc/etc.
 6  base = "project/"
 7
 8  # Directory that contains the deploy-ready HTML files and assets generated by
 9  # the build. This is relative to the base directory if one has been set, or the
10  # root directory if a base has not been set. This sample publishes the
11  # directory located at the absolute path "root/project/build-output"
12  publish = "build-output/"
13
14  # Default build command.
15  command = "echo 'default context'"

If I were to replace command with something like sleep 120 and open a pull request that uses Netlify and has not explicitly disabled deploy previews, the Netlify build will happily build my pull request and sleep for two minutes.

A quick search on GitHub for filename:netlify.toml shows there are approximately forty thousand repositories that are potentially vulnerable to this style of attack.

One small mercy is the fact that Netlify limits builds to 15 minutes by default. So, to cost anyone any money, someone would have to open 20 pull requests with command = "sleep 900". At that point, I hope either the repository owner would notice, GitHub would rate-limit you, or it would catch the attention of Netlify.

What next?

I was recently bitten by unexpected charges for build minutes. After contacting Netlify they, very kindly, forgave the charges. However, when I asked about billing protection for customers they replied:

Our policy is to keep your website and builds running as you expect, rather than leaving things in an inconsistent state. I understand it is not what you prefer, but it is what our business prioritizes - expected behavior and continuous uptime for your website and build processes. I have however recorded your feedback for our billing team.

Personally, I have disabled preview deploys on all my sites on Netlify and moved the builds to GitHub Actions. I still use Netlify for NetlifyCMS, but I might look into using NetlifyCMS with a custom OAuth provider in the future.

Tech stack #10YearChallenge

Dec 21, 2020 · 5 minute read

#10YearChallenge has been trending for a while, so I thought it would be fun to do a 10 year challenge for programming and take a look at the technology I used back in 2010.

2010

10 years ago covers my final year in high school, and my first year in university. Both used completely different programming languages and tech stacks, so it’s an interesting place to look back at.

I was running Windows on my personal machine, but the computers in the engineering department at my university were running Linux (SUSE if I recall correctly). It wasn’t my first exposure to Linux, but I was still more comfortable using Windows at this point.

VB.NET

I started learning how to program in my final few years of high school. My computer science teacher started us off with Visual Basic .NET. We were actually the first year group to use this stack. Previously my school used Delpi and Pascal, so it was new to everyone.

For my final year project, I built a system for a hairdresser complete with appointment scheduling, customer database, and inventory management!

A screenshot of my high school project

MATLAB

The first week of university we got thrown into the deep end with a week-long Lego Mindstorms coursework project. There were no real limitations except for your imagination… and your MATLAB skills. In the end, our team built a robot with an automatic gearbox.

A Lego Mindstorms car

Despite MATLAB’s reputation for not being a ‘real’ programming language, I used it a lot throughout all four years at university, including for my Master’s thesis! I’d really recommend MATLAB Cody if you’re looking to improve your MATLAB skills.

C++

Still one of the favourite languages for teaching undergraduates. C++ was used extensively, but one of my proudest pieces of work in C++ is still the logic simulator I wrote for a coursework project.

A screenshot of a logic simulator

I really got to cut my teeth on C++ during my first ever internship, working on an H.265/HEVC video encoder at Cisco. To this day, it was some of the most challenging (in a good way) work I’ve done. Or to use someone else’s words “H.264 is Magic”.

2020

Flash forward to 2020 and I’ve been programming professionally for almost 6 years now. In that time, I’ve used a lot of different languages including Java, Python, and even a year working in X++ (despite my attempts to forget it!).

Even though I work at Microsoft, I’ve been running Arch Linux as my daily driver for over 3 years. Yes, I still need to use Windows in a VM from time to time, but the fact that I can achieve my developer workflow almost entirely from Linux just goes to show that Microsoft ♥ Linux isn’t just an empty platitude.

C#

It’s only in the last year or so that I’ve come back to working on a .NET stack, but already I’ve deployed applications on Azure Functions, ASP.NET Core running in Kubernetes, and most recently Service Fabric. C# is a real breath of fresh air coming from 4 years of Java, and I am really excited to see where the language goes after C# 8 and .NET 5.

TypeScript

If you’re doing front-end work nowadays, I think TypeScript is the best way to do it. It papers over the cracks of JavaScipt, and gives you much more confidence, especially when working in a large codebase. The most common stack I work in now is React + TypeScript, and it is a million times better than the jQuery days.

I’ve also used TypeScript for some back-end work too – most notably for Renovate. The type system really lends itself well to these sorts of back-end tasks, and I wouldn’t discount it over some of the more conventional stacks.

DevOps

Okay, so this one isn’t a programming language, but it’s definitely something that has changed the way I work. In this context, DevOps means a couple of things to me: testing, continuous integration/continuous delivery (CI/CD) and monitoring.

In 2010, testing meant manual testing. I remember for my hairdresser management system I had to document my manual test plan. It was a requirement of the marking scheme. Nowadays, it’s easier to think of testing as a pyramid with unit tests at the base, integration tests and E2E tests in the middle, and a small number of manual tests at the top. Ham Vocke’s The Practical Test Pyramid is the definitive guide for testing in 2020.

CI/CD has been one of my favourite topics lately. Even though the agile manifesto talked about it almost 20 years ago, only recently has the barrier to entry gotten so low. Between Github Actions, Gitlab CI, Travis CI and all the rest it’s a no-brainer. I use GitHub Actions in almost every side project I build.

Monitoring is such an important tool for running a successful service. You can use it to pre-emptively fix problems before they become problems or choose what areas to work on based on customer usage. Like CI/CD it’s become so easy now. For most platforms all you need to do is include an SDK!

2030?

Who knows what 2030 will bring? Maybe Rust will replace C++ everywhere? Maybe AI will have replaced programmers? Maybe Go will finally get generics?

Common async pitfalls—part two

Nov 28, 2020 · 7 minute read

Following on from part one, here’s some more of the most common pitfalls I’ve come across—either myself, colleagues and friends, or examples in documentation—and how to avoid them.

‘Fake’-sync is not async

If the method you are calling is synchronous, even in an async method, then call it like any other synchronous method. If you want to yield the thread, then you should use Task.Yield in most cases. For UI programming, see this note about Task.Yield from the .NET API documentation.

 1public async Task DoStuffAsync(CancellationToken cancellationToken)
 2{
 3    await SomeAsyncMethod(cancellationToken);
 4
 5    // BAD: This provides no benefit and incurs the overhead of a thread hop
 6    await Task.Run(() =>
 7    {
 8        SomeSyncMethod();
 9        return Task.CompletedTask
10    });
11
12    // GOOD: Just keep the synchronous call on the current thread
13    SomeSyncMethod();
14
15    // GOOD: If you have a long-running loop, etc, then periodically yield for maximum thread sharing
16    for (int i = 0; i < 1000; i++)
17    {
18        if (i % 100 == 0)
19        {
20            await Task.Yield();
21        }
22
23        // In some cases you will need to do CPU intensive work. This is ok but
24        // care should be taken to yield at regular intervals
25        AnExpensiveMethod();
26    }
27}

Delegates

Here’s a common pitfall when passing actions as method parameters:

 1// A standard method which accepts a callback. The expectation is that
 2// all methods execute sequentially
 3public void DoSomething(Action callback)
 4{
 5    First();
 6
 7    try
 8    {
 9        callback();
10    }
11    catch (Exception ex)
12    {
13        Trace.WriteException(ex);
14    }
15    Second();
16}
17
18// GOOD: Invoked with an explicit action
19DoSomething(() => Console.WriteLine("Hello"));
20
21// BAD: This delegate is convertible to an async void method, which is allowed to be passed
22// to a method which accepts an Action.
23DoSomething(async () =>
24{
25    await Task.Delay(5000);
26    // UH-OH! This exception will not be observed by the catch handler in DoSomething above
27    // since the caller is not aware that this method is asynchronous
28    throw new Exception("This will not be observed");
29});

The implicit type conversion from the async function to Action is, surprisingly, not a compiler error! This happens because the function doesn’t have a return value, so it’s converted to a method with an async void signature. In this example the side effects aren’t bad, but in a real application this could be terrible as it violates the expected execution contract.

Synchronization

Synchronizing asynchronous code is slightly more complicated than synchronizing synchronous code. Mostly, this is because awaiting a task will result in switching to a different thread. This means that the standard synchronization primitives, which require the same thread to acquire and release a lock, won’t work when used in an async state machine.

Therefore, you must take care to use thread safe synchronization primitives in async methods. For example, using lock, will block the current thread while your code waits to gain exclusive access. In asynchronous code, threads should only block for a short amount of time.

In general, it’s not a good idea to perform any I/O under a lock. There’s usually a much better way to synchronize access in asynchronous programming.

 1private object _lock = new object();
 2
 3// GOOD: Don't do anything expensive inside a lock
 4lock (_lock)
 5{
 6    _cache = new Cache();
 7}
 8
 9// BAD: This is performing unnecessary I/O under a lock
10lock (_lock)
11{
12    _cache = Cache.LoadFromDatabase();
13}

Lazy Initialization

Imagine you need to lazy initialize some object under a lock.

 1private object _lock = new object();
 2private bool _initialized = false;
 3private int _value;
 4
 5// BAD: This method performs blocking I/O under a lock
 6public void Initialize()
 7{
 8    if (_initialized)
 9    {
10        return _value;
11    }
12
13    lock (_lock)
14    {
15        if (!_initialized)
16        {
17            _value = RetrieveData();
18            _initialized = true;
19        }
20    }
21
22    return _value;
23}

When converting RetrieveData to run asynchronously, you might try to rewrite Initialize a few different ways:

 1// BAD: Performs I/O and blocks under a lock
 2lock (_lock)
 3{
 4    if (!_initialized)
 5    {
 6        _value = RetrieveDataAsync().SyncResult();
 7        _initialized = true;
 8    }
 9}
10
11// BAD: Fails at runtime since you cannot change threads under a lock
12lock (_lock)
13{
14    if (!_initialized)
15    {
16        _value = await RetrieveDataAsync();
17        _initialized = true;
18    }
19}

But there are a few issues:

You shouldn’t call external code under a lock. The caller has no idea what work the external code will do, or what assumptions it has made.
You shouldn’t perform I/O under a lock. Code sections under a lock should execute as quickly as possible, to reduce contention with other threads. As soon as you perform I/O under a lock, avoiding contention isn’t possible.

SemaphoreSlim

If you absolutely must perform asynchronous work which limits the number of callers, .NET provides SemaphoreSlim which support asynchronous, non-blocking, waiting.

You still need to take care when converting from a synchronous locking construct. Semaphores, unlike monitor locks, aren’t re-entrant.

 1private readonly SemaphoreSlim m_gate = new SemaphoreSlim(1, 1);
 2
 3public async Task DoThingAsync(CancellationToken cancellationToken)
 4{
 5    // Be careful, semaphores aren't re-entrant!
 6    bool acquired = false;
 7    try
 8    {
 9        acquired = m_gate.WaitAsync(cancellationToken);
10        if (acquired)
11        {
12            // Now that we have entered our mutex it is safe to work on shared data structures
13            await DoMyCriticalWorkAsync(cancellationToken);
14        }
15    }
16    finally
17    {
18        if (acquired)
19        {
20            m_gate.Release();
21        }
22    }
23}

IDisposable

IDisposible is used to finalize acquired resources. In some cases, you need to dispose of these resources asynchronously, to avoid blocking. Unfortunately, you can’t do this inside Dispose().

Thankfully, .NET Core 3.0 provides the new IAsyncDisposible interface, which allows you to handle asynchronous finalization like so:

 1class Foo : IAsyncDisposable
 2{
 3    public async Task DisposeAsync()
 4    {
 5        await SomethingAsync();
 6    }
 7}
 8
 9await using (var foo = new Foo())
10{
11    return await foo.DoSomethingAsync();
12}

IEnumerable and IEnumerator

Usually you would implement IEnumerable or IEnumerator so you can use syntactic sugar, like foreach and LINQ-to-Objects. Unfortunately, these are synchronous interfaces that can only be used on synchronous data sources. If your underlying data source is actually asynchronous, you shouldn’t expose it using these interfaces, as it will lead to blocking.

With the release of .NET Core 3.0 we got the IAsyncEnumerable and IAsyncEnumerator interfaces, which allow you to enumerate asynchronous data sources:

1await foreach (var foo in fooCollection)
2{
3    await foo.DoSomethingAsync();
4}

Prefer the compiler-generated state machine

There are some valid cases for using Task.ContinueWith, but it can introduce some subtle bugs if not used carefully. It’s much easier to avoid it, and just use async and await instead.

 1// BAD: This is using promise-based constructs
 2public Task DoThingAsync()
 3{
 4    return DoMoreThingsAsync().ContinueWith((t) => DoSomethingElse());
 5}
 6
 7// BAD: A much worse version of the above
 8public Task<int> DoThingAsync()
 9{
10    var tcs = new TaskCompletionSource<int>();
11    DoMoreThingsAsync().ContinueWith(t =>
12    {
13        var result = t.Result;
14
15        // UH-OH! The call to .Result is valid, however, it doesn't properly handle exceptions
16        // which can lead to an async call chain which never completes. This is an very difficult
17        // issue to debug as all evidence is garbage collected.
18        tcs.SetResult(result);
19    }
20
21    return tcs.Task;
22}
23
24// GOOD: This is using await
25public async Task DoThingAsync()
26{
27     await DoMoreThingsAsync();
28     DoAnotherThing();
29}

TaskCompletionSource

TaskCompletionSourc<T> allows you to support manual completion in asynchronous code. In general, this class should not be used… but when you have to use it you should be aware of the following behaviour:

 1var waiters = new List<TaskCompletionSource<int>>();
 2
 3// BAD: This uses default continuation options which result in synchronous callbacks
 4Task SomeMethod(CancellationToken cancellationToken)
 5{
 6    var tcs1 = new TaskCompletionSource<int>();
 7    waiters.Add(tcs1);
 8    return tcs1.Task;
 9}
10
11// GOOD: This uses explicit asynchronous continuations
12Task SomeMethod(CancellationToken cancellationToken)
13{
14    var tcs2 = new TaskCompletionSource<int>(TaskCreationOptions.RunContinuationsAsynchronously); 
15    waiters.Add(tcs2);
16    return tcs2.Task;
17}
18
19// The problem comes in when a caller awaits your task. The callee may have an expectation that the callbacks execute
20// quickly, but in the example below this is only true for the implementation which specifies asynchronous continuation.
21// The reason is the caller of TaskCompletionSource<T>.SetResult will be blocked for the entire duration of the loop 
22// below when using the first implementation, while in the second implementation the loop will run on a different thread
23// and the caller may continue execution quickly as expected.
24async Task SomeCaller(CancellationToken cancellationToken)
25{
26    var foo = await SomeMethod();
27    for (int i = 0; i < 10000; i++)
28    {
29        // Do something synchronous and expensive
30    }
31}

Common async pitfalls—part one

Nov 17, 2020 · 8 minute read

The .NET Framework provides a great programming model that enables high performance code using an easy to understand syntax. However, this can often give developers a false sense of security, and the language and runtime aren’t without pitfalls. Ideally static analysers, like the Microsoft.VisualStudio.Threading.Analyzers Roslyn analysers, would catch all these issues at build time. While they do help catch a lot of mistakes, they can’t catch everything, so it’s important to understand the problems and how to avoid them.

Here’s a collection of some of the most common pitfalls I’ve come across—either myself, colleagues and friends, or examples in documentation—and how to avoid them.

Blocking calls

The main benefit of asynchronous programming is that the thread pool can be smaller than a synchronous application while performing the same amount of work. However, once a piece of code begins to block threads, the resulting thread pool starvation can be ugly.

 1public async Task DoStuffAsync(CancellationToken cancellationToken)
 2{
 3    Task<bool> resultTask = StuffAsync(cancellationToken);
 4
 5    // BAD: These may deadlock depending on thread pool capacity or the presence of a sync context
 6    resultTask.Wait();
 7    resultTask.Result;
 8    resultTask.GetAwaiter().GetResult();
 9
10    // BAD: This blocks the thread
11    Thread.Sleep(5000);
12
13    // GOOD: Always await
14    await resultTask;
15
16    // GOOD: This does not block and allows for maximum throughput
17    await Task.Delay(5000, cancellationToken);
18}

If I run a small test, which makes 5000 concurrent HTTP requests to a local server, there are dramatically different results depending on how many blocking calls are used.

% blocking shows the number of calls that use Task.Result, which blocks the thread. All other requests use await.

% Blocking	Threads	Total Duration	Avg. Duration
0	24	00:00:11.961	0.0023923
5	268	00:02:16.574	0.0273148

The increased total duration when using blocking calls is due to the thread pool growth, which happens slowly. You can always tune the thread pool settings to achieve better performance, but it will never match the performance you can achieve with non-blocking calls.

Streams

Like all other blocking calls, any methods from System.IO.Stream should use their async equivalents: Read to ReadAsync, Write to WriteAsync, Flush to FlushAsync, etc. Also, after writing to a stream, you should call the FlushAsync method before disposing the stream. If not, the Dispose method may perform some blocking calls.

CancellationToken

You should always propagate cancellation tokens to the next caller in the chain. This is called a cooperative cancellation model. If not, you can end up with methods that run longer than expected, or even worse, never complete.

To indicate to the caller that cancellation is supported, the final parameter in the method signature should be a CancellationToken object.

 1public async Task DoStuffAsync()
 2{
 3    // BAD: This method does not accept a cancellation token. Cooperative cancellation is extremely important.
 4    await StuffAsync();
 5}
 6
 7public async Task DoStuffAsync(CancellationToken cancellationToken)
 8{
 9    var httpClient = new HttpClient();
10
11    // BAD: The caller has provided a cancellation token for cooperative cancellation but it was not provided
12    // downstream. Regardless of the token being signaled the http client will not return until the
13    // operation completes.
14    await httpClient.GetAsync("http://example.com");
15
16    // GOOD: Always propagate the token to the next caller in the chain
17    await httpClient.GetAsync("http://example.com", cancellationToken);
18}

Linked tokens

If you need to put a timeout on an inner method call, you can link one cancellation token to another. For example, you want to make a service-to-service call, and you want to enforce a timeout, while still respecting the external cancellation.

 1public async Task<int> DoThingAsync(CancellationToken cancellationToken)
 2{
 3    using (var cancelTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
 4    {
 5        // The cancellation token source will now manage a timer which automatically notifies the parent
 6        // token.
 7        cancelTokenSource.CancelAfter(TimeSpan.FromSeconds(10));
 8
 9        // By linking the token source we will now cooperatively cancel. We have also
10		// provided a time based auto-cancellation signal which only applies to the 
11		// the single method. This is how nested cancellation scopes may be used in circuit breakers.
12        await GetAsync(cancelTokenSource.Token);
13    }
14
15    // Other calls within the method may not need this restricted timeout so the original
16    // cancellation token is used instead.
17    await DoOtherThingAsync(cancellationToken);
18}

Cancelling uncancellable operations

Sometimes you may find the need to call an API which does not accept a cancellation token, but your API receives a token and is expected to respect cancellation. In this case the typical pattern involves managing two tasks and effectively abandoning the un-cancellable operation after the token signals.

 1public async Task<int> DoThingAsync(CancellationToken cancellationToken)
 2{
 3    var tcs = new TaskCompetionSource<int>();
 4    using (var registration = cancellationToken.Register(() => tcs.SetResult(0)))
 5    {
 6         var completedTask = await Task.WhenAny(tcs.Task, DoNonCancellableCallAsync());
 7         if (completedTask == tcs.Task)
 8         {
 9             // We have been cancelled, so take appropriate action here. At this point in time the non-cancellable call is
10             // still running. Once it completes everything will be properly garbage collected
11             throw new OperationCanceledException(cancellationToken);
12         }
13
14        return await completedTask;
15    }
16}

Constructors

Occasionally, you may find yourself wanting to perform asynchronous work during initialization of a class instance. Unfortunately, there is no way to make constructors async.

 1public class Foo
 2{
 3    public Foo(int a, int c)
 4    {
 5        _a = a;
 6        // DON'T DO THIS!
 7        _b = CallSomethingAsync().SyncResult();
 8        _c = c;
 9    }
10
11    private readonly int _a;
12    private readonly string _b;
13    private readonly int _c;
14}

There are a couple of different ways to solve this. Here’s a pattern I like:

A public static creator method, which publicly replaces the constructor
A private async member method, which does the work the constructor used to do
A private constructor, so callers can’t directly instantiate the class by mistake

So, if I apply the same pattern to the class above the class becomes:

 1public class Foo
 2{
 3    public static async Task<Foo> CreateAsync(int a, int c)
 4    {
 5        Foo instance = new Foo(a, c);
 6        await instance.InitializeAsync();
 7        return instance;
 8    }
 9
10    private Foo(int a, int c)
11    {
12        // GOOD: readonly keyword is retained
13        _a = a;
14        _c = c;
15    }
16
17    private async Task InitializeAsync()
18    {
19        // GOOD: all async work is performed here
20        // NOTE: make sure initialization order is correct when porting existing code
21        _b = await CallSomethingAsync();
22    }
23
24    private readonly int _a;
25    private readonly int _c;
26
27    string _b;
28}

And we can instantiate the class by calling var foo = await Foo.CreateAsync(1, 2);.

In cases where the class is part of an inheritance hierarchy, the constructor can be made protected and InitializeAsync can be made protected virtual, so it can be overridden and called from derived classes. Each derived class will need to have its own CreateAsync method.

Parallelism

Avoid premature optimization

It might be very tempting to try to perform parallel work by not immediately awaiting tasks. In some cases, you can make significant performance improvements. However, if not used with care you can end up in debugging hell involving socket or port exhaustion, or database connection pool saturation.

 1public async Task DoTwoThingsAsync(CancellationToken cancellationToken)
 2{
 3    // Depending on what you are doing under the covers, this may cause unintended
 4    // consequences such as exhaustion of outgoing sockets or database connections.
 5    var thing1Task = FirstAsync(cancellationToken);
 6    var thing2Task = SecondAsync(cancellationToken);
 7    await Task.WhenAll(thing1Task, thing2Task);
 8
 9    // Prefer sequential execution of asynchronous calls
10    await FirstAsync(cancellationToken);
11    await SecondAsync(cancellationToken);
12}

Using async everywhere generally pays off without having to make any individual piece of code faster via parallelization. When threads aren’t blocking you can achieve higher performance with the same amount of CPU.

Avoid Task.Factory.StartNew, and use Task.Run only when needed

Even in the cases where not immediately awaiting is safe, you should avoid Task.Factory.StartNew, and only use Task.Run when you need to run some CPU-bound code asynchronously.

The main way Task.Factory.StartNew is dangerous is that it can look like tasks are awaited when they aren’t. For example, if you async-ify the following code:

1Task.WhenAll(Task.Factory.StartNew(Foo), Task.Factory.StartNew(Foo2)).Wait();

be careful because changing the delegate to one that returns Task, Task.Factory.StartNew will now return Task<Task>. Awaiting only the outer task will only wait until the actual task starts, not finishes.

1// BAD: Only waits until all tasks have been scheduled, not until the actual work has completed
2await Task.WhenAll(Task.Factory.StartNew(FooAsync), Task.Factory.StartNew(Foo2Async));

Normally what you want to do, when you know delegates are not CPU-bound, is to just use the delegates themselves. This is almost always the right thing to do.

1// GOOD: Will wait until the tasks have all completed
2await Task.WhenAll(FooAsync(), Foo2Async());

However, if you are certain the delegates are CPU-bound, and you want to offload this to the thread pool, you can use Task.Run. It’s designed to support async delegates. I’d still recommend reading Task.Run Etiquette and Proper Usage for a more thorough explanation.

1
2// Ok if FooAsync is known to be primarily CPU-bound (especially before going async).
3// Will schedule the CPU-bound work like Task.Factory.StartNew does,
4// and automatically convert Task<Task> into a 'proxy' Task that represents the actual work
5await Task.WhenAll(Task.Run(FooAsync), Task.Run(Foo2Async))

If, for some extremely unlikely reason, you really do need to use Task.Factory.StartNew you can use Unwrap() or await await to convert a Task<Task> into a Task that represents the actual work. I’d recommend reading Task.Run vs Task.Factory.StartNew for a deeper dive into the topic.

Null conditionals

Using the null conditional operator with awaitables can be dangerous. Awaiting null throws a NullReferenceException.

1// BAD: Will throw NullReferenceException if foo is null
2await foo?.DoStuffAsync()

Instead, you must do a manual check first.

1// GOOD
2if (foo != null)
3{
4    await foo.DoStuffAsync();
5}

A Null-conditional await is currently under consideration for future versions of C#, but until then you’re stuck with manually checking.

Newer Older

Jamie Magee

Writing GitHub bots in .NET

Setup

Handling webhooks

GitHub webhook configuration

Making it interactive

What next?

Netlify billing Denial-of-Service

SaaS cost attack

Proof-of-concept

What next?

Tech stack #10YearChallenge

2010

VB.NET

MATLAB

C++

2020

C#

TypeScript

DevOps

2030?

Common async pitfalls—part two

‘Fake’-sync is not async

Delegates

Synchronization

Lazy Initialization

SemaphoreSlim

IDisposable

IEnumerable and IEnumerator

Prefer the compiler-generated state machine

TaskCompletionSource

Common async pitfalls—part one

Blocking calls

Streams

CancellationToken

Linked tokens

Cancelling uncancellable operations

Constructors

Parallelism

Avoid premature optimization

Avoid Task.Factory.StartNew, and use Task.Run only when needed

Null conditionals