Netlify billing Denial-of-Service

At its heart, Netlify is a platform for hosting static websites. I initially started using it as an alternative to GitHub Pages, and features like Netlify CMS and preview deploys really won me over. However, I recently got bit by preview deploys and the build minutes pricing.

I had a few GitHub repos set up to automatically deploy to Netlify previews from pull requests. I had also configured Renovate to automatically open pull requests for my dependencies. Javascript projects have a lot of dependencies, so there were more than a few pull requests!

Originally Netlify didn’t charge for build minutes, but in October 2019 they introduced a cap of 300 minutes per month on the start plan, with the option to buy packs of 500 build minutes for $7. I’m not criticising them for this change. I understand that they need to make money, and I think $7 for 500 minutes is a reasonable price. However, I don’t think Netlify has put the necessary protections in place for their customers.

SaaS cost attack

A quick Google search for “huge bill” for AWS, Azure, or GCP returns thousands of results. Stories of people who, at best, misconfigured their infrastructure or, at worst, were the target of a DDoS attack. Whatever the cause, the outcome is the same: a massive bill for thousands of dollars.

The big three cloud providers, and most other SaaS companies, offer some sort of protections or hard limits to prevent these sorts of surprises: Azure has Azure Cost Management + Billing, AWS has AWS Budget, and GCP has Cloud Billing budgets. Netlify has no such protections.

To me, this seems like a massive oversight and is a core feature that most users, and all business, would expect to have. Obviously the intention here is for Netlify to be able to charge customers when they exceed their build minutes. This also leaves open a massive opportunity for malicious actors to incur huge costs for Netlify customers with very little effort.

Proof-of-concept

Netlify allows you to configure your site’s build using a netlify.toml stored in the root of your repository. Here’s an excerpt taken from their documentation:

# Settings in the [build] context are global and are applied to all contexts
# unless otherwise overridden by more specific contexts.
[build]
  # Directory to change to before starting a build.
  # This is where we will look for package.json/.nvmrc/etc.
  base = "project/"

  # Directory that contains the deploy-ready HTML files and assets generated by
  # the build. This is relative to the base directory if one has been set, or the
  # root directory if a base has not been set. This sample publishes the
  # directory located at the absolute path "root/project/build-output"
  publish = "build-output/"

  # Default build command.
  command = "echo 'default context'"

If I were to replace command with something like sleep 120 and open a pull request that uses Netlify and has not explicitly disabled deploy previews, the Netlify build will happily build my pull request and sleep for two minutes.

A quick search on GitHub for filename:netlify.toml shows there are approximately forty thousand repositories that are potentially vulnerable to this style of attack.

One small mercy is the fact that Netlify limits builds to 15 minutes by default. So, to cost anyone any money, someone would have to open 20 pull requests with command = "sleep 900". At that point, I hope either the repository owner would notice, GitHub would rate-limit you, or it would catch the attention of Netlify.

What next?

I was recently bitten by unexpected charges for build minutes. After contacting Netlify they, very kindly, forgave the charges. However, when I asked about billing protection for customers they replied:

Our policy is to keep your website and builds running as you expect, rather than leaving things in an inconsistent state. I understand it is not what you prefer, but it is what our business prioritizes - expected behavior and continuous uptime for your website and build processes. I have however recorded your feedback for our billing team.

Personally, I have disabled preview deploys on all my sites on Netlify and moved the builds to GitHub Actions. I still use Netlify for NetlifyCMS, but I might look into using NetlifyCMS with a custom OAuth provider in the future.

Tech stack #10YearChallenge

#10YearChallenge has been trending for a while, so I thought it would be fun to do a 10 year challenge for programming and take a look at the technology I used back in 2010.

2010

10 years ago covers my final year in high school, and my first year in university. Both used completely different programming languages and tech stacks, so it’s an interesting place to look back at.

I was running Windows on my personal machine, but the computers in the engineering department at my university were running Linux (SUSE if I recall correctly). It wasn’t my first exposure to Linux, but I was still more comfortable using Windows at this point.

VB.NET

I started learning how to program in my final few years of high school. My computer science teacher started us off with Visual Basic .NET. We were actually the first year group to use this stack. Previously my school used Delpi and Pascal, so it was new to everyone.

For my final year project, I built a system for a hairdresser complete with appointment scheduling, customer database, and inventory management!

A screenshot of my high school project

MATLAB

The first week of university we got thrown into the deep end with a week-long Lego Mindstorms coursework project. There were no real limitations except for your imagination… and your MATLAB skills. In the end, our team built a robot with an automatic gearbox.

A Lego Mindstorms car

Despite MATLAB’s reputation for not being a ‘real’ programming language, I used it a lot throughout all four years at university, including for my Master’s thesis! I’d really recommend MATLAB Cody if you’re looking to improve your MATLAB skills.

C++

Still one of the favourite languages for teaching undergraduates. C++ was used extensively, but one of my proudest pieces of work in C++ is still the logic simulator I wrote for a coursework project.

A screenshot of a logic simulator

I really got to cut my teeth on C++ during my first ever internship, working on an H.265/HEVC video encoder at Cisco. To this day, it was some of the most challenging (in a good way) work I’ve done. Or to use someone else’s words “H.264 is Magic”.

2020

Flash forward to 2020 and I’ve been programming professionally for almost 6 years now. In that time, I’ve used a lot of different languages including Java, Python, and even a year working in X++ (despite my attempts to forget it!).

Even though I work at Microsoft, I’ve been running Arch Linux as my daily driver for over 3 years. Yes, I still need to use Windows in a VM from time to time, but the fact that I can achieve my developer workflow almost entirely from Linux just goes to show that Microsoft ♥ Linux isn’t just an empty platitude.

C#

It’s only in the last year or so that I’ve come back to working on a .NET stack, but already I’ve deployed applications on Azure Functions, ASP.NET Core running in Kubernetes, and most recently Service Fabric. C# is a real breath of fresh air coming from 4 years of Java, and I am really excited to see where the language goes after C# 8 and .NET 5.

TypeScript

If you’re doing front-end work nowadays, I think TypeScript is the best way to do it. It papers over the cracks of JavaScipt, and gives you much more confidence, especially when working in a large codebase. The most common stack I work in now is React + TypeScript, and it is a million times better than the jQuery days.

I’ve also used TypeScript for some back-end work too – most notably for Renovate. The type system really lends itself well to these sorts of back-end tasks, and I wouldn’t discount it over some of the more conventional stacks.

DevOps

Okay, so this one isn’t a programming language, but it’s definitely something that has changed the way I work. In this context, DevOps means a couple of things to me: testing, continuous integration/continuous delivery (CI/CD) and monitoring.

In 2010, testing meant manual testing. I remember for my hairdresser management system I had to document my manual test plan. It was a requirement of the marking scheme. Nowadays, it’s easier to think of testing as a pyramid with unit tests at the base, integration tests and E2E tests in the middle, and a small number of manual tests at the top. Ham Vocke’s The Practical Test Pyramid is the definitive guide for testing in 2020.

CI/CD has been one of my favourite topics lately. Even though the agile manifesto talked about it almost 20 years ago, only recently has the barrier to entry gotten so low. Between Github Actions, Gitlab CI, Travis CI and all the rest it’s a no-brainer. I use GitHub Actions in almost every side project I build.

Monitoring is such an important tool for running a successful service. You can use it to pre-emptively fix problems before they become problems or choose what areas to work on based on customer usage. Like CI/CD it’s become so easy now. For most platforms all you need to do is include an SDK!

2030?

Who knows what 2030 will bring? Maybe Rust will replace C++ everywhere? Maybe AI will have replaced programmers? Maybe Go will finally get generics?

Common async pitfalls—part two

Following on from part one, here’s some more of the most common pitfalls I’ve come across—either myself, colleagues and friends, or examples in documentation—and how to avoid them.

‘Fake’-sync is not async

If the method you are calling is synchronous, even in an async method, then call it like any other synchronous method. If you want to yield the thread, then you should use Task.Yield in most cases. For UI programming, see this note about Task.Yield from the .NET API documentation.

Delegates

Here’s a common pitfall when passing actions as method parameters:

The implicit type conversion from the async function to Action is, surprisingly, not a compiler error! This happens because the function doesn’t have a return value, so it’s converted to a method with an async void signature. In this example the side effects aren’t bad, but in a real application this could be terrible as it violates the expected execution contract.

Synchronization

Synchronizing asynchronous code is slightly more complicated than synchronizing synchronous code. Mostly, this is because awaiting a task will result in switching to a different thread. This means that the standard synchronization primitives, which require the same thread to acquire and release a lock, won’t work when used in an async state machine.

Therefore, you must take care to use thread safe synchronization primitives in async methods. For example, using lock, will block the current thread while your code waits to gain exclusive access. In asynchronous code, threads should only block for a short amount of time.

In general, it’s not a good idea to perform any I/O under a lock. There’s usually a much better way to synchronize access in asynchronous programming.

Lazy Initialization

Imagine you need to lazy initialize some object under a lock.

When converting RetrieveData to run asynchronously, you might try to rewrite Initialize a few different ways:

But there are a few issues:

  1. You shouldn’t call external code under a lock. The caller has no idea what work the external code will do, or what assumptions it has made.
  2. You shouldn’t perform I/O under a lock. Code sections under a lock should execute as quickly as possible, to reduce contention with other threads. As soon as you perform I/O under a lock, avoiding contention isn’t possible.

SemaphoreSlim

If you absolutely must perform asynchronous work which limits the number of callers, .NET provides SemaphoreSlim which support asynchronous, non-blocking, waiting.

You still need to take care when converting from a synchronous locking construct. Semaphores, unlike monitor locks, aren’t re-entrant.

IDisposable

IDisposible is used to finalize acquired resources. In some cases, you need to dispose of these resources asynchronously, to avoid blocking. Unfortunately, you can’t do this inside Dispose().

Thankfully, .NET Core 3.0 provides the new IAsyncDisposible interface, which allows you to handle asynchronous finalization like so:

IEnumerable and IEnumerator

Usually you would implement IEnumerable or IEnumerator so you can use syntactic sugar, like foreach and LINQ-to-Objects. Unfortunately, these are synchronous interfaces that can only be used on synchronous data sources. If your underlying data source is actually asynchronous, you shouldn’t expose it using these interfaces, as it will lead to blocking.

With the release of .NET Core 3.0 we got the IAsyncEnumerable and IAsyncEnumerator interfaces, which allow you to enumerate asynchronous data sources:

Prefer the compiler-generated state machine

There are some valid cases for using Task.ContinueWith, but it can introduce some subtle bugs if not used carefully. It’s much easier to avoid it, and just use async and await instead.

TaskCompletionSource

TaskCompletionSourc<T> allows you to support manual completion in asynchronous code. In general, this class should not be used… but when you have to use it you should be aware of the following behaviour:

Common async pitfalls—part one

The .NET Framework provides a great programming model that enables high performance code using an easy to understand syntax. However, this can often give developers a false sense of security, and the language and runtime aren’t without pitfalls. Ideally static analysers, like the Microsoft.VisualStudio.Threading.Analyzers Roslyn analysers, would catch all these issues at build time. While they do help catch a lot of mistakes, they can’t catch everything, so it’s important to understand the problems and how to avoid them.

Here’s a collection of some of the most common pitfalls I’ve come across—either myself, colleagues and friends, or examples in documentation—and how to avoid them.

Blocking calls

The main benefit of asynchronous programming is that the thread pool can be smaller than a synchronous application while performing the same amount of work. However, once a piece of code begins to block threads, the resulting thread pool starvation can be ugly.

If I run a small test, which makes 5000 concurrent HTTP requests to a local server, there are dramatically different results depending on how many blocking calls are used.

% blocking shows the number of calls that use Task.Result, which blocks the thread. All other requests use await.

% BlockingThreadsTotal DurationAvg. Duration
02400:00:11.9610.0023923
526800:02:16.5740.0273148

The increased total duration when using blocking calls is due to the thread pool growth, which happens slowly. You can always tune the thread pool settings to achieve better performance, but it will never match the performance you can achieve with non-blocking calls.

Streams

Like all other blocking calls, any methods from System.IO.Stream should use their async equivalents: Read to ReadAsync, Write to WriteAsync, Flush to FlushAsync, etc. Also, after writing to a stream, you should call the FlushAsync method before disposing the stream. If not, the Dispose method may perform some blocking calls.

CancellationToken

You should always propagate cancellation tokens to the next caller in the chain. This is called a cooperative cancellation model. If not, you can end up with methods that run longer than expected, or even worse, never complete.

To indicate to the caller that cancellation is supported, the final parameter in the method signature should be a CancellationToken object.

Linked tokens

If you need to put a timeout on an inner method call, you can link one cancellation token to another. For example, you want to make a service-to-service call, and you want to enforce a timeout, while still respecting the external cancellation.

Cancelling uncancellable operations

Sometimes you may find the need to call an API which does not accept a cancellation token, but your API receives a token and is expected to respect cancellation. In this case the typical pattern involves managing two tasks and effectively abandoning the un-cancellable operation after the token signals.

Constructors

Occasionally, you may find yourself wanting to perform asynchronous work during initialization of a class instance. Unfortunately, there is no way to make constructors async.

There are a couple of different ways to solve this. Here’s a pattern I like:

  1. A public static creator method, which publicly replaces the constructor
  2. A private async member method, which does the work the constructor used to do
  3. A private constructor, so callers can’t directly instantiate the class by mistake

So, if I apply the same pattern to the class above the class becomes:

And we can instantiate the class by calling var foo = await Foo.CreateAsync(1, 2);.

In cases where the class is part of an inheritance hierarchy, the constructor can be made protected and InitializeAsync can be made protected virtual, so it can be overridden and called from derived classes. Each derived class will need to have its own CreateAsync method.

Parallelism

Avoid premature optimization

It might be very tempting to try to perform parallel work by not immediately awaiting tasks. In some cases, you can make significant performance improvements. However, if not used with care you can end up in debugging hell involving socket or port exhaustion, or database connection pool saturation.

Using async everywhere generally pays off without having to make any individual piece of code faster via parallelization. When threads aren’t blocking you can achieve higher performance with the same amount of CPU.

Avoid Task.Factory.StartNew, and use Task.Run only when needed

Even in the cases where not immediately awaiting is safe, you should avoid Task.Factory.StartNew, and only use Task.Run when you need to run some CPU-bound code asynchronously.

The main way Task.Factory.StartNew is dangerous is that it can look like tasks are awaited when they aren’t. For example, if you async-ify the following code:

be careful because changing the delegate to one that returns Task, Task.Factory.StartNew will now return Task<Task>. Awaiting only the outer task will only wait until the actual task starts, not finishes.

Normally what you want to do, when you know delegates are not CPU-bound, is to just use the delegates themselves. This is almost always the right thing to do.

However, if you are certain the delegates are CPU-bound, and you want to offload this to the thread pool, you can use Task.Run. It’s designed to support async delegates. I’d still recommend reading Task.Run Etiquette and Proper Usage for a more thorough explanation.

If, for some extremely unlikely reason, you really do need to use Task.Factory.StartNew you can use Unwrap() or await await to convert a Task<Task> into a Task that represents the actual work. I’d recommend reading Task.Run vs Task.Factory.StartNew for a deeper dive into the topic.

Null conditionals

Using the null conditional operator with awaitables can be dangerous. Awaiting null throws a NullReferenceException.

Instead, you must do a manual check first.

A Null-conditional await is currently under consideration for future versions of C#, but until then you’re stuck with manually checking.

Zwift on Linux

Getting Zwift to run on Linux was a journey I started just over a year ago. I didn’t get very far with my effort, but since then a lot of progress has been made by the Wine developers and others in the community, and Zwift is now (mostly) playable on Linux. I’ll admit there are some workarounds required. Like having to use the Zwift companion app to connect sensors. But on the whole, it works well. So I wanted to summarise the process for anyone who wants to try it for themselves.

I’m using Lutris, a gaming client for Linux, to script out all the steps needed to make games playable on Linux. If you’ve never used it before, I’d really recommend it for gaming on Linux in general. First things first, you’re going to have to download and install Lutris for your Linux distribution. Thankfully Lutris has a great help page explaining how to do this for most distributions.

Installation

Once you’ve got Lutris installed, installing Zwift is pretty easy. In Lutris search for Zwift, select the only result, and click the “Install” button to start the installation process. You can also start the installer from the command line by running lutris install/zwift-windows.

Lutris Installer

This might take a while, and depending on your internet speed could be anywhere from 10 minutes to around an hour.

Once the Zwift launcher has finished downloading and updating, we’ve hit the first hurdle that can’t be scripted with Lutris.

The launcher will appear as a blank white window. Actually, the launcher is displaying a web page, but Wine can’t render properly. Thankfully all the files are already downloaded, so all you need to do is quit the launcher window, and exit Zwift from the Wine system menu. After that, the Lutris installer should complete.

Running Zwift

Zwift requires the Launcher to be running all the time while in-game. However, Lutris only allows 1 application to launch from the “Play” button. So before you hit the play button, first you need to click “Run EXE inside wine prefix” and browse to drive_c\Program Files (x86)\Zwift\ZwiftLauncher. You should see that familiar blank white screen.

Finally, you can hit the “Play” button and Ride On 👍