Serverless computing in Azure: How to build a petition site

This example shows you how to use Azure services to build scalable, serverless web applications

Serverless computing in Azure: How to build a petition site
Sharon Gaudin/IDG

The other day, I found myself watching a ticker on a petition site slowly roll over at a million signatures. It was clear that the site was struggling, with authentication emails taking anything up to 24 hours to send, resulting in queues full of unapproved signatures waiting to be written to a database. Watching that ticker count up, I started to wonder how a service like this could be built using some of the more modern cloud-first design options available with Azure, putting together many of the tools and services I’ve written about in this column.

The architecture for a service like this is relatively simple. You need a web front end to collect signatures, a messaging framework to deliver them to a scalable back end, using microservices to write them into a database and to verify sender identities. You can then use other messaging tools to keep track of signatures, and you can use analytic tools and machine learning to identify invalid entries.

Designing a web tier

One key element of a service like this is the web content. It’s important to choose the right web development framework. If you’re going to use Azure microservices to handle page-generated events, a single-page web application (SPA) is a useful framework to build on. Azure’s tools for handling scalable web content provides a platform for delivering content, using its content delivery network to scale out static content and page templates, with Azure Front Door’s built-in application gateway to handle load-balancing web application servers and to provide a web application firewall.

I would use React as a web framework because it works well in SPAs, and its responsive nature makes it a good fit for an application that needs to work across multiple devices. It makes it easy to connect form elements to JavaScript, letting you construct a CloudEvents payload and send it to your microservices back end.

Building a service like this around a messaging-driven event architecture makes a lot of sense. You need to be able to scale, and you need to be able to work across an environment that’s rapidly changing. Messaging, especially when tied into a message broker, can handle dynamically changing infrastructures and standards-based message formats such as the one at the heart of CloudEvents give you a framework for building and constructing message headers and payloads.

Processing data with serverless microservices

There’s an interesting question facing anyone building a site like this: How do you get it to scale? The obvious answer is to take advantage of serverless computing technologies and deliver a minimal yet highly scalable service. Instead of launching new virtual machines as load increases and having to dynamically modify load balancing rules, you can spin up Azure Functions instances as they are needed, using tools like Event Grid to direct messages from your web app front end to message-processing microservices.

Designing the microservices that build an app like this is relatively easy. You’re writing data into a queue and holding it until you’ve verified the user via an email verification loop. That process can be delivered using two microservices. The first microservice takes a message from the web app, writes its contents into a database with a flag that indicates a provisional signing. The second microservice then displays a message that verification is required, delivering an email address and a signing token to an email service, which inserts them in a template before delivering it to the signer.

Using intelligent data services

Perhaps the most difficult decision for anyone building a modern cloud application is choosing a data back end. It’s not that there aren’t any services that would work, it’s that there are too many! Should you store data in traditional relational tables, should you store it in a NoSQL service, or take advantage of the capabilities of graph databases?

Certainly for a petition site, where you want to run a range of queries on the results to determine fraudulent signatures, you’ve got a specific set of requirements that you can use to determine problematic entries. You’ll store a name, an address token like ZIP or postal codes, a time stamp, and an originating address, along with a petition identifier. This should let you first check for duplicate addresses as part of the signing process, as well as check for obvious variants of an address (for example, looking for the . separator in a Gmail address, which isn’t parsed by Gmail but is by most other mail-transfer engines).

Timestamps and IP addresses can help track frauds, as part of a batch cleanup process that runs outside the petition application. Obvious vote stuffing from a single IP address can be tracked, along with signatures that come from known VPN endpoints.

With queries like this, it could be possible to use something like Azure’s Cosmos DB as a data platform, using it to store data as graph nodes and using graph queries to extract data. By using a distributed storage model, you can build a resilient back end, using a relatively loose consistency model to handle replication between instances. By using, say, bounded staleness or session-based consistency models, you can let your signing web app drop data into Cosmos DB and then put up a page asking a user to wait for a confirmation email.

Confirmations and machine learning for fraud detection

That lets you wait for the various instances to become consistent before you make basic fraud checks and then trigger an appropriate email. Using a serverless app triggered by JavaScript functions running inside Cosmos DB, you could send fail, retry, or authenticate submission emails. Although Azure has the tools to send SMTP messages, it’s better to use a third-party service like SendGrid, which has the scale to send many millions of messages in a short time (and which has the benefit of being whitelisted by most antispam services).

The confirmation message sent to the user contains a cryptographically signed token. Once returned to a verification microservice, and checked, you can then toggle the signed status of the database entry. Another internal JavaScript service counts valid entries, and you can use SignalR to deliver a near-real-time count of signatories to an API that can be displayed by a web page or used by third parties who want to track the petition.

Putting all these elements together delivers a petition service that scales well and that can handle basic antifraud techniques. You can quickly add more complex antifraud tools around the process, using Azure machine learning tools to look for unusual patterns of signing. These can run outside the signing process, so you use them only when you have enough signatures to make it economical to use large amounts of compute.

Thus far, I’ve used modern cloud development principles to lay out the core architecture for a scalable, relatively low-cost service. You can do more detailed work to flesh things out, building quick protypes to test, before starting with a minimum viable product and adding features with each subsequent release. By mixing services with serverless compute, you can scale quickly both up and down as user demand fluctuates, without having to commit to longer-term virtual infrastructure costs.

It’s worth running through this type of design exercise with your architecture team regularly, because it gives you the opportunity to consider services that you might not be using in your applications. It’s a quick and easy way of learning and sharing ideas, with an added bonus: Once you’ve thought about using them in a greenfield application, you have the option of considering them in future application developments and updates, giving you a way to reduce the risk of technical debt.

Copyright © 2019 IDG Communications, Inc.