Cloud Architecture Patterns

Bill Wilder





a measure of the number of users one application can effectively support at the same time.

Vertically Scaling Up

is to

increase overall application capacity by increasing the resources within existing nodes.

$ cf scale myApp -k 512M
$ cf scale myApp -m 1G

Horizontally Scaling Out

is to

increase overall application capacity by adding entire nodes.

$ cf scale myApp -i 5

Horizontal Scaling

is more efficient

with homogeneous nodes.

Scale Unit


the combinations of resources that need to be scaled together.

Following characteristics of a cloud platform make cloud-native applications possible


  • Enabled by (the illusion of) infinite resources and limited by the maximum capacity of individual virtual machines, cloud scaling is horizontal.
  • Enabled by a short-term resource rental model, cloud scaling releases resources as easily as they are added.
  • Enabled by a metered pay-for-use model, cloud applications only pay for currently allocated resources and all usage costs are transparent.
  • Enabled by self-service, on-demand, programmatic provisioning and releasing of resources, cloud scaling is automatable.
  • Both enabled and constrained by multitenant services running on commodity hardware, cloud applications are optimized for cost rather than reliability; failure is routine, but downtime is rare.
  • Enabled by a rich ecosystem of managed platform services such as for virtual machines, data storage, messaging, and networking, cloud application development is simplified.

A cloud-native application is architected to take full advantage of cloud platforms. A cloud-native application is assumed to have the following properties, as applicable


  • Leverages cloud-platform services for reliable, scalable infrastructure. (“Let the platform do the hard stuff.”)
  • Uses non-blocking asynchronous communication in a loosely coupled architecture.
  • Scales horizontally, adding resources as demand increases and releasing resources as demand decreases.
  • Cost-optimizes to run efficiently, not wasting resources.
  • Handles scaling events without downtime or user experience degradation.
  • Handles transient failures without user experience degradation.
  • Handles node failures without downtime.
  • Uses geographical distribution to minimize network latency.
  • Upgrades without downtime.
  • Scales automatically using proactive and reactive actions.
  • Monitors and manages application logs even as nodes come and go.


Horizontally Scaling Compute Pattern

Cloud Scaling



Managing Session State


sticky session or session affinity

Managing Many Nodes


  • capacity planning for large scale
  • sizing virtual machines
  • failure is partial
  • operational data collection


Queue-Centric Workflow Pattern

Thank you