Switch Between Serverless and Dedicated VMs on GCP

GCE with LaunchFlow Graphic

One of our goals with LaunchFlow is to make the interaction between your application and the infrastructure that runs it as simple as possible. LaunchFlow now supports deploying services to VMs on Google Compute Engine (GCE)! This means you can now seamlessly switch between compute service providers on GCP with one line of code:

Cloud Run:

import launchflow as lf
 
serverless = lf.gcp.CloudRunService("serverless")

Compute Engine:

import launchflow as lf
 
dedicated = lf.gcp.ComputeEngineService("dedicated")

To Serverless or not to Serverless?

With LaunchFlow’s support of GCE, you can now choose between serving your application on a serverless Cloud Run runtime or on dedicated Compute Engine VMs. There are pros and cons to both approaches, and what you choose largely depends on your application. Luckily with LaunchFlow you're not locked in to your first choice because you can switch between the different architectures with a single line of code, and you can even switch which architecture you’re using between environments! Meaning you can massively reduce your development environment bill while still ensuring your production service is as performant as possible.

Why Serverless?

Serverless is great if you're building a web server that expects little traffic and can be offline most of the time. However there are some complications that you need to be aware of, especially when building Python applications. First, if your service requires any state you need to use an external service such as a remote database (Postgres) or a remote bucket (GCS or S3). This is because anything you write to disk will be wiped when the serverless runtime scales down. Secondly you always need to keep your eye on cold start times. With Python, simply importing a library can increase your cold start times meaning users will be waiting too long for a response. If either of these are an issue it might be time to look at deploying your service to a dedicated VM.

Pros:

Cheaper for low traffic services
Generally easier to maintain / setup

Cons:

Cold start times can be problematic especially for Python applications
Completely stateless
Often have timeout limits for requests

Why Dedicated VMs?

Dedicated VMs are great when you have a steady stream of traffic or your application requires some state. If you have a steady stream of traffic, dedicated VMs will actually end up being significantly cheaper than using a serverless architecture. The most obvious issue with Serverless is the cold start time. A common pattern on Cloud Run is to have 1 minimum instance up to reduce cold start time latencies, but if you do this you can expect to spend about 3x the cost of using dedicated VMs.

Pros:

Easier to build stateful applications
Minimal cold start times

Cons:

May be more expensive depending on workload
Requires more setup and regular maintenance

How does it work?

So you’ve decided you want to use LaunchFlow and GCE to serve your Python application, but how does it work? You simply define your service in a Python file, run lf deploy, and LaunchFlow takes care of the rest. As with all LaunchFlow Services and Resources, the GCE service is pre-configured with reasonable defaults but is fully customizable to whatever your needs may be.

import launchflow as lf
 
# Define a GCE service with one line of code
my_service = lf.gcp.ComputeEngineService("dedicated-vm")
 
if lf.environment == "prod":
  # Customize across environments
  my_service = lf.gcp.ComputeEngineService(
    "dedicated-vm",
    machine_type="n1-standard-2",
    region="us-central1",
  )

$ lf deploy dev

When you use LaunchFlow’s GCEService, the cli will spin up the following GCP resources under the hood:

Managed Instance Group

The managed instance group is a container for your VMs that are serving your python application. It is responsible for ensuring that the VMs stay healthy and that there are enough VMs to serve the incoming requests.

Health Check

The health check is a continuous ping to your server to ensure it is still running. If a VM ever reports as unhealthy traffic will be immediately redirected to a different VM and the defective VM will be replaced.

Autoscaler

The autoscaler defines the shape of your service. Things like: minimum number of VMs, maximum number of VMs, when should VMs be added or removed. This works in tandem with the Managed Instance Group to ensure that your service can always meet the demands of incoming traffic.

Custom Domain Load Balancer

Optionally you can have LaunchFlow map a custom domain to your GCE service. This will spin up a load balancer that redirects traffic to your virtual machines whenever users visit your domain.

Full Control Over Your Resources

You always have full access to these underlying resources if you need to make any changes. For example, here's how you can customize the autoscaler to scale based on CPU utilization:

import launchflow as lf
 
instance_group = lf.gcp.RegionalManagedInstanceGroup("instance-group")
autoscaler = lf.gcp.RegionalAutoscaler(
    "autoscaler",
    group_mananger=instance_group,
    autoscaling_policies=[
        lf.gcp.regional_autoscaler.AutoscalingPolicy(
            min_replicas=1,
            max_replicas=10,
            cpu_utilization=lf.gcp.regional_autoscaler.CPUUtilization(target=0.8)
        )
    ]
)
 
my_service = lf.gcp.ComputeEngineService(
    "dedicated-vm",
    autoscaler=autoscaler
)

How do I get started?

If you’re new to LaunchFlow we recommend you start with our Get Started guide. This will help you understand how LaunchFlow works. Once you’ve completed that you can reference the GCE service documentation to add a GCE service to your application.

As always, reach out to us on Slack if you have any questions!