Google Cloud Platform cron jobs sourced from Kubernetes

--

It’s not a new question. In fact, it has been asked, and answered, several times. As you build services on top of the Google Cloud Platform, you will inevitably need some sort of “cron” equivalent. Whether for database maintenance (like a garbage collector) or for servicing an asynchronous job queue. As features get added and complexity grows, cloud services will need tasks that are serviced at regular intervals.

Recommended Architecture

Google seems unanimous about the recommended architecture. The same architecture is documented in the Google Cloud Platform documentation, on the Firebase blog, and mentioned in the Google Cloud Platform Podcast: deploy a small web server to the Google App Engine that listens for requests triggered by the GAE cron service and in turn publishes PubSub topics. They even provide a nice sample application written in python that is ready to be deployed (and another for Firebase).

And why shouldn’t they be unanimous? The proposed architecture has a lot of benefits when running in the cloud. The benefits stem from the decoupling of events (scheduled moments in time) and actions (stuff that gets done) with PubSub. By using PubSub to decouple events from actions we create scalable cloud cron jobs. Multiple actions can be triggered by a single event and, inversely, a single action can be triggered by multiple events. In addition, changes to scheduling have no impacts on actions and, inversely, changes to actions have no impacts on scheduling.

The benefits stem from the decoupling of of events (scheduled moments in time) and actions (stuff that gets done) with PubSub.

So what’s wrong with that?

At Hubware, our customer facing API’s are all written in node.js and scaled horizontally by our Kubernetes cluster hosted in the Google Cloud Kubernetes Engine. However, we have recently been adding new functionality using serverless Cloud Functions. We then combine Cloud Functions with PubSub to distribute messages between our various services. Working with Cloud Functions allows us to scale (or not) certain functionality independent from our API’s deployed in the Kubernetes cluster.

The common thread linking all of this together is node.js. Working in node has allowed us come to market extremely fast and facilitates constantly adding new functionality that our customers want, whether deployed in Kubernetes or Cloud Functions. This has been a magic recipe: happy customers, happy sales team, happy product team, happy development team… ahhhhh.

Working in node has allowed us come to market extremely fast and facilitates constantly adding new functionality that our customers want, whether deployed in Kubernetes or Cloud Functions.

Given the above, when we needed to add cloud cron jobs, deploying a python application to the Google App Engine made the entire team cringe. We did not want any new maintenance overhead (applications running in the GAE) nor did we want to have to maintain a new python application.

A better way…

Fortunately for us, we found a better way. At least, better for us! The secret sauce was to replace the GAE deployment and cron scheduling with Kubernetes CronJobs (available since version 1.8, on GCP since October 2017). This gave us all the benefits of using PubSub to decouple events from actions, but didn’t add any new infrastructure to maintain in our production environment.

This gave us all the benefits of using PubSub to decouple events from actions, but didn’t add any new infrastructure to maintain in our production environment.

The design is further simplified because a Kubernetes cron job that publishes to a PubSub topic can be defined in a single config file! No python web-server required to listen for events and then publish them to PubSub. Instead, by taking advantage of the Google’s gcloud-sdk docker image we can publish directly to the PubSub topic without any intermediary.

The file below illustrates how to publish to a PubSub topic directly in a Kubernetes cron job config file. Easy, cheesy!

Leveraging node.js for cron job actions

Bringing node back into the story, Google Cloud Functions can be triggered by messages published to PubSub topics. The great thing about Cloud Functions: they run on node! This allows us to leverage the power of node, its ecosystem, and our existing expertise to quickly create cron job actions for almost anything. Need to do some DB maintenance, send a Slack notification, or just order a pizza… it’s easy with node!

This allows us to leverage the power of node, its ecosystem, and our existing expertise to quickly create cron job actions for almost anything.

Living the good life on the Google Cloud Platform

Combining Kubernetes cron jobs with PubSub allowed us to create scalable, distributed, cloud worthy cron jobs without adding any significant maintenance burden. Making a good thing better, we were able to leverage our existing node expertise!

I hope this magic recipe can help you out when you decide you need cron jobs on the Google Cloud Platform.

Leave your questions and comments below.

--

--

Cloud Architect and university instructor with a passion for developer relations. Currently consulting for Nearform.