Photo By Ryan Parker / Unsplash

At the end of December 2020, AWS released container support for Lambda, which brings unlimited capacities to Lambda. Before, we zipped our code, upload it to a bucket, create a lambda to take our code, and run it in the Lambda environment. So, if your code was requiring interaction with an external program like the shell, git, pg_dump, etc.. it was not possible.

With the Lambda container, you build your Docker image with all the dependencies needed by your application, upload it to the AWS container registry, create a Lambda and run it. The image size can be up to 10Gb.

What we want to achieve

We have a web application connected to a database running in production. The application inserts and retrieves data requested by the users. It is paramount to do a backup of the database frequently to avoid data loss or being able to restore in case you were unable to prevent the data loss. Here is a bash script to back up a PostgreSQL database.

#!/bin/bash

# day_month_year_hour_minute
TODAY=`date +"%d_%m_%Y_%H_%M"`

FILENAME="db_name-${TODAY}.tar"

echo "${FILENAME}.gz"

# Dump only the data and the schema (data definitions).
pg_dump --dbname=postgresql://db_user:db_password@db_host:$db_port/db_name -F t > "/tmp/${FILENAME}"

if [ $? -eq 0 ]; then
  exit 0
else
  exit 1
fi

# compress the SQL dump file
gzip "/tmp/${FILENAME}"

Replace db_user , db_password, db_host,   db_password and db_name with valid database credentials.

We want to be able to trigger a Lambda every Saturday at 02:30 AM to run this bash script file (backup.sh) and upload the backup somewhere. Note that we will only focus on database backup. Check out this tutorial to see how to upload a file in Google Drive

Implementation steps

Starting from this, we will create a Docker image with pg_dump installed, embed the backup script file inside, and the source code written in Node.js. Once the image is created, we will push it to AWS Elastic Container Registry (ECR) and finally, create a Lambda function that uses this Docker image using the Serverless Framework.

Prerequisites

Before getting our hands dirty by writing code, make sure our development environment has the following tools installed and configured:

  • An AWS account (a free tier is enough)
  • AWS CLI installed with credentials set
  • Docker installed
  • Node.js 12+

Create Serverless project

Serverless is a tool that makes it easier to create, test and deploy a project that uses serverless architecture. The first step is to install it on our computer.

npm install -g serveless


serverless -v


Framework Core: 2.55.0
Plugin: 5.4.3
SDK: 4.3.0
Components: 3.15.1

Now create a Node.js project with the command below:

serverless create --template aws-nodejs-docker --path backup-db

cd backup-db

The project generated has the following structure:

├── Dockerfile
├── README.md
├── app.js
└── serverless.yml

Create a file backup.sh and add the code for database backup. Save and exit. Now, the project folder is like this:

├── Dockerfile
├── README.md
├── app.js
├── backup.sh
└── serverless.yml

Create the Docker image

The container image can be created from one of the base images AWS provides that already implements the different runtimes. You can also create your own custom image, but it must implement the Lambda Runtime API for this to work. We will choose the second option to show how to implement the Lambda runtime API on a custom Docker image.

Open the Dockerfile and replace the content with the following:

ARG FUNCTION_DIR="/function"

FROM node:14-buster

RUN apt-get update && \
    apt install -y \
    g++ \
    make \
    cmake \
    autoconf \
    libtool \
    wget \
    openssh-client \
    gnupg2

RUN wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - && \
    echo "deb http://apt.postgresql.org/pub/repos/apt/ buster-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
    apt-get update && apt-get -y install postgresql-client-12


ARG FUNCTION_DIR

RUN mkdir -p ${FUNCTION_DIR} && chmod -R 755 ${FUNCTION_DIR}

WORKDIR ${FUNCTION_DIR}

COPY package.json .
RUN npm install

COPY backup.sh .
RUN chmod +x backup.sh
COPY app.js .

ENTRYPOINT ["/usr/local/bin/npx", "aws-lambda-ric"]
CMD ["app.handler"]

Let’s explain what happens above:

  • We store the project's path inside the container as a variable since it will be used in many places.
  • We use Docker node-buster as the base image.
  • We install the dependencies required to install postgres-client
  • We install postgres-client which contains pg_dump, who is required to perform the backup.
  • We create the project directory, then add package.json, and install the dependencies.
  • Copy the backup.sh file, make it executable, and copy the app.js file.
  • Finally, we define the command to executes when the container is launched.

Implement the Lambda API Runtime

On the ENTRYPOINT command in the Dockerfile, the first argument calls an executable aws-lambda-ric from the node_modules. It is the Lambda Runtime Interface Client who is a lightweight interface that allows your runtime to receive requests from and send requests to the Lambda service. Since we built our Docker image without AWS lambda image, we must add this package through npm to make our image Lambda compatible.

npm init -y

npm install aws-lambda-ric

Build the docker image. Note that you must be in the project root directory:

docker build -t backup-db:v1 .

Test the Lambda in local

Now I have my Docker image ready, but I want to test it locally before deploying the Lambda in production to ensure it works as expected. With Serverless, we usually use sls invoke local -f <function_name> , but it will not work for this case because we don’t have the Lambda execution context. Hopefully, AWS provides a Runtime Interface Emulator (RIE), as his name indicates, will emulate the Lambda execution context. Let’s install it at the home directory:

mkdir -p ~/aws

# Download the RIE
curl -Lo ~/aws/aws-lambda-rie https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie

# Make it executable
chmod +x~/aws/aws-lambda-rie

  • Open the first terminal and run the command below to launch a container with RIE.
docker run -v ~/aws:/aws -it --rm -p 9000:8080 --entrypoint /aws/aws-lambda-rie backup-db:v1 /usr/local/bin/npx aws-lambda-ric app.handler

  • Open a second terminal and run the code below to trigger the lambda execution:
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}'

Here is the result we get:

Trigger the Lambda function in local

In the second terminal, we got the response returned by the lambda function. It is the code inside the file app.js.

Update the handler to backup the database

We have a bash script we want to run from a Node.js code. We will use a built-in module of Node.js called child process who gives the ability to create a sub-process in the current process to run a code, retrieve the result and send it to the parent process.

Replace the content of app.js with the code below:

'use strict';

constchildProcess= require("child_process");
constpath= require("path");

const backupDatabase = () => {
  const scriptFilePath =path.resolve(__dirname, "./backup.sh");

  return newPromise((resolve, reject) => {
childProcess.execFile(scriptFilePath, (error) => {
      if (error) {
console.error(error);
        resolve(false);
      }

      resolve(true);
    });
  });
};

module.exports.handler = async (event) => {
  const isBackupSuccessful = await backupDatabase();

  if (isBackupSuccessful) {
    return {
      status: "success",
      message: "Database backup completed successfully!"
    };
  }

  return  {
    status: "failed",
    message: "Failed to backup the database! Check out the logs for more details"
  };
};

Now test by following the process described previously. We go to the output below:

Test the database backup from the local environment

One thing we can do to make sure the backup succeeded is to SSH into the container and check the content of the /tmp folder:

docker ps #To view the container's name

docker exec -it <container_name> /bin/bash

cd /tmp
ls -lh

We got the following output:

View the database backup inside the Docker container

As we can see, the backup file compressed has a size of 5 KiloBytes.

Note: The database host must be remote; if you have a PostgresSQL database installed on your computer and use localhost or 127.0.0.1 as the host, it will fail to connect because the code inside the Docker container doesn’t know about our physical host.

Deploy and test the Lambda

Now our code works as expected in the local environment, so we will deploy it in production in three steps.

Step 1: Push the Docker image into a container registry. We will use AWS Elastic Container Registry (ECR).

Log into your AWS account, create an ECR repository backup-db , and then execute the commands below to push the image.

docker build -t backup-db:v1 .

aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <ecr_repository_url>

docker tag backup-db:v1 <ecr_repository_url>/backup-db:latest

docker push <ecr_repository_url>/backup-db:latest

Step 2: Update the serverless.yml by providing the Docker image URL of the source code. We will also add a cron job that will schedule the trigger of this lambda function every Saturday at 2:30 AM.

service: backup-db

frameworkVersion: '2'

provider:
  name: aws
  region: eu-west-3
  lambdaHashingVersion: 20201221
  iam:
    role:
      statements:
        - Effect: "Allow"
          Action:
            - "ecr:InitiateLayerUpload"
            - "ecr:SetRepositoryPolicy"
            - "ecr:GetRepositoryPolicy"
          Resource: [ "arn:aws:ecr:eu-west-3:<account_id>:repository/*" ]

functions:
  backup-db:
    image: <account_id>.dkr.ecr.eu-west-3.amazonaws.com/backup-db@sha256:8e6baf5255e3ab7c5b5cb1de943c2fb67cf4262afde48f7249abcebd8d6a7a01
    timeout: 900
    events:
      - schedule: cron(30 2 ? * SAT *)

Step 3: Deploy our lambda function using the Serverless Framework.

sls deploy

Wait for serverless to complete the deployment then, go to AWS console to the Lambda. If everything is done correctly, you will have an output similar to the one below:

Trigger the Lambda function from the AWS console

Conclusion

Throughout this tutorial, we have seen how we can take advantage of Docker image support by AWS Lambda to backup a database. To go further, you can upload the code to cloud storage. I showed how to upload a file to Google Drive here.

Find the source code of this project on the Github repository.

Follow me on Twitter or Subscribe to my newsletter to keep updated with new content.