In this part of the series, I’ll automate the site deploy process as described by Mike Tabor.

In his tutorial, Mike explains how to deploy a static site to an S3 bucket on AWS and then serve it with Cloudflare.

Cloudflare is a very versatile service that has many perks included in its free plan, such as being a reverse proxy and a reverse DNS proxy. Moreover, we don’t have to generate an SSL certificate, since it can be provided by Cloudflare.

To recap, we automated the site build process in the previous post and now we have static content generated by Jekyll. In this post, we’ll upload the content to AWS automatically with a Docker container.


Disclaimer:

I didn’t affiliate with either AWS or Cloudflare while writing this post. Use the services at your discretion, as they may incur costs. Familiarize yourself with the plans & offers for all the services listed above.


Prerequisites

  • You own a domain and can manage its DNS entries
  • You have an AWS account set up
  • You installed the AWS CLI tool on your computer
  • You have a Cloudflare account set up
  • Some knowledge of Python
  • We’ll use the boto3 package to deploy the site

Set Up

Follow the steps provided by Michael in his post. A small remark: you can give up configuring the bucket policy, as your bucket will be publicly accessible.

Prior to accessing AWS CLI, AWS recommend you configure a special IAM account dedicated to programmatic access. In our case, this account will be dedicated for the deployment automation. Give it as less permissions as possible (for example, only allowing list/read/update/write/delete operations at the S3 bucket level).

After creating the account and assigning access to it, generate the access keys and add them into the CLI on your computer by issuing the aws configure command. Once you enter the command for the first time, it’ll ask for the credentials you just generated.

The credentials you entered on your machine will be later passed to the deployment container.

The CI Steps

We’ll automate some of the steps of CI/CD (continuous integration/continuous deployment) process:

  1. Building the site
  2. Automatically uploading the generated content to S3 bucket

The CI/CD process includes some additional steps that we’ll review further in the series (stay tuned).

Building the Site’s Contents

Let’s recall the previous post, where we ran Jekyll commands on top of a container:

docker run --name siteBuilder --rm -v $PWD:/site sitebuilder jekyll build

Running this command takes the Markdown files and other resources, and generates a site which is ready for deployment under the _site subdirectory.

The Deployment Process

I’d like to run the deployment process on top of a Docker container. Hence, we should prepare a custom image designated for deployment.

We’ll create a special directory called cicd under the site’s directory: mkdir cicd

While performing all the steps below, we’ll remain in the cicd directory.

The Dockerfile

Let’s begin with the Dockerfile itself, so we gain better insights in order to write our deployment script.

As for now let’s create an empty file named deployment.py. We’ll get back to it in the next step.

Let’s also create the requirements.txt file with the following contents:

boto3==1.20.15

Finally, let’s add the Dockerfile.deploy Dockerfile and populate its contents as specified below:

FROM python:3.9.9-alpine3.14
ENV S3_BUCKET_NAME=www.yourdomain.com
ENV SITE_CONTENTS_PATH=/site
ENV AWS_SHARED_CREDENTIALS_FILE=/deploy/credentials
RUN apk update && apk upgrade
WORKDIR /deploy
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY deployment.py ./
RUN chmod 755 deployment.py
ENTRYPOINT ["./deployment.py"]

Let’s dissect the Dockerfile now.

The Environment Variables

It’s a good practice to avoid hard coding. Hence, I defined three environment variables, from which the relevant data will be read.

Note that boto3 searches for the AWS credentials in the host OS or a container that it runs in. One of the methods is to check for the file path using the AWS_SHARED_CREDENTIALS_FILE environment variable that points to the credentials file.

Other two ENV statements state to which S3 bucket the files would be uploaded to and the path from which these files should be taken from respectively.

Replace the bucket name (S3_BUCKET_NAME) with the actual bucket name.

Another advantage of using the environment variables is that their values can be overriden during container runtime.

As for the rest of the Dockerfile, we employ the same methodology of containerizing an app as we employed in the 1st post of the series.

Namely, we chose the most lighweight base image, and on top of it we installed the needed dependencies. Finally, we passed the code to it and activate the main program at the container’s initialization by stipulating the container’s ENTRYPOINT.

The Deployment Script

The Dependencies

If we revisit the requirements.txt file, we see that currently there’s only one dependency and it’s the boto3 package.

Show Me the Code

#!/usr/local/bin/python
from pathlib import Path
import os
import boto3
import sys
from botocore.exceptions import ClientError
import mimetypes

ENV_S3_BUCKET_NAME = 'S3_BUCKET_NAME'
ENV_SITE_CONTENTS_PATH = 'SITE_CONTENTS_PATH'
site_contents_file_path = Path(os.environ.get(ENV_SITE_CONTENTS_PATH))


def get_mimetype(object_path):
    content_type, encoding = mimetypes.guess_type(object_path)
    if content_type is None:
       return 'binary/octet-stream'
    return content_type
    

class BucketManager:

    def __init__(self, bucket_name):
        self._s3_client = boto3.client('s3')
        self._bucket_name = bucket_name

    def upload_files(self, object_path):
        # If the object is taken from the parent directory, then the key is the file name
        bucket_key = object_path.name
        # If the object originates from one of the subdirectories, its key would be the relative path
        # AWS S3 uses / to derive paths and create folders in the bucket itself
        if len(object_path.parts) > 3:
            bucket_key = "/".join(object_path.parts[2:])
        try:
            if object_path.is_file():
                sys.stdout.write(f'Uploading {bucket_key}\n')
                content_type = get_mimetype(object_path)
                response = self._s3_client.upload_file(str(object_path), self._bucket_name, bucket_key, 
                    ExtraArgs={'ContentType':content_type})
            if object_path.is_dir():
                for os_obj in object_path.iterdir():
                    self.upload_files(os_obj)
        except ClientError as e:
            sys.stderr.write(str(e))
            sys.stderr.flush()
            exit(1) #return exit code 1 to the container
            return False
        return True

def upload_site_contents():
    s3_bucket_name = os.environ.get(ENV_S3_BUCKET_NAME)
    s3_manager = BucketManager(s3_bucket_name)
    s3_manager.upload_files(site_contents_file_path)
    sys.stdout.flush()
    

upload_site_contents()

The above script retrieves the environment variables from the container’s OS and uploads the files the the S3 bucket accordingly.

I used the BucketManager class in order to reuse an existing session, so we make less connections to AWS in the process.

The script is written based on the examples provided by AWS.

A Note on Uploading the Files Programmatically

Based on the Content-Type header in the response, our browsers know how to deal with different types of data (for example, the text/html value tells our browser to reander an HTML page for us, whereas text/css prompts our browser to style the page). Failing to specify the proper Content-Type header’s value, will result in unexpected browser’s behavior while rendering your site (e.g. not stlyzing the pages at all).

In most of the cases, the servers can automatically detect the MIME type of the file being served to the user.

Unfortunately, as per this Stack Overflow response, the S3 service doesn’t add this metadata automatically if the files are uploaded by the boto3 client. Hence, you have to stipulate this piece of data in your script.

As a result, I also implemented the get_mimetype function to get the file’s MIME type from the mimetypes built-in module in Python. Afterwards, I pass this function’s output to the S3 client by adding the ExtraArgs argument to the upload_file function.

To finish this step, let’s build the image by running the following command:

docker build -t sitedeployment -f Dockerfile.deploy .

Run the Deployment Image

Let’s move back up to the site’s directory now and run the deployment container:

docker run \
-it --name deploy -v $PWD/_site:/site \
 -v ~/.aws/credentials:/deploy/credentials:ro --rm \
 sitedeployment

Note that we’re mounting the AWS credentials file stored locally on our computer to the container. In the Dockerfile we stipulate a custom location for the credentials, so boto3 uses that custom location to find the credentials.

Bringing Everyting Together

As we discussed previously, running the container from the command line can be cumbersome.

Let’s relay all the settings to the docker-compose file we created in the previous part:

version: "3"

services:
    site:
        image: nginx:1.21.4-alpine
        volumes:
          - $PWD/_site:/usr/share/nginx/html
          - ./nginx.conf:/etc/nginx/nginx.conf:ro
        ports:
          - 80:80
        container_name: site
    
    sitebuilder:
        image: sitebuilder
        volumes:
            - $PWD:/site
        container_name: sitebuilder
        entrypoint: ["jekyll", "build"]
        deployment:
        image: sitedeployment
        volumes:
            - $PWD/_site:/site

    deployment:
        image: sitedeployment
        volumes:
            - $PWD/_site:/site
            - ~/.aws/credentials:/deploy/credentials:ro

As for the sitebuilder service, note that the entrypoint statement was added. This way, we’ll make Jekyll the main container’s process. So, when running, Jekyll will return its exit code (status) directly to the container.

The exit code will be used in further posts of the series to automatically determine whether the build was successful.

To build the site, you can run the following command:

docker-compose up sitebuilder

This will run the container with all the mounted volumes and build the site.

To test the site, run the following command:

docker-compose up site

And finally, to deploy the site, issue this command:

docker-compose up deployment

Don’t forget to remove the containers with the docker-compose down command.

The source code is available here

Stay tuned for more posts…