Using Backblaze as an AWS S3 API compatible alternative

Storage is a huge part of MD3’s Infrastructure and backups are a giant concern for us, that’s why we’ve been […]

Words by
Array
Published
in 29.07.2020

Storage is a huge part of MD3's Infrastructure and backups are a giant concern for us, that's why we've been using AWS S3 for a long period of time.

Mediatree core business is media broadcast monitoring, which involves recording and process huge amounts of media files. At the moment the infrastructure has in production around 500 international TV and Radio channels, where we record all the native quality of the channels, and process them in order to extract metadata (EPG information, thumbnails, closed caption and speech to text) and make it available for our customers. 

MD3 is responsible for the administration of the IT Infrastructure of our parent company Mediatree

Mediatree stores up to 1 year of content of this data, which makes storage a huge part of our Infrastructure. At the time, we have around 500TB of data in our storage, which is heavily mixed between big large files (native .ts files with raw quality) and a huge amount of small files (around 300M of mainly .jpeg, .srt and .xml). 

Storage solutions are typically designed to handle well one of each case (big files vs small files), which puts us in one extreme case where we need to handle both types of data with similar performance.

Due to these requirements on the storage part of our infrastructure, backups are an even greater concern for us. Our platform is on 24/7 with very high SLA’s, and even one downtime of a few minutes has an impact on the production because our customers expect us to always deliver the full content without any cut in the video.

Our approach to backups has been to follow the 3-2-1 Rule, where we try to always have 3 copies of the data, in 2 different formats and 1 offsite.


To address the offsite part, we have been using for a long period of time AWS S3. In this case, we have daily backups from our main datacenter into a few AWS S3 buckets. The backups are mainly the media files and exported disks of our VM’s that we run in our datacenter.

We are constantly searching for newer solutions to improve our services and if possible reduce our costs. In 2018 our team went to NAB show in Las Vegas to meet with a few customers and also search for new solutions and ways to improve our services.

One of the stands there was BackBlaze. We had an interesting discussion with a person from BackBlaze but at the time, they were only based in the USA and had no datacenters in Europe, so it made no sense for us to use their services. 

We enjoyed the company approach to the storage problems they were trying to solve (cheaper yet reliable alternative to other cloud storage services like AWS S3) and we kept following them to see the progress they were making.

For us, in August 2019, they made a major step forward: to open the first BackBlaze European datacentre. Late 2019, we started testing BackBlaze B2 service, however, our tests showed us that there was a major inconvenience with it, it was not S3 API compatible, which prevented us from using it in some of the backup software that we had in our production.

The second major step for us was in May 2020, where they announced that BackBlaze B2 was now compatible with S3 API. This fully allowed us to migrate our backups fully from AWS S3 into BackBlaze B2 service, allowing us to save up to 80% in our invoice.

The biggest advantage of BackBlaze B2 is the pricing model. When using BackBlaze we can have direct savings of 5x in comparison to AWS S3 and another big advantage is that the data egress pricing is also cheaper.

Cloud Storage Pricing Comparison

How to Setup AWS CLI to access BackBlaze B2

Requirements:

Important note: The location of the account is defined in the registration. If you want your account to be located in Europe other than the US, you need to specify it at the registration phase.

Set the account location while registering.
  • Access to BackBlaze Console.
  • AWS CLI installed
    • AWS CLI is available in the official package repository of Ubuntu 18.04 LTS, if it is not installed you can easily install it with “sudo apt update && sudo apt install awscli”
  • Once inside the Backblaze console, we will go ahead and create our first bucket.

Select and Create Buckets

Inside the BackBlaze console, navigate to to “B2 Cloud Storage” and select “Buckets:

After selecting buckets, click on “Create a Bucket”

Create a Bucket.

Insert the name of the backup and choose if you want your files to be public or private

Name the bucket.

At this stage, we should have our first bucket created. The important part for the rest of the tutorial is the “Endpoint”. This is the endpoint that we will have to set later when using the AWS S3 API in the CLI.

This marks the endpoint for the created bucket.

On the left menu, we need to navigate to “App Keys” to create our application keys, that in BackBlaze B2 are the equivalent of the AWS IAM Security keys.

Application Keys

By default, there is a Master Application Key. We will create a specific key that will be used only to access our new bucket.
Click on “Add a New Application Key”

Add a new application key

A few parameters that can be set

While adding a new key, there are a few parameters that we can set:

Name of Key - the name of the key
Allow Access to Bucket - restrict the key to a specific bucket
Type of Access - give permission to Write/Read on the bucket
Allow List All Bucket Names - Allow this key to list all buckets in the account
Filename prefix - Allow only access to files that have a specific prefix
Duration - Specify access to files within the set duration

In our case, we will restrict the access to the bucket we created previously and give it Read/Write permissions.

After pressing “Create New Key” we should have a Success message that shows us our newly created key. From this, we need to save the KeyID and applicationKey

KeyID = AWS Access Key
ApplicationKey = AWS Secret Access Key


Newly created key

Our configuration work in the BackBlaze console is complete. We will keep it open to check later the result of our first file upload.

Now onto the AWS CLI configuration. We will add a new profile that will be used to access the BackBlaze B2 bucket.

To do this, we need to add a profile into our aws cli configuration that is usually located in our home folder ~/.aws

Open the file ~/.aws/config
add the following in the bottom of the file:
[profile b2]
region = eu-central-003
output = son

We should have something like:

Now on our credentials file at ~/.aws/credentials we will add:
[b2]
aws_access_key_id = 003bc0ef44688310000000005
aws_secret_access_key = [applicationKey]

The applicationKey is the value that we saved previously when we created the Key in BackBlaze Console.
The configuration steps are completed and now we should be able to use the AWS CLI to access our BackBlaze B2 storage.

In order to use it, we need to specify the --endpoint URL and also the profile

Example commands:

  • List all buckets: aws s3 ls --endpoint https://s3.eu-central-003.backblazeb2.com --profile b2
  • Copy a file: aws s3 cp file1 s3://backblaze-s3-api --endpoint https://s3.eu-central-003.backblazeb2.com --profile b2

In BackBlaze Console we can navigate to “Browse Files” and select our bucket “backblaze-s3-api” and we should be able to see our newly uploaded file.

Using BackBlaze B2 service could be a good alternative for a cheaper cloud backup solution and because most backup software already has support for AWS S3 storage it should be possible to migrate from AWS S3 to BackBlaze B2 easily.

© Images from https://www.backblaze.com/

This is an article written by Gonçalo Dias, Software Engineer @md3.

Gonçalo has 7 years of experience in systems administration, with a special taste for Linux systems and high availability. He is the MD3's Head of IT, and is responsible for the management of the entire infrastructure of the Mediatree group and for the continuous development of all solutions and technologies used.