A Simple Personal Data Backup Setup

Until recently, my only computer was a 2013 MacBook Air. I backed it up using the included Time Machine system and an external spinning disk hard drive. With the nature of it being a laptop, I didn’t have the external drive plugged in all the time, and so I set a calendar reminder to do a weekly backup because I knew I wouldn’t do it daily. It’s embarrassing to admit that even as a ‘tech professional’, there were plenty of weeks that I skipped backing up my content. I was used to the computer working all the time, I was busy and I always had some excuse to ignore the reminders. MacOS Notifications would beg me to do something:

It has been 3 weeks since you backed up your computer! You're asking for data loss chump!

It’s pretty obvious that in order for it to function properly, you need an automatic backup system. Otherwise, human complacency and laziness take over. Your data needs to be backed up without any input from you. Having recently undertaken an initiative to upgrade and enhance my personal technology setup, I knew I could do better.

My Overall Requirements

  1. Multiple copies of my data to survive hardware/software failure.
  2. Live data synchronization between my MacBook and new Linux Workstation.
  3. Snapshots available for restore.
  4. Remote copies of my data.

The Guiding Principle - 3,2,1

This is something that is obvious, but it really is the bare minimum, even for your personal data. The 3,2,1 rule states that you need:

  • 3 backups: 3 total copies of your data.
  • 2 local: 2 locally available copies.
  • 1 remote: 1 off-site copy.

If you satisfy these requirements, your chance of data loss is incredibly slim.

Local First

There are many fancy approaches to create local, automatic, backup systems. Fundamentally, they lean on having a dedicated device, on-prem, holding your data. These options included:

  • Purpose-built Network Attached Storage with multiple storage bays
  • Raspberry Pi with added hardware
  • Full Data Server

For my needs, these were a little overkill. I wasn’t planning on using my backup as a streaming source or personal media server. I just really needed my data to be synchronized between my Macbook and my Linux Workstation. Surely there must be a simple solution for this? Luckily for me, I found Syncthing. I am very satisfied with the operational simplicity of this tool. I can just point the app to my data folder on each device, and it does the rest. The included ignore pattern functionality is also very useful in reducing the synchronization data volume amount. The nice effect of live synchronization is that the data changes are immediately reflected on the other machine.

However, even though live synchronization maintains two copies of my data it is not a replacement for true backups! What if the data became corrupted on one device? The daemon would sync this corruption to the other device and I would lose content! This is what versioned backups prevent by creating snapshots of your data at certain points in time. Because my use case is simple, I am delegating versioned backups to the remote backup element.

Remote Backups

With a live synchronization system working locally, I set out to add a remote backup portion to the setup. My personal stuff is not mission-critical. Maybe yours is, but mine isn’t, so a remote backup frequency of once per day was enough. Anyways, I settled on the following requirements:

  1. Inexpensive: I don’t want to break the bank.
  2. Encrypted: I’m not a privacy nut, but something still feels wrong about giving someone else carte blanche access to my data.
  3. Scheduled: I would like this backup to run once a day.
  4. Open-Source: This is important for two reasons: I don’t want to be locked-in and I want to be sure the app does what it says it does.

To make a long story short; I ended up using anacron and duplicity with Backblaze B2 on my Linux Workstation.

Anacron allows task scheduling for devices that are not always on. So if you schedule something to run once a day, as long as your computer is on for a portion of the day; it will run.

Duplicity is an amazing all-in-one encryption and upload command-line backup tool. It supports full and partial backups along with support for most cloud providers. I set it up to delete the partial incremental backups after 30 days and do a full backup to save space in the long run.

Backblaze B2 is an solid, inexpensive online storage provider. It provided all the features I needed and is compatible with duplicity out of the box.

The following is templated copy of my backup script. Please note the firefox profile backup and the use of systemd-inhibit to prevent shutting down the computer while the backup is in progress:

# Sync firefox profile into data folder
rsync -a {{FIRFOX_FOLDER}} {{DATA_FOLDER}}

# Push latest changes to cloud
systemd-inhibit --why="Daily Backup In-Progress" duplicity \
    --verbosity 8 \
    {{DATA_FOLDER}} \
    {{CLOUD_ENPOINT_AND_API_KEY}} \
    --asynchronous-upload \
    --full-if-older-than 1M \
    --allow-source-mismatch \
    >> {{LOG_FILE}}
    
echo "personal backup complete ($(date))" >> {{LOG_FILE}}

Epilogue

Since setting up this system, not a single part of has failed. And maybe, it never will. But I am no longer embarrassed of my backup system and I sleep a little better at night. Please take this as a reminder to back your important stuff up. Hopefully you’ll never have to thank me!

Related post