Backing up Der Flounder Revisited Once Again
2023-3-4 05:20:16 Author: derflounder.wordpress.com(查看原文) 阅读量:22 收藏

Home > Backup, Linux, Raspberry Pi > Backing up Der Flounder Revisited Once Again

Backing up Der Flounder Revisited Once Again

Eleven years ago, I wrote a post on how I back up this blog. Overall, the reasons I’m backing up haven’t changed:

  • I like this blog and don’t want to see it or its data disappear because of data loss
  • WordPress.com’s free hosting doesn’t provide me with an automated backup method.

Two years ago, I wrote another post on how I needed to switch from hosting on a Mac to now hosting on a Raspberry Pi. The overall methodology hadn’t changed, I was creating a nightly mirror using HTTrack. This worked fine until the latest move to a new host in February 2023, where HTTrack was failing for me because the Raspberry Pi was running headless without a connected display and HTTrack was having problems with trying to launch a headless browser. After an hour of futzing with it, I moved to using wget. The wget tool has a number of handy options for mirroring websites, including the following:

  • –mirror: Makes the download recursive, with recursive browsing and infinite recursion depth.
  • –convert-links: Convert all the links to relative, so it will be suitable for offline viewing.
  • –adjust-extension: Adds suitable filename extensions to filenames, (html, css, etc.) depending on their content-type.

Based on my research, using wget would be a decent replacement for what I had been doing with HTTrack and wouldn’t have the problems I was seeing with HTTrack not being able to launch a headless browser session. For those wanting to know more, please see below the jump.

The current backup host is a Raspberry Pi 4 running Raspberry Pi OS Bullseye. To set up an automated backup using wget, I used the following procedure:

1. Install wget for Debian Bullseye by running the commands below with root privileges:


2. Create a backup directory in the pi user’s home directory by running the following command:


3. Set up the following script as /usr/local/bin/der_flounder_backup.sh


#!/bin/bash
backupDirectoryPath="/home/pi/derflounder_backup"
website="https://derflounder.wordpress.com"
/usr/bin/wget –show-progress –mirror –convert-links –adjust-extension –page-requisites –no-parent -P "$backupDirectoryPath" ${website}

For the script itself, here’s what the various options are doing:

  • –show-progress: When running the script manually, show what’s being currently downloaded.
  • –mirror: Makes the download recursive, with recursive browsing and infinite recursion depth.
  • –convert-links: Convert all the links to relative, so it will be suitable for offline viewing.
  • –adjust-extension: Adds suitable filename extensions to filenames, (html, css, etc.) depending on their content-type.
  • –page-requisites: Download support files like CSS style-sheets and images required to properly display the page offline.
  • –no-parent: When recursing, do not go up to the parent directory.
  • -P: Set directory where all files should be downloaded to.

4. Set up a cron job like the one shown below to run the backup script, with any messages from running the cron job sent to /dev/null. In my case, I set it up in the pi user’s crontab to run nightly at 2:00 AM:


Meanwhile, like the hosts which went before it, I’m also backing up the Raspberry Pi that the backup is stored on, so that I have two copies of the backed-up data available.


文章来源: https://derflounder.wordpress.com/2023/03/03/backing-up-der-flounder-revisited-once-again/
如有侵权请联系:admin#unsafe.sh