Uncategorized

Automated Backup and Recovery in PostgreSQL

Backup planning is something that most of the organizations do to save the collection of databases from any loss. Many things are considered in backup planning. Like-

  • Backup interval (How often can I take backups)
  • Backup retention (How long can I keep the backups around as forever backup is expensive)
  • Performance impacts

Restore planning

  • Time spent on taking reinforcements is significant
  • Time it takes to restore is basic
  • Consider multi-organize arrangements

In this blog, we are going to discuss about several backup and recovery methods and tools used in PostgreSQL.

A PostgreSQL administrator deals with a bundle of periodic tasks related to database in his daily life. The tasks are based upon daily, weekly, or the monthly targets. Periodic backups are beneficial for a healthy database. The tasks that are easily done by bash scripts are listed below:-

  • Full backup

These are done when both creation and removal are required

  • WAL backup

These are done when both creation and removal are required

  • Vacuum
  • pgBadger log analysis

These are done when both creation and removal are required

  • log maintenance

These are done if you do not want to use log rotate

PostgreSQL backup options:-

  • Logical backups (or SQL dumps)
    • pg_dump
  • Physical backups (or filesystem)
    • Filesystem snapshots
    • pg_basbackup
    • “Manual” base backups
  • Continuous archiving (or PITR)

The point in time recovery (PITR) is worthy for big-sized databases where logical or physical backups take really long time. In this process, few of the directories can be eliminated from the internal directory which helps in speeding up of the process.

  • Snapshots

Operating system support is required for the snapshots backup. These backups are ideally fitted for the applications at the places both the data directory and the database are required to be in a synchronized manner. For example LAMP applications. The snapshots backups are not at all recommended when the files of databases are stored across multiple file systems because it must snapshot all file systems simultaneously.

  • Cloud

PostgreSQL is used by all the cloud providers to implement their backup process. Logical backups can be done in the cloud (as usual). On the other side, physical backups or point in time recovery (PITR) are accessible through the subscriptions of cloud services in single access to the data.

  • Agent base

It needs an agent to be located on the targets. It can perform a block-level backup, for example, COMMVAULT that can be installed in the Windows operating system only.

Features of PostgreSQL backups

PostgreSQL provides many helpful methods to perform logical, physical, and PITR backups. Following are the features:-

  • Robotization
  • Frequency
  • Retention period
  • Integrity
  • Easy to use
  • Incremental backups to save the extra space
  • Backup inventories
  • Capability to store backups on the cloud or the premises
  • Alters and warnings
  • Comprehensive reporting
  • Access control
  • Encryption
  • Graphical user interface and dashboards
  • Remote hosts backup
  • Adaptive throughput to limit the heap on the objectives
  • Handling of different has in parallel
  • Backup arrangement (example- jobs changing)
  • Rest API’s

PostgreSQL provides the logical backups by employing the command “pg_dump”. This option allows creating quondam SQL dump. Also, the “pg_dump” option allows simplifying the procedure backup of a single database. You need to run this order being a client alongside the authorizations allowed to the database that you mean to backup. If you want to use this command, then you need to play out the following steps:-

  1. The first step is to login as the postgres user:

su – postgres

  1. Afterwards, you can dump the indexes of a database to a file by running the following command. [To run the order on the brief, replace dbname with the name of the database which you need to bto be backed up.]

pg_dump dbname > dbname.bak

The generated backup file, dbname.bak, can be transferred to another host with scp or can be stored locally for later use.

  1. If you want to signify restoring the lost data then, delete your example database by replacing the new empty database in its place. Follow the command:

dropdb dbname

createdb dbname

  1. You can restore the database using psql:

psql test < dbname.bak

There are few more options for the backup format:

  1. *.bak: compressed binary format
  2. *.sql: plaintext dump
  3. *.tar: tarball

How to take backup of all databases

The “pg_dump” permits streamlining the strategy reinforcement of a solitary database. The disadvantage of this command is that it doesn’t store or save information of the database roles or cluster-wise configuration. If you want to store that information and want to take backup of all your databases, then you can use “pg_dumpall” command. This command will take the back of all your databases simultaneously along with the information of the roles of database. To do this:

  1. Create a backup file by using the following command:
pg_dumpall > pg_backup.bak
  • Restore all databases from the backup:
psql -f pg_backup.bak postgres

How to take Automate backups

“Cron task” is a process which automates the backup process at regular intervals. The steps which will be explained below will set up a “cron task” that will run “pg_dump” commence every week at least once.

  1. Ensure that you are logged in as the postgres user:

su – postgres

  1. C Make an index where you need to store  the automatic backups:

mkdir -p ~/postgres/backups

  1. Now you need to edit the crontab to create the new cron task:

crontab -e

  1. Add the following code snippet to the end of the crontab:

crontab

10 0 * * 0 pg_dump -U postgres dbname > ~/postgres/backups/dbname.bak
  1. Save and exit from the editor. Your database is now ready for the backup at midnight every Sunday.

Code for Full Backup

A full backup can be taken as per the following example-

#!/bin/bash

#

# Creates base backup.

CUR_DIR=$(dirname “$0”)

if [[ ! -f ${CUR_DIR}/pgsql-common.sh ]]

then

    echo “pgsql-common.sh not found!”

    exit 1

fi

source “${CUR_DIR}/pgsql-common.sh”

source “$CONFIG”

if [[ -d ${CR_BASE_BACKUP_DIR}/${CR_LABEL} ]]

then

    echo “${CR_BASE_BACKUP_DIR}/${CR_LABEL} already exists and is not empty!”

    exit 2

fi

pg_basebackup \

    –pgdata=${CR_BASE_BACKUP_DIR}/${CR_LABEL} \

    –format=plain \

    –write-recovery-conf \

    –wal-method=stream \

    –label=${CR_LABEL} \

    –checkpoint=fast \

    –progress \

    –verbose

if [[ $? -gt 0 ]]

then

    rm -rf ${CR_BASE_BACKUP_DIR}/${CR_LABEL}

    echo “pg_basebackup on ${CR_LABEL} failed!”

    exit 3

fi

tar -czf ${CR_BASE_BACKUP_DIR}/${CR_LABEL}.tar.gz ${CR_BASE_BACKUP_DIR}/${CR_LABEL} &amp;&amp; rm -rf ${CR_BASE_BACKUP_DIR}/${CR_LABEL}

Code for vacuumdb task

#!/bin/bash
#
# Vacuums the whole database cluster running on a given port.
 
while [[ $# &gt; 0 ]]
do
    key="$1"
 
    case $key in
        -p|--port)
            PORT="$2"
            shift
            ;;
        *)
            echo "Usage: `basename $0` --port|-p [port_number]"
            exit 1
            ;;
    esac
    shift
done
 
if [[ -z "$PORT" ]]
then
    echo "Port not provided!"
    $0 *
    exit 2
fi
 
/usr/bin/vacuumdb -U postgres -p $PORT --all --full --analyze

Tools for the PostgreSQL Backup

  • Amanda

It is an agent based and open box solution for the backups of Postgre SQL.

  • Zmanda

It provides the enterprise version of Amanda.

  • Barman

It is an accident recovery solution which is useful for PostgreSQL which is kept up by second Quadrant. It helps in the administration of the reinforcements for different databases and it has the capacity to reestablish to the predecessor point in time by utilizing PITR highlight of the PostgreSQL.

  • EDB BART (Backup and Recovery tool)

It consolidates the PostgreSQL local record framework level backup and point in time recovery (PITR) into simple open instrument.

  • pgBackRest

It performs the full backup that does not need to depend on the common tools

Summary

The management of PostgreSQL database is no longer a challenge. There is an availability of various tools using which you can automate the backup process or you can also use manual code snippets to automate the backup process using bash script as mentioned in the write-up. Make sure-

  • To test your bash scripts before the deployment of the produced database
  • Even one line of code can result in the unexpected outcomes, so ensure you test all the bash scripts before taking up the backups
  • The best practice is to remain cautious as much as you can. Continuously run the tried code and secure your database before any tests. You would prefer not to blow your database, correct?

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *