Backup and restore in PostgreSQL

image

Suppose we have a postgresql in streaming replication mode. master server and hot-standby ready to replace their fallen comrades. When bad scenario, we just have to create a trigger file and switch our app to work with the new master. However, there may be times when legitimate changes were made crooked written a migration and ran to the master and slave server. For example, have been removed/changed data in terms of tables or the tables were not removed. From the point of view of the database everything is fine, but from a business point of view is a disaster. In this case, the proclamation of the hot hot-standby master, the procedure is clearly useless...
For warning such a situation there are at least two options...

the
    the
  • to use a periodic backup with pg_dump funds;
  • the
  • to use a backup base backups and WAL archives.

The first method is simple to implement and requires minimum effort for installation and maintenance. Put "pg_dump | lbzip2" in cron, and forget. However, this option does not propose to restore the directory database to the point of failure, Alesi at the time of backup. The second option is a bit more complicated and costly in terms of storage, but this option is a more flexible solution in case of recovery. About it will be discussed.
From the pros:
the
    the
  • option to restore the cluster database at any point in time relative to the time of the creation of the baseline backup and the time of the failure;
  • the
  • as the condition for restoration can serve as a time stamp and a specific transaction.

Cons:
the
    the
  • base copy is the approximate size of the cluster database.
  • the
  • need to store the WAL files during the storage period from the baseline backup.

As mentioned above, this backup method offers flexibility to restore (you can restore the state of the database in a clearly specified time or the time before or after the execution of a particular transaction), but at the same time adds significant requirements to the backup storage. The implementation looks like this:
the
    the
  • setting up archiving WAL logs;
  • the
  • configure backup;
  • the
  • storing one or more backups;
  • the
  • remove the oldest backup file in case of successful fulfillment of item 1;
  • the
  • to delete the appropriate WAL files from the backup copy of p. 3;
  • the
  • optionally, you can perform a procedure of checking backups for their "professionalism".


Mode archiving WAL logs is configured through the enabling parameter archive_mode and archive_command in postgresql.conf and create the directory where to store the archives. To start is to turn on the backup mode and to estimate the volume of archives to be created for one day of work the database. This will allow you to assess the required storage space for archives and basic copies. For archiving answer options:
archive_mode = on
archive_command = 'cp %p /opt/pgsql/pgbackup/archive/%f'

Direct backup custom tools pg_basebackup. This program is from the kit utilities included with PostgreSQL, which can be used for setting up streaming replication, and to relieve the backups. The principle of operation allows to take backup without stopping the database cluster. Based on the task, we just need to run pg_basebackup on the schedule in cron. Given the demands on the place, you need to take care of sufficient disk space to avoid overflow.

The backup task is optional, as it is sufficient to have at least one backup copy. It is understood that at the time of starting the backup, we agree with the statement that the base is in the "correct" state (we're not going to copy the broken base).
After creating a backup copy, the old copy can be deleted along with the archives. Deletion of archives is performed using the pg_archivecleanup utility. Utility allows you to remove unneeded backups based on specific file-marks that are created during backup.

It is also important to have a procedure verify the backup after it is created. The algorithm is rather simple: you need to copy the base copy in a directory sandbox (gently place!), to create the minimum necessary configuration files needed to run in recovery mode and run postgres relative to this directory sandbox, after starting it is necessary to analyze the log and make a conclusion whether the backup is suitable for recovery.
Thus the process fits into three steps: creating the base copy, verify and delete old, the previous baseline backup.

Now suppose that the worst happened and you want to restore. You need to stop the main cluster of postgres and rename the database directory to a random name. Directory of the backup you want to rename in the directory of the database cluster. If you want to copy the configuration files. After you identify the configuration files that run postgres on our directory. At startup, Postgres will find the recovery.conf and start in recovery mode. It remains to wait until postgres will restore its state using the archives, then you can connect to the database and continue working. That's all, the recovery procedure is completed.

Here and so. Keep your data safe! Scripts for backup and validation copies here.
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Briefly on how to make your Qt geoservice plugin

Database replication PostgreSQL-based SymmetricDS

Yandex.Widget + adjustIFrameHeight + MooTools