The Postgres schema comparison and related problem with views
Compare schemas of two or more different Postgres databases is a common task, but it can become more tricky, if these databases run on different versions of Postgres'. Quick and canonical way to compare schemes is the use of the same program pg_dump to communicate with each base with --schema-only parameter. This method works great, but there are some pitfalls, especially when copying submissions.

(Photos made Philippe Vieux-Jeanton)
the
Let's start with some assumptions, as was discovered by this problem. We have an instance that is in the process of upgrading versions of Postgres 9.2 to 9.6 (the latest version at the time of writing). Use pg_upgrade was impossible as it was planned not only enable data checksums, but changing the encoding to UTF-8. A number of factors, especially the change of the encoding meant that the typical update process old_database pg_dump | psql new_database impossible. Thus, we have a very specific program that accurately migrates portions of the data, producing the action along the way.
the
As a final assessment of sanity, we wanted to ensure that the final scheme has been updated to version 9.6 of the database as far as possible identical to the current scheme grocery database version 9.2. When comparing the output of pg_dump, we quickly found a problem with the way of displaying views. Version 9.2 uses very lean, single-line output, while version 9.6 uses multiline "beautifully derived" variation. Needless to say, this meant that none of the representations do not coincide when comparing the output of pg_dump.
The problem lies in the system function pg_get_viewdef(), which is used pg_dump'om to return human-readable and Postgres-recognizable version of the view. To demonstrate the problems and solutions that will create a simple view on both databases, then compare them via pg_dump:
the
The only difference besides the version of the server is a representation that does not match at all, and is concerned about the diff. (For the purposes of this article, from the output, to remove all secondary lines).
As mentioned earlier, the culprit is the pg_get_viewdef(). His job is to represent the filling of a presentation in a relevant, readable way. There are two main changes which it makes with this conclusion: adding the parentheses and adding padding with spaces. In recent versions, despite the fact that the documents hint, indentation (nice conclusion) can't be disabled, so there's no easy way to get the server version 9.6 to give the difference in views on one line, how does the server version 9.2 by default. Moreover, there are five versions of the function pg_get_viewdef, each of which accepts different arguments:
Article based on information from habrahabr.ru

(Photos made Philippe Vieux-Jeanton)
the
the Premise
Let's start with some assumptions, as was discovered by this problem. We have an instance that is in the process of upgrading versions of Postgres 9.2 to 9.6 (the latest version at the time of writing). Use pg_upgrade was impossible as it was planned not only enable data checksums, but changing the encoding to UTF-8. A number of factors, especially the change of the encoding meant that the typical update process old_database pg_dump | psql new_database impossible. Thus, we have a very specific program that accurately migrates portions of the data, producing the action along the way.
the
the Problem
As a final assessment of sanity, we wanted to ensure that the final scheme has been updated to version 9.6 of the database as far as possible identical to the current scheme grocery database version 9.2. When comparing the output of pg_dump, we quickly found a problem with the way of displaying views. Version 9.2 uses very lean, single-line output, while version 9.6 uses multiline "beautifully derived" variation. Needless to say, this meant that none of the representations do not coincide when comparing the output of pg_dump.
The problem lies in the system function pg_get_viewdef(), which is used pg_dump'om to return human-readable and Postgres-recognizable version of the view. To demonstrate the problems and solutions that will create a simple view on both databases, then compare them via pg_dump:
the
$ psql -p vtest 5920-c \
'gregtest create view as select count(*) from pg_class where reltuples = 0'
CREATE VIEW
$ psql -p vtest 5960-c \
'gregtest create view as select count(*) from pg_class where reltuples = 0'
CREATE VIEW
$ diff-u <(vtest pg_dump-x-p 5920 --schema-only) <(vtest pg_dump-x-p 5960 --schema-only)
--- /dev/fd/70 2016-09-29 12:34:56.019700912 -0400
+++ /dev/fd/72 2016-09-29 12:34:56.019720902 -0400
@@ -2,7 +2,7 @@
-- PostgreSQL database dump
--
--- Dumped from database version 9.2.18
+-- Dumped from database version 9.6.0
-- Dumped by pg_dump version 9.6.0
SET statement_timeout = 0;
@@ -35,22 +35,14 @@
--
CREATE VIEW gregtest AS
-SELECT count(*) AS count FROM pg_class WHERE (pg_class.reltuples = (0)::double precision);
+ SELECT count(*) AS count
+ FROM pg_class
+ WHERE (pg_class.reltuples = (0)::double precision);
The only difference besides the version of the server is a representation that does not match at all, and is concerned about the diff. (For the purposes of this article, from the output, to remove all secondary lines).
As mentioned earlier, the culprit is the pg_get_viewdef(). His job is to represent the filling of a presentation in a relevant, readable way. There are two main changes which it makes with this conclusion: adding the parentheses and adding padding with spaces. In recent versions, despite the fact that the documents hint, indentation (nice conclusion) can't be disabled, so there's no easy way to get the server version 9.6 to give the difference in views on one line, how does the server version 9.2 by default. Moreover, there are five versions of the function pg_get_viewdef, each of which accepts different arguments:
-
the
- view name the
- the view name, and a logical argument the
- OID the
- OIDs and logical argument the
- an OID and an integer argument In Postgres 9.2, version pg_get_viewdef(text,boolean) will enable or disable the indentation, moreover, you can see that the default margins are not added:
- to Write a script that will transform and normalize the output circuit the
- to Modify the source code Postgres'and behavior changes pg_get_viewdef the
- to call pg_dump'om functions pg_get_viewdef thus, to get identical output
the
$ psql vtest -p 5920 -Atc "select pg_get_viewdef('gregtest')"
SELECT count(*) AS count FROM pg_class WHERE (pg_class.reltuples = (0)::double precision);
$ psql vtest -p 5920 -Atc "select pg_get_viewdef('gregtest',false)"
SELECT count(*) AS count FROM pg_class WHERE (pg_class.reltuples = (0)::double precision);
$ psql vtest -p 5920 -Atc "select pg_get_viewdef('gregtest',true)"
SELECT count(*) AS count +
FROM pg_class +
WHERE pg_class.reltuples = 0::double precision;
In Postgres version 9.6, however, you are always faced with a "beautiful" display, regardless of which of the five versions of the functions you choose and what arguments you give them! Here is the call the same functions as in the above example for 9.6:
the
$ psql vtest -p 5960 -Atc "select pg_get_viewdef('gregtest')"
SELECT count(*) AS count
FROM pg_class
WHERE (pg_class.reltuples = (0)::double precision);
$ psql vtest -p 5960 -Atc "select pg_get_viewdef('gregtest',false)"
SELECT count(*) AS count
FROM pg_class
WHERE (pg_class.reltuples = (0)::double precision);
$ psql vtest -p 5960 -Atc "select pg_get_viewdef('gregtest',true)"
SELECT count(*) AS count
FROM pg_class
WHERE pg_class.reltuples = 0::double precision;
the
Solutions
When I first encountered this problem, three solutions popped into my mind:
-
the
Initially, I thought that a quick Perl script would be the easiest way. And by the time I got one working version of the script brought a lot of pain to turn the output from "beautiful" to "ugly", especially the spaces and brackets. The approach of brute force, by simply removing all parentheses and brackets are extra spaces from the rules and view definitions almost worked, but the output was quite
Approach number two, modifying the source Postgres'and actually quite simple. At some point the source code has been modified so that the insert passes had to be shown in the "on" state. Change a single character in the file src/backend/utils/adt/ruleutils.c decided
the
- #define PRETTYFLAG_INDENT 2
+ #define PRETTYFLAG_INDENT 0
Although this solution eliminated the problem with indentations and spaces, parentheses are all the same yet different and it is not easy to solve. Overall, not the best solution.
The third approach was to change the source code pg_dump'. In particular, it uses pg_get_viewdef(oid) format feature. By changing this format to pg_get_viewdef(oid,integer) the format function and the input argument is 0, and version 9.2, and version 9.5 derive the same:
the
$ psql vtest -p 5920 -tc "select pg_get_viewdef('gregtest'::regclass, 0)"
SELECT count(*) AS count +
FROM pg_class +
WHERE pg_class.reltuples > 0::double precision;
$ psql vtest -p 5960 -tc "select pg_get_viewdef('gregtest'::regclass, 0)"
SELECT count(*) AS count +
FROM pg_class +
WHERE pg_class.reltuples > 0::double precision;
This modified version will reproduce the same schema in our test database:
the
$ diff-u <(vtest pg_dump-x-p 5920 --schema-only) <(vtest pg_dump-x-p 5960 --schema-only)
--- /dev/fd/80 2016-09-29 12:34:56.019801980 -0400
+++ /dev/fd/88 2016-09-29 12:34:56.019881988 -0400
@@ -2,7 +2,7 @@
-- PostgreSQL database dump
--
--- Dumped from database version 9.2.18
+-- Dumped from database version 9.6.0
-- Dumped by pg_dump version 9.6.0
SET statement_timeout = 0;
The best solution, according to my colleague David Christensen, just to make sure that Postgres was doing all the heavy operations with the help of the magic of import/export. By the end of the day, the conclusion pg_dump'and not only chelovecescie, but designed so that it can recognize Postgres. Thus, we can feed the old scheme 9.2 time-based version 9.6, then turn around and copy it. As a result, we have identical challenges pg_get_viewdef() for both schemes. Here it is on our test databases:
the
$ createdb -p 5960 vtest92
$ pg_dump vtest -p 5920 | psql -q-p 5960 vtest92
$ diff-s-u <(vtest92 pg_dump-x-p 5960 --schema-only) <(vtest pg_dump-x-p 5960 --schema-only)
the
Conclusion
Attempts to compare the schemas of different versions can be quite difficult, so it is better not to even try. Backup and recovery schemes is a cheap operation, so just copy both the schemas on a single server, then make the comparison.
Комментарии
Отправить комментарий