The Postgres schema comparison and related problem with views

Compare schemas of two or more different Postgres databases is a common task, but it can become more tricky, if these databases run on different versions of Postgres'. Quick and canonical way to compare schemes is the use of the same program pg_dump to communicate with each base with --schema-only parameter. This method works great, but there are some pitfalls, especially when copying submissions.

image
(Photos made Philippe Vieux-Jeanton)

the

the Premise


Let's start with some assumptions, as was discovered by this problem. We have an instance that is in the process of upgrading versions of Postgres 9.2 to 9.6 (the latest version at the time of writing). Use pg_upgrade was impossible as it was planned not only enable data checksums, but changing the encoding to UTF-8. A number of factors, especially the change of the encoding meant that the typical update process old_database pg_dump | psql new_database impossible. Thus, we have a very specific program that accurately migrates portions of the data, producing the action along the way.

the

the Problem


As a final assessment of sanity, we wanted to ensure that the final scheme has been updated to version 9.6 of the database as far as possible identical to the current scheme grocery database version 9.2. When comparing the output of pg_dump, we quickly found a problem with the way of displaying views. Version 9.2 uses very lean, single-line output, while version 9.6 uses multiline "beautifully derived" variation. Needless to say, this meant that none of the representations do not coincide when comparing the output of pg_dump.

The problem lies in the system function pg_get_viewdef(), which is used pg_dump'om to return human-readable and Postgres-recognizable version of the view. To demonstrate the problems and solutions that will create a simple view on both databases, then compare them via pg_dump:

the
$ psql -p vtest 5920-c \
'gregtest create view as select count(*) from pg_class where reltuples = 0'
CREATE VIEW
$ psql -p vtest 5960-c \
'gregtest create view as select count(*) from pg_class where reltuples = 0'
CREATE VIEW
$ diff-u <(vtest pg_dump-x-p 5920 --schema-only) <(vtest pg_dump-x-p 5960 --schema-only)

--- /dev/fd/70 2016-09-29 12:34:56.019700912 -0400
+++ /dev/fd/72 2016-09-29 12:34:56.019720902 -0400
@@ -2,7 +2,7 @@
-- PostgreSQL database dump
--

--- Dumped from database version 9.2.18
+-- Dumped from database version 9.6.0
-- Dumped by pg_dump version 9.6.0

SET statement_timeout = 0;
@@ -35,22 +35,14 @@
--

CREATE VIEW gregtest AS
-SELECT count(*) AS count FROM pg_class WHERE (pg_class.reltuples = (0)::double precision);
+ SELECT count(*) AS count
+ FROM pg_class
+ WHERE (pg_class.reltuples = (0)::double precision);

The only difference besides the version of the server is a representation that does not match at all, and is concerned about the diff. (For the purposes of this article, from the output, to remove all secondary lines).

As mentioned earlier, the culprit is the pg_get_viewdef(). His job is to represent the filling of a presentation in a relevant, readable way. There are two main changes which it makes with this conclusion: adding the parentheses and adding padding with spaces. In recent versions, despite the fact that the documents hint, indentation (nice conclusion) can't be disabled, so there's no easy way to get the server version 9.6 to give the difference in views on one line, how does the server version 9.2 by default. Moreover, there are five versions of the function pg_get_viewdef, each of which accepts different arguments:

    the
  1. view name
  2. the
  3. the view name, and a logical argument
  4. the
  5. OID
  6. the
  7. OIDs and logical argument
  8. the
  9. an OID and an integer argument
  10. In Postgres 9.2, version pg_get_viewdef(text,boolean) will enable or disable the indentation, moreover, you can see that the default margins are not added:

    the
    $ psql vtest -p 5920 -Atc "select pg_get_viewdef('gregtest')"
    SELECT count(*) AS count FROM pg_class WHERE (pg_class.reltuples = (0)::double precision);
    
    $ psql vtest -p 5920 -Atc "select pg_get_viewdef('gregtest',false)"
    SELECT count(*) AS count FROM pg_class WHERE (pg_class.reltuples = (0)::double precision);
    
    $ psql vtest -p 5920 -Atc "select pg_get_viewdef('gregtest',true)"
    SELECT count(*) AS count +
    FROM pg_class +
    WHERE pg_class.reltuples = 0::double precision;
    

    In Postgres version 9.6, however, you are always faced with a "beautiful" display, regardless of which of the five versions of the functions you choose and what arguments you give them! Here is the call the same functions as in the above example for 9.6:

    the
    $ psql vtest -p 5960 -Atc "select pg_get_viewdef('gregtest')"
    SELECT count(*) AS count
    FROM pg_class
    WHERE (pg_class.reltuples = (0)::double precision);
    
    $ psql vtest -p 5960 -Atc "select pg_get_viewdef('gregtest',false)"
    SELECT count(*) AS count
    FROM pg_class
    WHERE (pg_class.reltuples = (0)::double precision);
    
    $ psql vtest -p 5960 -Atc "select pg_get_viewdef('gregtest',true)"
    SELECT count(*) AS count
    FROM pg_class
    WHERE pg_class.reltuples = 0::double precision;
    

    the

    Solutions


    When I first encountered this problem, three solutions popped into my mind:

      the
    1. to Write a script that will transform and normalize the output circuit
    2. the
    3. to Modify the source code Postgres'and behavior changes pg_get_viewdef
    4. the
    5. to call pg_dump'om functions pg_get_viewdef thus, to get identical output

    Initially, I thought that a quick Perl script would be the easiest way. And by the time I got one working version of the script brought a lot of pain to turn the output from "beautiful" to "ugly", especially the spaces and brackets. The approach of brute force, by simply removing all parentheses and brackets are extra spaces from the rules and view definitions almost worked, but the output was quite uglyhard to read, also, had problems with extra spaces.

    Approach number two, modifying the source Postgres'and actually quite simple. At some point the source code has been modified so that the insert passes had to be shown in the "on" state. Change a single character in the file src/backend/utils/adt/ruleutils.c decided

    the
    - #define PRETTYFLAG_INDENT 2
    + #define PRETTYFLAG_INDENT 0
    

    Although this solution eliminated the problem with indentations and spaces, parentheses are all the same yet different and it is not easy to solve. Overall, not the best solution.

    The third approach was to change the source code pg_dump'. In particular, it uses pg_get_viewdef(oid) format feature. By changing this format to pg_get_viewdef(oid,integer) the format function and the input argument is 0, and version 9.2, and version 9.5 derive the same:

    the
    $ psql vtest -p 5920 -tc "select pg_get_viewdef('gregtest'::regclass, 0)"
    SELECT count(*) AS count +
    FROM pg_class +
    WHERE pg_class.reltuples > 0::double precision;
    
    $ psql vtest -p 5960 -tc "select pg_get_viewdef('gregtest'::regclass, 0)"
    SELECT count(*) AS count +
    FROM pg_class +
    WHERE pg_class.reltuples > 0::double precision;
    

    This modified version will reproduce the same schema in our test database:

    the
    $ diff-u <(vtest pg_dump-x-p 5920 --schema-only) <(vtest pg_dump-x-p 5960 --schema-only)
    
    --- /dev/fd/80 2016-09-29 12:34:56.019801980 -0400
    +++ /dev/fd/88 2016-09-29 12:34:56.019881988 -0400
    @@ -2,7 +2,7 @@
    -- PostgreSQL database dump
    --
    
    --- Dumped from database version 9.2.18
    +-- Dumped from database version 9.6.0
    -- Dumped by pg_dump version 9.6.0
    
    SET statement_timeout = 0;
    

    The best solution, according to my colleague David Christensen, just to make sure that Postgres was doing all the heavy operations with the help of the magic of import/export. By the end of the day, the conclusion pg_dump'and not only chelovecescie, but designed so that it can recognize Postgres. Thus, we can feed the old scheme 9.2 time-based version 9.6, then turn around and copy it. As a result, we have identical challenges pg_get_viewdef() for both schemes. Here it is on our test databases:

    the
    $ createdb -p 5960 vtest92
    
    $ pg_dump vtest -p 5920 | psql -q-p 5960 vtest92
    
    $ diff-s-u <(vtest92 pg_dump-x-p 5960 --schema-only) <(vtest pg_dump-x-p 5960 --schema-only)
    
    

    the

    Conclusion


    Attempts to compare the schemas of different versions can be quite difficult, so it is better not to even try. Backup and recovery schemes is a cheap operation, so just copy both the schemas on a single server, then make the comparison.
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Briefly on how to make your Qt geoservice plugin

Database replication PostgreSQL-based SymmetricDS

Developing for Sailfish OS: notifications for example apps for taking notes