Database replication PostgreSQL-based SymmetricDS

In this article I will tell how to configure database replication for PostgreSQL. For the experiments we use a distribution of Linux CentOS 5.3, although this is not essential. we will use a version of PostgreSQL 8.4.7 and SymmetricDS-2.2.2.

What is replication?

In fact, it is a mechanism to automatically synchronize the contents of databases running on different servers. As a result of replication of these databases contain absolutely identical data. This is necessary for example in order to provide fault tolerance system (in the case of the fall of the first database server, the work enters the second), or to implement load balancing, different customers can serve different server.

For replication you need at least two database servers, so prepare two identical servers with a PostgreSQL database on each. The first is the IP address 10.0.2.20, the second — 10.0.2.21, both gateway 10.0.2.2.
You can do a virtual machine like VirtualBox to create two virtual servers and run them on your own computer.

In the above commands, the first character will be a # sign or $, these signs indicate that the command is run as root or as a normal user, respectively.
So, what actions should be taken:

Start setting

Include the postgresql service if it is not included:
# chkconfig --level 3 postgresql on
# service postgresql start


Now we need to create a database. To create her "head" will not work:
# createdb mytest
psql: FATAL: user "root" does not exist


then you need to create a role in Postgres for linuxoveho user. Create of your administrator password in Postgres with the username "sa" and linuxoveho user with the same name, the password is same "sa".
# adduser -m sa

# su - postgres
$ createuser -d-s -P sa

now you can create a table "test" by issuing the command from the user "sa":
# su - sa
$ createdb mytest

you can now use the psql utility:
psql mytest

The output from this utility is a set of two characters \q

Installing SymmetricDS

Well, now it's time to begin the fun part of the job. Before you continue, make sure that we simultaneously operate two servers running postgreSQL database server on each.
The server must ping each other, because SymmetricDS uses the HTTP Protocol for synchronization. Ports 8080 and 9090 on the server should not be blocked by the firewall.

OK. To work SymmetricDS need the java interpreter itself, and the JRE 1.6, so install them on your server if you have not done so:
downloadable from the site java.com installer jre-6u24-linux, install it (just run this file)
custom a symbolic link (if it is for some reason not created):
# ln-s /usr/java/latest/bin/java /usr/bin/java

and write in the command line classpath — without them, SymmetricDS will fall with an error that class not found.
$ CLASSPATH=/usr/java/latest/lib; export CLASSPATH

To sym (main executable file of Symmetric DS) worked, you must also specify the database server so that it listens on the external network interface (not just localhost) and that it allowed UNIX users to itself. This is done so at each of our two servers:
1) uncomment this line in /var/lib/pgsql/data/postgresql.conf:
listen_addresses = '*'

2) specify the trust method for all connections in /var/lib/pgsql/data/pg_hba.conf:
the local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust


3) To support the session variables in the file /var/lib/pgsql/data/postgresql.conf add the line:
custom_variable_classes = 'symmetric'

4) restart the database server:
# service postgresql restart

To verify that it works, so:
$ psql mytest -U sa -h localhost

This error message should not be: psql: FATAL: Ident authentication failed for user "sa".
Instead, you should open the psql as usual.

Now download the actual SymmetricDS-2.2.2 and unpack on both servers:
$ unzip symmetric-ds-2.2.2-server.zip


configuring SymmetricDS

Editable root file.properties and client.properties that are in the samples directory. They need to uncomment the lines for postgresql by commenting out or deleting the other options and specify a login and password to the database (which is sa).
root.properties are only needed on the first node, client.properties — just for a second.
Let's start in order.

setting up SymmetricDS on the first server

Formerum a root file.properties:

root.properties
# The class name for the JDBC Driver
db.driver=org.postgresql.Driver

# The JDBC URL used to connect to the database
db.url=jdbc:postgresql://localhost/mytest

# The user to login as who can create and update tables
db.user=sa
# The password for the user to login as
db.password=sa
registration.url=http://10.0.2.20:8080/sync
sync.url=http://10.0.2.20:8080/sync
# Do not change these for running the demo
group.id=corp
external.id=00000
# Don't muddy the waters with purge logging
job.purge.period.time.ms=7200000


Create the initial database on the root node:
$ psql postgres -c "CREATE database mytest;"

For the purposes of this simple example we have in the database will have one table t2, which is synchronized (replicated) in both directions. Create this table.
$ ../bin/sym-p root.properties --run-ddl create_sample.xml

it uses the file create_sample.xml
<?xml version="1.0"?>
<!DOCTYPE database SYSTEM "http://db.apache.org/torque/dtd/database.dtd">

<database name="mytest">
<table name="t2">
<column name="id" type="INTEGER" required="true" primaryKey="true" autoIncrement="true"/>
<column name="mydata" type="VARCHAR" size="64" required="false" />
<column name="intval" type="DECIMAL" size="10,2" required="false" />
</table>
</database>


It is also necessary in the database to create handlers plpgsql — run this script:
$ ./create_func.sh

create_func.sh
#!/bin/sh
psql mytest -U sa -h localhost -c "CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '\$libdir/plpgsql' LANGUAGE C;"
psql mytest -U sa -h localhost -c "CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '\$libdir/plpgsql' LANGUAGE C;"
psql mytest -U sa -h localhost -c "CREATE TRUSTED PROCEDURAL LANGUAGE plpgsql HANDLER plpgsql_call_handler VALIDATOR plpgsql_validator;"


you can now initialize the system SymmetricDS on root node:

$ ../bin/sym-p root.properties --auto-create

The output will be something like this:
[sa@CENTOS1 samples]$ ../bin/sym-p root.properties --auto-create
Log output will be written to ../logs/symmetric.log
SymmetricLauncher - Option: name=properties, value={root.properties}
SymmetricLauncher - Option: name=auto-create, value={}
PlatformFactory - The name/version pair is returned for the database PostgreSQL8
was not mapped to a known database platform. Defaulting to using just the database type of PostgreSql
PostgreSqlDbDialect - The DbDialect being used is org.jumpmind.symmetric.db.postgresql.PostgreSqlDbDialect
ConfigurationService - Initializing SymmetricDS database.
PostgreSqlDbDialect - There are SymmetricDS tables missing. They will be auto created.
PostgreSqlDbDialect - Starting auto update of SymmetricDS tables.
PostgreSqlDbDialect - Just installed sym_triggers_disabled
PostgreSqlDbDialect - Just installed sym_node_disabled
PostgreSqlDbDialect - Just installed sym_fn_sym_largeobject
ConfigurationService - Auto-configuring config channel.
ConfigurationService - Auto-configuring reload channel.
ConfigurationService - Done initializing SymmetricDS database.


It worked. This script has been created as many as 26 tables for their work. These tables begin sym_ and see a list of them by entering command
$ psql mytest -c "select tablename from pg_tables where tablename like 'sym%';"

Next. Konfigurim symmetricDS by filling the newly created tables
$ ../bin/sym-p root.properties --run-sql insert_sample.sql

The data for filling in are:

insert_sample.sql
insert into t2(mydata, intval) values('sdsdsdsds', 102);

--
-- Nodes
--
insert into sym_node_group (node_group_id, description)
values ('corp', 'Central Office');
insert into sym_node_group (node_group_id, description)
values ('store', 'Store');

insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action)
values ('store', 'corp', 'P');
insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action)
values ('corp', 'store', 'W');

insert into sym_node (node_id, node_group_id, external_id, sync_enabled)
values ('00000', 'corp', '00000', 1);
insert into sym_node_identity values ('00000');

--
-- Channels
--
insert into sym_channel
(channel_id, processing_order, max_batch_size, enabled, description)
values('channel_t2', 1, 100000, 1, 't2 data from register and back office');

--
-- Triggers
--
insert into sym_trigger
(trigger_id,source_table_name,channel_id,last_update_time,create_time)
values('trigger_t2','t2','channel_t2',current_timestamp,current_timestamp);

-- Example of a "dead" trigger, which is used to only sync the table during initial load
insert into sym_trigger
(trigger_id,source_table_name,channel_id, sync_on_insert, sync_on_update, sync_on_delete, last_update_time,create_time)
values('t2_dead','t2','channel_t2',0,0,0,current_timestamp,current_timestamp);

--
-- Routers
--

-- In this example, both routers pass everything all the time.
insert into sym_router
(router_id,source_node_group_id,target_node_group_id,create_time,last_update_time)
values('corp_store_identity', 'corp', 'store', current_timestamp, current_timestamp);

insert into sym_router
(router_id,source_node_group_id,target_node_group_id,create_time,last_update_time)
values('store_corp_identity', 'store', 'corp', current_timestamp, current_timestamp);

--
-- Trigger Router Links
--

insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('trigger_t2','corp_store_identity',100,current_timestamp,current_timestamp);

insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('trigger_t2','store_corp_identity', 200, current_timestamp, current_timestamp);

-- Example of a "dead" trigger, which is used to only sync the table during initial load
insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('trouter_dead','corp_store_identity', 300, current_timestamp, current_timestamp);


Which means the code above?

In fact, we create a configuration for SymmetricDS triggers to change values in fields of tables. Each alter table source_table_name (in this case, table t2), listed in the configuration of SymmetricDS in table sym_trigger, works code SymmetricDS. It determines which routes are associated with concretum trigger (see sym_trigger_router). If you specify only one path, for example, from the first server to the second, then the changed data on the second server will not be delivered on the first. In our case there are two channels in both directions: from a first server to the second and from the second to the first (see sym_router).

Running on the first server, the real server replication. This team will not give up control, will work to bring down the logs on the screen.
$ ../bin/sym-p root.properties --port 8080 --server

Resolved on the first server registration (run from another terminal window)
$ ../bin/sym-p root.properties --open-registration "store,1"

All. Configuring replication server for the route node is complete.

configuring SymmetricDS on the second server

Go to the second server and make settings there. Create a file in the directory symmetric-ds-2.2.2/samples:

client.properties
# The class name for the JDBC Driver
db.driver=org.postgresql.Driver

# The JDBC URL used to connect to the database
db.url=jdbc:postgresql://localhost/mytest

# The user to login as who can create and update tables
db.user=sa
# The password for the user to login as
db.password=sa
# The HTTP URL of the root node to contact for registration
registration.url=http://10.0.2.20:8080/sync
# Do not change these for running the demo
group.id=store
external.id=1
job.routing.period.time.ms=2000
# This is how often the push job will be run.
job.push.period.time.ms=5000
# This is how often the pull job will be run.
job.pull.period.time.ms=5000


Create on the second server, the same database:
$ psql postgres -c "CREATE database mytest;"

and create the same triggers, and functions:
$ ../bin/sym-p client.properties --run-ddl create_sample.xml
$ ./create_func.sh


Run on the second server, the replication process.
$ ../bin/sym-p client.properties --port 9090 --server

At this stage we have a database has not replicated, replication is not started. Rate this. go to the first server and type:
$ psql mytest -c "select * from t2;"

This command will print out the contents of the table on the root node, there will be one entry generated by the script insert_sample.sql, which we launched only at the root node.
Move on to the second server, type the same command there. The table will be empty.

Now the most basic: throw to the client the initial data set, by entering this command from the first server, and thus start the replication!
$ ../bin/sym-p root.properties --reload-node 1

After a while on the second server, the contents of the table t2 will be the same! Cheers!

If you have something went wrong, for example, you have specified a wrong IP address of the server to register, the easiest way to kill our database (along with all 26 tables of SymmetricDS) and start all over again. For this we need to invoke the command on both servers:
$ psql postgres -c "drop databаse mytest;"

and begin anew, with this line (look for it above:)
$ psql postgres -c "CREATE database mytest;"

real-world load

Write a perl script to download the information in the table on the root server.
To simulate real-world load it will produce one hundred UPDATE per second, in a loop.

test.pl
#!/usr/bin/perl

use DBI;
my $dbh = DBI- > connect("DBI:Pg:dbname=mytest", "sa", "sa");

for(my $idval = 5000; $idval < 9000; $idval++)
{
$dbh- > do("insert into t2 (id,mydata, intval) values ($idval,'some data', $idval)");
}

my $newval = 3434;
my $interval = 0;
for(my $j = 0; $j < 100000; $j++)
{
for(my $idval = 5000; $idval < 9000; $idval++)
{
$newval++;
$interval++;
$dbh- > do("UPDATE t2 set intval=$newval where id=$idval");
printf ("UPD[%04d] id=%d val=%d\r", $j, $idval, $newval);
if($interval == 100)
{
$interval = 0;
`ping localhost -w 1 > /dev/null 2>&1`;
}
}
}
$dbh->disconnect;

print "\nOK\n";


Running this script on the root node, the table will constantly change. On the second server, we will be able to see how changes are sent there, with some delay, like so:
$ psql mytest -c "select * from t2 where id=5000;"

The value of intval for this table row will lag behind what is printed by the script running on the first server.

Problems under heavy load


In this example, we, in fact, did not take into account the problem of high load on the database server. When implementing replication, SymmetricDS keeps track of all data sent to the remote database, in the table of sym_data. This table contains an entry for every insert/update, which is run on the root database. Thus, at high traffic (say, hundreds of inserts per second) the size of the table sym_data begins to grow. The growth of the table leads to an increase in the overhead of replication.

To solve this problem, you need to configure Purge Service Symmetric DS. The key parameters of this service, which must be set in root file.properties such:

start.purge.job
job.purge.period.time.ms (how often to run the Purge Job)
purge.retention.minutes (how long to store the history already sent to the second base change)

The last two parameters directly affect the maximum number of entries in the table of sym_data: it will still purge.retention.minutes * number of insert/update in a minute.

Service PurgeService should be automatically launched along with other services, such as Push and Poll (which provide synchronization of data in databases). But, if for some reason Purge Service is not started, this leads to excessive growth of the official tables of SymmetricDS, so you want to start the Purge Service manually, preferably during periods of low load on the database server:

../bin/sym –p root.properties –X

Success.
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Briefly on how to make your Qt geoservice plugin

Yandex.Widget + adjustIFrameHeight + MooTools