Monitoring dynamic XML documents


Work within the design of a new system of integration of devices for monitoring video/audio streams need to track the accumulation and subsequent analysis of the changes in their status. The state issued through the zoo of dynamic XML documents, used mainly for filling legacy web UI.

To simplify the integration, I proposed the idea of creating a generalized library for preserving structured diff-ov (nearly) arbitrary XML. Since these diffs are stored based on the structure of the document, this would enable very economical to accumulate the state changes of the devices as well as in the future to generate reports with analysis, charts, etc. After a week of binge programming, I've drafted a working proof-of-concept that I want to share in this article.

creating a schema document


The library uses the XSD as a source of information about the document structure. An XSD is very simple: there are many online services that allows for XML to generate some XSD validating it. For most cases this will be sufficient.

Next, we need to slightly modify the obtained XSD schema. For each element in the source XML document, involving multiple occurrences, you need to add the attribute `monId` in the corresponding XSD "element". Its value is the attribute name that uniquely identifies a repeating element. For example, we are going to monitor the documents in the following form:

the
<element1>
<element2 attr1="value1">
<element3>
<element4 attr2="value2">value3</element4>
<element4 attr2="value4">value5</element4>
<element4 attr2="value6">value7</element4>
</element3>
</element2>
<element2 attr1="value8">
<element3>
<element4 attr2="value9">value10</element4>
<element4 attr2="value11">value12</element4>
</element3>
</element2>
</element1>

The structure of the document it is clear that at least the following elements have multiple entry:

the
    the
  • /element1/element2
  • the
  • /element1/element2/element3/element4

Therefore, in the corresponding XSD `elements` should be added `monId` with the names of identifying attributes:
...
<xs:element name="element2" maxOccurs="unbounded" minOccurs="0" monId="attr1">
...
<xs:element name="element4" maxOccurs="unbounded" minOccurs="0" monId="attr2">
...

How it works


So, the library parses the XSD (in fact, while it is supported only a limited subset sufficient to digest most of the automatically generated schema), and based on it creates the tables, the appropriate elements in the source document.



After creating the internal representation of the document map each element will correspond to a table in the database. Any change of item will result in adding a new entry to the table. Ie, each record means an event (add, edit, delete, snapshot). In other words, to retrieve a version of a document appropriate to the given time stamp, the library scans all of the events corresponding to the given element, and rekonstruiruet his condition.

Because events can be many, such reconstruction will require more and more time. That's why each document is required to periodically save a snapshot of its current state (snapshot). Thus, the reconstruction of the elements will be made not with the beginning of the document, and the nearest snapshot for the specified timestamp.

Usage


Library written in golang and stores documents in PostgreSQL. As a database driver is used libpq. In the current state of the library only knows how to save and reconstruct XML documents (for arbitrary timestamp).

Example usage
package main

import (
"btc/data"
"btc/mon"
"btc/xmls"
"database/sql"
"log"
"os"
"time"
)

func install(db *sql.DB) {
var err error
if err = mon.Install(db); err != nil {
log.Fatalf("failed to install data monitor: %s", err)
}

var root *xmls.Element
root, err = xmls.FromFile("tmp/etr.xsd")
if err != nil {
log.Fatalf("failed to create the xml schema: %s", err)
}

schema := mon.NewSchema("etr", "probe ETR-290 checks")
if err = mon.AddSchema(db, schema, root); err != nil {
log.Fatalf("failed to install schema: %s", err)
}

doc := mon.Request("hw4_172_etr", "etr",
"http://10.0.30.172/probe/etrdata?inputId=0&tuningSetupId=1",
60, 86400)
if err = mon.AddDoc(db, doc); err != nil {
log.Fatalf("failed to add document: %s", err)
}
}

func commit(db *sql.DB) {
file, err := os.Open("tmp/etr.xml")
if err != nil {
log.Fatalf("failed to open xml doc: %s", err)
}
defer file.Close()

if err = mon.CommitDoc(db, "hw4_172_etr", file, false); err != nil {
log.Fatalf("failed to commit doc: %s", err)
}
}

func checkout(db *sql.DB) {
timestamp, err := time.Parse(
time.RFC3339, "2015-12-25T18:26:58+01:00")
if err != nil {
log.Fatalf("failed to parse timestamp: %s", err)
}

if err := mon.CheckoutDoc(
db "hw4_172_etr", timestamp,
os.Stdout, " ", " "); err != nil {
log.Fatalf("failed to checkout doc: %s", err)
}
}

func main() {
config, err := NewConfig("config.json")
if err != nil {
log.Fatalf("failed to load config: %s", err)
}

var db *sql.DB
db, err = data.Open(config.DbConnStr)
if err != nil {
log.Fatalf("failed to establish db connection: %s", err)
}
defer db.Close()

//install(db)
//commit(db)
checkout(db)
}

Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Briefly on how to make your Qt geoservice plugin

Database replication PostgreSQL-based SymmetricDS

Developing for Sailfish OS: notifications for example apps for taking notes