Stored function With PostgreSQL


Hello, habrachelovek! Many of You faced with making the business logic in the DBMS as stored functions/procedures, facilitating client. This is as advantages and disadvantages. Today I'd like to tell You how to create stored functions in the PostgreSQL, written in C. this article the basics that you need to know to start working with them.

custom functions


At the moment, PostgreSQL allows you to define custom functions the following types:
the

    SQL-functions the

  • feature C
  • the
  • functions for the procedural languages (plpgsql, pltcl, plperl, etc)

SQL-functions are functions, in the body which have one or more SQLrequests, while vozneseny the result will be the result of the last query. If vozraschenie the result is not void allowed to use INSERT, UPDATE, or DELETE RETURNING.

Functions in C are statically and dynamically loadable. Statically loaded (also called internal functions) are created on the server when initializing the database cluster, dynamically-loaded — loaded by the server on demand.

Functions on procedural languages require the creation of appropriate extensions, and some of the languages can be of two types: trusted and untrusted (for the latter there is no possibility to restrict user actions). In the basic supply of PostgreSQL are plpgsql, pltcl, plperl and plpython, a list of other languages see here. Extensions for procedural languages are created using SQL:
the
CREATE EXTENSION pltcl; -- create a trusted extension pltcl
CREATE EXTENSION pltclu; -- create not trusted extensions pltcl

Or through the console (only available in plpython not trusted):
the
createlang plpythonu

Dynamically-loaded functions in C


Dynamically loaded C functions contained in a dynamically loaded (or shared) libraries that are loaded when first call to the function. Example of creating such a function:
the
CREATE OR REPLACE FUNCTION grayscale ( r double precision, g double precision, b double precision )
RETURNS double precision AS
'utils', 'grayscale'
LANGUAGE C STRICT;

This example creates a function of grayscale (having three float parameter and masraweya float), which is located in the dynamic library utils. Key word STRICT is used for function return NULL if at least one of the arguments is NULL If you do not specify an absolute path refers to the directory specified in the variable dynamic_library_path, to view the value of which can be here
the
SELECT current_setting ( 'dynamic_library_path' );

In this case, if the value of the variable or the path to the dynamic library starts with $libdir, $libdir is replaced by the path to the directory containing the library PostgreSQL, to know which is possible using the console command:
the
pg_config --by the pkglibdir

Since library loading is done with the rights of the user under which the daemon is running PostgreSQL (usually postgres), that user must have rights to access the library.

For functions there are two types of agreements: version 0 (deprecated) and version 1. The function version 0 is not portable and are limited in functionality, therefore further refers to the function version 1. To indicate that we use the version 1, a function before its definition, it is necessary to mark a special macro:
the
PG_FUNCTION_INFO_V1(grayscale);


Structure of dynamic libraries


Each library must have a magic block (single, regardless of the number of files), to be able to detect inconsistencies, for example the older version of the server PostgreSQL and versions of PostgreSQL, which includes the library. This block is declared as follows:
#ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif
If necessary, you can define an initialization function _PG_init (has no parameters and vozraduysya void) that is invoked once the library is loaded, and the function is complete, the _PG_fini (which has the same signature as _PG_init), which is called before unloading the library. Please note, the documentation States that at this moment libraries are not unloaded, so the _PG_fini will never be called. Example functions:
the
void _PG_init()
{
createLog();
}

void _PG_fini()
{
destroyLog();
}

Functions in the library have a certain form, for parameters that receive arguments, return the result and some other operations there are special macros (described in detail below):
the
Datum grayscale(PG_FUNCTION_ARGS) 
{
float8 r = PG_GETARG_FLOAT8(0);
float8 g = PG_GETARG_FLOAT8(1);
float8 b = PG_GETARG_FLOAT8(2);

PG_RETURN_FLOAT8(0.299 * r + 0.587 * g + 0.114 * b);
}


data Types


The basic data types used in functions are divided into three types:
the
    the
  • with fiksirovannoi length passed by value
  • the
  • with fiksirovannoi length passed by pointer
  • the
  • with variable length transmitted via pointer

Types first type can be 1, 2 or 4 bytes (or 8, if sizeof(Datum) on your platform is 8). When defining your own types (for example, through a typedef) you have to be sure that their size is the same on all architectures.

Types with a fixed length, passed by pointer, represent structures. To allocate memory for them (and the types of the third type) must be using palloc, for example:
the
typedef struct
{
float r, g, b, a;
} Color;

Color *color = (Color*)palloc(sizeof(Color));

For types third type you need to define a field (4 bytes) for storing the size of the entire type (the data size + the size of the field) and, in fact, the data itself, is continuously located in the field. This can be done using the structure:
the
typedef struct
{
int32 length;
char data[1];
} text;

The value of a field with the type size is set implicitly by using the macro SET_VARSIZE. Other macros for working with variable-length types passed by pointer:
the
char data[10];
...
text *string = (text*)palloc(VARHDRSZ + 20); // VARHDRSZ - size field with the length of the data
SET_VARSIZE(string, VARHDRSZ + 20); // SET_VARSIZE - setting the size of type
memcpy(VARDATA(string), data, 10); // VARDATA is a pointer to data of type

The correspondence between the types in the functions C and SQL listed in this table.

The types that are passed by pointer, it should be noted that you cannot modify the data pointed to by this pointer, as it may be data directly on the disk, which can lead to their damage. The consequences are not quite happy.

Structure function


The signature of the function should look like this:
the
Datum grayscale(PG_FUNCTION_ARGS);

Datum is a special type for vozraschenie values of the function, which is essentially typedef pointer. The macro PG_FUNCTION_ARGS takes place in a pointer to a structure containing meta-information about the function parameters: the call context if the value is NULL, and so on. Access function arguments is done using the macros PG_GETARG_*:
the
float8 r = PG_GETARG_FLOAT8(0); // first argument of type float8
int32 x = PG_GETARG_INT32(1); // second argument of type int32
text *msg = PG_GETARG_TEXT_P(2); // the third argument with the text*

If SQL-the function is declared with no STRICT, with PG_ARGISNULL you can check whether the argument value is NULL. To return the function result as a NULL through PG_RETURN_NULL. As an example, let's see how is the implementation of functions without a STRICT:
the
Datum grayscale(PG_FUNCTION_ARGS)
{
if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
{
PG_RETURN_NULL();
}

float8 r = PG_GETARG_FLOAT8(0);
float8 g = PG_GETARG_FLOAT8(1);
float8 b = PG_GETARG_FLOAT8(2);

PG_RETURN_FLOAT8(0.299 * r + 0.587 * g + 0.114 * b);
}

Example



Now that you know how to write stored function in C to summarize and look at the example. The environment we have is this:
the
  • operating system: Ubuntu 12.10
  • the
  • PostgreSQL version: 9.3
  • the
  • compiler: gcc 4.7.2

  • Create a file utils.c with this content:
    the
    #include <postgres.h>
    #include <fmgr.h>
    
    #ifdef PG_MODULE_MAGIC
    PG_MODULE_MAGIC;
    #endif
    
    PG_FUNCTION_INFO_V1(grayscale);
    
    Datum grayscale(PG_FUNCTION_ARGS)
    {
    float8 r = PG_GETARG_FLOAT8(0);
    float8 g = PG_GETARG_FLOAT8(1);
    float8 b = PG_GETARG_FLOAT8(2);
    
    PG_RETURN_FLOAT8(0.299 * r + 0.587 * g + 0.114 * b);
    }
    

    Next, compile utils.c into an object file, it needs to be posizione-nezavisimym code (gcc this option is fpic):
    the
    cc-I/usr/local/pgsql/include/server -fpic -c utils.c
    

    The command pg_config --includedir-server tells the location of directory with header files. Relink the object file as a dynamic library (if it's all right, we will have a dynamic library utils.so, copy it in /usr/local/pgsql/lib):
    the
    cc -shared-L/usr/local/pgsql/lib -lpq -o utils.so utils.o
    

    Now, connect to our database and will create in it the function grayscale_с, putting some options:
    the
    CREATE OR REPLACE FUNCTION grayscale_c ( r double precision, g double precision, b double precision )
    RETURNS double precision AS
    'utils', 'grayscale'
    LANGUAGE C STRICT VOLATILE COST 100;;
    

    Check her performance:
    the
    SELECT grayscale_c ( 0.6, 0.5, 0.5 ); -- result: 0.5299
    

    But that's not all. Compare this function with the same, but plpgsql. Let's call it grayscale_plpgsql:
    the
    CREATE OR REPLACE FUNCTION grayscale_plpgsql ( r double precision, g double precision, b double precision )
    RETURNS double precision AS
    $BODY$
    BEGIN
    RETURN 0.299 * r + 0.587 * g + 0.114 * b;
    END
    $BODY$
    LANGUAGE plpgsql STRICT VOLATILE COST 100;
    

    And have some test
    the
    CREATE TABLE color AS
    SELECT random () AS r, random () AS g, random () AS b
    FROM generate_series ( 1, 1000000 );
    
    SELECT grayscale_c ( r, g, b ) FROM color; -- query time: 926 msec
    SELECT grayscale_plpgsql ( r, g, b ) FROM color; -- query execution time: 3679 MS
    

    Little test:
    the
    SELECT * FROM color WHERE grayscale_c ( r, g, b ) != grayscale_plpgsql ( r, g, b ); -- 0 records
    

    A very good result.

    As we have seen, the creation of a dynamically-loaded functions on C is not that complicated. Complexity, as a rule, lie in their implementation.

    P. S. Thank you for your attention.

    Links:
    the
    Article based on information from habrahabr.ru

    Комментарии

    Популярные сообщения из этого блога

    Briefly on how to make your Qt geoservice plugin

    Database replication PostgreSQL-based SymmetricDS

    Yandex.Widget + adjustIFrameHeight + MooTools