Stored function With PostgreSQL

Hello, habrachelovek! Many of You faced with making the business logic in the DBMS as stored functions/procedures, facilitating client. This is as advantages and disadvantages. Today I'd like to tell You how to create stored functions in the PostgreSQL, written in C. this article the basics that you need to know to start working with them.

custom functions

At the moment, PostgreSQL allows you to define custom functions the following types:
the

SQL-functions the

feature C
functions for the procedural languages (plpgsql, pltcl, plperl, etc)

SQL-functions are functions, in the body which have one or more SQLrequests, while vozneseny the result will be the result of the last query. If vozraschenie the result is not void allowed to use INSERT, UPDATE, or DELETE RETURNING.

Functions in C are statically and dynamically loadable. Statically loaded (also called internal functions) are created on the server when initializing the database cluster, dynamically-loaded — loaded by the server on demand.

Functions on procedural languages require the creation of appropriate extensions, and some of the languages can be of two types: trusted and untrusted (for the latter there is no possibility to restrict user actions). In the basic supply of PostgreSQL are plpgsql, pltcl, plperl and plpython, a list of other languages see here. Extensions for procedural languages are created using SQL:
the

CREATE EXTENSION pltcl; -- create a trusted extension pltcl
CREATE EXTENSION pltclu; -- create not trusted extensions pltcl

Or through the console (only available in plpython not trusted):
the

createlang plpythonu

Dynamically-loaded functions in C

Dynamically loaded C functions contained in a dynamically loaded (or shared) libraries that are loaded when first call to the function. Example of creating such a function:
the

CREATE OR REPLACE FUNCTION grayscale ( r double precision, g double precision, b double precision )
RETURNS double precision AS
'utils', 'grayscale'
LANGUAGE C STRICT;

This example creates a function of grayscale (having three float parameter and masraweya float), which is located in the dynamic library utils. Key word STRICT is used for function return NULL if at least one of the arguments is NULL If you do not specify an absolute path refers to the directory specified in the variable dynamic_library_path, to view the value of which can be here
the

SELECT current_setting ( 'dynamic_library_path' );

In this case, if the value of the variable or the path to the dynamic library starts with $libdir, $libdir is replaced by the path to the directory containing the library PostgreSQL, to know which is possible using the console command:
the

pg_config --by the pkglibdir

Since library loading is done with the rights of the user under which the daemon is running PostgreSQL (usually postgres), that user must have rights to access the library.

For functions there are two types of agreements: version 0 (deprecated) and version 1. The function version 0 is not portable and are limited in functionality, therefore further refers to the function version 1. To indicate that we use the version 1, a function before its definition, it is necessary to mark a special macro:
the

PG_FUNCTION_INFO_V1(grayscale);

Structure of dynamic libraries

Each library must have a magic block (single, regardless of the number of files), to be able to detect inconsistencies, for example the older version of the server PostgreSQL and versions of PostgreSQL, which includes the library. This block is declared as follows:
#ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif
If necessary, you can define an initialization function _PG_init (has no parameters and vozraduysya void) that is invoked once the library is loaded, and the function is complete, the _PG_fini (which has the same signature as _PG_init), which is called before unloading the library. Please note, the documentation States that at this moment libraries are not unloaded, so the _PG_fini will never be called. Example functions:
the

void _PG_init()
{
createLog();
}

void _PG_fini()
{
destroyLog();
}

Functions in the library have a certain form, for parameters that receive arguments, return the result and some other operations there are special macros (described in detail below):
the

Datum grayscale(PG_FUNCTION_ARGS) 
{
float8 r = PG_GETARG_FLOAT8(0);
float8 g = PG_GETARG_FLOAT8(1);
float8 b = PG_GETARG_FLOAT8(2);

PG_RETURN_FLOAT8(0.299 * r + 0.587 * g + 0.114 * b);
}

data Types

The basic data types used in functions are divided into three types:
the

with fiksirovannoi length passed by value
with fiksirovannoi length passed by pointer
with variable length transmitted via pointer

Types first type can be 1, 2 or 4 bytes (or 8, if sizeof(Datum) on your platform is 8). When defining your own types (for example, through a typedef) you have to be sure that their size is the same on all architectures.

Types with a fixed length, passed by pointer, represent structures. To allocate memory for them (and the types of the third type) must be using palloc, for example:
the

typedef struct
{
float r, g, b, a;
} Color;

Color *color = (Color*)palloc(sizeof(Color));

For types third type you need to define a field (4 bytes) for storing the size of the entire type (the data size + the size of the field) and, in fact, the data itself, is continuously located in the field. This can be done using the structure:
the

typedef struct
{
int32 length;
char data[1];
} text;

The value of a field with the type size is set implicitly by using the macro SET_VARSIZE. Other macros for working with variable-length types passed by pointer:
the

char data[10];
...
text *string = (text*)palloc(VARHDRSZ + 20); // VARHDRSZ - size field with the length of the data
SET_VARSIZE(string, VARHDRSZ + 20); // SET_VARSIZE - setting the size of type
memcpy(VARDATA(string), data, 10); // VARDATA is a pointer to data of type

The correspondence between the types in the functions C and SQL listed in this table.

The types that are passed by pointer, it should be noted that you cannot modify the data pointed to by this pointer, as it may be data directly on the disk, which can lead to their damage. The consequences are not quite happy.

Structure function

The signature of the function should look like this:
the

Datum grayscale(PG_FUNCTION_ARGS);

Datum is a special type for vozraschenie values of the function, which is essentially typedef pointer. The macro PG_FUNCTION_ARGS takes place in a pointer to a structure containing meta-information about the function parameters: the call context if the value is NULL, and so on. Access function arguments is done using the macros PG_GETARG_*:
the

float8 r = PG_GETARG_FLOAT8(0); // first argument of type float8
int32 x = PG_GETARG_INT32(1); // second argument of type int32
text *msg = PG_GETARG_TEXT_P(2); // the third argument with the text*

If SQL-the function is declared with no STRICT, with PG_ARGISNULL you can check whether the argument value is NULL. To return the function result as a NULL through PG_RETURN_NULL. As an example, let's see how is the implementation of functions without a STRICT:
the

Datum grayscale(PG_FUNCTION_ARGS)
{
if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
{
PG_RETURN_NULL();
}

float8 r = PG_GETARG_FLOAT8(0);
float8 g = PG_GETARG_FLOAT8(1);
float8 b = PG_GETARG_FLOAT8(2);

PG_RETURN_FLOAT8(0.299 * r + 0.587 * g + 0.114 * b);
}

Example

Now that you know how to write stored function in C to summarize and look at the example. The environment we have is this:
the

operating system: Ubuntu 12.10

the

PostgreSQL version: 9.3

the

compiler: gcc 4.7.2

Create a file utils.c with this content:
the

#include <postgres.h>
#include <fmgr.h>

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

PG_FUNCTION_INFO_V1(grayscale);

Datum grayscale(PG_FUNCTION_ARGS)
{
float8 r = PG_GETARG_FLOAT8(0);
float8 g = PG_GETARG_FLOAT8(1);
float8 b = PG_GETARG_FLOAT8(2);

PG_RETURN_FLOAT8(0.299 * r + 0.587 * g + 0.114 * b);
}

Next, compile utils.c into an object file, it needs to be posizione-nezavisimym code (gcc this option is fpic):
the

cc-I/usr/local/pgsql/include/server -fpic -c utils.c

The command pg_config --includedir-server tells the location of directory with header files. Relink the object file as a dynamic library (if it's all right, we will have a dynamic library utils.so, copy it in /usr/local/pgsql/lib):
the

cc -shared-L/usr/local/pgsql/lib -lpq -o utils.so utils.o

Now, connect to our database and will create in it the function grayscale_с, putting some options:
the

CREATE OR REPLACE FUNCTION grayscale_c ( r double precision, g double precision, b double precision )
RETURNS double precision AS
'utils', 'grayscale'
LANGUAGE C STRICT VOLATILE COST 100;;

Check her performance:
the

SELECT grayscale_c ( 0.6, 0.5, 0.5 ); -- result: 0.5299

But that's not all. Compare this function with the same, but plpgsql. Let's call it grayscale_plpgsql:
the

CREATE OR REPLACE FUNCTION grayscale_plpgsql ( r double precision, g double precision, b double precision )
RETURNS double precision AS
$BODY$
BEGIN
RETURN 0.299 * r + 0.587 * g + 0.114 * b;
END
$BODY$
LANGUAGE plpgsql STRICT VOLATILE COST 100;

And have some test
the

CREATE TABLE color AS
SELECT random () AS r, random () AS g, random () AS b
FROM generate_series ( 1, 1000000 );

SELECT grayscale_c ( r, g, b ) FROM color; -- query time: 926 msec
SELECT grayscale_plpgsql ( r, g, b ) FROM color; -- query execution time: 3679 MS

Little test:
the

SELECT * FROM color WHERE grayscale_c ( r, g, b ) != grayscale_plpgsql ( r, g, b ); -- 0 records

A very good result.

As we have seen, the creation of a dynamically-loaded functions on C is not that complicated. Complexity, as a rule, lie in their implementation.

P. S. Thank you for your attention.

Links:
the

the Official documentation for PostgreSQL

Article based on information from habrahabr.ru

Поиск по этому блогу

computer express