Chapter 14. Extending SQL: Types

As previously mentioned, there are two kinds of types in Postgres: base types (defined in a programming language) and composite types. Examples in this section up to interfacing indices can be found in complex.sql and complex.c. Composite examples are in funcs.sql.

14.1. User-Defined Types

14.1.1. Functions Needed for a User-Defined Type

A user-defined type must always have input and output functions. These functions determine how the type appears in strings (for input by the user and output to the user) and how the type is organized in memory. The input function takes a null-delimited character string as its input and returns the internal (in memory) representation of the type. The output function takes the internal representation of the type and returns a null delimited character string. Suppose we want to define a complex type which represents complex numbers. Naturally, we choose to represent a complex in memory as the following C structure:

typedef struct Complex {
    double      x;
    double      y;
} Complex;
     
and a string of the form (x,y) as the external string representation. These functions are usually not hard to write, especially the output function. However, there are a number of points to remember:

  • When defining your external (string) representation, remember that you must eventually write a complete and robust parser for that representation as your input function!

    Complex *
    complex_in(char *str)
    {
        double x, y;
        Complex *result;
        if (sscanf(str, " ( %lf , %lf )", &x, &y) != 2) {
            elog(ERROR, "complex_in: error in parsing %s", str);
            return NULL;
        }
        result = (Complex *)palloc(sizeof(Complex));
        result->x = x;
        result->y = y;
        return (result);
    }
    	
    The output function can simply be:
    char *
    complex_out(Complex *complex)
    {
        char *result;
        if (complex == NULL)
            return(NULL);
        result = (char *) palloc(60);
        sprintf(result, "(%g,%g)", complex->x, complex->y);
        return(result);
    }
    	

  • You should try to make the input and output functions inverses of each other. If you do not, you will have severe problems when you need to dump your data into a file and then read it back in (say, into someone else's database on another computer). This is a particularly common problem when floating-point numbers are involved.

To define the complex type, we need to create the two user-defined functions complex_in and complex_out before creating the type:

CREATE FUNCTION complex_in(opaque)
    RETURNS complex
    AS 'PGROOT/tutorial/obj/complex.so'
    LANGUAGE 'c';

CREATE FUNCTION complex_out(opaque)
    RETURNS opaque
    AS 'PGROOT/tutorial/obj/complex.so'
    LANGUAGE 'c';

CREATE TYPE complex (
    internallength = 16,
    input = complex_in,
    output = complex_out
);
     

As discussed earlier, Postgres fully supports arrays of base types. Additionally, Postgres supports arrays of user-defined types as well. When you define a type, Postgres automatically provides support for arrays of that type. For historical reasons, the array type has the same name as the user-defined type with the underscore character _ prepended. Composite types do not need any function defined on them, since the system already understands what they look like inside.

14.1.2. Large Objects

If the values of your datatype might exceed a few hundred bytes in size (in internal form), you should be careful to mark them TOASTable. To do this, the internal representation must follow the standard layout for variable-length data: the first four bytes must be an int32 containing the total length in bytes of the datum (including itself). Then, all your functions that accept values of the type must be careful to call pg_detoast_datum() on the supplied values --- after checking that the value is not NULL, if your function is not strict. Finally, select the appropriate storage option when giving the CREATE TYPE command.