Datatypes

Datatypes -- An overview of datatype handling

Introduction

All DBMS provide multiple choice of data types for the information that can be stored in their database table fields. However, the set of data types made available varies from DBMS to DBMS.

To simplify the interface with the DBMS supported by MDB2, it was defined a base set of data types that applications may access independently of the underlying DBMS.

The MDB2 applications programming interface takes care of mapping data types when managing database options. It is also able to convert that is sent to and received from the underlying DBMS using the respective driver.

The following data type examples should be used with MDB2's createTable() method. The example array at the end of the data types section may be used with createTable() to create a portable table on the DBMS of choice (please refer to the main MDB2 documentation to find out what DBMS back ends are properly supported). It should also be noted that the following examples do not cover the creation and maintenance of indices, this chapter is only concerned with data types and the proper usage thereof.

"Global" table type modifiers

Within the MDB2 API there are a few modifiers that have been designed to aid in optimal table design. These are:

Building upon the above, we can say that the modifiers alter the field definition to create more specific field types for specific usage scenarios. The notnull modifier will be used in the following way to set the default DBMS NOT NULL Flag on the field to true or false, depending on the DBMS's definition of the field value: In PostgreSQL the "NOT NULL" definition will be set to "NOT NULL", whilst in MySQL (for example) the "NULL" option will be set to "NO". In order to define a "NOT NULL" field type, we simply add an extra parameter to our definition array (See the examples in the following section)
'sometime' = array(
    'type'    = 'time',
    'default' = '12:34:05',
    'notnull' = true,
),
Using the above example, we can also explore the default field operator. Default is set in the same way as the notnull operator to set a default value for the field. This value may be set in any character set that the DBMS supports for text fields, and any other valid data for the field's data type. In the above example, we have specified a valid time for the "Time" data type, '12:34:05'. Remember that when setting default dates and times, as well as datetimes, you should research and stay within the epoch of your chosen DBMS, otherwise you will encounter difficult to diagnose errors!

The above example will create a character varying field of length 12 characters in the database table. If the length definition is left out, MDB2 will create a length of the maximum allowable length for the data type specified, which may create a problem with some field types and indexing. Best practice is to define lengths for all or most of your fields.

Text data type

The text data type is available with two options for the length: one that is explicitly length limited and another of undefined length that should be as large as the database allows.

The length limited option is the most recommended for efficiency reasons. The undefined length option allows very large fields but may prevent the use of indexes, nullability and may not allow sorting on fields of its type.

The fields of this type should be able to handle 8 bit characters. Drivers take care of DBMS specific escaping of characters of special meaning with the values of the strings to be converted to this type.

By default MDB2 will use variable length character types. If fixed length types should be used can be controlled via the fixed modifier.

Boolean data type

The boolean data type represents only two values that can be either 1 or 0. Do not assume that these data types are stored as integers because some DBMS drivers may implement this type with single character text fields for a matter of efficiency. Ternary logic is possible by using null as the third possible value that may be assigned to fields of this type.

Integer data type

The integer data type may store integer values as large as each DBMS may handle. Fields of this type may be created optionally as unsigned integers but not all DBMS support it. Therefore, such option may be ignored. Truly portable applications should not rely on the availability of this option.

Decimal data type

The decimal data type may store decimal numbers accurately with a fixed number of decimal places. This data type is suitable for representing accurate values like currency amounts.

Some DBMS drivers may emulate the decimal data type using integers. Such drivers need to know in advance how many decimal places that should be used to perform eventual scale conversion when storing and retrieving values from a database. Despite this, applications may use arithmetic expressions and functions with the values stored on decimal type fields as long as any constant values that are used in the expressions are also converted with the respective MDB2 conversion functions.

The number of places that are used to the left and the right of the decimal point is pre-determined and fixed for all decimal values stored in the same database. By default, MDB2 uses 2 places to the right of the decimal point, but this may be changed when setting the database connection. The number of places available to the right of the decimal point depend on the DBMS.

It is not recommended to change the number places used to represent decimal values in database after it is installed. MDB2 does not keep track of changes in the number of decimal places. The number of decimal places can be set using the setOption() method.

Float data type

The float data type may store floating point decimal numbers. This data type is suitable for representing numbers within a large scale range that do not require high accuracy. The scale and the precision limits of the values that may be stored in a database depends on the DBMS that it is used.

Date data type

The date data type may represent dates with year, month and day. DBMS independent representation of dates is accomplished by using text strings formatted according to the IS0-8601 standard.

The format defined by the ISO-8601 standard for dates is YYYY-MM-DD where YYYY is the number of the year (Gregorian calendar), MM is the number of the month from 01 to 12 and DD is the number of the day from 01 to 31. Months or days numbered below 10 should be padded on the left with 0.

Some DBMS have native support for date formats, but for others the DBMS driver may have to represent them as integers or text values. In any case, it is always possible to make comparisons between date values as well sort query results by fields of this type.

Time data type

The time data type may represent the time of a given moment of the day. DBMS independent representation of the time of the day is also accomplished by using text strings formatted according to the ISO-8601 standard.

The format defined by the ISO-8601 standard for the time of the day is HH:MI:SS where HH is the number of hour the day from 00 to 23 and MI and SS are respectively the number of the minute and of the second from 00 to 59. Hours, minutes and seconds numbered below 10 should be padded on the left with 0.

Some DBMS have native support for time of the day formats, but for others the DBMS driver may have to represent them as integers or text values. In any case, it is always possible to make comparisons between time values as well sort query results by fields of this type.

Timestamp data type

The timestamp data type is a mere combination of the date and the time of the day data types. The representation of values of the time stamp type is accomplished by joining the date and time string values in a single string joined by a space. Therefore, the format template is YYYY-MM-DD HH:MI:SS. The represented values obey the same rules and ranges described for the date and time data types

Large object (file) data types

The large object data types are meant to store data of undefined length that may be too large to store in text fields, like data that is usually stored in files.

MDB2 supports two types of large object fields: Character Large OBjects (CLOBs) and Binary Large OBjects (BLOBs). CLOB fields are meant to store only data made of printable ASCII characters. BLOB fields are meant to store all types of data.

Large object fields are usually not meant to be used as parameters of query search clause (WHERE) unless the underlying DBMS supports a feature usually known as "full text search"

Putting it all together

The above example will create a database table as such:

Таблица 34-1. PostgreSQL

ColumnTypeNot NullDefaultcomment
idcharacter(32)   
somenamecharacter varying(12)   
somedatedate   
sometimestamptimestamp without time zone   
somebooleanboolean   
somedecimalnumeric(18,2)   
somefloatdouble precision   
sometimetime without time zoneNOT NULL'12:34:05'::time without time zone 
someclobtext   
someblobbytea   

Таблица 34-2. MySQL

FieldTypeCollationAttributesNullDefaultcomment
idchar(32)  YES  
somenamevarchar(12)latin1_swedish_ci YES  
somedatedate  YES  
sometimestamptimestamp without time zone  YES  
somebooleantinyint(1)  YES  
somedecimaldecimal(18,2)  YES  
somefloatdouble  YES  
sometimetime  NO12:34:05 
somecloblongtextlatin1_swedish_ci YES  
somebloblongblob binaryYES  

Custom Data Types and Further reading

Custom data types can be defined. This will be explored soon. For now you can refer to this blog post.

Further reading should be done at the following URL's: getTypeDeclaration, getDeclaration.


HIVE: All information for read only. Please respect copyright!
Hosted by hive ÊÃÁ: Êèåâñêàÿ ãîðîäñêàÿ áèáëèîòåêà