KDb/Drivers/SQLite: Difference between revisions

From KDE Community Wiki
< KDb‎ | Drivers
No edit summary
Line 146: Line 146:


===Backward compatibility in Kexi===
===Backward compatibility in Kexi===
See [https://bugs.kde.org/show_bug.cgi?id=278411 Bug 278411 - Add collation sequence for unicode characters to the SQLite driver and make it the default]
See [[#Compatibility_with_3rd-party_software|Compatibility with 3rd-party software]] section for two compatibility issues.
See [[#Compatibility_with_3rd-party_software|Compatibility with 3rd-party software]] section for two compatibility issues.


Line 152: Line 154:


'''Solution for ISSUE 2''' is not needed because Kexi does not storequery definitions in SQLite's SQL format but in KEXISQL format. <tt>COLLATE <nowiki>''</nowiki></tt> specifier is added at runtime in Kexi 2.4+ only. In Kexi < 2.4 the specifier is not added and the database defaults to BINARY collation, so there is not change in behaviour in Kexi < 2.4, what is expected.
'''Solution for ISSUE 2''' is not needed because Kexi does not storequery definitions in SQLite's SQL format but in KEXISQL format. <tt>COLLATE <nowiki>''</nowiki></tt> specifier is added at runtime in Kexi 2.4+ only. In Kexi < 2.4 the specifier is not added and the database defaults to BINARY collation, so there is not change in behaviour in Kexi < 2.4, what is expected.
===Enabling collation selection in Kexi===
See [https://bugs.kde.org/show_bug.cgi?id=289293 Wish #289293 - Allow Assigning/Creating Collating Sequences for Columns in SQLite].
This wish can be implemented in Kexi >= 2.5.
PROBLEM 1: '''collations support is backend-dependent'''. E.g. SQLite and MySQL backends have different support for collations.
PROBLEM 2: '''collations support is installation/database/connection-dependent'''. E.g.
*Two MySQL servers can be installed with different settings for default collations, so newly created database can behave differently. Solution to this is to always specify collations explicitly.
*Two existing MySQL databases can have different default collations.
*Two connections to the same MySQL server can have different default collations.
TODO: more problems?
Identified major types of collations:
*BINARY - binary comparison, default in raw SQLite, default in Kexi < 2.4
*UNICODE, i.e. ROOT, i.e. (<nowiki>''</nowiki>) in SQLite - provided by the ICU extension, default in Kexi 2.4+
*
TODO: improve this list (other backends...)


===Links===
===Links===
*[https://bugs.kde.org/show_bug.cgi?id=278411 Bug 278411 - Add collation sequence for unicode characters to the SQLite driver and make it the default]
*[https://bugs.kde.org/show_bug.cgi?id=289293 Wish #289293 - Allow Assigning/Creating Collating Sequences for Columns in SQLite]
*http://stackoverflow.com/questions/611459/how-to-sort-text-in-sqlite3-with-specified-locale
*http://stackoverflow.com/questions/611459/how-to-sort-text-in-sqlite3-with-specified-locale
*http://www.sqlite.org/c3ref/load_extension.html
*http://www.sqlite.org/c3ref/load_extension.html
*http://www.sqlite.org/lang_corefunc.html#load_extension
*http://www.sqlite.org/lang_corefunc.html#load_extension
*http://wiki.apache.org/solr/UnicodeCollation#Sorting_text_for_multiple_languages
*http://wiki.apache.org/solr/UnicodeCollation#Sorting_text_for_multiple_languages

Revision as of 22:01, 18 December 2011

Note

This page is maintained by Jstaniek


Requirements for libsqlite3 builds

Configure libsqlite3 build as follows:

mkdir builddir
cd builddir
REQUIRED_CPPFLAGS="-DSQLITE_ENABLE_COLUMN_METADATA=1"
REQUIRED_OPTIONS="--enable-load-extension"
RECOMMENDED_CPPFLAGS="-DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS=1 \
-DSQLITE_SECURE_DELETE=1 -DSQLITE_ENABLE_MEMORY_MANAGEMENT=1 \
-DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_STAT2=1 \
-DSQLITE_ENABLE_UPDATE_DELETE_LIMIT=1 -DSQLITE_ENABLE_UNLOCK_NOTIFY=1 \
-DSQLITE_SOUNDEX=1"
RECOMMENDED_OPTIONS="--enable-threadsafe"
CPPFLAGS="$REQUIRED_CPPFLAGS $RECOMMENDED_CPPFLAGS" \
../sqlite-src-???/configure --prefix=.... $REQUIRED_OPTIONS $RECOMMENDED_OPTIONS
  • --enable-load-extension is a must to have support for setting up unicode collations.
  • Optional: append --enable-debug for debug version.
  • During the build, sqlite3 shell will be also build, the build options also affect the shell, e.g. .load command is available only if --enable-load-extension is present.
  • See also explanation in Deploying SQLite blog entry.

libsqlite3 status in distros

  • Debian: OK
  • Windows: OK
    • via KDE on Windows deployment system (handled by KDE packagers)
  • Mac OS X: ??

TODO... please update

Collations

By default Kexi forces ROOT collation, i.e. combined collations for all locales. Example test session in sqlite3 shell:

'''.load path/to/libkexidb_sqlite3_icu.so'''
-- original order: 3 locales mixed (polish, german, greek)
1|aaaa
2|ąąąą
3|β
4|żżżż
5|bbbb
6|ü
7|źźźź
8|ξ
9|α
10|u
-- binary collation, sqlite default
sqlite> select * from collation order by name; 
1|aaaa
5|bbbb
10|u
6|ü
2|ąąąą
7|źźźź
4|żżżż
9|α
3|β
8|ξ
-- ROOT collation
sqlite> SELECT icu_load_collation('', '');
sqlite> select * from test order by name COLLATE '';
1|aaaa
2|ąąąą
5|bbbb
10|u
6|ü
7|źźźź
4|żżżż
9|α
3|β
8|ξ
-- ROOT collation declared for table column as default collation
sqlite> create table test2 ( name text COLLATE '' );
sqlite> insert into test2 values ( "ć" );
sqlite> insert into test2 values ( "ą" );
sqlite> insert into test2 values ( "Ć" );
sqlite> insert into test2 values ( "a" );
sqlite> insert into test2 values ( "A" );
sqlite> insert into test2 values ( "C" );
-- using the default collation
sqlite> select * from test2 order by name;
a
A
ą
C
ć
Ć
-- forcing binary collation
sqlite> select * from test2 order by name COLLATE BINARY;
A
a
C
ą
Ć
ć
-- "A" > "a" for ROOT collation 
select * from test2 where name > "A"
A
C
ą
Ć
ć

Compatibility with 3rd-party software

SQLite has three simple collation sequence types built-in: NOCASE, RTRIM. Other collations have to be loaded.

The ICU-based collation sequences is provided via the SQLite extension API, so it works both in Kexi SQLite3 driver (which loads the needed collation automatically) and the sqlite3 shell (where .load command can be used). However any 3rd-party software that intends to use tables or queries with these collations have to load the plugin. This can be done either using load_extension() or SQL command or using [http://www.sqlite.org/c3ref/load_extension.html sqlite3_load_extension() C API.

sqlite3_enable_load_extension(1) have to be called before to enable extension loading. Certain tools have no extensions enabled in their usage of SQLite3 and altering of their code may be needed.

When user attempts to use tables or queries with custom collations the following error message appears (note that the message can be hidden by the 3rd-party program):

-- ISSUE 1: executing query on table with column that use custom collation, from the previous example: create table test2 ( name text COLLATE '' )
select * from test2;
-- result:
> Error: no such collation sequence: 
-- workaround is to force BINARY collation:
select * from test2 order by name COLLATE BINARY;
> OK

-- ISSUE 2: executing query that uses custom collation on table with standard collation: select * from test order by name COLLATE ''; > Error: no such collation sequence: -- workaround is to remove COLLATE part so the collation defaults to BINARY: select * from test order by name; > OK

Ensuring Kexi is built with capable version of libsqlite3

For improved modularity and to enable proper use of sqlite3 shell, custom collations are used in Kexi via extensions API of libsqlite3. The extensions API are not enabled by default for libsqlite3 builds because --enable-load-extension option has to be explicitly added to the ./configure command. Read libsqlite3 status in distros section to check status for particular OS distribution.

In order to ensure minimal, extension support is tested at configuration stage of Kexi build process.

How to deal with incapable libsqlite3 versions?

  1. Recommended solution: add needed build option(s) for the official distribution. Make Kexi >= 2.4.0 dependent on this version of renewed libsqlite3 as minimal version.
  2. Alternative solution: create custom package of libsqlite3 (and possibly libsqlite3-dev and sqlite3 shell). Use different names for the packages to enable side-by-side installation, e.g. libsqlite3ext, sqlite3ext, libsqlite3ext-dev, respectively. Make Kexi >= 2.4.0 package dependent on the custom package instead of canonical libsqlite3.

Please consider the alternative solution only as temporary. Switch to the recommended solution as soon as possible.

In any case consider adding also recommended build options in one go (the list is published in section Requirements for libsqlite3 builds. This is beneficial for all of us.

Backward compatibility in Kexi

See Bug 278411 - Add collation sequence for unicode characters to the SQLite driver and make it the default

See Compatibility with 3rd-party software section for two compatibility issues.

Original solution for ISSUE 1 (force BINARY collation) would have to be added to Kexi < 2.4 in order to enable Kexi < 2.4 to open databases created with Kexi 2.4+. This is becasue (unlike qieries) table definitions are stored by SQLite itself so COLLATE is a constraint. So better approach is to skip adding COLLATE '' for column definitions in CREATE TABLE statements and instead to add them at runtime in SQL generated for queries. There is no clear overhead in this solution. behaviour is exactly the same as in the original solution.

Solution for ISSUE 2 is not needed because Kexi does not storequery definitions in SQLite's SQL format but in KEXISQL format. COLLATE '' specifier is added at runtime in Kexi 2.4+ only. In Kexi < 2.4 the specifier is not added and the database defaults to BINARY collation, so there is not change in behaviour in Kexi < 2.4, what is expected.

Enabling collation selection in Kexi

See Wish #289293 - Allow Assigning/Creating Collating Sequences for Columns in SQLite.

This wish can be implemented in Kexi >= 2.5.

PROBLEM 1: collations support is backend-dependent. E.g. SQLite and MySQL backends have different support for collations.

PROBLEM 2: collations support is installation/database/connection-dependent. E.g.

  • Two MySQL servers can be installed with different settings for default collations, so newly created database can behave differently. Solution to this is to always specify collations explicitly.
  • Two existing MySQL databases can have different default collations.
  • Two connections to the same MySQL server can have different default collations.

TODO: more problems?

Identified major types of collations:

  • BINARY - binary comparison, default in raw SQLite, default in Kexi < 2.4
  • UNICODE, i.e. ROOT, i.e. ('') in SQLite - provided by the ICU extension, default in Kexi 2.4+

TODO: improve this list (other backends...)

Links