My sqlite3 database contains strings of text containing special characters (e.g. macrons such as Ā, Ō). Since sqlite3 does not directly ship with unicode handling, I would like to use Qt's localeAwareComparison
as a means to enable locale-aware collation. Results of queries using ORDER BY text
should be ordered lexicographically A < Ā < B, instead of A < B < Ā.
I was able to implement a working prototype using the following comparison method:
int sqliteLocaleAwareCompare(void *, int ll, const void * l, int rl, const void * r) {
QString left = QString::fromUtf16((const ushort *)l, ll/2);
QString right = QString::fromUtf16((const ushort *)r, rl/2);
return QString::localeAwareCompare(left, right);
}
The collation is then added to sqlite3 (which is compiled along with my code using the official amalgamation):
// Open the database
QSqlDatabase db = QSqlDatabase::addDatabase("QSQLITE");
db.setDatabaseName(database_path);
if (!db.open()) {
qDebug() << "ERROR: Database could not be opened.";
}
// Retrieve the database handle
QVariant handle = db.driver()->handle();
sqlite3* sqlite_db = *static_cast<sqlite3**>(handle.data());
// Initialize sqlite3's global variables
sqlite3_initialize();
// Add the collation
sqlite3_create_collation(sqlite_db, "LOCALE", SQLITE_UTF16, nullptr, sqliteLocaleAwareCompare);
My question is twofold:
Thank you in advance for your help!
Behavior on Mac OS X
Results:
Ābc
Alpha
Beta
Gamma
Ōdachi
localeAwareComparison-Output:
comparing: "Ōdachi" with "Ābc" result 1
comparing: "Beta" with "Gamma" result -1
comparing: "Gamma" with "Ōdachi" result -1
comparing: "Gamma" with "Ābc" result 1
comparing: "Beta" with "Ābc" result 1
comparing: "Alpha" with "Ōdachi" result -1
comparing: "Alpha" with "Gamma" result -1
comparing: "Alpha" with "Beta" result -1
comparing: "Alpha" with "Ābc" result 1
Inserting role name: "tt"
Locale:
QLocale(English, Latin, UnitedStates)
"UTF-8"
Behavior on Android
Results:
Alpha
Beta
Gamma
Ābc
Ōdachi
localeAwareComparison-Output:
comparing: "Ōdachi" with "Ābc" result 1
comparing: "Beta" with "Gamma" result -5
comparing: "Gamma" with "Ōdachi" result -126
comparing: "Gamma" with "Ābc" result -125
comparing: "Alpha" with "Ōdachi" result -132
comparing: "Alpha" with "Ābc" result -131
comparing: "Alpha" with "Gamma" result -6
comparing: "Alpha" with "Beta" result -1
Inserting role name: "tt"
Locale:
QLocale(English, Latin, UnitedStates)
"UTF-8"
The last two lines are the results of the following commands (if this helps):
QLocale locale;
qDebug() << locale;
QTextCodec * codec = QTextCodec::codecForLocale();
qDebug() << codec->name();