1NAME 2 DBD::PgLite - PostgreSQL emulation mode for SQLite 3 4SUMMARY 5 use DBI; 6 my $dbh = DBI->connect('dbi:PgLite:dbname=file'); 7 # The following PostgreSQL-flavoured SQL is invalid 8 # in SQLite directly, but works using PgLite 9 my $sql = q[ 10 SELECT 11 news_id, title, cat_id, cat_name, sc_id sc_name, 12 to_char(news_created,'FMDD.FMMM.YYYY') AS ndate 13 FROM 14 news 15 NATURAL JOIN x_news_cat 16 NATURAL JOIN cat 17 NATURAL JOIN subcat 18 WHERE 19 news_active = TRUE 20 AND news_created > NOW() - INTERVAL '7 days' 21 ]; 22 my $res = $dbh->selectall_arrayref($sql,{Columns=>{}}); 23 # From v. 0.05 with full sequence function support 24 my $get_nid = "SELECT NEXTVAL('news_news_id_seq')"; 25 my $news_id = $dbh->selectrow_array($get_nid); 26 27DESCRIPTION 28 The module automatically and transparently transforms a broad range of 29 SQL statements typical of PostgreSQL into a form suitable for use in 30 SQLite. This involves both (a) parsing and filtering of the SQL; and (b) 31 the addition of several PostgreSQL-compatible functions to SQLite. 32 33 Mainly because of datatype issues, support for many PostgreSQL features 34 simply cannot be provided without elaborate planning and detailed 35 metadata. Since this module is intended to be usable with any SQLite3 36 database, it follows that the emulation is limited in several respects. 37 An overview of what works and what doesn't is given in the following 38 section on PostgreSQL Compatibility. 39 40 DBD::PgLite has support of a sort for stored procedures. This is 41 described in the Extras section below. So are the few database functions 42 defined by this module which are not in PostgreSQL. Finally, the Extras 43 section contains a brief mention of the DBD::PgLite::MirrorPgToSQLite 44 companion module. 45 46 If you do not want SQL filtering to be turned on by default for the 47 entire session, you can connect setting the connection attribute 48 *FilterSQL* to a false value: 49 50 my $dbh = DBI->connect("dbi:PgLite:dbname=$fn", 51 undef, undef, {FilterSQL=>0}); 52 53 To turn filtering off (or on) for a single statement, you can specify 54 *FilterSQL* option as a statement attribute, e.g.: 55 56 $dbh->do($sql, {FilterSQL=>0}, @bind); 57 my $sth = $dbh->prepare($sql, {FilterSQL=>0}); 58 $res = $dbh->selectall_arrayref($sql, {FilterSQL=>0}, @bind); 59 60 It is possible to specify user-defined pre- and postfiltering routines, 61 both globally (by specifying them as attributes of the database handle) 62 and locally (by specifying them as statement attributes): 63 64 $dbh = DBI->connect("dbi:PgLite:$file",undef,undef, 65 {prefilter=>\&prefilter}); 66 $res = $dbh->selectall_arrayref($sql, 67 {postfilter=>\&postfilter}, 68 @bind_values); 69 70 The pre-/postfiltering subroutine receives the SQL as parameter and is 71 expected to return the changed SQL. 72 73STATUS OF THE MODULE 74 This module was initially developed using SQLite 3.0 and PostgreSQL 7.3, 75 but it should be fully compatible with newer versions of both SQLite 76 (3.1 and 3.2 have been tested) and PostgreSQL (8.1 has been tested). 77 78 Support for SELECT statements and the WHERE-conditions of DELETE and 79 UPDATE statements is rather good, though still incomplete. The module 80 especially focuses on NATURAL JOIN differences and commonly used, 81 built-in PostgreSQL functions. 82 83 Support for inserted/updated values in INSERT and UPDATE statements 84 could use some improvement but is useable for simple things. 85 86 There is no support for differences in DDL. 87 88 The SQL transformations used are not based on a formal grammar but on 89 applying simple regular expressions. An obvious consequence of this is 90 that they may depend excessively on the author's SQL style. YMMV. (I 91 would however like you to contact me if you come across some SQL 92 statements which you feel should work but that don't). 93 94 The development of this module has been driven by personal needs, and so 95 is likely to be even more one-sided than the above description suggests. 96 97POSTGRESQL COMPATIBILITY 98 In this section, the PostgreSQL functions and operators supported by the 99 module are enumerated. 100 101 Regex operators 102 * The regex operators "~", "~*", "!~" and "!~*" are transformed into 103 calls to the user-defined function matches(). The regex flavour 104 supported is Perl, not plain vanilla POSIX, so some 105 incompatibilities may arise. 106 107 * Note that for ease of parsing, whitespace before and after the 108 operator is required for the filtering to succeed. So "col ~ 'pat'" 109 works, but "col~'pat'" doesn't. 110 111 * "SIMILAR TO" is not supported. 112 113 * ILIKE is quietly changed to LIKE. LIKE in SQLite is case-insensitive 114 for 7-bit characters. In future, ILIKE will probably be handled more 115 elegantly, and LIKE will be redefined so as to be more like 116 PostgreSQL. 117 118 Math Functions 119 * Added: abs, cbrt, ceil, degrees, exp, floor, ln, log (1- and 120 2-argument forms), mod, pi, pow, radians, sign, sqrt, trunc (1- and 121 2-argument forms), acos, asin, atan, atan2, cos, cot, sin, tan. 122 123 * random() exists in SQLite but was redefined to conform better with 124 PostgreSQL in terms of value range. setseed() was also added, but is 125 not entirely compatible with PostgreSQL in the sense that setting 126 the random seed does not engender the same sequence of pseudo-random 127 numbers as it would in PostgreSQL. 128 129 * SQLite already has a handful of mathematical functions which have 130 been left alone, notably round() (1- and 2-argument forms). 131 132 String Functions 133 The only string functions which are present natively in SQLite are 134 substr(), lower() and upper(). These have been left alone. Added 135 functions are the following: 136 137 * ascii, bit_length, btrim, char_length, character_length, chr, 138 convert (1- and 2-arg), decode, encode, initcap, length, lpad, 139 ltrim, md5, octet_length, position, pg_client_encoding (always 140 'SQL_ASCII'), quote_ident, quote_literal, repeat, replace, rpad, 141 rtrim, split_part, strpos, substring(string,offset,length), 142 substring(string from pattern), to_ascii (assumes latin-1 input), 143 to_hex, translate, trim. 144 145 Except for convert(), where another input encoding can be specified 146 explicitly, these functions all assume that the strings are in an 8-bit 147 character set, preferably iso-8859-1. 148 149 The little-used idiom "substring(string from pattern for escape)" (where 150 'pattern' is not a POSIX regular expression but a SQL pattern) is not 151 supported. Otherwise support for string functions is pretty complete. 152 153 Data Type Formatting Functions 154 The implementation of these functions is impeded by the sparse type 155 system employed by SQLite. Workarounds are possible, however, so this 156 area will probably be better covered in future. 157 158 * to_char(timestamp, format) is mostly supported. There is support for 159 most formatting strings (all except 'Y,YYY', 'IYYY', 'IYY', 'IY', 160 'I', 'J', 'TZ', and 'tz'). The FM prefix is supported for 'MM', 'DD' 161 and 'HH*', but not otherwise. Other prefixes are not supported. 162 163 * to_char(interval, format) and to_char(number, format) are not 164 currently supported. Nor are to_date(), to_timestamp() and 165 to_number() (yet). 166 167 Date/Time Functions 168 Again, SQLite's intrinsically bad support for dates and intervals makes 169 this area somewhat hard to cover properly. Function support is as 170 follows; also note the caveats below: 171 172 * Supported: now, current_date, current_time, current_datetime, 173 date_part (with timestamps, not intervals), date_trunc, extract 174 (with timestamps, not intervals), localtime, localtimestamp, 175 timeofday. 176 177 * Not supported: age, isfinite, overlaps. 178 179 Versions of SQLite 3.1 and later support some of these functions, e.g. 180 current_date. In these versions the built-in will be overridden. 181 182 The module makes no distinction between time/timestamp with and without 183 time zone. It is assumed that times and timestamps are either all GMT or 184 all localtime; time zone information is silently discarded. This may 185 change later. 186 187 Support for calculations with dates and intervals is still very limited. 188 Basically, what is supported are expressions of the form "expr +/- 189 interval 'descr'" where expr reduces to a timestamp or date value. 190 191 If a transaction is started with begin_work(), the time as represented 192 by now() and friends is "frozen" in the same way as in PostgreSQL until 193 commit() or rollback() are called. A transaction started by simply 194 running the SQL statement "BEGIN" does not, however, trigger this 195 behaviour. Nor is the time automatically "unfrozen" when an error occurs 196 during a transaction; you need to catch exceptions and call rollback() 197 manually. 198 199 Sequence Manipulation Functions 200 * There is now full support for all explicit invocations of the 201 sequence functions nextval(), setval(), currval() and lastval(). 202 Sequences are emulated using the table pglite_seq. (This works even 203 with multiple connections to the same database file, some of which 204 are using transactions, since SQLite transactions lock the whole 205 database file, luckily eliminating any risk of two connections 206 getting the same value from a nextval() call). 207 208 Please be aware that sequences are autogenerated if they do not 209 exist. Be careful to specify the appropriate sequence names or you 210 will get unexpected results. 211 212 If a sequence being autogenerated ends with '_seq' and has a name 213 which seems to match an existing table + an integer column from that 214 table (tablename_colname_seq), it is given an initial value based on 215 the maximum value in the column in question. 216 217 There is as yet no support for CREATE SEQUENCE statements. Use the 218 autogeneration feature to create sequences. 219 220 Implicit calls to NEXTVAL() by omitting the serial column from the 221 column list in an INSERT are caught in most cases. The main 222 conditions that must be fulfilled for this to work are: (1) that the 223 column in question is an integer column which is the sole primary 224 key on the table; and (2) that the statement is a normal INSERT with 225 a column list and a VALUES clause (and not, e.g., a statement of the 226 form INSERT INTO x SELECT * FROM y). 227 228 There is as yet no interaction with the SQLite builtin 229 autoincrement/last_insert_rowid() functionality in connection with 230 the sequence function support. 231 232 Aggregate Functions 233 * max(), min(), count() and sum() are already supported by SQLite and 234 have been left alone. Note that the construct "count(distinct 235 colname)" is not supported unless the SQLite version being used 236 supports it (3.2.6 and later). 237 238 * avg() has been added. 239 240 * stddev() and variance() are not supported. 241 242 A Note on Casting 243 Casting using the construct "::datatype" is not supported in general. 244 However, "::int", "::date" and "::bool" should work as expected. All 245 other casts are silently discarded. 246 247 A Note on Booleans 248 This module assumes that booleans will be stored as numeric values in 249 the SQLite database. SQLite interprets 0 as false and any non-zero 250 numeric value as true. Accordingly, expressions such as "= TRUE" and "= 251 't'" are simply removed in SELECT and DELETE statements. Likewise, "expr 252 = FALSE" is turned into "NOT expr" before being passed on to SQLite. 253 254 In INSERT and DELETE statements, TRUE and FALSE (as well as 't'::bool 255 and 'f'::bool - but not 't' and 'f' by themselves) are turned into 1 and 256 0. 257 258 Current_user etc. 259 The functions current_user(), session_user() and user() - with or 260 without parentheses - all mean the same thing. They return the username 261 of the effective uid. 262 263 Other Functions 264 The main groups of other functions (not supported by this module at all) 265 are: 266 267 * Database/user information functions: Aside from 268 current_user/session_user/user, which were mentioned above, no 269 functions in this group are supported. This includes 270 current_database(), current_schema(), all functions with names 271 starting with 'pg_' and 'has_', obj_description and col_description. 272 See 273 http://www.postgresql.org/docs/current/static/functions-misc.html 274 275 * Array functions are not implemented - see 276 http://www.postgresql.org/docs/current/static/functions-array.html 277 278 * Binary string (BYTEA) functions are not implemented - see 279 http://www.postgresql.org/docs/current/static/functions-binarystring 280 .html 281 282 * Geometric functions are not implemented - see 283 http://www.postgresql.org/docs/current/static/functions-geometry.htm 284 l 285 286 * Network Address Functions are not implemented - see 287 http://www.postgresql.org/docs/current/static/functions-net.html 288 289EXTRAS 290 Stored Procedures 291 If the active database file contains a table called pglite_functions, 292 the module assumes that it will have the following structure: 293 294 CREATE TABLE pglite_functions ( 295 name TEXT, -- name of the function 296 argnum INT, -- number of arguments (-1 means any number) 297 type TEXT, -- can be 'sql' or 'perl' 298 sql TEXT, -- the body of the function 299 PRIMARY KEY (name, argnum) 300 ); 301 302 In the case of a SQL-type function, it can contain syntax supported 303 through the module (and not directly by SQLite). The numeric arguments 304 ($1-$9) customary in PostgreSQL are supported, so that in many cases 305 simple functions will be directly transferrable from pg_proc in a 306 PostgreSQL database. 307 308 An instance of a SQL snippet which would work as a function body both in 309 PostgreSQL and PgLite (e.g. with the function name 'full_price_descr'): 310 311 SELECT TRIM(group_name||': '||price_description) 312 FROM price_group NATURAL JOIN price 313 WHERE price_id = $1 314 315 As for perl-type functions, the function body is simply the text of a 316 subroutine. Here is a simple example of a function body for the function 317 'commify', which takes two arguments: the number to be formatted and the 318 desired number of decimal places: 319 320 sub { 321 my ($num,$dp) = @_; 322 my $format = "%.${dp}f"; 323 $num = scalar reverse(sprintf $format, $num); 324 my $rest = $1 if $num =~ s/^(\d+)\.//; 325 $num =~ s/(...)/$1,/g; 326 $num = "$rest.$num" if $rest; 327 return scalar reverse($num); 328 } 329 330 Non-Pg Functions 331 matches(), imatches(): 332 These functions are used behind the scenes to implement support for 333 the '~' regex-matching operator and its variants. They take two 334 arguments, a string and a regular expression. matches() is case 335 sensitive, imatches() isn't. 336 337 matches_safe(), imatches_safe(): 338 These work in the same way as matches() and imatches() except that 339 metacharacters are escaped in the regex argument. They are therefore 340 in many cases more suitable for user input and other untrusted 341 sources. 342 343 lower_latin1(): 344 Depending on platform, lower() and upper() may not transform the 345 case of non-ascii characters despite a proper locale being defined 346 in the environment. This functions assumes that a Latin-1 locale is 347 active and returns a lower-case version of the input given this 348 assumption. 349 350 localeorder(): 351 DBD::SQLite does not provide access to defining SQLite collation 352 functions. This is a workaround for a specific case where this 353 limitation can be an issue. Given a Latin-1 encoded string, it 354 returns a string of hex digits which can be ascii-sorted in the 355 ordinary way. The resulting row order will be in accordance with the 356 currently active locele - but only if the locale is Latin-1 based. 357 The sort is case-insensitive. 358 359 locale(): 360 An information function simply returning the name of the current 361 locale. The module sets the locale based on the environment 362 variables $ENV{LC_COLLATE}, $ENV{LC_ALL}, $ENV{LANG}, and 363 $ENV{LC_CTYPE}, in that order. Currently it is not possible to use 364 different locales for character type and collation, as far as the 365 module is concerned. 366 367 DBD::PgLite::MirrorPgToSQLite 368 The companion module, DBD::PgLite::MirrorPgToSQLite, may be of use in 369 conjunction with this module. It can be used for easily mirroring 370 specific tables from a PostgreSQL database, moving views and (some) 371 functions as well if desired. 372 373CAVEATS 374 Some functions defined by the module are not suitable for use with UTF-8 375 data and/or in an UTF-8 locale. (This, however, would be rather easy to 376 change if you're willing to sacrifice proper support for 8-bit locales 377 such as iso-8859-1). 378 379 Please do not make the mistake of using this module for an important 380 production system - too much can go wrong. But as a development tool it 381 can be useful, and as a toy it can be fun... 382 383TODO 384 There is a lot left undone. The next step is probably to handle 385 non-SELECT statements better. 386 387SEE ALSO 388 DBI, DBD::SQLite, DBD::Pg, DBD::PgLite::MirrorPgToSQLite; 389 390THANKS TO 391 Johan Vromans, for encouraging me to improve the sequence support. 392 393AUTHOR 394 Baldur Kristinsson (bk@mbl.is), 2006. 395 396 Copyright (c) 2006 Baldur Kristinsson. All rights reserved. 397 This program is free software; you can redistribute it and/or 398 modify it under the same terms as Perl itself. 399 400=================================== 401NAME 402 DBD::PgLite::MirrorPgToSQLite - Mirror tables from PostgreSQL to SQLite 403 404SUMMARY 405 use DBD::PgLite::MirrorPgToSQLite qw(pg_to_sqlite); 406 pg_to_sqlite( 407 sqlite_file => '/var/pg_mirror/news.sqlite', 408 pg_dbh => $dbh, 409 schema => 'news', 410 tables => [ qw(news cat img /^x_news/)], 411 views => [ 'v_newslist' ], 412 indexes => 1, 413 verbose => 1, 414 snapshot => 1, 415 ); 416 417USAGE 418 The purpose of this module is to facilitate mirroring of tables from a 419 PostgreSQL dataabse to a SQLite file. The module has only be tested with 420 PostgreSQL 7.3 and SQLite 3.0-3.2. SQLite 2.x will probably not work; as 421 for PostgreSQL, any version after 7.2 is supposed to work. If it 422 doesn't, please let me know. 423 424 As seen above, options to the pg_to_sqlite() function (which is exported 425 on request) are passed in as a hash. These options are described below. 426 The default values can be changed by overriding the 427 DBD::PgLite::MirrorPgToSQLite::defaults() subroutine. 428 429 Required options 430 Obviously, the mirroring function needs either a PosgtgreSQL database 431 connection or enough information to be able to connect to the database 432 by itself. It also needs the name of a target SQLite file, and a list of 433 tables to copy between the two databases. 434 435 pg_dbh, pg_user, pg_pass, pg_dsn 436 If a database handle is specified in *pg_dbh*, it takes precedence. 437 Otherwise we try to connect using *pg_dsn*, *pg_user*, and *pg_pass* 438 (which are assigned defaults based on the environment variables 439 PGDATABASE, PGUSER and PGPASSWORD, if any of these is present). 440 441 tables 442 The value of the required *tables* option should be an arrayref of 443 strings or a string containing a comma-separated list of tablenames 444 and tablename patterns. A tablename pattern is a string or distinct 445 string portion delimited by forward slashes. To clarify: Suppose 446 that a database contains the tables news, img, img_group, cat, 447 users, comments, news_read_log, x_news_cat, x_news_img, and 448 x_img_group; and that we want to mirror news, img, cat, x_news_img 449 and x_news_cat, leaving the other tables alone. To achieve this, you 450 would set the *tables* option to any of the following (there are of 451 course also other possibilities): 452 453 (1) [qw(news img cat x_news_img x_news_cat)] 454 (2) 'news, img, cat, x_news_img, x_news_cat' 455 (3) [qw(news /img$/ /cat$/)] 456 (4) 'news,/img$/,/cat/' 457 458 The purpose of this seemingly unneccesary flexibility in how the 459 table list is specified is to make the functionality of the module 460 more easily accessible from the command line. 461 462 Please note that the patterns between the slash delimiters are not 463 Perl regular expressions but rather POSIX regular expressions, used 464 to query the PostgreSQL system tables directly. 465 466 sqlite_file 467 This should specify the full path to a SQLite file. While the 468 mirroring takes place, the incoming data is not written directly to 469 this file, but to a file with the same name except for a '.tmp' 470 extension. When the operation has finished, the previous file with 471 the name specified (if any) is renamed with a '.bak' extension, and 472 the .tmp file is renamed to the requested filename. Unless you use 473 the *append* option, the information previously in the file will be 474 totally replaced. 475 476 Other options 477 schema 478 This signifies the schema from which the tables on the PostgreSQL 479 side are to be fetched. Default: 'public'. Only one schema can be 480 specified at a time. 481 482 where 483 A WHERE-condition appended to the SELECT-statement used to get data 484 from the PostgreSQL tables. 485 486 views 487 A list of views, specified in the same manner as the list of tables 488 for the *tables* option. An attempt is made to define corresponding 489 views on the SQLite side (though this functionality is far from 490 reliable). 491 492 indexes 493 A boolean option indicating whether to create indexes for the same 494 columns in SQLite as in PostgreSQL. Default: false. (Normally only 495 the primary key is created). 496 497 functions 498 A boolean indicating whether to attempt to create functions on the 499 SQLite side corresponding to any SQL language (NOT PL/pgSQL or other 500 procedural language) functions in the PostgreSQL database. This is 501 for use with DBD::PgLite only, since these functions are put into 502 the pglite_functions table. Default: false. 503 504 page_limit 505 Normally the information from the PostgreSQL tables is read into 506 memory in one go and transferred directly to the SQLite file. This 507 is, however, obviously not desireable for very large tables. If the 508 PostgreSQL system tables report that the page count for the table is 509 above the limit specified by *page_limit*, the table is instead 510 transferred row-by-row. Default value: 5000; since each page 511 normally is 8K, this represents about 40 MB on disk and perhaps 512 70-100 MB of memory usage by the Perl process. For page_limit to 513 work, the table must have a primary key. 514 515 NB! Do not set this limit lower than necessary: it is orders of 516 magnitude slower than the default "slurp into memory" mode. 517 518 append 519 If this boolean option is true, then instead of creating a new 520 SQLite file, the current contents of the *sqlite_file* are added to. 521 If a table which is being mirrored existed previously in the file, 522 it is dropped and recreated, but any tables not being copied from 523 PostgreSQL in the current run are left alone. (This is primarily 524 useful for mirroring some tables in toto, and others only in part, 525 into the same file). Default: false. Incompatible with the 526 *snapshot* option. 527 528 snapshot 529 If this is true, then the copying from PostgreSQL takes place in 530 serialized mode (transaction isolation level serializable), which 531 should ensure consistency of relations between tables linked by 532 foreign key constraints. Currently, foreign keys are not created on 533 the SQLite side, however. Default: false. Incompatible with the 534 *append* option. 535 536 cachedir 537 The current method for getting information about table structure in 538 PostgreSQL is somewhat slow, especially for databases with very many 539 tables. To offset this, table definitions are cached in a temporary 540 directory so that subsequent mirrorings of the same table will go 541 faster. The downside is, of course, that if the table structure 542 changes, the cache needs to be cleared manually. The cache directory 543 can be specified using this option; the default is 544 /tmp/sqlite_mirror_cache (with separate subdirectories for each 545 user). 546 547 verbose 548 If this is true, a few messages will be output to stderr during the 549 mirroring process. 550 551TODO 552 * Support for foreign keys is missing. 553 554 * The method used to read tables bigger than *page_limit* needs to be 555 improved. 556 557 * It would be nice to have a quick way of telling whether the cached 558 table definition of a specific table is still valid. 559 560 * Tests. 561 562AUTHOR 563 Baldur Kristinsson (bk@mbl.is), 2004-2006. 564 565 Copyright (c) 2006 Baldur Kristinsson. All rights reserved. 566 This program is free software; you can redistribute it and/or 567 modify it under the same terms as Perl itself. 568 569