1.. _rfc-28: 2 3================================================================================ 4RFC 28: OGR SQL Generalized Expressions 5================================================================================ 6 7Author: Frank Warmerdam 8 9Contact: warmerdam@pobox.com 10 11Status: Adopted, Implemented 12 13Summary 14------- 15 16The OGR SQL evaluation engine currently does not allow general purpose 17functions to be applied to columns in SELECT statements. Some special 18purpose functions are supported (ie. CAST, COUNT, AVG, MAX, MIN, and 19SUM), but not as part of more general expressions and generally in very 20constrained arrangements. It is the intent of this work item to extend 21the OGR SQL engine to support fairly general purpose expression 22evaluation in the output field list of OGR SQL SELECT statements and to 23implement a few preliminary processing functions in a fashion compatible 24with standard SQL. As well, expressions used in WHERE clauses will be 25generalized to support evaluation of non-logical operations, such as 26math and functions. For example, after implementation it is intended the 27following could be evaluated. 28 29:: 30 31 SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM customers 32 SELECT id, "Regional Road" AS roadtypename FROM roads where roadtype=3 33 SELECT (subtotal+salestax) as totalcost from invoice_info where 100 <= (subtotal+salestax) 34 35A prototype implementation is now available for review in 36`http://svn.osgeo.org/gdal/sandbox/warmerdam/gdal-rfc28 <http://svn.osgeo.org/gdal/sandbox/warmerdam/gdal-rfc28>`__ 37 38Technical Approach 39------------------ 40 41Currently logical expressions take a very constrained format with the 42base elements having to be of the form * <constant_value>*. As part of 43the generalization non-logical expressions will be supported and the 44left and right side of operators will be equally treated. The current 45OGR SQL parser is ad hoc and cannot be practically extended to this 46generalized form of expression. So at this point we will move to a 47yacc/bison based parser grammar for expressions. 48 49Since it is not really practical to continue to use the existing ad hoc 50SELECT parsing when parts of the SELECT statement are expressions, the 51yacc/bison based parser will also be used to parse the whole SELECT 52statement. 53 54The current expression node will be generalized to have 0-n children 55(for arguments to functions), and to treat field references and constant 56values as distinct leaf nodes rather than embedding this information in 57a node defining an operation. 58 59It should be noted that as a side effect WHERE clauses will also support 60more general expressions - not just logical comparisons. For instance: 61 62SELECT \* WHERE (subtotal+salestax) > 100.0 63 64New Functions 65------------- 66 67- Math: +, -, \*, /, \*\* 68- String: CONCAT, SUBSTR 69 70SELECT Rules 71------------ 72 73:: 74 75 SELECT <field-list> FROM <table_def> 76 [LEFT JOIN <table_def> 77 ON [<table_ref>.]<key_field> = [<table_ref>.].<key_field>]* 78 [WHERE <where-expr>] 79 [ORDER BY <sort specification list>] 80 81 <field-list> ::= <column-spec> [ { , <column-spec> }... ] 82 83 <column-spec> ::= <field-spec> [ <as clause> ] 84 | CAST ( <field-spec> AS <data type> ) [ <as clause> ] 85 86 <field-spec> ::= [DISTINCT] <field_ref> 87 | <cumm-field-func> ( [DISTINCT] <field-ref> ) 88 | <field-expr> 89 | Count(*) 90 91 <field-expr> ::= <field_ref> 92 | <constant-value> 93 | <field-expr> <field-operator> <field-expr> 94 | <field-func> ( <field-expr-list> ) 95 | ( <field-expr> ) 96 97 <field-expr-list> ::= field-expr 98 | field-expr , field-expr-list 99 | <empty> 100 101 <as clause> ::= [ AS ] <column_name> 102 103 <data type> ::= character [ ( field_length ) ] 104 | float [ ( field_length ) ] 105 | numeric [ ( field_length [, field_precision ] ) ] 106 | integer [ ( field_length ) ] 107 | date [ ( field_length ) ] 108 | time [ ( field_length ) ] 109 | timestamp [ ( field_length ) ] 110 111 <cumm-field-func> ::= AVG | MAX | MIN | SUM | COUNT 112 113 <field-operator> ::= '+' | '-' | '/' | '*' | '||' 114 115 <field-func> ::= CONCAT | SUBSTR 116 117 <field_ref> ::= [<table_ref>.]field_name 118 119Special Notes 120------------- 121 122The existing CAST, and column summary functions COUNT, AVG, MIN, MAX and 123SUM will be treated more-or-less as functions but constrained to be root 124operations on column definitions and treated as a special case (still). 125 126Compatibility Implications 127-------------------------- 128 129Some identifiers that were previously allowed as unquoted field names 130will likely now have to be quoted as they will be keywords in the 131grammar. The keyword set is: 132 133- IN 134- LIKE 135- NULL 136- IS 137- SELECT 138- LEFT 139- JOIN 140- WHERE 141- ON 142- ORDER 143- BY 144- FROM 145- AS 146- ASC 147- DESC 148- DISTINCT 149- CAST 150 151The previous implementation was written in C and avoided all use of 152GDAL/OGR services so that it could be easily used in other contexts, 153including as the where clause evaluator of the OGDI library. After this 154update the code is C++, and direct use of CPL error and other services 155has been directly incorporated. This means the implementation used by 156GDAL and OGDI will diverge. 157 158For the most part the change results in some OGR SQL statements to work 159that would previously have generated an error. 160 161Performance Implications 162------------------------ 163 164I am hopeful that the evaluation speed will not be significantly 165different for simple selections, but each output field will need to be 166evaluated as an expression (with presumably one value-from-field node). 167 168Implementation Plan 169------------------- 170 171Frank Warmerdam will implement, test and document for the GDAL/OGR 1.8 172release. 173 174Testing 175------- 176 177All existing OGR SQL test suite tests should pass. A new 178autotest/ogr/ogr_sql_rfc28.py script will be introduced to test new 179functionality. 180 181Documentation 182------------- 183 184The :ref:`OGR SQL <ogr_sql_dialect>` document will be 185extended to describe the new capabilities. 186