1Rego v2 - Proposal 2 3Authors: Tristan Swadell, Tim Hinrichs, Torin Sandall 4Last-Modified: 2018-02-22 5 6# Goals 7 8This document serves to facilitate collaborative development of the design of 9a general-purpose policy language. _General-purpose_ means that the language 10should be applicable to any domain, layer of the stack, or enforcement point. 11Different implementations of the language runtime may be better suited to 12different applications. 13 14# Concepts 15 16The user-experience for policy enforcement depends heavily on the policy 17language and what concepts the user must understand to use that language. The 18proposed concepts thus far are: 19 20* **Rule** - An identifier with optional parameters that conditionally produces 21 a decision. 22 * May refer to other rules, constants, and functions. 23 * Declared within modules. 24 * May be overloaded. 25 26* **Context** - Data provided at evaluation time or through calls to external 27 data-sources. 28 * Rules may declare a signature with the expected context. 29 * External datasources may be a file, database, or API. 30 31The proposed language has the following properties: 32 33* **Side-effect free** and non-Turing complete. 34* Rules, functions, and constants are declared within modules. 35* Rules are evaluated by name or by module. 36* Decisions are qualified by the module name where they are declared. 37* Conflicts must be decided by a decision resolver or by the actor calling the 38 engine. 39 40# Rules 41 42The rule declaration provides a named entry point for evaluation and 43composition. The signature of the rule declares the context to be provided 44upon evaluation, and the body is a collection of conditions and decisions. 45 46Please note that the definition of `expr` and `literal` are taken from 47the [Common Expression Language](http://github.com/google/cel-spec) (CEL): 48 49``` 50rule_decl 51 := 'rule' id assign_expr (if_expr else_expr*)? 52 | 'rule' function_signature '{' statement+ '}' 53 ; 54const_decl 55 := decorator? id assign_expr 56 ; 57assign_expr 58 := '=' (expr | comprehension_expr) 59 ; 60function_decl 61 := decorator? 'function' function_signature ('{' statement+ '}')? 62 ; 63function_signature 64 : id '(' arg_list? ')' 65 ; 66arg_list 67 := id (',' id)* 68 ; 69decorator: 70 : '@' id 71 ; 72statement 73 := condition_expr 74 | const_decl 75 | return_expr 76 | comprehension_expr 77 ; 78condition_expr 79 := if_expr '{' statement+ '}' 80 ; 81if_expr 82 := 'if' expr 83 ; 84else_expr 85 := 'else' (expr | if_expr) 86 ; 87return_expr 88 : 'return' (expr | comprehension_expr) 89 ; 90comprehension_expr 91 := iter_expr 92 | '{' (expr '|')? iter_expr '}' 93 | '[' (expr '|')? iter_expr ']' 94 ; 95iter_expr 96 := 'for' id (',' id)? 'in' expr if_expr? statement? 97 | 'for' id (',' id)? 'in' expr if_expr? ('{' statement+ '}')? 98 ; 99``` 100 101Rules represents a decision and are declared distinctly from constants and 102functions. Rules may be constant-like or function-like. Constant-like rules 103yield decisions derived from module or system provided context. Function-like 104rules declare a function signature that indicates the context required from the 105caller. The body of the rule may contain any number of conditions, decisions, 106and local declarations. 107 108``` 109rule userSalary(resource, user) { 110 match_result = resource.match('users/{target_user}/salary'); 111 return user == match_result.group.target_user; 112} 113``` 114 115Rules may be imported and leveraged within rule decisions. Note the inclusion 116of the rule within a package and how this affects function identifier 117resolution. 118 119``` 120package acme; 121import acme.hr; 122 123rule readUserSalary(request, resource, user) { 124 match_result = resource.match('users/{target_user}/salary'); 125 if (match_result.matches() && request.method == 'get') { 126 target_user = match_result.groups.target_user 127 return (hr.isManager(user, target_user) 128 || hr.isAdmin(user) 129 || user == target_user) 130 } 131 return false; 132} 133 134// outputs -> {'rule': 'acme.readUserSalary', 'result': bool} 135``` 136 137Rules may have multiple return statements in order to enforce allow / deny 138semantics (in the case of binary rules), or simply different variations on 139an affirmative rule decision, such as whether to return a list of honey-pot 140servers versus a valid list of servers. 141 142``` 143rule readUserSalary(request, resource, user) { 144 match_result = resource.match('users/{target_user}/salary') 145 if (match_result.matches() && request.method == 'get') { 146 target_user = match_result.groups.target_user 147 // Deny requests outside of business hours. 148 if (request.time.hour() < 9 || request.time.hour() > 17) { 149 return false 150 } 151 return (hr.isManager(user, target_user) 152 || hr.isAdmin(user) 153 || user == target_user) 154 } 155 return false 156} 157``` 158 159Rules may be composed from other rules and functions. The example below 160groups reading and listing salaries into a single decision. 161 162``` 163// Rules may be composed. 164rule viewSalary(request, resource, user) { 165 return queryDepartmentSalaries(request, resource, user) 166 || readUserSalary(request, resource, user); 167} 168 169rule queryDepartmentSalaries(request, resource, user) { 170 match_result = resource.match('departments/{department}/salaries') 171 if match_result.matches() && request.method == 'list' { 172 department = match_result.groups.department; 173 return !('user' in request.params.fields) 174 && hr.department(department).manager == user; 175 } 176 return false; 177} 178 179// Evaluating of the 'acme' module produces: 180// [{'rule': 'acme.viewSalary', 'result': bool}, 181// {'rule': 'acme.viewUserSalary', 'result': bool}, 182// {'rule': 'acme.queryDepartmentSalaries', 'result': bool}] 183// 184// Evaluation of just acme.viewSalary would yield only the first decision. 185``` 186 187Rules may also be written in a constant style where overloading occurs on 188the name. Developers may choose to write logically ORed statements together 189or provide overloads as a means of augmenting the decision set that can be 190attached to a rule. 191 192``` 193// Constant-like rule which supports named evaluation. 194// 195// The authorized value is optionally assigned to the identifier 'allow' 196// depending on the request.auth condition. 197// 198// Note, if the condition evaluates false, authenticated is not assigned and 199// not included in the decision set. 200rule authorized = allow if request.auth != null 201 202// Overloads the authorized grant to also permit access to public resources. 203// 204// When the request is both authenticated and against a public resource, the 205// decision set will include two authorized results. 206// 207// A conflict resolution strategy of 'anyAllow' would ensure a single result 208// for the authorized decision. The default conflict resolution behavior could 209// be to aggregate overloaded decisions if the type supports aggregation: 210// e.g. sets, lists, boolean 211// Conflict resolution requires further discussion. 212rule authorized = allow if resource.name.contains('/public/') 213 214// Rules may also be as a conditional assignment with if-else conditions rather 215// than as overloads if the developer would like to control the overload 216// behavior. 217rule rateLimit = 100 if 'admin' in request.auth.claims 218 else 50 if 'owner' in request.auth.claims 219 else 10 220``` 221 222## Conditions 223 224Conditions are [Common Expression Language](https://github.com/google/cel-spec) 225(CEL) expressions and evaluate to a boolean outcome. A condition may be used 226to select applicable rules or to make an effect or trigger conditional. 227Conditions may also be used to filter list and map entries within a `for-in` 228expression. 229 230Declarations within a condition are within its block scope and may be shadowed 231by declarations in nested conditions. Once a declaration has been assigned, it 232cannot be reassigned. 233 234## Decisions 235 236A rule represents a conditional decision which may depend on other rule 237decisions. The types of overloaded rule declarations *should* agree. In the 238case of multiple `rule`s being evaluated where the output types do not agree, 239the decision is dynamically typed. Rule decisions are _maybe_ values, in the 240sense that the result may be a valued type or undefined. This is a crucial 241feature in support of rule overloading and evaluation with partial state. 242 243The decision semantics when multiple `rule`s are evaluated depends on the 244conflict resolution algorithm declared within the policy or on the caller. 245 246Note: conflict resolution across `rule`s are as yet undefined, but will be 247addressed in a future update to the proposal. 248 249### Triggers 250 251Triggers are not a separate concept, but rather a core consideration of how 252the `rule` declarations are designed. Since each rule emits at most one 253decision, this makes it feasible to write rules which go beyond request and 254config validation, and apply to the conditional execution of additional 255compute coordinated by the policy engine. 256 257Conceptually, triggers align with the concepts of obligations and advice 258introduced within [XACML](xacml.org). The difference between whether something 259is an obligation or advice typically boils down to whether the action to 260perform is synchronous or asynchronous. Synchronous obligations may include 261preconditions, whereas async obligations would be considered promises. Advice, 262on the other hand, should always be considered asynchronous and best-effort. 263 264The following is an example of a rule that acts as a trigger to log request 265behavior and check quota. 266 267``` 268// This rule conditionally produces metadata indicating that a quota 269// check should be performed. The evaluation engine will support the 270// registration of decision handlers whose behavior will affect the 271// overall request handling. 272rule hasQuota(request) { 273 if authenticated == allow { 274 // Should return a payload flagged as an obligation, that can be 275 // processed by a decision handler registered with the evaluator 276 // for the <package-name>.hasQuota decision. 277 return quota.check('perMethodPerUser', 278 [{name: 'method', value: request.method}, 279 {name: 'user', value: request.auth.principal}]) 280 } 281} 282 283// The payload for a log statement could be as simple as a string. 284rule log = logger.log(request.auth.principal 285 + " denied request on " 286 + resource.name) 287 if authenticated != allow 288 289// Module-based evaluation of these rules would output the following 290// decisions: 291// [{'rule': 'hasQuota', 292// 'result': { 293// 'type': 'obligation', 294// 'handler': 'quota.check', 295// 'metadata' : { 'metric': 'perMethodPerUser', args:[ ... ]}}}, 296// {'rule': 'log', 297// 'result': { 298// 'type': 'promise', 299// 'handler': 'logger.log', 300// 'metadata': {'message': ...}}}] 301``` 302 303# Tests 304 305Being able to verify the correctness policy-related logic is of paramount 306importance. As is the ability to pose ad hoc queries with partial state. To this 307end we include `@test` as supported decorator and introduce the `with-as` clause 308to assist with partial state bindings required for both adhoc queries and for 309function mocking. 310 311Note: the following is under review and not yet reflected in the grammar. 312 313``` 314rule user_owned_action(auth, resource) { 315 result = resource.matches('documents/{owner}/**') 316 return (result.owner == auth.principal 317 || resource.owner in user_groups(auth.principal)); 318} 319 320@extern function user_groups(user); 321 322function mock_user_groups(user) { … } 323 324@test function group_check() { 325 with user_groups as mock_get_group_users { 326 assertTrue(user_owned_action({principal: 'me'}, 'documents/my-group')) 327 assertFalse(user_owned_action({principal: 'me'}, 'documents/their-group')) 328 } 329} 330``` 331 332# Context 333 334Context is either supplied through arguments provided to the function, through 335constant declarations within the module, or through external functions invoked 336at evaluation time. 337 338The following are all examples of how context may be provided: 339 340``` 341package acme; 342syntax = 'rego.v2'; 343 344// Functions bound at evaluation time. 345@extern function resource(); 346@extern function query(document_name); 347 348// Constant declaration based on external function calls. 349ctx = {resource: resource().name, 350 resource_owner: resource().owner}; 351 352// Allow user-owned reads, with user and request provided as a rule argument. 353rule allow_user_reads(user, request) { 354 if (request.method in ['get', 'list']) { 355 return (request.auth.claims.email == user 356 || ctx.resource_owner == user 357 || query("/documents/" + ctx.resource).created_by == user); 358 } 359 return false; 360} 361``` 362 363It will likely be common practice to provide @extern functions within their own 364module like so: 365 366``` 367package acme.db; 368syntax = 'rego.v2'; 369@extern function resource(); 370@extern function query(document_name); 371``` 372 373``` 374package acme; 375import acme.db; 376syntax = 'rego.v2'; 377 378// Constant declaration based on external function calls. 379ctx = {'resource': db.resource().name, 380 'resource_owner': db.resource().owner}; 381 382// Allow user-owned reads, with user and request provided as a rule argument. 383rule allow_user_reads(user, request) { 384 if (request.method in ['get', 'list']) { 385 return (request.auth.claims.email == user 386 || ctx.resource_owner == user 387 || db.query("/documents/" + ctx.resource).created_by == user); 388 } 389 return false; 390} 391``` 392 393The example below shows how `acme.db` provides a library of context and 394functions for use with rules. The `@extern` decorator is equivalent to a 395forward declaration, both to serve as documentation for what exists, but 396also to be consumed during type-checking to ensure the system context and 397function hooks are being used correctly within rules. 398 399``` 400package acme.db; 401syntax = 'rego.v2'; 402@extern request; 403@extern function resource(); 404@extern function query(document_name); 405 406input = { 407 'method': request.method, 408 'user': request.auth.claims.email, 409 'resource': resource() 410} 411 412permission = { 413 'read': input.method in ['get', 'list']; 414 'write': input.method in ['create', 'update', 'delete']; 415} 416 417function resourceMatch(pattern) { 418 return resource().name.match(pattern).groups; 419} 420``` 421 422``` 423package acme; 424import acme.db; 425syntax = 'rego.v2'; 426 427// Allow user-owned reads, with context information provided by functions and 428// constants provided by acme.db in the form of module or extern declarations. 429rule authorized() { 430 target_user = db.resourceMatch('users/{target_user}/salary').target_user; 431 return db.permission.read 432 && (db.input.user == target_user 433 || db.input.resource.owner == target_user 434 || db.query("/documents/" + input.resource.name).created_by == target_user); 435} 436``` 437 438# Language 439 440Rego v2 is a series of extensions to the Common Expression Language (CEL) with 441the aim of clarifying the flow of execution from Rego v1 while preserving its 442features and incorporating the non-Turing complete, partial evaluation 443semantics of CEL. 444 445## Grammar 446 447``` 448module 449 := package? import* decls 450 ; 451package 452 := 'package' qualified_id ';' 453 ; 454import 455 := 'import' qualified_id ('as' id)? ';' 456 ; 457decls 458 := (rule_decl | function_decl | const_decl)* 459 ; 460rule_decl 461 := 'rule' id assign_expr (if_expr else_expr*)? 462 | 'rule' function_signature '{' statement+ '}' 463 ; 464const_decl 465 := id assign_expr 466 | decorator id 467 ; 468assign_expr 469 := '=' (expr | comprehension_expr) 470 ; 471function_decl 472 := decorator? 'function' function_signature ('{' statement+ '}')? 473 ; 474function_signature 475 : id '(' arg_list? ')' 476 ; 477arg_list 478 := id (',' id)* 479 ; 480decorator: 481 : '@' id 482 ; 483statement 484 := condition_expr 485 | const_decl 486 | return_expr 487 | comprehension_expr 488 | with_block 489 ; 490condition_expr 491 := if_expr '{' statement+ '}' 492 ; 493if_expr 494 := 'if' expr 495 ; 496else_expr 497 := 'else' (expr | if_expr) 498 ; 499return_expr 500 : 'return' (expr | comprehension_expr) 501 ; 502comprehension_expr 503 := iter_expr 504 | '{' (expr '|')? iter_expr '}' 505 | '[' (expr '|')? iter_expr ']' 506 ; 507iter_expr 508 := 'for' id (',' id)? 'in' expr if_expr? statement? 509 | 'for' id (',' id)? 'in' expr if_expr? ('{' statement+ '}')? 510 ; 511with_block 512 : 'with' qualified_id 'as' id '{' statement+ '}' 513 ; 514expr 515 := or_expr ('?' or_expr ':' expr)? 516 ; 517or_expr 518 := and_expr ('||' and_expr)* 519 ; 520and_expr 521 := relation_expr ('&&' relation_expr)* 522 ; 523relation_expr 524 := calc_expr 525 | relation_expr ('<'|'<='|'>='|'>'|'=='|'!='|'in') relation_expr 526 ; 527calc_expr 528 := unary_expr 529 | calc_expr ('*'|'/'|'%') calc_expr 530 | calc_expr ('+'|'-') calc_expr 531 ; 532unary_expr 533 := member_expr 534 | '!'+ member_expr 535 | '-'+ member_expr 536 ; 537member_expr 538 := primary_expr 539 | member_expr '.' id 540 | member_expr '.' id '(' expr_list? ')' 541 | member_expr '[' expr ']' 542 | qualified_id '{' field_inits? '}' 543 ; 544primary_expr 545 := '.'? id 546 | '.'? id '(' expr_list? ')' 547 | '(' expr ')' 548 | '[' expr_list? ']' 549 | '{' map_inits? '}' 550 | literal 551 ; 552expr_list 553 := expr (',' expr)* 554 ; 555field_inits 556 := id ':' expr (',' id ':' expr)* 557 ; 558map_inits 559 := expr ':' expr (',' expr : 'expr')* 560 ; 561qualified_id 562 := '.'? id ('.' id)* 563 ; 564``` 565 566Note: CEL and Rego are gradually typed languages. Although the proposal does 567not describe developer-defined type designations on constants and functions, 568this may be introduced in the future. 569 570## Constants 571 572Constants are identifiers associated with an expression. They are useful for 573clarifying the logical relationship between components of a rule decision. 574Constants referenced within expression will evaluate in the same manner as 575though they were written inline into the statement. 576 577## Functions 578 579Functions are simply collections of logical statements. Functions decorated 580with `@extern` must only declare a signature as the implementation is provided 581by the calling environment. Functions decorated with `@test` may incorporate 582`with` blocks and cannot be directly or indirectly referenced by `rule` 583declarations. 584 585Functions are idempotent. Given the same input, the functions must return 586the same output. Idempotency must hold for both local and extern functions. 587At present functions do not support overloads, though this may change in 588future iterations of this proposal. 589 590The general form of a function is as follows: 591 592``` 593function_decl 594 := decorator? 'function' id '(' arg_list? ')' ('{' statement+ '}')? 595 ; 596decorator 597 := '@' id 598 ; 599``` 600 601The set of supported decorators is limited to `@extern` and `@test`. When a 602function is annotated with `@extern`, it must be supplied at runtime as part of 603the rule evaluation context. For example: 604 605``` 606// Environment must bind a handler for 'db.get' function calls from the policy. 607package db; 608@extern function get(document_name); 609``` 610 611## Comprehensions 612 613The one key difference between a Rego v2 expression and a CEL expression is 614that Rego v2 has an explicit syntax for list comprehensions rather than use the 615optional CEL macros. This syntax simplifies quantifiers as well as mapping and 616filtering while permitting the introduction of specialized reducing functions. 617It is recommended, but not required that comprehensions be performed within 618local functions rather inline within a rule, but this is not a requirement. 619 620The general form of a comprehension is as follows: 621 622``` 623comprehension_expr 624 := iter_expr 625 | '{' (expr '|')? iter_expr '}' // set, map comprehension 626 | '[' (expr '|')? iter_expr ']' // list comprehension 627 ; 628iter_expr 629 := 'for' id (',' id)? 'in' expr iter_op_expr? statement? 630 | 'for' id (',' id)? 'in' expr iter_op_expr? ('{' statement+ '}')? 631 ; 632iter_op_expr 633 := if_expr 634 | 'all' expr 635 | 'any' expr 636 ; 637filter_expr 638 := 'if' expr 639 ; 640``` 641 642A set comprehension differs from a list comprehension only in the sense that 643the entries within the set are unique. Set comprehensions are also used for 644constructing maps and complex objects where the uniqueness constraint generally 645holds true. Within maps, colliding keys are overwritten. Within a 646comprehension, colliding keys values are merged if the value is declared as a 647list. 648 649### Filter 650 651The `for-in` expression may be filtered using a condition. The result of the 652filtering, absent of other syntax, will be the same as the range type of the 653`for-in` expression. If the for-in range is a list, the result type will be a 654list. The same is true for maps, structs, and messages. 655 656``` 657for u in users if u.clearance in ['secret', 'top secret'] // list of users 658``` 659 660Filters may be chained as well. 661 662``` 663{ user: [doc] | 664// Iteration variables may be referenced within the transform statement 665for user in users if user.clearance in ['secret', 'top_secret'] 666// The statement after a for-in is implicitly within its block scope. 667// Curly braces may be used to explicitly denote the block. 668for doc in documents if doc.clearance <= user.clearance } 669``` 670 671### Quantify 672 673The quantifiers rely on filtering and simply test the `size()` of the filter 674result. The result type of the for-in is determined by the surrounding braces. 675 676``` 677// exists one 678[ for e in list if e < 10 ].size() == 1 679 680// exists at least two unique values, note the `{}` surrounding the `for-in` 681// indicates this is the construction of a set from a list. 682{ for e in list if e.startsWith('t') }.size() > 2 683``` 684 685There are also two special operators which can be used to make qualitative 686inferences about the elements within an aggregate type, `all` and `any`. The 687primary difference between these operators and the filter expressions (or 688even standard comprehensions) is that they are accumulator functions with 689the same semantics as the logical `&&` and `||`, meaning these operators 690can absorb errors. 691 692``` 693// Any element exists in this list 694for e in list any e != 'bad_candidate' 695 696// All elements in this list must be good candidates 697for e in list all e == 'good_candidate' 698``` 699 700The `any` and `all` yield boolean outcomes and are thus specialized reducing 701functions capable of absorbing errors in the same manner as hand-rolled code. 702 703### Map 704 705It is very common for developers to massage data from a variety of sources in 706order to compose the desired format best suited for the task at hand. 707 708``` 709// Set of unique image names 710{ image.name | for release_name, image in release_images } 711 712// Map of image name to release names 713// Collisions on the image.name will overwrite the existing value. 714// Note: Consider adding syntactic sugar to deal with object construction 715// during list comprehensions. 716{ image.name : release_name | for release_name, image in release_images } 717 718// List (of pairs) comprehension. 719// There is an implicit single-statement block after a for-in if no 720// curly braces are present. 721[ [user, file] | 722 for file in files 723 for user in file.viewers + [file.owner] 724 if user.matches("@{domain}.{tld}$").groups.domain in ["styra", "google"] 725] 726 727// Equivalent to the above, but with a set of unique user, file tuples and 728// explicit block syntax. 729{ [user, file] | 730 for file in files { 731 for user in file.viewers + [file.owner] 732 if user.matches("@{domain}.{tld}$").groups.domain in ["styra", "google"] ] 733 } 734} 735``` 736 737### Reduce 738 739Rather than introduce a special syntax for reduction, Rego v2 provides the 740helper functions: `sum()`, `flatten()`, and `join()`. Additional functions 741will be added over time, but these helpers represent the most common use 742cases of reduction. 743 744## Modules 745 746A module represents a collection of functions and constants. Modules are in 747the root package unless otherwise indicated by a package declaration. Multiple 748modules may share the same package. Functions in modules within the same 749package are accessible within other modules without specifying the qualified 750function name, e.g. `module.func()`. 751 752Importing a module makes all symbols within that module accessible by either 753the fully qualified module name, or the simple module name (the last fragment 754of a qualified package identifier). 755 756 757