1.. _querying: 2 3Querying 4======== 5 6This section will cover the basic CRUD operations commonly performed on a 7relational database: 8 9* :py:meth:`Model.create`, for executing *INSERT* queries. 10* :py:meth:`Model.save` and :py:meth:`Model.update`, for executing *UPDATE* 11 queries. 12* :py:meth:`Model.delete_instance` and :py:meth:`Model.delete`, for executing 13 *DELETE* queries. 14* :py:meth:`Model.select`, for executing *SELECT* queries. 15 16.. note:: 17 There is also a large collection of example queries taken from the 18 `Postgresql Exercises <https://pgexercises.com/>`_ website. Examples are 19 listed on the :ref:`query examples <query_examples>` document. 20 21Creating a new record 22--------------------- 23 24You can use :py:meth:`Model.create` to create a new model instance. This method 25accepts keyword arguments, where the keys correspond to the names of the 26model's fields. A new instance is returned and a row is added to the table. 27 28.. code-block:: pycon 29 30 >>> User.create(username='Charlie') 31 <__main__.User object at 0x2529350> 32 33This will *INSERT* a new row into the database. The primary key will 34automatically be retrieved and stored on the model instance. 35 36Alternatively, you can build up a model instance programmatically and then call 37:py:meth:`~Model.save`: 38 39.. code-block:: pycon 40 41 >>> user = User(username='Charlie') 42 >>> user.save() # save() returns the number of rows modified. 43 1 44 >>> user.id 45 1 46 >>> huey = User() 47 >>> huey.username = 'Huey' 48 >>> huey.save() 49 1 50 >>> huey.id 51 2 52 53When a model has a foreign key, you can directly assign a model instance to the 54foreign key field when creating a new record. 55 56.. code-block:: pycon 57 58 >>> tweet = Tweet.create(user=huey, message='Hello!') 59 60You can also use the value of the related object's primary key: 61 62.. code-block:: pycon 63 64 >>> tweet = Tweet.create(user=2, message='Hello again!') 65 66If you simply wish to insert data and do not need to create a model instance, 67you can use :py:meth:`Model.insert`: 68 69.. code-block:: pycon 70 71 >>> User.insert(username='Mickey').execute() 72 3 73 74After executing the insert query, the primary key of the new row is returned. 75 76.. note:: 77 There are several ways you can speed up bulk insert operations. Check out 78 the :ref:`bulk_inserts` recipe section for more information. 79 80.. _bulk_inserts: 81 82Bulk inserts 83------------ 84 85There are a couple of ways you can load lots of data quickly. The naive 86approach is to simply call :py:meth:`Model.create` in a loop: 87 88.. code-block:: python 89 90 data_source = [ 91 {'field1': 'val1-1', 'field2': 'val1-2'}, 92 {'field1': 'val2-1', 'field2': 'val2-2'}, 93 # ... 94 ] 95 96 for data_dict in data_source: 97 MyModel.create(**data_dict) 98 99The above approach is slow for a couple of reasons: 100 1011. If you are not wrapping the loop in a transaction then each call to 102 :py:meth:`~Model.create` happens in its own transaction. That is going to be 103 really slow! 1042. There is a decent amount of Python logic getting in your way, and each 105 :py:class:`InsertQuery` must be generated and parsed into SQL. 1063. That's a lot of data (in terms of raw bytes of SQL) you are sending to your 107 database to parse. 1084. We are retrieving the *last insert id*, which causes an additional query to 109 be executed in some cases. 110 111You can get a significant speedup by simply wrapping this in a transaction with 112:py:meth:`~Database.atomic`. 113 114.. code-block:: python 115 116 # This is much faster. 117 with db.atomic(): 118 for data_dict in data_source: 119 MyModel.create(**data_dict) 120 121The above code still suffers from points 2, 3 and 4. We can get another big 122boost by using :py:meth:`~Model.insert_many`. This method accepts a list of 123tuples or dictionaries, and inserts multiple rows in a single query: 124 125.. code-block:: python 126 127 data_source = [ 128 {'field1': 'val1-1', 'field2': 'val1-2'}, 129 {'field1': 'val2-1', 'field2': 'val2-2'}, 130 # ... 131 ] 132 133 # Fastest way to INSERT multiple rows. 134 MyModel.insert_many(data_source).execute() 135 136The :py:meth:`~Model.insert_many` method also accepts a list of row-tuples, 137provided you also specify the corresponding fields: 138 139.. code-block:: python 140 141 # We can INSERT tuples as well... 142 data = [('val1-1', 'val1-2'), 143 ('val2-1', 'val2-2'), 144 ('val3-1', 'val3-2')] 145 146 # But we need to indicate which fields the values correspond to. 147 MyModel.insert_many(data, fields=[MyModel.field1, MyModel.field2]).execute() 148 149It is also a good practice to wrap the bulk insert in a transaction: 150 151.. code-block:: python 152 153 # You can, of course, wrap this in a transaction as well: 154 with db.atomic(): 155 MyModel.insert_many(data, fields=fields).execute() 156 157.. note:: 158 SQLite users should be aware of some caveats when using bulk inserts. 159 Specifically, your SQLite3 version must be 3.7.11.0 or newer to take 160 advantage of the bulk insert API. Additionally, by default SQLite limits 161 the number of bound variables in a SQL query to ``999`` for SQLite versions 162 prior to 3.32.0 (2020-05-22) and 32766 for SQLite versions after 3.32.0. 163 164Inserting rows in batches 165^^^^^^^^^^^^^^^^^^^^^^^^^ 166 167Depending on the number of rows in your data source, you may need to break it 168up into chunks. SQLite in particular typically has a `limit of 999 or 32766 <https://www.sqlite.org/limits.html#max_variable_number>`_ 169variables-per-query (batch size would then be 999 // row length or 32766 // row length). 170 171You can write a loop to batch your data into chunks (in which case it is 172**strongly recommended** you use a transaction): 173 174.. code-block:: python 175 176 # Insert rows 100 at a time. 177 with db.atomic(): 178 for idx in range(0, len(data_source), 100): 179 MyModel.insert_many(data_source[idx:idx+100]).execute() 180 181Peewee comes with a :py:func:`chunked` helper function which you can use for 182*efficiently* chunking a generic iterable into a series of *batch*-sized 183iterables: 184 185.. code-block:: python 186 187 from peewee import chunked 188 189 # Insert rows 100 at a time. 190 with db.atomic(): 191 for batch in chunked(data_source, 100): 192 MyModel.insert_many(batch).execute() 193 194Alternatives 195^^^^^^^^^^^^ 196 197The :py:meth:`Model.bulk_create` method behaves much like 198:py:meth:`Model.insert_many`, but instead it accepts a list of unsaved model 199instances to insert, and it optionally accepts a batch-size parameter. To use 200the :py:meth:`~Model.bulk_create` API: 201 202.. code-block:: python 203 204 # Read list of usernames from a file, for example. 205 with open('user_list.txt') as fh: 206 # Create a list of unsaved User instances. 207 users = [User(username=line.strip()) for line in fh.readlines()] 208 209 # Wrap the operation in a transaction and batch INSERT the users 210 # 100 at a time. 211 with db.atomic(): 212 User.bulk_create(users, batch_size=100) 213 214.. note:: 215 If you are using Postgresql (which supports the ``RETURNING`` clause), then 216 the previously-unsaved model instances will have their new primary key 217 values automatically populated. 218 219In addition, Peewee also offers :py:meth:`Model.bulk_update`, which can 220efficiently update one or more columns on a list of models. For example: 221 222.. code-block:: python 223 224 # First, create 3 users with usernames u1, u2, u3. 225 u1, u2, u3 = [User.create(username='u%s' % i) for i in (1, 2, 3)] 226 227 # Now we'll modify the user instances. 228 u1.username = 'u1-x' 229 u2.username = 'u2-y' 230 u3.username = 'u3-z' 231 232 # Update all three users with a single UPDATE query. 233 User.bulk_update([u1, u2, u3], fields=[User.username]) 234 235.. note:: 236 For large lists of objects, you should specify a reasonable batch_size and 237 wrap the call to :py:meth:`~Model.bulk_update` with 238 :py:meth:`Database.atomic`: 239 240 .. code-block:: python 241 242 with database.atomic(): 243 User.bulk_update(list_of_users, fields=['username'], batch_size=50) 244 245Alternatively, you can use the :py:meth:`Database.batch_commit` helper to 246process chunks of rows inside *batch*-sized transactions. This method also 247provides a workaround for databases besides Postgresql, when the primary-key of 248the newly-created rows must be obtained. 249 250.. code-block:: python 251 252 # List of row data to insert. 253 row_data = [{'username': 'u1'}, {'username': 'u2'}, ...] 254 255 # Assume there are 789 items in row_data. The following code will result in 256 # 8 total transactions (7x100 rows + 1x89 rows). 257 for row in db.batch_commit(row_data, 100): 258 User.create(**row) 259 260Bulk-loading from another table 261^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 262 263If the data you would like to bulk load is stored in another table, you can 264also create *INSERT* queries whose source is a *SELECT* query. Use the 265:py:meth:`Model.insert_from` method: 266 267.. code-block:: python 268 269 res = (TweetArchive 270 .insert_from( 271 Tweet.select(Tweet.user, Tweet.message), 272 fields=[TweetArchive.user, TweetArchive.message]) 273 .execute()) 274 275The above query is equivalent to the following SQL: 276 277.. code-block:: sql 278 279 INSERT INTO "tweet_archive" ("user_id", "message") 280 SELECT "user_id", "message" FROM "tweet"; 281 282 283Updating existing records 284------------------------- 285 286Once a model instance has a primary key, any subsequent call to 287:py:meth:`~Model.save` will result in an *UPDATE* rather than another *INSERT*. 288The model's primary key will not change: 289 290.. code-block:: pycon 291 292 >>> user.save() # save() returns the number of rows modified. 293 1 294 >>> user.id 295 1 296 >>> user.save() 297 >>> user.id 298 1 299 >>> huey.save() 300 1 301 >>> huey.id 302 2 303 304If you want to update multiple records, issue an *UPDATE* query. The following 305example will update all ``Tweet`` objects, marking them as *published*, if they 306were created before today. :py:meth:`Model.update` accepts keyword arguments 307where the keys correspond to the model's field names: 308 309.. code-block:: pycon 310 311 >>> today = datetime.today() 312 >>> query = Tweet.update(is_published=True).where(Tweet.creation_date < today) 313 >>> query.execute() # Returns the number of rows that were updated. 314 4 315 316For more information, see the documentation on :py:meth:`Model.update`, 317:py:class:`Update` and :py:meth:`Model.bulk_update`. 318 319.. note:: 320 If you would like more information on performing atomic updates (such as 321 incrementing the value of a column), check out the :ref:`atomic update <atomic_updates>` 322 recipes. 323 324.. _atomic_updates: 325 326Atomic updates 327-------------- 328 329Peewee allows you to perform atomic updates. Let's suppose we need to update 330some counters. The naive approach would be to write something like this: 331 332.. code-block:: pycon 333 334 >>> for stat in Stat.select().where(Stat.url == request.url): 335 ... stat.counter += 1 336 ... stat.save() 337 338**Do not do this!** Not only is this slow, but it is also vulnerable to race 339conditions if multiple processes are updating the counter at the same time. 340 341Instead, you can update the counters atomically using :py:meth:`~Model.update`: 342 343.. code-block:: pycon 344 345 >>> query = Stat.update(counter=Stat.counter + 1).where(Stat.url == request.url) 346 >>> query.execute() 347 348You can make these update statements as complex as you like. Let's give all our 349employees a bonus equal to their previous bonus plus 10% of their salary: 350 351.. code-block:: pycon 352 353 >>> query = Employee.update(bonus=(Employee.bonus + (Employee.salary * .1))) 354 >>> query.execute() # Give everyone a bonus! 355 356We can even use a subquery to update the value of a column. Suppose we had a 357denormalized column on the ``User`` model that stored the number of tweets a 358user had made, and we updated this value periodically. Here is how you might 359write such a query: 360 361.. code-block:: pycon 362 363 >>> subquery = Tweet.select(fn.COUNT(Tweet.id)).where(Tweet.user == User.id) 364 >>> update = User.update(num_tweets=subquery) 365 >>> update.execute() 366 367Upsert 368^^^^^^ 369 370Peewee provides support for varying types of upsert functionality. With SQLite 371prior to 3.24.0 and MySQL, Peewee offers the :py:meth:`~Model.replace`, which 372allows you to insert a record or, in the event of a constraint violation, 373replace the existing record. 374 375Example of using :py:meth:`~Model.replace` and :py:meth:`~Insert.on_conflict_replace`: 376 377.. code-block:: python 378 379 class User(Model): 380 username = TextField(unique=True) 381 last_login = DateTimeField(null=True) 382 383 # Insert or update the user. The "last_login" value will be updated 384 # regardless of whether the user existed previously. 385 user_id = (User 386 .replace(username='the-user', last_login=datetime.now()) 387 .execute()) 388 389 # This query is equivalent: 390 user_id = (User 391 .insert(username='the-user', last_login=datetime.now()) 392 .on_conflict_replace() 393 .execute()) 394 395.. note:: 396 In addition to *replace*, SQLite, MySQL and Postgresql provide an *ignore* 397 action (see: :py:meth:`~Insert.on_conflict_ignore`) if you simply wish to 398 insert and ignore any potential constraint violation. 399 400**MySQL** supports upsert via the *ON DUPLICATE KEY UPDATE* clause. For 401example: 402 403.. code-block:: python 404 405 class User(Model): 406 username = TextField(unique=True) 407 last_login = DateTimeField(null=True) 408 login_count = IntegerField() 409 410 # Insert a new user. 411 User.create(username='huey', login_count=0) 412 413 # Simulate the user logging in. The login count and timestamp will be 414 # either created or updated correctly. 415 now = datetime.now() 416 rowid = (User 417 .insert(username='huey', last_login=now, login_count=1) 418 .on_conflict( 419 preserve=[User.last_login], # Use the value we would have inserted. 420 update={User.login_count: User.login_count + 1}) 421 .execute()) 422 423In the above example, we could safely invoke the upsert query as many times as 424we wanted. The login count will be incremented atomically, the last login 425column will be updated, and no duplicate rows will be created. 426 427**Postgresql and SQLite** (3.24.0 and newer) provide a different syntax that 428allows for more granular control over which constraint violation should trigger 429the conflict resolution, and what values should be updated or preserved. 430 431Example of using :py:meth:`~Insert.on_conflict` to perform a Postgresql-style 432upsert (or SQLite 3.24+): 433 434.. code-block:: python 435 436 class User(Model): 437 username = TextField(unique=True) 438 last_login = DateTimeField(null=True) 439 login_count = IntegerField() 440 441 # Insert a new user. 442 User.create(username='huey', login_count=0) 443 444 # Simulate the user logging in. The login count and timestamp will be 445 # either created or updated correctly. 446 now = datetime.now() 447 rowid = (User 448 .insert(username='huey', last_login=now, login_count=1) 449 .on_conflict( 450 conflict_target=[User.username], # Which constraint? 451 preserve=[User.last_login], # Use the value we would have inserted. 452 update={User.login_count: User.login_count + 1}) 453 .execute()) 454 455In the above example, we could safely invoke the upsert query as many times as 456we wanted. The login count will be incremented atomically, the last login 457column will be updated, and no duplicate rows will be created. 458 459.. note:: 460 The main difference between MySQL and Postgresql/SQLite is that Postgresql 461 and SQLite require that you specify a ``conflict_target``. 462 463Here is a more advanced (if contrived) example using the :py:class:`EXCLUDED` 464namespace. The :py:class:`EXCLUDED` helper allows us to reference values in the 465conflicting data. For our example, we'll assume a simple table mapping a unique 466key (string) to a value (integer): 467 468.. code-block:: python 469 470 class KV(Model): 471 key = CharField(unique=True) 472 value = IntegerField() 473 474 # Create one row. 475 KV.create(key='k1', value=1) 476 477 # Demonstrate usage of EXCLUDED. 478 # Here we will attempt to insert a new value for a given key. If that 479 # key already exists, then we will update its value with the *sum* of its 480 # original value and the value we attempted to insert -- provided that 481 # the new value is larger than the original value. 482 query = (KV.insert(key='k1', value=10) 483 .on_conflict(conflict_target=[KV.key], 484 update={KV.value: KV.value + EXCLUDED.value}, 485 where=(EXCLUDED.value > KV.value))) 486 487 # Executing the above query will result in the following data being 488 # present in the "kv" table: 489 # (key='k1', value=11) 490 query.execute() 491 492 # If we attempted to execute the query *again*, then nothing would be 493 # updated, as the new value (10) is now less than the value in the 494 # original row (11). 495 496For more information, see :py:meth:`Insert.on_conflict` and 497:py:class:`OnConflict`. 498 499Deleting records 500---------------- 501 502To delete a single model instance, you can use the 503:py:meth:`Model.delete_instance` shortcut. :py:meth:`~Model.delete_instance` 504will delete the given model instance and can optionally delete any dependent 505objects recursively (by specifying `recursive=True`). 506 507.. code-block:: pycon 508 509 >>> user = User.get(User.id == 1) 510 >>> user.delete_instance() # Returns the number of rows deleted. 511 1 512 513 >>> User.get(User.id == 1) 514 UserDoesNotExist: instance matching query does not exist: 515 SQL: SELECT t1."id", t1."username" FROM "user" AS t1 WHERE t1."id" = ? 516 PARAMS: [1] 517 518To delete an arbitrary set of rows, you can issue a *DELETE* query. The 519following will delete all ``Tweet`` objects that are over one year old: 520 521.. code-block:: pycon 522 523 >>> query = Tweet.delete().where(Tweet.creation_date < one_year_ago) 524 >>> query.execute() # Returns the number of rows deleted. 525 7 526 527For more information, see the documentation on: 528 529* :py:meth:`Model.delete_instance` 530* :py:meth:`Model.delete` 531* :py:class:`DeleteQuery` 532 533Selecting a single record 534------------------------- 535 536You can use the :py:meth:`Model.get` method to retrieve a single instance 537matching the given query. For primary-key lookups, you can also use the 538shortcut method :py:meth:`Model.get_by_id`. 539 540This method is a shortcut that calls :py:meth:`Model.select` with the given 541query, but limits the result set to a single row. Additionally, if no model 542matches the given query, a ``DoesNotExist`` exception will be raised. 543 544.. code-block:: pycon 545 546 >>> User.get(User.id == 1) 547 <__main__.User object at 0x25294d0> 548 549 >>> User.get_by_id(1) # Same as above. 550 <__main__.User object at 0x252df10> 551 552 >>> User[1] # Also same as above. 553 <__main__.User object at 0x252dd10> 554 555 >>> User.get(User.id == 1).username 556 u'Charlie' 557 558 >>> User.get(User.username == 'Charlie') 559 <__main__.User object at 0x2529410> 560 561 >>> User.get(User.username == 'nobody') 562 UserDoesNotExist: instance matching query does not exist: 563 SQL: SELECT t1."id", t1."username" FROM "user" AS t1 WHERE t1."username" = ? 564 PARAMS: ['nobody'] 565 566For more advanced operations, you can use :py:meth:`SelectBase.get`. The 567following query retrieves the latest tweet from the user named *charlie*: 568 569.. code-block:: pycon 570 571 >>> (Tweet 572 ... .select() 573 ... .join(User) 574 ... .where(User.username == 'charlie') 575 ... .order_by(Tweet.created_date.desc()) 576 ... .get()) 577 <__main__.Tweet object at 0x2623410> 578 579For more information, see the documentation on: 580 581* :py:meth:`Model.get` 582* :py:meth:`Model.get_by_id` 583* :py:meth:`Model.get_or_none` - if no matching row is found, return ``None``. 584* :py:meth:`Model.first` 585* :py:meth:`Model.select` 586* :py:meth:`SelectBase.get` 587 588Create or get 589------------- 590 591Peewee has one helper method for performing "get/create" type operations: 592:py:meth:`Model.get_or_create`, which first attempts to retrieve the matching 593row. Failing that, a new row will be created. 594 595For "create or get" type logic, typically one would rely on a *unique* 596constraint or primary key to prevent the creation of duplicate objects. As an 597example, let's say we wish to implement registering a new user account using 598the :ref:`example User model <blog-models>`. The *User* model has a *unique* 599constraint on the username field, so we will rely on the database's integrity 600guarantees to ensure we don't end up with duplicate usernames: 601 602.. code-block:: python 603 604 try: 605 with db.atomic(): 606 return User.create(username=username) 607 except peewee.IntegrityError: 608 # `username` is a unique column, so this username already exists, 609 # making it safe to call .get(). 610 return User.get(User.username == username) 611 612You can easily encapsulate this type of logic as a ``classmethod`` on your own 613``Model`` classes. 614 615The above example first attempts at creation, then falls back to retrieval, 616relying on the database to enforce a unique constraint. If you prefer to 617attempt to retrieve the record first, you can use 618:py:meth:`~Model.get_or_create`. This method is implemented along the same 619lines as the Django function of the same name. You can use the Django-style 620keyword argument filters to specify your ``WHERE`` conditions. The function 621returns a 2-tuple containing the instance and a boolean value indicating if the 622object was created. 623 624Here is how you might implement user account creation using 625:py:meth:`~Model.get_or_create`: 626 627.. code-block:: python 628 629 user, created = User.get_or_create(username=username) 630 631Suppose we have a different model ``Person`` and would like to get or create a 632person object. The only conditions we care about when retrieving the ``Person`` 633are their first and last names, **but** if we end up needing to create a new 634record, we will also specify their date-of-birth and favorite color: 635 636.. code-block:: python 637 638 person, created = Person.get_or_create( 639 first_name=first_name, 640 last_name=last_name, 641 defaults={'dob': dob, 'favorite_color': 'green'}) 642 643Any keyword argument passed to :py:meth:`~Model.get_or_create` will be used in 644the ``get()`` portion of the logic, except for the ``defaults`` dictionary, 645which will be used to populate values on newly-created instances. 646 647For more details read the documentation for :py:meth:`Model.get_or_create`. 648 649Selecting multiple records 650-------------------------- 651 652We can use :py:meth:`Model.select` to retrieve rows from the table. When you 653construct a *SELECT* query, the database will return any rows that correspond 654to your query. Peewee allows you to iterate over these rows, as well as use 655indexing and slicing operations: 656 657.. code-block:: pycon 658 659 >>> query = User.select() 660 >>> [user.username for user in query] 661 ['Charlie', 'Huey', 'Peewee'] 662 663 >>> query[1] 664 <__main__.User at 0x7f83e80f5550> 665 666 >>> query[1].username 667 'Huey' 668 669 >>> query[:2] 670 [<__main__.User at 0x7f83e80f53a8>, <__main__.User at 0x7f83e80f5550>] 671 672:py:class:`Select` queries are smart, in that you can iterate, index and slice 673the query multiple times but the query is only executed once. 674 675In the following example, we will simply call :py:meth:`~Model.select` and 676iterate over the return value, which is an instance of :py:class:`Select`. 677This will return all the rows in the *User* table: 678 679.. code-block:: pycon 680 681 >>> for user in User.select(): 682 ... print user.username 683 ... 684 Charlie 685 Huey 686 Peewee 687 688.. note:: 689 Subsequent iterations of the same query will not hit the database as the 690 results are cached. To disable this behavior (to reduce memory usage), call 691 :py:meth:`Select.iterator` when iterating. 692 693When iterating over a model that contains a foreign key, be careful with the 694way you access values on related models. Accidentally resolving a foreign key 695or iterating over a back-reference can cause :ref:`N+1 query behavior <nplusone>`. 696 697When you create a foreign key, such as ``Tweet.user``, you can use the 698*backref* to create a back-reference (``User.tweets``). Back-references 699are exposed as :py:class:`Select` instances: 700 701.. code-block:: pycon 702 703 >>> tweet = Tweet.get() 704 >>> tweet.user # Accessing a foreign key returns the related model. 705 <tw.User at 0x7f3ceb017f50> 706 707 >>> user = User.get() 708 >>> user.tweets # Accessing a back-reference returns a query. 709 <peewee.ModelSelect at 0x7f73db3bafd0> 710 711You can iterate over the ``user.tweets`` back-reference just like any other 712:py:class:`Select`: 713 714.. code-block:: pycon 715 716 >>> for tweet in user.tweets: 717 ... print(tweet.message) 718 ... 719 hello world 720 this is fun 721 look at this picture of my food 722 723In addition to returning model instances, :py:class:`Select` queries can return 724dictionaries, tuples and namedtuples. Depending on your use-case, you may find 725it easier to work with rows as dictionaries, for example: 726 727.. code-block:: pycon 728 729 >>> query = User.select().dicts() 730 >>> for row in query: 731 ... print(row) 732 733 {'id': 1, 'username': 'Charlie'} 734 {'id': 2, 'username': 'Huey'} 735 {'id': 3, 'username': 'Peewee'} 736 737See :py:meth:`~BaseQuery.namedtuples`, :py:meth:`~BaseQuery.tuples`, 738:py:meth:`~BaseQuery.dicts` for more information. 739 740Iterating over large result-sets 741^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 742 743By default peewee will cache the rows returned when iterating over a 744:py:class:`Select` query. This is an optimization to allow multiple iterations 745as well as indexing and slicing without causing additional queries. This 746caching can be problematic, however, when you plan to iterate over a large 747number of rows. 748 749To reduce the amount of memory used by peewee when iterating over a query, use 750the :py:meth:`~BaseQuery.iterator` method. This method allows you to iterate 751without caching each model returned, using much less memory when iterating over 752large result sets. 753 754.. code-block:: python 755 756 # Let's assume we've got 10 million stat objects to dump to a csv file. 757 stats = Stat.select() 758 759 # Our imaginary serializer class 760 serializer = CSVSerializer() 761 762 # Loop over all the stats and serialize. 763 for stat in stats.iterator(): 764 serializer.serialize_object(stat) 765 766For simple queries you can see further speed improvements by returning rows as 767dictionaries, namedtuples or tuples. The following methods can be used on any 768:py:class:`Select` query to change the result row type: 769 770* :py:meth:`~BaseQuery.dicts` 771* :py:meth:`~BaseQuery.namedtuples` 772* :py:meth:`~BaseQuery.tuples` 773 774Don't forget to append the :py:meth:`~BaseQuery.iterator` method call to also 775reduce memory consumption. For example, the above code might look like: 776 777.. code-block:: python 778 779 # Let's assume we've got 10 million stat objects to dump to a csv file. 780 stats = Stat.select() 781 782 # Our imaginary serializer class 783 serializer = CSVSerializer() 784 785 # Loop over all the stats (rendered as tuples, without caching) and serialize. 786 for stat_tuple in stats.tuples().iterator(): 787 serializer.serialize_tuple(stat_tuple) 788 789When iterating over a large number of rows that contain columns from multiple 790tables, peewee will reconstruct the model graph for each row returned. This 791operation can be slow for complex graphs. For example, if we were selecting a 792list of tweets along with the username and avatar of the tweet's author, Peewee 793would have to create two objects for each row (a tweet and a user). In addition 794to the above row-types, there is a fourth method :py:meth:`~BaseQuery.objects` 795which will return the rows as model instances, but will not attempt to resolve 796the model graph. 797 798For example: 799 800.. code-block:: python 801 802 query = (Tweet 803 .select(Tweet, User) # Select tweet and user data. 804 .join(User)) 805 806 # Note that the user columns are stored in a separate User instance 807 # accessible at tweet.user: 808 for tweet in query: 809 print(tweet.user.username, tweet.content) 810 811 # Using ".objects()" will not create the tweet.user object and assigns all 812 # user attributes to the tweet instance: 813 for tweet in query.objects(): 814 print(tweet.username, tweet.content) 815 816For maximum performance, you can execute queries and then iterate over the 817results using the underlying database cursor. :py:meth:`Database.execute` 818accepts a query object, executes the query, and returns a DB-API 2.0 ``Cursor`` 819object. The cursor will return the raw row-tuples: 820 821.. code-block:: python 822 823 query = Tweet.select(Tweet.content, User.username).join(User) 824 cursor = database.execute(query) 825 for (content, username) in cursor: 826 print(username, '->', content) 827 828Filtering records 829----------------- 830 831You can filter for particular records using normal python operators. Peewee 832supports a wide variety of :ref:`query operators <query-operators>`. 833 834.. code-block:: pycon 835 836 >>> user = User.get(User.username == 'Charlie') 837 >>> for tweet in Tweet.select().where(Tweet.user == user, Tweet.is_published == True): 838 ... print(tweet.user.username, '->', tweet.message) 839 ... 840 Charlie -> hello world 841 Charlie -> this is fun 842 843 >>> for tweet in Tweet.select().where(Tweet.created_date < datetime.datetime(2011, 1, 1)): 844 ... print(tweet.message, tweet.created_date) 845 ... 846 Really old tweet 2010-01-01 00:00:00 847 848You can also filter across joins: 849 850.. code-block:: pycon 851 852 >>> for tweet in Tweet.select().join(User).where(User.username == 'Charlie'): 853 ... print(tweet.message) 854 hello world 855 this is fun 856 look at this picture of my food 857 858If you want to express a complex query, use parentheses and python's bitwise 859*or* and *and* operators: 860 861.. code-block:: pycon 862 863 >>> Tweet.select().join(User).where( 864 ... (User.username == 'Charlie') | 865 ... (User.username == 'Peewee Herman')) 866 867.. note:: 868 Note that Peewee uses **bitwise** operators (``&`` and ``|``) rather than 869 logical operators (``and`` and ``or``). The reason for this is that Python 870 coerces the return value of logical operations to a boolean value. This is 871 also the reason why "IN" queries must be expressed using ``.in_()`` rather 872 than the ``in`` operator. 873 874Check out :ref:`the table of query operations <query-operators>` to see what 875types of queries are possible. 876 877.. note:: 878 879 A lot of fun things can go in the where clause of a query, such as: 880 881 * A field expression, e.g. ``User.username == 'Charlie'`` 882 * A function expression, e.g. ``fn.Lower(fn.Substr(User.username, 1, 1)) == 'a'`` 883 * A comparison of one column to another, e.g. ``Employee.salary < (Employee.tenure * 1000) + 40000`` 884 885 You can also nest queries, for example tweets by users whose username 886 starts with "a": 887 888 .. code-block:: python 889 890 # get users whose username starts with "a" 891 a_users = User.select().where(fn.Lower(fn.Substr(User.username, 1, 1)) == 'a') 892 893 # the ".in_()" method signifies an "IN" query 894 a_user_tweets = Tweet.select().where(Tweet.user.in_(a_users)) 895 896More query examples 897^^^^^^^^^^^^^^^^^^^ 898 899.. note:: 900 For a wide range of example queries, see the :ref:`Query Examples <query_examples>` 901 document, which shows how to implements queries from the `PostgreSQL Exercises <https://pgexercises.com/>`_ 902 website. 903 904Get active users: 905 906.. code-block:: python 907 908 User.select().where(User.active == True) 909 910Get users who are either staff or superusers: 911 912.. code-block:: python 913 914 User.select().where( 915 (User.is_staff == True) | (User.is_superuser == True)) 916 917Get tweets by user named "charlie": 918 919.. code-block:: python 920 921 Tweet.select().join(User).where(User.username == 'charlie') 922 923Get tweets by staff or superusers (assumes FK relationship): 924 925.. code-block:: python 926 927 Tweet.select().join(User).where( 928 (User.is_staff == True) | (User.is_superuser == True)) 929 930Get tweets by staff or superusers using a subquery: 931 932.. code-block:: python 933 934 staff_super = User.select(User.id).where( 935 (User.is_staff == True) | (User.is_superuser == True)) 936 Tweet.select().where(Tweet.user.in_(staff_super)) 937 938Sorting records 939--------------- 940 941To return rows in order, use the :py:meth:`~Query.order_by` method: 942 943.. code-block:: pycon 944 945 >>> for t in Tweet.select().order_by(Tweet.created_date): 946 ... print(t.pub_date) 947 ... 948 2010-01-01 00:00:00 949 2011-06-07 14:08:48 950 2011-06-07 14:12:57 951 952 >>> for t in Tweet.select().order_by(Tweet.created_date.desc()): 953 ... print(t.pub_date) 954 ... 955 2011-06-07 14:12:57 956 2011-06-07 14:08:48 957 2010-01-01 00:00:00 958 959You can also use ``+`` and ``-`` prefix operators to indicate ordering: 960 961.. code-block:: python 962 963 # The following queries are equivalent: 964 Tweet.select().order_by(Tweet.created_date.desc()) 965 966 Tweet.select().order_by(-Tweet.created_date) # Note the "-" prefix. 967 968 # Similarly you can use "+" to indicate ascending order, though ascending 969 # is the default when no ordering is otherwise specified. 970 User.select().order_by(+User.username) 971 972You can also order across joins. Assuming you want to order tweets by the 973username of the author, then by created_date: 974 975.. code-block:: pycon 976 977 query = (Tweet 978 .select() 979 .join(User) 980 .order_by(User.username, Tweet.created_date.desc())) 981 982.. code-block:: sql 983 984 SELECT t1."id", t1."user_id", t1."message", t1."is_published", t1."created_date" 985 FROM "tweet" AS t1 986 INNER JOIN "user" AS t2 987 ON t1."user_id" = t2."id" 988 ORDER BY t2."username", t1."created_date" DESC 989 990When sorting on a calculated value, you can either include the necessary SQL 991expressions, or reference the alias assigned to the value. Here are two 992examples illustrating these methods: 993 994.. code-block:: python 995 996 # Let's start with our base query. We want to get all usernames and the number of 997 # tweets they've made. We wish to sort this list from users with most tweets to 998 # users with fewest tweets. 999 query = (User 1000 .select(User.username, fn.COUNT(Tweet.id).alias('num_tweets')) 1001 .join(Tweet, JOIN.LEFT_OUTER) 1002 .group_by(User.username)) 1003 1004You can order using the same COUNT expression used in the ``select`` clause. In 1005the example below we are ordering by the ``COUNT()`` of tweet ids descending: 1006 1007.. code-block:: python 1008 1009 query = (User 1010 .select(User.username, fn.COUNT(Tweet.id).alias('num_tweets')) 1011 .join(Tweet, JOIN.LEFT_OUTER) 1012 .group_by(User.username) 1013 .order_by(fn.COUNT(Tweet.id).desc())) 1014 1015Alternatively, you can reference the alias assigned to the calculated value in 1016the ``select`` clause. This method has the benefit of being a bit easier to 1017read. Note that we are not referring to the named alias directly, but are 1018wrapping it using the :py:class:`SQL` helper: 1019 1020.. code-block:: python 1021 1022 query = (User 1023 .select(User.username, fn.COUNT(Tweet.id).alias('num_tweets')) 1024 .join(Tweet, JOIN.LEFT_OUTER) 1025 .group_by(User.username) 1026 .order_by(SQL('num_tweets').desc())) 1027 1028Or, to do things the "peewee" way: 1029 1030.. code-block:: python 1031 1032 ntweets = fn.COUNT(Tweet.id) 1033 query = (User 1034 .select(User.username, ntweets.alias('num_tweets')) 1035 .join(Tweet, JOIN.LEFT_OUTER) 1036 .group_by(User.username) 1037 .order_by(ntweets.desc()) 1038 1039Getting random records 1040---------------------- 1041 1042Occasionally you may want to pull a random record from the database. You can 1043accomplish this by ordering by the *random* or *rand* function (depending on 1044your database): 1045 1046Postgresql and Sqlite use the *Random* function: 1047 1048.. code-block:: python 1049 1050 # Pick 5 lucky winners: 1051 LotteryNumber.select().order_by(fn.Random()).limit(5) 1052 1053MySQL uses *Rand*: 1054 1055.. code-block:: python 1056 1057 # Pick 5 lucky winners: 1058 LotteryNumber.select().order_by(fn.Rand()).limit(5) 1059 1060Paginating records 1061------------------ 1062 1063The :py:meth:`~Query.paginate` method makes it easy to grab a *page* or 1064records. :py:meth:`~Query.paginate` takes two parameters, 1065``page_number``, and ``items_per_page``. 1066 1067.. attention:: 1068 Page numbers are 1-based, so the first page of results will be page 1. 1069 1070.. code-block:: pycon 1071 1072 >>> for tweet in Tweet.select().order_by(Tweet.id).paginate(2, 10): 1073 ... print(tweet.message) 1074 ... 1075 tweet 10 1076 tweet 11 1077 tweet 12 1078 tweet 13 1079 tweet 14 1080 tweet 15 1081 tweet 16 1082 tweet 17 1083 tweet 18 1084 tweet 19 1085 1086If you would like more granular control, you can always use 1087:py:meth:`~Query.limit` and :py:meth:`~Query.offset`. 1088 1089Counting records 1090---------------- 1091 1092You can count the number of rows in any select query: 1093 1094.. code-block:: python 1095 1096 >>> Tweet.select().count() 1097 100 1098 >>> Tweet.select().where(Tweet.id > 50).count() 1099 50 1100 1101Peewee will wrap your query in an outer query that performs a count, which 1102results in SQL like: 1103 1104.. code-block:: sql 1105 1106 SELECT COUNT(1) FROM ( ... your query ... ); 1107 1108Aggregating records 1109------------------- 1110 1111Suppose you have some users and want to get a list of them along with the count 1112of tweets in each. 1113 1114.. code-block:: python 1115 1116 query = (User 1117 .select(User, fn.Count(Tweet.id).alias('count')) 1118 .join(Tweet, JOIN.LEFT_OUTER) 1119 .group_by(User)) 1120 1121The resulting query will return *User* objects with all their normal attributes 1122plus an additional attribute *count* which will contain the count of tweets for 1123each user. We use a left outer join to include users who have no tweets. 1124 1125Let's assume you have a tagging application and want to find tags that have a 1126certain number of related objects. For this example we'll use some different 1127models in a :ref:`many-to-many <manytomany>` configuration: 1128 1129.. code-block:: python 1130 1131 class Photo(Model): 1132 image = CharField() 1133 1134 class Tag(Model): 1135 name = CharField() 1136 1137 class PhotoTag(Model): 1138 photo = ForeignKeyField(Photo) 1139 tag = ForeignKeyField(Tag) 1140 1141Now say we want to find tags that have at least 5 photos associated with them: 1142 1143.. code-block:: python 1144 1145 query = (Tag 1146 .select() 1147 .join(PhotoTag) 1148 .join(Photo) 1149 .group_by(Tag) 1150 .having(fn.Count(Photo.id) > 5)) 1151 1152This query is equivalent to the following SQL: 1153 1154.. code-block:: sql 1155 1156 SELECT t1."id", t1."name" 1157 FROM "tag" AS t1 1158 INNER JOIN "phototag" AS t2 ON t1."id" = t2."tag_id" 1159 INNER JOIN "photo" AS t3 ON t2."photo_id" = t3."id" 1160 GROUP BY t1."id", t1."name" 1161 HAVING Count(t3."id") > 5 1162 1163Suppose we want to grab the associated count and store it on the tag: 1164 1165.. code-block:: python 1166 1167 query = (Tag 1168 .select(Tag, fn.Count(Photo.id).alias('count')) 1169 .join(PhotoTag) 1170 .join(Photo) 1171 .group_by(Tag) 1172 .having(fn.Count(Photo.id) > 5)) 1173 1174Retrieving Scalar Values 1175------------------------ 1176 1177You can retrieve scalar values by calling :py:meth:`Query.scalar`. For 1178instance: 1179 1180.. code-block:: python 1181 1182 >>> PageView.select(fn.Count(fn.Distinct(PageView.url))).scalar() 1183 100 1184 1185You can retrieve multiple scalar values by passing ``as_tuple=True``: 1186 1187.. code-block:: python 1188 1189 >>> Employee.select( 1190 ... fn.Min(Employee.salary), fn.Max(Employee.salary) 1191 ... ).scalar(as_tuple=True) 1192 (30000, 50000) 1193 1194.. _window-functions: 1195 1196Window functions 1197---------------- 1198 1199A :py:class:`Window` function refers to an aggregate function that operates on 1200a sliding window of data that is being processed as part of a ``SELECT`` query. 1201Window functions make it possible to do things like: 1202 12031. Perform aggregations against subsets of a result-set. 12042. Calculate a running total. 12053. Rank results. 12064. Compare a row value to a value in the preceding (or succeeding!) row(s). 1207 1208peewee comes with support for SQL window functions, which can be created by 1209calling :py:meth:`Function.over` and passing in your partitioning or ordering 1210parameters. 1211 1212For the following examples, we'll use the following model and sample data: 1213 1214.. code-block:: python 1215 1216 class Sample(Model): 1217 counter = IntegerField() 1218 value = FloatField() 1219 1220 data = [(1, 10), 1221 (1, 20), 1222 (2, 1), 1223 (2, 3), 1224 (3, 100)] 1225 Sample.insert_many(data, fields=[Sample.counter, Sample.value]).execute() 1226 1227Our sample table now contains: 1228 1229=== ======== ====== 1230id counter value 1231=== ======== ====== 12321 1 10.0 12332 1 20.0 12343 2 1.0 12354 2 3.0 12365 3 100.0 1237=== ======== ====== 1238 1239Ordered Windows 1240^^^^^^^^^^^^^^^ 1241 1242Let's calculate a running sum of the ``value`` field. In order for it to be a 1243"running" sum, we need it to be ordered, so we'll order with respect to the 1244Sample's ``id`` field: 1245 1246.. code-block:: python 1247 1248 query = Sample.select( 1249 Sample.counter, 1250 Sample.value, 1251 fn.SUM(Sample.value).over(order_by=[Sample.id]).alias('total')) 1252 1253 for sample in query: 1254 print(sample.counter, sample.value, sample.total) 1255 1256 # 1 10. 10. 1257 # 1 20. 30. 1258 # 2 1. 31. 1259 # 2 3. 34. 1260 # 3 100 134. 1261 1262For another example, we'll calculate the difference between the current value 1263and the previous value, when ordered by the ``id``: 1264 1265.. code-block:: python 1266 1267 difference = Sample.value - fn.LAG(Sample.value, 1).over(order_by=[Sample.id]) 1268 query = Sample.select( 1269 Sample.counter, 1270 Sample.value, 1271 difference.alias('diff')) 1272 1273 for sample in query: 1274 print(sample.counter, sample.value, sample.diff) 1275 1276 # 1 10. NULL 1277 # 1 20. 10. -- (20 - 10) 1278 # 2 1. -19. -- (1 - 20) 1279 # 2 3. 2. -- (3 - 1) 1280 # 3 100 97. -- (100 - 3) 1281 1282Partitioned Windows 1283^^^^^^^^^^^^^^^^^^^ 1284 1285Let's calculate the average ``value`` for each distinct "counter" value. Notice 1286that there are three possible values for the ``counter`` field (1, 2, and 3). 1287We can do this by calculating the ``AVG()`` of the ``value`` column over a 1288window that is partitioned depending on the ``counter`` field: 1289 1290.. code-block:: python 1291 1292 query = Sample.select( 1293 Sample.counter, 1294 Sample.value, 1295 fn.AVG(Sample.value).over(partition_by=[Sample.counter]).alias('cavg')) 1296 1297 for sample in query: 1298 print(sample.counter, sample.value, sample.cavg) 1299 1300 # 1 10. 15. 1301 # 1 20. 15. 1302 # 2 1. 2. 1303 # 2 3. 2. 1304 # 3 100 100. 1305 1306We can use ordering within partitions by specifying both the ``order_by`` and 1307``partition_by`` parameters. For an example, let's rank the samples by value 1308within each distinct ``counter`` group. 1309 1310.. code-block:: python 1311 1312 query = Sample.select( 1313 Sample.counter, 1314 Sample.value, 1315 fn.RANK().over( 1316 order_by=[Sample.value], 1317 partition_by=[Sample.counter]).alias('rank')) 1318 1319 for sample in query: 1320 print(sample.counter, sample.value, sample.rank) 1321 1322 # 1 10. 1 1323 # 1 20. 2 1324 # 2 1. 1 1325 # 2 3. 2 1326 # 3 100 1 1327 1328Bounded windows 1329^^^^^^^^^^^^^^^ 1330 1331By default, window functions are evaluated using an *unbounded preceding* start 1332for the window, and the *current row* as the end. We can change the bounds of 1333the window our aggregate functions operate on by specifying a ``start`` and/or 1334``end`` in the call to :py:meth:`Function.over`. Additionally, Peewee comes 1335with helper-methods on the :py:class:`Window` object for generating the 1336appropriate boundary references: 1337 1338* :py:attr:`Window.CURRENT_ROW` - attribute that references the current row. 1339* :py:meth:`Window.preceding` - specify number of row(s) preceding, or omit 1340 number to indicate **all** preceding rows. 1341* :py:meth:`Window.following` - specify number of row(s) following, or omit 1342 number to indicate **all** following rows. 1343 1344To examine how boundaries work, we'll calculate a running total of the 1345``value`` column, ordered with respect to ``id``, **but** we'll only look the 1346running total of the current row and it's two preceding rows: 1347 1348.. code-block:: python 1349 1350 query = Sample.select( 1351 Sample.counter, 1352 Sample.value, 1353 fn.SUM(Sample.value).over( 1354 order_by=[Sample.id], 1355 start=Window.preceding(2), 1356 end=Window.CURRENT_ROW).alias('rsum')) 1357 1358 for sample in query: 1359 print(sample.counter, sample.value, sample.rsum) 1360 1361 # 1 10. 10. 1362 # 1 20. 30. -- (20 + 10) 1363 # 2 1. 31. -- (1 + 20 + 10) 1364 # 2 3. 24. -- (3 + 1 + 20) 1365 # 3 100 104. -- (100 + 3 + 1) 1366 1367.. note:: 1368 Technically we did not need to specify the ``end=Window.CURRENT`` because 1369 that is the default. It was shown in the example for demonstration. 1370 1371Let's look at another example. In this example we will calculate the "opposite" 1372of a running total, in which the total sum of all values is decreased by the 1373value of the samples, ordered by ``id``. To accomplish this, we'll calculate 1374the sum from the current row to the last row. 1375 1376.. code-block:: python 1377 1378 query = Sample.select( 1379 Sample.counter, 1380 Sample.value, 1381 fn.SUM(Sample.value).over( 1382 order_by=[Sample.id], 1383 start=Window.CURRENT_ROW, 1384 end=Window.following()).alias('rsum')) 1385 1386 # 1 10. 134. -- (10 + 20 + 1 + 3 + 100) 1387 # 1 20. 124. -- (20 + 1 + 3 + 100) 1388 # 2 1. 104. -- (1 + 3 + 100) 1389 # 2 3. 103. -- (3 + 100) 1390 # 3 100 100. -- (100) 1391 1392Filtered Aggregates 1393^^^^^^^^^^^^^^^^^^^ 1394 1395Aggregate functions may also support filter functions (Postgres and Sqlite 13963.25+), which get translated into a ``FILTER (WHERE...)`` clause. Filter 1397expressions are added to an aggregate function with the 1398:py:meth:`Function.filter` method. 1399 1400For an example, we will calculate the running sum of the ``value`` field with 1401respect to the ``id``, but we will filter-out any samples whose ``counter=2``. 1402 1403.. code-block:: python 1404 1405 query = Sample.select( 1406 Sample.counter, 1407 Sample.value, 1408 fn.SUM(Sample.value).filter(Sample.counter != 2).over( 1409 order_by=[Sample.id]).alias('csum')) 1410 1411 for sample in query: 1412 print(sample.counter, sample.value, sample.csum) 1413 1414 # 1 10. 10. 1415 # 1 20. 30. 1416 # 2 1. 30. 1417 # 2 3. 30. 1418 # 3 100 130. 1419 1420.. note:: 1421 The call to :py:meth:`~Function.filter` must precede the call to 1422 :py:meth:`~Function.over`. 1423 1424Reusing Window Definitions 1425^^^^^^^^^^^^^^^^^^^^^^^^^^ 1426 1427If you intend to use the same window definition for multiple aggregates, you 1428can create a :py:class:`Window` object. The :py:class:`Window` object takes the 1429same parameters as :py:meth:`Function.over`, and can be passed to the 1430``over()`` method in-place of the individual parameters. 1431 1432Here we'll declare a single window, ordered with respect to the sample ``id``, 1433and call several window functions using that window definition: 1434 1435.. code-block:: python 1436 1437 win = Window(order_by=[Sample.id]) 1438 query = Sample.select( 1439 Sample.counter, 1440 Sample.value, 1441 fn.LEAD(Sample.value).over(win), 1442 fn.LAG(Sample.value).over(win), 1443 fn.SUM(Sample.value).over(win) 1444 ).window(win) # Include our window definition in query. 1445 1446 for row in query.tuples(): 1447 print(row) 1448 1449 # counter value lead() lag() sum() 1450 # 1 10. 20. NULL 10. 1451 # 1 20. 1. 10. 30. 1452 # 2 1. 3. 20. 31. 1453 # 2 3. 100. 1. 34. 1454 # 3 100. NULL 3. 134. 1455 1456Multiple window definitions 1457^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1458 1459In the previous example, we saw how to declare a :py:class:`Window` definition 1460and re-use it for multiple different aggregations. You can include as many 1461window definitions as you need in your queries, but it is necessary to ensure 1462each window has a unique alias: 1463 1464.. code-block:: python 1465 1466 w1 = Window(order_by=[Sample.id]).alias('w1') 1467 w2 = Window(partition_by=[Sample.counter]).alias('w2') 1468 query = Sample.select( 1469 Sample.counter, 1470 Sample.value, 1471 fn.SUM(Sample.value).over(w1).alias('rsum'), # Running total. 1472 fn.AVG(Sample.value).over(w2).alias('cavg') # Avg per category. 1473 ).window(w1, w2) # Include our window definitions. 1474 1475 for sample in query: 1476 print(sample.counter, sample.value, sample.rsum, sample.cavg) 1477 1478 # counter value rsum cavg 1479 # 1 10. 10. 15. 1480 # 1 20. 30. 15. 1481 # 2 1. 31. 2. 1482 # 2 3. 34. 2. 1483 # 3 100 134. 100. 1484 1485Similarly, if you have multiple window definitions that share similar 1486definitions, it is possible to extend a previously-defined window definition. 1487For example, here we will be partitioning the data-set by the counter value, so 1488we'll be doing our aggregations with respect to the counter. Then we'll define 1489a second window that extends this partitioning, and adds an ordering clause: 1490 1491.. code-block:: python 1492 1493 w1 = Window(partition_by=[Sample.counter]).alias('w1') 1494 1495 # By extending w1, this window definition will also be partitioned 1496 # by "counter". 1497 w2 = Window(extends=w1, order_by=[Sample.value.desc()]).alias('w2') 1498 1499 query = (Sample 1500 .select(Sample.counter, Sample.value, 1501 fn.SUM(Sample.value).over(w1).alias('group_sum'), 1502 fn.RANK().over(w2).alias('revrank')) 1503 .window(w1, w2) 1504 .order_by(Sample.id)) 1505 1506 for sample in query: 1507 print(sample.counter, sample.value, sample.group_sum, sample.revrank) 1508 1509 # counter value group_sum revrank 1510 # 1 10. 30. 2 1511 # 1 20. 30. 1 1512 # 2 1. 4. 2 1513 # 2 3. 4. 1 1514 # 3 100. 100. 1 1515 1516.. _window-frame-types: 1517 1518Frame types: RANGE vs ROWS vs GROUPS 1519^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1520 1521Depending on the frame type, the database will process ordered groups 1522differently. Let's create two additional ``Sample`` rows to visualize the 1523difference: 1524 1525.. code-block:: pycon 1526 1527 >>> Sample.create(counter=1, value=20.) 1528 <Sample 6> 1529 >>> Sample.create(counter=2, value=1.) 1530 <Sample 7> 1531 1532Our table now contains: 1533 1534=== ======== ====== 1535id counter value 1536=== ======== ====== 15371 1 10.0 15382 1 20.0 15393 2 1.0 15404 2 3.0 15415 3 100.0 15426 1 20.0 15437 2 1.0 1544=== ======== ====== 1545 1546Let's examine the difference by calculating a "running sum" of the samples, 1547ordered with respect to the ``counter`` and ``value`` fields. To specify the 1548frame type, we can use either: 1549 1550* :py:attr:`Window.RANGE` 1551* :py:attr:`Window.ROWS` 1552* :py:attr:`Window.GROUPS` 1553 1554The behavior of :py:attr:`~Window.RANGE`, when there are logical duplicates, 1555may lead to unexpected results: 1556 1557.. code-block:: python 1558 1559 query = Sample.select( 1560 Sample.counter, 1561 Sample.value, 1562 fn.SUM(Sample.value).over( 1563 order_by=[Sample.counter, Sample.value], 1564 frame_type=Window.RANGE).alias('rsum')) 1565 1566 for sample in query.order_by(Sample.counter, Sample.value): 1567 print(sample.counter, sample.value, sample.rsum) 1568 1569 # counter value rsum 1570 # 1 10. 10. 1571 # 1 20. 50. 1572 # 1 20. 50. 1573 # 2 1. 52. 1574 # 2 1. 52. 1575 # 2 3. 55. 1576 # 3 100 155. 1577 1578With the inclusion of the new rows we now have some rows that have duplicate 1579``category`` and ``value`` values. The :py:attr:`~Window.RANGE` frame type 1580causes these duplicates to be evaluated together rather than separately. 1581 1582The more expected result can be achieved by using :py:attr:`~Window.ROWS` as 1583the frame-type: 1584 1585.. code-block:: python 1586 1587 query = Sample.select( 1588 Sample.counter, 1589 Sample.value, 1590 fn.SUM(Sample.value).over( 1591 order_by=[Sample.counter, Sample.value], 1592 frame_type=Window.ROWS).alias('rsum')) 1593 1594 for sample in query.order_by(Sample.counter, Sample.value): 1595 print(sample.counter, sample.value, sample.rsum) 1596 1597 # counter value rsum 1598 # 1 10. 10. 1599 # 1 20. 30. 1600 # 1 20. 50. 1601 # 2 1. 51. 1602 # 2 1. 52. 1603 # 2 3. 55. 1604 # 3 100 155. 1605 1606Peewee uses these rules for determining what frame-type to use: 1607 1608* If the user specifies a ``frame_type``, that frame type will be used. 1609* If ``start`` and/or ``end`` boundaries are specified Peewee will default to 1610 using ``ROWS``. 1611* If the user did not specify frame type or start/end boundaries, Peewee will 1612 use the database default, which is ``RANGE``. 1613 1614The :py:attr:`Window.GROUPS` frame type looks at the window range specification 1615in terms of groups of rows, based on the ordering term(s). Using ``GROUPS``, we 1616can define the frame so it covers distinct groupings of rows. Let's look at an 1617example: 1618 1619.. code-block:: python 1620 1621 query = (Sample 1622 .select(Sample.counter, Sample.value, 1623 fn.SUM(Sample.value).over( 1624 order_by=[Sample.counter, Sample.value], 1625 frame_type=Window.GROUPS, 1626 start=Window.preceding(1)).alias('gsum')) 1627 .order_by(Sample.counter, Sample.value)) 1628 1629 for sample in query: 1630 print(sample.counter, sample.value, sample.gsum) 1631 1632 # counter value gsum 1633 # 1 10 10 1634 # 1 20 50 1635 # 1 20 50 (10) + (20+0) 1636 # 2 1 42 1637 # 2 1 42 (20+20) + (1+1) 1638 # 2 3 5 (1+1) + 3 1639 # 3 100 103 (3) + 100 1640 1641As you can hopefully infer, the window is grouped by its ordering term, which 1642is ``(counter, value)``. We are looking at a window that extends between one 1643previous group and the current group. 1644 1645.. note:: 1646 For information about the window function APIs, see: 1647 1648 * :py:meth:`Function.over` 1649 * :py:meth:`Function.filter` 1650 * :py:class:`Window` 1651 1652 For general information on window functions, read the postgres `window functions tutorial <https://www.postgresql.org/docs/current/tutorial-window.html>`_ 1653 1654 Additionally, the `postgres docs <https://www.postgresql.org/docs/current/sql-select.html#SQL-WINDOW>`_ 1655 and the `sqlite docs <https://www.sqlite.org/windowfunctions.html>`_ 1656 contain a lot of good information. 1657 1658.. _rowtypes: 1659 1660Retrieving row tuples / dictionaries / namedtuples 1661-------------------------------------------------- 1662 1663Sometimes you do not need the overhead of creating model instances and simply 1664want to iterate over the row data without needing all the APIs provided 1665:py:class:`Model`. To do this, use: 1666 1667* :py:meth:`~BaseQuery.dicts` 1668* :py:meth:`~BaseQuery.namedtuples` 1669* :py:meth:`~BaseQuery.tuples` 1670* :py:meth:`~BaseQuery.objects` -- accepts an arbitrary constructor function 1671 which is called with the row tuple. 1672 1673.. code-block:: python 1674 1675 stats = (Stat 1676 .select(Stat.url, fn.Count(Stat.url)) 1677 .group_by(Stat.url) 1678 .tuples()) 1679 1680 # iterate over a list of 2-tuples containing the url and count 1681 for stat_url, stat_count in stats: 1682 print(stat_url, stat_count) 1683 1684Similarly, you can return the rows from the cursor as dictionaries using 1685:py:meth:`~BaseQuery.dicts`: 1686 1687.. code-block:: python 1688 1689 stats = (Stat 1690 .select(Stat.url, fn.Count(Stat.url).alias('ct')) 1691 .group_by(Stat.url) 1692 .dicts()) 1693 1694 # iterate over a list of 2-tuples containing the url and count 1695 for stat in stats: 1696 print(stat['url'], stat['ct']) 1697 1698.. _returning-clause: 1699 1700Returning Clause 1701---------------- 1702 1703:py:class:`PostgresqlDatabase` supports a ``RETURNING`` clause on ``UPDATE``, 1704``INSERT`` and ``DELETE`` queries. Specifying a ``RETURNING`` clause allows you 1705to iterate over the rows accessed by the query. 1706 1707By default, the return values upon execution of the different queries are: 1708 1709* ``INSERT`` - auto-incrementing primary key value of the newly-inserted row. 1710 When not using an auto-incrementing primary key, Postgres will return the new 1711 row's primary key, but SQLite and MySQL will not. 1712* ``UPDATE`` - number of rows modified 1713* ``DELETE`` - number of rows deleted 1714 1715When a returning clause is used the return value upon executing a query will be 1716an iterable cursor object. 1717 1718Postgresql allows, via the ``RETURNING`` clause, to return data from the rows 1719inserted or modified by a query. 1720 1721For example, let's say you have an :py:class:`Update` that deactivates all 1722user accounts whose registration has expired. After deactivating them, you want 1723to send each user an email letting them know their account was deactivated. 1724Rather than writing two queries, a ``SELECT`` and an ``UPDATE``, you can do 1725this in a single ``UPDATE`` query with a ``RETURNING`` clause: 1726 1727.. code-block:: python 1728 1729 query = (User 1730 .update(is_active=False) 1731 .where(User.registration_expired == True) 1732 .returning(User)) 1733 1734 # Send an email to every user that was deactivated. 1735 for deactivate_user in query.execute(): 1736 send_deactivation_email(deactivated_user.email) 1737 1738The ``RETURNING`` clause is also available on :py:class:`Insert` and 1739:py:class:`Delete`. When used with ``INSERT``, the newly-created rows will be 1740returned. When used with ``DELETE``, the deleted rows will be returned. 1741 1742The only limitation of the ``RETURNING`` clause is that it can only consist of 1743columns from tables listed in the query's ``FROM`` clause. To select all 1744columns from a particular table, you can simply pass in the :py:class:`Model` 1745class. 1746 1747As another example, let's add a user and set their creation-date to the 1748server-generated current timestamp. We'll create and retrieve the new user's 1749ID, Email and the creation timestamp in a single query: 1750 1751.. code-block:: python 1752 1753 query = (User 1754 .insert(email='foo@bar.com', created=fn.now()) 1755 .returning(User)) # Shorthand for all columns on User. 1756 1757 # When using RETURNING, execute() returns a cursor. 1758 cursor = query.execute() 1759 1760 # Get the user object we just inserted and log the data: 1761 user = cursor[0] 1762 logger.info('Created user %s (id=%s) at %s', user.email, user.id, user.created) 1763 1764By default the cursor will return :py:class:`Model` instances, but you can 1765specify a different row type: 1766 1767.. code-block:: python 1768 1769 data = [{'name': 'charlie'}, {'name': 'huey'}, {'name': 'mickey'}] 1770 query = (User 1771 .insert_many(data) 1772 .returning(User.id, User.username) 1773 .dicts()) 1774 1775 for new_user in query.execute(): 1776 print('Added user "%s", id=%s' % (new_user['username'], new_user['id'])) 1777 1778Just as with :py:class:`Select` queries, you can specify various :ref:`result row types <rowtypes>`. 1779 1780.. _cte: 1781 1782Common Table Expressions 1783------------------------ 1784 1785Peewee supports the inclusion of common table expressions (CTEs) in all types 1786of queries. CTEs may be useful for: 1787 1788* Factoring out a common subquery. 1789* Grouping or filtering by a column derived in the CTE's result set. 1790* Writing recursive queries. 1791 1792To declare a :py:class:`Select` query for use as a CTE, use 1793:py:meth:`~SelectQuery.cte` method, which wraps the query in a :py:class:`CTE` 1794object. To indicate that a :py:class:`CTE` should be included as part of a 1795query, use the :py:meth:`Query.with_cte` method, passing a list of CTE objects. 1796 1797Simple Example 1798^^^^^^^^^^^^^^ 1799 1800For an example, let's say we have some data points that consist of a key and a 1801floating-point value. Let's define our model and populate some test data: 1802 1803.. code-block:: python 1804 1805 class Sample(Model): 1806 key = TextField() 1807 value = FloatField() 1808 1809 data = ( 1810 ('a', (1.25, 1.5, 1.75)), 1811 ('b', (2.1, 2.3, 2.5, 2.7, 2.9)), 1812 ('c', (3.5, 3.5))) 1813 1814 # Populate data. 1815 for key, values in data: 1816 Sample.insert_many([(key, value) for value in values], 1817 fields=[Sample.key, Sample.value]).execute() 1818 1819Let's use a CTE to calculate, for each distinct key, which values were 1820above-average for that key. 1821 1822.. code-block:: python 1823 1824 # First we'll declare the query that will be used as a CTE. This query 1825 # simply determines the average value for each key. 1826 cte = (Sample 1827 .select(Sample.key, fn.AVG(Sample.value).alias('avg_value')) 1828 .group_by(Sample.key) 1829 .cte('key_avgs', columns=('key', 'avg_value'))) 1830 1831 # Now we'll query the sample table, using our CTE to find rows whose value 1832 # exceeds the average for the given key. We'll calculate how far above the 1833 # average the given sample's value is, as well. 1834 query = (Sample 1835 .select(Sample.key, Sample.value) 1836 .join(cte, on=(Sample.key == cte.c.key)) 1837 .where(Sample.value > cte.c.avg_value) 1838 .order_by(Sample.value) 1839 .with_cte(cte)) 1840 1841We can iterate over the samples returned by the query to see which samples had 1842above-average values for their given group: 1843 1844.. code-block:: pycon 1845 1846 >>> for sample in query: 1847 ... print(sample.key, sample.value) 1848 1849 # 'a', 1.75 1850 # 'b', 2.7 1851 # 'b', 2.9 1852 1853Complex Example 1854^^^^^^^^^^^^^^^ 1855 1856For a more complete example, let's consider the following query which uses 1857multiple CTEs to find per-product sales totals in only the top sales regions. 1858Our model looks like this: 1859 1860.. code-block:: python 1861 1862 class Order(Model): 1863 region = TextField() 1864 amount = FloatField() 1865 product = TextField() 1866 quantity = IntegerField() 1867 1868Here is how the query might be written in SQL. This example can be found in 1869the `postgresql documentation <https://www.postgresql.org/docs/current/static/queries-with.html>`_. 1870 1871.. code-block:: sql 1872 1873 WITH regional_sales AS ( 1874 SELECT region, SUM(amount) AS total_sales 1875 FROM orders 1876 GROUP BY region 1877 ), top_regions AS ( 1878 SELECT region 1879 FROM regional_sales 1880 WHERE total_sales > (SELECT SUM(total_sales) / 10 FROM regional_sales) 1881 ) 1882 SELECT region, 1883 product, 1884 SUM(quantity) AS product_units, 1885 SUM(amount) AS product_sales 1886 FROM orders 1887 WHERE region IN (SELECT region FROM top_regions) 1888 GROUP BY region, product; 1889 1890With Peewee, we would write: 1891 1892.. code-block:: python 1893 1894 reg_sales = (Order 1895 .select(Order.region, 1896 fn.SUM(Order.amount).alias('total_sales')) 1897 .group_by(Order.region) 1898 .cte('regional_sales')) 1899 1900 top_regions = (reg_sales 1901 .select(reg_sales.c.region) 1902 .where(reg_sales.c.total_sales > ( 1903 reg_sales.select(fn.SUM(reg_sales.c.total_sales) / 10))) 1904 .cte('top_regions')) 1905 1906 query = (Order 1907 .select(Order.region, 1908 Order.product, 1909 fn.SUM(Order.quantity).alias('product_units'), 1910 fn.SUM(Order.amount).alias('product_sales')) 1911 .where(Order.region.in_(top_regions.select(top_regions.c.region))) 1912 .group_by(Order.region, Order.product) 1913 .with_cte(regional_sales, top_regions)) 1914 1915Recursive CTEs 1916^^^^^^^^^^^^^^ 1917 1918Peewee supports recursive CTEs. Recursive CTEs can be useful when, for example, 1919you have a tree data-structure represented by a parent-link foreign key. 1920Suppose, for example, that we have a hierarchy of categories for an online 1921bookstore. We wish to generate a table showing all categories and their 1922absolute depths, along with the path from the root to the category. 1923 1924We'll assume the following model definition, in which each category has a 1925foreign-key to its immediate parent category: 1926 1927.. code-block:: python 1928 1929 class Category(Model): 1930 name = TextField() 1931 parent = ForeignKeyField('self', backref='children', null=True) 1932 1933To list all categories along with their depth and parents, we can use a 1934recursive CTE: 1935 1936.. code-block:: python 1937 1938 # Define the base case of our recursive CTE. This will be categories that 1939 # have a null parent foreign-key. 1940 Base = Category.alias() 1941 level = Value(1).alias('level') 1942 path = Base.name.alias('path') 1943 base_case = (Base 1944 .select(Base.id, Base.name, Base.parent, level, path) 1945 .where(Base.parent.is_null()) 1946 .cte('base', recursive=True)) 1947 1948 # Define the recursive terms. 1949 RTerm = Category.alias() 1950 rlevel = (base_case.c.level + 1).alias('level') 1951 rpath = base_case.c.path.concat('->').concat(RTerm.name).alias('path') 1952 recursive = (RTerm 1953 .select(RTerm.id, RTerm.name, RTerm.parent, rlevel, rpath) 1954 .join(base_case, on=(RTerm.parent == base_case.c.id))) 1955 1956 # The recursive CTE is created by taking the base case and UNION ALL with 1957 # the recursive term. 1958 cte = base_case.union_all(recursive) 1959 1960 # We will now query from the CTE to get the categories, their levels, and 1961 # their paths. 1962 query = (cte 1963 .select_from(cte.c.name, cte.c.level, cte.c.path) 1964 .order_by(cte.c.path)) 1965 1966 # We can now iterate over a list of all categories and print their names, 1967 # absolute levels, and path from root -> category. 1968 for category in query: 1969 print(category.name, category.level, category.path) 1970 1971 # Example output: 1972 # root, 1, root 1973 # p1, 2, root->p1 1974 # c1-1, 3, root->p1->c1-1 1975 # c1-2, 3, root->p1->c1-2 1976 # p2, 2, root->p2 1977 # c2-1, 3, root->p2->c2-1 1978 1979Foreign Keys and Joins 1980---------------------- 1981 1982This section have been moved into its own document: :ref:`relationships`. 1983