This post will show some more advanced parts of the Django ORM and the queries you can do with it. There are multiple ways to achieve the same query result in Django, but this post will focus on the ways done through the ORM and the python functions.
Raw SQL
For completeness, I quickly mention the way to execute raw SQL statements. In my opinion, this way should be used as a last resort, if no other way works because you loose a lot of benefits like type conversion and bound checks if you completly skip the ORM and access the database directly. If you still want to do it, you just replace the filter
- or all
-method with the raw
-method on the objects
-member and pass your query as a parameter:
items = Item.objects.raw("SELECT id, name FROM app_items")
for item in items:
print(item) # Gives you "1 First Task"
Getting started
All the following examples work with a table called todo
which looks like this:
id | title | done | created_at |
---|---|---|---|
1 | Do dishes | false | 2023-02-01 |
2 | Clean house | true | 2023-04-05 |
GROUP BY
To achieve a GROUP BY
through the ORM, you can use the values
-method on the objects
-attribute. If I want to group by done
I would write:
Todo.objects.all().values('done')
This alone doesn’t make much sense because we’re not using the GROUP BY, but we could extend this query to show us how many items are done and how many are still open:
Todo.objects.all().values('done').annotate(cnt=Count('done'))
Aggregations
If you want to calculate a value over multiple (or all) rows, for example the minimum or maximum value, you can use an aggregate. If we want to find the item that was created first and the one that was created last, we can use an aggregation and find the minimum and maximum value of the created_at
column:
from django.db.models import Min, Max
Todo.objects.all().aggregate(Min('created_at'), Max('created_at'))
# This will give us { 'created_at__min': '2023-02-01', 'created_at__max': '2023-04-05' }
Annotations
An annotation allows you to add new fields on the fly to each row that you retrieve. An example would be if we want a column showing how old a Todo-Item is. We can calculate that based on the created_at
-column, which would look like this:
from django.db.models.functions import ExtractDay
from django.db.models import F
from django.utils import timezone
Todo.objects.all().annotate(age_in_days=ExtractDay(timezone.now().date() - F("created_at")))
# This would give us a new field `age_in_days` showing how many days
# are between today and the creation date. We need to use ExtractDay
# to get only the amount of days between the 2 dates supplied.
Subselect
Subselects are relatively easy to implement in Django’s ORM. 2 types are required for it: OuterRef
and Subquery
. OuterRef
allows you
to reference a field in the parent query from the child-query. Subquery
puts your child-query into the parent-query. Let’s see an example.
First we define a Subselect that needs a field called todays_date
of the parent query as reference:
sub = Todo.objects.filter(created_at=OuterRef('todays_date'))
Now we define a parent query that uses values of the child query:
Todo.objects.annotate(todays_date=timezone.now().date()).annotate(title=Subquery(sub.values('title')))
The parent query has a field called todays_date
and annotates it with the field title
from the subselect. The result will be
a list of items containing todays date as well as the title of a todo which has it’s creation date today.