2.5. Querying the database
Document classes have an objects attribute, which is used to
access the documents in the database collection associated with that class. The
objects attribute is actually a
QuerySet object. This lazily queries the database only when you need the
data. It may also be filtered to narrow down your query.
2.5.1. Filtering queries
The query may be filtered by calling the QuerySet object with field lookup
keyword arguments. The keys in the keyword arguments correspond to fields on the
Document you are querying:
# This will return a QuerySet that will only iterate over users whose
# 'country' field is set to 'uk'
uk_users = User.objects(country='uk')
Fields on embedded documents may also be referred to using field lookup syntax by using a double-underscore in place of the dot in object attribute access syntax:
# This will return a QuerySet that will only iterate over pages that have
# been written by a user whose 'country' field is set to 'uk'
uk_pages = Page.objects(author__country='uk')
2.5.2. Query operators
Operators other than equality may also be used in queries — just attach the operator name to a key in a keyword argument with a double-underscore:
# Get all pages that have had more than 5 comments
Page.objects(num_comments__gt=5)
# Get all pages that have been written since 1 January 2010
Page.objects(date__gte=datetime(2010, 1, 1))
# Get all pages where the title is not 'Example Page'
Page.objects(title__ne='Example Page')
# Get all pages whose tags contain 'coding'
Page.objects(tags__in=['coding', 'python'])
# Get all pages that have been tagged
Page.objects(tags__exists=True)
The following operators are available:
ne— not equal tolt— less thanlte— less than or equal togt— greater thangte— greater than or equal tonot— negate a standard check, may be used before other operatorsin— value is in listnin— value is not in listmod— value % x == y, where x and y are two provided valuesall— every item in list of values provided is in arraysize— the size of the array isexists— value for field existselemMatch— provides the ability to match multiple fields in an embedded document within an arraytype— the type of the field is the specified typeregex— value matches the provided regular expressioniexact— a case insensitive version ofexactcontains— field contains the valueicontains— case insensitive containsstartswith— field starts with the valueistartswith— case insensitive startswithendswith— field ends with the valueiendswith— case insensitive endswithwholeword— field contains whole wordiwholeword— case insensitive whole wordmatch— performs an$elemMatchso you can match an entire document within an array
Querying with Q objects
Sometimes calling a QuerySet object with keyword arguments can't fully
express the query you want — for example if you need to combine constraints using
and and or. This is made possible in MongoEngine through the
Q class. A Q object represents part of a query, and can be
initialised using the same keyword-argument syntax. To build a complex query, combine
Q objects using the & (and) and | (or)
operators:
# Get published posts
Post.objects(Q(published=True) | Q(publish_date__lte=datetime.now()))
# Get top posts
Post.objects((Q(featured=True) & Q(hits__gte=1000)) | Q(hits__gte=5000))
2.5.3. Sorting/Ordering results
It is possible to order the results by using
order_by(). The order may be specified by prepending each of the field names
by + or - to indicate ascending or descending order
respectively:
# Order by ascending date
blogs = BlogPost.objects().order_by('date')
# Order by ascending date and descending title
blogs = BlogPost.objects().order_by('+date', '-title')
2.5.4. Limiting and skipping results
Just as with traditional ORMs, you may limit the number of results returned or skip a
number of results in you query.
limit() and skip() methods are available on
QuerySet objects, but the slice operator is preferred:
# Only get 5 posts
blog_posts = BlogPost.objects[:5]
# Get all posts except for the first 5
blog_posts = BlogPost.objects[5:]
# 5 posts after the 10th post
blog_posts = BlogPost.objects[10:15]
You may also retrieve a single result using index or get:
# Retrieve the 5th post
blog_posts = BlogPost.objects[5]
# Retrieve the latest post
latest_post = BlogPost.objects[-1]
Retrieving unique results
To retrieve a result that should be unique in the collection, use
get(). This will raise
DoesNotExist if no document matches the query, and
MultipleObjectsReturned if more than one document matched the query:
try:
page = Page.objects.get(title='Test Page')
except Page.DoesNotExist:
page = Page(title='Test Page')
except Page.MultipleObjectsReturned:
# do something here
pass
2.5.5. Default Document queries
By default, the objects attribute on a document returns a
QuerySet that doesn't filter the collection — it returns all objects. This
may be changed by defining a method on a document that modifies a queryset. The method
should accept two arguments — doc_cls and queryset. The method
should then return a modified queryset. The method should be decorated with
queryset_manager in order for MongoEngine to recognise it:
from mongoengine.queryset import queryset_manager
class BlogPost(Document):
title = StringField()
date = DateTimeField()
@queryset_manager
def objects(doc_cls, queryset):
# This may actually also be done by defining a default ordering for
# the document, but this illustrates the use of manager methods
return queryset.order_by('-date')
2.5.6. Custom QuerySets
Should you want to add custom methods for interacting with or filtering documents, define a custom
QuerySet subclass and indicate it should be used on the document using the
queryset_class meta attribute:
class AwesomerQuerySet(QuerySet):
def get_awesome(self):
return self.filter(awesome=True)
class Page(Document):
meta = {'queryset_class': AwesomerQuerySet}
# To call:
Page.objects.get_awesome()
2.5.7. Aggregation
MongoDB provides some aggregation methods out of the box. MongoEngine provides wrappers around the built-in methods.
Counting results
num_users = User.objects.count()
count() is preferred over
len(User.objects). count() executes a server-side count query,
while len() retrieves all the results and counts them in memory.
Further aggregation
# Sum over a field
yearly_expense = Employee.objects.sum('salary')
# Get the average
mean_age = User.objects.average('age')
# Get a dictionary of item frequencies
tag_frequencies = BlogPost.objects.item_frequencies('tags')
2.5.8. Query efficiency and performance
There are a couple of methods available to improve query efficiency. Using
only() to return only a subset of fields:
# Only load the title
pages = Page.objects.only('title')
# Load all fields EXCEPT the 'tags' field
pages = Page.objects.exclude('tags')
If you are dealing with a large QuerySet, you can use the
no_dereference() method to turn off automatic dereferencing:
post = Post.objects.no_dereference().first()
# author is a DBRef, not a User object
assert isinstance(post.author, DBRef)
2.5.9. Advanced queries
Sometimes calling a QuerySet object with keyword arguments can't fully
express the query you want to use. It is possible to use a raw PyMongo query as a query
parameter by using the __raw__ keyword argument:
Page.objects(__raw__={'tags': 'coding'})
2.5.10. Atomic updates
Documents may be updated atomically by using the
update_one(), update() and
modify() methods on a QuerySet object. There are several update
modifiers that you may use with these methods:
set— set a particular valueset_on_insert— set only if document is being insertedunset— delete a particular value (since MongoDB v1.3+)max— update only if value is biggermin— update only if value is smallerinc— increment a value by a given amountdec— decrement a value by a given amountpush— append a value to a listpush_all— append several values to a listpop— remove the first or last element of a listpull— remove a value from a listpull_all— remove several values from a listadd_to_set— add value to a list only if its not in the list alreadyrename— rename the key name
The syntax for atomic updates is similar to the querying syntax, so to increment the number of comments on a page, you would use:
# Update all pages with num_comments less than 5
Post.objects(num_comments__lt=5).update(inc__num_comments=1)
# Only update one
Post.objects(num_comments__lt=5).update_one(inc__num_comments=1)
2.5.11. Server-side javascript execution
Javascript functions may be written and sent to the server for execution. The result of
this is the return value of the Javascript function. This functionality is accessed through
the exec_js() method on QuerySet objects:
def sum_field(document, field_name, include_negatives=True):
code = """
function(sumField) {
var total = 0.0;
db[collection].find(query).forEach(function(doc) {
var val = doc[sumField];
if (val >= 0.0 || options.includeNegatives) {
total += val;
}
});
return total;
}
"""
options = {'includeNegatives': include_negatives}
return document.objects.exec_js(code, field_name, **options)