2.5. Querying the database

Document classes have an objects attribute, which is used to access the documents in the database collection associated with that class. The objects attribute is actually a QuerySet object. This lazily queries the database only when you need the data. It may also be filtered to narrow down your query.

2.5.1. Filtering queries

The query may be filtered by calling the QuerySet object with field lookup keyword arguments. The keys in the keyword arguments correspond to fields on the Document you are querying:

# This will return a QuerySet that will only iterate over users whose
# 'country' field is set to 'uk'
uk_users = User.objects(country='uk')

Fields on embedded documents may also be referred to using field lookup syntax by using a double-underscore in place of the dot in object attribute access syntax:

# This will return a QuerySet that will only iterate over pages that have
# been written by a user whose 'country' field is set to 'uk'
uk_pages = Page.objects(author__country='uk')

2.5.2. Query operators

Operators other than equality may also be used in queries — just attach the operator name to a key in a keyword argument with a double-underscore:

# Get all pages that have had more than 5 comments
Page.objects(num_comments__gt=5)

# Get all pages that have been written since 1 January 2010
Page.objects(date__gte=datetime(2010, 1, 1))

# Get all pages where the title is not 'Example Page'
Page.objects(title__ne='Example Page')

# Get all pages whose tags contain 'coding'
Page.objects(tags__in=['coding', 'python'])

# Get all pages that have been tagged
Page.objects(tags__exists=True)

The following operators are available:

ne — not equal to
lt — less than
lte — less than or equal to
gt — greater than
gte — greater than or equal to
not — negate a standard check, may be used before other operators
in — value is in list
nin — value is not in list
mod — value % x == y, where x and y are two provided values
all — every item in list of values provided is in array
size — the size of the array is
exists — value for field exists
elemMatch — provides the ability to match multiple fields in an embedded document within an array
type — the type of the field is the specified type
regex — value matches the provided regular expression
iexact — a case insensitive version of exact
contains — field contains the value
icontains — case insensitive contains
startswith — field starts with the value
istartswith — case insensitive startswith
endswith — field ends with the value
iendswith — case insensitive endswith
wholeword — field contains whole word
iwholeword — case insensitive whole word
match — performs an $elemMatch so you can match an entire document within an array

Querying with Q objects

Sometimes calling a QuerySet object with keyword arguments can't fully express the query you want — for example if you need to combine constraints using and and or. This is made possible in MongoEngine through the Q class. A Q object represents part of a query, and can be initialised using the same keyword-argument syntax. To build a complex query, combine Q objects using the & (and) and | (or) operators:

# Get published posts
Post.objects(Q(published=True) | Q(publish_date__lte=datetime.now()))

# Get top posts
Post.objects((Q(featured=True) & Q(hits__gte=1000)) | Q(hits__gte=5000))

2.5.3. Sorting/Ordering results

It is possible to order the results by using order_by(). The order may be specified by prepending each of the field names by + or - to indicate ascending or descending order respectively:

# Order by ascending date
blogs = BlogPost.objects().order_by('date')

# Order by ascending date and descending title
blogs = BlogPost.objects().order_by('+date', '-title')

2.5.4. Limiting and skipping results

Just as with traditional ORMs, you may limit the number of results returned or skip a number of results in you query. limit() and skip() methods are available on QuerySet objects, but the slice operator is preferred:

# Only get 5 posts
blog_posts = BlogPost.objects[:5]

# Get all posts except for the first 5
blog_posts = BlogPost.objects[5:]

# 5 posts after the 10th post
blog_posts = BlogPost.objects[10:15]

You may also retrieve a single result using index or get:

# Retrieve the 5th post
blog_posts = BlogPost.objects[5]

# Retrieve the latest post
latest_post = BlogPost.objects[-1]

Retrieving unique results

To retrieve a result that should be unique in the collection, use get(). This will raise DoesNotExist if no document matches the query, and MultipleObjectsReturned if more than one document matched the query:

try:
    page = Page.objects.get(title='Test Page')
except Page.DoesNotExist:
    page = Page(title='Test Page')
except Page.MultipleObjectsReturned:
    # do something here
    pass

2.5.5. Default Document queries

By default, the objects attribute on a document returns a QuerySet that doesn't filter the collection — it returns all objects. This may be changed by defining a method on a document that modifies a queryset. The method should accept two arguments — doc_cls and queryset. The method should then return a modified queryset. The method should be decorated with queryset_manager in order for MongoEngine to recognise it:

from mongoengine.queryset import queryset_manager

class BlogPost(Document):
    title = StringField()
    date = DateTimeField()

    @queryset_manager
    def objects(doc_cls, queryset):
        # This may actually also be done by defining a default ordering for
        # the document, but this illustrates the use of manager methods
        return queryset.order_by('-date')

2.5.6. Custom QuerySets

Should you want to add custom methods for interacting with or filtering documents, define a custom QuerySet subclass and indicate it should be used on the document using the queryset_class meta attribute:

class AwesomerQuerySet(QuerySet):
    def get_awesome(self):
        return self.filter(awesome=True)

class Page(Document):
    meta = {'queryset_class': AwesomerQuerySet}

# To call:
Page.objects.get_awesome()

2.5.7. Aggregation

MongoDB provides some aggregation methods out of the box. MongoEngine provides wrappers around the built-in methods.

Counting results

num_users = User.objects.count()

Note If you are concerned about performance, count() is preferred over len(User.objects). count() executes a server-side count query, while len() retrieves all the results and counts them in memory.

Further aggregation

# Sum over a field
yearly_expense = Employee.objects.sum('salary')

# Get the average
mean_age = User.objects.average('age')

# Get a dictionary of item frequencies
tag_frequencies = BlogPost.objects.item_frequencies('tags')

2.5.8. Query efficiency and performance

There are a couple of methods available to improve query efficiency. Using only() to return only a subset of fields:

# Only load the title
pages = Page.objects.only('title')

# Load all fields EXCEPT the 'tags' field
pages = Page.objects.exclude('tags')

If you are dealing with a large QuerySet, you can use the no_dereference() method to turn off automatic dereferencing:

post = Post.objects.no_dereference().first()
# author is a DBRef, not a User object
assert isinstance(post.author, DBRef)

2.5.9. Advanced queries

Sometimes calling a QuerySet object with keyword arguments can't fully express the query you want to use. It is possible to use a raw PyMongo query as a query parameter by using the __raw__ keyword argument:

Page.objects(__raw__={'tags': 'coding'})

2.5.10. Atomic updates

Documents may be updated atomically by using the update_one(), update() and modify() methods on a QuerySet object. There are several update modifiers that you may use with these methods:

set — set a particular value
set_on_insert — set only if document is being inserted
unset — delete a particular value (since MongoDB v1.3+)
max — update only if value is bigger
min — update only if value is smaller
inc — increment a value by a given amount
dec — decrement a value by a given amount
push — append a value to a list
push_all — append several values to a list
pop — remove the first or last element of a list
pull — remove a value from a list
pull_all — remove several values from a list
add_to_set — add value to a list only if its not in the list already
rename — rename the key name

The syntax for atomic updates is similar to the querying syntax, so to increment the number of comments on a page, you would use:

# Update all pages with num_comments less than 5
Post.objects(num_comments__lt=5).update(inc__num_comments=1)

# Only update one
Post.objects(num_comments__lt=5).update_one(inc__num_comments=1)

2.5.11. Server-side javascript execution

Javascript functions may be written and sent to the server for execution. The result of this is the return value of the Javascript function. This functionality is accessed through the exec_js() method on QuerySet objects:

def sum_field(document, field_name, include_negatives=True):
    code = """
    function(sumField) {
        var total = 0.0;
        db[collection].find(query).forEach(function(doc) {
            var val = doc[sumField];
            if (val >= 0.0 || options.includeNegatives) {
                total += val;
            }
        });
        return total;
    }
    """
    options = {'includeNegatives': include_negatives}
    return document.objects.exec_js(code, field_name, **options)

← Previous Next →