Effective Django ORM¶§
Writing Models¶§
from django.db import models
class Address(models.Model):
address = models.CharField(max_length=255, blank=True)
city = models.CharField(max_length=150, blank=True)
state = models.CharField(max_length=2, blank=True)
zip = models.CharField(max_length=15, blank=True)
class Contact(models.Model):
first_name = models.CharField(max_length=255, blank=True)
last_name = models.CharField(max_length=255, blank=True)
birthdate = models.DateField(auto_now_add=True)
phone = models.CharField(max_length=25, blank=True)
email = models.EMailField(blank=True)
address = models.ForeignKey(Address, null=True)
Working with Models¶§
nathan = Contact()
nathan.first_name = 'Nathan'
nathan.last_name = 'Yergler'
nathan.save()
What Goes in Models¶§
- Models should encapsulate business logic
- Encourages testable, composable code
- If logic operates on a “set” of Models, put it in the Manager
Saving Data¶§
- Starting in Django 1.5, calling
.save()
only updates the fields that have changed. - Prior to 1.5,
.save()
updated the entire model, making it easy to overwrite changes -
django-dirtyfields lets you track which fields have been changed if you’re stuck on an older version of Django (but does not change
.save()
behavior on its own)
Managers¶§
- Models get a manager injected as
.objects
- Managers allow you to operate over collections of your model
- Default manager emulates part of the
QuerySet
API for convenience
Contact.objects.filter(last_name__iexact='yergler')
Contact.objects.filter(address__state='OH')
Custom Managers¶§
- You can override the default Manager, or add additional ones
- Operations on sets of Model instances belongs here
- Subclass from
models.Manager
to get queryset emulation
class ContactManager(models.Manager):
def with_email(self):
return self.filter(email__ne = '')
class Contact(models.Model):
...
objects = ContactManager()
contacts.objects.with_email().filter(email__endswith='osu.edu')
Testing¶§
Writing a Test¶§
def test_with_email():
# make a couple Contacts
Contact.objects.create(first_name='Nathan')
Contact.objects.create(email='')
self.assertEqual(
len(Contact.objects.with_email()), 1
)
Test Objects¶§
- Creating objects for tests is time consuming
- Unnecessarily involves the database
- factory boy provides an easy way to make model factories
FactoryBoy Example¶§
import factory
from models import Contact
class ContactFactory(factory.Factory):
FACTORY_FOR = Contact
first_name = 'John'
last_name = 'Doe'
# Returns a Contact instance that's not saved
contact = ContactFactory.build()
contact = ContactFactory.build(last_name='Yergler')
# Returns a saved Contact instance
contact = ContactFactory.create()
Querying Your Data¶§
- Query Sets are chainable
Contact.objects.filter(state='OH').filter(email__ne='')
- Multiple filters are collapsed into SQL “and” conditions
ORM Performance¶§
Instantiation is Expensive¶§
for user in Users.objects.filter(is_active=True):
send_email(user.email)
- QuerySets are lazy, but have non-trivial overhead when evaluated
- If a query returns 1000s of rows, users will notice this
-
.values()
and.values_list()
avoid instantiation
Avoiding Instantiation¶§
user_emails = Users.objects.\
filter(is_active=True).\
values_list('email', flat=True)
for email in user_emails:
send_email(email)
Traversing Relationships¶§
- Traversing foreign keys can incur additional queries
-
select_related
queries for foreign keys in the initial query
Contact.objects.\
select_related('address').\
filter(last_name = 'Yergler')
Query Performance¶§
- QuerySets maintain state in memory
- Chaining triggers cloning, duplicating that state
- Unfortunately, QuerySets maintain a lot of state
- If possible, don’t chain more than one filter
Falling Back to Raw SQL¶§
- Django has to be database agnostic, you don’t
- Sometimes the clearest thing to do is write a SQL statement
- The
.raw()
method lets you do this
Contact.objects.raw('SELECT * FROM contacts WHERE last_name = %s', [lname])
- Must retrieve the primary key
- Omitted fields will be “deferred”
-
DO NOT use string formatting in
raw()
calls
Other Manager Operations¶§
Managers have some additional helpers for operating on the table or collection:
get_or_create
update
delete
bulk_insert
Read Repeatable¶§
MySQL’s default transaction isolation for InnoDB breaks Django’s get_or_create
when running at scale
def get_or_create(self, **kwargs):
try:
return self.get(**lookup), False
except self.model.DoesNotExist:
try:
obj = self.model(**params)
obj.save(force_insert=True, using=self.db)
return obj, True
except IntegrityError, e:
try:
return self.get(**lookup), False
except self.model.DoesNotExist:
raise e