For a good while this morning, I was banging my head over some fairly simple code. The reason behind this, in the end, was because I was using Haystack's Simple Engine. Let me explain.
For those who don't know, Haystack is an application written in Python that can be added to any Django app to create search indexes in a clean way that requires minimal code on your part and zero refactoring. It's really as simple as implementing a single class.
It, like Django's ORM, can utilize multiple backends such as Woosh and Solr, as well as this Simple Engine I mentioned before. Furthermore, all the complex query goodies you're used to with Django's ORM, including a version of Q objects, are available to you to perform queries from simple to advanced (with a couple of caveats, one of which I will mention later on).
Simple Engine... Not so simple after all...
On to this Simple Engine, which is the point of this post. I had a simple index set up as per the dev docs from haystacksearch.org. (You'll find a wonderful readthedocs style documentation site there.)
class MyModelIndex(indexes.SearchIndex, indexes.Indexable): text = indexes.CharField(document=True, use_template=True, model_attr="field_1") title = indexes.CharField(model_attr='title_field') multi = indexes.MultiValueField() def get_model(self): return MyModel def prepare_multi(self, obj): return [p.pk for p in obj.multi.related_m2m.all()] def index_queryset(self): return self.get_model().objects.all()
To complete this picture, let's assume my models.py file contains the following.
MAX_LENGTH = 255 class RelatedM2M(models.Model): name = models.CharField(max_length=MAX_LENGTH, primary_key=True) class RelatedModel(models.Model): name = models.CharField(max_length=MAX_LENGTH, primary_key=True) related_m2m = models.ManyToManyField(RelatedM2M, blank=True, null=True) class MyModel(models.Model): field_1 = models.TextField() title_field = models.CharField(max_length=MAX_LENGTH) multi = models.ForeignKey(RelatedModel)
Now that we have the groundwork laid out, we will concentrate on what Simple Engine doesn't like about this. This engine doesn't like anything that isn't directly in the model it's concerned with. Thus, it makes sense that it doesn't like us accessing data through the foreign key in "MyModel".
For me, it was using Solr. Since we were planning on using Solr in the production environment, this move actually made a lot of sense. Utilizing the same software allows me not only to test the rest of my code against Solr, but also allows me to avoid any ugly hacks I may have needed to add in to get related field information loading into my indexes.
Now, for a caveat
Remember me saying there were caveats on filtering? Let's look at that prepare function from the first class again.
def prepare_multi(self, obj): return [p.pk for p in obj.multi.related_m2m.all()]
This is going to cause issues. At least it did for me. From what I've been able to test, Solr does not like empty values. There are two solutions for this. The first is to change the model to enforce that null=False and blank=False, and then introduce a fixture or a data migration that adds a "Default" RelatedM2M object to the database. Not the prettiest solution, but this is by far the most robust and straight forward for whoever is using your system.
You can also add an implicit default to the index's prepare_multi function itself. This would look something like this.
def prepare_multi(self, obj): if obj.multi.related_m2m.count(): return [p.pk for p in obj.multi.related_m2m.all()] return [u"Default"]
Just make sure to document this very clearly in help strings. Again, the way I mentioned first, creating an explicit object, is better.
Hopefully this saves some people headaches. If you do plan to use Solr, don't forget to use the manage.py command to build your schema.xml!