Sunday, September 9, 2012

Free Full-Text Search with Heroku, Part 2

   Last time, we looked at putting a Ruby on Rails app on Heroku, using the pg_search gem to access PostgreSQL's built-in full-text search by making pg_search_scopes, and how to work around the bogon that you must always feed them something to search on.

   The second bogon, subject of their very first issue filed on Github, is that you can't chain two or more of them in the same statement.  For instance, if my Job model has:

  pg_search_scope :has_description, against: :description
  pg_search_scope :has_title, against: :title

then I can't ask for:

   Job.has_description("Ruby").has_title("developer")

   What happens if I do?  It barfs and gives a cryptic error message about pg_search_rank being ambiguous.  Long story short, that means it's calculating pg_search_rank in two different contexts (one for description, and one for title), and then trying to refer to pg_search_rank, without saying which context to get it from.

   Looking at the generated SQL, we see that it's trying to use it in an ORDER BY clause, specifically:

    ORDER BY pg_search_rank DESC, "jobs"."id" ASC

Luckily, that might not be the order we want.  Sure, sometimes you do want it that way... but in my particular application (a job board), there are other perfectly relevant orders to default to, like posting date, or distance from the center of a search (if the user did a distance-limited search), or various other columns I haven't mentioned before.  So let's try to replace it, and do:

    Job.has_description("Ruby").has_title("developer").
        order(:posted_at)

   Close but no cigar.  Now we have:

    ORDER BY pg_search_rank DESC, "jobs"."id" ASC,
    posted_at DESC

   It seems that order only adds to the ordering, so it can't help us get rid of referring to pg_search_rank.

   BUT... ActiveRelation also provides the lesser-known reorder!  So if we do:

    Job.has_description("Ruby").has_title("developer").
        reorder(:posted_at)

our ORDER BY clause is now simply:

    ORDER BY posted_at DESC

   Long story short, this works.  Later, maybe I'll look into how to actually use the pg_search_rank....