PostgreSQL is an awesome choice for a SQL database. Out of the box it has so many awesome features; two notable features are its integrated full-text search and its NoSQL, key/value, hstore system. On their own, the two features are powerful, but what if the two could work in concert? What if you could have a hstore and perform full-text search on it? Here we will be looking at using PostgreSQL's full-text search and hstore together.

So let's say we are making an app that manages a listing of products. Our products are books, games, movies, and music. Now there will be some columns shared among each item, such as title and price, but for the most part the number of columns not shared among the different type of items is far greater than the number of shared columns. One solution is to have separate tables for each type of product, or just add every possible column to a single product table. Because PostgreSQL has its hstore system, we can take a third approach - store only columns that are common to all products in traditional database columns and store data specific to a particular type of product in a hstore. So things like name, price, and quantity, can go into database columns, but things like author, director, runtime, and pages can all go into a hstore.

Bellow is a sample of our database table.

SELECT * FROM products;
 id |              title              | price |        properties               
----+---------------------------------+-------+------------------------------
  1 | Ulysses                         | 14.99 | "isbn"=>"1613821174",
    |                                 |       | "pages"=>"612",
    |                                 |       | "author"=>"James Joyce"

  2 | The Importance of Being Earnest |   7.5 | "isbn"=>"1613822189",
    |                                 |       | "pages"=>"156",
    |                                 |       | "author"=>"Oscar Wilde" 

So if we wanted to do full-text search on the products table checking only the title, we could do the following.

SELECT * FROM "products" WHERE (title @@ 'being earnest');
 id |              title              | price |        properties               
----+---------------------------------+-------+------------------------------
  2 | The Importance of Being Earnest |   7.5 | "isbn"=>"1613822189",
    |                                 |       | "pages"=>"156",
    |                                 |       | "author"=>"Oscar Wilde" 

But what if we also checked the properties column.

SELECT "products".* FROM "products" WHERE (title @@ 'importance' or properties @@ 'importance');
ERROR:  operator does not exist: hstore @@ unknown
LINE 1: ...ducts" WHERE (title @@ 'importance' or properties @@ 'import...
                                                             ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

The properties column is a hstore and stores keys and values as text, but if we try to do full text search on a hstore we get an error similar to the one above. One quick fix is casting the properties column to text, but this also includes the key, not just the values, in the cast. If you want to exclude keys from the cast simply extract the values first, and then cast the values to text.

SELECT "products".* FROM "products" WHERE (title @@ 'oscar' or CAST(avals(properties) AS text) @@ 'oscar');
 id |              title              | price |        properties               
----+---------------------------------+-------+------------------------------
  2 | The Importance of Being Earnest |   7.5 | "isbn"=>"1613822189",
    |                                 |       | "pages"=>"156",
    |                                 |       | "author"=>"Oscar Wilde" 

Comments


TheKidCoder |
Awesome snippet! Just what I needed!

alagram |
You got me out of jail with this one. Thanks!

gvre |
You can also do a full text search based on a specific property's value, for example the name of the author. SELECT * FROM products WHERE to_tsvector('english', properties->'author') @@ to_tsquery('english', 'james');

danwu |
I've found this method to only do a table scan and not hit indexes when it runs, even though there exists a gin index on the hstore field. Have you met this problem? Thanks for the post.