June 27, 2013

Adding Customized Fields to the Apache Solr index in Drupal

Written by Mike Bopp @mwbopp

Share on LinkedIn

There are great resources to get you started using Solr with Drupal.  And in most cases the default Solr configuration that the Drupal module provides is more than enough for a full featured robust search.  Sometimes extra information needs to be added to the Solr Index / Engine.  Reasons for this might be searchable data that is not directly part of a defined drupal field.  Another reason that I have personally encountered is that you want to display custom information in the search results.  In my case I wanted to display a thumbnail associated with a node.  

Solr is a bit Isolated from the rest of the Drupal content infrastructure.  You can get around this sometimes by loading a node for each item returned and referencing the needed information at that time.  The big problem with this is that there is a lot of overhead in doing this as it requires querying the database for every result.  This should be avoided at all costs, or you'll lose the performance benefits of the Solr search solution.  Instead it's best to only display information in your search results that is stored in the Solr database itself.  That means we need to get our thumbnail URL in the index.

This more or less requires two functions / hook implementations within a custom Drupal module.  

One for constructing the data to store in Solr.  hook_apachesolr_index_document_build_' . [entity type]

/**
 * Build the documents before sending them to Solr.
 *
 * Supports all types of
 * hook_apachesolr_index_document_build_' . $entity_type($documents[$id], $entity, $env_id);
 *
 * The function is the follow-up for apachesolr_update_index but then for
 * specific entity types
 *
 * @param $document
 * @param $entity
 */
function custom_apachesolr_index_document_build_node(ApacheSolrDocument $document, $entity, $env_id) {
  if (isset($entity->field_article_thumb[LANGUAGE_NONE][0]['uri'])) {
    foreach ($entity->field_article_thumb[LANGUAGE_NONE] as $image) { 
       $document->setMultiValue('sm_field_thumb', $image['uri']);
    }
  }  
}

Now you need to use the syntax already in place for Drupal's interation with Solr.  In the above example we are hooking into the node indexing.  So the entity passed into the function IS the node object.  The goal of the function is to alter the $document, adding the data you want to store in the index.  I named my field sm_field_thumb to follow the conventions used in Solr already.  (You can see what is stored in the index by visiting the solr instance itself and browse the schema)

Next we need to add the hook_apachesolr_query_prepare hook implementation.  

/**
 * Prepare the query by adding parameters, sorts, etc.
 *
 * This hook is invoked before the query is cached. The cached query is used
 * after the search such as for building facet and sort blocks, so parameters
 * added during this hook may be visible to end users.
 *
 * This is otherwise the same as HOOK_apachesolr_query_alter(), but runs before
 * it.
 *
 * @param object $query
 *  An object implementing DrupalSolrQueryInterface. No need for &.
 */
function custom_apachesolr_query_prepare(DrupalSolrQueryInterface $query) {
  $query->addParam('fl', 'sm_field_thumb');
}

This will allow you to configure things how you want them within the Drupal apachesolr configuration.  It will also let Drupal know to pull back this information in the results.  You can also add sorts, etc in this hook.

And that's it.  Clear and re-build your index using the Drupal solr admin and your new information should be a part of the index.  You can then use the provided theme functionality for designing a search result to add your newly stored info to the results page itself.

We'd love to chat about your next web or application project!