Reader small image

You're reading from  Apache Solr PHP Integration

Product typeBook
Published inNov 2013
Reading LevelIntermediate
PublisherPackt
ISBN-139781782164920
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Jayant Kumar
Jayant Kumar
author image
Jayant Kumar

Jayant Kumar is an experienced software professional with a bachelor of engineering degree in computer science and more than 14 years of experience in architecting and developing large-scale web applications. Jayant is an expert on search technologies and PHP and has been working with Lucene and Solr for more than 11 years now. He is the key person responsible for introducing Lucene as a search engine on www.naukri.com, the most successful job portal in India. Jayant is also the author of the book Apache Solr PHP Integration, Packt Publishing, which has been very successful. Jayant has played many different important roles throughout his career, including software developer, team leader, project manager, and architect, but his primary focus has been on building scalable solutions on the Web. Currently, he is associated with the digital division of HT Media as the chief architect responsible for the job site www.shine.com. Jayant is an avid blogger and his blog can be visited at http://jayant7k.blogspot.in. His LinkedIn profile is available at http://www.linkedin.com/in/jayantkumar.
Read more about Jayant Kumar

Right arrow

Chapter 4. Advanced Queries – Filter Queries and Faceting

This chapter starts by defining filter queries and their benefits compared to the normal search queries that we have used earlier. We will see how we can use filter queries in Solr with PHP and the Solarium library. We will then explore faceting in Solr. We will also see how PHP can be used to facet in Solr. We will explore faceting by field, faceting by query, and faceting by range. We will also look at faceting by using pivots. The topics that will be covered are as follows:

  • Filter queries and their benefits

  • Executing filter queries using PHP and Solarium

  • Creating a filter query configuration

  • Faceting

  • Faceting by field, query, and range

  • Faceting pivots

Filter queries and their benefits


Filter queries are used to put a filter on the results from a Solr query without affecting the score. Suppose we are looking for all books that are in stock. The related query will be q=cat:book AND inStock:true.

http://localhost:8080/solr/collection1/select/?q=cat:book%20AND%20inStock:true&fl=id,name,price,author,score,inStock&rows=50&defType=edismax

Another way to handle the same query is by using filter queries. The query will change to q=cat:book&fq=inStock:true.

http://localhost:8080/solr/collection1/select/?q=cat:book&fl=id,name,price,author,score,inStock&rows=50&fq=inStock:true&defType=edismax

Though the results are the same, there are certain benefits of using filter queries. A filter query stores only document IDs. This makes it very fast to apply filters to include or exclude documents in a query. A normal query on the other hand has a complex scoring function causing reduced performance. Scoring or relevance calculation...

Executing filter queries


To add a filter query to our existing query, first we need to create a filter query from our Solr query module.

$query = $client->createSelect();
$query->setQuery('cat:book');
$fquery = $query->createFilterQuery('Availability');

The string provided as a parameter to the createFilterQuery() function is used as key for the filter query. This key can be used to retrieve the filter query associated with this query. Once the filter query module is available, we can use the setQuery() function to set a filter query for this Solarium query.

In the preceding piece of code, we have created a filter query by the name of Availability. We will set the filter query for key Availability as instock:true and then execute the complete query as follows:

$fquery->setQuery('inStock:true');
$resultSet = $client->select($query);

Once the resultset is available, it can be iterated over to get and process the results.

Let us check Solr logs and see the query that was sent to...

Creating filter query configuration


We can also pass filter query as a configuration parameter to the Solarium query using the addFilterQuery() function. For this, we need to first define the filter query as a configuration array and then add it to the Solarium query.

$fqconfig = array(
          "query"=>"inStock:true",
          "key"=>"Availability",
  );
$query = $client->createSelect();
$query->addFilterQuery($fqconfig);

The Solr query created by the preceding configuration is similar to the one created earlier. The benefit of using filter query configuration is that we can define multiple standard filter queries as configurations and add them in our Solr query as required. The addTag(String $tag) and addTags(array $tags) functions are used to define tags in the filter queries. We can use these tags to exclude certain filter queries in facets. We will go through an example later.

Faceting


Faceted searches break up the search results into multiple categories, showing counts for each category. Faceting is used in searches to drill down into a subset of results from a query. To get an idea of how facets are helpful, let us go to www.amazon.com and search for mobile phones. We will see facets on the left-hand side such as brand, display size, and carrier. Once we select a facet to drill down, we will see more facets that will help us narrow down the phone we would like to purchase.

Faceting is generally done on human readable text that is predefined such as location, price, and author name. It would not make sense tokenizing these fields. So, facet fields are kept separate from search and sorting fields in the Solr schema. They are also not converted to lowercase but are kept as they are. Faceting is done on indexed fields on Solr. So there is no need to store faceted fields.

Solarium introduces the concept of facetset, which is one central component and can be used to...

Facet by field


Faceting by field counts the number of occurrences of a term in a specific field. Let us create facets on author and genre. There are separate string fields in our Solr index for indexing facet-related strings without any tokenization. In this case, the fields are author_s and genre_s.

Note

Fields ending with _s are dynamic fields defined in our Solr schema.xml. Dynamic fields defined as *_s match any field that ends in _s and all attributes in the field definition are applied on this field.

To create a facet on our author_s field, we need to get the facetset component from the Solarium query, create a facet field key and set the actual field using the facets that will be created.

$query->setQuery('cat:book');
$facetset = $query->getFacetSet();
$facetset->createFacetField('author')->setField('author_s');

Set the number of facets to get using the following code:

$facetset->setLimit(5);

Return all facets that have at least one term in them.

$facetset->setMinCount...

Facet by query


We can use a facet query in addition to the normal query to get counts with respect to the facet query. The counts are not affected by the main query and filter queries can be excluded from it. Let's see the code to get counts of facets where genre is fantasy and also see an example of excluding a filter query.

Let us first create a query to select all books in our index.

$query->setQuery('cat:book');

Create a filter query for books that are in stock and tag it.

$fquery = $query->createFilterQuery('inStock');
$fquery->setQuery('inStock:true');
$fquery->addTag('inStockTag');

Get the facetset component from our query using the following code:

$facetset = $query->getFacetSet();

Create a facet by query to count the number of books of a particular genre. Also, exclude the filter query we added earlier.

$facetqry = $facetset->createFacetQuery('genreFantasy');
$facetqry->setQuery('genre_s: fantasy');
$facetqry->addExclude('inStockTag');

Let us add another facet query...

Facet by range


Faceting can also be done on range basis. So for example, we can create facet counts of books for every two dollars. Using range faceting, we can give counts of books with prices between 0-2 dollars and from 2-4 dollars and so on.

$facetqry = $facetset->createFacetRange('pricerange');
$facetqry->setField('price');
$facetqry->setStart(0);
$facetqry->setGap(2);
$facetqry->setEnd(16);

In the preceding code, we start faceting from price 0 dollars and up to 16 dollars. The following code will be used to display the range facets along with their counts after executing the query:

$facetCnts = $resultSet->getFacetSet()->getFacet('pricerange');
foreach($facetCnts as $range => $cnt){
  echo $range.' to '.($range+2).': ['.$cnt.']'."<br/>".PHP_EOL;
}

Facet by range output

5481523 [http-bio-8080-exec-4] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/select params={facet=true&f.price.facet.range.gap=2&facet.range={!key%3Dpricerange...

Facet by pivot


In addition to the different ways of creating facets, there is a concept of facet by pivots that is provided by Solr and is exposed via Solarium. Pivot faceting allows us to create facets within the results of the parent facet. The input to pivot faceting is a set of fields to pivot on. Multiple fields create multiple sections in the response.

Here is the code to create a facet pivot on genre and availability (in stock):

$facetqry = $facetset->createFacetPivot('genre-instock');
$facetqry->addFields('genre_s,inStock');

To display the pivots, we have to get all facets from the resultset.

$facetResult = $resultSet->getFacetSet()->getFacet('genre-instock');

And for each facet, get the field, value, and count for the facet and more facet pivots within the facet.

  echo 'Field: '.$pivot->getField().PHP_EOL;
  echo 'Value: '.$pivot->getValue().PHP_EOL;
  echo 'Count: '.$pivot->getCount().PHP_EOL;

Also get all pivots inside this facet and process them in the same fashion...

Summary


In this chapter, we saw advanced query functionalities of Solr. We defined filter queries and saw the benefits of using filter queries instead of normal queries. We saw how to do faceting on Solr using PHP and Solarium. We saw different ways to facet results as facets by field, facets by query, facets by range and creating facet pivots. We also saw the actual queries being executed on Solr and in some cases executed the query on Solr and saw the results.

In the next chapter, we will explore highlighting of search results using PHP and Solr.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Apache Solr PHP Integration
Published in: Nov 2013Publisher: PacktISBN-13: 9781782164920
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jayant Kumar

Jayant Kumar is an experienced software professional with a bachelor of engineering degree in computer science and more than 14 years of experience in architecting and developing large-scale web applications. Jayant is an expert on search technologies and PHP and has been working with Lucene and Solr for more than 11 years now. He is the key person responsible for introducing Lucene as a search engine on www.naukri.com, the most successful job portal in India. Jayant is also the author of the book Apache Solr PHP Integration, Packt Publishing, which has been very successful. Jayant has played many different important roles throughout his career, including software developer, team leader, project manager, and architect, but his primary focus has been on building scalable solutions on the Web. Currently, he is associated with the digital division of HT Media as the chief architect responsible for the job site www.shine.com. Jayant is an avid blogger and his blog can be visited at http://jayant7k.blogspot.in. His LinkedIn profile is available at http://www.linkedin.com/in/jayantkumar.
Read more about Jayant Kumar