Apache Solr PHP Integration — Save 50%
Build a fully-featured and scalable search application using PHP to unlock the search functions provided by Solr with this book and ebook.
Search is an integral part of any web application that is built today. Whether it is a content site, a job site, an ecommerce site or any other website, search plays a very important role in helping the user locate the information that he is looking for. As a developer it is imperative to provide the user of the website all the possible tools for searching and narrowing down to the required information. Apache Solr is a full text search engine which provides a large list of features for search. PHP is the preferred programming language for building websites.
This article written by Jayant Kumar, author of the book Apache Solr PHP Integration, will make the integration between Apache Solr and PHP easy for you. We will start with Solr installation. Look at how Solr can be integrated with PHP. And then explore the features provided by Solr through PHP code. After going through the book, you should be able to integrate almost all features provided by Solr into your PHP applications.
(For more resources related to this topic, see here.)
We will be looking at installation on both Windows and Linux environments. We will be using the Solarium library for communication between Solr and PHP.
This article will give a brief overview of the Solarium library and showcase some of the concepts and configuration options on Solr end for implementing certain features.
Calling Solr using PHP code
A ping query is used in Solr to check the status of the Solr server. The Solr URL for executing the ping query is http://localhost:8080/solr/collection1/admin/ping/?wt=json.
Response of Solr ping query in browser
We can use Curl to get the ping response from Solr via PHP code; a sample code for executing the previous ping query is as below
$curl = curl_init("http://localhost:8080/solr/collection1/admin/ping/
?wt=json"); curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($curl); $data = json_decode($output, true); echo "Ping Status : ".$data["status"].PHP_EOL;
Though Curl can be used to execute almost any query on Solr, but it is preferable to use a library which does the work for us. In our case we will be using Solarium. To execute the same query on Solr using the Solarium library the code is as follows.
include_once("vendor/autoload.php"); $config = array("endpoint" => array("localhost" => array
("host"=>"127.0.0.1", "port"=>"8080", "path"=>"/solr", "core"=>"collection1",) ) );
We have included the Solarium library in our code. And defined the connection parameters for our Solr server.
Next we will need to create a Solarium client with the previous Solr configuration. And call the createPing() function to create the ping query.
$client = new Solarium\Client($config); $ping = $client->createPing(); Finally execute the ping query and get the result. $result = $client->ping($ping); $result->getStatus();
The output should be similar to the one shown below.
Output of ping query using PHP
Adding documents to Solr index
To create a Solr index, we need to add documents to the Solr index using the command line, Solr web interface or our PHP program. But before we create a Solr index, we need to define the structure or the schema of the Solr index. Schema consists of fields and field types. It defines how each field will be treated and handled during indexing or during search. Let us see a small piece of code for adding documents to the Solr index using PHP and Solarium library.
Create a solarium client. Create an instance of the update query. Create the document in PHP and finally add fields to the document.
$client = new Solarium\Client($config); $updateQuery = $client->createUpdate(); $doc1 = $updateQuery->createDocument(); $doc1->id = 112233445; $doc1->cat = 'book'; $doc1->name = 'A Feast For Crows'; $doc1->price = 8.99; $doc1->inStock = 'true'; $doc1->author = 'George R.R. Martin'; $doc1->series_t = '"A Song of Ice and Fire"';
Id field has been marked as unique in our schema. So we will have to keep different values for Id field for different documents that we add to Solr.
Add documents to the update query followed by commit command. Finally execute the query.
$updateQuery->addDocuments(array($doc1)); $updateQuery->addCommit(); $result = $client->update($updateQuery);
Let us execute the code.
After executing the code, a search for martin will give these documents in the result.
Document added to Solr index
Executing search on Solr Index
Documents added to the Solr index can be searched using the following piece of PHP code.
$selectConfig = array( 'query' => 'cat:book AND author:Martin', 'start' => 3, 'rows' => 3,
'fields' => array('id','name','price','author'),
'sort' => array('price' => 'asc') ); $query = $client->createSelect($selectConfig); $resultSet = $client->select($query);
The above code creates a simple Solr query and searches for book in cat field and Martin in author field. The results are sorted in ascending order or price and fields returned are id, name of book, price and author of book. Pagination has been implemented as 3 results per page, so this query returns results for 2nd page starting from 3rd result.
In addition to this simple select query, Solr also supports some advanced query modes known as dismax and edismax. With the help of these query modes, we can boost certain fields to give more importance to certain fields in our query. We can also use function queries to do some type of dynamic boosting based on values in fields.
If no sorting is provided, the Solr results are sorted by the score of documents which are calculated based on the terms in the query and the matching terms in the documents in the index. Score is calculated for each document in the result set using two main factors - term frequency known as tf and inverse document frequency known as idf.
In addition to these, Solr provides a way of narrowing down the results using filter queries. Also facets can be created based on fields in the index and it can be used by the end users to narrow down the results.
Highlighting search results using PHP and Solr
Solr can be used to highlight the fields returned in a search result based on the query. Here is a sample code for highlighting the results for search keyword harry.
Get the highlighting component from the query, set the fields to be highlighted and also set the html tags to be used for highlighting.
$hl = $query->getHighlighting(); $hl->setFields('name,series_t'); $hl->setSimplePrefix('<strong>')->setSimplePostfix('</strong>');
Once the query is run and result set is received, we will need to retrieve the highlighted results from the result set. Here is the output for the highlighting code.
Highlighted search results
In addition to highlighting, Solr can be used to create a spelling suggester and a spell checker. Spelling suggester can be used to prompt input query to the end user as the user keeps on typing. Spell check can be used to prompt spelling corrections similar to 'did you mean' to the user. Solr can also be used for finding documents which are similar to a certain document based on words in certain fields. This functionality of Solr is known as more like this and is exposed via Solarium by the MoreLikeThis component. Solr also provides grouping of the result based on a particular query or a certain field.
Solr can be scaled to handle large number of search requests by using master slave architecture. Also if the index is huge, it can be sharded across multiple Solr instances and we can run a distributed search to get results for our query from all the sharded instances. Solarium provides a load balancing plug-in which can be used to load balance queries across master-slave architecture.
Solr provides an extensive list of features for implementing search. These features can be easily accessed in PHP using the Solarium library to build a full features search application which can be used to power search on any website.
Resources for Article:
- Apache Solr Configuration [Article]
- Getting Started with Apache Solr [Article]
- Making Big Data Work for Hadoop and Solr [Article]
|Build a fully-featured and scalable search application using PHP to unlock the search functions provided by Solr with this book and ebook.|
eBook Price: $20.99
Book Price: $34.99
About the Author :
Jayant Kumar is an experienced software professional and a Bachelor of Engineering in Computer Science, with more than 12 years' of experience in architecting and developing large-scale web applications.
Jayant is an expert on search technologies and PHP and has been working with Lucene and Solr for more than 10 years now. He has been the key person responsible for introducing Lucene as a search engine in www.naukri.com, the most successful job portal in India.
Jayant has played many different important roles throughout his career, including software developer, team leader, project manager, and architect, but his primary focus has been on building scalable solutions on the web. Currently, he is associated with the digital division of HT Media as the Chief Architect responsible for the job site www.shine.com.