Using Additional Solr Functionalities

Exclusive offer: get 50% off this eBook here
Apache Solr 3.1 Cookbook

Apache Solr 3.1 Cookbook — Save 50%

Over 100 recipes to discover new ways to work with Apache’s Enterprise Search Server

$26.99    $13.50
by Rafał Kuć | July 2011 | Open Source

There are many features of Solr that we don't use every day. Highlighting, sorting results, or ignoring words may not be in everyday use, but they can come in handy in many situations. In this article Rafal Kuc, author of Apache Solr 3.1 Cookbook, the author will try to show you how to overcome some typical problems that can be fixed by using some of the Solr functionalities.

Specifically, we will cover:

  • Getting more documents similar to those returned in the results list
  • Presenting search results in a fast and easy way
  • Highlighting matched words
  • How to highlight long text fields and get good performance
  • Sorting results by a function value
  • Searching words by how they sound
  • Ignoring defined words

 

Apache Solr 3.1 Cookbook

Apache Solr 3.1 Cookbook

Over 100 recipes to discover new ways to work with Apache’s Enterprise Search Server

        Read more about this book      

(For more resources on this subject, see here.)

Getting more documents similar to those returned in the results list

Let's imagine a situation where you have an e-commerce library shop and you want to show users the books similar to the ones they found while using your application. This recipe will show you how to do that.

How to do it...

Let's assume that we have the following index structure (just add this to your schema.xml file's fields section):

<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="name" type="text" indexed="true" stored="true"
termVectors="true" />

The test data looks like this:

<add>
<doc>
<field name="id">1</field>
<field name="name">Solr Cookbook first edition</field>
</doc>
<doc>
<field name="id">2</field>
<field name="name">Solr Cookbook second edition</field>
</doc>
<doc>
<field name="id">3</field>
<field name="name">Solr by example first edition</field>
</doc>
<doc>
<field name="id">4</field>
<field name="name">My book second edition</field>
</doc>
</add>

Let's assume that our hypothetical user wants to find books that have first in their names. However, we also want to show him the similar books. To do that, we send the following query:

http://localhost:8983/solr/select?q=name:edition&mlt=true&mlt.
fl=name&mlt.mintf=1&mlt.mindf=1

The results returned by Solr are as follows:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
<lst name="params">
<str name="mlt.mindf">1</str>
<str name="mlt.fl">name</str>
<str name="q">name:edition</str>
<str name="mlt.mintf">1</str>
<str name="mlt">true</str>
</lst>
</lst>
<result name="response" numFound="1" start="0">
<doc>
<str name="id">3</str>
<str name="name">Solr by example first edition</str>
</doc>
</result>
<lst name="moreLikeThis">
<result name="3" numFound="3" start="0">
<doc>
<str name="id">1</str>
<str name="name">Solr Cookbook first edition</str>
</doc>
<doc>
<str name="id">2</str>
<str name="name">Solr Cookbook second edition</str>
</doc>
<doc>
<str name="id">4</str>
<str name="name">My book second edition</str>
</doc>
</result>
</lst>
</response>

Now let's see how it works.

How it works...

As you can see, the index structure and the data are really simple. One thing to notice is that the termVectors attribute is set to true in the name field definition. It is a nice thing to have when using more like this component and should be used when possible in the fields on which we plan to use the component.

Now let's take a look at the query. As you can see, we added some additional parameters besides the standard q one. The parameter mlt=true says that we want to add the more like this component to the result processing. Next, the mlt.fl parameter specifies which fields we want to use with the more like this component. In our case, we will use the name field. The mlt.mintf parameter tells Solr to ignore terms from the source document (the ones from the original result list) with the term frequency below the given value. In our case, we don't want to include the terms that will have the frequency lower than 1. The last parameter, mlt.mindf, tells Solr that the words that appear in less than the value of the parameter documents should be ignored. In our case, we want to consider words that appear in at least one document.

Finally, let's take a look at the search results. As you can see, there is an additional section (<lst name="moreLikeThis">) that is responsible for showing us the more like this component results. For each document in the results, there is one more similar section added to the response. In our case, Solr added a section for the document with the unique identifier 3 (<result name="3" numFound="3" start="0">) and there were three similar documents found. The value of the id attribute is assigned the value of the unique identifier of the document that the similar documents are calculated for.

Presenting search results in a fast and easy way

Imagine a situation where you have to show a prototype of your brilliant search algorithm made with Solr to the client. But the client doesn't want to wait another four weeks to see the potential of the algorithm, he/she wants to see it very soon. On the other hand, you don't want to show the pure XML results page. What to do then? This recipe will show you how you can use the Velocity response writer (a.k.a. Solritas) to present a prototype fast.

How to do it...

Let's assume that we have the following index structure (just add this to your schema.xml file to the fields section):

<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="name" type="text" indexed="true" stored="true" />

The test data looks like this:

<add>
<doc>
<field name="id">1</field>
<field name="name">Solr Cookbook first edition</field>
</doc>
<doc>
<field name="id">2</field>
<field name="name">Solr Cookbook second edition</field>
</doc>
<doc>
<field name="id">3</field>
<field name="name">Solr by example first edition</field>
</doc>
<doc>
<field name="id">4</field>
<field name="name">My book second edition</field>
</doc>
</add>

We need to add the response writer definition. To do this, you should add this to your solrconfig.xml file (actually this should already be in the configuration file):

<queryResponseWriter name="velocity" class="org.apache.solr.
request.VelocityResponseWriter"/>

Now let's set up the Velocity response writer. To do that we add the following section to the solrconfig.xml file (actually this should already be in the configuration file):

<requestHandler name="/browse" class="solr.SearchHandler">
<lst name="defaults">
<str name="wt">velocity</str>
<str name="v.template">browse</str>
<str name="v.layout">layout</str>
<str name="title">Solr cookbook example</str>
<str name="defType">dismax</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
<str name="qf">name</str>
</lst>
</requestHandler>

Now you can run Solr and type the following URL address:

http://localhost:8983/solr/browse

You should see the following page:

(Move the mouse over the image to enlarge it.)

How it works...

As you can see, the index structure and the data are really simple, so I'll skip discussing this part of the recipe.

The first thing in configuring the solrconfig.xml file is adding the Velocity Response Writer definition. By adding it, we tell Solr that we will be using velocity templates to render the view.

Now we add the search handler to use the Velocity Response Writer. Of course, we could pass the parameters with every query, but we don't want to do that, we want them to be added by Solr automatically. Let's go through the parameters:

  • wt: The response writer type; in our case, we will use the Velocity Response Writer.
  • v.template: The template that will be used for rendering the view; in our case, the template that Velocity will use is in the browse.vm file (the vm postfix is added by Velocity automatically). This parameter tells Velocity which file is responsible for rendering the actual page contents.
  • v.layout: The layout that will be used for rendering the view; in our case, the template that velocity will use is in the layout.vm file (the vm postfix is added by velocity automatically). This parameter specifies how all the web pages rendered by Solritas will look like.
  • title: The title of the page.
  • defType: The parser that we want to use.
  • q.alt: Alternate query for the dismax parser in case the q parameter is not defined.
  • rows: How many maximum documents should be returned.
  • fl: Fields that should be listed in the results.
  • qf: The fields that we should be searched.

Of course, the page generated by the Velocity Response Writer is just an example. To modify the page, you should modify the Velocity files, but this is beyond the scope of this article.

There's more...

If you are still using Solr 1.4.1 or 1.4, there is one more thing that can be useful.

Running Solritas on Solr 1.4.1 or 1.4

Because the Velocity Response Writer is a contrib module in Solr 1.4.1, we need to do the following operations to use it. Copy the following libraries from the /contrib/velocity/ src/main/solr/lib directory to the /lib directory of your Solr instance:

  • apache-solr-velocity-1.4.dev.jar
  • commons-beanutils-1.7.0.jar
  • commons-collections-3.2.1.jar
  • velocity-1.6.1.jar
  • velocity-tools-2.0-beta3.jar

Then copy the contents of the /velocity (with the directory) directory from the code examples to your Solr configuration directory.

Apache Solr 3.1 Cookbook Over 100 recipes to discover new ways to work with Apache’s Enterprise Search Server
Published: July 2011
eBook Price: $26.99
Book Price: $44.99
See more
Select your format and quantity:
        Read more about this book      

(For more resources on this subject, see here.)

Highlighting matched words

Imagine a situation where you want to show your users which words were matched in the document that is shown in the results list. For example, you may want to show which words in the book name were matched and display that to the user. Do you have to store the documents and do the matching on the application side? The answer is no—we can force Solr to do that for us and this recipe will show you how.

How to do it...

Let's assume that we have the following index structure (just add this to the fields section of your schema.xml file):

<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="name" type="text" indexed="true" stored="true" />

The test data looks like this:

<add>
<doc>
<field name="id">1</field>
<field name="name">Solr Cookbook first edition</field>
</doc>
<doc>
<field name="id">2</field>
<field name="name">Solr Cookbook second edition</field>
</doc>
<doc>
<field name="id">3</field>
<field name="name">Solr by example first edition</field>
</doc>
<doc>
<field name="id">4</field>
<field name="name">My book second edition</field>
</doc>
</add>

Let's assume that our user is searching for the word 'book'. To tell Solr that we want to highlight the matches, we send the following query:

http://localhost:8983/solr/select?q=name:book&hl=true

The response from Solr should be like this:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">15</int>
<lst name="params">
<str name="hl">true</str>
<str name="q">name:book</str>
</lst>
</lst>
<result name="response" numFound="1" start="0">
<doc>
<str name="id">4</str>
<str name="name">My book second edition</str>
</doc>
</result>
<lst name="highlighting">
<lst name="4">
<arr name="name">
<str>My &lt;em&gt;book&lt;/em&gt; second edition</str>
</arr>
</lst>
</lst>
</response>

As you can see, besides the normal results list, we got the highlighting results (the highlighting results are grouped by the <lst name="highlighting"> XML tag). The word book is surrounded with the <em> and </em> HTML tags. So everything is working as intended. Now let's see how it works.

How it works...

As you can see, the index structure and the data are really simple, so I'll skip discussing this part of the recipe. Please note that in order to use the highlighting mechanism, your fields should be stored and not analyzed by aggressive filters (like stemming). Otherwise, the highlighting results can be misleading to the users. The example of such a behavior can be simple—imagine the user types the word bought in the search box but Solr highlighted the word buy because of the stemming algorithm.

The query is also not complicated. We can see the standard q parameter that passes the query to Solr. However, there is also one additional parameter, the hl set to true. This parameter tells Solr to include the highlighting component results to the results list. As you can see in the results list, in addition to the standard results there is a new section—<lst name="highlighting"> which contains the highlighting results. For every document, in our case, the only one found (<lst name="4"> means that the highlighting result is presented for the document with the unique identifier value of 4) there is a list of fields that contains the sample data with the matched words (or words) highlighted. By highlighted, I mean surrounded with the HTML tag, in this case, the <em> tag.

You should also remember one thing—if you are using the standard LuceneQParser then the default field used for highlighting will be the one set in the schema.xml file. If you are using the DismaxQParser, then the default fields used for highlighting are the ones specified by the qf parameter.

There's more...

There are a few things that can be useful when using the highlighting mechanism.

Specifying the fields for highlighting

In many real-life situations, we want to decide which fields we want to show for highlighting. To do that, you must add an additional parameter, hl.fl with the list of fields separated by the comma character. For example, if we would like to show the highlighting for the fields name and description our query should look like this:

http://localhost:8983/solr/select?q=name:book&hl=true&hl.
fl=name,description

Changing the default HTML tags that surround the matched word

There are situations where you would like to change the default <em> and </em> HTML tags to the ones of your choice. To do that, you should add the hl.simple.pre parameter and the hl.simple.post parameter. The first one specifies the prefix that will be added in front of the matched word and the second one specifies the postfix that will be added after the matched word. For example, if you would like to surround the matched word with the <b> and </b> HTML tags the query would look like this:

http://localhost:8983/solr/select?q=name:book&hl=true&hl.simple.
pre=<b>&hl.simple.post=</b>

How to highlight long text fields and get good performance

In certain situations, the standard highlighting mechanism may not be performing as well as you would like it to be. For example, you may have long text fields and you want the highlighting mechanism to work with them. This recipe will show you how to do that.

How to do it...

Let's assume that we have the following index structure (just add this to your schema.xml file, to the fields section):

<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="name" type="text" indexed="true" stored="true"
termVectors="true" termPositions="true" termOffsets="true" />

The test data looks like this:

<add>
<doc>
<field name="id">1</field>
<field name="name">Solr Cookbook first edition</field>
</doc>
<doc>
<field name="id">2</field>
<field name="name">Solr Cookbook second edition</field>
</doc>
<doc>
<field name="id">3</field>
<field name="name">Solr by example first edition</field>
</doc>
<doc>
<field name="id">4</field>
<field name="name">My book second edition</field>
</doc>
</add>

Let's assume that our user is searching for the word book. To tell Solr that we want to highlight the matches we send the following query:

http://localhost:8983/solr/select?q=name:book&hl=true&hl.
useFastVectorHighlighter=true

The response from Solr should be like this:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">15</int>
<lst name="params">
<str name="hl">true</str>
<str name="q">name:book</str>
</lst>
</lst>
<result name="response" numFound="1" start="0">
<doc>
<str name="id">4</str>
<str name="name">My book second edition</str>
</doc>
</result>
<lst name="highlighting">
<lst name="4">
<arr name="name">
<str>My &lt;em&gt;book&lt;/em&gt; second edition</str>
</arr>
</lst>
</lst>
</response>

As you can see, everything is working as intended. Now let's see how it works.

How it works...

As you can see, the index structure and the data are really simple, but there is a difference between using the standard highlighter and the new FastVectorHighlighting. To be able to use the new highlighting mechanism, you need to store the information about term vectors, position, and offsets. This is done by adding the following attributes to the field definition or to the type definition: termVectors="true", termPositions="true", termOffsets="true".

Please note that in order to use the highlighting mechanism your fields should be stored and not analyzed by aggressive filters (like stemming). Otherwise, the highlighting results can be a misleading to the users. The example of such a behavior can be simple—imagine the user types the word bought in the search box but Solr highlighted the word buy because of the stemming algorithm.

The query is also not complicated. We can see the standard q parameter that passes the query to Solr. However, there is also one additional parameter, hl, set to true. This parameter tells Solr to include the highlighting component results to the results list. In addition, we add the parameter to tell Solr to use the FastVectorHighlighting—hl.useF astVectorHighlighter=true.

As you can see in the results list, in addition to the standard results there is a new section, <lst name="highlighting">, which contains the highlighting results. For every document, in our case, the only one found (<lst name="4">, means that the highlighting result is presented for the document with the unique identifier value of 4) there is a list of fields that contains the sample data with the matched words (or words) highlighted. By highlighted, I mean surrounded with the HTML tag, in this case, the <em> tag.

Sorting results by a function value

Let's imagine that you have an application that allows the user to search through the companies that are stored in the index. You would like to add an additional feature to your application—to sort the results on the basis of the distance of certain geographical points. Is this possible with Solr? Yes, and this recipe will show you how to do that.

How to do it..

Let's assume that we have the following index structure (just add this to the fields section of your schema.xml file):

<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="name" type="text" indexed="true" stored="true" />
<field name="geoX" type="float" indexed="true" stored="true" />
<field name="geoY" type="float" indexed="true" stored="true" />

And the data looks like this:

<add>
<doc>
<field name="id">1</field>
<field name="name">Company one</field>
<field name="geoX">10.1</field>
<field name="geoY">10.1</field>
</doc>
<doc>
<field name="id">2</field>
<field name="name">Company two</field>
<field name="geoX">11.1</field>
<field name="geoY">11.1</field>
</doc>
<doc>
<field name="id">3</field>
<field name="name">Company three</field>
<field name="geoX">12.2</field>
<field name="geoY">12.2</field>
</doc>
</add>

Let's assume that our hypothetical user searches for the word company and the user is in the location which has the geographical point of (13, 13). So, in order to show the results of the query and sort it by the distance from the given point, we send the following query to Solr:

http://localhost:8983/solr/select?q=name:company&sort=dist(2,geoX,
geo,Y,13,13)+asc

The results list returned by the query looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">10</int>
<lst name="params">
<str name="sort">dist(2,geoX,geoY,13,13) asc</str>
<str name="q">name:company</str>
<str name="defType">dismax</str>
</lst>
</lst>
<result name="response" numFound="3" start="0">
<doc>
<float name="geoX">12.2</float>
<float name="geoY">12.2</float>
<str name="id">3</str>
<str name="name">Company three</str>
</doc>
<doc>
<float name="geoX">11.1</float>
<float name="geoY">11.1</float>
<str name="id">2</str>
<str name="name">Company two</str>
</doc>
<doc>
<float name="geoX">10.1</float>
<float name="geoY">10.1</float>
<str name="id">1</str>
<str name="name">Company one</str>
</doc>
</result>
</response>

As you can see, everything is working as it should be. So now let's see how it works.

How it works...

Let's start from the index structure. We have four fields—one for holding the unique identifier (the id field), one for holding the name of the company (the name field), and two fields responsible for the geographical location of the company (the geoX and geoY fields). The data is pretty simple so let's just skip discussing that.

Besides the standard q parameter responsible for the user query, you can see the sort parameter. However, the sort parameter is a bit different from the ones you are probably used to. It uses the dist function to calculate the distance from the given point and the value returned by the function is then used to sort the documents in the results list. The first argument of the dist function (the value 2) tells Solr to use the Euclidian Distance to calculate the distance. The next two arguments tell Solr which fields in the index hold the geographical position of the company. The last two arguments specify the point from which the distance should be calculated. Of course, as with every sort, we specify the order in which we want to sort. In our case, we want to sort from the nearest to the farthest company (the asc value).

As you can see in the results, the documents were sorted as they should be.

Searching words by how they sound

One day your boss comes to your office and says "Hey, I want our search engine to be able to find the same documents when I enter phone or fone into the search box." You try to reply, but your boss is already on the other side of the door. So, you wonder if this kind of functionality is available in Solr. I think you already know the answer—yes, it is, and this recipe will show you how to configure it and use with Solr.

How to do it...

Let's assume that we have the following index structure (just add this to the fields section of your schema.xml file):

<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="name" type="phonetic" indexed="true" stored="true" />

The phonetic type definition looks like this (this already should be included in the schema. xml file of the example Solr instance):

<fieldtype name="phonetic" stored="false" indexed="true" class=
"solr.TextField" >
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.DoubleMetaphoneFilterFactory" inject="false"/>
</analyzer>
</fieldtype>

And the data looks like this:

<add>
<doc>
<field name="id">1</field>
<field name="name">Phone</field>
</doc>
<doc>
<field name="id">2</field>
<field name="name">Fone</field>
</doc>
</add>

Now let's assume that our user wants to find documents that have the word that sounds like fon. So, we send the following query to Solr:

http://localhost:8983/solr/select?q=name:fon

The result list returned by the query is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
<lst name="params">
<str name="q">name:fon</str>
</lst>
</lst>
<result name="response" numFound="2" start="0">
<doc>
<str name="id">1</str>
<str name="name">Phone</str>
</doc>
<doc>
<str name="id">2</str>
<str name="name">Fone</str>
</doc>
</result>
</response>

So, the filter worked—we got two documents in the results list. Now let's see how it works.

How it works...

Let's start with the index structure. As you can see, we have two fields—one responsible for holding the unique identifier (the id field) of the product and the other responsible for holding the name of the product (the name field).

The name field is the one that will be used for phonetic search. For that we defined a new field type named phonetic. Besides the standard parts (such as class and so on) we defined a new filter, DoubleMetaphoneFilterFactory. It is responsible for analysis and checking how the words sound like. This filter uses algorithm named double metaphone to analyze the phonetics of the words. The additional attribute inject="false" tells Solr to replace the existing tokens instead of inserting additional ones, which means that the original tokens will be replaced by the ones that the filter produces.

As you can see from the query and the data, the fon word was matched to the word phone and also to the word fone which means that the algorithm (and thus the filter) are working quite well. However, take into consideration that this is only an algorithm, so some words that you think should be matched will not match.

Ignoring defined words

Imagine a situation where you would like to filter the words that are considered vulgar from the data we are indexing. Of course, by accident, such words can be found in your data and you don't want them to be searchable, thus you want to ignore them. Can we do that with Solr? Of course we can and this recipe will show you how.

How to do it...

Let's start with the index structure (just add this to your schema.xml file to the fields section):

<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="name" type="text_ignored" indexed="true" stored=
"true" />

The text_ignored type definition looks like this:

<fieldType name="text_ignored" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="ignored.txt" enablePositionIncrements="true" />
</analyzer>
</fieldType>

The ignored.txt file looks like this:

vulgar
vulgar2
vulgar3

And the data looks like this:

<add>
<doc>
<field name="id">1</field>
<field name="name">Company name</field>
</doc>
</add>

Now let's assume that our user wants to find documents that have the words Company and vulgar. So, we send the following query to Solr:

http://localhost:8983/solr/select?q=name:(Company+AND+vulgar)

In the standard situation, there shouldn't be any results because we don't have a document that matches the two given words. However, let's take a look at what Solr returned to us as the preceding query result:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
<lst name="params">
<str name="q">name:(Company AND vulgar)</str>
</lst>
</lst>
<result name="response" numFound="1" start="0">
<doc>
<str name="id">1</str>
<str name="name">Company name</str>
</doc>
</result>
</response>

Hmm… it works. To be perfectly sure, let's see the analysis page found at the administration interface.

Using Additional Solr Functionalities

As you can see, the word vulgar was cut and thus ignored.

How it works...

Let's start with the index structure. As you can see, we have two fields. One responsible for holding the unique identifier (the id field) of the product and the other responsible for holding the name of the product (the name field).

The name field is the one that we will use for the ignore functionality of Solr, StopFilterFactory. As you can see, the text_ignored type definition is analyzed the same way both in the query and index time. The unusual thing is the new filter, StopFilterFactory. The words attribute of the filter definition specifies the name of the file, encoded in UTF-8 that consists of words (new word at every file line) that should be ignored. The defined file should be placed in the same directory that we put the schema. xml file. The ignoreCase attribute set to true tells the filter to ignore the case of the tokens and the words defined in the file. The last attribute, enablePositionIncrements=true, tells Solr to increment the position of the tokens in the token stream. The enablePositionIncrements parameter should be set to true if you want to preserve the next token after the discarded one to increment its position in the token stream.

As you can see in the query, our hypothetical user queried Solr for two words with the logical operator AND which means that both words must be present in the document. However, the filter we added cut the word vulgar and thus the results list consists of the document that has only one of the words. The same is the situation when you are indexing your data. The words defined in the ignored.txt file will not be indexed.

If you look at the provided screenshot from the analysis page of the Solr administration interface, you can see that the word vulgar was cut during the processing of the token stream in the StopFilterFactory.

Summary

In this article we took a look at some additional Solr functionalities such as highlighting, sorting results, ignoring words, and so on.

In the next article we will take a look at some more Solr functionalities.


Further resources on this subject:


Apache Solr 3.1 Cookbook Over 100 recipes to discover new ways to work with Apache’s Enterprise Search Server
Published: July 2011
eBook Price: $26.99
Book Price: $44.99
See more
Select your format and quantity:

About the Author :


Rafał Kuć

Rafał Kuć is a born team leader and a Software Developer. Working as a Consultant and a Software Engineer at Sematext Group, Inc., he concentrates on open source technologies such as Apache Lucene, Solr, ElasticSearch, and Hadoop stack. He has more than 11 years of experience in various software branches—from banking software to e-commerce products. He is mainly focused on Java, but open to every tool and programming language that will make the achievement of his goal easier and faster. He is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people to resolve their problems with Solr and Lucene. He is also a speaker for various conferences around the world such as Lucene Eurocon, Berlin Buzzwords, ApacheCon, and Lucene Revolution.

Rafał began his journey with Lucene in 2002 and it wasn't love at first sight. When he came back to Lucene in late 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then Solr came and this was it. He started working with ElasticSearch in the middle of 2010. Currently, Lucene, Solr, ElasticSearch, and information retrieval are his main points of interest.

Rafał is also an author of Solr 3.1 Cookbook, the update to it—Solr 4.0 Cookbook, and is a co-author of ElasticSearch Server all published by Packt Publishing.

The book you are holding in your hands was something that I wanted to write after finishing the ElasticSearch Server book and I got the opportunity. I wanted not to jump from topic to topic, but concentrate on a few of them and write about what I know and share the knowledge. Again, just like the ElasticSearch Server book, I couldn't include all topics I wanted, and some small details that are more or less important, depending on the use case, had to be left aside. Nevertheless, I hope that by reading this book you'll be able to easily get into all the details about ElasticSearch and underlying Apache Lucene, and I also hope that it will let you get the desired knowledge easier and faster.

Books From Packt


Solr 1.4 Enterprise Search Server
Solr 1.4 Enterprise Search Server

Apache Solr 3 Enterprise Search Server: RAW
Apache Solr 3 Enterprise Search Server: RAW

Python 2.6 Text Processing: Beginners Guide
Python 2.6 Text Processing: Beginners Guide

Pentaho Data Integration 4 Cookbook
Pentaho Data Integration 4 Cookbook

iReport 3.7
iReport 3.7

MySQL Admin Cookbook
MySQL Admin Cookbook

NHibernate 3 Beginner's Guide
NHibernate 3 Beginner's Guide

CMS Made Simple Development Cookbook
CMS Made Simple Development Cookbook


No votes yet

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
j
q
z
E
X
k
Enter the code without spaces and pay attention to upper/lower case.
Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software