Solr 1.4 Enterprise Search Server

More Information
Learn
  • Blend structured data with real search features
  • Import CSV formatted data, XML, common document formats, and from databases
  • Deploy Solr and provide reference to Solr's query syntax from the basics to range queries
  • Enhance search results with spell-checking, auto-completing queries, highlighting search results, and more.
  • Secure Solr
  • Integrate a host of technologies with Solr from the server side to client-side JavaScript, to frameworks like Drupal
  • Scale Solr using replication, distributed searches, and tuning
About

If you are a developer building a high-traffic web site, you need to have a terrific search engine. Sites like Netflix.com and Zappos.com employ Solr, an open source enterprise search server, which uses and extends the Lucene search library. This is the first book in the market on Solr and it will show you how to optimize your web site for high volume web traffic with full-text search capabilities along with loads of customization options. So, let your users gain a terrific search experience.

This book is a comprehensive reference guide for every feature Solr has to offer. It serves the reader right from initiation to development to deployment. It also comes with complete running examples to demonstrate its use and show how to integrate it with other languages and frameworks.

This book first gives you a quick overview of Solr, and then gradually takes you from basic to advanced features that enhance your search. It starts off by discussing Solr and helping you understand how it fits into your architecture—where all databases and document/web crawlers fall short, and Solr shines. The main part of the book is a thorough exploration of nearly every feature that Solr offers. To keep this interesting and realistic, we use a large open source set of metadata about artists, releases, and tracks courtesy of the MusicBrainz.org project. Using this data as a testing ground for Solr, you will learn how to import this data in various ways from CSV to XML to database access. You will then learn how to search this data in a myriad of ways, including Solr's rich query syntax, "boosting" match scores based on record data and other means, about searching across multiple fields with different boosts, getting facets on the results, auto-complete user queries, spell-correcting searches, highlighting queried text in search results, and so on.

After this thorough tour, we'll demonstrate working examples of integrating a variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby, XSLT, PHP, and Python.

Finally, we'll cover various deployment considerations to include indexing strategies and performance-oriented configuration that will enable you to scale Solr to meet the needs of a high-volume site.

Features
  • Deploy, embed, and integrate Solr with a host of programming languages
  • Implement faceting in e-commerce and other sites to summarize and navigate the results of a text search
  • Enhance your search by highlighting search results, offering spell-corrections, auto-suggest, finding “similar” records, boosting records and fields for scoring, phonetic matching
  • Informative and practical approach to development with fully working examples of integrating a variety of technologies
  • Written and tested for Solr 1.4 pre-release 2009.08
Page Count 336
Course Length 10 hours 4 minutes
ISBN 9781847195883
Date Of Publication 18 Aug 2009

Authors

David Smiley

Born to code, David Smiley is a software engineer who's passionate about search, Lucene, spatial, and open source. He has a great deal of expertise with Lucene and Solr, which started in 2008 at MITRE. In 2009, as the lead author, along with the coauthor Eric Pugh, he wrote Solr 1.4 Enterprise Search Server, the first book on Solr, published by Packt Publishing. It was updated in 2011, Apache Solr 3 Enterprise Search Server, Packt Publishing, and again for this third edition.

After the first book, he developed 1- and 2-day Solr training courses, delivered half a dozen times within MITRE, and he has also delivered training on LucidWorks once. Most of his excitement and energy relating to Lucene is centered on Lucene's spatial module to include Spatial4j, which he is largely responsible for. He has presented his progress on this at Lucene Revolution and other conferences several times. He currently holds the status of committer & Project Management Committee (PMC) member with the Lucene/Solr open source project. Over the years, David has staked his career on search, working exclusively on such projects, formerly for MITRE and now as an independent consultant for various clients. You can reach him at dsmiley@apache.org and view his LinkedIn profile here: http://www.linkedin.com/in/davidwsmiley.

Eric Pugh

Fascinated by the "craft" of software development, Eric Pugh has been involved in the open source world as a developer, committer, and user for the past decade. He is an emeritus member of the Apache Software Foundation.

In biotech, financial services, and defense IT, he has helped European and American companies develop coherent strategies to embrace open source software. As a speaker, he has advocated the advantages of Agile practices in search, discovery, and analytics projects.

Eric became involved in Solr when he submitted the patch SOLR-284 to parse rich document types, such as PDF and MS Office formats, that became the single-most popular patch, as measured by votes! The patch was subsequently cleaned up and enhanced by three other individuals, demonstrating the power of the free / open source models to build great code collaboratively. SOLR-284 was eventually refactored into Solr Cell.

He blogs at http://www.opensourceconnections.com/blog/.