Apache Solr Enterprise Search Server - Third Edition

More Information
Learn
  • Design a schema to include text indexing details such as tokenization, stemming, and synonyms
  • Import data from databases using various formats including CSV and XML and extract text from different document formats
  • Search using Solr's rich query syntax, perform geospatial searches, "join" relationally, and influence relevancy order
  • Build a query auto-complete/suggester capability with knowledge of the fundamental types of suggestion and ways to implement them
  • Enhance standard searches with faceting for navigation or analytics
  • Deploy Solr to production taking into account logging, security, and monitoring
  • Integrate a host of technologies with Solr including web crawlers, Hadoop, Java, JavaScript, Ruby, PHP, Drupal, and others
  • Tune Solr and use SolrCloud for horizontal scalability
About

Solr Apache is a widely popular open source enterprise search server that delivers powerful search and faceted navigation features—features that are elusive with databases. Solr supports complex search criteria, faceting, result highlighting, query-completion, query spell-checking, relevancy tuning, geospatial searches, and much more.

This book is a comprehensive resource for just about everything Solr has to offer, and it will take you from first exposure to development and deployment in no time. Even if you wish to use Solr 5, you should find the information to be just as applicable due to Solr's high regard for backward compatibility. The book includes some useful information specific to Solr 5.

Features
  • An update to the popular second edition of Apache Solr 3 Enterprise Search Server, covering Solr 4’s most important new features such as SolrCloud for scaling and real-time search
  • Contains integration examples with databases, web crawlers, Hadoop, XSLT, Java and embedded Solr, PHP and Drupal, JavaScript, and Ruby frameworks
  • Learn about deployment considerations including security, logging, monitoring, running ZooKeeper, and measuring performance
Page Count 432
Course Length 12 hours 57 minutes
ISBN 9781782161363
Date Of Publication 25 May 2015

Authors

David Smiley

Born to code, David Smiley is a software engineer who's passionate about search, Lucene, spatial, and open source. He has a great deal of expertise with Lucene and Solr, which started in 2008 at MITRE. In 2009, as the lead author, along with the coauthor Eric Pugh, he wrote Solr 1.4 Enterprise Search Server, the first book on Solr, published by Packt Publishing. It was updated in 2011, Apache Solr 3 Enterprise Search Server, Packt Publishing, and again for this third edition.

After the first book, he developed 1- and 2-day Solr training courses, delivered half a dozen times within MITRE, and he has also delivered training on LucidWorks once. Most of his excitement and energy relating to Lucene is centered on Lucene's spatial module to include Spatial4j, which he is largely responsible for. He has presented his progress on this at Lucene Revolution and other conferences several times. He currently holds the status of committer & Project Management Committee (PMC) member with the Lucene/Solr open source project. Over the years, David has staked his career on search, working exclusively on such projects, formerly for MITRE and now as an independent consultant for various clients. You can reach him at dsmiley@apache.org and view his LinkedIn profile here: http://www.linkedin.com/in/davidwsmiley.

Eric Pugh

Fascinated by the "craft" of software development, Eric Pugh has been involved in the open source world as a developer, committer, and user for the past decade. He is an emeritus member of the Apache Software Foundation.

In biotech, financial services, and defense IT, he has helped European and American companies develop coherent strategies to embrace open source software. As a speaker, he has advocated the advantages of Agile practices in search, discovery, and analytics projects.

Eric became involved in Solr when he submitted the patch SOLR-284 to parse rich document types, such as PDF and MS Office formats, that became the single-most popular patch, as measured by votes! The patch was subsequently cleaned up and enhanced by three other individuals, demonstrating the power of the free / open source models to build great code collaboratively. SOLR-284 was eventually refactored into Solr Cell.

He blogs at http://www.opensourceconnections.com/blog/.

Matt Mitchell

Matt Mitchell studied music synthesis and performance at Boston's Berklee College of Music, but his experiences with computers and programming in his younger years inspired him to pursue a career in software engineering. A passionate technologist, he has worked in many areas of software development, is active in several open source communities, and now has over 15 years of professional experience. He had his first experiences with Lucene and Solr in 2008 at the University of Virginia Library, where he became a core contributor to an open source search platform called Blacklight. Matt is the author of many open source projects, including a Solr client library called RSolr, which has had over 1 million downloads from rubygems.org. He has been responsible for the design and implementation of search systems at several tech companies, and he is currently a senior member of the engineering team at LucidWorks, where he's working on a next generation search, discovery, and analytics platform.

You can contact Matt on LinkedIn at https://www.linkedin.com/in/mattmitchell4.

Kranti Parisa

Kranti Parisa has more than a decade of software development expertise and a deep understanding of open source, enterprise software, and the execution required to build successful products.

He has fallen in love with enterprise search technologies, especially Lucene and Solr, after his initial implementations and customizations carried out in early 2008 to build a legal search engine for bankruptcy court documents, docket entries, and cases. He is an active contributor to the Apache Solr community. One of his recent contributions, along with Joel Bernstein, SOLR-4787, includes scalable and nested join implementations.

Kranti is currently working at Apple. Prior to that, he worked as a lead engineer and search architect at Comcast Labs, building and supporting a highly scalable search and discovery engine for the X1/X2 platform—the world's first entertainment operating system.

An entrepreneur by DNA, he is the cofounder and technical advisor of multiple start-ups focusing on cloud computing, SaaS, big data, and enterprise search based products and services. He holds a master's degree in computer integrated manufacturing from the National Institute of Technology, Warangal, India.

You can reach him on LinkedIn: http://www.linkedin.com/in/krantiparisa.