Packt+ | Advance your knowledge in tech

You're reading from Solr Cookbook - Third Edition

Product typeBook

Published inJan 2015

Reading LevelIntermediate

Publisher

ISBN-139781783553150

Edition1st Edition

Languages

Java

Tools

Solr

Concepts

Enterprise Search

Author (1)

Rafal Kuc

Using core discovery

Until Solr 4.4, solr.xml needed to include mandatory information, such as the cores definition. This was needed because Solr used this information to get and load the defined cores and their properties, basically information that was required for Solr to operate properly. Starting from Solr 4.4, a new structure of the solr.xml file was introduced, and in addition to this, a process called core discovery was implemented. Due to these changes, we are not forced to describe the core in the solr.xml file, but instead, we can use simple text files, and Solr will automatically load the appropriate cores. This recipe will show you how to use the core discovery process.

How to do it...

Using the new core discovery process is very simple.

We start with creating the solr.xml file, which should be put in the home directory of Solr. The contents of the file should look like the following:

<?xml version="1.0" encoding="UTF-8" ?>
<solr>
 <solrcloud>
  <str name="host">${host:}</str>
  <int name="hostPort">${jetty.port:8983}</int>
  <str name="hostContext">${hostContext:solr}</str>
  <int name="zkClientTimeout">${zkClientTimeout:30000}</int>
  <bool name="genericCoreNodeNames">
             ${genericCoreNodeNames:true}</bool>
 </solrcloud>
 <shardHandlerFactory name="shardHandlerFactory"
             class="HttpShardHandlerFactory">
  <int name="socketTimeout">${socketTimeout:0}</int>
  <int name="connTimeout">${connTimeout:0}</int>
 </shardHandlerFactory>
</solr>

After this, we are ready to use the core discovery. For each core, apart from the standard configuration stored in the conf directory, we need to create the core.properties file, which should be placed in the same directory as the conf directory. For example, if we have a core named sample_core, our very simple core.properties file will look like this:
```
name=sample_core
```

That's all; during startup, Solr will load our core.

How it works...

The solr.xml file is the same one that is provided with the Solr example deployment, and it contains the default values related to Solr configuration. The host property specifies the hostname, and the hostPort property specifies the port on which Solr will run (it will be taken from the jetty.port property, and is by default 8983). The hostContext property specifies the web application context under which Solr will be available (by default, it is solr). In addition to this, we can specify the ZooKeeper client session timeout by using the zkClientTimeout property (used only in the SolrCloud mode, defaulting to 30,000 milliseconds). By default, we also say that we want Solr to use generic core names for SolrCloud, and we can change this by specifying false in the genericCoreNodeNames property.

There are two additional properties that relate to shard handling. The socketTimeout property specifies the timeout of socket connection, and the connTimeout property specifies the timeout of connection. Both the properties are used to create clients used by Solr to communicate between shards. The connection timeout specifies the timeout when Solr connects to another shard, and it takes a long time; the socket timeout is about the time to wait for the response to be back.

The simplest core.properties file is an empty file, in which case, Solr will try to choose the core name for us. However, in our case, we wanted to give the core a name we've chosen, and because of this, we included a single name entry that defines the name Solr will assign to the core. You should remember that Solr will try to load all the cores that have the core.properties file present, and the core name doesn't have to live in the directory of the same name.

Of course, the name property is not the only property available for usage. There are other properties, but in most cases, you'll use the name property only:

name: This is the name of the core.
config: This is the configuration filename, which defaults to solrconfig.xml.
dataDir: This is the directory where data is stored. By default, Solr will use a directory called data that is created on the same level as the conf directory.
ulogDir: This is the directory where the transaction log entries are stored. For performance reasons, it might be good to store transaction logfiles on a disks other than the index files.
schema: This is the name of the file describing the index structure, which defaults to schema.xml.
shard: This is the identifier of the shard.
collection: This is the name of the collection the core belongs to.
roles: This is the core role definition.
loadOnStartup: This can take a value of true or false. It defaults to true, which means Solr will load the core during startup.
transient: This can take a value of true or false. It defaults to false, which means that the core can't be automatically unloaded by Solr.
coreNodeName: This is the name of the core used by SolrCloud.

Finally, it is worth saying that the old solr.xml format will not be supported in Solr 5.0, so it is good to get familiar with the new format now.

There's more...

If you want to see all the properties and sections exposed by the new solr.xml format, refer to the official Apache Solr documentation located at https://cwiki.apache.org/confluence/display/solr/Format+of+solr.xml.

You have been reading a chapter from

Solr Cookbook - Third Edition

Published in: Jan 2015Publisher: ISBN-13: 9781783553150

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Rafal Kuc

Rafał Kuć is a software engineer, trainer, speaker and consultant. He is working as a consultant and software engineer at Sematext Group Inc. where he concentrates on open source technologies such as Apache Lucene, Solr, and Elasticsearch. He has more than 14 years of experience in various software domains—from banking software to e–commerce products. He is mainly focused on Java; however, he is open to every tool and programming language that might help him to achieve his goals easily and quickly. Rafał is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people solve their Solr and Lucene problems. He is also a speaker at various conferences around the world such as Lucene Eurocon, Berlin Buzzwords, ApacheCon, Lucene/Solr Revolution, Velocity, and DevOps Days. Rafał began his journey with Lucene in 2002; however, it wasn't love at first sight. When he came back to Lucene in late 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then Solr came and that was it. He started working with Elasticsearch in the middle of 2010. At present, Lucene, Solr, Elasticsearch, and information retrieval are his main areas of interest. Rafał is also the author of the Solr Cookbook series, ElasticSearch Server and its second edition, and the first and second editions of Mastering ElasticSearch, all published by Packt Publishing.
Read more about Rafal Kuc

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages