We start with our index having two fields, id and _version_. The id field is used as the unique identifier; we informed Solr about this by adding the unqiueKey section in schema.xml. We will need it for functionalities such as document updates, deletes by identifiers, and so forth. The _version_ field is used by Solr internally, and is required by some Solr functionalities (such as optimistic locking); this is why we include it. The rest of the fields will be added automatically.
We also need to define the field types that we will use. Apart from the string type used by the id field, and the long type used by the _version_ field, it contains types our documents will use. We will also define these types in our custom processor chain in the solrconfig.xml file.
The next thing is very important; the managed schema factory that we defined in solrconfig.xml, which is a ManagedIndexSchemaFactory type (the class property set to this value). By adding this section, we say that we want Solr to manage our schema.xml file. This means that Solr will load the schema.xml file during startup, change its name to schema.xml.bak, and will then create a file called managed-schema (the value of the managedSchemaResourceName property). From this point, we shouldn't modify our index structure manually—we should either let Solr do it during indexation or add and alter fields using the schema API (we will talk about this in the Altering the index structure on a live collection recipe in Chapter 8, Using Additional Functionalities). Since I assume that we will use the schema API, I've set the mutable property to true. If we want to disallow using the schema API, we should set the mutable property to false.
Note
Note that you need to have a single schemaFactory defined, and it needs to be set to the ManagedIndexSchemaFactory type. If it is not set to this type, field discovery will not work and the indexation will result in an error.
We also need to include an update request processor chain. Since we want all index requests to use our custom request chain, we add the update.chain property and set it to add-unknown-fields in the defaults section of our update request handler configuration.
Finally, the second most important thing in this recipe is our update request processor chain called add-unknown-fields (the same as we used in the update processor configuration). It defines several update processors that allow us to get the functionality of fields and their types' discoveries. The solr.RemoveBlankFieldUpdateProcessorFactory processor factory removes empty fields from the documents we send to indexation. The solr.ParseBooleanFieldUpdateProcessorFactory processor factory is responsible for parsing Boolean fields; solr.ParseLongFieldUpdateProcessorFactory parses fields that have data that uses the long type; solr.ParseDoubleFieldUpdateProcessorFactory parses fields with data of double type; and solr.ParseDateFieldUpdateProcessorFactory parses the date-based fields. We specify the format we want Solr to recognize (we will discuss this in more detail in the Using parsing update processors to parse data recipe in Chapter 2, Indexing Your Data).
Finally, we include the solr.AddSchemaFieldsUpdateProcessorFactory processor factory that adds the actual fields to our managed schema. We specify the default field type to text by adding the defaultFieldType property. This type will be used when no other type will match the field. After the default field type definition, we see four lists called typeMapping. These sections define the field type mappings Solr will use. Each list contains at least one valueClass property and one fieldType property. The valueClass property defines the type of data Solr will assign to the field type defined by the fieldType property.
In our case, if Solr finds a date (<str name="valueClass">java.util.Date</str>) value in a field, it will create a new field using the tdates field type (<str name="fieldType">tdates</str>). If Solr finds a long or an integer value, it creates a new field using the tlongs field type. Of course, a field won't be created if it already exists in our managed schema. The name of the field created in our managed schema will be the same as the name of the field in the indexed document.
Finally, the solr.LogUpdateProcessorFactory processor factory tells Solr to write information about the update to log, and the solr.RunUpdateProcessorFactory processor factory tells Solr to run the update itself.
As we can see, our data includes fields that we didn't specify in the schema.xml file, and the document was indexed properly, which allows us to assume that the functionality works. If you want to check how our index structure looks like after indexation, use the schema API; you can do it yourself after reading the Retrieving information about the index structure recipe in Chapter 8, Using Additional Functionalities.
One thing to remember is that by default, Solr is able to automatically detect field types such as Boolean, integer, float, long, double, and date.