Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Learning Search-driven Application Development with SharePoint 2013

You're reading from   Learning Search-driven Application Development with SharePoint 2013 The search engine in SharePoint 2013 is a refreshed version and this book will show you how to make the most of it with a range of methodologies for developing search-driven applications. JavaScript experience required.

Arrow left icon
Product type Paperback
Published in Jul 2013
Publisher Packt
ISBN-13 9781782171003
Length 106 pages
Edition 1st Edition
Arrow right icon
Author (1):
Arrow left icon
Johnny Tordgeman Johnny Tordgeman
Author Profile Icon Johnny Tordgeman
Johnny Tordgeman
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Learning Search-driven Application Development with SharePoint 2013
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
1. Getting Started with SharePoint 2013 Search 2. Using the Out of the Box Search Components FREE CHAPTER 3. Using the New CSOM and RESTful APIs 4. Customizing the Look 5. Extending Beyond SharePoint Index

The search architecture


SharePoint 2013 Search introduces a new search architecture that includes significant changes and new additions compared to previous versions. Since Microsoft consolidated FAST and SharePoint Search, the new search architecture has inherited components from both products while maintaining high scalability and performance.

Let's have a look at the new search architecture and discuss its components; refer to the following screenshot:

As we can see from the diagram, the search architecture can be divided into four components groups as follows:

  • Content components

  • Query components

  • The index component

  • The analytics-processing component

Content components

The content components are in charge of getting content ready for indexing. Each component has a well-defined role, which we will discuss next.

Crawl component

The crawl component is responsible for crawling content sources. It is the first stop for data that is about to be indexed by the search engine. The crawl component invokes connectors (both out-of-the-box and custom ones) that interact with the content source in order to crawl it.

While indexing, the crawl component uses one (or more) crawl database to temporarily store detailed tracking and historical information about the crawled item, such as the last time the item was crawled and the type of update during the last crawl.

Once an item is crawled, meaning both its data and its associated metadata is crawled, the crawl component delivers it to the content-processing component.

Content-processing component

The content-processing component's job is to analyze content it receives from the crawl component and feed it to the index component for indexing.

Content analysis is done by following a flow known as the Content Processing Flow, which is depicted in the following diagram:

The rectangular blocks in the diagram represent stages that we cannot interact with. We won't be discussing them as they are quite self-explanatory. The curved rectangular blocks, however, represent stages that we can interact with during the processing flow.

The Web service callout stage is similar to the pipeline extensibility stage of FAST for SharePoint 2010, and allows you to add a callout from the content-processing component to a web service of your own so you can manipulate the crawled content before it gets indexed by the index component.

Unlike FAST's pipeline-extensibility stage, where code had to be executed in a sandbox, the web service callout accepts a web service endpoint, which is much easier and reduces the overhead involved in writing a console application to accompany the content-flow process.

Calling a web service during the processing stage can be useful for two scenarios.

  • Creating new refiners by extracting data from unstructured text using our own logic

  • Calculating new refiners based on the data of managed properties

You can find a great example on using the web service callout in Kathrine Hammervold's post, Customize the SharePoint 2013 search experience with a Content Enrichment web service, located at http://blogs.msdn.com/b/sharepointdev/archive/2012/11/13/customize-the-sharepoint-2013-search-experience-with-a-content-enrichment-web-service.aspx.

The next point of interaction is the word-breaking stage, which allows you to write your own custom word-breaking logic for the content processor. Please refer to the MSDN documentation on custom word breakers, located at http://msdn.microsoft.com/en-us/library/jj163981.aspx.

Query components

The query components are in charge of analyzing the search query and processing the results.

Web frontend

The web frontend is where the search process actually begins. A user can interact with the search service by either writing a search query in the search center (or a search box) or developing against the new public APIs: REST/OData services and the CSOM. Both the search center and public APIs are hosted on the frontend.

Once the user creates a query, the query is sent to the query-processing component for analysis. The query-processing component analyzes the query and forwards it to the index component. The index component returns the matching results to the query-processing component for another analysis and from there the results are forwarded to the web frontend to be displayed.

Query processing component

As mentioned previously, the query-processing component's job is to analyze and process both search queries and results.

When the query-processing component receives a search query from the frontend, it analyzes it in an attempt to optimize its precision and relevance. A site administrator can interact with a query using different techniques such as query rules or result source. We will discuss these techniques in detail in the next chapter, but for now it is important to understand that these manipulations are handled within the query-processing components. As part of its query handling, the query-processing component performs linguistic processes on the query, such as word-breaking and stemming.

Once the query is optimized, it is sent to the index component, which will process the optimized query and return a result set back to the query-processing component and from there to the search frontend.

The index component

The index component is the heart of the search service, and without proper planning it can easily become the bottleneck of the service as well.

The index component has the following two roles:

  • Input: The index component is in charge of writing the optimized content it gets from the content-processing component to the index file

  • Output: The index component is in charge of returning results from the index file to the query-processing component, by request

How the index component saves and manages this index file is out of the scope of this book, but you can read more about this in the TechNet article Manage the index component in SharePoint Server 2013, located at http://technet.microsoft.com/en-us/library/jj862355.aspx.

Analytics processing component

The analytics-processing component is a new addition to SharePoint Search. Its role is to analyze both content and user actions with the content in order to improve the search relevance for the user.

The analytics architecture consists of three main parts, as follows:

  • The analytics-processing component, which runs the analytics jobs.

  • The analytics-reporting database, which stores statistical information such as usage data.

  • The link database, which stores information about searches and crawled documents. In addition, the link database is shared with the Content Processing Component, which in turn stores links and anchors in it. The information, the content-processing component stores is later used by the analytics-processing component.

The analytics-processing component runs two types of analytics: search analytics and usage analytics. The search analytics analyzes content from the content-processing component for information such as links, information related to people, and recommendations. The usage analytics analyzes user actions on an item, such as the number of views it had or how many users clicked on it.

An important output of usage analytics are the recommendations. The recommendations analysis creates recommendations on items based on how users have interacted with this specific item in the past. The analysis calculates an item-to-item relationship graph and updates it continuously based on search usage.

Keep in mind that the analytics-processing component is a "learning" component, which means it learns by usage. The more usage the search system will have, the better analytics it will provide.

lock icon The rest of the chapter is locked
Visually different images
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Learning Search-driven Application Development with SharePoint 2013
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime
Modal Close icon
Modal Close icon