Indexing - How it works

1 Prerequisite
2 Overall Approach
3 Full Indexing Job
4 Incremental Indexing Job

Prerequisite

Fields have to be defined in Hawksearch in order for the data sent from Optimizely website to be saved in the Hawksearch indexes. For more details on field setup please refer to https://luminoslabs.atlassian.net/wiki/spaces/HC/pages/3718774833

Overall Approach

Passing data from an Optimizely solution (1 or multiple websites) to a Hawksearch engine (or more) has been designed as a 2-part process. Each part uses the Optimizely Scheduled Jobs support provided out of the box by the Optimizely framework:

Full Indexing Job - process which extracts all cms (pages) & commerce (products/variants/bundles/packages) content from your Optimizely Solution, transforms each content into document objects which are then loaded into the Hawksearch Engine(s).
Incremental Indexing Job - a process which uses Optimizely events raised when content is changed (published/deleted) in order to update the Hawksearch engine(s) with the latest version of data.

At the core of both indexing jobs sits a design based on a series of Handlers chained one after the other, each of them solving one particular indexing use case. This represents the first extensibility point of the indexing processes inside the connector. Additional handlers can be developed and chained as needed. For more details on how to do this, check How to Extend Handlers.

The more complex handlers (e.g. Product Indexing) are then built out of a series of Pipes which are usually extracting the content from the Optimizely databases, transforming them into Hawksearch document objects and then loading them into Hawksearch Engine(s) using the https://bridgeline.atlassian.net/l/cp/sacB0SGs APIs. Additional pipes can be developed as needed. For more details on how to do this, check How to Extend Pipes.

Full Indexing Job

Steps (aka Handlers)

Delete Previous (not set as current) Index
1. read more about this step here
Create Index
Categories Indexing
Rebuild Hierarchy
- as per Hawksearch documentation
Product Indexing
Variants Indexing
Bundles Indexing
Packages Indexing
CMS Content Indexing
- multiple pipelines can be added depending on the Optimizely page type required for indexing
Rebuild All
- as per Hawksearch documentation rebuilds autocomplete, percolator, learning search and related searches.
Set new index as Current Index

Notes

steps above are repeated for each engine (language) defined in the Configuring the Connector
it is recommended to have this job running on a scheduled bases no more frequently that 1/day

Incremental Indexing Job

Steps (aka Handlers)

Get Current Index
Categories Incremental Indexing
Categories Deletion
Rebuild Hierarchy
Cascading Events Handling
1. read more about this step here
Products Incremental Indexing
Variants Incremental Indexing
Bundles Incremental Indexing
Packages Incremental Indexing
Products / Variants / Bundles / Packages Deletion
CMS Content Incremental Indexing
CMS Content Deletion
Rebuild All

Notes

steps above are repeated for each engine (language) defined in the Configuring the Connector
this job will not run if Full Indexing Job is already running
it is recommended to have this job running on a scheduled bases with a frequency of 1 to 10 minutes

Events

on publishing/expiring/deleting content the system will temporarily store information about content in a custom table
a full list of events and how they are treated can be found here