Indexing - How it works
Prerequisite
Fields have to be defined in Hawksearch in order for the data sent from Optimizely website to be saved in the Hawksearch indexes. For more details on field setup please refer to https://luminoslabs.atlassian.net/wiki/spaces/HC/pages/3718774833
Overall Approach
Passing data from an Optimizely solution (1 or multiple websites) to a Hawksearch engine (or more) has been designed as a 2-part process. Each part uses the Optimizely Scheduled Jobs support provided out of the box by the Optimizely framework:
Full Indexing Job - process which extracts all cms (pages) & commerce (products/variants/bundles/packages) content from your Optimizely Solution, transforms each content into document objects which are then loaded into the Hawksearch Engine(s).
Incremental Indexing Job - a process which uses Optimizely events raised when content is changed (published/deleted) in order to update the Hawksearch engine(s) with the latest version of data.
At the core of both indexing jobs sits a design based on a series of Handlers chained one after the other, each of them solving one particular indexing use case. This represents the first extensibility point of the indexing processes inside the connector. Additional handlers can be developed and chained as needed. For more details on how to do this, check How to Extend Handlers.
The more complex handlers (e.g. Product Indexing) are then built out of a series of Pipes which are usually extracting the content from the Optimizely databases, transforming them into Hawksearch document objects and then loading them into Hawksearch Engine(s) using the https://bridgeline.atlassian.net/l/cp/sacB0SGs APIs. Additional pipes can be developed as needed. For more details on how to do this, check How to Extend Pipes.
Â
Full Indexing Job
Steps (aka Handlers)
Delete Previous (not set as current) Index
read more about this step here
Create Index
Categories Indexing
Rebuild Hierarchy
as per Hawksearch documentation
Product Indexing
Variants Indexing
Bundles Indexing
Packages Indexing
CMS Content Indexing
multiple pipelines can be added depending on the Optimizely page type required for indexing
Rebuild All
as per Hawksearch documentation rebuilds autocomplete, percolator, learning search and related searches.
Set new index as Current Index
Â
Notes
steps above are repeated for each engine (language) defined in the Configuring the Connector
it is recommended to have this job running on a scheduled bases no more frequently that 1/day
Â
Incremental Indexing Job
Steps (aka Handlers)
Get Current Index
Categories Incremental Indexing
Categories Deletion
Rebuild Hierarchy
Cascading Events Handling
read more about this step here
Products Incremental Indexing
Variants Incremental Indexing
Bundles Incremental Indexing
Packages Incremental Indexing
Products / Variants / Bundles / Packages Deletion
CMS Content Incremental Indexing
CMS Content Deletion
Rebuild All
Â
Notes
steps above are repeated for each engine (language) defined in the Configuring the Connector
this job will not run if Full Indexing Job is already running
it is recommended to have this job running on a scheduled bases with a frequency of 1 to 10 minutes
Â
Events
on publishing/expiring/deleting content the system will temporarily store information about content in a custom table
a full list of events and how they are treated can be found here
Â