Indexing - How it works

Prerequisite

Fields have to be defined in Hawksearch in order for the data sent from Optimizely website to be saved in the Hawksearch indexes. For more details on field setup please refer to https://luminoslabs.atlassian.net/wiki/spaces/HC/pages/3718774833

Overall Approach

Passing data from an Optimizely solution (1 or multiple websites) to a Hawksearch engine (or more) has been designed as a 2-part process. Each part uses the Optimizely Scheduled Jobs support provided out of the box by the Optimizely framework:

  1. Full Indexing Job - process which extracts all cms (pages) & commerce (products/variants/bundles/packages) content from your Optimizely Solution, transforms each content into document objects which are then loaded into the Hawksearch Engine(s).

  2. Incremental Indexing Job - a process which uses Optimizely events raised when content is changed (published/deleted) in order to update the Hawksearch engine(s) with the latest version of data.

At the core of both indexing jobs sits a design based on a series of Handlers chained one after the other, each of them solving one particular indexing use case. This represents the first extensibility point of the indexing processes inside the connector. Additional handlers can be developed and chained as needed. For more details on how to do this, check How to Extend Handlers.

The more complex handlers (e.g. Product Indexing) are then built out of a series of Pipes which are usually extracting the content from the Optimizely databases, transforming them into Hawksearch document objects and then loading them into Hawksearch Engine(s) using the https://hawksearch.atlassian.net/wiki/spaces/HSKB/pages/1150943236/Hawksearch+V4.0 APIs. Additional pipes can be developed as needed. For more details on how to do this, check How to Extend Pipes.

 

Full Indexing Job

Steps (aka Handlers)

  1. Delete Previous (not set as current) Index

    1. read more about this step here

  2. Create Index

  3. Categories Indexing

  4. Rebuild Hierarchy

  5. Product Indexing

  6. Variants Indexing

  7. Bundles Indexing

  8. Packages Indexing

  9. CMS Content Indexing

    • multiple pipelines can be added depending on the Optimizely page type required for indexing

  10. Rebuild All

  11. Set new index as Current Index

 

Notes

  • steps above are repeated for each engine (language) defined in the Configuring the Connector

  • it is recommended to have this job running on a scheduled bases no more frequently that 1/day

 

Incremental Indexing Job

Steps (aka Handlers)

  1. Get Current Index

  2. Categories Incremental Indexing

  3. Categories Deletion

  4. Rebuild Hierarchy

  5. Cascading Events Handling

    1. read more about this step here

  6. Products Incremental Indexing

  7. Variants Incremental Indexing

  8. Bundles Incremental Indexing

  9. Packages Incremental Indexing

  10. Products / Variants / Bundles / Packages Deletion

  11. CMS Content Incremental Indexing

  12. CMS Content Deletion

  13. Rebuild All

 

Notes

  • steps above are repeated for each engine (language) defined in the Configuring the Connector

  • this job will not run if Full Indexing Job is already running

  • it is recommended to have this job running on a scheduled bases with a frequency of 1 to 10 minutes

 

Events

  • on publishing/expiring/deleting content the system will temporarily store information about content in a custom table

  • a full list of events and how they are treated can be found here

Â