Field Configuration: Best Practices - Field Types

Table of Contents


Overview

Field and facet configuration impacts index size and payload size, which has an impact on indexing time and engine performance. The following guidelines are standard best practices, your configurations may vary depending on your data, business requirements and use cases.

Data Configuration: Fields

The following screenshot depicts the configuration settings within the field section that this document will cover. Below are some of the key field settings that impact index and payload size.


Field Type

There are four different configuration options for field type. This field configuration depends on the type of data that is included in the field.


Field Values are NOT stemmed

This field configuration should be used in the following instances:

  • Fields that are used as Facets

  • Fields that are used for Sorting

  • Fields that need to be searched as is (i.e. on the full field)

    • Brand (i.e . “Under Armour” should match the full term. Searching “under” should not return a match on “Under Armour”)

    • Size fields

    • Numeric Values


Field values are ONLY stemmed (search only)

This field configuration should be used in the following instances:

  • Searching words within a field (i.e. searching “Under Armour” would find a match for “Under Armour”, but searching “under” would also find a match on “Under Armour”)

  • Fields are NOT used for Sorting or Facets

  • Fields this setting is typically used on include:

    • Short Description

    • Content


Field Values indexed “as is” AND are stemmed

This setting is a combination of the two settings above and this field configuration should be used in the following instances:

  • If the field will be used for Sorting or Facets AND for searching

  • This setting will store multiple versions of the field.

  • NOTE: while this option covers all scenarios, please use this setting judiciously as it will increase the index size to account for both ‘as is’ and stemmed values.


Stored only, not used for search or facets

This field configuration should be used in the following instances:

  • This option should be used for fields that are not searchable or used as facets, but the field should be included in the response to render the layout

  • This setting is typically for fields such as:

    • Images Fields

    • URL Fields


Prefixed/Wildcard Fields

This field configuration should be used in the following instances:

  • Should only be set if prefix or wildcard querying is needed on front end or backend.

  • Please use this setting judiciously as it will increase the index size to account for all of the combinations being indexed.

  • Should not be used on long fields

  • This field setting will bloat the index if used incorrectly and negatively impact relevancy.

  • Typically this setting is not enabled on many fields, if at all.

  • Remember: Stemming handles searching words within a phrase, as well as variations of the word. 

  • Wildcard will handle partial searches within a single value. The most common use for this setting is with fields such as:

    • SKU

    • UPC


Include in Results

  • Lucene stores both tokenized and pure text versions of the field.  Turning this off stops storing the text version of the field, which in turn reduces the Index Size

  • This setting should only be enabled on fields that need the output for the field to render the item.