Customizing Cortex Search Scoring | Snowflake Documentation (original) (raw)

By default, queries to Cortex Search Services leverage vector similarity, text matching, and reranking to determine the relevance of each result. You can customize the scoring of search results in several ways:

Numeric boosts and time decays

You can boost or apply decays search results based on numeric or timestamp metadata. This feature is useful when you have structured metadata, such as popularity or recency signals, for each result that can help determine the relevance of documents at query time. You can specify two categories of ranking signals when making a query:

Type Description Applicable column types Example metadata fields (illustrative)
Numeric boost Numeric metadata that boosts results having more attention or activity. Numeric data type clicks, likes, comments
Time decay Date or time metadata that boosts more recent results. The influence of recency signals decays over time. Date and time data type created_timestamp, last_opened_timestamp, action_date

Boost and decay metadata come from columns in the source table from which a Cortex Search Service is created. You specify the metadata columns to use for boosting or decaying when you make the query, but those columns must be included when creating the Cortex Search service.

When querying a Cortex Search Service, specify the columns to use for boosting or decaying in the optionalnumeric_boosts and time_decays fields in the scoring_config.functions field. You can also specify the weight for each boost or decay.

{ "scoring_config": { "functions": { "numeric_boosts": [ { "column": "column_name", "weight": 1 }, /* ... / ], "time_decays": [ { "column": "column_name", "weight": 1, "limit_hours": 120 }, / ... */ ] } } }

Properties

Note

Numeric boosts are applied as weighted averages to the returned fields, while decays leverage a log-smoothed function to demote less recent values.

Weights are relative across the specified boost or decay fields. If only a single field is provided within a boosts ordecays array, the value of its weight is irrelevant.

If more than one field is provided, the weights are applied relative to each other. A field with a weight of 10, for example, affects the record’s ranking twice as much as a field with a weight of 5.

Reranking

By default, queries to Cortex Search Services leverage semantic reranking to improve search result relevance. While reranking can measurably increase result relevance, it can also noticeably increase query latency. You can disable reranking in any Cortex Search query if you’ve found that the quality benefit that reranking provides can be sacrificed for faster query speeds in your business use case.

Note

Disabling reranking reduces query latency by 100-300ms on average, but the exact reduction in latency, as well as the magnitude of the quality degradation, varies across workloads. Evaluate results side-by-side, with and without reranking, before you decide to disable it in queries.

You can disable the reranker for an individual query at query time in the scoring_config.reranker field in the following format:

{ "scoring_config": { "reranker": "none" } }

Properties

Component weights

The weights field in the scoring_config object allows you to specify the weights of individual scoring components (vectors, texts, reranker) in the overall score for each result. By default, the weights are set to 1.0 for each component, with an equal contribution to the overall scoring.

You can specify weights in the following format:

{ "scoring_config": { "functions": { "weights": { "texts": 3, "vectors": 2, "reranker": 1 } } } }

Properties

For example, the following specifies that text scores should be weighted 3 times more than vector scores, and reranker scores should be weighted 2 times more than text scores:

{ "scoring_config": { "functions": { "weights": { "texts": 3, "vectors": 1, "reranker": 2 } } } }

Named scoring profiles

Boosts/decays and reranker settings together form a scoring configuration, which can be specified in the scoring_config parameter when making a query. Scoring configurations can also be given a name and attached to the Cortex Search service.

Using a named scoring profile lets you easily use a scoring configuration across applications and queries without having to specify the full scoring configuration each time. If you change the scoring configuration, you only need to update it in one place, not in every query.

To add a scoring profile to your Cortex Search Service, use the ALTER CORTEX SEARCH SERVICE … ADD SCORING PROFILE command, as shown in the following example:

ALTER CORTEX SEARCH SERVICE my_search_service ADD SCORING PROFILE IF NOT EXISTS heavy_comments_with_likes '{ "functions": { "numeric_boosts": [ { "column": "comments", "weight": 6 }, { "column": "likes", "weight": 1 } ] } }'

The syntax of the scoring profile definition is the same schema used in the scoring_config parameter when making a query.

Scoring profiles can’t be modified after being created; to change a profile, drop it and recreate it with the new scoring configuration. To delete a named scoring profile, use ALTER CORTEX SEARCH SERVICE … DROP SCORING PROFILE.

To query a Cortex Search Service using a named scoring profile, specify the profile name in the scoring_profile parameter when making a query, as shown in the following examples:

results = svc.search( query="technology", columns=["comments", "likes"], scoring_profile="heavy_comments_with_likes", limit=10 )

To see a service’s stored scoring profiles, query the CORTEX_SEARCH_SERVICE_SCORING_PROFILES view in theINFORMATION_SCHEMA schema, as shown in the following example:

SELECT * FROM my_db.INFORMATION_SCHEMA.CORTEX_SEARCH_SERVICE_SCORING_PROFILES WHERE service_name = 'my_search_service';

Note

The DESCRIBE CORTEX SEARCH SERVICE and SHOW CORTEX SEARCH SERVICE results contain a column named scoring_profile_count that indicates the number of scoring profiles for each service.

Component Scores

Component Scores provide detailed scoring information for search results. They allow developers to understand how search rankings are determined and debug search performance. Scores for each result are returned in the @scores field for each retrieval “component” (text, vector). Component scores are useful in scenarios where there is a need to:

Understanding Component Scores

Component scores provide detailed breakdowns of how Cortex Search calculates the final relevance score for each search result. The scoring system consists of multiple components:

Cosine Similarity

Scores based on semantic similarity between the query and vector indexes. Higher scores indicate stronger conceptual or meaning-based matches using vector embeddings.

Text Match

Scores based on keyword/lexical similarity between the query and text indexes. Higher scores indicate stronger exact or fuzzy keyword matches.

Response Format

With component scores enabled, the following scoring information is returned for all your Cortex Search queries. For more information on Cortex Search Query syntax, see Query a Cortex Search Service.

{ "results": [ { "@scores": { "cosine_similarity": , "text_match": } } ] }

Score Fields

Usage Notes