Customizing Cortex Search Scoring | Snowflake Documentation (original) (raw)

By default, queries to Cortex Search Services leverage vector similarity, text matching, and reranking to determine the relevance of each result. You can customize the scoring of search results in several ways:

Apply numeric boosts based on numeric metadata columns.
Apply time decays based on timestamp metadata columns.
Disable reranking to reduce query latency.
Modify component weights to adjust the weight of individual scoring components (vector, text, reranking) in the overall search ranking.

Numeric boosts and time decays¶

You can boost or apply decays search results based on numeric or timestamp metadata. This feature is useful when you have structured metadata, such as popularity or recency signals, for each result that can help determine the relevance of documents at query time. You can specify two categories of ranking signals when making a query:

Type	Description	Applicable column types	Example metadata fields (illustrative)
Numeric boost	Numeric metadata that boosts results having more attention or activity.	Numeric data type	clicks, likes, comments
Time decay	Date or time metadata that boosts more recent results. The influence of recency signals decays over time.	Date and time data type	created_timestamp, last_opened_timestamp, action_date

Boost and decay metadata come from columns in the source table from which a Cortex Search Service is created. You specify the metadata columns to use for boosting or decaying when you make the query, but those columns must be included when creating the Cortex Search service.

When querying a Cortex Search Service, specify the columns to use for boosting or decaying in the optionalnumeric_boosts and time_decays fields in the scoring_config.functions field. You can also specify the weight for each boost or decay.

{ "scoring_config": { "functions": { "numeric_boosts": [ { "column": "column_name", "weight": 1 }, /* ... / ], "time_decays": [ { "column": "column_name", "weight": 1, "limit_hours": 120 }, / ... */ ] } } }

Properties¶

numeric_boosts (array, optional):
- <numeric_boost_object> (object, optional):
  * column_name (string): Specifies the numeric column to which the boost should be applied.
  * weight (float): Specifies the weight or importance assigned to the boosted column in the ranking process. When multiple columns are specified, a higher weight increases the influence of the field.
time_decays (array, optional):
- <time_decay_object> (object, optional):
  * column_name (string): Specifies the time or date column to which the decay should be applied.
  * weight (float): Specifies the weight or importance assigned to the decayed column in the ranking process. When multiple columns are specified, a higher weight increases the influence of the field.
  * limit_hours (float): Sets the boundary after which time starts to have less effect on the relevance or importance of the document. For example, a limit_hours value of 240 indicates that documents with timestamps greater than 240 hours (10 days) in the past from the now timestamp do not receive significant boosting, while documents with a timestamp within the last 240 hours should receive a more significant boost.
  * now (string, optional): Optional reference timestamp from which decays are calculated in ISO-8601 format yyyy-MM-dd'T'HH:mm:ss.SSSXXX. For example, "2025-02-19T14:30:45.123-08:00". Defaults to the current timestamp if not specified.

Note

Numeric boosts are applied as weighted averages to the returned fields, while decays leverage a log-smoothed function to demote less recent values.

Weights are relative across the specified boost or decay fields. If only a single field is provided within a boosts ordecays array, the value of its weight is irrelevant.

If more than one field is provided, the weights are applied relative to each other. A field with a weight of 10, for example, affects the record’s ranking twice as much as a field with a weight of 5.

Reranking¶

By default, queries to Cortex Search Services leverage semantic reranking to improve search result relevance. While reranking can measurably increase result relevance, it can also noticeably increase query latency. You can disable reranking in any Cortex Search query if you’ve found that the quality benefit that reranking provides can be sacrificed for faster query speeds in your business use case.

Note

Disabling reranking reduces query latency by 100-300ms on average, but the exact reduction in latency, as well as the magnitude of the quality degradation, varies across workloads. Evaluate results side-by-side, with and without reranking, before you decide to disable it in queries.

You can disable the reranker for an individual query at query time in the scoring_config.reranker field in the following format:

{ "scoring_config": { "reranker": "none" } }

Properties¶

reranker (string, optional): Parameter that can be set to “none” if the reranker should be turned off. If excluded or null, the default reranker is used.

Component weights¶

The weights field in the scoring_config object allows you to specify the weights of individual scoring components (vectors, texts, reranker) in the overall score for each result. By default, the weights are set to 1.0 for each component, with an equal contribution to the overall scoring.

You can specify weights in the following format:

{ "scoring_config": { "functions": { "weights": { "texts": 3, "vectors": 2, "reranker": 1 } } } }

Properties¶

weights (object, optional): Specifies weights for combining text, vector, and reranker scores for each document. Weights are applied relative to one another within this field.

For example, the following specifies that text scores should be weighted 3 times more than vector scores, and reranker scores should be weighted 2 times more than text scores:

{ "scoring_config": { "functions": { "weights": { "texts": 3, "vectors": 1, "reranker": 2 } } } }

Named scoring profiles¶

Boosts/decays and reranker settings together form a scoring configuration, which can be specified in the scoring_config parameter when making a query. Scoring configurations can also be given a name and attached to the Cortex Search service.

Using a named scoring profile lets you easily use a scoring configuration across applications and queries without having to specify the full scoring configuration each time. If you change the scoring configuration, you only need to update it in one place, not in every query.

To add a scoring profile to your Cortex Search Service, use the ALTER CORTEX SEARCH SERVICE … ADD SCORING PROFILE command, as shown in the following example:

ALTER CORTEX SEARCH SERVICE my_search_service ADD SCORING PROFILE IF NOT EXISTS heavy_comments_with_likes '{ "functions": { "numeric_boosts": [ { "column": "comments", "weight": 6 }, { "column": "likes", "weight": 1 } ] } }'

The syntax of the scoring profile definition is the same schema used in the scoring_config parameter when making a query.

Scoring profiles can’t be modified after being created; to change a profile, drop it and recreate it with the new scoring configuration. To delete a named scoring profile, use ALTER CORTEX SEARCH SERVICE … DROP SCORING PROFILE.

To query a Cortex Search Service using a named scoring profile, specify the profile name in the scoring_profile parameter when making a query, as shown in the following examples:

results = svc.search( query="technology", columns=["comments", "likes"], scoring_profile="heavy_comments_with_likes", limit=10 )

To see a service’s stored scoring profiles, query the CORTEX_SEARCH_SERVICE_SCORING_PROFILES view in theINFORMATION_SCHEMA schema, as shown in the following example:

SELECT * FROM my_db.INFORMATION_SCHEMA.CORTEX_SEARCH_SERVICE_SCORING_PROFILES WHERE service_name = 'my_search_service';

Note

The DESCRIBE CORTEX SEARCH SERVICE and SHOW CORTEX SEARCH SERVICE results contain a column named scoring_profile_count that indicates the number of scoring profiles for each service.

Component Scores¶

Component Scores provide detailed scoring information for search results. They allow developers to understand how search rankings are determined and debug search performance. Scores for each result are returned in the @scores field for each retrieval “component” (text, vector). Component scores are useful in scenarios where there is a need to:

Establish thresholds: Use component scores to determine when to pass results on to a downstream process, like an agent.
Debug search rankings: Understand why certain documents rank higher than others in search results.

Understanding Component Scores¶

Component scores provide detailed breakdowns of how Cortex Search calculates the final relevance score for each search result. The scoring system consists of multiple components:

Cosine Similarity

Scores based on semantic similarity between the query and vector indexes. Higher scores indicate stronger conceptual or meaning-based matches using vector embeddings.

Text Match

Scores based on keyword/lexical similarity between the query and text indexes. Higher scores indicate stronger exact or fuzzy keyword matches.

Response Format¶

With component scores enabled, the following scoring information is returned for all your Cortex Search queries. For more information on Cortex Search Query syntax, see Query a Cortex Search Service.

{ "results": [ { "@scores": { "cosine_similarity": , "text_match": } } ] }

Score Fields¶

@scores.cosine_similarity: Cosine similarity score between the query and the value in the vector index, in the range [-1, 1].
@scores.text_match: Text match score between the query and the value in the text index. This score is unbounded and its range depends on the query.

Usage Notes¶

cosine_similarity scores are:
- Returned for any query that includes a VECTOR INDEX.
- Bounded in the range [-1, 1] and comparable across different queries.
- Computed after prepending each query with a prefix that increases search quality in many cases. This prefix varies per model, but generally looks like: Represent this sentence for searching relevant passages: {query}. The returned cosine similarity score is the cosine similarity between the query with the prefix and the value in the vector index.
text_match scores are:
- Returned for any query that includes a TEXT INDEX. text_match scores are unbounded.
- Not comparable across different queries. For example, a text match score of 0.95 on a result for a given query is not comparable to a text match score of 0.95 on a result for a different query to the same service.
@scores values are not affected by the weights parameter. The weights only affect the final ordering of the results.