Marten (original) (raw)

Full Text Searching

Full Text Indexes in Marten are built based on GIN or GiST indexes utilizing Postgres built in Text Search functions. This enables the possibility to do more sophisticated searching through text fields.

WARNING

To use this feature, you will need to use PostgreSQL version 10.0 or above, as this is the first version that support text search function on jsonb column - this is also the data type that Marten use to store it's data.

Defining Full Text Index through Store options

Full Text Indexes can be created using the fluent interface of StoreOptions like this:

cs

var store = DocumentStore.For(_ =>
{
    _.Connection(ConnectionSource.ConnectionString);

    // This creates
    _.Schema.For<User>().FullTextIndex();
});

snippet source | anchor

INFO

If you don't specify language (regConfig) - by default it will be created with 'english' value.

cs

var store = DocumentStore.For(_ =>
{
    _.Connection(ConnectionSource.ConnectionString);

    // This creates
    _.Schema.For<User>().FullTextIndex(d => d.FirstName);
});

snippet source | anchor

cs

var store = DocumentStore.For(_ =>
{
    _.Connection(ConnectionSource.ConnectionString);

    // This creates
    _.Schema.For<User>().FullTextIndex(
        index =>
        {
            index.Name = "mt_custom_italian_user_fts_idx";
            index.RegConfig = "italian";
        },
        d => d.FirstName);
});

snippet source | anchor

cs

var store = DocumentStore.For(_ =>
{
    _.Connection(ConnectionSource.ConnectionString);

    // This creates
    _.Schema.For<User>().FullTextIndex(d => d.FirstName, d => d.LastName);
});

snippet source | anchor

cs

var store = DocumentStore.For(_ =>
{
    _.Connection(ConnectionSource.ConnectionString);

    // This creates
    _.Schema.For<User>().FullTextIndex(
        index =>
        {
            index.Name = "mt_custom_italian_user_fts_idx";
            index.RegConfig = "italian";
        },
        d => d.FirstName, d => d.LastName);
});

snippet source | anchor

cs

var store = DocumentStore.For(_ =>
{
    _.Connection(ConnectionSource.ConnectionString);

    // This creates
    _.Schema.For<User>()
        .FullTextIndex(d => d.FirstName) //by default it will use "english"
        .FullTextIndex("italian", d => d.LastName);
});

snippet source | anchor

Defining Full Text Index through Attribute

Full Text Indexes can be created using the [FullTextIndex] attribute like this:

cs

[FullTextIndex]
public class Book
{
    public Guid Id { get; set; }

    public string Title { get; set; }

    public string Author { get; set; }

    public string Information { get; set; }
}

snippet source | anchor

cs

public class UserProfile
{
    public Guid Id { get; set; }

    [FullTextIndex] public string Information { get; set; }
}

snippet source | anchor

INFO

If you don't specify regConfig - by default it will be created with 'english' value.

cs

public class UserDetails
{
    private const string FullTextIndexName = "mt_custom_user_details_fts_idx";

    public Guid Id { get; set; }

    [FullTextIndex(IndexName = FullTextIndexName, RegConfig = "italian")]
    public string Details { get; set; }
}

snippet source | anchor

cs

public class Article
{
    public Guid Id { get; set; }

    [FullTextIndex] public string Heading { get; set; }

    [FullTextIndex] public string Text { get; set; }
}

snippet source | anchor

INFO

To group multiple properties into single index you need to specify the same values in IndexName parameters.

cs

public class BlogPost
{
    public Guid Id { get; set; }

    public string Category { get; set; }

    [FullTextIndex] public string EnglishText { get; set; }

    [FullTextIndex(RegConfig = "italian")] public string ItalianText { get; set; }

    [FullTextIndex(RegConfig = "french")] public string FrenchText { get; set; }
}

snippet source | anchor

Text Search

Postgres contains built in Text Search functions. They enable the possibility to do more sophisticated searching through text fields. Marten gives possibility to define (full text indexes)(/documents/configuration/full_text) and perform queries on them. Currently four types of full Text Search functions are supported:

cs

var posts = session.Query<BlogPost>()
    .Where(x => x.Search("somefilter"))
    .ToList();

snippet source | anchor

cs

var posts = session.Query<BlogPost>()
    .Where(x => x.PlainTextSearch("somefilter"))
    .ToList();

snippet source | anchor

cs

var posts = session.Query<BlogPost>()
    .Where(x => x.PhraseSearch("somefilter"))
    .ToList();

snippet source | anchor

cs

var posts = session.Query<BlogPost>()
    .Where(x => x.WebStyleSearch("somefilter"))
    .ToList();

snippet source | anchor

All types of Text Searches can be combined with other Linq queries

cs

var posts = session.Query<BlogPost>()
    .Where(x => x.Category == "LifeStyle")
    .Where(x => x.PhraseSearch("somefilter"))
    .ToList();

snippet source | anchor

They allow also to specify language (regConfig) of the text search query (by default english is being used)

cs

var posts = session.Query<BlogPost>()
    .Where(x => x.PhraseSearch("somefilter", "italian"))
    .ToList();

snippet source | anchor

Partial text search in a multi-word text (NGram search)

Marten provides the ability to search partial text or words in a string containing multiple words using NGram search. This is quite similar in functionality to NGrams in Elastic Search. As an example, we can now accurately match rich com text within Communicating Across Contexts (Enriched). NGram search uses English by default. NGram search also encompasses and handles unigrams, bigrams and trigrams. This functionality is added in v5.

cs

var result = await session
    .Query<User>()
    .Where(x => x.UserName.NgramSearch(term))
    .ToListAsync();

snippet source | anchor

cs

var store = DocumentStore.For(_ =>
{
    _.Connection(Marten.Testing.Harness.ConnectionSource.ConnectionString);
    _.DatabaseSchemaName = "ngram_test";

    // This creates an ngram index for efficient sub string based matching
    _.Schema.For<User>().NgramIndex(x => x.UserName);
});

await store.Storage.ApplyAllConfiguredChangesToDatabaseAsync();

await using var session = store.LightweightSession();

string term = null;
for (var i = 1; i < 4; i++)
{
    var guid = $"{Guid.NewGuid():N}";
    term ??= guid.Substring(5);

    var newUser = new User(i, $"Test user {guid}");

    session.Store(newUser);
}

await session.SaveChangesAsync();

var result = await session
    .Query<User>()
    .Where(x => x.UserName.NgramSearch(term))
    .ToListAsync();

snippet source | anchor

cs

var result = await session
    .Query<User>()
    .Where(x => x.Address.Line1.NgramSearch(term))
    .ToListAsync();

snippet source | anchor

NGram search on non-English text 7.39.5

If you want to use NGram search on non-English text, Marten provides a mechanism via an opt-in storeOptions.Advanced.UseNGramSearchWithUnaccent = true which uses Postgres unaccent extension for applying before creating ngrams and on search input for a better multilingual experience. Check the sample code below:

cs

var store = DocumentStore.For(_ =>
{
   _.Connection(Marten.Testing.Harness.ConnectionSource.ConnectionString);
   _.DatabaseSchemaName = "ngram_test";
   _.Schema.For<User>().NgramIndex(x => x.UserName);
   _.Advanced.UseNGramSearchWithUnaccent = true;
});

await store.Storage.ApplyAllConfiguredChangesToDatabaseAsync();

await using var session = store.LightweightSession();
//The ngram uðmu should only exist in bjork, if special characters ignored it will return Umut
var umut = new User(1, "Umut Aral");
var bjork = new User(2, "Björk Guðmundsdóttir");

//The ngram øre should only exist in bjork, if special characters ignored it will return Chris Rea
var kierkegaard = new User(3, "Søren Kierkegaard");
var rea = new User(4, "Chris Rea");

session.Store(umut);
session.Store(bjork);
session.Store(kierkegaard);
session.Store(rea);

await session.SaveChangesAsync();

var result = await session
   .Query<User>()
   .Where(x => x.UserName.NgramSearch("uðmu") || x.UserName.NgramSearch("øre"))
   .ToListAsync();

snippet source | anchor

NGram Search Across Multiple Properties 7.39.5

In many cases, you may want to perform partial text search across multiple fields like UserName, FirstName, and LastName.

A naive approach might be to apply individual Ngram indexes and search each field separately:

Don't do this

This results in multiple indexes per document and requires complex LINQ queries to combine the results — inefficient and hard to maintain.

csharp

// Inefficient and verbose
var store = DocumentStore.For(_ =>
{
    _.Connection(ConnectionSource.ConnectionString);

    // Too many indexes
    _.Schema.For<User>().NgramIndex(d => d.UserName);
    _.Schema.For<User>().NgramIndex(d => d.FirstName);
    _.Schema.For<User>().NgramIndex(d => d.LastName);
});

var result = await session
    .Query<User>()
    .Where(x => x.UserName.NgramSearch(term) 
             || x.FirstName.NgramSearch(term) 
             || x.LastName.NgramSearch(term))
    .ToListAsync();

Instead, define a computed property that concatenates the values into a single field, and index that:

csharp

public class User
{
    public Guid Id { get; set; }
    public string UserName { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }

    // Combine searchable fields
    public string SearchString => $"{UserName} {FirstName} {LastName}";
}

Then configure the Ngram index on that property:

csharp

    _.Schema.For<User>().NgramIndex(x => x.SearchString);

This simplifies querying:

csharp

var result = await session
    .Query<User>()
    .Where(x => x.SearchString.NgramSearch(term))
    .ToListAsync();