Better conversion from collection expression - C# feature specifications (original) (raw)

Note

This article is a feature specification. The specification serves as the design document for the feature. It includes proposed specification changes, along with information needed during the design and development of the feature. These articles are published until the proposed spec changes are finalized and incorporated in the current ECMA specification.

There may be some discrepancies between the feature specification and the completed implementation. Those differences are captured in the pertinent language design meeting (LDM) notes.

You can learn more about the process for adopting feature speclets into the C# language standard in the article on the specifications.

Champion issue: https://github.com/dotnet/csharplang/issues/8374

Summary

Updates to the better conversion rules to be more consistent with params, and better handle current ambiguity scenarios. For example, ReadOnlySpan<string> vs ReadOnlySpan<object> can currently cause ambiguities during overload resolution for [""].

Detailed Design

The following are the better conversion from expression rules. These replace the rules in https://github.com/dotnet/csharplang/blob/main/proposals/csharp-12.0/collection-expressions.md#overload-resolution.

These rules are:

Given an implicit conversion C₁ that converts from an expression E to a type T₁, and an implicit conversion C₂ that converts from an expression E to a type T₂, C₁ is a better conversion than C₂ if one of the following holds:

We add a new definition for better collection conversion from expression, as follows:

Given:

If there is an identity conversion from E₁ to E₂, then the element conversions are as good as each other. Otherwise, the element conversions to E₁ are better than the element conversions to E₂ if:

C₁ is a better collection conversion from expression than C₂ if:

Otherwise, neither collection type is better, and the result is ambiguous.

Note

These rules mean that methods that expose overloads that take different element types and without a conversion between the collection types are ambiguous for empty collection expressions. As an example:

public void M(ReadOnlySpan<int> ros) { ... }
public void M(Span<int?> span) { ... }

M([]); // Ambiguous

Scenarios:

In plain English, the collection types themselves must be either the same, or unambiguously better (ie, List<T> and List<T> are the same, List<T> is unambiguously better than IEnumerable<T>, and List<T> and HashSet<T> cannot be compared), and the element conversions for the better collection type must also be the same or better (ie, we can't decide between ReadOnlySpan<object> and Span<string> for [""], the user needs to make that decision). More examples of this are:

T₁ T₂ E C₁ Conversions C₂ Conversions CE₁ᵢ vs CE₂ᵢ Outcome
List List [1, 2, 3] [Identity, Identity, Identity] [Implicit Constant, Implicit Constant, Implicit Constant] CE₁ᵢ is better List is picked
List List [(int)1, (byte)2] [Identity, Implicit Numeric] Not applicable T₂ is not applicable List is picked
List List [1, (byte)2] [Identity, Implicit Numeric] [Implicit Constant, Identity] Neither is better Ambiguous
List List [(byte)1, (byte)2] [Implicit Numeric, Implicit Numeric] [Identity, Identity] CE₂ᵢ is better List is picked
List<int?> List [1, 2, 3] [Implicit Nullable, Implicit Nullable, Implicit Nullable] [Implicit Numeric, Implicit Numeric, Implicit Numeric] Neither is better Ambiguous
List<int?> List [1, 2, 3] [Implicit Nullable, Implicit Nullable, Implicit Nullable] [Implicit Numeric, Implicit Numeric, Implicit Numeric] CE₁ᵢ is better List<int?> is picked
List List [1, 2, 3] [Implicit Numeric, Implicit Numeric, Implicit Numeric] [Implicit Numeric, Implicit Numeric, Implicit Numeric] CE₁ᵢ is better List is picked
IEnumerable List [1, 2, 3] [Identity, Identity, Identity] [Implicit Constant, Implicit Constant, Implicit Constant] CE₁ᵢ is better IEnumerable is picked
IEnumerable List [(byte)1, (byte)2] [Implicit Numeric, Implicit Numeric] [Identity, Identity] CE₂ᵢ is better List is picked
int[] List [1, 2, 3] [Identity, Identity, Identity] [Implicit Constant, Implicit Constant, Implicit Constant] CE₁ᵢ is better int[] is picked
ReadOnlySpan ReadOnlySpan ["", "", ""] [Identity, Identity, Identity] [Implicit Reference, Implicit Reference, Implicit Reference] CE₁ᵢ is better ReadOnlySpan is picked
ReadOnlySpan ReadOnlySpan ["", new object()] Not applicable [Implicit Reference, Identity] T₁ is not applicable ReadOnlySpan is picked
ReadOnlySpan Span ["", ""] [Implicit Reference] [Identity] CE₂ᵢ is better Span is picked
ReadOnlySpan Span [new object()] [Identity] Not applicable T₁ is not applicable ReadOnlySpan is picked
ReadOnlySpan ReadOnlySpan [$"{1}"] [Interpolated String Handler] [Identity] CE₁ᵢ is better ReadOnlySpan is picked
ReadOnlySpan ReadOnlySpan [$"{"blah"}"] [Interpolated String Handler] [Identity] - But constant CE₂ᵢ is better ReadOnlySpan is picked
ReadOnlySpan ReadOnlySpan [$"{1}"] [Identity] [Interpolated String] CE₂ᵢ is better ReadOnlySpan is picked
ReadOnlySpan ReadOnlySpan [$"{1}", (FormattableString)null] Not applicable [Interpolated String, Identity] T₁ isn't applicable ReadOnlySpan is picked
HashSet Span [1, 2] [Implicit Constant, Implicit Constant] [Implicit Numeric, Implicit Numeric] CE₁ᵢ is better HashSet is picked
HashSet Span [1, 2] [Implicit Numeric, Implicit Numeric] [Implicit Constant, Implicit Constant] CE₂ᵢ is better Span is picked

Open questions

How far should we prioritize ReadOnlySpan/Span over other types?

As specified today, the following overloads would be ambiguous:

C.M1(["Hello world"]); // Ambiguous, no tiebreak between ROS and List
C.M2(["Hello world"]); // Ambiguous, no tiebreak between Span and List

C.M3(["Hello world"]); // Ambiguous, no tiebreak between ROS and MyList.

C.M4(["Hello", "Hello"]); // Ambiguous, no tiebreak between ROS and HashSet. Created collections have different contents

class C
{
    public static void M1(ReadOnlySpan<string> ros) {}
    public static void M1(List<string> list) {}

    public static void M2(Span<string> ros) {}
    public static void M2(List<string> list) {}

    public static void M3(ReadOnlySpan<string> ros) {}
    public static void M3(MyList<string> list) {}

    public static void M4(ReadOnlySpan<string> ros) {}
    public static void M4(HashSet<string> hashset) {}
}

class MyList<T> : List<T> {}

How far do we want to go here? The List<T> variant seems reasonable, and subtypes of List<T> exist aplenty. But the HashSet version has very different semantics, how sure are we that it's actually "worse" than ReadOnlySpan in this API?