Breaking change: UTF-7 code paths are obsolete - .NET (original) (raw)

The UTF-7 encoding is no longer in wide use among applications, and many specs now forbid its use in interchange. It's also occasionally used as an attack vector in applications that don't anticipate encountering UTF-7-encoded data. Microsoft warns against use of System.Text.UTF7Encoding because it doesn't provide error detection.

Consequently, the Encoding.UTF7 property and UTF7Encoding constructors are now obsolete. Additionally, Encoding.GetEncoding and Encoding.GetEncodings no longer allow you to specify UTF-7.

Change description

Previously, you could create an instance of the UTF-7 encoding by using the Encoding.GetEncoding APIs. For example:

Encoding enc1 = Encoding.GetEncoding("utf-7"); // By name.
Encoding enc2 = Encoding.GetEncoding(65000); // By code page.

Additionally, an instance that represents the UTF-7 encoding was enumerated by the Encoding.GetEncodings() method, which enumerates all the Encoding instances registered on the system.

Starting in .NET 5, the Encoding.UTF7 property and UTF7Encoding constructors are obsolete and produce warning SYSLIB0001. However, to reduce the number of warnings that callers receive when using the UTF7Encoding class, the UTF7Encoding type itself is not marked obsolete.

// The next line generates warning SYSLIB0001.
UTF7Encoding enc = new UTF7Encoding();
// The next line does not generate a warning.
byte[] bytes = enc.GetBytes("Hello world!");

Additionally, the Encoding.GetEncoding methods treat the encoding name utf-7 and the code page 65000 as unknown. Treating the encoding as unknown causes the method to throw an ArgumentException.

// Throws ArgumentException, same as calling Encoding.GetEncoding("unknown").
Encoding enc = Encoding.GetEncoding("utf-7");

Finally, the Encoding.GetEncodings() method doesn't include the UTF-7 encoding in the EncodingInfo array that it returns. The encoding is excluded because it cannot be instantiated.

foreach (EncodingInfo encInfo in Encoding.GetEncodings())
{
    // The next line would throw if GetEncodings included UTF-7.
    Encoding enc = Encoding.GetEncoding(encInfo.Name);
}

Reason for change

Many applications call Encoding.GetEncoding("encoding-name") with an encoding name value that's provided by an untrusted source. For example, a web client or server might take the charset portion of the Content-Type header and pass the value directly to Encoding.GetEncoding without validating it. This could allow a malicious endpoint to specify Content-Type: ...; charset=utf-7, which could cause the receiving application to misbehave.

Additionally, disabling UTF-7 code paths allows optimizing compilers, such as those used by Blazor, to remove these code paths entirely from the resulting application. As a result, the compiled applications run more efficiently and take less disk space.

Version introduced

5.0

In most cases, you don't need to take any action. However, for apps that have previously activated UTF-7-related code paths, consider the guidance that follows.

void DoSomething(Encoding enc)  
{  
    // Don't perform the check this way.  
    // It produces a warning and misses some edge cases.  
    if (enc == Encoding.UTF7)  
    {  
        // Encoding is UTF-7.  
    }  
    // Instead, perform the check this way.  
    if (enc != null && enc.CodePage == 65000)  
    {  
        // Encoding is UTF-7.  
    }  
}  
#pragma warning disable SYSLIB0001 // Disable the warning.  
Encoding enc = Encoding.UTF7;  
#pragma warning restore SYSLIB0001 // Re-enable the warning.  
<Project Sdk="Microsoft.NET.Sdk">  
  <PropertyGroup>  
   <TargetFramework>net5.0</TargetFramework>  
   <!-- NoWarn below suppresses SYSLIB0001 project-wide -->  
   <NoWarn>$(NoWarn);SYSLIB0001</NoWarn>  
  </PropertyGroup>  
</Project>  
<Project Sdk="Microsoft.NET.Sdk">  
  <PropertyGroup>  
   <TargetFramework>net5.0</TargetFramework>  
   <!-- Re-enable support for UTF-7 -->  
   <EnableUnsafeUTF7Encoding>true</EnableUnsafeUTF7Encoding>  
  </PropertyGroup>  
</Project>  

In the application's runtimeconfig.template.json file:

{  
  "configProperties": {  
    "System.Text.Encoding.EnableUnsafeUTF7Encoding": true  
  }  
}  

Tip
If you re-enable support for UTF-7, you should perform a security review of code that calls Encoding.GetEncoding.

Affected APIs