Introduction
Learn about setting up regular expressions with a timeout for an entire application. The reason, suppose a malicious user enters an input for an email address that leads to a denial of service which can bring down an application, the same can occur for a desktop application cause the application to become unresponsive.
No matter if a regular expression accepts untrusted data or not a developer can set a timeout for all regular expressions in an application. In the samples provided, learn how to set a global timeout for an application reading the timeout from appsettings.json.
Microsoft docs: Define a time-out value
Source code has more code than shown below.
Setting up
Read settings
Add the following to appsettings.json
which will set the default time out for all regular expressions to one second. Feel free to change to milliseconds if one second is not acceptable.
{
"RegularExpressions": {
"Timeout": "00:00:01.000"
}
}
A class/model for reading the Timeout.
/// <summary>
/// Represents a model for handling regular expressions with a configurable timeout.
/// In this case <see cref="Timeout"/> is read from appsettings.json
/// </summary>
public class RegularExpressions
{
/// <summary>
/// Gets or sets the timeout value for regular expressions.
/// </summary>
/// <value>
/// A <see cref="TimeSpan"/> representing the maximum time allowed for a regular expression match to execute.
/// </value>
/// <remarks>
/// This property is decorated with a <see cref="JsonConverterAttribute"/> that specifies the use of <see cref="TimeSpanConverter"/>
/// for JSON serialization and deserialization.
/// </remarks>
[JsonConverter(typeof(TimeSpanConverter))]
public TimeSpan Timeout { get; set; }
}
Reading timeout in uses the following class.
public static class Configuration
{
/// <summary>
/// Reads a configuration section and converts it to the specified type.
/// </summary>
/// <typeparam name="T">The type to which the configuration section will be converted.</typeparam>
/// <param name="sectionName">The name of the configuration section to read.</param>
/// <returns>An instance of <typeparamref name="T"/> representing the configuration section.</returns>
public static T ReadSection<T>(string sectionName)
=> JsonRoot().GetSection(sectionName).Get<T>();
}
Which the following method does the retrieval.
public static TimeSpan RegexTimeOut()
{
var timeOut = Configuration.ReadSection<RegularExpressions>("RegularExpressions");
return timeOut.Timeout;
}
Set global timeout for an application
The following sets the global timeout.
public static string _timeout => "REGEX_DEFAULT_MATCH_TIMEOUT";
/// <summary>
/// Sets the regular expression timeout value in the application domain data.
/// </summary>
/// <remarks>
/// This method retrieves the timeout value from the configuration and sets it in the application domain data
/// using a predefined key. The timeout value is used to limit the execution time of regular expressions.
/// </remarks>
public static void SetTimeout()
{
AppDomain.CurrentDomain.SetData(_timeout, TimeSpan.FromSeconds(RegexTimeOut().Seconds));
}
Get global timeout for an application
/// <summary>
/// Retrieves the regular expression timeout value from the application domain data.
/// </summary>
/// <returns>
/// A <see cref="TimeSpan"/> representing the timeout value if it is set; otherwise, <c>null</c>.
/// </returns>
public static TimeSpan? GetTimeout()
=> (TimeSpan?)AppDomain.CurrentDomain.GetData(_timeout);
Determine if there is a timeout set
If Regex.InfiniteMatchTimeout.Milliseconds equals -1 the default timeout is used.
/// <summary>
/// Determines whether the default timeout for regular expression operations is set to infinite.
/// </summary>
/// <returns>
/// <c>true</c> if the default timeout is infinite; otherwise, <c>false</c>.
/// </returns>
public static bool IsDefaultTimeout()
{
return Regex.InfiniteMatchTimeout.Milliseconds == -1;
}
Samples
Here are several samples include in the include source code.
Crash-in-burn sample
To keep everything clear, the timeout is set directly for the timeout rather than using the timeout from appsettings.json.
In this case the timeout is one second to malicious input which take more than 30 seconds which goes back to in a web application a denial of service.
public static void BadSample()
{
AppDomain.CurrentDomain.SetData("REGEX_DEFAULT_MATCH_TIMEOUT", TimeSpan.FromSeconds(1));
try
{
// Takes more than 30s
var isMatch = EmailRegex().IsMatch("[email protected]%20");
}
catch (RegexMatchTimeoutException ex)
{
AnsiConsole.MarkupLine($"[red]Regex Timeout for[/] {ex.Message} after [cyan]{ex.MatchTimeout}[/] elapsed.");
AnsiConsole.MarkupLine("[red]Pattern[/]");
Console.WriteLine(ex.Pattern);
Log.Error(ex,nameof(BadSample));
}
catch (ArgumentOutOfRangeException ex)
{
AnsiConsole.MarkupLine($"[red]{ex.Message}[/]");
Log.Error(ex, nameof(BadSample));
}
}
Serilog dump
[2024-10-27 09:31:52.426 [Error] BadSample
System.Text.RegularExpressions.RegexMatchTimeoutException: The Regex engine has timed out while trying to match a pattern to an input string. This can occur for many reasons, including very large inputs or excessive backtracking caused by nested quantifiers, back-references and other factors.
at System.Text.RegularExpressions.RegexRunner.<CheckTimeout>g__ThrowRegexTimeout|25_0()
at System.Text.RegularExpressions.Generated.<RegexGenerator_g>F4CCF545FEA8210BA650F96F69065B2DEAC44F2CBB643E9156FE00073346BE310__EmailRegex_1.RunnerFactory.Runner.TryMatchAtCurrentPosition(ReadOnlySpan`1 inputSpan) in C:\OED\DotnetLand\VS2022\LanguageFeatures\RegularExpressionsTimeOutApp\obj\Debug\net8.0\System.Text.RegularExpressions.Generator\System.Text.RegularExpressions.Generator.RegexGenerator\RegexGenerator.g.cs:line 1107
at System.Text.RegularExpressions.Generated.<RegexGenerator_g>F4CCF545FEA8210BA650F96F69065B2DEAC44F2CBB643E9156FE00073346BE310__EmailRegex_1.RunnerFactory.Runner.Scan(ReadOnlySpan`1 inputSpan) in C:\OED\DotnetLand\VS2022\LanguageFeatures\RegularExpressionsTimeOutApp\obj\Debug\net8.0\System.Text.RegularExpressions.Generator\System.Text.RegularExpressions.Generator.RegexGenerator\RegexGenerator.g.cs:line 264
at System.Text.RegularExpressions.Regex.ScanInternal(RegexRunnerMode mode, Boolean reuseMatchObject, String input, Int32 beginning, RegexRunner runner, ReadOnlySpan`1 span, Boolean returnNullIfReuseMatchObject)
at System.Text.RegularExpressions.Regex.RunSingleMatch(RegexRunnerMode mode, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat)
at System.Text.RegularExpressions.Regex.IsMatch(String input)
at RegularExpressionsTimeOutApp.Classes.Samples.BadSample() in C:\OED\DotnetLand\VS2022\LanguageFeatures\RegularExpressionsTimeOutApp\Classes\Samples.cs:line 114
Screenshot
Without a timeout example, IsMatch on the author's machine.
public static void BadSampleRaw()
{
var timer = new Stopwatch();
timer.Start();
var isMatch = EmailRegex().IsMatch("[email protected]%20");
timer.Stop();
TimeSpan timeTaken = timer.Elapsed;
Console.WriteLine(isMatch.ToYesNo());
Console.WriteLine($"Time taken: {timeTaken:m\\:ss\\.fff}");
}
Using timeout from appsettings.json
In this case the timeout is one second but could be done with milliseconds but seconds is used for good measure.
public static void NormalUse()
{
string input = @"\\SomeServer\HTTP\demo1\index.html 4 KB HTML File 2/19/2019 3:48:21 PM 2/19/2019 1:05:53 PM 2/19/2019 1:05:53 PM 5";
const string format = "M/d/yyyy h:mm:ss tt";
MatchCollection matches = DatesRegex().Matches(input);
foreach (Match match in matches)
{
var dateTime = DateTime.ParseExact(match.Value, format, CultureInfo.InvariantCulture);
Console.WriteLine(dateTime);
}
}
Screenshot
Main class
Although in provided source code the following class is in a console project, to make the code usable in other projects simply create a class project which can be used for web or desktop projects.
public class RegexOperations
{
/// <summary>
/// Retrieves the regular expression timeout value from the configuration.
/// </summary>
/// <returns>
/// A <see cref="TimeSpan"/> representing the timeout value for regular expressions.
/// </returns>
/// <remarks>
/// This method reads the "RegularExpressions" section from the configuration and returns the timeout value specified.
/// </remarks>
public static TimeSpan RegexTimeOut()
{
var timeOut = Configuration.ReadSection<RegularExpressions>("RegularExpressions");
return timeOut.Timeout;
}
public static string _timeout => "REGEX_DEFAULT_MATCH_TIMEOUT";
/// <summary>
/// Sets the regular expression timeout value in the application domain data.
/// </summary>
/// <remarks>
/// This method retrieves the timeout value from the configuration and sets it in the application domain data
/// using a predefined key. The timeout value is used to limit the execution time of regular expressions.
/// </remarks>
public static void SetTimeout()
{
AppDomain.CurrentDomain.SetData(_timeout, TimeSpan.FromSeconds(RegexTimeOut().Seconds));
}
/// <summary>
/// Retrieves the regular expression timeout value from the application domain data.
/// </summary>
/// <returns>
/// A <see cref="TimeSpan"/> representing the timeout value if it is set; otherwise, <c>null</c>.
/// </returns>
public static TimeSpan? GetTimeout()
=> (TimeSpan?)AppDomain.CurrentDomain.GetData(_timeout);
}
Summary
All regular expressions should be setup for a user defined timeout rather than using the default timeout in all project types. By using provided code will protect against malicious user input and badly written regular expressions.
Please take time to study the source code to get a good understanding of the code.
NuGet packages used.
Top-level Package | Version |
---|---|
ConfigurationLibrary | 1.0.6 |
ConsoleConfigurationLibrary | 1.0.0.4 |
ConsoleHelperLibrary | 1.0.2 |
Microsoft.Extensions.Configuration.Json | 8.0.1 |
Microsoft.Extensions.Options.ConfigurationExtensions | 8.0.0 |
Serilog | 3.1.1 |
Serilog.Extensions.Logging.File | 3.0.0 |
Serilog.Sinks.Console | 5.0.1 |
Serilog.Sinks.File | 5.0.0 |
Spectre.Console | 0.46.0 |