Evaluating String Pattern Matching in .NET: Performance, Readability, and Nuances
Usually, when we come across tips on string pattern matching online, they tend to focus on three basic approaches: using the StartsWith and EndsWith methods, using the indexer property with the ^ operator, and using list patterns.
In this article, we’ll take a slightly deeper look and answer a few practical questions: when to use each of these patterns, how they differ in terms of performance and usage scenarios, and whether there are other string-matching patterns worth considering.
We will use BenchmarkDotNet on .NET 10 for the benchmark tests. All benchmarks are run on a Windows 11 Pro machine with a 12th Gen Intel Core i7-1255U processor, 40 GB RAM, using an x64 environment.
Below are three common string pattern matching techniques. One might argue that StartsWith is the most readable simply because it's the most established and familiar. However, the new list pattern is arguably superior in terms of readability, as it appears shorter, cleaner, and more expressive.
Surprisingly, StartsWith is often evaluated based on its string overload. This is misleading, as that specific implementation carries a massive performance penalty and is roughly 1000 times slower than the optimized patterns discussed below.
Now, let us move to the benchmark tests for the following three patterns. We will also add a few more for comparison purposes.
if (email.StartsWith("a") && email.EndsWith("m"))
{
Console.WriteLine("Starts with a, ends with m");
}Using StartsWith and EndsWith methods
if (email[0] == 'a' && email[^1] == 'm')
{
Console.WriteLine("Starts with a, ends with m");
}Using Indexer Property and ^ Operator
if (email is ['a', .., 'm'])
{
Console.WriteLine("Starts with a, ends with m");
}Using List Pattern (C# 11)
First of all, we included the StartsWith overload with a char parameter to determine if it offers a performance advantage. Next, we introduced the method using StringComparison.Ordinal to serve as a baseline. Two Regex-based patterns were incorporated to demonstrate the significant overhead of the regex engine compared to specialized string methods. Finally, we implemented two Span<char> versions to evaluate whether direct memory access provides a substantial performance boost.
Beyond standard test values, we included several edge-case scenarios to provide a more comprehensive performance overview.
Now, let us examine the results obtained after running the tests. We shall observe how the different implementations compare in terms of execution time and memory allocation.
using System.Text.RegularExpressions;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Order;
using BenchmarkDotNet.Running;
BenchmarkRunner.Run<StringPerformanceBenchmark>();
[ShortRunJob]
[MemoryDiagnoser]
[RankColumn]
[Orderer(SummaryOrderPolicy.FastestToSlowest)]
[DisassemblyDiagnoser(printSource: true, maxDepth: 3)]
public partial class StringPerformanceBenchmark
{
[Params("", "a@m", "admin@company.com", "LONG_EMAIL", "wrong_start_and_end")]
public string EmailInput { get; set; }
private string _email = string.Empty;
private static readonly Regex EmailRegexGenerated = MyRegex();
private static readonly Regex EmailRegexCompiled = new(@"^a.*m$", RegexOptions.Compiled);
[GlobalSetup]
public void Setup()
{
_email = EmailInput == "LONG_EMAIL"
? "a" + new string('n', 2000) + "m"
: EmailInput;
}
[Benchmark]
public bool UseMethods() =>
_email.StartsWith("a") && _email.EndsWith("m");
[Benchmark(Baseline = true)]
public bool UseMethodsOrdinal() =>
_email.StartsWith("a", StringComparison.Ordinal) &&
_email.EndsWith("m", StringComparison.Ordinal);
[Benchmark]
public bool ViaCharMethods() =>
_email.StartsWith('a') && _email.EndsWith('m');
[Benchmark]
public bool UseIndexer() =>
_email.Length > 0 && _email[0] == 'a' && _email[^1] == 'm';
[Benchmark]
public bool UseListPattern() => _email is ['a', .., 'm'];
[Benchmark]
public bool ViaSpan()
{
ReadOnlySpan<char> span = _email.AsSpan();
return span is ['a', .., 'm'];
}
[Benchmark]
public bool ViaSpanDirect()
{
var span = _email.AsSpan();
return span.Length > 0 && span[0] == 'a' && span[^1] == 'm';
}
[Benchmark]
public bool ViaCompiledRegex() =>
EmailRegexCompiled.IsMatch(_email);
[Benchmark]
public bool ViaGeneratedRegex() =>
EmailRegexGenerated.IsMatch(_email);
[GeneratedRegex(@"^a.*m$")]
private static partial Regex MyRegex();
}As expected, both Regex methods ranked last in performance, positioned just below the original StartsWith method. We shall examine the reasons for such poor performance later in this article. What was not anticipated, however, is that the Regex methods do not allocate additional memory.
| Method | Mean (ns) | StdDev | Ratio | Rank | Alloc |
|---|---|---|---|---|---|
| Input: "" (empty string) | |||||
| ViaCharMethods | 0.024 | 0.030 | ×0.11 | 1 | 0 B |
| UseIndexer | 0.028 | 0.024 | ×0.13 | 1 | 0 B |
| UseMethodsOrdinal ★ | 0.212 | 0.030 | ×1.00 | 2 | 0 B |
| UseListPattern | 0.269 | 0.034 | ×1.28 | 3 | 0 B |
| ViaSpan | 0.467 | 0.007 | ×2.23 | 4 | 0 B |
| ViaSpanDirect | 0.526 | 0.029 | ×2.51 | 4 | 0 B |
| ViaGeneratedRegex | 13.155 | 0.218 | ×62.7 | 5 | 0 B |
| ViaCompiledRegex | 18.099 | 0.667 | ×86.3 | 6 | 0 B |
| UseMethods ⚠ | 210.056 | 0.583 | ×1001 | 7 | 0 B |
| Input: "a@m" | |||||
| ViaCharMethods | 0.018 | 0.020 | ×0.07 | 1 | 0 B |
| UseIndexer | 0.253 | 0.071 | ×0.98 | 2 | 0 B |
| UseMethodsOrdinal ★ | 0.259 | 0.021 | ×1.00 | 2 | 0 B |
| UseListPattern | 0.429 | 0.154 | ×1.66 | 3 | 0 B |
| ViaSpanDirect | 1.019 | 0.061 | ×3.95 | 4 | 0 B |
| ViaSpan | 1.319 | 0.072 | ×5.11 | 5 | 0 B |
| ViaGeneratedRegex | 31.297 | 0.070 | ×121 | 6 | 0 B |
| ViaCompiledRegex | 42.401 | 0.044 | ×164 | 7 | 0 B |
| UseMethods ⚠ | 635.781 | 9.901 | ×2464 | 8 | 0 B |
| Input: "admin company.com" | |||||
| UseMethodsOrdinal ★ | 0.399 | 0.063 | ×1.02 | 1 | 0 B |
| UseListPattern | 0.408 | 0.035 | ×1.04 | 1 | 0 B |
| ViaCharMethods | 0.444 | 0.064 | ×1.13 | 1 | 0 B |
| UseIndexer | 0.695 | 0.044 | ×1.77 | 2 | 0 B |
| ViaSpanDirect | 0.912 | 0.048 | ×2.32 | 3 | 0 B |
| ViaSpan | 1.259 | 0.033 | ×3.20 | 4 | 0 B |
| ViaGeneratedRegex | 32.556 | 0.038 | ×82.9 | 5 | 0 B |
| ViaCompiledRegex | 45.159 | 0.180 | ×115 | 6 | 0 B |
| UseMethods ⚠ | 1 100.989 | 3.132 | ×2803 | 7 | 0 B |
| Input: LONG_EMAIL (2002 chars) | |||||
| ViaCharMethods | 0.009 | 0.016 | ×0.02 | 1 | 0 B |
| UseIndexer | 0.276 | 0.043 | ×0.62 | 2 | 0 B |
| UseListPattern | 0.280 | 0.029 | ×0.63 | 2 | 0 B |
| UseMethodsOrdinal ★ | 0.458 | 0.089 | ×1.03 | 3 | 0 B |
| ViaSpanDirect | 0.472 | 0.030 | ×1.06 | 3 | 0 B |
| ViaSpan | 0.756 | 0.026 | ×1.69 | 4 | 0 B |
| ViaGeneratedRegex | 58.235 | 0.770 | ×130 | 5 | 0 B |
| ViaCompiledRegex | 73.774 | 0.627 | ×165 | 6 | 0 B |
| UseMethods ⚠ | 1 379.540 | 1.682 | ×3088 | 7 | 0 B |
| Input: "wrong_start_and_end" (early exit — Ratio n/a) | |||||
| UseMethodsOrdinal ★ | 0.012 | 0.020 | ? | 1 | 0 B |
| ViaCharMethods | 0.017 | 0.007 | ? | 2 | 0 B |
| UseListPattern | 0.462 | 0.178 | ? | 3 | 0 B |
| UseIndexer | 0.469 | 0.047 | ? | 3 | 0 B |
| ViaSpanDirect | 1.026 | 0.154 | ? | 4 | 0 B |
| ViaSpan | 1.063 | 0.100 | ? | 4 | 0 B |
| ViaGeneratedRegex | 21.663 | 0.016 | ? | 5 | 0 B |
| ViaCompiledRegex | 29.554 | 0.277 | ? | 6 | 0 B |
| UseMethods ⚠ | 213.107 | 0.854 | ? | 7 | 0 B |
★ Baseline · ⚠ culture-aware (missing StringComparison) · Ratio ? = baseline too close to noise floor · Allocated = 0 B for all methods
The top performers are ViaCharMethods, UseIndexer, UseMethodsOrdinal, and UseListPattern. The Span<T> implementations did not demonstrate any significant advantages in these scenarios. This lack of performance gain makes them less appropriate, considering their lower-level nature and the associated maintenance overhead.
At the beginning of our testing, we intentionally included the DisassemblyDiagnoser attribute to analyze the generated assembly code. The output reveals that the fastest methods also generate the most concise instruction sets, ranging from 20 to 100 bytes per method. In contrast, the slower methods produce significantly larger instruction sets, often exceeding 1,000 bytes.
We can see that with realistic data (non-empty and of standard length), the performance gap disappears. The List Pattern is even more efficient because it is better optimized for medium-length strings than the manual checks in the indexer implementation. At the same time, it is significantly more readable than the version using StringComparison.Ordinal.
; StringPerformanceBenchmark.UseListPattern()
; Method signature: public bool UseListPattern() => _email is ['a', .., 'm'];
; 1. Load string reference and perform Null Check
mov rax,[rcx+10] ; Load _email object reference into rax
test rax,rax ; Check if the reference is null
je short M00_L00 ; If null, jump to return false
; 2. Length Validation (Safety Check)
mov ecx,[rax+8] ; Load string length (stored at offset +8) into ecx
cmp ecx,2 ; Check if length is at least 2
jl short M00_L00 ; If length < 2, the pattern cannot match; return false
; 3. Check First Character ('a' = 0x61)
cmp word ptr [rax+0C],61 ; Compare character at offset +0C (first char) with 'a'
jne short M00_L00 ; If not equal, jump to return false
; 4. Check Last Character ('m' = 0x6D)
dec ecx ; Decrement length to get the index of the last character
cmp word ptr [rax+rcx*2+0C],6D ; Calculate address and compare last char with 'm'
; 5. Finalize Result
sete al ; Set al to 1 if equal, 0 otherwise
movzx eax,al ; Zero-extend result to eax
ret ; Return result
M00_L00:
xor eax,eax ; Standard "Return False" block
ret ; Return 0To be continued...