You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My team observed a regression for the performance of Directory.EnumerateFiles in .NET 8. We store a large number of database redo logs in a folder, named with a prefix and a log generation number (e.g., E10ABCD1234.log). Our goal is to determine the maximum log generation number efficiently.
To achieve this, we have developed a fast search algorithm that looks for the highest generation log files in a hierarchical manner, starting from E10F*******.log to E100*******.log and so on, until the last digit. We use Directory.EnumerateFiles(directory, filter) to detect matching files.
We compared the performance of our search algorithm against a direct enumeration of all files to get the maximum generation number.
The direct enumeration takes approximately 0.5 seconds in both .Net Framework & .Net 8.
Our search algorithm is significantly faster with a baseline of 4 milliseconds for 400,000 log files in .Net Framework, but it degrades a lot in .Net 8. Below are the detailed performance metrics.
Runtime
# of Log
Elapsed Time
Enum Times
.NET Framework 4.8.9282.0
1000
00:00:00.0040757
99
.NET Framework 4.8.9282.0
10000
00:00:00.0051796
115
.NET Framework 4.8.9282.0
100000
00:00:00.0039643
101
.NET Framework 4.8.9282.0
400000
00:00:00.0042406
101
.NET 8.0.11
1000
00:00:00.0341829
99
.NET 8.0.11
10000
00:00:00.3642297
115
.NET 8.0.11
100000
00:00:02.7944484
101
.NET 8.0.11
400000
00:00:11.4074592
101
Previously in .Net framework it uses FindFirstFile which takes the search filter to find out the first / next file.
This API changed in .Net Core/6/8 implementation,
Directory.EnumerateFiles enumerates all files and see if current one match the prefix. So, this means FindHighestGenerationLogFileFastV2 scans the whole directory for many times.
Description
My team observed a regression for the performance of Directory.EnumerateFiles in .NET 8. We store a large number of database redo logs in a folder, named with a prefix and a log generation number (e.g., E10ABCD1234.log). Our goal is to determine the maximum log generation number efficiently.
To achieve this, we have developed a fast search algorithm that looks for the highest generation log files in a hierarchical manner, starting from E10F*******.log to E100*******.log and so on, until the last digit. We use Directory.EnumerateFiles(directory, filter) to detect matching files.
We compared the performance of our search algorithm against a direct enumeration of all files to get the maximum generation number.
Previously in .Net framework it uses FindFirstFile which takes the search filter to find out the first / next file.
This API changed in .Net Core/6/8 implementation,
runtime/src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEnumerator.Windows.cs at 6362f242fc6e3065948b3bf922406509cb721a73 · dotnet/runtime
Configuration
Regression?
Data
Analysis
The text was updated successfully, but these errors were encountered: