-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Brotli Optimal CompressionLevel value #57440
Conversation
Tagging subscribers to this area: @dotnet/area-system-io-compression Issue DetailsFix #46595
|
@stephentoub since this is a breaking change I am moving it to 7.0.0 |
There are two changes here:
We can delete (2) from this PR as it doesn't do anything. (1) can be considered for 6. |
I removed the validation and kept the I opened another PR to validate the CompresstionLevel argument to be a defined value: #57561 |
/azp run |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks solid, just want to make sure the CI passes, or if there are unit tests we need to adjust.
@carlossanlop CI is green. Apparently no tests rely on the specific value. Is there anything else you wanted to confirm before we approve and merge this? |
Could we re-validate this has the expected impact in terms of a good default balance between size and speed on various inputs? |
Sure! 6 is a much better value than 11, but it's not the optimal value. I tested all the Google test files against values from 4 to 8, and concluded that a CompressionLevel of 5 is the value that offers the best balance between compression and speed. 10x10yWinners: 5, 6
64xWinners: 4, 5, 6
backward65536Winners: 4, 5, 6
compressed_fileWinners: 4, 5
compressed_repeatedWinners: 4, 5, 6
emptyWinners: 4, 5, 6
mapsdatazrhWinners: 4
monkeyWinners: 5, 6
quickfoxWinners: 5, 6
quickfox_repeatedWinners: 5
random_org_10k.binWinners: 4, 5, 6
xWinners: current, 4, 5, 6
ukkonooaWinners: 4
xyzzyWinners: 4, 5, 6
zerosWinners: 5, 6
Final results
Here's the code I used to generate the above tables. I ran it as a unit test inside BrotliGoogleTestData.cs. private static IEnumerable<string> GetGoogleTestPaths()
{
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "10x10y");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "64x");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "backward65536");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "compressed_file");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "compressed_repeated");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "empty");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "mapsdatazrh");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "monkey");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "quickfox");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "quickfox_repeated");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "random_org_10k.bin");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "x");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "ukkonooa");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "xyzzy");
yield return Path.Combine("UncompressedTestFiles", "GoogleTestData", "zeros");
}
private static string GetLevel(int level)
{
return level switch
{
0 => "CL.Optimal (11) => 0",
1 => "CL.Fastest (1) => 1",
2 => "CL.NoCompression (0) => 2",
3 => "CL.SmallestSize (11) => 3",
4 => "Two below suggested => 4",
5 => "One below suggested => 5",
6 => "Suggested optimal => 6",
7 => "One above suggested => 7",
8 => "Two above suggested => 8",
11 => "Current optimal => 11",
_ => throw new Exception()
};
}
private int[] Levels = new[] {
0, // CompressionLevel.Optimal (11)
1, // CompressionLevel.Fastest (1)
2, // CompressionLevel.NoCompression (0)
3, // CompressionLevel.SmallestSize (11)
4, // Two below suggested optimal
5, // One below suggested optimal
6, // Suggested optimal
7, // One above suggested optimal
8, // Two above suggested optimal
11 // Current optimal (should have same result as 0)
};
private const int Pad = 25;
private string[] Headers = new[] {
"File".PadLeft(Pad),
"CompressionLevel".PadLeft(Pad),
"Uncomp. Length (B)".PadLeft(Pad),
"Comp. Length (B)".PadLeft(Pad),
"Diff. Size (%)".PadLeft(Pad),
"Time (ms)".PadLeft(Pad)
};
[Fact]
public void Test()
{
Stopwatch sw = new Stopwatch();
Console.WriteLine("|{0}|", string.Join('|',Headers));
Console.WriteLine("|{0}|", string.Join('|', Enumerable.Repeat<string>(new string('-', Pad), Headers.Length)));
long compressedLength;
foreach (string filePath in GetGoogleTestPaths())
{
byte[] bytes = File.ReadAllBytes(filePath);
foreach (int level in Levels)
{
using (var memoryStream = new MemoryStream())
{
sw.Restart();
using (var brotliStream = new BrotliStream(memoryStream, (CompressionLevel)level, leaveOpen: true))
{
brotliStream.Write(bytes, 0, bytes.Length);
}
sw.Stop();
memoryStream.Position = 0;
compressedLength = memoryStream.Length;
memoryStream.Dispose();
}
double percent = Math.Round(compressedLength * 100.0 / bytes.Length, 4);
var cols = new string[] {
$"{Path.GetFileName(filePath), Pad}",
$"{GetLevel(level), Pad}",
$"{bytes.Length, Pad}",
$"{compressedLength, Pad}",
$"{percent, Pad}",
$"{sw.ElapsedMilliseconds, Pad}"
};
Console.WriteLine("|{0}|", string.Join('|', cols));
}
}
} |
@i3arnon @jozkee @adamsitnik @stephentoub are the above results good enough to decide using 5 as Optimal? |
I have two concerns:
|
Nice, before I looked at this comment, I made an BrotliOptions type that then I harcoded to level 5 being the optimal value 👌. As that change will be a breaking change, I feel that this one should be merged first. Note: my changes deletes that function as the range checking was moved to the BrotliOptions type I made. |
@@ -9,7 +9,7 @@ internal static partial class BrotliUtils | |||
public const int WindowBits_Default = 22; | |||
public const int WindowBits_Max = 24; | |||
public const int Quality_Min = 0; | |||
public const int Quality_Default = 11; | |||
public const int Quality_Default = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public const int Quality_Default = 6; | |
public const int Quality_Default = 5; |
I've created a new issue and described everything that needs to be done to change the default value: #64185 (comment) Since this PR is not complete (lack of good data that explains the decision), I am going to close it. Thanks |
Fix #46595