Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List to Array Failing in Benchmark Tests #9

Open
Kaotic3 opened this issue Aug 28, 2022 · 6 comments
Open

List to Array Failing in Benchmark Tests #9

Kaotic3 opened this issue Aug 28, 2022 · 6 comments

Comments

@Kaotic3
Copy link

Kaotic3 commented Aug 28, 2022

Using the following code:

List<string> uniqueWords = new();
static string? parsedTermsComplete;

public void AttemptOne()
{
    var wordArray = uniqueWords.ToArray();
    var keyWords = new AhoCorasickTree(wordArray);
    var keywordsPositions = keyWords.Search(parsedTermsComplete).ToList();
    // var result = keyWords.Contains(parsedTermsComplete!); - alternative still fails.
}

I get the following error:

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
---> System.ArgumentException: should contain keywords
at AhoCorasick.Net.AhoCorasickTree..ctor(String[] keywords)
at RegexTesting.Program.Attempts.AttemptOne() in C:\Demo\RegexTesting\RegexTesting\Program.cs:line 1058
at BenchmarkDotNet.Autogenerated.Runnable_0.WorkloadActionNoUnroll(Int64 invokeCount) in C:\Demo\RegexTesting\RegexTesting\bin\Release\net6.0\b224fdac-806c-4a65-a8ce-8efc0ea02b10\b224fdac-806c-4a65-a8ce-8efc0ea02b10.notcs:line 318
at BenchmarkDotNet.Engines.Engine.RunIteration(IterationData data)
at BenchmarkDotNet.Engines.EngineFactory.Jit(Engine engine, Int32 jitIndex, Int32 invokeCount, Int32 unrollFactor)
at BenchmarkDotNet.Engines.EngineFactory.CreateReadyToRun(EngineParameters engineParameters)
at BenchmarkDotNet.Autogenerated.Runnable_0.Run(IHost host, String benchmarkName) in C:\Demo\RegexTesting\RegexTesting\bin\Release\net6.0\b224fdac-806c-4a65-a8ce-8efc0ea02b10\b224fdac-806c-4a65-a8ce-8efc0ea02b10.notcs:line 175
--- End of inner exception stack trace ---
at System.RuntimeMethodHandle.InvokeMethod(Object target, Span`1& arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at System.Reflection.MethodBase.Invoke(Object obj, Object[] parameters)
at BenchmarkDotNet.Autogenerated.UniqueProgramName.AfterAssemblyLoadingAttached(String[] args) in C:\Demo\RegexTesting\RegexTesting\bin\Release\net6.0\b224fdac-806c-4a65-a8ce-8efc0ea02b10\b224fdac-806c-4a65-a8ce-8efc0ea02b10.notcs:line 58

The following code works in the same benchmark suite - so it isn't that the list is empty....

public void AttemptTwo()
{
    var wordArray = uniqueWords.ToArray();
    int i = uniqueWords.Count - 1;
    foreach (var item in wordArray)
    {
        var keyWords = new AhoCorasickTree(new[] { item });
        if (keyWords.Contains(parsedTermsComplete))
        {
            uniqueWords.RemoveAt(i);
        }
        i--;
    }
}

Not sure if I have misunderstood the implementation, but wanted to raise it as I thought that this would work.

For implementation, parsedTermsComplete should contain several copies of every single word in the uniqueWords List - as that is how the parsed terms were created - I was looking to find a way to QC that every single word in uniqueWords did in fact exist in the parsedTermsComplete.

Don't get me wrong, the implementation that works is super fast, 3 times faster than any other implementation of the test - just wondered why I can't use a List to Array.

@Kaotic3
Copy link
Author

Kaotic3 commented Aug 29, 2022

Just as an addition - I took this out of the benchmark and put it into the actual project.

Worked just fine. Just fails in Benchmarks.

@alexandrnikitin
Copy link
Owner

Hi @Kaotic3,

Thank you for the issue. It throws the exception because the constructor gets an empty array of keywords. See here

if (keywords.Length == 0) throw new ArgumentException("should contain keywords");

I assume uniqueWords hence wordArray is empty. How do you add items to the list?

@Kaotic3
Copy link
Author

Kaotic3 commented Aug 29, 2022

Hi @alexandrnikitin thanks for the response.

These two methods appear exactly like this in the benchmark test:

[MemoryDiagnoser]
public class Attempts
{
    [Benchmark]
    public void AttemptOne()
    {
        var wordArray = uniqueWords.ToArray();
        var keyWords = new AhoCorasickTree(wordArray);
        var resultTest = keyWords.Contains(parsedTermsComplete!);
    }
    [Benchmark]
    public void AttemptTwo()
    {
        var wordArray = uniqueWords.ToArray();
        int i = uniqueWords.Count - 1;
        foreach (var item in wordArray)
        {
            var keyWords = new AhoCorasickTree(new[] { item });
            if (keyWords.Contains(parsedTermsComplete))
            {
                uniqueWords.RemoveAt(i);
            }
            i--;
        }
    }
}

I am not sure why it would work in AttemptTwo but not in AttemptOne if there was a problem with the list?

Also to add, as noted above - in the Program.cs main method, I placed AttemptOne in a class and then sent the results there instead of running the Benchmark -

var result = TermsParserChecks.FullContent(parsedTermsComplete, uniqueWords);
//var result = BenchmarkRunner.Run<Attempts>();

That runs fine and provides the expected result. I don't think the list is empty, but it does appear that it is being seen as empty when being run in a Benchmark?

@alexandrnikitin
Copy link
Owner

Yes, it looks like the uniqueWords list is empty and not populated. Please populate it in the benchmark constructor or Setup method. See Benchmark.NET docs.
It doesn't throw in the AttemptTwo method because it doesn't execute logic inside foreach block (because the list is empty).

@Kaotic3
Copy link
Author

Kaotic3 commented Aug 29, 2022

Ok it seems that the second one will run while empty...and the benchmark is not taking the list with any items in it at all. The list definitely has items but doesn't have any in the benchmark - I will need to investigate that one.

Sorry @alexandrnikitin - didn't mean to waste your time!

@alexandrnikitin
Copy link
Owner

Yes, it looks like the list is populated outside of the benchmark.
No problem. Feel free to ask any question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants