selenium [dotnet] Added function for sending multiple DevTools command

Description

This PR introduces a SendCommands function to DevTools session, which is able to sent multiple Devtools commands and wait till all the commands received an answer. The singular SendCommand function will get blocked because the commands running against that target will be blocked until runtime.runIfWaitingForDebugger is invoked. By bundling the commands we can get around that issue.

This PR is a continuation of https://github.com/SeleniumHQ/selenium/pull/13330

For demonstration purposes I've included a sample script.

    private static async Task Main(string[] args)
    {
        var driverService = ChromeDriverService.CreateDefaultService(FindChromeDriverPath());
        var driver = new ChromeDriver(driverService);

        // create a new DevTools session with WaitForDebugger on start enabled
        var devToolsSession = driver.GetDevToolsSession(new DevToolsOptions
        {
            WaitForDebuggerOnStart = true,
        });

        // create an event handler for the TargetAttached event
        devToolsSession.Domains.Target.TargetAttached += (object? sender, TargetAttachedEventArgs e) =>  
        {
            var commands = new List<DevToolsCommandSettings>();

            //enable domains for page targets to enable generating CDP events.
            if (string.Equals("page", e.TargetInfo.Type, StringComparison.InvariantCultureIgnoreCase))
            {
                var enablePage = new OpenQA.Selenium.DevTools.V120.Page.EnableCommandSettings();
                var enableNetwork = new OpenQA.Selenium.DevTools.V120.Network.EnableCommandSettings();
                var enableRuntime = new OpenQA.Selenium.DevTools.V120.Runtime.EnableCommandSettings();

                commands.Add(new DevToolsCommandSettings(enablePage.CommandName) { SessionId = e.SessionId, CommandParameters = JToken.FromObject(enablePage) });
                commands.Add(new DevToolsCommandSettings(enableNetwork.CommandName) { SessionId = e.SessionId, CommandParameters = JToken.FromObject(enableNetwork) });
                commands.Add(new DevToolsCommandSettings(enableRuntime.CommandName) { SessionId = e.SessionId, CommandParameters = JToken.FromObject(enableRuntime) });
            }

            // in case the target is halted, we want to instruct DevTools to continue running the target
            if (e.WaitingForDebugger)
            {
                var runIfWaitingForDebugger = new OpenQA.Selenium.DevTools.V120.Runtime.RunIfWaitingForDebuggerCommandSettings();

                commands.Add(new DevToolsCommandSettings(runIfWaitingForDebugger.CommandName) { SessionId = e.SessionId, CommandParameters = JToken.FromObject(runIfWaitingForDebugger) });
            }

            // sent all commands in one go using the newly introduced SendCommands function
            if (commands.Any())
            {
                // We use Task.Run to prevent blocking other events from Devtools.
                Task.Run(() => devToolsSession.SendCommands(commands, millisecondsTimeout: (int)TimeSpan.FromSeconds(5).TotalMilliseconds, throwExceptionIfResponseNotReceived: false));
            }
        };      

        // perform a navigate
        driver.Navigate().GoToUrl("https://testtarget.uptrends.net/Tab");

        // the page contains a button to open a new tab
        var openTabButton = driver.FindElement(By.XPath("/html/body/div/div[2]/div/a[1]")); 

        // upon opening the new tab, a target attached event is raised. When placing a breakpoint in the event handler you notice the new tab not doing anything until issues the runIfWaitingForDebugger command.
        openTabButton.Click();

        Thread.Sleep(10000);

        driver.CloseDevToolsSession();
        driver.Dispose();
    }

Note that when enabling the WaitForDebuggerOnStart option, all targets will be halted, this includes iframes and service workers for instance. Therefore you should always issue the runIfWaitingForDebugger command when this feature is enabled.

Motivation and Context

When a new target (new tab for instance) is created DevTools continues execution based on the WaitForDebugger property set in the SetAutoAttach feature. When the WaitForDebugger property is set to false DevTools continues without waiting the consumer to properly enable all domains on the target. For instance when being interested in network events from the network domain we need to invoke Network.enable for that specific target. The current behavior causes some of the network events to be missing due to Selenium enabling the domains after the target already performed network requests.

By allowing consumers to enable the WaitForDebugger property we can enable all domains and tell DevTools whenever we are ready by using the Run.runIfWaitingForDebugger command.

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)

Checklist

[x] I have read the contributing document.
[ ] My change requires a change to the documentation.
[ ] I have updated the documentation accordingly.
[ ] I have added tests to cover my changes. => I would like to, but I have not yet succeeded in getting the test suite to run.
[ ] All new and existing tests passed.

Feb 02 '24 15:02 EdwinVanVliet

This is something strange, and based what I see in changing API, I am strongly opiniated this should not be a part of selenium code base. Selenium provides a method to send single cdp command, I don't see problems how user might send multiple commands, even in parallel, up to user. But selenium should be aware of thread-safety.

To be on the same page let's start from a simple problem: reproducible real scenario where selenium behaves incorrectly. And then we will see where the issue is and how we can resolve it.

Feb 05 '24 17:02 nvborisenko

Can we fill out an issue to go with this PR with all the details? Thanks!

Feb 05 '24 18:02 titusfortner

@nvborisenko

I've rewritten the example a bit to demonstrate the issue I'm facing.

Added the async modifier to the event handler and now use the singular SendCommand functions. Note that we use the SessionId argument from the event itself, therefore can't use the domains.

Running this code will result in the first devToolsSession.SendCommand to hang until the timeout (30 secs) expired. Might be the thread is locked because it's not an async event handler to begin with?

        // create an event handler for the TargetAttached event
        devToolsSession.Domains.Target.TargetAttached += async (object? sender, TargetAttachedEventArgs e) =>
        {
            //enable domains for page targets to enable generating CDP events.
            if (string.Equals("page", e.TargetInfo.Type, StringComparison.InvariantCultureIgnoreCase))
            {
                var enablePage = new OpenQA.Selenium.DevTools.V120.Page.EnableCommandSettings();
                var enableNetwork = new OpenQA.Selenium.DevTools.V120.Network.EnableCommandSettings();
                var enableRuntime = new OpenQA.Selenium.DevTools.V120.Runtime.EnableCommandSettings();

                await devToolsSession.SendCommand(enablePage.CommandName, e.SessionId, JToken.FromObject(enablePage));
                await devToolsSession.SendCommand(enablePage.CommandName, e.SessionId, JToken.FromObject(enableNetwork));
                await devToolsSession.SendCommand(enablePage.CommandName, e.SessionId, JToken.FromObject(enableRuntime));
            }

            // in case the target is halted, we want to instruct DevTools to continue running the target
            if (e.WaitingForDebugger)
            {
                var runIfWaitingForDebugger = new OpenQA.Selenium.DevTools.V120.Runtime.RunIfWaitingForDebuggerCommandSettings();

                await devToolsSession.SendCommand(runIfWaitingForDebugger.CommandName, e.SessionId, JToken.FromObject(runIfWaitingForDebugger));
            }
        };

Fine by me if we make a separate issue from that, and close this PR.

Feb 06 '24 08:02 EdwinVanVliet

@EdwinVanVliet can it be even more simplified?

It would be really helpful to see the entire code, just to copy/paste and see a problem.

PS:

await devToolsSession.SendCommand(enablePage.CommandName, e.SessionId, JToken.FromObject(enablePage));
await devToolsSession.SendCommand(enablePage.CommandName, e.SessionId, JToken.FromObject(enableNetwork));

Seems enablePage.CommandName is used for both commands...

Feb 06 '24 19:02 nvborisenko

@nvborisenko

I think the problem is related to invoking a command while handling the event. I also tried converting the TargetAttached event to async, but without succes. Keeps hanging on sending the first command to DevTools.

using Newtonsoft.Json.Linq;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.DevTools;

internal class Program
{
    private static async Task Main(string[] args)
    {
        var driverService = ChromeDriverService.CreateDefaultService("{PATH HERE}\\chromedriver.exe");
        var driver = new ChromeDriver(driverService);

        // create a new DevTools session with WaitForDebugger on start enabled
        var devToolsSession = driver.GetDevToolsSession(new DevToolsOptions
        {
            WaitForDebuggerOnStart = true,
        });

        // create an event handler for the TargetAttached event
        devToolsSession.Domains.Target.TargetAttached += async (object? sender, TargetAttachedEventArgs e) =>
        {
            //enable domains for page targets to enable generating CDP events.
            if (string.Equals("page", e.TargetInfo.Type, StringComparison.InvariantCultureIgnoreCase))
            {
                var enablePage = new OpenQA.Selenium.DevTools.V120.Page.EnableCommandSettings();
                var enableNetwork = new OpenQA.Selenium.DevTools.V120.Network.EnableCommandSettings();
                var enableRuntime = new OpenQA.Selenium.DevTools.V120.Runtime.EnableCommandSettings();

                await devToolsSession.SendCommand(enablePage.CommandName, e.SessionId, JToken.FromObject(enablePage));
                await devToolsSession.SendCommand(enableNetwork.CommandName, e.SessionId, JToken.FromObject(enableNetwork));
                await devToolsSession.SendCommand(enableRuntime.CommandName, e.SessionId, JToken.FromObject(enableRuntime));
            }

            // in case the target is halted, we want to instruct DevTools to continue running the target
            if (e.WaitingForDebugger)
            {
                var runIfWaitingForDebugger = new OpenQA.Selenium.DevTools.V120.Runtime.RunIfWaitingForDebuggerCommandSettings();

                await devToolsSession.SendCommand(runIfWaitingForDebugger.CommandName, e.SessionId, JToken.FromObject(runIfWaitingForDebugger));
            }
        };

        // perform a navigate
        driver.Navigate().GoToUrl("https://testtarget.uptrends.net/Tab");

        // the page contains a button to open a new tab
        var openTabButton = driver.FindElement(By.XPath("/html/body/div/div[2]/div/a[1]"));

        // upon opening the new tab, a target attached event is raised. When placing a breakpoint in the event handler you notice the new tab not doing anything until issues the runIfWaitingForDebugger command.
        openTabButton.Click();

        Thread.Sleep(10000);

        driver.CloseDevToolsSession();
        driver.Dispose();
    }
}

Feb 07 '24 10:02 EdwinVanVliet

I ran your example locally - exited with success:

Starting ChromeDriver 121.0.6167.85 (3f98d690ad7e59242ef110144c757b2ac4eef1a2-refs/branch-heads/6167@{#1539}) on port 9114
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.

DevTools listening on ws://127.0.0.1:9118/devtools/browser/9dfe131c-e2c0-474b-9270-fcceb2cefc98

C:\Projects\selenium-simple\selenium-simple-console\bin\Release\net6.0\selenium-simple-console.exe (process 30388) exited with code 0.
Press any key to close this window . . .

Really, what if we start the discussion from the issue you face? I appreciate your contribution, it is making this project great.

Feb 10 '24 22:02 nvborisenko

@nvborisenko

The issue we are facing is that sending the commands from within the event handler seems to lock. I've adjusted the code example a bit that it won't continue running if this occurs. The SendCommand function will now throw an exception in case it doesn't receive a response.

The browser opened the new tab, but nothing is showing in screen. This indicates that the target is halted, like we instructed with the WaitForDebuggerOnStart option. Halting the target caused the TargetAttached event to fire. In that event handler we try to enable some of the domains we want to have enabled and finally try to invoke the runIfWaitingForDebugger command.

The issue here is that the Page.enable (or whatever command we send first) is locked until we send the runtime.runIfWaitingForDebugger command. However, if we send the runIfWaitingForDebugger command first than it won't lock. My thinking here is that Chrome is halting all commands until runIfWaitingForDebugger is invoked and that on the Selenium side we only process each command after each other. This causes a scenario that all command prior to runtime.runIfWaitingForDebugger will timeout.

using Newtonsoft.Json.Linq;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.DevTools;

internal class Program
{
    private static async Task Main(string[] args)
    {
        var driverService = ChromeDriverService.CreateDefaultService("{{PATH HERE}}\\chromedriver.exe");
        var driver = new ChromeDriver(driverService);

        // create a new DevTools session with WaitForDebugger on start enabled
        var devToolsSession = driver.GetDevToolsSession(new DevToolsOptions
        {
            WaitForDebuggerOnStart = true,
        });

        bool invokedRunIfWaitingForDebugger = false;

        // create an event handler for the TargetAttached event
        devToolsSession.Domains.Target.TargetAttached += async (object? sender, TargetAttachedEventArgs e) =>
        {
            //enable domains for page targets to enable generating CDP events.
            if (string.Equals("page", e.TargetInfo.Type, StringComparison.InvariantCultureIgnoreCase))
            {
                var enablePage = new OpenQA.Selenium.DevTools.V120.Page.EnableCommandSettings();
                var enableNetwork = new OpenQA.Selenium.DevTools.V120.Network.EnableCommandSettings();
                var enableRuntime = new OpenQA.Selenium.DevTools.V120.Runtime.EnableCommandSettings();

                // here we send a command to devtools which will timeout. It seems like the thread is being locked.
                await devToolsSession.SendCommand(enablePage.CommandName, e.SessionId, JToken.FromObject(enablePage), millisecondsTimeout: 5000, throwExceptionIfResponseNotReceived: true);
                await devToolsSession.SendCommand(enableNetwork.CommandName, e.SessionId, JToken.FromObject(enableNetwork), millisecondsTimeout: 5000, throwExceptionIfResponseNotReceived: true);
                await devToolsSession.SendCommand(enableRuntime.CommandName, e.SessionId, JToken.FromObject(enableRuntime), millisecondsTimeout: 5000, throwExceptionIfResponseNotReceived: true);
            }

            // in case the target is halted, we want to instruct DevTools to continue running the target
            if (e.WaitingForDebugger)
            {
                var runIfWaitingForDebugger = new OpenQA.Selenium.DevTools.V120.Runtime.RunIfWaitingForDebuggerCommandSettings();

                await devToolsSession.SendCommand(runIfWaitingForDebugger.CommandName, e.SessionId, JToken.FromObject(runIfWaitingForDebugger));

                invokedRunIfWaitingForDebugger = true;
            }
        };

        // perform a navigate
        driver.Navigate().GoToUrl("https://testtarget.uptrends.net/Tab");

        // the page contains a button to open a new tab
        var openTabButton = driver.FindElement(By.XPath("/html/body/div/div[2]/div/a[1]"));

        // upon opening the new tab, a target attached event is raised. When placing a breakpoint in the event handler you notice the new tab not doing anything until issues the runIfWaitingForDebugger command.
        openTabButton.Click();

        while (!invokedRunIfWaitingForDebugger)
        {
            Console.Out.WriteLine("Waiting for RunIfWaitingForDebugger command");
            Thread.Sleep(1000);
        }

        driver.CloseDevToolsSession();
        driver.Dispose();
    }
}

Feb 12 '24 09:02 EdwinVanVliet

@EdwinVanVliet I spent couple of hours and now I see the issue is not related to that cdp command processor cannot handle sequential commands. I will dive deeper (don't promise).

My first thoughts are we should clearly understand what commands we want to send! I mean what are correct commands we should send.. My next steps are to debug cdp conversation from browser perspective.

Feb 14 '24 21:02 nvborisenko

@nvborisenko

My first thoughts are we should clearly understand what commands we want to send! I mean what are correct commands we should send.. My next steps are to debug cdp conversation from browser perspective.

I think it's also a question of what the end user wants to enable. For my own use case we enable Page, Runtime and Network. However someone else might have different needs.

Feb 15 '24 13:02 EdwinVanVliet

I didn't go deeper, what I see the client sends a request, but the server is silent.

Mar 05 '24 20:03 nvborisenko

@EdwinVanVliet @nvborisenko is there something left to do in this PR?

Mar 26 '24 09:03 diemol

Converted this PR to draft, the sttaus of this PR is undetermined yet.

Mar 26 '24 09:03 nvborisenko

@EdwinVanVliet please pay attention on https://github.com/SeleniumHQ/selenium/pull/13768, probably it impacts something.

Apr 05 '24 20:04 nvborisenko