Michael Hogue

Results 34 comments of Michael Hogue

@admineral Yeah sorry, not sure why I closed this. I must've been tired. This is a good idea! @KBB99 Looking at this now and it looks really interesting. I hadn't...

What operating system are you on? Also, do you have multiple monitors?

There are some known issues with multi-monitor setups. #57 proposes a change that uses only the active monitor on Linux.

@joshbickett @Andy1996247 #57 didn't fix this issue. The active monitor needs to be selected in the screenshot when there are multiple monitors connected. Will look into this tonight.

@joshbickett It probably would be best to add this to the README or in a separate file that a link could point to. Even MSS, which I've looked into as...

@ahsin-s From [README.md](https://github.com/OthersideAI/self-operating-computer/tree/main/README.md): > **Note:** GPT-4V's error rate in estimating XY mouse click locations is currently quite high. This framework aims to track the progress of multimodal models over time,...

Okay I think that's a wrap. The model can now hover the cursor at a chosen location and scroll up or down. I also added a bit to the vision...

@joshbickett Glad you like the concept! I totally understand if you can't merge this due to the architectural changes. I'm going to put this back to draft for now until...

@joshbickett Great to hear! I've got this on hold at the moment. Currently I'm working on a PR that replaces platform-specific screenshot methods with one platform-agnostic screenshot function using [MSS](https://python-mss.readthedocs.io/examples.html)....

@joshbickett I've tested the common case of "go to youtube and play holiday music" multiple times and I haven't had any issues with a double click. I actually notice reduced...