wbot icon indicating copy to clipboard operation
wbot copied to clipboard

Crawler gets never finished

Open 2peter3 opened this issue 1 year ago • 0 comments

Hey, regarding to the latest Version of the Crawler, it looks like it wont finish.

Reproduce:


import (
	"fmt"
	"log"

	"github.com/rs/zerolog"

	"github.com/twiny/wbot"
	"github.com/twiny/wbot/pkg/api"
)

func main() {
	bot := wbot.New(
		wbot.WithParallel(50),
		wbot.WithMaxDepth(5),
		wbot.WithRateLimit(&api.RateLimit{
			Hostname: "*",
			Rate:     "10/1s",
		}),
		wbot.WithLogLevel(zerolog.DebugLevel),
	)
	defer bot.Shutdown()

	// read responses
	bot.OnReponse(func(resp *api.Response) {
		fmt.Printf("crawled: %s\n", resp.URL.String())
	})

	if err := bot.Run(
		"https://example.com",
	); err != nil {
		log.Fatal(err)
	}

	log.Printf("finished crawling\n")
}

Output:

2024-03-24T11:05:28-04:00 DBG fetched target=https://example.com
crawled: https://example.com

Then it stucks forever. Also when requesting a not existing Page like "https://examdfasdfasdfple.com". It never finish :/

2024-03-24T11:07:00-04:00 INF Starting crawler with 1 links
2024-03-24T11:07:10-04:00 ERR fetch error="context deadline exceeded" target=https://examdfasdfasdfple.com


2peter3 avatar Mar 24 '24 15:03 2peter3