DotnetSpider icon indicating copy to clipboard operation
DotnetSpider copied to clipboard

DotnetSpider, a .NET standard web crawling library. It is lightweight, efficient and fast high-level web crawling & scraping framework

Results 29 DotnetSpider issues
Sort by recently updated
recently updated
newest added

[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) Welcome to [Renovate](https://togithub.com/renovatebot/renovate)! This is an onboarding PR to help you understand and configure settings before regular Pull Requests begin. 🚦 To activate Renovate, merge this Pull Request....

请问能用一个Spider,然后通过数据库进行配置不同抓取规则,进行多个网站抓取吗?

如果切换sql server数据库,或者自己定义储存方式,这边只要能拿到数据就好

**我碰到了一个request再重试第2次或第三次的时候,它的Properties和Headers都为空,被清理掉了。是不是下面两段代码的执行顺序变了影响的?清理context的代码先执行所以导致Properties和Headers被清空了** ![image](https://user-images.githubusercontent.com/64203190/174426968-fdb33bf5-e9cb-40f5-8a1f-d92e2b152032.png) ![image](https://user-images.githubusercontent.com/64203190/174427185-84703017-3975-48c2-8d87-8b60a236838d.png)

![image](https://user-images.githubusercontent.com/50691694/168585859-d512580d-0ef8-4f08-b9b7-242e416cc383.png)

` public static class HttpResponseMessageExtensions { public static async Task ToResponseAsync(this HttpResponseMessage httpResponseMessage) { var response = new Response {StatusCode = httpResponseMessage.StatusCode}; foreach (var header in httpResponseMessage.Headers) { response.Headers.Add(header.Key, header.Value?.ToString());...

比如 ` ("//ul/li")` 这个库我该如何遍历它

.net 下得爬虫框架太少了,支持!

就像Scrapy和Splash一样,能不能提供一个在DotnetSpider框架中结合使用PuppeteerSharp来填写表格、自动化browser、访问Javascript网站的example?