ciroBorrelli
ciroBorrelli
[hcp_connector_stdout.log.txt](https://github.com/Norconex/collector-http/files/9377490/hcp_connector_stdout.log.txt)
I tried with a public document URL: https://raw.githubusercontent.com/papers-we-love/papers-we-love/master/artificial_intelligence/3-bayesian-network-inference-algorithm.pdf and the problem (CertificateException) does not happen. It only appear with a private HCP Web-Https FileSystem, whose domain name is 'hcp.vm733.sicotetb..it', and...
Thank You Pascal, "_That way you won't get any deletions as it will not have a crawl history to compare to between crawler runs_" I dont want already imported documents...
My strong desire is: - to not get any deletions where URLs become temporirty unavailable (how to configure repo clean in 2.9.0, pls ?) - to not have any RE-IMPORT...
"_What is the issue in your case of having a few valid documents be resent to IDOL when they are back online?_" Dear Pascal, our issue with "having a few...
Is there any advantage by switching to version 3.0.x of Your Http crawler? We not-only need to IGNORE or GRACE_ONCE temporarily offline URLs, we need to avoid re-import thousands documents...
Yetserday I tried to use the ORPHAN Strategy: `IGNORE` I hoped that would solve my problem, indeed "_Orphans are valid documents, which on subsequent crawls can no longer be reached...
IMPORTANT UPDATE: looks like the problem DOESNOT depend on the 'hdnBtn' that calls the EL/backBean Model method '#{lavoro.removeTag()}' It depends on the actionListener="#{lavoro.selectThis(rep)}" of the 'minus' b:commandButton, instead the code...
Dear StephanRauh I think I found a workaround to this issue problem; I really dont know if its a workaround or **the solution**; let me know Your considerations. I modified...