Hi All,
I did some experiments with the scraping proxy feature that should in theory trigger a crawl based on the visited pages, however as I see only a minor part of the visited pages will be crawled in the end.
As I see there is a warning at the proxy settings page that states that no pages will be crawled that store any private information eg where POST/GET action is performed or where cookies are used.
The POST/GET part is more or less fine, but is this cookie restriction really necessary? Most pages use cookies nowadays (eg for controlling advertizements, etc) and this restriction keeps the percentge of the crawled pages (triggered by the proxy) very low.
Is there any workaround for this?
Thanks,