I run a small webring ( geekring.net ) and I get rather a lot of traffic from yacy, and while I appreciate the indexing of the webring-related stuff, everything under /sites/ is disallowed because its simply redirection rules (accessing something under /sites/ don’t provide content, it simply redirects to a ring-member site)
I have the following robots.txt file:
User-agent: *
Crawl-delay: 5
Disallow: /site/
Sitemap: https://geekring.net/sitemap.xml