This is nonsense! I don’t want to index everything that CMS engine of website is generating! In robots.txt there is rules for that, to index pages, not garbage, and YaCy don’t give a f about it!
Thats not all!
If i use
User-agent: *
Allow: /$
Allow: /showthread.php?t=*$
Disallow: /
Yacy see only Disallow: / and say, Oh, this website not allow to index, bla bla bla. WTF??!!
and thats not all!!
Yacy chache robots.txt and even if it changes in the server, it doesn’t care!