1st question
how to tell yacy to crawl only these pages
http://example.com/*/abc/*.html
2nd question
can yacy do a crawl automatically every day
1st question
how to tell yacy to crawl only these pages
http://example.com/*/abc/*.html
2nd question
can yacy do a crawl automatically every day
3rd question
how to exclude <div id="someDiv">
in some.html from search results
The crawl start provides fields for regular expressions to include or exclude urls.
Then there is the crawl scheduler where you can reschedule a crawl on a daily basis.
This is not a regex.
Maybe you should try http:\/\/example\.com\/.+\/abc.+\.html
I by myself am never sure which syntax to follow as regex are mixed sometimes (e.g. blacklist) with file globs (stuff with *)