Hi guys, I launched new node and testing in yacy search network. My node specification is not bad - i3 Intel NUC, 8GB, Ubuntu with 0.5TB SSD space. All this with 350 Mbit/s business grade link. Firewall already open on port 8090 all work good. I was looking for the documentation but is a bit weak therefore can somebody answer me the following:
- how yacy server manage dead links and remove them from index?
- what is recommended way to remove duplications?
- any documentation about the heuristics?
- how is the autocrawl configured? How it works? does use dictionary for url names?