Hello,
I’m looking into trying out Yacy, but first have a question about how Yacy indexes images and media.
Does Yacy download images and media locally in order to index them?
If not, great.
If yes, then the concern becomes inadvertently downloading illegal content (of any kind).
Any clarification on this would be greatly appreciated.
Thank you.
Hi, images and media files are not downloaded however their alt tag is used to write an index entry to search for that media. That does not work great, but loading those media types would not bring any advantage (until now with AI, but we do not have (yet) vision implemented). And of course your concerns are right, but we don’t load those things. And in future not without asking.
2 Likes
That’s a good clarification…. and something I had also wondered about, because I usually select the option to cache web pages and thought the data was saved on my hard drive in:
yacy_search_server/DATA/HTCACHE
Ironically, I wanted to store content for offline use, archived in case the content later vanishes from the internet, but I did not suspect I may be doing something illegal.
The cached data is not stored in its original form, I guess, but as blobs?