Hi, images and media files are not downloaded however their alt tag is used to write an index entry to search for that media. That does not work great, but loading those media types would not bring any advantage (until now with AI, but we do not have (yet) vision implemented). And of course your concerns are right, but we don’t load those things. And in future not without asking.
That’s a good clarification…. and something I had also wondered about, because I usually select the option to cache web pages and thought the data was saved on my hard drive in:
yacy_search_server/DATA/HTCACHE
Ironically, I wanted to store content for offline use, archived in case the content later vanishes from the internet, but I did not suspect I may be doing something illegal.
The cached data is not stored in its original form, I guess, but as blobs?