I’m succesively migrating all of my YaCy peers to the new release with Solr 8.8.1. I just did a JSON flat dump before migrating to 1.925/10086. After the version upgrade I put the JSON flat dump into /DATA/SURROGATES/in but the import doesn’t work. After a few secs the Log shows the following:
E 2021/03/20 14:21:13
org.apache.solr.common.SolrException: ERROR: [doc=-
ql5IgPCpqc4] Error adding field 'last_modified'='Sat Dec 02
17:46:10 GMT 2017' msg=Invalid Date String:'Sat Dec 02
The full stacktrace is located at E 2021/03/20 14:21:13 org.apache.solr.handler.RequestHandlerBase org.apache.solr - Pastebin.com
A fix would be great
in order to have something to be imported via SURROGATES/in you need the fill-blown xml export.
(the json export is for imports at elastic search)
I’ve done some tuning on the solr-8.8.1 topic - check the latest version.
thx for the info. Ok, I’ll fetch the latest code at the repo.
Thank you very much.
I’m sad that the data can’t be imported into a 1.925 Yacy Is YaCy Grid ready to dock to freeworld right now? Or is it possible to do a “backport” export as a XML dump for the “old” YaCy. I can provide the JSON flat file as soon the upload is finished.
I will make it possible to do the json import to get compatibility with YaCy Grid
Thank you very much. I’m very glad that the data in the dump I created will soon enrich our freeworld network again. There is some special metadata in the dump that imho is valuable for the community.
I just fixed the import.
However, it is working a bit slow because of an enrichment process that can re-annotate synonyms and facets in case that such things are defined in the importing peer. It is possible to speed up that process but it needs extra care.
Do not start huge imports right now, I will work on the performance!
Ok thx. This is what is theoretically possible:
now I have added concurrency and removed superfluous tokenization in case no synonyms or semantic tags are defined.
Just cloning our repo now. The benchmark results shown above are made from my PCIe NVMe SSD acquired only for YaCy. But some OS’s NVMe drivers aren’t very mature yet. I had many Kernel Panic’s @ Mac OS < Catalina with that. Linux works fine but I’m not sure which filesystem is the fastest. I’m currently using XFS.