How does YaCy order results?

I cant find this info anywhere. It is a VERY important aspect for me.
It would be nice to be able to choose how they are ordered.

2 Likes

The results are ordered by RANDOM in current yacy versions.

You can change how local results from Solr are ranked with RankingSolr_p.html

Results from other peers, if you use set YaCy to only use Solr data (not yacy’s RWI based ranking), will be ranked according to how other peers have configured their Solr data to be ranked (likely default).

One major problem with the actual results you get from YaCy is that the rank values DO NOT MATTER, NOT EVEN A LITTLE. If you look at source/net/yacy/search/query/SearchEvent.java you’ll find that results are actually ranked according to first received first ranked! So if the first peer to respond sends 3 results those are 1,2,3 and if the second peer results are received from sends 3 results those are 4,5,6 even if their attached ranking score is higher.

You’ll have to change source/net/yacy/search/query/SearchEvent.java if you want yacy to wait a given amount of time for results and then sort the results. That’s what I do with the yacy fork I run on https://yacy.everdot.org/ - it waits 6 seconds for results and then it actually sorts the results and presents them.

1 Like

Nice work! https://yacy.everdot.org/

Ranking search results is something I believe should, and could be, user controlled, to whatever extent possible.

To do that, there has to be some agreed upon criteria.

For example, I frequent various, very extensive not for profit organization websites which regularly post “action alerts” and the like, where some urgent or immediate response is required.

I don’t particularly want to subscribe to all the thousands upon thousands of groups and organizations out there that put out such material. I would like to be able to rank search results by “urgency”. Or at least, what the publisher considers urgent, out of what might be thousands of pages on their website.

That kind of thing is what I consider “semantic” search, in that there is some actual meaning or significance attached to the results, beyond words or language.

Your solution of having a few second delay to provide time for sorting search results is a small price to pay for what might otherwise be many wasted hours browsing through unsorted search results.

1 Like

Hi, can you tell me how you got your fork running on standard port 80?
I have tried setting the main config port: but I get this error in the log

I 2020/11/13 06:25:28 ConcurrentLog shutdown of ConcurrentLog.Worker: injection of poison message
I 2020/11/13 06:25:28 ConcurrentLog terminating ConcurrentLog.Worker with 0 cached loglines.
I 2020/11/13 06:25:28 ConcurrentLog shutdown of ConcurrentLog.Worker: terminated

You are using searx as frontend. Did you have to change some code in searx?

Nice work with https://yacy.everdot.org/

Well your interpretation of the code is not correct. If you load the yacysearch.html result page, each of the 10 result lines are loaded using a Server-side-include (look out for

<!--#include virtual="yacysearchitem.html?item=#[item]#&eventID=#[eventID]#" -->

) and that is passed from the YaCy Server using HTTP Multipart. That means, every single search result is one multipart element, and the result in that multipart is the best ranked entry at that time.

In the past we delayed each of the ten results by some time (like 100 milliseconds) to give remote peers the option to deliver more links and thus ‘late’ peers had also provided to the result page. But loading was very slow and people complained more about search time but about result ordering, so its now without extra delays.

Is there a chance to customize YaCy to sort by date by default, without having to add the keyword /date?

I just, while browsing the forum, did some searches on everdot: https://yacy.everdot.org/

When switching to search for video this error message appears:

Engines cannot retrieve results:

youtube (unexpected crash Extra data: line 1 column 179806 (char 179805)), invidious (unexpected crash Expecting value: line 1 column 1 (char 0))

The search results, however, below that seem just fine. I have no idea what the error message means or what could be causing it but it seems consistent (the same, or similar) regardless of the search terms used.

The frontent of this engine is seax (searx.me) which is a metasearch engine. Its a wrapper to search in many other engines in parallel and combine the results. The interfaces of the engines used change from time to time which causes unexpected behavior. This kind of errors is displayed by searx.