How to improve the order of the result?

Hi there, I run Yacy for some times, mainly to build my index and contribute.
Usually I add web dev oriented pages.

But I rarelly use Yacy as search engine because the results are ordered, in my point of view, quiet randomly. :face_with_peeking_eye:

One simple example with the search ‘python analyse file’ with this url : http://localhost:8090/yacysearch.html?query=python+analyse+file&Enter=&auth=&contentdom=all&strictContentDom=false&former=python+analyse++file&maximumRecords=10&startRecord=0&verify=iffresh&resource=global&nav=all&prefermaskfilter=&depth=0&constraint=&meanCount=0&timezoneOffset=-60 I got this first result below.

the json of the 10 first results
"items": [
    {
      "title": "Chemists bitten by Python scripts: How different OSes produced different results during test number-crunching • The Register Forums",
      "link": "https://forums.theregister.com/forum/all/2019/10/15/bug_python_scripts/",
      "description": "",
      "pubDate": "Thu, 26 Feb 2026 00:00:00 +0000"
    },
    {
      "title": "KI im Controlling | Controlling | Haufe",
      "link": "https://www.haufe.de/controlling/controllerpraxis/ki-im-controlling_112_662516.html",
      "description": "",
      "pubDate": "Tue, 24 Feb 2026 06:40:25 +0000"
    },
    {
      "title": "Datenanalyse mit Excel - Tools, Funktionen, Nutzen | Controlling | Haufe",
      "link": "https://www.haufe.de/controlling/controllerpraxis/datenanalyse-mit-excel-tools-funktionen-nutzen_112_629066.html",
      "description": "",
      "pubDate": "Mon, 23 Feb 2026 15:12:27 +0000"
    },
    {
      "title": "",
      "link": "https://www.oecd.org/officialdocuments/publicdisplaydocumentpdf/?cote=DAF/COMP/LACF(2020)5&docLanguage=En",
      "description": "The CNMC has collaborated with the Portuguese competition authority on a case involving several … ",
      "pubDate": "Mon, 14 Sep 2020 16:17:47 +0000"
    },
    {
      "title": "MySQL Procedure Analyse Denial Of Service ≈ Packet Storm",
      "link": "https://packetstormsecurity.com/files/137232/MySQL-Procedure-Analyse-Denial-Of-Service.html",
      "description": "# This exploit is compatible with both <b>Python<\/b> 3.x and 2.x. \/ Follow us on Twitter Follow us on Facebook Subscribe to an RSS Feed <b>File<\/b> Archive:",
      "pubDate": "Sun, 19 Mar 2023 19:34:36 +0000",
    },
    {
      "title": "GitHub - attify\/firmware-analysis-toolkit: Toolkit to emulate firmware and analyse it for security vulnerabilities",
      "link": "https://github.com/attify/firmware-analysis-toolkit",
      "description": "However you need to have both <b>Python<\/b> 3 and <b>Python<\/b> 2 installed since parts of Firmadyne and its dependencies use <b>Python<\/b> 2. \/ After installation is completed, edit the <b>file<\/b>",
      "pubDate": "Sun, 01 Jun 2025 12:18:10 +0000",
    },
    {
      "title": "Umstieg auf Python 3",
      "link": "https://www.linux-magazin.de/ausgaben/2009/09/im-zeichen-der-drei/",
      "description": "Netzwerk-Standard und <b>Python<\/b>-Einf&uuml;hrung",
      "pubDate": "Mon, 24 Apr 2023 07:29:00 +0000"
    },
    {
      "title": "Analyse Logs with Elasticsearch",
      "link": "https://www.opensourceforu.com/2022/07/analyse-logs-with-elasticsearch/",
      "description": "We often need to search and <b>analyse<\/b> the logs of our Windows based computer or server to get an insight into the system or application.",
      "pubDate": "Tue, 20 Jun 2023 12:54:49 +0000"
    },
    {
      "title": "dis — Disassembler for Python bytecode — Python 3.11.2 documentation",
      "link": "https://docs.python.org/3/library/dis.html",
      "description": "Source code: Lib\/dis.py The dis module supports the analysis of CPython bytecode by disassembling it. The CPython bytecode which this module takes as an input is defined in the <b>file<\/b> Include\/opcode....",
      "pubDate": "Sat, 01 Apr 2023 09:58:25 +0000"
    },
    {
      "title": "Erweiterte Analyse - Azure Architecture Center | Microsoft Learn",
      "link": "https://learn.microsoft.com/de-de/azure/architecture/solution-ideas/articles/advanced-analytics-on-big-data",
      "description": "Datenbanken oder Daten-Warehouse zu kombinieren. Verwendung von skalierbaren Machine Learning-\/Deep-Learning-Techniken, um tiefere Erkenntnisse aus diesen Daten abzuleiten, unter Verwendung von <b>Python<\/b>, Scala oder .NET mit Notebook",
      "pubDate": "Wed, 31 May 2023 23:00:06 +0000"
    }
],

I wonder how I can improve that?!
Any though ?

Hi there, I can explain a bit more what I’m trying to do.

I would like to have more results with the keywords in the url or in the titles. To do that I change the setup in the page http://localhost:8090/RankingSolr_p.html but I cannot see a difference.

Another question, how can I set preferred domaines by keywords. In this example I would like to have the result from docs.python.org first in the list and after the other results.

hi, @Lascapi!

  1. you can play with setting the ranking rules, changing the priority of various elements. that helped me a lot.

  2. there is something like Google’s PageRank, called CitationRank, which will boost pages the more, the more other pages link to them. see faq. not sure about how efficient is that, there had been several threads in the forum:
    Documentation unclear about webgraph vs Reverse Link Index
    How to activate and rank by CR - citation rank

  1. I look forward, whether vector search, already present in solr, will ever be implemented in yacy as well. that can help a lot with relevancy. do you have any plans with that, @orbiter ?

and yes, relevancy is one of biggest pains for me as well. it resembles altavista in the old times, before google came in.

yacy is useful for me in niché searches, where no other search engines have not indexed the whole sites. and good in a full control of index/search. but still with mixed results.

@orbiter is diving in all these new thrilling AI functions, whereas the “core” functions like crawler, indexer, search, RWI etc. contain a lot of inefficiencies or bugs and would need some developers or rather a team to improve.

1 Like

inurl:docs.python.org, added to a search query can help. it’s not exactly what you mean, but it limits the results to a domain.

1 Like