How to Boost via domain extension?

I have spent a lot of time trying to Boost results for .au domains.

The ChatGPT bot seem to be full of old information. As nothing it told me to do, worked. The settings to be changed and files to be altered did not exist.
Creation the files/changes made no effect.

Does anyone know a way to boost results from a certain domain/s extension?

Thank you.

I give up.
There is no way to get help with YACY.
Developers just assume you know what everything is. It’s like they created it and know what it does and that’s good enough.

The AI while good for basic stuff but is not good for anything else. As it seems to be stuck about a decade ago. Wanting you to edit stuff that no longer exists.

Currently stuck with curl 401ing all the time, no matter what I try. So I cant advance on the orig issue.
Search for an answer here on the forum and there is an unanswered question (most here seem to be self replied) about it from 2019 still waiting for help, related to the 401 of curl and advice.

It’s a shame that the docs and help are so lacking. It could/should be a thriving community with a great piece of software.

EDIT: API endpoints always respond with 401 Unauthorized · Issue #354 · yacy/yacy_search_server · GitHub Looks like a feature that it doesn’t work as stated.
If you don’t how to, there is no real explanation on how, just on how normal usage doesn’t work…

Hi,
I understand your frustration.

It seems to me that original developer, @Orbiter, is more into another projects (after these 18 years or something), so the project really lacks both maintenance and support. Time to time, there is a commit or even release of new version, althought unsystematical.

Chatbot is for sure fed with the old documentation, so we couldn’t expect miracles.

I try to make the documentation, but I’m a mere user, so a lot of things is a blackbox for me. I don’t do Java, so code contribution is mission impossible for me.

What we lack is:

  1. java developers,
  2. community, helping the others in the forum (i wonder why a lot of questions are unresponded in the forum or why people come only once here).

Still, YaCy is the only working p2p search engine. The idea is great, the implementation is still IMHO immature. For me, it’s still on the edge: does it worth to take time and effort to contribute, or is it going to be abandoned?

At least three things are there, what everyone can do:

  1. answer newbie or other user questions in the forum and/or github issues. Lack of feedback is probably the most frustrating thing.
  2. extract information already published and contribute that into documentation or documentation fork (which I personaly run at: [YaCy Docs],
  3. if you understand Java, try to fix some issues or contribute a new code

I have experimented with YaCy node for some years now, somehow working, but still not ripe for production use (mainly speed, memory and relevance issues for me).

Seems that we’ll have only as good YaCy as we manage to make ourselves, not relying on the original authors.

I personally don’t have a lot of experience with boosting the queries, but from what I understand, it’s a way to use a direct solr options in queries. Docs - Definition of Ranking Rules. If I had to contend with that, I’d try to play with some solr queries directly, using /solr/select?core=collection1&q=... interface. But frankly, I don’t have any more specific advice for you.

Thanks for your reply.

I know a bit of Java. (started programming when I was 16, 1987)
I was going to look into development and contributing to the docs but considering I cant even gather enough information to boost a domain. I’m having second thoughts.

I have seen people wait a year for their changes to be submitted. It is a shame that it has been left to die a slow death.

If I can workout how boost au domains (part way there) I’ll hang around, otherwise with the randomly bad search results and utter lack of any real help/docs (not you) I’ll give up and find something else to do with the server.

You appear to be the main/only active helper here.

I would like to hang around and contribute, guess I’ll see how I go in the next couple of hours and decide.

very true. that led me to make my own fork of both yacy and docs, i try to commit the changes into mainstream, but my latest doc hangs in PRs for months again. better to not rely on central authorities.

i wish i wouldn’t be the only one.
@joestr made some contributions in previous year, @inonkps was interested in developement, @sixcooler runs 5 biggest yacy nodes and sometimes contributes a code, @orbiter, the main author, most probably works in batches, time to time, now probably working on solr 9 and some AI & vector search stuff. @roamn (.au based?) runs his wild but interesting performance experiments and stress tests. @akdk7 recently revived & crafted his yacy-stats.de site.
so it’s probably not dead, only… sleeping? fragmented? definitely not very actively maintained. but we could have forks.

1 Like

While there is probably a better field to use. I made do with what was already enabled.
<str name="host_s">myfoodbook.com.au</str>

Done some brief testing and this seems to be working to boost specific domain extensions.
I placed the bellow in Boost Query

host_s = query => *.au^50

3 Likes

Glad you managed!

1 Like

Thank you :slight_smile:

I’ve added that into the docs (PR). Hope that’s correct. If not, feel free to edit or extend.

1 Like

That’s perfectly fine.
I do intend to get notes (imagine scribble currently) together to pass on but 5-6 weeks in. I have zero indexes lol. various (learning) reasons…

But I do have python code (thanks free ChatGPT) that takes YACY’s top x results, then resorts with Spacy.
Spacy reads cached document and the pages are re ranked.

1 Like

nice! would you thare the code?

After Hugging Face and spaCy. I have been messing about with straight Python.
Hugging Face and spaCy went pretty good but too complicated IMO due to dependencies. So I went with straight Python as it is easier for anyone to set up.

The code is slow and still being worked on but works.

http://bezazz.com/Archive.zip

EDIT: Would need to pip install
aiohttp
flask-caching
requests
beautifulsoup4
markupsafe
I recommend a venv — Creation of a virtual environment

uvicorn chatGPT:app --host 0.0.0.0 --port 5000