How to boost YaCy

To all YaCy users!

Many of us run YaCy for a long time and put serious effort in it. Obviously there is an increasing use of the YaCy net as there is an increasing need of alternative search engines.

For some reason the number of participants is stuck at some level.

Who would like to join in order to boost the market presence of YaCy?

Cheers
M

1 Like

Hello

What and how do you imagine?

Regards
Patrick

I’m totally new, just installed it yesterday. For a newby, the admin interface with all the settings and stuff you need to tweak and adjust can be a bit overwhelming.

I assume, Yacy would benefit from many more nodes, each doing a little bit of work. To achieve that, it should be much easier, something like just putting up a docker container and forget about it. Minimal configuration. I’m struggling to get Yacy working reliably on my Raspberry Pi, to the extent that the Pi freezes up repeatedly. Since I love selfhosting, and try to de-google myself as much as possible, I’ll keep trying to get it to work. But others might just give up and chuck it when possibly only a little tweaking would solve the issues.

So my suggestion would be to make it very very simple and easy to setup and start. Apple style, like my 2 year old can already unlock the iPad and play his videos. Having plenty of settings and adjustment possible is fine, but the basics should be dead easy to get it (and keep) going.

2 Likes

Hello

I cannot say that I am brand new to the YaCy world. Nevertheless, as a YaCy user, I still see myself in the group of beginners and, in my opinion, that will stay that way until I get the server load distribution under control.

I agree with you that YaCy has a fairly high entry level, especially for newbies, but it is manageable. It is a bit easier for people who have had experience with open source projects in the past, for example. Fortunately, I had some experience with the OpenSim open source project, which in my opinion prepared me for YaCy.

A general division of YaCy into a professional area and a YaCy for beginners would, in my opinion, be the way forward. The way YaCy is right now, it’s great for professionals. In the beginner mode, the CPU & Ram load should be easier to regulate, e.g. by simply setting percentages. 40% CPU and ram power for new crawls, 40% CPU and ram power for recurring crawls and 20% for the server itself including search queries from users. I would have wished for something like that at the beginning, because the water I had to jump into from my point of view of knowledge was pretty cold.

At the moment my biggest problem is to learn the load distribution and then to set YaCy so that 40% / 40% / 20% distribution becomes a real state on my YaCy server. More and more often I wait far too long for pages to load in the admin area, because once again there are many more crawlers in the queue because I misjudge the overall size of some websites. Something like that quickly leads to a loss of fun with the search engine.

Greetings Patrick


Hallo

Das ich ganz neu in der YaCy Welt bin, kann ich von mir nicht behaupten. Trotzdem sehe ich mich als YaCy Anwender noch in der Gruppe der AnfÀnger und das wird aus meiner Sicht auch so bleiben, bis ich die Serverlastverteilung in den Griff bekommen habe.

Ich stimme mit dir ĂŒberein, das YaCy gerade fĂŒr Neulinge ein ziemlich hohes Einstiegsneveau hat, aber es ist schaffbar. Die Personen, die z.B. Erfahrungen mit Open Source Projekte in der Vergangenheit sammeln durften, ist der Einstieg etwas leichter. Ich hatte zum GlĂŒck ein paar Erfahrungen mit dem Open Source Projekt OpenSim sammlen können, was mich nach meiner Meinung etwas auf YaCy vorbereitet hat.

Eine generelle Unterteilung von YaCy in einen Profibereich und einen YaCy fĂŒr AnfĂ€nger, wĂ€re nach meiner Ansicht der zukĂŒnftige Weg. So wie YaCy jetzt gerade ist, ist es fĂŒr Profis super. Beim AnfĂ€nger Modus sollte die CPU & Ram Last einfacher regulierbar sein z.B. durch einfaches festlegen von Prozente. 40% CPU und Ram Power fĂŒr neue Crawls, 40% CPU und Ram Power fĂŒr wiederkehrende Crawls und 20% fĂŒr den Server selber inkl. Suchanfragen von Benutzer. Sowas hĂ€tte ich mir fĂŒr den Anfang gewĂŒnscht, denn das Wasser in das ich von meinen Wissenstandpunkt springen musste, war doch ziemlich kalt.

Zur Zeit ist mein grĂ¶ĂŸtes Problem die Lastverteilung zu erlernen und YaCy dann so einzustellen, das 40%/40%/20% Verteilung ein realer Zustand auf meinen YaCy Server wird. Ich warte immer öfter viel zu lange auf das Laden von Seiten im Adminbereich, weil wieder einmal viel mehr Crawler in der Warteschleife hĂ€ngen, weil ich die GesamtgrĂ¶ĂŸe mancher Webseiten falsch einschĂ€tze. Sowas fĂŒhrt sehr schnell zu Spaßverlust an der Suchmaschine.

GrĂŒĂŸe Patrick

2 Likes

Getting it into more distribution package managers would be a step forward. I can’t believe this decade+ old project isn’t in my repo. That would at least lower the barrier to entry by a little bit for anyone looking to get into web crawling.

1 Like

For me, the biggest problem in using YaCy is that there’s no official instance running. I wouldn’t mind trying to use it as my default search engine for a while and if things look good, then run my own instance to contribute to the network.

@Orbiter is there a reason there’s no official instance?

1 Like

There are two valid answers here:

  • you did not find it, it’s here: https://yacy.searchlab.eu/
  • I am not promoting this like “on a front page”, it’s just a demo peer which you can find in the following click path: yacy.net → Demo → (at the bottom in the left menu) YaCy Demo Peer

I explain this since 18 years: the YaCy project is not about the construction of a search portal, it is about the development of a software that the user should run themself. A prominent portal would destroy that target.

However, since this is permanently requested: a portal will be provided with a follow-up project at https://searchlab.eu

A flatpak file would be great indeed.
It would work on any linux machine.

About boosting yacy, I have not found a better place to post the following idea.

As newbee, for one year I periodically looked at:
https://download.yacy.net/
to see if there was a new version.

Today I found by chance:
https://release.yacy.net/

I suggest the former be removed, or made it a link to the latter.
It’s confusing.

Thanks.

hi salvador,
download.yacy.net hosts ‘releases’, whose hadn’t been made for quite long time. should be, but i’m not sure, what exactly the mechanism is, and how could be the new versions released. @orbiters admins that.
release.yacy.net are unsigned, automaticaly built unix packages out of github. i didn’t know the url before and i’ve added the link to some of the docs.
which page of documentation is confusing, in your opinion?

additionaly, imho, what yacy is really missing to “boost”, are java developers.
the developement and bug-fixing is stalled for quite long, @orbiter is into his AI stuff, there is plenty of unsolved issues on github and some of the bugs still in core functions. i cannot do java myself, and i was able to fix only the most obvious stuff, so i am trying to improve the documentation at least. in this lack of human power, even tiny pieces of improvement both code and docs would help.

Hi Okybaca,
I said that is confusing because I saw https://download.yacy.net/ years ago, and therefore I did not think to search for another place to download the software.
Only a few days ago I found https://release.yacy.net/ .
I think it would be best to just hide the “download” URL for now, given that it is not being updated.
The content is the same after all, except for the version.
I ran yacy two years ago and it was very usable.
I don’t know about the bugs it contains, but I don’t think anybody expects perfection from a search engine.
Thanks for your contribution, by the way.
Only in a year from now I will be able to run it again publicly.
If a linux administrator can help, please let me know and I will try to :slight_smile:

Sure!
There is a sort of mess in all of the distribution formats. Various packages for various unixes, docker images etc. and it would be nice to tidy them a bit, or at least to have a notion of which and where are they and what is necessary to update them, or what would be a regular update workflow.

I also converted some install guides from the old wiki for various system and I’m not able to judge, if they’re outdated and what need to be done to update them.

That would be really helpful!

And definitelly, it’s a great help to respond the other users in the forum, to help them solve their problems.
It’s sort of single-sided here last months: someone asks a question, nobody responds, and the new possibly active member of community loses his/her interest.

OK.
As I said, I am currently unable to run yacy, because I don’t control the router.
But I’ll see what I can do :slight_smile:

If you got an account on any unix machine with public address, you could use a ssh tunnel as described in the faq.

Unfortunately I don’t have any machine with public IP addresses at home.
To have that, I would still need to have control of the router of the apartment.
Or to pay a hosting service, which I prefer not to do at the moment.

You don’t need a public address for running YaCy, you need it just to be a ‘senior’ peer (other peers can search using instance of yours).
Without public address you can still crawl and search the P2P network as ‘junior’ peer and do the RWI distribution.

Ah, OK.
Thanks for the info :slight_smile: