Documentation improvement

for a longer time I’ve been trying to improve the documentation, at least a little.

The docs are available as a github repository and anyone is welcome to improve them.

I created a list of all the docs available at the time at, and @orbiter kindly added it to the homepage menu.

There is a link to FAQ, which is main manual now, links to three various “Operational” how-tos, and fragments of API description.

I also reordered the FAQ in some more logical way and added a few questions & answers.

Much of the information is still included in the legacy wiki and anyone, who could help with transfering the text from wiki to github, is warmly appreciated. Especialy the API description is just one page now, in contrast with many in the old wiki. Dump of wiki articles would help with converting to MarkDown.

Not sure, what is missing in the FAQ especially for newbies and what could help to improve the first experience with YaCy.

If you feel, you can add your experience, knowledge or tips&tricks, or you can help improving the docs even with few lines, please do so!


And if you want to start improving the documentation, and don’t know where to start, you can pick one of “Documentation” labeled Issues on GitHub. Or pick a wiki entry and transfer it to github.

Ranking rules imported from the old wiki into documentation.
Crawler API organised and formated.
Link to added to Download section, since there wasn’t an official release for quite a long time and automaticaly built package is the newest one we have.

@orbiter, does wiki allow you to export individual articles as wiki markdown?
it would be easier to convert the articles to .md from raw wiki, than doing the formatting manually…
could you make a dump or somethig, please?

Yes good idea, I will try an export.

I made an export and had a look at the dump which was crazy big, 1.5GB.
It is full of strange things, obviously scripted log-in attempts and content that the wiki filtered out using some kind of spam filter by itself, I did not know that mediawiki can do that. I cannot provide the dump in that form, with the crap inside and without filtering out personal information. Please be patient while I work on clean this thing.

Well, since Config Settings page of documentation hadn’t been merged for many months, I forked the standalone documentation and the new YaCy config settings manual is hosted elsewhere.

Changes are appreciated here and I will try to commit the changes into the mainstream.

New manual page about RWI distribution is being written here: RFC: docs: Index distribution in YaCy .
All the input, comments and enhancements are warmly welcomed.

as the old wiki is obsolete, I started the process of batch conversion the old articles to MarkDown, so they could be managed and edited on GitHub.

First batch of pages is commited.

Since PRs had not been merged since last November, there is an updated standalone version of documentation as well.

The pages are probably heavily outdated, so anyone is warmly welcomed to check the newly converted ones and correct them.

I converted only english pages, those interested in other language can follow my path:

First, I downloaded using wget:

wget -r -l0 -np -E --restrict-file-names=unix,ascii,lowercase --convert-links;
wget -r -l0 -np -E --restrict-file-names=unix,ascii,lowercase --convert-links

, then converted by pandoc and cleaned a bit using sed, using this script:


#convert html to md and place it into md dir

#create output dirs
mkdir -p md

# convert all html files to md
for f in *.html

echo -n "converting file $f "

#stripe out header and footer
sed -n '/<h1/,/Abgerufen/ p' $f  > $f.tmp1

# convert from html to git-flavoured md
pandoc --from html --to gfm $f.tmp1 -o $f.tmp2

# clean spans and divs 
sed -e 's/<span[^>]*>//g' $f.tmp2 | sed -e 's/<\/span>//g'  | sed -e 's/<div[^>]*>//g' | sed -e 's/<\/div>//g' | sed -e 's/Abgerufen von/Converted from/'  > $f.tmp3

echo " to md/`echo $f |sed 's/.html//'|sed 's/\:/_/'`.md"
mv $f.tmp3 md/`echo $f |sed 's/.html//'|sed 's/\:/_/'`.md

rm $f.tmp1 $f.tmp2


Then carefully hand-edited.

This is the first batch, mostly ‘Installation’ and all the linked ‘Operation’ section articles.

More files converted, but I got to edit them by hand and commit as the time constrains allow.

i tried to machine translate the “Network Definition” file from the original wiki, which hadn’t been traslated at all. it’s mainly about how to set a network definition files and how to start your own network, other than defaul ‘freeworld’: Network Definition.
Despite I tried to clear the text a lot, to make it ‘english’, I’m not fluent in german, so I’m not sure, if I didn’t make some mistakes.
If anyone here, fluent both in german and english, could check the translation and/or correct it on github, that would be more than appreciated!

Also a new batch of old-wiki articles transfered. They may be outdated, so if you spot an error, you can correct that as well. Here’s how.