EDIT: @isle See this: Clarification on crawling levels?
@transysthor
While i can’t in a meaningful manner explain crawl depth levels accurately (for all i know YaCy applies the set depth level every time it branches outside of the current domain and into a new one), i think i can perhaps shed some light on Rows to fetch at once etc…
If you go to Crawler Monitor, you’ll see solr search api mentioned; linking to something like e.g this:
https://peach.stembod.online:8443/solr/select?core=collection1&q=*:*&start=0&rows=3
(replace host and port as needed)
(this is also useful for testing any desired Auto Crawler solr query string, instead of default *:*
. See links below for more.)
Number of rows, at least in most databases, usually refers to number of database entries to fetch. (think of it like a spreadsheet/table, having rows (and columns))
And as you can see, that request asks for 3 results (rows=3). (in combo with start=0 , i’m guessing would mean kind of like ‘fetch me the rows 0 to 3’)
so Rows to fetch at once (with the default setting at 100, and no/default *:*
query) would mean
-
select?core=collection1&q=*:*&start=0&rows=100
(row 0 to 100)
And then, when Auto Crawler gets done with that set, it probably does
-
select?core=collection1&q=*:*&start=100&rows=100
(row 100 to 200)
, then -
select?core=collection1&q=*:*&start=200&rows=100
(row 200 to 300)
, and so on…
And with Deepcrawl every set to the default 50
. It means that result 50,150,250 etc. … Would get set to be Deep crawled (default 3), while the others gets set to be done at Shallow depth crawl (default 2(?)) .
I’m guessing… And not sure in what way it deals with the various custom collections, e.g user
…
In regards to the Query setting, i’ve found this useful:
- https://lucene.apache.org/core/3_5_0/queryparsersyntax.html
- http://www.solrtutorial.com/solr-query-syntax.html
in combination with looking at the fields present in yacy’s IndexSchema_p.html page