add more information on what our crawler is used for

2024-01-30 16:44:08 -08:00 · 2024-01-30 16:44:08 -08:00 · fcec37a9f2
commit fcec37a9f2
parent 540a91659f
1 changed files with 11 additions and 0 deletions
--- a/crawler.html
+++ b/crawler.html
@ -24,6 +24,11 @@
                that software such as search engines can use to help find specific websites.
                <br/>
                <br/>
+		our web crawler is specifically used for indexing for the work-in-progress search engine <a href="https://asklyphe.com">askLyphe</a>,
+		which aims to not rely on the results of other search engines and as such needs its own web crawler to function.
+		we do not use our indexes to train neural networks, and currently do not store full pages in their entirety whatsoever.
+		<br/>
+		<br/>
                our web crawler attempts to respect standard <a href="https://en.wikipedia.org/wiki/Robots.txt">robots.txt files</a>,
                and should also respect robots.txt blocks for googlebot (unless you specifically allow vorebot);
                however, no one is a perfect programmer and we may have made a mistake.
@ -51,6 +56,12 @@
                so on.
                <br/>
                <br/>
+		Our web crawler is specifically used for indexing for the search engine <a href="https://asklyphe.com">askLyphe</a>,
+		which is currently in development and not available to the public. Our design goal is to not rely on other search engines for our results,
+		thus we must run our own web crawler.
+		We do not use our indexes to train neural networks, and currently do not store full pages in their entirety.
+                <br/>
+                <br/>
                Our web crawler attempts to respect "robots.txt" files (<a href="https://en.wikipedia.org/wiki/Robots.txt">https://en.wikipedia.org/wiki/Robots.txt</a>)
                and will also respect blocks on "googlebot" (unless you specifically allow "vorebot"). However, our
                program may make errors. If our program has made an error, please email us at <a href="mailto:devnull@voremicrocomputers.com">devnull@voremicrocomputers.com</a>