What is Yacy the search engine

W

YaCy is a distributed search engine, free of charge, built on the principles of peer-to-peer networks. The source code is written in Java, distributed on several hundred computers, the so-called YaCy peers. Each YaCy-peer crosses the internet, analyzes and indexes the results, and saves indexed results into a common database (the so-called “index”), which is shared with other YaCy-peers using P2P network principles.

Peer-to-peer, or P2P in its abbreviated form, refers to computer networks that use a distributed architecture. This means that all computers or devices that are part of such a system share the tasks between them. Devices that are part of a peer-to-peer network are called peers. There are no “equal” privileges and there is no administrative device at the center of the network.

P2P networks have some features that make them useful:

1. They are difficult to disable. Even if you stop a peer (peer), the others continue to work and communicate. You must stop all participants to disable the network.

2. Peer-to-peer networks are highly scalable. Adding new peers is very easy because there is no need for any configuration on any central server.

3. From a file sharing point of view, the higher a peer-to-peer network, the faster it is. Having the same file stored on multiple participants on a P2P network means that when someone needs to download it, the file is downloaded from multiple locations at once.

Compared with semi-distributed search engines, the YaCy network has a decentralized architecture. Each YaCy-peer is equal and there is no central server. The search engine can be run in both the “search robot” mode and a local proxy server, indexing the pages visited on the user’s computer with several mechanisms in place to protect the identity of users.

The program itself is made available under the GPL License and anyone is welcomed to contribute. Yacy is available for both Linux, but also Windows or Mac users, and can be downloaded from the main menu of the Download Menu at the top right.

Access to search features is done by a local server running a search bar that returns results just like any other popular search engine.

The YaCy search engine is based on 4 major components:

1. Robot
A search engine that goes from page to page, analyzing the content of each page.

2. Index
It creates an inverted word index (RWI), so each RWI word has its own list of relevant addresses and well-classified information. Words are saved as word hash.

3. Search and management interface
Made as a web interface provided by a local HTTP servlet with a servlet engine.

4. Data deposits
Used to store inverted word index in a distributed hash table.

Benefits:
1. Because there is no central server, the results can’t be easily censored, and the reliability is high because there is no single failure point and the search index is redundantly saved.
2. Because a company does not own the engine, there is no centralized advertising system.

Disadvantages:
1. Because there is no central server and the YaCy network is open to everyone, malicious persons can insert commercial or contaminated results.
2. Verification of results is done on the client’s device, which makes the computer running YaCy slow and makes the search much slower than a search engine of the competing company, like Google.

As a conclusion to this article I need to add that although many users see P2P technology as a tool for piracy, others use it to do things that are inherently good, offering a chance of involvement to those who want to belong to it.

Recent Posts

Archives

Categories