The Internet is characterized by an enormous variety of information. This information is found in many different places and have to be ordered sense due to their mass. This problem was already at the beginning of the Internet and the development of which was to Web 2.0 and the rise of user generated content increasingly larger. Various approaches have been created, which are discussed under.
Directories
Web directories are manually maintained lists that are divided into different categories. These categories may have subcategories in turn, so that there is a hierarchical structure. In the individual categories, the URLs are stored at appropriate resources including a short description. The most famous representative in this area are the Open Directory Project and Yahoo! Directories . However, these directories have many disadvantages, because they must be manually maintained and extended, and these tasks is no longer compatible with the increasing growth of the Internet. Even for a user searching for information is troublesome because he has to navigate across different hierarchy levels before coming to a conclusion.
Full text search
An automated approach to the organization of the Internet provided the first full-text search engines. These were based on the automated reading of Web documents and their storage in an index . It could now be made ??generic queries that are matched against the index, and as a result provided all documents that matched the query. In this concept, the applicable maintenance manual, but in most cases, the number of matching documents is simply too large to represent a real benefit for a seeker. This problem is partly due to a lack of understanding of the search engines for the content of a web document to pass.
Inclusion of meta information
Meta information is a general understanding of the page details. With the help of these details it is possible to explain in speech the data more accurately and thus produce machine actionable information from it. Search engines can thus make a better assessment of the content of a website and thus the result set of a query limit. The typical meta information in HTML documents are listed within the tags and are called meta tags known.
Introduction of a ranking
Even search engines that process the metadata, can only provide a more accurate result. As already mentioned, this list is usually still too large to it manually to extract the information that is actually looking for. For this reason, the ranking based search engines have been developed to additionally make some assessment of the individual results to a query, thereby creating a structure that allows a meaningful work. The largest members of this genus in the western area are Google, Yahoo and Bing.