Search engines
Search engines are specialized websites designed to help people find the information they need quickly and easily on the internet. While each search engine (e.g. Google, Yahoo!Search, etc.) has a different approach to data gathering, they all perform three core functions:
Searching or surfing the internet based on important words;
Keeping an index of what they find and where they found them; and
Allow users to look for words or combinations of words on their site.
Search Engine Spiders
The key to a search engine's success lies in special programs called "spiders" - automated virtual robots - that build lists of words that can be found on a web page or website. They are usually sent out from a central computer and aimed at popular websites and servers that experience heavy use. The spiders crawl over the web page, cataloging the words and links (and following these links to other websites), and then creating an index or listing of "key search words" that online users can use to find the pages they are looking for.
Over the years, search engines have been upgrading and improving their systems in order to provide more convenience and faster response times to their users. One such system is the use of "meta-tags" - key words on a web page that makes a page easier for a spider to pinpoint and index.
Making Sense of the Data - Indexing the Words
Once the spiders have collected the data, these are processed and stored by the search engine in a way that makes it easy for people to access them.
It would be simple to simply store the word and URL address in a single, searchable database but there is no way that a user, when seeing the bare bones information, would know which page is more important, or fits in with the search he has in mind.
For example, one may be searching for "apples" to find apples for apple pie-making or a picture of apples. If the database were to simply list "apples" and their URL, one would have to do his own tedious searching for the specific context of the word he's looking for.
For this reason, most search engines index and collect more than mere words and URL addresses: they may index words that are in specific locations on the page (title and sub-title areas, for example), meta-tags, frequency of words on the page, etc.
Google and the other search engines are in a constant race to improve their search capabilities like concept-based searching which uses statistical analysis on pages with the words one is searching for, so as to identify other pages of interest. Another approach is natural language queries where one inputs questions in the same way he talks to people - "What is a search engine?" for example.
The most popular search engine using natural language queries is AskJeeves.com, which "zooms in" on key words (e.g. What, search engine) and uses this as the starting point for checking through its index of words.
0 comments:
Post a Comment