Sunday, September 12, 2010

What The Search Engines Say - Teoma

To truly understand what sets Teoma apart from the competition, you first need to know the 3 primary techniques used by search engines today:

Text Analysis - determines a site’s relevance by the text on the page. This technique was fine when the Web was small and spammers couldn’t artificially increase their rankings.
Popularity - determines that the more links there are to a site, the more popular it is. However, this is not necessarily the best judge of relevance.
Status - goes beyond popularity by analyzing the importance or "status" of the sites providing the incoming links. But this lacks context because it doesn’t calculate whether the incoming links are related to the search subject.
Subject-Specific Popularity This is the Teoma difference. Teoma uses elements of the above three techniques, but it does more. Rather than rely on the recommendations of the entire Web audience to determine relevance, Teoma technology uses Subject-Specific Popularity to connect with the authorities - the experts within specific interest groups that guide it to the best resources for a subject.

The Teoma Web Crawler FAQ

The Teoma Crawler is Ask Jeeves' Web-indexing robot (or, crawler/spider, as they are typically referred to in the search world). The crawler collects documents from the Web to build the ever-expanding index for our advanced search functionality at Ask Jeeves at Ask.com, Ask.co.uk and Teoma.com (among other Web sites that license the proprietary Teoma search technology).

Teoma is unique from any other search technology because it analyzes the Web as it actually exists - in subject-specific communities. This process begins by creating a comprehensive and high-quality index. Web crawling is an essential tool for this approach, and it ensures that we have the most up-to-date search results.

Q: What is a Web crawler/Web spider?
A: A Web crawler (or, spider or robot) is a software program designed to follow hyperlinks throughout a Web site, retrieving and indexing pages to document the site for searching purposes. The crawlers are innocuous and cause no harm to an owner's site or servers.

Q: Why does Teoma use Web crawlers?
A: Teoma utilizes Web crawlers to collect raw data and gather information that is used in building our ever-expanding search index. Crawling ensures that the information in our results is as up-to-date and relevant as it can possibly be. Our crawlers are well designed and professionally operated, providing an invaluable service that is in accordance with search industry standards.

Q: How does the crawler work?
A: The crawler goes to a Web address (URL) and downloads the HTML page. The crawler follows hyperlinks from the page, which is URLs on the same site or on different sites.The crawler adds new URLs to its list of URLs to be crawled. It continually repeats this function, discovering new URLs, following links, and downloading them.

The crawler excludes some URLs if it has downloaded a sufficient number from the Web site or if it appears that the URL might be a duplicate of another URL already downloaded. The files of crawled URLs are then built into a search catalog. These URL's are displayed as part of search results on the site powered by Teoma's technology when a relevant match is made.

Q: How frequently will the Teoma Crawler download pages from my site?
A: The crawler will download only one page at a time from your site (specifically, from your IP address). After it receives a page, it will pause a certain amount of time before downloading the next page. This delay time may range from 0.1 second to hours. The quicker your site responds to the crawler when it asks for pages, the shorter the delay.

Q: How can I tell if the Teoma crawler has visited my site/URL?
A: To determine whether the Teoma crawler has visited your site, check your server logs. Specifically, you should be looking for the following user-agent string:

User-Agent: Mozilla/2.0 (compatible; Ask Jeeves/Teoma)

Q: How did the Teoma Web crawler find my URL?
A: The Teoma crawler finds pages by following links (HREF tags in HTML) from other pages. When the crawler finds a page that contains frames (i.e., it is a frameset), the crawler downloads the component frames and includes their content as part of the original page. The Teoma crawler will not index the component frames as URLs themselves unless they are linked via HREF from other pages.

Q: What types of links does the Teoma crawler follow?
A: The Teoma crawler will follow HREF links, SRC links and re-directs.

Q. Does the Teoma crawler include dynamic URLs?
A. We include a select number of dynamic URLs in our index. However, they are screened to detect likely duplicates before downloading.

Q: Why has the Teoma crawler not visited my URL?
A: If the Teoma crawler has not visited your URL, it is because we did not discover any link to that URL from other pages (URLs) we visited.

Q: How do I register my site/URL with Teoma so that it will be indexed?
A: We appreciate your interest in having your site listed on Ask Jeeves and the Teoma search engine. It is important to note that we no longer offer a paid Site Submission program. As a result of some recent enhancements to Teoma, we're confident that we're indexing even more Web pages than ever, and that your site should appear in our Search index as a result of our ongoing "crawling" of the Web for new and updated sites and content.

If you are the owner/webmaster of a site in question, you may also want to research some online resources that provide tips and helpful information on how to best create your Web site and set up your Web server to optimize how search engines look at Web content, and how they index and trigger based upon different types of search keywords.

Q: Why aren't the pages the Teoma crawler indexed showing up in the search results at Teoma.com?
A: If you don't see your pages indexed in our search results, don't be alarmed. Because we are so thorough about the quality of our index, it takes some time for us to analyze the results of a crawl and then process the results for inclusion into the database. Teoma does not necessarily include every site it has crawled in its index.

No comments:

Post a Comment