Web Crawler at a Glance
Size
Spider Class
Meta Tag support
Frame support
Image Map support
Alt Text support
HTML Comments
Url Searching
Embedded Directory
Submission URL
|
less than 10 Million Urls (approx)
Shallow
Yes
No
No
No
No
Yes
Yes
Submit
|
SUBMISSION POLICY
Web Crawler is unique among the search engines discussed in this document. For one thing, Web Crawler is one of the oldest search engines and one of the smallest. Although Web Crawler and Excite merged a while ago, Web Crawler has managed to maintain its own unique system.
Web Crawler has this to say about their submission policy (see ref.)
It seems that more and more website owners and designers have been "spamming" -- including unsolicited, extra or irrelevant information on their pages, usually in the form of word lists -- in order to make search engines display them at the top of their listings. This practice is something that we strongly discourage. Searching the Internet is our business, and spamming actively interferes with providing everyone who uses the World Wide Web with the best search engine we possibly can.
In order to make our index cleaner and more navigable, and to foster a more level playing field for everyone, we've started removing these pages from our index and screening new submissions. If you load pages with long, repetitve word lists, or titles this will cause WebCrawler either to ignore the repetition or, in some cases, to ignore such documents entirely.
Web Crawler is the first search engine here to openly admit that multiple titles are considered spamming.
HTML FACTS
Web Crawler indexes all text on a page (up to 1 megabyte). Web Crawler does not provide support for frames or imagemaps. Additionally Web Crawler ignores comments and alt text.
Web Crawler was the first system to implement an artificial intelligence routine to generate a summary for an entry. However they quickly saw the problems in using this method and decided to offer support for the meta description tag. Should your page omit the meta description tag, Web Crawler will invoke their AI routine to determine a summary for your page.
RANKING METHODS
Web Crawler says this about their ranking method (see ref.)
1.Use a title uniquely descriptive of your page or site. Since WebCrawler's indexing/relevance algorithm gives slightly more weight to titles than to body text pages with titles containing dead-weight words like "Homepage" or "Home Page on the WWW" don't often get easily found.
2.Make sure that the main page of the site describes to the fullest extent possible what the site's about. It doesn't have to be over-long and exhaustive, but as much text with the important words in it as you can possibly have without sacrificing the design/layout of the site will help on the indexing front.
The first item is fairly standard, Web Crawler likes to see unique site titles. But the second item is more interesting. For one thing its an opposite approach to ranking than used by Excite. Web Crawler wants more descriptive text, not less.
Shown below are two entries taken from Web Crawler, in the first entry, the page being summarized contained a meta description tag, hence the summary control remained in the hands of the designer. The second entry lacked any meta tags whatsoever, the text pulled from the page to generate the summary came from the bottom of that page, and does not adequately reflect the content of the page.
Northern Webs - North Idaho's Premier Web Design Studio
Northern Webs, North Idaho's most experienced Web Design studio. Similar Pages
http://www.northernwebs.com/
71%
Idaho Department of Law Enforcement Home Page
Police Departments Sheriff Departments Other Departments Attention Patch Collectors For an Idaho State Police Patch, please send a self-addressed, stamped envelope (SASE) and $5.00 (U.S. currency) to: Idaho State Police Association attn: Tom Wilson 3056 Elder St Boise, ID 83705 USA Please allow 4-6 weeks for delivery Similar Pages
http://www.state.id.us/dle/dle.htm
44%
This is clearly illustrates our third law of Search Engines;
Third Law of Search Engines: If you don't follow the rules they lay down, the search engines will do something unexpected with your page. |
SUMMARY
The Web Crawler spider is a shallow spider, so be prepared to submit your primary pages to them.
Although small by comparison, Web Crawler is backed by the folks at Excite, which translates into all that AOL exposure. You can't make a mistake by submitting your site to web crawler.
|