Quantcast
Channel: Feedblog » crawler
Browsing all 10 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Spinn3r, Facebook, Google, People Search, and Crawling Profiles.

The news out today is that Facebook will open up their profiles for use with Google and other search engines: One of the great features of Facebook was privacy. You could be assured that what was in...

View Article



Announcing New Versions of Spinn3r and Tailrank

This is a big day for us. We’re announcing new versions of both Tailrank and Spinn3r. The first big announcement is Spinn3r 2.0: After nearly a year in development, I’m pleased to announce the release...

View Article

Thoughts on Efficient Crawling through URL Ordering

I’m re-reading “Efficient Crawling through URL Ordering” and a few other papers I’ve read a few years ago. Now that I have Skim I can take notes in the PDF directly which is turning out to be amazingly...

View Article

Storing the Full Internet

The other day I blogged about Blekko and what it would take to in terms of hardware index the full Internet. High Scalability responded with some interesting thoughts. Kevin Burton calculates that...

View Article

Image may be NSFW.
Clik here to view.

Yahoo Extends Semantic Web Support

Looks like Yahoo is releasing more details about web standards, RDF, and microformat support in their search platform: While there has been remarkable progress made toward understanding the semantics...

View Article


Is WordPress Insecure by Design?

Ouch. So much for upgrading to WordPress 2.5 for a secure version of WordPress. While the shift is going in the right direction it might not fully fix the problem now that this exploit is known....

View Article

Robot Yield

This morning I was thinking about robot blocks regarding Rich’s post about Cuill being blocked on 10k hosts. So let’s say you write a web scale crawler and you accidentally pushed a bug. It was a huge...

View Article

Cuil Hitting Too Hard?

Looks like Cuil might be hitting websites too hard with their crawler: “I don’t know what spawned it, but when Cuil attempts to index a site, it does so by completely hammering it with traffic,” the...

View Article


Image may be NSFW.
Clik here to view.

Spinn3r Sponsors 2009 International Conference for Weblogs and Social Data...

Spinn3r is sponsoring the International Conference for Weblogs and Social Media this year with a snapshot of our index. The data set was designed for use by researchers to build cool and interesting...

View Article


Image may be NSFW.
Clik here to view.

Spinn3r 3.0: New Features, New Architecture, New APIs – More Goodness

I’m proud to announce that we have just released Spinn3r 3.0 after more than a year of development. This has been quite a lot of work based on feedback from our customer base and ships with some really...

View Article
Browsing all 10 articles
Browse latest View live




Latest Images