Marchex has announced that today it will be launching over 100,000 new websites, totaling over 1 billion pages of content. These websites are both targeted at local searches (i.e. denverautorepair.com, 90210.com, etc), as well as specific verticals (locksmiths.com). Rather than aiming for a central portal on one domain, similar to an IYP site such as SuperPages, Marchex is hoping that these thousands of sites will each be relevant enough to pull searchers in as they’re discovered through the SERPs. However, one thing that you will notice if you go to these individual sites is that they have a very similar look and feel (which you would expect with 100,000 pages being launched at once), and that they all link, in some way, to myzip.com, which can show the exact same information as the individual pages.
Looking at their robots.txt, you can see that they are basically shutting the crawlers out of everything on myzip.com
User-agent: *
Disallow: /-/home/
Disallow: /-/results/
Disallow: /-/detail/
Disallow: /-/about/
Disallow: /-/terms/
Disallow: /-/privacy/
Disallow: /-/guidelines/
In fact, the only page that I found on myzip.com that could be crawled was the portal page (http://www.myzip.com/-/portal/?p=Portal&). So maybe this is because they’re only having those 100,000 individual sites crawled? After all, they’d get some benefit from the urls right? Well, here’s the robots.txt for denverautorepair.com
User-agent: *
Disallow: /-/results/
Disallow: /-/detail/
Disallow: /-/about/
Disallow: /-/terms/
Disallow: /-/privacy/
Disallow: /-/guidelines/
Aha! A difference! They’re not blocking the /-/home directory on this site. So what is there? All of the unique content? Well… not quite, it’s the same content on each page in that directory, but the sponsored listings are different…
Marchex is looking to distinguish itself in scale and quality from the so-called “domain parking” industry that often prey on accidental visitors to their sites by serving up low-quality advertising links on random pages
From the Reuters article
…such as serving up ads for buying homes in Florida on a page about Denver car repair???

Now while it’s true that this is a relevance quirk on Yahoo’s side, (I reached this page by clicking on the “See sponsored links for: Florida” crawlable link on the site, which is populated by Yahoo), the fact still remains that Marchex is allowing these pages to be indexed . Of course houses in Florida on a Denver page isn’t the most fun example, so how about this crawlable page on locksmiths.com stuffed full of Carmen Electra ads, because when you need a locksmith, you obviously need something to take your mind off being locked out of your house / car…

Admittedly, this is a small sample that I’ve looked at, but it does look a little strange if they’re trying to distance themselves from the domain parking sites, yet the only pages they’re having crawled are those with different ad sets, especially when those ad sets may not be related to the content of the page. It could be that they’ve not yet ‘launched’ these sites fully, and the robots.txt files may be changing, so I’ll check back tomorrow and see if they’re still the same or not, but still…
**update - 2 days later - Looking at the robots.txt files for denverautorepair.com and locksmiths.com, I don’t see any change, so it looks like this is how they intend it to be.Â