Use the blog, Luke

Nearly eight years after Justin Hall uploaded his first hypertext diary entry, weblogging has finally hit the mainstream. Everyone seems to have a published opinion on this not-so-new new thing, and if the attention seems a little belated, it's not undeserved.

After all, a number of significant developments separate us from pioneering sites like Links From the Underground or Robot Wisdom: The blogging population itself has grown dramatically, and has begun organizing itself into a genuine community rather than a series of isolated sites; software tools have been built specifically to let noncoders create and maintain blogs; and the universe of potential pages to link to has expanded by several orders of magnitude since Hall launched his site. There's simply more Web to log, and consequently more need for experienced guides.

Then there are the high-profile migrations: print journalists like Mickey Kaus, Virginia Postrel and Andrew Sullivan, who have managed to enhance the mainsteam credibility of the blog genre, while simultaneously exploring new business models. (With some genuine success -- Sullivan says he is now breaking even, and his new book-club feature has made him an Oprah-style kingmaker on Amazon.com.) Just as it did five years ago with the Web zine world, the appearance of old-journalism celebs has triggered a wave of articles and Op-Eds, debating the merits of this new form. Thus far the debate has centered on whether blogs constitute a new model of journalism or simply a minor variation on an existing theme: an Op-Ed page with more links and fewer fact checkers.

But the debate is a false one. What makes blogs interesting is precisely the way in which they're not journalism. Sure, if more writers can follow in Sullivan's wake and turn their blogs into revenue-generating enterprises, blogs will certainly mark a qualitative change as far as the underlying economics go. (Effectively it will mean that bloggers have a new, usually modest revenue stream to supplement what they take home from their day jobs.) But the journalistic form itself won't be all that earth-shattering, certainly no more revolutionary than the first-generation Web zines, which were often staffed like old-style print magazines, but sported hypertext, multimedia and genuine community interaction alongside those traditional mastheads.

The true revolution promised by the rise of bloggerdom is not about journalism. It's about information management. The bloggers have the potential to do something far more original than offer up packaged opinions on the news of the day; they can actually help organize the Web in ways tailored to your minute-by-minute needs. Often dismissed as self-obsessed "vanity sites," the bloggers actually have an important collective role to play on the Web. But they're not challengers to the throne of the New York Times and the Wall Street Journal. They're challengers to the throne of Google.

As it happens, the bloggers already function as a kind of kitchen cabinet for Google's relevancy ranking algorithm. Google measures relevancy by determining how many other pages link to a given page -- the more people point to your "Remington Steele" tribute site, the more likely it is that Google will recommend it to someone searching for info on '80s detective shows or Pierce Brosnan or Henry Mancini theme songs. Those pointers are themselves ranked by Google: If a lot of highly linked-to pages link to your page, you'll rise even higher in the rankings.

You'd be hard-pressed to design a system that gave the blogging community a greater impact on Google's results. Because bloggers by definition link far more than your average Web page, and because they also tend to link to each other's sites (most blogs feature a now standard list of comrades in their margins), a page that attracts the attention of a few bloggers will quickly shoot up the Google rankings. Do a search on Larry Lessig's book "The Future of Ideas" -- a hit with the blogging community -- and a review from a blog called Sopsy Digest shows up 15 notches higher than an article from Business Week. (Or at least it did the last time I checked; Google rankings are hardly set in stone.)

This is the Blogger Effect. It's what happens when the arbiters of relevance in the "attention economy" shift toward a bottom-up structure. Google thinks pages are relevant now not just because they've received the imprimatur of Condé Nast or the New York Times, but because they caught the interest of Sopsy and friends.

Now, that's good news if you like Sopsy more than you like, say, Howell Raines. But if you can't stand Sopsy, or you've no idea who he/she/it is, then it's a little bit disturbing that the site is skewing your Google rankings. There are significant political consequences to the Blogger Effect: Because the blogging community contains a disproportionate number of libertarians, it's possible that Google searches on certain hot-button issues will start skewing toward libertarian-friendly pages. Given Google's increasing prominence, this libertarian slant could prove to be more significant than the more familiar concerns about liberal bias in the major networks, and conservative bias on Fox News. No sensible person thinks "The O'Reilly Factor" is free of political slant (save O'Reilly himself). But the great oracle of Google is supposed to be above such partisan concerns.

The solution is not to eliminate the bloggers from Google. The solution is to create more Googles. Or, even better, to transform the data generated by the bloggers into something that rivals what Google does -- to extract some new kind of collective wisdom out of a universe of armchair opinion leaders.

Think about those bloggers pointing to Sopsy and causing the site to rise in the Google rankings: Are they providing a journalistic function with those links? On some level, perhaps. But they are also doing something closer to information management, more librarian or archivist than Woodward and Bernstein. The bloggers are helping Google learn what pages should be connected to other pages, or to particular text strings. They are helping Google transform the Web from a disorganized mess into a more coherent universe of useful data. But their contributions to this noble cause have been limited to date, partially because the bloggers themselves have been too busy boxing with the phantoms of traditional journalism.

Beyond the unspoken collective effect on Google's results, the blog world has already been mined for global patterns in a number of interesting experiments, like Blogdex, which creates a kind of alternative headline news by tracking popular URLs in recent posts. Then there's Weblog Bookwatch, which scans for Amazon URLs in new blog entries, and constructs a regularly updated list of books that are "top of mind" with bloggers. (An interesting corrective to ordinary bestseller lists, in that it measures which books get talked about, rather than which ones get bought.)

But both Blogdex and Bookwatch share a conceptual limitation with most individual blogs, a limitation that is hard-wired into the software used by the great majority of webloggers: They are organized around time.

Time is central to the philosophical DNA the blogs share with journalism: Both compulsively feature today's link, today's controversy, today's top books. This might seem like an obvious organizational principle, but it comes with great restrictions. Google, for instance, is largely oblivious to time: When you use Google, you're usually not looking for up-to-the-minute info, you're looking for authority and depth. (Try getting a useful stock quote directly from Google and you'll understand immediately.) Many of the bloggers that I follow comment on links that are time-sensitive on the scale of a year or two: Someone's rant on the latest XML spec revisions is just as relevant next week, though probably not nearly so relevant a decade from now. But because those links fall off the front door every few days, they effectively enter a de facto oblivion, where I have to hunt them down actively three weeks later when I'm looking around for useful assessments of XML. The beautiful thing about most information captured by the bloggers is that it has an extensive shelf life. The problem is that it's being featured on a rotating shelf.

If there's a time element that I do care about, it's not the just-off-the-wires time of today's news. It's my time. It's what I'm doing right now. I don't always want to know what über-blogger Jason Kottke happens to be thinking about this morning -- I want to know what he thinks about the page I'm currently reading, or the paragraph I just wrote. If I stumble across a page 10 weeks after Jason wrote up a description of it on Kottke.org, his description is just as valuable to me as it was 10 weeks before -- in fact, it's probably more valuable, because I've come across the page on my own personal journey. But as it stands now, to figure out if Jason's referenced the page I have to copy the URL and paste it into the search engine on Kottke.org. If I've got 20 or 30 bloggers that I'm following, I've got to paste that URL into 20 separate input fields.

But the bloggers needn't be anchored to the headline-news mentality. Think of them as less like a newspaper substitute and more a kind of guardian angel, hovering over your shoulder as you surf. (The Alexa software created by Brewster Kahle relied on a similar approach: He called it a "surf engine.") Punch up a URL and if Jason, or Andrew Sullivan, or Sopsy has an opinion about that page, you see their comments in a floating window alongside your main browser window. It's a simple enough trick: Sites like Blogdex are already tracking blog-borne references to different URLs. All your browser would have to do is send an additional request to a database of blogged URLs anytime you pulled up a page: If there's a match -- if one of the bloggers you're following has referenced the URL -- their comments get sent back to your machine and appear in the floating palette.

The critical standardized part in this machine is the URL: Because pages -- and Amazon products -- have distinct identifying text strings, you can assemble references to them into new higher-level forms of information: bookblogs and blogdexes and guardian blogs. But the URL is only one potential component part among many. If we had standardized tags for just five or six additional elements, you could start mining the blog space for on-the-fly information resources that would truly rival Google's. You'd need fixed categories describing who is doing the linking and who his or her "friends" are; you'd need a summary of the response to the link, alongside the full text of the response; you'd need keywords, as well as the number of comments generated in an active thread responding to the link.

Perhaps most important, you'd also need a way to distinguish between positive and negative links. Right now, systems like Google's page rank presume that the decision to link to a page is by definition an endorsement of the page linked to. You need only think of how many times Andrew Sullivan has linked to the Op-Ed columns of his arch-nemesis Paul Krugman to recognize the flaw in this logic. Positive linking should certainly be the default, but if Bloggers are going to be organizing the Web for us, they need to be able to point to pages that suck without giving those pages an even higher standing on Google.

If the blog space were to standardize around these categories, what kind of information-management tools might we be able to create? Here's one scenario. You define a few "guardian" Bloggers, perhaps by checking a box when you visit their site. You also instruct your software to watch the activity on sites maintained by "friends" of those key bloggers. You tell the software that you want a medium level of intrusiveness: In other words, you want the system to point out useful information to you, but you don't want it constantly bombarding you with data at every turn. And then you start using your computer as you normally do: surfing, writing e-mail, drafting Word documents.

Behind the scenes as you write or read, the software on your machine scans the last few paragraphs for high-information text, the six or seven words that make that paragraph distinct from the average paragraph sitting on your machine. If there's a URL included in the text, it grabs that too. The software then sends a query to the blogs maintained by your guardian Bloggers, as well as those maintained by their friends -- say 20 blogs in all -- and searches for posts that include those keywords. Since you've defined a medium level of intrusiveness, it might only grab the URL and summary text for posts that match half of your keywords, and that appear on 25 percent of the bloggers you're tracking. Let's say Jason Kottke has linked to a related article; if four other bloggers you're following have also linked to that URL, Jason's description of the article pops up beside the paragraph you've just written.

This wouldn't be a recommendation engine so much as a connection machine, tracking the flow of words across your screen and linking them fluidly to other text residing on the Web. You can make those connections as loud or as soft as you want: Perhaps the software only suggests other URLs and blog posts when you request them. (Running your blog analyzer might be akin to running a spell checker when you're done with a draft.) Other users might set their thresholds around timeliness or "heat" -- only pop up a window when there's a related link that's been posted in the past 24 hours, or when there's a link that's generated a 20-post discussion thread.

There are almost as many potential ways to manage that new flow of information as there are bloggers providing it. But to open up these new avenues, the bloggers are going to have to shed their dependence on the traditional journalistic models: Instead of going to today's blog the way you pick up today's paper, the bloggers should follow us around, providing context and commentary, supplementing our libraries and our memory. Many blogs out there possess the standards and intelligence of conventional journalism, but there are already too many of them to keep track of the way we subscribe to old-style magazines or habitually tune in to favorite TV networks. If the blogging population expands at the current rate, soon enough you'll be able to spend an entire day just reading the front doors of all your bookmarked blogs. Better to do away with the dependence on front doors, and let your favorite bloggers come to you.

In an essay published in last month's Business 2.0, James Wolcott describes the Blog experience as "a one-on-one unmediated relationship between writer and reader paradoxically made possible by the most mass of media, the Internet. Each blog is like a blinking neuron in the circuitry of an emerging, chatterbox superbrain." It's a typically well-crafted phrase, and there's something undeniably compelling about the description, but the fact that Wolcott tosses out both ideas -- one-on-one relationships and superbrains -- as though they were synonymous suggests that it's the poetry of the words that attracts him, rather than the underlying substance. There is a world of difference between the one-on-one encounter and the emerging superbrain. Blogs already excel at the former -- they're long on one-on-one encounters. But their emerging superbrains could use a little work.

Use the blog, Luke

The collective future of blogs lies not in dethroning the New York Times -- but in becoming a force that can make sense of the Web's infinity of links.

By Steven Johnson

Published May 10, 2002 7:30PM (EDT)

Shares

By Steven Johnson

Related Topics ------------------------------------------

Related Articles