Personalization Doesn’t Have to Make Search Perfect – Just Better

First published July 19, 2007 in Mediapost’s Search Insider

For the first time in a long time, I’ve been consistently frustrated with the result that Google’s been returning for several of my searches. It’s not that Google’s getting worse, it’s that the nature of my searches has changed significantly. My searches are getting fuzzier as I’m stepping into territory I don’t know very well. Google is not functioning terribly well as my “discovery” engine.

Aaron’s Ambient Findability

Aaron Goldman wrote an absolutely fascinating column last week about ambient findability, based on Peter Morville’s book. I’ll definitely be taking Aaron’s advice and ordering my copy from Amazon soon. The interesting thing was that I read Aaron’s column shortly after I did an interview with Jakob Nielsen where he expressed similar cynicism about the practicality of search personalization. To sum up, both instances pointed to the fact that doing personalization is very difficult to do right. It’s probably impossible to do perfectly. But then again, personalization shouldn’t be perfect because humans aren’t. There will always be the human element of variability and unpredictability.

Google’s limits as a discovery engine

As much as the topic of ambient findability fascinates me (I explored the territory myself in a previous Search Insider ) I won’t steal Aaron’s thunder because I know he’s doing a follow-up column this week. I’ll take a more mundane path and talk about my increasing level of frustration with Google.

As I mentioned in last week’s column, I’m currently doing research for a book. Right now, what I’m researching is the nitty-gritty of why and how we make purchase decisions. By the way, Aaron suggested an interesting book, so I’ll do the same. Please do yourself a favor and pick up a copy of Clotaire Rapaille’s “The Culture Code .” This is one of the most fascinating marketing books I’ve read in some time. Rapaille talks about the challenge of doing traditional market research in trying to uncover people’s attitudes towards brands or other aspects of our culture, like food, healthcare and even the American presidency. The problem is that in most traditional market research vehicles (focus groups & surveys) we’re stuck with what people say. It’s almost impossible to uncover what people really feel. What people say comes directly from their cerebral cortex, the logical and rational part of their brain. But what they feel comes from the limbic and reptilian part of the brain, the dark, shadowy corners of our personas. The minute you ask them a question, no matter what the format, you immediately get the cortex in gear. This got me thinking about neural marketing and the actual mechanisms in our mind that click over when we make the decision to buy or not.

Rapaille’s book simply served to whet my appetite. I voraciously started looking for more of the same but books, research or articles that explore the primal reasons why we buy seem to be few and far between (hint: if you know of any, please pass them along in the Search Insider blog so we can all share). I turned to Google and tried a number of queries to try to dig up academic research or Web sites on the subject matter. I was definitely venturing into new territory and while Google usually acts as a reliable guide, it was leaving me stranded high and dry in these particular quests.

Personalization is an idea, not an algorithm

So, let’s get back to personalization. Would personalization in the form (Kamvar’s algorithm) that is currently being envisioned and rolled out by Google help me in this matter? Probably not. The signals (search and Web history) would be too few to help me zero in on the content I’m looking for. It wouldn’t really improve Google’s utility as a “discovery” engine. It would run into the same road blocks that Aaron and others consistently point out.

But here’s the thing. Google is making a huge bet on personalization. But personalization is not the only thing Google is working on. Personalization simply acts as a hub. MIT’s Technology Review recently did an interview with Peter Norvig, Google’s Director of Research. Norvig is, quite literally, a rocket scientist (he was head of computational sciences at NASA in a previous life) who is taking Google’s research in some interesting new directions. Speech recognition and machine translation are two notable areas. Speech recognition can overcome some major input obstacles not only on the desktop, but, more notably, on mobile devices and on a convergent home screen that fully integrates our online world and entertainment options. And machine translation can enable a number of automated systems that can power further online functionality. Both are very much aligned with Google’s engineering view of the universe, where introducing people into the equation just introduces friction in an otherwise perfect world.

But the really telling part of the interview came when the conversation turned to search. Norvig talks about the current imbalance of search, where there is an avalanche of data available but the only gate to that data is the few words the searcher chooses to share with the search engine. We’re trying to paint personalization into a corner based on Google’s current implementation of it. And that’s absolutely the wrong thing to do. Personalization is not a currently implemented algorithm, or even some future version of the same algorithm. It’s is an area of development that will encompass many new technologies, some of which are under development right now in some corner of Google’s labs.

Personalization, in its simplest form, is simply knowing more about you as an individual and using that knowledge to better connect you to content and functionality on the Web. There are many paths you can take to that same end goal. Sep Kamvar’s algorithm is just one of them. By the way, Norvig’s particular area of expertise is artificial intelligence. Let’s for an moment stop talking about personalization and start talking instead about what the inclusion of true artificial intelligence could do for the search experience. But artificial intelligence requires signals, and personalization is a good bet to provide those signals. It doesn’t have to get it perfect every time, it just has to make it better.

Just as a last point, Marissa Mayer said in an interview that Google’s current forays into personalization serve no other purpose than to give Kamvar some data to play with to improve his algorithms. We’ve all quickly jumped on personalization (and yes, I’m probably the most guilty of this) as the new direction of search, but many of us (and I believe my guilt ends here) are making the assumption that personalization means a form of what we’re seeing today. It doesn’t. Not by a long shot. And, at the end of the day, what we’re looking for is a jump ahead in matching our needs with what the Web has to offer. To win, Google doesn’t have to do it perfectly. It just has to do it better than everyone else.

Ask: The Reasoning Behind Ask 3D

Last week in my interview with Jakob Nielsen, he called Ask’s 3D label “stupid”. Just to refresh your memory, here’s how the exchange went:

Gord: Like Ask is experimenting with right now with their 3D search. They’re actually breaking it up into 3 columns, and using the right rail and the left rail to show non-web based results.

Jakob: Exactly, except I really want to say that it’s 2 dimensional, it’s not 3 dimensional.

Gord: But that’s what they’re calling it.

Jakob: Yes I know, but that’s a stupid word. I don’t want to give them any credit for that. It’s 2 dimensional. It’s evolutionary in the sense that search results have been 1 dimensional, which is linear, just scroll down the page, and so potentially 2 dimensional (they can call it three but it is two) that is the big step, doing something differently and that may take off and more search engines may do that if it turns out to work well.

My friend Michael Ferguson at Ask (who has his own interview coming up soon) sent me a quick email with the reasoning behind the label:

The 3D label came from the 3 dimensions of search we folded onto one page: query expression in the left rail, results in the center, and content on the right (vs. the one dimension of returning solely results).

Interview with Jakob Nielsen on the Future of the SERP (and other stuff)

jakob-nielsen_cropped.jpg.400x400_q95_crop_upscaleI recently had the opportunity to talk to Jakob Nielsen for a series I’m doing for Search Engine Land about what the search results page will look like in 2010.  Jakob is called a “controversial guru of Web design” in Wikipedia (Jakob gets his own shots in at Wikipedia in this interview) because of his strongly held views on the use of graphics and flash in web design. I have a tremendous amount of respect for Jakob, even though we don’t agree on everything, because of his no frills, common sense approach to the user experience. And so I thought it was quite appropriate I sound him out on his feelings about the evolution of the search interface, now that with Universal search and Ask’s 3D Search we seem to be seeing more innovation in this area in the last 6 months than we’ve seen for the last 10 years. Jakob is not as optimistic about the pace of change as I am, but the conversation was fascinating. We touched on Universal Search, personalization, banner blindness on the SERP and scanning of the web in China, amongst other things. Usability geeks..enjoy!

Gord: For today I only really have one question, although I’m sure there be lots of branch offs from it. It revolves around what the search engine results page may look like in 2010.  I thought you would be a great person to lend your insight on that.

Jakob: Ok, sure.

Gord: So why don’t we just start? Obviously there are some things that are happening now with personalization and universal search results. Let’s just open this up. What do you think we’ll be seeing on a search results page in 3 years?

Jakob: I don’t think there will be that big a change because 3 years is not that long a time. I think if you look back three years at 2004, there was not really that much difference from what there is today.  I think if you look back ten years there still isn’t that much difference.  I actually just took a look at some old screen shots in preparation before this call at some various search engines like Infoseek and Excite and those guys that were around at that time, and Google’s Beta release, and the truth is that they were pretty similar to what we have today as well.  The main difference, the main innovation seems to have been to abandon banner ads, which we all know now really do not work, and replace them with the text ads, and of course that affected the appearance of the page.  And of course now the text ads are driven by the key words, but in terms of the appearance of the page, they have been very static, very similar for 10 years.  I think that’s quite likely to continue. You could speculate the possible changes. Then I think there are three different big things that could happen.

One of them that will not make any difference to the appearance and that is a different prioritization scheme. Of course, the big thing that has happened in the last 10 years was a change from an information retrieval oriented relevance ranking to being more of a popularity relevance ranking. And I think we can see a change maybe being a more of a usefulness relevance ranking. I think there is a tendency now for a lot of not very useful results to be dredged up that happen to be very popular, like Wikipedia and various blogs. They’re not going to be very useful or substantial to people who are trying to solve problems. So I think that with counting links and all of that, there may be a change and we may go into a more behavioral judgment as to which sites actually solve people’s problems, and they will tend to be more highly ranked.

But of course from the user perspective, that’s not going to look any different. It’s just going to be that the top one is going to be the one that the various search engines, by what ever means they think of, will judge to be the best and that’s what people will tend to click first, and then the second one and so on. That behavior will stay the same, and the appearance will be the same, but the sorting might be different. That I think is actually very likely to happen

Gord: So, as you say, those will be the relevancy changes at the back end. You’re not seeing the paradigm of the primarily text based interface with 10 organic results and  8-9 sponsored results where they are, you don’t see that changing much in the next 3 years?

Jakob: No.  I think you can speculate on possible changes to this as well. There could be small changes, there could be big changes.  I don’t think big changes. The small changes are, potentially, a change from the one dimensional linear layout to more of a two dimensional layout with different types of information, presented in different parts of the page so you could have more of a newspaper metaphor in terms of the layout. I’m not sure if that’s going to happen.  It’s a huge dominant user behavior to scan a linear list and so this attempt to put other things on the side, to tamper with the true layout, the true design of the page, to move from it being just a list, it’s going to be difficult, but I think it’s a possibility.  There’s a lot of things, types of information that the search engines are crunching on, and one approach is to unify them all into one list based on it’s best guess as to relevance or importance or whatever, and that is what I think is most likely to happen.  But it could also be that they decide to split it up, and say, well, out here to the right we’ll put shopping results, and out here to the left we’ll put news results, and down here at the bottom we’ll put pictures, and so forth, and I think that’s a possibility.

Gord: Like Ask is experimenting with right now with their 3D search. They’re actually breaking it up into 3 columns, and using the right rail and the left rail to show non-web based results.

Jakob: Exactly, except I really want to say that it’s 2 dimensional, it’s not 3 dimensional.

Gord: But that’s what they’re calling it.

Jakob: Yes I know, but that’s a stupid word. I don’t want to give them any credit for that. It’s 2 dimensional. It’s evolutionary in the sense that search results have been 1 dimensional, which is linear, just scroll down the page, and so potentially 2 dimensional (they can call it three but it is two) that is the big step, doing something differently and that may take off and more search engines may do that if it turns out to work well.  But I think it’s more likely that they will work on ways on integrating all these different sources into a linear list. But those are two alternative possibilities, and it depends on how well they are able to produce a single sorted list of all these different data sources.  Can they really guess people’s intent that well?

All this stuff..all this talk about personalization, that is incredibly hard to do. Partly because it’s not just personalization, based on a user model, which is hard enough already. You have to guess that this person prefers this style of content and so on.  But furthermore, you have to guess as to what this person’s “in this minute” interest is and that is almost impossible to do. I’m not too optimistic on the ability to do that.  In many ways I think the web provides self personalization, you know, self service personalization. I show you my navigational scheme of things you can do on my site and you pick the one you want today, and the job of the web designer is to, first of all, design choices that adequately meet common user needs, and secondly, simply explain these choices so people can make the right ones for them.  And that’s what most sites do very poorly. Both of those two steps are done very poorly on most corporate websites. But when it’s done well, that leads to people being able to click – click and they have what they want, because they know what they want, and its very difficult for the computer to guess what they want in this minute.

Gord:  When we bring it back to the search paradigm, giving people that kind of control to be able to determine the type of content that’s most relevant to them requires them interacting with the page in some way.

Jakob: Yes, exactly, and that’s actually my third possible change. My first one was changing to the ranking scheme; the second one was the potentially changing to two dimensional layouts. The third one is to add more tools to the search interface to provide query reformulation and query refinement options. I’m also very skeptical about this, because this has been tried a lot of times and it has always failed.  If you go back and look at old screen shots (you probably have more than I have) of all of the different search engines that have been out there over the last 15 years or so, there have been a lot of attempts to do things like this. I think Microsoft had one where you could prioritize one thing more, prioritize another thing more. There was another slider paradigm. I know that Infoseek, many, many years ago, had alternative query terms you could do just one click and you could search on them, which was very simple. Yet most people didn’t even do that.

People are basically lazy, and this makes sense.  The basic information foraging theory, which is, I think, the one theory that basically explains why the web is the way it is, says that people want to expend minimal effort to gain their benefits.  And this is an evolutionary point that has come about because the people, or the creatures, who don’t exert themselves, are the ones most likely to survive when there are bad times or a crisis of some kind. So people are inherently lazy and don’t want to exert themselves. Picking from a set of choices is one of the least effortful interaction styles which is why this point and click interaction in general seems to work very well. Where as tweaking sliders, operating pull down menus and all that stuff, that is just more work.

Gord: Right.

Jakob: But of course, this depends on whether we can make these tools useful enough, because it’s not that people will never exert themselves.  People do, after all, still get out of bed in the morning, so people will do something if the effort is deemed worthwhile.  But it just has to be the case that if you tweak the slider you get remarkably better results for your current needs.  And it has to be really easy to understand. I think this has been a problem for many of these ideas. They made sense to the search engine experts, but for the average user they had no idea about what would happen if they tweaked these various search settings and so people tended to not do them.

Gord: Right. When you look at where Google appears to be going, it seems like they’ve made the decision, “we’ll keep the functionality transparent in the background, we’ll use our algorithms and our science to try to improve the relevancy”, where as someone like Ask might be more likely to offer more functionality and more controls on the page. So if Google is going the other way, they seem to be saying that personalization is what they’re betting on to make that search experience better.  You’re not too optimistic that that will happen without some sort of interaction on the part of the user?

Jakob: Not, at least, in a small number of years. I think if you look very far ahead, you know 10, 20, 30 years or whatever, then I think there can be a lot of things happening in terms of natural language understanding and making the computer more clever than it is now. If we get to that level then it may be possible to have the computer better guess at what each person needs without the person having to say anything, but I think right now, it is very difficult.  The main attempt at personalization so far on the web is Amazon.com. They know so much about the user because they know what you’ve bought which is a stronger signal of interest than if you had just searched for something.  You search for a lot of things that you may never actually want, but actually paying money; that’s a very, very strong signal of interest.  Take myself, for example. I’m a very loyal shopper of Amazon. I’ve bought several hundred things from them and despite that they rarely recommend (successfully)…sometimes they actually recommend things I like but things I already have. I just didn’t buy it from them so they don’t know I have it. But it’s very, very rare that they recommend something where I say, “Oh yes, I really want that”. So I actually buy it from them.  And that’s despite the (fact that the) economic incentive is extreme, recommending things that people will buy. And they know what people have bought. Despite that and despite their work on this now for already 10 years (it’s always been one of their main dreams is to personalize shopping) they still don’t have it very well done. What they have done very well is this “just in time” relevance or “cross sell” as it’s normally called. So when you are on one book on one page, or one product in general, they will say, here are 5 other ones that are very similar to the one you’re looking at now. But that’s not saying, in general, I’m predicting that these 5 books will be of interest to you. They’re saying, “Given that you’re looking at this book, here are 5 other books that are similar, and therefore, the lead that you’re interested in these 5 books comes from your looking at that first book, not from them predicting or having a more elaborate theory about what I like.

Gord: Right.

Jakob: What “I like” tends not to be very useful.

Gord: Interesting. Jakob, I want to be considerate of your time but I do have one more question I’d love to run by you.  As the search results move towards more types of images, we’re already seeing more images showing up on the actual search results page for a lot of searches. Soon we could be seeing video and different types of information presented on the page. First of all, how will that impact our scanning patterns?  We’ve both done eye scanning research on search engine results, so we know there is very distinct patterns that we see.  Second of all, Marissa Mayer in a statement not that long ago seemed to backpedal a bit about the fact that Google would never put display ads back on a search results page, seeming to open a door for non text ads.  Would you mind commenting on those two things?

Jakob: Well they’re actually quite related.  If they put up display ads, then they will start training people to exhibit more banner blindness, which will also cause them to not look at other types of multimedia on the page. So as long as the page is very clean and the only ads are the text ads that are keyword driven, then I think that putting pictures and probably even videos on there actually work well.  The problem of course is they are inherently a more two dimensional media form, and video is 3 dimensional, because it’s two dimensional – graphic, and the third dimension is time, so they become more difficult to process in this linear type of scanned document “down the page” type of pattern.  But on the other hand people can process images faster, with just one fixation and you can “grok” a lot of what’s in an image, so I think that if they can keep the pages clean, then it will be incorporated in peoples scanning pattern a little bit more. “Oh this can give me a quick idea of what this is all about and what type of information I can expect”.  This of course assumes as well one more thing which is that they can actually select good pictures.

Gord: Right.

Jakob: I would be kind of conservative until higher tweaking with these algorithms, you know, what threshold should you cross before you put an image up.  I would really say tweak it such so that you only put it up when you’re really sure that it’s a highly relevant good image.  If there starts becoming that there are too many images, then we start seeing the obstacle course behavior. People scan around the images, as they do on a lot of corporate websites, where the images tend to be stock photos of glamour models that are irrelevant to what the user’s there for.  And then people involve behavior where they look around the images which is very contrary to first principals of perceptual psychology type of predicting which would be that the images would be attractive. Images turn out to be repelling if people start feeling like they are irrelevant. It’s a similar effect to banner blindness. If there’s any type of design element that people start perceiving as being irrelevant to their needs, then they will start to avoid that design element.

Gord: So, they could be running the risk of banner blindness, by incorporating those images if they’re not absolutely relevant…

Jakob: Exactly.

Gord: …to the query. Ok thank you so much.  Just out of interest have you done a lot of usability work with Chinese?

Jakob: Some. I actually read the article you had on your site. We haven’t done eye tracking studies, but we did some studies when we were in Hong Kong recently, and to that level the findings were very much the same. In terms of pdf was bad and how people go though shopping carts. So a lot of the transactional behavior, the interaction behavior, is very, very similar.

Gord: It was interesting to see how they were interacting with the search results page.  We’re still trying to figure out what some of those interactions meant

Jakob: I think it’s interesting. It can possibly be that the alphabet or character set is less scannable, but it is very hard to say because when you’re a foreigner, these characters look very blocky, and it looks very much like a lot of very similar scribbles.  But on the other hand, it could very well be the same, that people who don’t speak English would view a set of English words like a lot of little speck marks on the page, and yet words in English or in European languages are highly scannable because they have these shapes.

Gord: Right.

Jakob: So I think this is where more research is really called for to really find out.  But I think it’s possible, you know the hypothesis is that it’s just less scannable because the actual graphical or visual appearance of the words just don’t make the words pop as much.

Gord: There seems to be some conditioning effects as well and intent plays a huge part.  There’s a lot of moving pieces with that and we’re just trying to sort out. The relevancy of the results is a huge issue because the relevancy in China is really not that good so…

Jakob: It seems like it would have a lot to do with experience and amount of information.  If you compare back with uses of search in the 80’s, for example, before the web started, that was also a much more thorough reading of search results because people didn’t do search very well. Most people never did it actually, and when you did do it you would search through a very small set of information, and you had to carefully consider each probability. Then, as WebCrawler and Excite and AltaVista and people started, users got more used to scanning, they got more used to filtering out lots of junk. So the paradigm has completely changed from “find everything about my question” to “protect myself against overload of information”.  That paradigm shift requires you to have lived in a lot of information for awhile.

Gord: I was actually talking to the Chinese engineering team down at Yahoo! and that’s one thing I said. If you look at how the Chinese are using the internet, it’s very similar to North America in 99 or 2000. There’s a lot of searching for entertainment files and MP3s. They’re not using it for business and completing tasks nearly as much. It’s an entertainment medium for them, and that will impact how their browsing things like search results. It’ll be interesting to watch as that market matures and as users get more experienced, if that scanning pattern condenses and tightens up a lot

Jakob: Exactly. And I would certainly predict it would. There could be a language difference, basically a character set as we just discussed, but I think the basic information foraging theory is still a universal truth. People have to protect themselves against information overload, if you have information overload. As long as you’re not accustomed to that scenario, then you don’t evolve those behaviors. But once you get it… I think a lot of those people have lived in an environment where there’s not a lot of information.  Only one state television channel and so forth and gradually they’re getting satellite television and they’re getting millions of websites. But gradually they are getting many places where they can shop for given things, but that’s going to be an evolution.

Gord: The other thing we saw was that there was a really quick scan right to the bottom of the page, within 5 seconds, just to determine how relevant these results were, were these legitimate results? And then there was a secondary pass though where they went back to the top and then started going through. So they’re very wary of what’s presented on the page, and I think part of it is lack of trust in the information source and part of it is the amount of spam on the results page.

Jakob: Oh, yes, yes.

Gord: Great thanks very much for your time Jakob.

Jakob: Oh and thank you!

Is Personalization the Path to Follow?

First published July 5, 2007 in Mediapost’s Search Insider

Aaron, Aaron, Aaron. Could I possibly leave you as a lone voice out in the wilderness, prophesizing about personalized search? Of course not.

Last week, fellow Search Insider Aaron Goldman pointed out some loopholes in personalized search nirvana. It’s hard to find fault with his points. They’re all very real flaws in making personalization a credible evolution in search relevancy. Also, somewhere along the line, it appears that I’ve become the cheerleader for personalized search. I do admit I’m somewhat bullish on it, but I think I should clarify why I think personalization is important.

It’s Time to Break Search’s Paradigm

Search has hit the ceiling, at least in its current embodiment. We’ve pushed the paradigm as far as it will go. Search’s nose is smashed up against the window. (I should stop writing these columns late in the evening, after a 15-hour day!). Search needs to go somewhere, and after looking at the alternatives, I believe personalization is the most probable path.

All the improvements in search over the past decade have largely been in the background. The interface you and I use has hardly changed since I first discovered Infoseek and AlltheWeb back in 1995. Sure, the algorithms have been tweaked, but they’ve all been improvements down the same path, and that path is at a dead end. For search to evolve, it needs to move beyond a pure query-initiated, algorithmic-driven exercise. Even universal search, which is the biggest change we’ve seen to the results page in the past few years, is really still a tweak on the existing paradigm. It’s just mixing the bag of results, powered by the same algorithm.

So, when we look at where search can go, there are precious few alternatives. They all aim at the holy grail, disambiguating intent. We can look at human-powered search. The idea behind this is that real, live human beings can deliver greater relevancy than an algorithm ever could. Here tread Jason Calacanis (Mahalo) and Jimbo Wales (Wikia).  Then we have the very close cousin (and in some cases, a stand-in) social search. If we somehow tag results, or implicitly give our vote, even through a click-through, will others who share our interests find the same results more relevant? Finally, we have personalization.

Don’t Expect Perfection Anytime Soon

Each approach has potential flaws. Any time you break a paradigm, iterative failure is almost a given. Nobody is going to get it perfect out of the gate. Getting to the next evolution of search will involve trial and error. That’s why I think it’s particularly brave of Google, given its current market leading position, to be moving aggressively down the personalization path. They’re eating their own lunch. It’s an inevitable move, but one that it takes guts to make. And don’t judge the potential of personalization based on what you’re seeing today. It would be akin to trying to determine the eventual impact of the automobile based on your impression of the first horseless carriage that lurched through town. There’s a reason it’s in beta.

Aaron worries about the search “ruts” that may evolve with personalization. If we tend to go down the same paths again and again, what happens when we want to explore new territory? Will personalization have formed a groove so deep we can’t crawl out of it?

Aaron is also concerned about multiple profiles on the same machine within a household. Or for that matter, multiple profiles with the same person. I search differently at work than I do at home. How will a search engine reconcile this search schizophrenia?

Of course, we haven’t even touched on the biggest challenge facing personalization: the privacy issue. Personalization is powered by mountains of sensitive data. The potential pushback on this is the biggest red flag that personalization has to contend with.

Making the Leap

But no matter which path search chooses to follow, there will be monumental challenges to address. That’s the whole crux of innovation. If it was easy, everyone would do it. But search has no option. For it to evolve into its next stage, which is to take its rightful place as the fundamental glue that connects us all to the highly functional, highly personal semantic Web, search needs to break the current paradigm. And that’s why I’m bullish on personalization. As Google’s Matt Cutts said to me once (about a totally different topic), if I had a dozen eggs, I’d be putting 11 of them in this particular basket. Sure, personalization has some big hurdles to jump. So do the alternatives. And I think the potential wins for personalization are far bigger. I have the suspicion that if personalization works as well as I think it can, we’ll look back five years from now with bemusement at the concerns we had in 2007 around the issue.

That’s the problem when you come to the end of a development path — and fundamental change, rather than incremental change, is required. It’s very difficult to see what lies ahead.

Notes from China

First published May 31, 2007 in Mediapost’s Search Insider

I let Chris Sherman convince me that if I had to choose one overseas show this year, it should be SES China in Xiamen. Part of me is thanking Chris, and part of me is cursing the hell out of him. To be fair, he warned me that this is a cultural shock of significant magnitude. He was right.

I’ll leave the personal observations for my blog. One of the reasons I came was that I knew this was the most important online market in the world, and I had to dip my toe in for myself. For that, I do have to thank Chris. A few weeks ago I was in Florida for the Search Insider Summit, and made a note of some advice Esther Dyson passed in the keynote presentation to the ersatz “Bill Gates” (played by David Vise): “Make sure your kids learn Mandarin.” Xie Xie (thank you), Esther. You’re absolutely right.

Big, But Just Beginning

Let me give you some sense of the magnitude of this market. Right now, the Chinese Internet market is the second largest in the world, only a whisker behind the U.S.: 150 million users to the U.S.’154 million. But the U.S has 68% penetration. That 150 million represents only about 10% of the Chinese market. At full saturation, the Chinese market will be almost seven times as large as that of the U.S.

But don’t make the mistake of projecting the U.S. experience onto the emerging Chinese market. Chinese culture is vastly different from ours, and their online community reflects this difference. For one thing, much of the Chinese online experience will likely happen through mobile devices, since the mobile market is much more mature here. While the number of Internet subscribers is 150 million, the number of cell phone subscribers is significantly higher, nearly 500 million (as of October, 2006) and is growing at the rate of 5.5 million subscribers per month. For another, the Sino mind just clicks at a different speed than ours.

Hot and Noisy Online

One of my favorite phrases I’ve learned while here was renao, which loosely translates into “hot and noisy.” It was explained to me by Deborah Fallows from the PEW Internet Group, an U.S. ex-pat living in Shanghai for two years with her husband, author and journalist Jim Fallows. It sums up so much of what I’ve seen here. The Chinese like to be bombarded by visual stimuli. They operate at a frenetic pace, juggling several things at once, each loudly demanding attention. Some look at this as a lack of maturity in the Asian market. Western eyes see Chinese Web sites as garish, and we think this is because the designers aren’t very sophisticated yet. Perhaps it’s just designers catering to their audience, who like it “hot and noisy.”

Savoring Information

The other difference is how Western cultures treat information, compared to the Chinese. In the West, information is in no short supply, and for the most part, we inherently trust the source of that information. We believe most things we read online to be true. Our biggest challenge is to wade through the mountain of information available to us and to eliminate the irrelevant. The Chinese treasure information yet have a healthy skepticism as to its veracity. While Western Web users are ruthless in their filtering of information, particularly on a search page, the Chinese are more apt to gather and consider, taking time to digest and choose. They often have multiple windows open at the same time, both as a way to keep busy with the slower load times typical in China, and also because they like their desktop “hot and noisy.”

Keeping an Eye on the Market

One of the reasons I was here was to share preliminary findings from an eye-tracking study we did with Chinese users on the two main Chinese search properties, Baidu and Google.cn. This difference in user behavior became very apparent in the study. In North America, the average interaction with a search results page, from launch to first click, is generally less than 10 seconds. In the Chinese study, we saw averages of 30 seconds on Google and up to a minute on Baidu. While North American scan activity is condensed in the Golden Triangle, in China, it’s spread around the page.

It’s fascinating to watch an individual session. The eye zips around the page, picking up information in an apparently haphazard manner. Baidu has been taken to task for the opaque nature of its listings, where you can pay for placement. The results are also much more prone to affiliate spam (on both engines, but particularly Baidu) than we see in North America. But the Chinese don’t mind. Baidu has captured 62% of the search market here, compared to 20% for Google. After all, lack of trust in information is nothing new to the Chinese. Why should it be any different on a search engine?

Everyone I’ve talked to here agrees. This is a market ready to explode. Innovation is happening organically and at an incredibly rapid pace. The development cycle to turn out new functionality on Chinese sites is 30% to 50% as long as their North-American-based rivals. As somebody told me, “In China, you point, shoot and then aim. Deliberation will kill you here.”

This is a lesson Google is learning the hard way. Chris noted that the level of sophistication has increased immensely from the last trade show here, in 2006. The Chinese Internet market is like a Beijing taxi: there may be no logic to its route, but it’s sure getting to wherever it’s going in a hurry!

An Intimate View of the World Through Google’s Eyes

First published May 24, 2007 in Mediapost’s Search Insider

The walls are coming crashing down at Google. They’re in the middle of tearing down silos and aggregating content. But that aggregation will likely come with a very unique viewpoint some day: yours.

Last week at Searchology (an event I couldn’t attend, due to a conflict) Google unveiled universal search, along with a few other assorted tidbits. David Berkowitz covered this in Tuesday’s Search Insider, so forgive me if some of this is redundant, but I think we’re covering unique ground in our approaches.

Mixing up Google’s Buckets

The key for universal search? Results that come from a number of different sources: the Web, blogs, video, news, images, maps, local, product, to name a few, all presented on the same results page. And yes, ads. Because, in the words of Google’s Marissa Mayer, “sometimes an ad is the right answer.” So, in effect, Google is no longer a search engine. It’s an “idea portal,” aggregated from Google’s vast Web reach around a specific query, on the fly and brought together for the user. And Google, in its infinite wisdom, will apply a universal ranking algorithm across disparate content to pull what it feels is the most relevant to the top of the page.

Universal search, in one fell swoop, makes the idea of vertical search irrelevant, because Google is making it all horizontal. The company will assemble a smorgasbord of content from their various buckets, prepared right in front of your eyes in 0.23 seconds.

Does One Score Fit All?

But here’s the challenge. The task of applying a content-agnostic relevancy score is daunting, and according to Google, it’s the reason it’s only now introducing universal search, after a number of years in the lab. In fact, it’s so daunting, you’ll probably only see other types of content creep onto your results page in the most obvious of cases. For example, a search for a specific video that’s suddenly very hot will bring back the video clip near the top. For most searches, the net impact of vertical search will be the appearance of some additional links to other vertical “buckets” near the top of the results set. Like most things that can impact the user experience, Google is treading carefully here.

Just Add Two Dashes of Personalization

So why bother? Because universal search becomes much more interesting when you combine it with personalization. In a recent interview I did with Mayer, she said she didn’t see a strong vertical angle for personalization in the near future. I can’t help but think that personalization will drive universal search. In fact, I don’t think universal search works very well without personalization. In both cases, we’re looking at an on-the-fly algorithm that works over and above the base Google algorithm, reordering results for you. Google will be able to be more confident in offering a much richer and more diverse set of universal results when you can tap into previous search and Web history. It will give them a lot more background to help them put context around your query. With personalization, every search becomes your customized portal, centered on what’s on the top of your mind right now. And that’s pretty interesting, both for the user and the advertiser.

And One Cup of Assorted Advertising

Obviously, Google’s mind is straying down this path as well, because at Searchology, Mayer did a pretty intense backpedal from her previous position that display or rich media ads would never appear on the search results page. The official position is now: “potentially… possibly… probably.” Google’s statements used to be much more unequivocal, but lately, they’re sounding much less adamant and much more political. No door shall remain unopened, even if it’s just a crack, because chances are, Google may have to squeeze through it in the future.

Increasingly, the puzzle pieces of Google’s empire are falling into place. When you take personalization, universal search, enhanced ad serving capabilities and outreach into the most popular Web communities and bring them together, you start to see a pretty compelling network emerge, and it’s all centered on the user, one user at a time.

Universal Search and Other Surprises from Google’s Searchology

When Google yesterday invited a number of reporters to come down to Mountain View for an event they called Searchology, I figured they had something in the works. I had to turn down the invitation because of other commitments, but we sent Enquiro’s Director of Technology and analytics blogger, Manoj Jasra down in my stead. Sure enough, just after noon yesterday, I received a press release announcing the introduction of universal search. I haven’t had a chance to talk to Manoj about what else Google may have unveiled in Mountain View yesterday, but even just working my way through the official release from Google gave me plenty of food for thought. For the extensive list of the announcements and some running commentary, check out Danny’s post on Searchengineland.

To me, the one thing that jumps out in this is the announcement of Universal Search. Basically, Universal Search is the breaking down of the information silos that currently exist on Google and blending them into a single set of results. The changes right now are very subtle. Web results still dominate the typical results page and the primary thing that would be noticeable by the user are additional dynamically generated navigation links that sit just about the results.

universalsearch

The key to universal search results is an on-the-fly algorithm that looks across all of Google’s information sources and prioritizes and ranks all the items coming from these disparate sources based on the user intent. Now, it’s in those last five words, “based on the user intent” that the really important piece of this comes out. Just a few weeks ago, I interviewed Marissa Mayer about the inclusion of Web history in the dataset to calculate personalized search results. This just gives Sep Kamvar and his personalization algorithm a lot more to chew on as they determine user intent. During the interview, I asked Marissa Mayer if personalization allows Google to be more confident in delivering vertical results. Marissa indicated that this was not an area they were currently looking at.

There are a lot of different things that we could do with this data. I’ll be totally honest. Verticals isn’t something that has been first and foremost in our minds so I don’t really think there’s a strong vertical angle here at the moment.

To me it just didn’t make sense. Couple that with yesterday’s announcement of Universal search results and I’ve got to conclude that Marissa was throwing up a smokescreen.

Personalized search is the engine is going to drive universal search. The two are inextricably linked. When you look at the wording the Google throws around about the on-the-fly ranking of content from all the sources for Universal Search, that’s exactly the same the wording they use for the personalization algorithm. It operates on-the-fly, looks at the content in the Google index and re-ranks it according to be perceived intent of the user, based on search history, Web history and other signals. It’s not a huge stretch to extend that same real-time categorization of content across all of Google’s information silos. That is, in fact, what Google’s announcement yesterday said. Call it a silo, call it a vertical, the end result is the same. As Google gains more confidence in disambiguating user intent, more specific types of search results, extending beyond Web results, will get included on the results page and presented to the user.

This introduces something else that opens up some interesting implications for Google. And again, if they choose to go down this path, it flies in the face of something that Marissa Mayer has previously stated. On the search results page as we know it, display or other types of advertising just don’t work that well. The search results pages is heavily text-based. We look for text, we respond to text, we click on text. Anything that’s not text acts as an interruption and distraction. There’s no place on this page for display or rich media advertising.

But if you mix up the search results page and start including things like images, video clips, maps, icons for audio files, you move away from the common paradigm of the text based search results page. The Google page becomes much more like a personalized, on-the-fly portal based around the intent of our query. As such, it includes stimuli from a lot of different sources, presented in a lot of different ways. There will be many things fighting for your attention. And in this paradigm, perhaps display and rich media advertising works better. In another announcement from Google, Marissa Mayer appears to have backtracked and open the door for this.

Yesterday, Marissa responded to a question about possible inclusion of non text-based ads in this way:

Well we don’t have anything to announce on that today. I do think this opens the door for the introduction of richer media into the search results page. We are now going to understand how users interact with that. And as Alan always likes to say search is about finding the best answer, not just the best URL or the best textual snippet.  

For us ads are answers as well. Searching ads is just as hard as searching the Web, as searching images. And so I was hoping that we could bring some of these same advances in terms of the richness of media to ads.

Greg Sterling, in his post on Search Engine Land, calls it something of a bombshell (Greg, I’m now regreting that I didn’t attend, as I would have loved to chat to you about this) and I agree. This is a significant retraction of Google’s long running stand on keeping display ads off the SERP:

There will be no banner ads on the Google homepage or web search results pages. There will not be crazy, flashy, graphical doodads flying and popping up all over the Google site. Ever.

Google said in their announcements that the changes for the user will be subtle at first. In fact, the position of the dynamically generated navigation links that appear about the search results will largely be ignored by most users. They won’t even know they exist. But in typical Google fashion, this tentative presentation of new functionality will be an incremental one. The typical path that Google takes when introducing new functionality is

  • subtly introduce new navigation options in the way of links that tend to be out of the primary scan path
  • make it an opt in experience for the user
  • gradually roll this functionality into a default opt in
  • eventually integrate more fully into the standard presentation of results
  • move to full integration and remove the ability for the user to opt out

if Google goes down this path with both universal and personal search, you can expect to see a substantially different look for search results in the near future. And as with most things we’ve talked about that Google is looking at introducing, there will be a trade-off between overall functionality for most users and a relinquishing of control for a small number of users.

My final point for this post is the speed of which Google is introducing new search innovations. A few weeks ago I posted that Google may be treating search as the forgotten child, devoting more attention to the sexier new channels they were acquiring, including pretty much everything under the sun. Matt Cutts was quick to post a comment saying that Google was still very much involved with search and that there would be a number of new things rolling out in the near future. It appears that I didn’t know what the hell I was talking about and now have to eat my words, as the announcements over the last few weeks have indicated that Google is still very much in the search game and is moving forward at, what for them, is breakneck pace.

I’ve often stated before the Google was the victim of their own success. Because they have such a large slice search market, any changes to the actual presentation of the search pages came with a lot of risk. It’s a major monetization channel for them, their biggest one by far, and any changes in user experience through the introduction of new functionality comes with the potential of dramatically reducing click through on sponsored ads. I predicted that this would make it tough for Google to really innovate with search and we would probably be looking to the smaller players to aggressively pursue innovation. Interestingly, much of my recent conversation with Ask’s usability team lead, Michael Ferguson, revolved around this point. That interview will be running tomorrow on Search Engine Land, with full transcript posted to this blog. If you look at what Ask is been doing with AskX:

AskX

 It’s very similar to what Google says they will be doing with universal search results. It’s taking content from a number of different sources and rolling it into one combined search results page. It came as a complete surprise to me when I read the release indicating that Google is moving aggressively down the same path. Google will not be taking the path that Ask is, by aggressively presenting new functionality on their main site, Google will introduce it incrementally, bit by bit. But expect the evolution of the search experience on Google to move fairly quickly.

All of Google’s announcements in the last few months point in the same direction. They all point to a highly personalized, highly relevant portal to all of Google’s information. Here’s my other prediction. While Marissa was very careful in past interviews to state that personalization is currently impacting only the organic search results, with no work being done on the personalized presentation of sponsored content, I smell another smokescreen. Personalized presentation of advertising content is just too huge a revenue opportunity for Google and we’ll be seeing it in the very near future.

Interview with Ask’s Michael Ferguson

I recently had the opportunity to chat with one of my favorite usability people, Michael Ferguson at Ask.com. You can find excerpts of the interview, along with commentary, on Search Engine Land in this week’s Just Behave column. Some of Michael’s comments are particularly timely now, given Google’s announcement of Universal search.

Gord: How does Ask.com approach the search user experience and in big terms, what is your general philosophy?

Michael: A lot of what we do is, to some extent, informed by core search needs but also by our relevant market share, understanding that people have often experienced other engines before they come to us, not necessarily in that session but generally on the web. People have at least done a few searches on Google and Yahoo, so they have some context coming from those search experiences. So often, we’re taking what we’ve learned from best practices from competitors and others and then, on top of that, trying to add a lot of product experience and relevance experiences that are differentiated. Of course, we’re coming from this longer history of the company where we’ve had various user experiences over the time that we’ve been around. We’ve marketed around natural language, in the late 90’s and answered people’s questions at the top of the page, but in the last year and a half or so, we’ve rebranded and really focused on getting the word out to the end users that we are a keyword search engine, an everyday search engine.

A lot of the things that we’ve done with users have been to try to, implicitly, if not explicitly, inform users that are coming to the site you can use it very much like you can use any other kind of search engine you’ve been on before. Or, if they’re current users and people are coming back to the site, to let them know that the range of experiences and the type of information we bring back to them has greatly expanded. So that’s pretty much it. It’s informed by the context of not just a sense of pure search and information retrieval and all the research that’s gone on that in the last 35 or 40 years but also the dynamics of the experiences that we’ve had before and people’s previous experiences with Ask. Then, an acknowledgement that they’ve often searched on other sites and looked for information.

Gord: You brought up a number of topics that I’d like to touch on, each in sequence. You mentioned that in a lot of cases, they’re coming to Ask and they’ve used Google or Yahoo or they’ve used another engine as one of their primary search tools. Does Ask’s role as a supplemental engine or an alternative engine give you a little more latitude? You can add things from a functionality point of view to really differentiate yourselves. I actually just did a search and see that you, at least on my computer here, have made the move to incorporate some of the things that you were testing on AskX into the main site. Maybe we’ll start there. Is that an ongoing test? Am I just part of a beta test on that or this rollover complete now?

Michael: We’re still in testing with that and it will roll out. We have decided because of a lot of the user experience metrics that we’re getting from the beta test that we’re going to go for it. We have decided to move the full experience over to the AskX experience. Of course, there are variants to that, but the basic theme of, in a smart way, bringing together results from different search verticals and wrapping those around the core organic results (as well as) a sponsored experience. So that will happen sometime this year. We don’t know exactly when, but just a couple of days ago, we really decided we’ve seen enough and we’re pretty excited about that.

Google has a really great user experience going, and Yahoo does too, but they have so many different levers that move so much revenue and traffic and experience metrics that I think it’s harder for them to take chances and to move things around and get buy-offs at a bureaucratic level. To some extent, we see ourselves as having permission and a responsibility to really innovate on the user experience. It’s definitely a good time for us because we have such great support from IAC and they’re very much invested in us improving the user experience and getting more traffic and getting frequency and taking market share and they’re ready to very much invest in that. So we don’t need to cram the page with sponsored links and things like that. It’s mostly a transitional time when we’re getting people to reconsider the brand and the search engine as a full keyword based, everyday search engine that has lots to offer. I’m talking to people all the time about Ask and there’s definitely still people that say, “Hey, last night, it came up with my buddies at the bar, this trivia question about the Los Angeles Lakers, 1966 to 1972 (and I went to Ask and asked a question)”. Then there are other people that see us as evolving beyond that but still really surprised that we haven’t had image search.  Now with AskX we’ll have preview search and there’s lots of other stuff coming along now. So yes, it’s a great place to be. I love working with it. There are so many things that, in an informed way, we can take chances on, relative to our competitors.

Gord: So does this mean that the main site becomes more of an active site? Are you being more upfront with the testing on Ask.com rather than on AskX.com?

Michael: Well, I think the general sense of what we’re going to do is that, at some point this year, the AskX experience will, at least at a wireframe level, become the default experience and, of course, we have a lot of next generation “after that” stuff queued up that we’re thinking about and we’re actively testing right now but not in any live sense.  So potentially, things will slide in behind the move of the full interface going out and then AskX will remain a sandbox for another instance of, hopefully, new and really useful and differentiated search experience coming after that. A general thing that we’re going to try to do, instead of having 15 or 18 different product managers and engineering teams working on all these different facets of information retrieval and services, we’re going to stay search focused and just have one sandbox area where people go in and see multiple facets of what we’re thinking about.

Gord: Let’s talk about the sponsored ads for a bit. I notice that for a couple of searches that I’ve done while we’ve been talking that they’ve definitely been dialed down as far as the presence of sponsored on the page. I’m only seeing top sponsored appear, so you’re using the right rail to add additional search value or information value, whether it be suggested searches or on a local search, where it brought me back the current weather and time. So what’s the current strategy on Ask as far as presentation of sponsored results and the amount of real estate devoted to them?

Michael: Just to fit along with the logic of Eye Tracking II (Enquiro’s second eye tracking study), those ads are not a delineated part of the user experience for the end user and they’re relevance and their frequency can color the perception of the rest of the page and especially the organic listings below them. Right now, as I said, we’re very much focusing on improved user experience and building frequency and retention of customers, which all the companies are, I’m sure. But we’re really being, basically, cautious with the ads and getting them there when they’re appropriate and, as best we can, adjust them over time, so that when they’re there, they’re going to valuable for the user and for the vendor.

Gord: That’s a fairly significant evolution in thinking about what the results page looks like from say, two years ago, with Ask. Is that purely a function of IAC knowing that this is a long term game and it begins with market share and after that comes the monetization opportunities?

Michael: Actually, I think way before we got acquired by IAC we knew that. We test like other engines would. We test lots of different ad configurations and presentations and things like that but definitely you want to balance that. Way before we got acquired we realized that there’s one thing that’s kind of fun about making the quarter and blowing through it a little bit and then there’s another thing about eroding customers. And definitely there’s a lifetime value that can be gained by giving people what you know is a better user experience over time, so once we did become part of the IAC family, we brought them up to speed with the results that we were finding that were pointing to taking that road and they’ve very much been in support of it. And, of course, their revenue is spread amongst a lot of different pieces of online and offline business so their ability to absorb it is probably more flexible than ours was as a stand alone company.

Gord: That brings me to my next question, which is, with all the different properties that IAC has and their deep penetration into some of the vertical areas, you had talked about the opportunity to bring some of that value to the search results page. What are we looking at as far as that goes? Are we going to see more and more information pulled from other IAC into the main AskX interface?

Michael: Maybe the most powerful thing about the internet is that you as an individual now have a very empowered position relative to other producers of information, other businesses where you can consume a bunch of different points of view. You have a bunch of different opportunities to do business and get the lowest price and read reviews that the company itself hasn’t sanctioned, or anything like that.  You have access to your peer network and to your social networks. Search, like the internet, becomes, and it necessarily needs to be, a proxy for that neutral, unbiased view of all the information that’s available. This probably gets a little bit into what may or not may work with something like Google’s search history. Users over time have said again and again, “Don’t hide anything from me or don’t over think what you may think I might want. Give me all of the best stuff, use your algorithms to rank all that, but if I get the sense that anything’s biased or people are paying for this, then I’m not going to trust you and I’m going to go somewhere else where I can get that sense of empowerment again.”

As I’ve sat in user experience research over time, I’ve seen people..and I know this isn’t true of Google and I know it isn’t true of Ask right now with the  retraction from paid inclusion…but you ask users why they think this came up first on Google, maybe with a navigational query like Honda or Honda Civic and Honda comes up first. They’ll say, “Oh, Honda paid for that.” So even with the engines that aren’t doing paid inclusion, there’s still this kind of wariness that consumers have of just generally somebody on the internet, somewhere, behind the curtains, trying to take advantage of them or steer them in some way. So as soon as we got acquired by IAC, we have made it very much part of their perception of this and their culture. Their product management point of view is that you can’t sacrifice that neutrality. You can’t load a bunch of IAC stuff all over the place. The relationship with IAC does give us access to proprietary databases that we can do lots of deep dives in and get lots of rich information out  that can help the user in their instance of their search needs that other companies wouldn’t be able to get access to, while maintaining access to everything else.

The way we approached AskCity was a great example of this. We had leveraged a lot of CitySearch data but at the same time, we know that when people go out and want to see reviews, they want to see reviews from AOL Neighborhoods, they want to see reviews from Yelp they want to see reviews from all these other points of view too. So we go and scrape all those and fold them into the CitySearch stuff. We give access to all those results that come up on AskCity. If they’re, for instance, at a restaurant, you can get Open Table reviews and you can get movie reservations through Fandango and other stuff like that. Those companies have nothing to do with IAC. Those decisions were borne from user needs and from us looking as individuals in particular urban areas, and saying “Hey, what would I want to come up?” We know from previous experience from AOL that the walled garden thing doesn’t work. It’s just not what people expect from search and not what they expect from the internet, so that lesson’s been learned. I don’t know how much it would be different if we had some dominant market share over search, but that’s even more reason for us to be appealing to as wide a population as possible. That’s my philosophy right now.

Gord: I guess the other thing that every major engine is struggling with right now is in this quest to disambiguate intent, where is the trade-off with user control? Like you said, just show me a lot of the best stuff and I’ll decide where I want to drill down and I’ll change the query based on what I’m seeing to filter down to what I want. In talking to Marissa at Google and their moves towards personalization and introducing web history, I  think for anyone who understands how search engines work, it’s not that hard to see the benefits of personalization but from a user perspective there does seem to be some significant push back against that. Some users are saying, “I don’t want a lot of things happening in the background that are not transparent to me. I want to stay in control.” How is Ask approaching that?

Michael: The other major thing that’s going on right now is that we have fully revamped how we’re taking this. We developed the Direct Hit late 90’s technology. And then the Teoma technology we acquired. And really, it’s not that we’re taking those to the next level, we got all of that stuff together and over the past three years, we’ve been saying, “Okay, what do we have and what’s unique and differentiated?” There’s a lot of great user behavior data that Direct Hit understands.  We have a whole variety of things there and that’s unlocked, that’s across all the people coming in and out over time but not any personally identifiable type of stuff. And then there’s Teoma, which is good at seeing communities on the web, expertise within the communities and how communities relate. So right now, even though we have personalization stuff and My Stuff and other things that are coming up, we’re investing a lot more in the next version of the algorithm and the infrastructure for us to grow called Edison. And we started talking about that a week ago since A.G. (Apostolos Gerasoulis) mentioned it. Across a lot of user data it understands a lot about the context from the user intention side and because we’re constantly capturing the topology of the web and it’s communities and how they’re related, we then match the intention and the map of the web as it stands and the  blogosphere as it stands and other domains as they stand. Our Zoom product, which is now on the left under the search box in the AskX experience and it’s on the right on the live site, is the big area that we’re going to more passively offer people different paths.

For example, just like with AskX, you search for U2, it’s going to bring up news, and product results, and video results and images, and a Smart Answer at the top of the page. It’s also going to know that there’s U2 as the entity, the music band and therefore search the blogosphere but just search within music blogs. So what it’s doing, over time, is trying to give a personalized experience that’s informed by lots of behavior and trying to capture the structure of the web, basically. So that’s where we are there.

There’s a book that came out in early 1999 called Net Worth, which you might want to read. I almost want to revisit it myself now. It’s a Harvard Business School book that Marc Singer and John Hagel came out with. It talked about infomediaries and it imagined this future where there’d be these trusted brands and companies. They were thinking along the lines of American Express or some other concurrent banking entity at the time, but these infomediaries would have outside vendors come to them and they would entrust all their information, as much as they wanted to, they could control that, both online and offline.  You were talking in your latest blog post about understanding in the consideration phase where somebody is and presenting, potentially, websites that they hadn’t seen yet or ones that they might like at that point in the car purchase behavior. But the way that they were imagining it was that there would be a credit card that might show that someone had been taking trips from the San Francisco Bay area to the Tahoe region at a certain time of year and had maybe met with real estate agents up there and things like that. But these infomediaries, on top of not just web history but even offline stuff, would be a broker for all that information and there would be this nice marketplace where someone could come and say, “I want to pay $250 to talk to this person right now with this specific message”. So it seems that Google is doing a lot of that, especially with the DoubleClick acquisition. But I’m just wondering about the other side of it, keeping the end user aware of and empowered over that information and where it’s at. So Net Worth is a neat book to check out because the way they were describing it, the end user, even to the broker, would seep out exactly what they wanted to seep out at any given time. It wouldn’t be this passive recording device thing that’s silently taping. My experience so far of using the Google Toolbar that’s allowing the collection of history, is that it’s ambiguous to me about how much of my behavior is getting taken up by that system and used. We’ll see where it goes but right now we don’t have strong plans to do anything with that for search.

Gord: It’s going to be really interesting because, up to now, the tool bar was collecting data but there was no transparency into what it was collecting, and now that they’ve done that, we’ll see what the user response is to that. Now that they can go into their web history and have that initial shock of realizing how much Google actually does know about them.

One other question, and this is kind of a sidelight, but it’s always something that I’ve been interested in. Now that you have the search box along the left side there and it gives search suggestions as you’re typing, have you done any tracking to see how that’s altered your query logs? Have you noticed any trends in people searching differently now that you’re suggesting possible searches to them as they’re typing?

Michael: There are two broad things that are encouraging to us. One is that over time, the natural language queries are down tremendously. Our queries, because we promoted in the late nineties this “ask a question” thing, tended to be longer and more phrase based, more natural language based.  That’s really gone down and is approaching what we would consider normal for an every day search engine profile as far as the queries. And we really think that this zooming stuff has really helped that because it’s often keyword based. You will sometimes see some natural language stuff in there. There are communities on the web that are informing us that there’s an interest in this topic that’s related to the basic topic so it is helping change the user behavior on Ask.

And the other result of that is as people use it more for everyday keyword based search engine, the topics or the different categories of queries that people see are normalizing out too. Less and less they’re reference type stuff and more and more they’re transactional type queries, so that’s a good thing. And that’s just been happening as we rebranded and we presented Zoom.

And then with the AskX experience, we are definitely seeing that even more because of the fact that they’re just in proximity to the search box. We always knew that these suggestions should ideally be close to the search box so that people understand fully what we’re trying to offer them. For instance, on the current site, we do see users that will sometimes type a query in the search box on top and because they’re used to seeing ads on the right rail on so many other sites and because they don’t necessarily know what narrow and expand your search is they think those are just titles to other results or websites. It’s a relatively small portion. Most people get what it is, but there was that liability there. Now in the AskX experience, it’s close and visually grouped with the search box. It’s definitely getting used more and guiding queries and people seem even more comfortable putting general terms in. We’ve made it that you can just arrow down to the one and hit return. It’s definitely driving the queries differently.

Gord: I’ve always liked what you guys have done on the search page. I think it’s some of the most innovative stuff with a major search property that I see out there and I think that there’s definitely a good place for that kind of initiative. So let me wrap up by asking, if you had your way, in two years, what part would Ask be playing in the total search landscape?

Michael: We’d definitely have significantly more than 10% market share. My point of view, from dealing with the user experience, is that I’ve been proud of the work that we’ve done and I really think that we’ve been very focused and innovative with a very talented team here and we’re really hoping that as we look at the rest of the year and we put out Edison and the AskX experience, that we become recognized for taking chances and presenting the user experience in a differentiated way that people have to respond to us in the market and start adopting some of the things that we’re doing. Because of the amount of revenue that Microsoft, Yahoo and Google are dealing with on the search side, they often get a lot of press but our hope is really to take share and to hopefully have a user experience that inform and improve the user experience of our competitors.

Gord: Thank you for your time Michael.

Personalization: Google’s Defensible Trump Card?

A thought that came up in a conversation with Michael Ferguson, Ask’s usability guy (which is probably why I like talking to him. He always greases the mental machinery) was Google’s defensible position that personalization offers.

Google is betting the farm on personalization. And really, they’re possibly the only search engine that can make this work. Here are the required components:

  • A high enough degree of additional user value to convince people to opt in to personalization. As I’ve talked about before, that’s why it’s being rolled out in organic search first. Expect a slew of other value adds in the near future, all powered by personalization and all aimed and getting you to hit the opt in box.
  • An extensive network so you can maintain multiple touch points for the delivery of targeted advertising. Nobody has a bigger network that Google’s AdSense network
  • Critical mass amongst users. With Google’s almost 65% market share and the highest penetration of installed tool bars (42% plus in a recent B to B study we did), Google also has the required components to tap into a significant slice of the available market. And future Gadgets and tools will likely either require personalization to be turned on, or will provide an enhanced level of functionality when they are. Expect Google to get aggressive with forcing adoption in the next year or so.

It came to light when I was talking to Michael about Ask’s algo and whether personalization will play a part (by the way, this is part of an interview that will be on Search Engine Land next week). After the interview, I realized it’s not an option for Ask, at least not at the level that Google’s contemplating. Even if they did move to personalization, they just don’t own enough of the total online user experience to push them to opt into personalization. They’d never gain the critical mass needed to make it work.

Microsoft has an outside chance through Messenger, but it would be a long shot. Yahoo also has a long shot at it (although better than Microsoft’s) but they’d have to start gaining market share, and there are a number of huge obstacles in their way. Google is by far the best bet to force personalization on the market and have it be adopted at significant rates.

So what are the options for the other engines? Well, again, there’s an interesting twist there as well. One thing that’s touted heavily by the contenders is social search. I have severe doubts about the scalability of anything that requires a human element, and I’ve written about this in the past. But then I realized that personalization gives Google social search in a way that others just can’t touch.

If Google is collecting both web and search history, they’re collecting implicit votes for the quality of every property on the web. They create their own community, and with every click, that community votes for the quality and relevance of every site they visit. It’s social search in a very powerful and completely transparent form. In this form, social search requires no additional action on the part of the user (one of the critical risk areas of social search) and is completely scalable, because there’s no human bottleneck (the other critical risk area).

The more I think about personalization, the more I think that Google has just trumped the entire search space…again.

The Three Cs of Search

First published May 10, 2007 in Mediapost’s Search Insider

Since most of the Search Insiders are in Bonita Springs this week, chances are that you’ll be hearing a lot of what’s happening down here in the Florida Everglades (other than the brush fires which appear to have us surrounded). Aaron Goldman shared his Buzz-o-meter with us on Tuesday, where he measures the words that seem to be dropped with the greatest frequency. It appears that my opening remarks set a tone that has been picked up in a number of sessions, and two words breaking into the top 10 are “connection” and “community.” Aaron added a third “c”: “content.”

To me, these words sum up a transition that’s happening in search. Expect the activity of searching on a search engine to gradually disappear, to be replaced with the functionality of search as an underpinning to the workings of many things on the Web. Search will become the engine that drives the semantic web, which Esther Dyson talked about in her keynote. She’s looking for search to move beyond “search and fetch” to her ideal, “deliver, act and transact.”

Search will be the connector between what we want and what best matches our want out there on the Web. And rather than a singular task (i.e. go look for this query) it will become a self-guided series of tasks, with intelligent agents in between to set search on its new direction. An entire trip, include flight reservations, hotel bookings, ground transportation, notifications of friends in the area and restaurant reservations, could be booked by intelligent Web agents, powered by search. And as came up in a panel discussion with the Search Insiders, when the presentation of commercial messaging appears in this context, it’s not advertising, it’s a helpful recommendation.

The piece that drives this is personalization, and that’s why Google’s moves are potentially so important. They take us much closer to the semantic web that Dyson envisions. This is the first “c”: connections.

Redefining Community

The second “c” speaks to the very transformation of our society: community. The way we relate to each other is being totally rewired by the Internet. By sheer physical necessity, communities have previously been defined by geography. We shared a common space, which enable communication, which created community. But today, the Internet has made physical distance irrelevant. Our communities are now defined by commonly held ideas or interests. Communities form around ideas, and search connects us to those communities. Every time we do online research for a product or service, we step into a community. In the course of a day, we can belong to several different communities. They are constantly shifting, as people move in and out of them, depending on the longevity of the engagement with the idea that forms the community.

Content Trails

And a third “c,” content, is the trail that the other members of that community leave behind through their conversations. These are the telltale signs that someone has already gone this way, and left a permanent record of his or her engagement with the community. Every Wikipedia entry is part of a community, as are many MySpace pages, blog posts and other virtual outposts. Search is the thread that loops them together at the user’s initiative. In fact, the algorithm of the engine is the de facto definer of community with each given search. The engine goes out, defines the landscape of community, and connects you with the citizens of that community and the content trails they leave behind.

It’s a fascinating world, which is being born as we speak. It’s a sociological experiment of vast magnitude in the making, and I don’t think we know what the repercussions will be. Whatever they are, it’s too late to turn back now. Technology moves fast, but people move slowly, and not in one mass. Small degrees of technological change can create seismic shifts in the sociological landscape. And we’re subjecting ourselves to a degree of technological change unparalleled in history. Who knows what we’ve unleashed?