Archive

Posts Tagged ‘microsoft’

Bing! Microsoft Prepares For War With A Revamped Search Engine (Screenshots)

May 28th, 2009

kumo-tribe
Today, Microsoft publicly unveiled its soon-to-launch search engine Bing. It will become available over the next few days, and be fully launched by June 3. On the surface, Bing has a distinct gloss. The home page features a rotation of stunning photography, for instance, which can be clicked on to produce related image search results. But the most significant changes are under the covers. “We have taken the algorithmic programming up an order of magnitude,” says Microsoft senior vice president Yusuf Mehdi. Each search result page is customized according to what type of search you do (health, travel, shopping, news, sports). The algorithms determine not only the order of results on the page, but the layout of the page itself, concluding what sections appear. These sections can include anything from guided refinements and a list of related searches in the left-hand pane to images, videos, and local results.

I’ve been playing around with a preview version of Bing for about a week. It is designed to be “more of a decision engine,” says Mehdi. Bing helps people make decisions through guided search and a focus on task completion. In a time when a new Website is created every 4.5 seconds, information overload is becoming a real problem. ” People are getting hundreds of thousands of links but not getting what they want,” says Mehdi. Bing tries to alleviate problem by offering up different experiences depending on the search.

The internal codename for Bing is Kumo (which is what you see in the screenshots), and the current release is called Kiev. Rather than a spare, blank screen, Bing’s homepage surrounds the search box with a single beautiful image, such as the one of the tribesmen above or a kinkajou. You can hover over parts of the image to get factoids about the image or click through to an image search result page to explore more. The left-hand pane offers the option to narrow your search on images, videos, shopping, news, maps, or travel. Each of these has a different look and feel. A travel search will turn up a page based on Microsoft’s Farecast technology asking you where you want to go, with flights, hotels, and destination information. A news search offers up headlines, photos, videos, and local news in a column on the right. A shopping search will bring up products and is tied into Microsoft’s Cashback program.

Every search also generates a guide on the left to help you refine your search. A search for “kinkajou,” for example, lets you refine by images, facts, sale, breeders, care, diseases, and videos. A search for “Samsung LCD TVs” brings up an entirely different set of guided results: shopping, review, manual, repair, buy, stand, images, and videos. If you search for images of “butterflies,” it lets you sift to show just Monarch, Swallowtail, Viceroy, Owl, and other types of butterflies. All of this categorization and concept-matching is Microsoft’s early attempt to bring in some basic semantoc search technologies into a mainstream search engine. Each guided option is dynamically generated, just like the different sections of the search results page. “Google, tried to preempt this,” says Mehdi, referring to Google’s new search refinement options it launched last week, which is also in the left pane. Those Google options, which include the ability to search across different time periods or for related keywords, are “completely static,” criticizes Mehdi. “There is nothing new about it. It is a very minor rev, not as sophisticated as what we are doing. For us ever query is special.”

Bing also takes advantage of Microsoft’s acquisition of Powerset to provide better previews and snippets of text when you hover over a result. Also, whenever a search brings up a “reference” tab in the guided exploration pane, clicking on that will bring up an enhanced Wikipedia article with semantic tags.

Onstage at the D7 conference, Steve Ballmer acknowledges: “There is no way to change the whole game in one step.” But search “deserves a good feature war.” And Bing will be rolling out new features as it goes forward. But is it enough to get people to switch? Bing is certainly not a game-changer, but it does cut out a lot of the back and forth that happens with so many searches today. If Bing can help people find what they are looking for faster, it will put pressure on Google to keep advancing the ball as well.

kumo-screen-annotatedkumo-newskumo-mapskumo-kinkajoukumo-imageskumo-farechasehomepage-630x315

Author: admin Categories: General Tags: , ,

Microsoft launching new search engine Bing (logo leaked)

May 26th, 2009

Within the next few days, Microsoft is expected to unveil its latest attempt at trying to be a player in the world of web search. After it has failed to get live.com any traction against Google, it will apparently launch a new engine called “Bing” — the project formerly known by its working title “Kumo.” This should be unveiled at the D conference which starts today in Carlsbad, CA — but it looks like Microsoft may be giving us a peak at the logo a tad early.12

While it appears that Microsoft may have already taken it down, I visited bing.com in my browser about 10 minutes ago and sure enough saw the favicon you see above. It’s a lowercase “b” with a yellow/orange dot in the middle. It would appear that this will be at least a part of the Bing logo. The light blue and yellow/orange color combination matches that of Kumo. I find that combination to be quite ugly — sort of like the Cleveland Cavaliers basketball uniforms (below) from the 1990s — but hey, that’s just personal taste. All that really matters is now the search engine actually performs.

This favicon, which again, may only be a part of the logo, also looks a lot like the logo for Blinkx, the video search engine. That features a red lowercase “b” with an eye in the middle.

Microsoft is spending some $80 to $100 million on a marketing campaign for Bing, according to Ad Age. That’s huge by any standard, but especially when you consider that Google only spent $25 million on all of its marketing last year. I don’t know what Microsoft plans to spend all that money on, but I get the sneaking suspicion that Bing Crosby will be involved in some way or another.

Do Search Engines Love Blogs?

May 5th, 2009

Microsoft Explores an Algorithm to Increase PageRank for Pages Linked to by Blogs.

In the new patent document, they ask if the rankings of web pages in search results would be improved by a providing a slight increase in the PageRank of pages linked to by blogs. They tell us that:

This idea is based on the assumption (or hope) that blogs are still mostly human-authored, and that links from blogs generally represent sincere endorsements on the part of the authors.

 

The December post explored how a search engine might be able to identify blog pages and distinquish them from non blog pages, and told us that:

Search engines are increasingly implementing features that restrict the results for queries to be from blog pages.

But limiting the number of blogs that show up in search results doesn’t necessarily mean that a search engine doesn’t like blogs. It may mean that search engines would prefer to show a diversified set of search results, including blog pages and other results.

Ranking Algorithms

Search engines often look a couple of different kinds of ranking factors when determining the order that search results are shown to searchers.

Query-Independent and Query-Dependent

One way to classify ranking algorithms is query-dependent (or dynamic) or query-independent (or static).

Query-dependent ranking algorithms rely upon the query terms someone uses to rank pages, while query-independent look at other factors such as how important they may believe a page to be based upon things such as whether or not important pages link to that page (an example of a query-independent ranking algorithm would be PageRank).

Query-independent ranking algorithms assign a quality score to each document on the web, and can be run ahead of time. Query-dependent ranking algorithms depend upon the query used, and have to be run when a user submits a query.

Content, Usage, and Link Based Ranking Algorithms

It’s also possible to classify ranking algorithms as content-based, usage-based, and link-based.

Content-based ranking algorithms - use the words in a document to rank the document among other documents. For instance, a higher score might be assigned to a document that contains the query terms at the beginning of a document, in a prominent font, or in a certain kind of HTML element.

Usage-based ranking algorithms - may assign a score based on estimages of how often documents are viewed from looking at web proxy logs or looking at click-throughs on search engine results pages.

Link-based ranking algorithms - look at the hyperlinks between web pages to rank those pages, assigning a score to pages based upon links pointing to pages. endorsement of the page.

PageRank - an example of a query-independent link-based ranking algorithm.

The PageRank formula is often explained as follows. Consider a web surfer who is performing a random walk on the web. At every step along the walk, the surfer moves from one web page to another, using the following algorithm.

With some probability d, the surfer selects a web page uniformly at random and jumps to it; otherwise, the surfer selects one of the outgoing hyperlinks in the current page uniformly at random and follows it. Because of this metaphor, the number d is sometimes called the “jump probability,” namely the probability that the surfer will jump to a completely random page.

If the web surfer jumps with probability d and there are |V| web pages, the probability of jumping to a particular page is d/|V|. Since any page can be reached by jumping, every page is guaranteed a score of at least d/|V|. The PageRank of a particular web page is then the fraction of time that the random surfer will spend at that page.

But what if that surfer started favoring pages that were linked to by blogs a little more?

Splitting PageRank

One of the problems behind using PageRank is that some commercial web sites try to inflate PageRank by creating links that point to a page solely for the purpose of endorsing that page, artificially increasing the value of the page.

This patent filing describes in some detail how a portion of PageRank from a page might be split (or distributed) equally amongst the links found on the pages of a site, and how the distribution of PageRank could be slightly altered to favor (or show a bias towards) pages that are linked to by blogs.

If blogs are, as the authors note in the patent, “still mostly human authored, and generally represent sincere endorsements of their authors,” then this bias might help counteract the artifical inflation of PageRank scores by people who would create links pointing to pages solely for the purpose of artifically increasing the PageRank of pages.

The patent filing is:

Ranking Method using Hyperlinks in Blogs
Inventors: Steve Chien and Dennis Fetterly
Assigned to Microsoft
US Patent Application 20080243812
Published October 2, 2008
Filed March 30, 2007

Abstract

A method for static ranking of web documents is disclosed. Search engines are typically configured such that search results having a higher PageRank.RTM. score are listed first. A modified scoring technique is provided whereby the score includes a reset vector that is biased toward web pages linked to blogs. This requires identifying web pages as either blogs or non-blogs.

Identifying Blogs

Some of the kinds of things that a search engine crawling program might look at when deciding whether a page is from a blog might include:

  1. Whether a page is hosted in a known blog hosting DNS domain such as blogspot or wordpress.com
  2. What features are containted in the non-HTML markup words and phrases contained in the page
  3. What the targets of outgoing links might be in the page, and
  4. Whether the string “blog” occurs in the URL

Experimenting with a Bias Towards Pages Linked to by Blogs

The authors of this patent performed experiments where they downloaded over 472 million pages, and found links to an additional 6 Billion pages within those pages.

They reranked the PageRank of these pages using a bias towards pages that they identified were linked to by blogs, with a preference towards using blog pages that had higher PageRanks, which they tell us tend to be “frequently updated, more informational rather than personal, and free of spam.”

They also tell us that some other characteristics of blogs may prove useful in refining this technique, such as looking at the number of subscribers to a particular blog, and associating a higher endorsement value to blogs with greater numbers of subscribers.

Conclusion

Can sending more PageRank to pages that are linked to by blogs something that will increase the relevance and importance of pages that show up in search results? Are links to pages from blogs still actual endorsements from the authors of those blogs?

Do search engines love blogs?