The Content Is The Query

According to the U.S. Census Bureau report, there are over 50 million “knowledge workers” in the United States. These knowledge workers span the fields of demographics and range from secretaries to CEO’s, students to scientists. In a recent report from Outsell, they found that 89% of knowledge workers share documents with colleagues on a weekly basis.

The belief at DeepDyve is that traditional search engines are great for the quick questions — finding someone’s name; locating the URL for a company; reading today’s headline news - which can be easily be described in 3 words or less.  But for knowledge workers who are looking for more depth, the strength of today’s search engine, namely finding the most popular stuff, becomes its weakness.  Run a seach on any 3 words you will be flooded with too much noise as millions of sites; expand your query and your results start to suffer as they only return results that contain ALL of the words, as opposed to any combination of the words. Furthermore, today’s search is optimized for independently finding what you are looking for by going to a search engine site.  However, In the case of knowledge workers, search (or research) is highly collaborative and social (as evidenced by the 89% document sharing) as well as decentralized:  it (ideally) takes place wherever the user is.  Researching a topic is not impulsive, but the impetus to quickly search something is often spontaneous something that is read triggers deeper thoughts and ideas that require further investigations.

At DeepDyve, we are seeing some interesting statistics and trends that indirectly point to knowledge worker-type searches, i.e. long-tail searches.  According to HitWise’s Feb 2009 analysis (see below), back in 2004, over 50% of queries were just 1-2 words, and less than 5% were for 6 words or more. In 2009, you can see the ‘long tail’ forming as slightly less than 45% of queries are 1-2 words, and roughly 10% of queries are 6 words or more.

DeepDyve users are appear to already be gravitating to this ‘long-tail’ query behavior. In looking at a recent sample of user data, we see that 43% of our queries are 1-2 words which probably represent our users engaging in their “Google muscle memory”. What’s interesting is that 20% of our queries are 6 words or more, or double the percentage in ‘traditional’ search engines. We believe this indicates that some of our users start with simple searches then gradually adopt longer ones; other possibilities include that a segment of our users are active in writing long queries as they’ve quickly adopted the benefits for their particular needs. Either way, this preliminary data reinforces the trend in the chart above and we believe points to a desire among users to have more flexibility in describing what they want.

That leaves the issue of ‘decentrailzed search’, or searching from everywhere.  In the coming weeks, we will be announcing some exciting new capabilities that are meant to address this exciting possibility. As we like to say, “the content is the query”.

Tags:

Leave a Reply