Archive for April, 2009

DeepDyve Launches Publisher Tools

Wednesday, April 29th, 2009

Yesterday we announced several exciting new tools and products for the Publishing industry. As you know, we work closely with many publishers to make their expert content discoverable in our search engine. With the release of these tools, DeepDyve will now enable publishers to leverage the DeepDyve technology on their own sites. These tools are specifically designed for publishers whose content is already in the DeepDyve index and who want to bring custom search results (i.e. just their articles) to visitors on their site:
•    Publisher Landing Pages — Publishers can leverage DeepDyve’s technology to drive more traffic to their sites. DeepDyve will dynamically build landing pages for EVERY article in the collection. Each landing page is information rich with related article links and is optimized to be more easily found in the major search engines.

•    Search API — DeepDyve’s search technology can easily be added to a publisher’s site to make their content more findable. Through a web services Application Protocol Interface (API), simply ‘cut and paste’ the API code and allow DeepDyve to power the search.
•    More Like This Document API — This tool allows websites to directly interface with the DeepDyve database to search for articles that are related to any designated document within our index. The user simply clicks a MLT link or button and DeepDyve will return related article results using the designated document as the query (alternatively, the results can be streamed in automatically).
•    Content Highlight Widget — The Highlight Widget allows users at the publisher’s site to highlight any block of text up to 5,000 characters, then simply run that selection as a query.

“We are looking forward to DeepDyve powering the search on,” said Peter Jerram, CEO of The Public Library of Science.  “DeepDyve’s cutting-edge technology and ability to use entire sentences as a query will make it much easier for our users to find and discover new original research in science and medicine.”

Each of these tools is available for free with advertising revenue sharing, or for a fee which varies depending on volume. To learn more, please visit the Publisher section of our website.   You can also click here to read the entire press release.


About The Public Library of Science

The Public Library of Science ( is a nonprofit organization committed to making the world’s scientific and medical literature a freely available public resource. Its goals are to:
•    Open the doors to the world’s library of scientific knowledge by giving any scientist, physician, patient or student - anywhere in the world - unlimited access to the latest scientific research.
•    Facilitate research, informed medical practice and education by making it possible to freely search the full text of every published article to locate specific ideas, methods, experimental results and observations.
•    Enable scientists, librarians, publishers and entrepreneurs to develop innovative ways to explore and use the world’s treasury of scientific ideas and discoveries

The Content Is The Query

Thursday, April 2nd, 2009

According to the U.S. Census Bureau report, there are over 50 million “knowledge workers” in the United States. These knowledge workers span the fields of demographics and range from secretaries to CEO’s, students to scientists. In a recent report from Outsell, they found that 89% of knowledge workers share documents with colleagues on a weekly basis.

The belief at DeepDyve is that traditional search engines are great for the quick questions — finding someone’s name; locating the URL for a company; reading today’s headline news - which can be easily be described in 3 words or less.  But for knowledge workers who are looking for more depth, the strength of today’s search engine, namely finding the most popular stuff, becomes its weakness.  Run a seach on any 3 words you will be flooded with too much noise as millions of sites; expand your query and your results start to suffer as they only return results that contain ALL of the words, as opposed to any combination of the words. Furthermore, today’s search is optimized for independently finding what you are looking for by going to a search engine site.  However, In the case of knowledge workers, search (or research) is highly collaborative and social (as evidenced by the 89% document sharing) as well as decentralized:  it (ideally) takes place wherever the user is.  Researching a topic is not impulsive, but the impetus to quickly search something is often spontaneous something that is read triggers deeper thoughts and ideas that require further investigations.

At DeepDyve, we are seeing some interesting statistics and trends that indirectly point to knowledge worker-type searches, i.e. long-tail searches.  According to HitWise’s Feb 2009 analysis (see below), back in 2004, over 50% of queries were just 1-2 words, and less than 5% were for 6 words or more. In 2009, you can see the ‘long tail’ forming as slightly less than 45% of queries are 1-2 words, and roughly 10% of queries are 6 words or more.

DeepDyve users are appear to already be gravitating to this ‘long-tail’ query behavior. In looking at a recent sample of user data, we see that 43% of our queries are 1-2 words which probably represent our users engaging in their “Google muscle memory”. What’s interesting is that 20% of our queries are 6 words or more, or double the percentage in ‘traditional’ search engines. We believe this indicates that some of our users start with simple searches then gradually adopt longer ones; other possibilities include that a segment of our users are active in writing long queries as they’ve quickly adopted the benefits for their particular needs. Either way, this preliminary data reinforces the trend in the chart above and we believe points to a desire among users to have more flexibility in describing what they want.

That leaves the issue of ‘decentrailzed search’, or searching from everywhere.  In the coming weeks, we will be announcing some exciting new capabilities that are meant to address this exciting possibility. As we like to say, “the content is the query”.