DeepDyve Introduces $0.99 Rental Service for Research Articles

October 27th, 2009

Today, we are pleased to unveil the NEW DeepDyve - the world’s largest online rental service for scientific, technical and medical research.   From a growing database spanning thousands of journals, DeepDyve now gives you access to the full-text of more than 30 million articles from prestigious publishers for as little as $0.99 per article.

According to a recent report from the Publishing Research Consortium, many knowledge workers face significant challenges when trying to find and discover high quality, authoritative information.  Much of this research is difficult to unearth from today’s search engines, and can be quite time consuming and expensive to purchase.

DeepDyve’s rental service builds on our initial research platform.  Users can search across million of articles from thousands of journals all in one place.  Once articles are discovered, users can rent and read the full-text of premium articles for as little as $0.99, or join a monthly plan with greater discounts and more flexibility.  Of course, users can continue to view any “open-access” article for free.

To read the full press release, please click here or go to:

If you’d like to learn more and try a risk-free 14-day trial, please visit us at:

New publishers partners

August 18th, 2009

DeepDyve is pleased to announce that the following publishers are now available for searching and discovery:

  • American Physiological Society
    With over 10,500 members, APS is devoted to fostering education, scientific research, and dissemination of information in the physiological sciences.  The APS produces 14 journals for its members who have doctoral degrees in physiology and/or medicine (or other health professions).
  • American Society for Microbiology
    ASM journals are the most prominent publications in the field, delivering up-to-date and authoritative coverage of both basic and clinical microbiology. Highly cited and well regarded, ASM’s 11 journals represent over 40% of all citations in Microbiology, according to ISI Journal Citation Reports.
  • FASEB (Federation of American Societies for Experimental Biology) FASEB advances health and welfare by promoting progress and education in biological and biomedical sciences through service to its member societies and collaborative advocacy.  The FASEB Journal ranks among the top biology journals worldwide (according to Thomson Scientific’s 2007 Journal Citation Reports). This monthly journal publishes peer-reviewed, multidisciplinary original research articles as well as editorials, reviews, and news of the life sciences.  FASEB also publishes the Journal of Leukocyte Biology.
  • Genetics Society of America
    Genetics Society of America (GSA) members are researchers, scientists, teachers, engineers, breeders, and geneticists in training. The purposes of the Society are 1) to facilitate communication between geneticists, 2) to promote research that will bring new discoveries in genetics, 3) to foster the training of the next generation of geneticists so they can effectively respond to the opportunities provided by our discoveries and the challenges posed by them, and 4) to educate the public and their government representatives about advances in genetics and the consequences to individuals and to society. The GSA endeavors to be the collective voice of its members on subjects where a deep knowledge of genetics and biological science is critically important.
  • Health Affairs
    Health Affairs is the leading journal of health policy thought and research. The peer-reviewed journal was founded in 1981 under the aegis of Project HOPE, a nonprofit international health education organization. Health Affairs explores health policy issues of current concern in both domestic and international spheres.
  • Hindawi Press
    Hindawi Press is a rapidly growing academic publisher with more than one hundred journals in science, technology, and medicine. All articles published in Hindawi journals are open access and distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • Information Sources, Inc. (TecTrends)
    TecTrends is a premier guide to the ever-changing world of intelligent technology. TecTrends combines detailed company information with thousands of abstracts of independent, third-party reviews and analyses across the full spectrum of technology, including coverage of computers and information technology, the Internet, biotechnology, nanotechnology, consumer electronics, and wireless technologies.
  • MIT Press
    Beginning in 1926, MIT Press is the only university press in the United States whose list is based in science and technology. MIT Press publishes about 200 new books a year and over 40 journals.
  • Springer Science+Business Media:  Springer Protocols
    Springer Protocols is the largest subscription-based electronic database of reproducible laboratory protocols in the Life and Biomedical Sciences.  Comprised of protocols from Humana’s successful books and handbooks on methods in molecular biology, Springer Protocols are published by the world’s second-largest publisher of journals in the STM (Science, Technology, Medicine) sector, the largest publisher of STM books, and the largest business-to-business publisher in the German-language area.

With these new partners, DeepDyve now makes discoverable in its search engine over 32 million documents, both premium and free (open access).

Hope on Twitter Search

June 18th, 2009

Today’s post has been written by Hope Leman.  Hope is a research information technologist and also contributes regularly to

This will be a very unscientific, random stroll through science-related and search aspects of Twitter. There is a lot of talk about Twitter replacing this or that technology: Twitter is destroying whatever prospects RSS had for ever gaining traction among the general Web-using public; Twitter is the new Google and so on.

What I would like to do this morning (and I do most of my explorations of search tools in the early morning before I go to my job as a research information technologist—which tends to entail trying to find announcements of grants and scholarships in the health sciences to list on ScanGrants, a free listing for such—and I am about to try to determine how Twitter does when it comes to finding research funding as a case study here) is to try to discourse knowledgeably on using Twitter in science search without writing long, hard to follow sentences like the one you and I are both enmeshed in at this point. That is the thing about Twitter—you can get both absorbed in what you are doing and increasingly scatterbrained and unable to think or express yourself coherently because there is so much fascinating stuff that you bounce along hither and thither sounding increasingly like an exceedingly eccentric person.

For example, I had hoped to simply go to the Search page of Twitter in order to see what I could come up with by searching for terms such as “grants” and “scholarships” and “funding.” But once I opened Twitter, all hopes of sticking to my proposed project vaporized immediately because I made the crucial mistake of glancing down at the tweets on my home page and got immediately distracted by items the titles of which sounded edifying.

For example, one of the most useful things I have found about Twitter is the fact that you learn about industries and fields you knew little about simply because people in them start to follow you and then you follow them and pretty soon you are starting to learn about marketing strategies in pharma and just now I have received an email from Twitter saying that I am being followed by this gentleman, Justin Johnson:

whom I had already been following on Twitter probably having found him via the Life Sciences room of FriendFeed.

That is one frustrating aspect of Twitter—there is often no record of how I came across a person to follow. I do save the emails from Twitter saying someone is following me. But are such people doing so because I followed them or because they came across my Twitter feed in the same random fashion that I came across theirs? And does it matter how one finds people to follow? To search professionals, marketers and social anthropologists parsing the intricacies of social networking and its societal implications it probably does.

But as someone just trying to learn as much as I can on a very superficial level (no time for depth in Twitterdom) as quickly as possible about such subjects such as search, Science 2.0, Open Science, Big Science and so on I just have to leverage my ability to read quickly and not stop to think these things through lest I find myself entrapped in yet another meandering sentence of own devising.

And what do I read through as quickly as I can in order to find things to read, ideally, a thoughtful, contemplative frame of mind? I read my home page of Twitter, looking for items intriguingly titled such as the item I found this morning, “Ok, say you get a genome. What next?”

See here for what I saw.


That is the greatest danger of Twitter—the power of cleverly titled tweets.

This was one irresistible. It appealed to me as a non-scientist interested in science. In a few simple words, it promised to elucidate an important subject (genomics) in an approachable fashion.

That is what endows Twitter with its power as a tool for public education in science. Would I in the pre-Twitter era have visited something called the OpenHelix Blog or cared that there was a blog with this self-proclaimed mandate, “Here on the OpenHelix blog you will find a genomics resources news portal with daily postings about genomics resources, genomics news and research, science and more. Our goal is to keep you, the researcher, informed about the overwhelming amount of genomics data out there and how to access it through the tools, databases and resources that are publicly available to you.”

Would I have even known that such a blog existed? That is one of the reasons Twitter is a search story—I keep pushing scientist-bloggers to add Twitter buttons to their blogs so as to render their incredibly useful content discoverable. But they cling to RSS and email subscriptions as the primary modes of dissemination of their writings and seem to regard Twitter as beneath them. Major miscalculation. Increasingly, Twitter-generated material is appearing in Google results. Like it or not, if you aren’t in Google you are missing a missing an opportunity to garner readers.

And on the matter of whether material gets read. It is this simple: I scan the homepage on Twitter. I notice a fascinating item such as the one on genomics. I note bits of wording that look significant and worthy of my time to retweet for the benefit of others who might, like me, need a lighting fast glimpse into abstruse matters. (“To get appropriate data to display you need to annotate your genome. You need to curate your genome.”) By retweeting it, I have simultaneously saved it as a social bookmark and thereby create a personal library of useful items for my own use later on. And therein lies a problem—how do I search my own updates in Twitter? There must be a way to do so, but if so I have not found it. There are third-party Twitter-related apps galore (and searching for those would itself entail using such apps to find other apps in a never-ending cycle of appiness). Is there one for organizing one’s updates?

Okay, I have now written a great deal and never did get accomplish my aim of investigating the potential utility of Twitter as a way of finding grants and scholarships. I have spent the past month working on getting ScanGrants Twitterized and that has been much more difficult than I anticipated. I have had it done by a real pro, thank goodness. But listing your own material (in my case grants) is own thing—searching through Twitter is another and I will have to address that another day and one that smart people like the guys at DeepDyve are probably working on even as I prepare to end this sentence.

Real-Time, Integrated Search

May 28th, 2009

In prior blogs, we’ve discussed our vision of search, that it will evolve from a fairly isolated function today that takes place mainly at “portals”, to a function that is integrated  into a larger activity, such as research.  Users do not want interruptions in whatever they are doing and today, search is often just that, particularly as it relates to more in-depth information seeking.

Yesterday, DeepDyve announced a suite of free tools and widgets that allows website owners and bloggers to embed DeepDyve capability directly into their pages.  These tools will allow content owners to more seamlessly allow their users to read, browse and discover related articles and search results from DeepDyve.  In addition, DeepDyve also contributed a guest blog to on the evolving role of search which is impacting not just the technology industry but also the information and publishing industries.  As users find and consume information in new ways due to the Web, major industries will be transformed in how they compete…or don’t.

DeepDyve Launches Publisher Tools

April 29th, 2009

Yesterday we announced several exciting new tools and products for the Publishing industry. As you know, we work closely with many publishers to make their expert content discoverable in our search engine. With the release of these tools, DeepDyve will now enable publishers to leverage the DeepDyve technology on their own sites. These tools are specifically designed for publishers whose content is already in the DeepDyve index and who want to bring custom search results (i.e. just their articles) to visitors on their site:
•    Publisher Landing Pages — Publishers can leverage DeepDyve’s technology to drive more traffic to their sites. DeepDyve will dynamically build landing pages for EVERY article in the collection. Each landing page is information rich with related article links and is optimized to be more easily found in the major search engines.

•    Search API — DeepDyve’s search technology can easily be added to a publisher’s site to make their content more findable. Through a web services Application Protocol Interface (API), simply ‘cut and paste’ the API code and allow DeepDyve to power the search.
•    More Like This Document API — This tool allows websites to directly interface with the DeepDyve database to search for articles that are related to any designated document within our index. The user simply clicks a MLT link or button and DeepDyve will return related article results using the designated document as the query (alternatively, the results can be streamed in automatically).
•    Content Highlight Widget — The Highlight Widget allows users at the publisher’s site to highlight any block of text up to 5,000 characters, then simply run that selection as a query.

“We are looking forward to DeepDyve powering the search on,” said Peter Jerram, CEO of The Public Library of Science.  “DeepDyve’s cutting-edge technology and ability to use entire sentences as a query will make it much easier for our users to find and discover new original research in science and medicine.”

Each of these tools is available for free with advertising revenue sharing, or for a fee which varies depending on volume. To learn more, please visit the Publisher section of our website.   You can also click here to read the entire press release.


About The Public Library of Science

The Public Library of Science ( is a nonprofit organization committed to making the world’s scientific and medical literature a freely available public resource. Its goals are to:
•    Open the doors to the world’s library of scientific knowledge by giving any scientist, physician, patient or student - anywhere in the world - unlimited access to the latest scientific research.
•    Facilitate research, informed medical practice and education by making it possible to freely search the full text of every published article to locate specific ideas, methods, experimental results and observations.
•    Enable scientists, librarians, publishers and entrepreneurs to develop innovative ways to explore and use the world’s treasury of scientific ideas and discoveries

The Content Is The Query

April 2nd, 2009

According to the U.S. Census Bureau report, there are over 50 million “knowledge workers” in the United States. These knowledge workers span the fields of demographics and range from secretaries to CEO’s, students to scientists. In a recent report from Outsell, they found that 89% of knowledge workers share documents with colleagues on a weekly basis.

The belief at DeepDyve is that traditional search engines are great for the quick questions — finding someone’s name; locating the URL for a company; reading today’s headline news - which can be easily be described in 3 words or less.  But for knowledge workers who are looking for more depth, the strength of today’s search engine, namely finding the most popular stuff, becomes its weakness.  Run a seach on any 3 words you will be flooded with too much noise as millions of sites; expand your query and your results start to suffer as they only return results that contain ALL of the words, as opposed to any combination of the words. Furthermore, today’s search is optimized for independently finding what you are looking for by going to a search engine site.  However, In the case of knowledge workers, search (or research) is highly collaborative and social (as evidenced by the 89% document sharing) as well as decentralized:  it (ideally) takes place wherever the user is.  Researching a topic is not impulsive, but the impetus to quickly search something is often spontaneous something that is read triggers deeper thoughts and ideas that require further investigations.

At DeepDyve, we are seeing some interesting statistics and trends that indirectly point to knowledge worker-type searches, i.e. long-tail searches.  According to HitWise’s Feb 2009 analysis (see below), back in 2004, over 50% of queries were just 1-2 words, and less than 5% were for 6 words or more. In 2009, you can see the ‘long tail’ forming as slightly less than 45% of queries are 1-2 words, and roughly 10% of queries are 6 words or more.

DeepDyve users are appear to already be gravitating to this ‘long-tail’ query behavior. In looking at a recent sample of user data, we see that 43% of our queries are 1-2 words which probably represent our users engaging in their “Google muscle memory”. What’s interesting is that 20% of our queries are 6 words or more, or double the percentage in ‘traditional’ search engines. We believe this indicates that some of our users start with simple searches then gradually adopt longer ones; other possibilities include that a segment of our users are active in writing long queries as they’ve quickly adopted the benefits for their particular needs. Either way, this preliminary data reinforces the trend in the chart above and we believe points to a desire among users to have more flexibility in describing what they want.

That leaves the issue of ‘decentrailzed search’, or searching from everywhere.  In the coming weeks, we will be announcing some exciting new capabilities that are meant to address this exciting possibility. As we like to say, “the content is the query”.

The Great Unbundling of Content

March 11th, 2009

There is no question that Web searching has changed the business of publishing and marketing content. The old watchword among publishers that “Content is King” has been replaced by the web2.0 phrase “The User is King”. Just as iTunes allows music buyers to obtain a song unbundled from a record album or CD, so too does Google enable any Web searcher to very quickly identify and download articles on nearly any topic of interest from any of thousands of sources.

Google and iTunes weren’t the first to deconstruct publisher “packaging”. The unbundling of full text content first emerged in early online services like Dialog and LexisNexis. Initially, this was a great deal for publishers, since a single article from a newspaper that originally cost 25 cents on the street could be sold thousands of times online for as much as $3 dollars a pop. In effect, researchers in the early days of the online industry were willing to pay a premium to use technology that could help them find that “needle in a haystack” article they needed. However, the rapidly declining cost of technology, combined with publishers’ desires to use that technology to better serve their audiences, have driven the price of an individual article, in many cases, to zero or close to it. Publishers still sell “bundled” content –print journals and institutional subscriptions are still alive and well– but the increasing number of consumers have been trained by Google to think that many articles can be had for free.

This ability to obtain an individual article, song, or other piece of content has broad implications for both users and businesses. For users, while they are no longer required to buy more than they may actually want, they also lose the opportunity for discovery. How many times have you bought an album or CD because of a popular single, only to realize there are lots of other great songs now in your possession. iTunes and the like would argue that they create additional avenues for discovery through their ‘related artists’ and ‘recommended songs’ capability – and we agree, but more on that later. For content owners, this unbundling trend is of course very serious as it can cannibalize the sale of journals, books and CD’s. The strategic question therefore is whether to ride and somehow maximize this unbundling wave, or resist the wave and potentially cede the end-user relationship to the platform providers, in this case Apple and Google.

Publishers for the most part have already begun preparing as if this process is irreversible. They are implementing new ways to replace revenues that are lost when customers no longer buy the whole ‘package’. This new repackaging can take many forms. Subscriptions to popular online journals, such as the New England Journal of Medicine, increasingly offer unique features, including email alerts, user communities, access to unique data sets, and multimedia content designed to make the value proposition of buying the “whole package” greater than the sum of the parts. Obviously, this can be an expensive proposition and requires technical expertise for adding and managing new features on the Website – not necessarily a core competency of scholarly publishers. Likewise, iTunes, Rhapsody and other online music sites also want to encourage larger transactions. As mentioned above, they are leveraging ‘related songs’ technologies and recommendation engines to encourage more discovery and therefore more purchases while still allowing the user to maintain their sense of control in determining what they want to buy, i.e the soft sell.

DeepDyve is jumping into the fray by offering publishers other ways to “repackage” their content that are very easy and inexpensive to implement. Similar to the features at the music sites, DeepDyve’s More Like This technology enables a user to discover related articles. The technology takes the contents from a single article and uses it to reach deep into a publisher’s archives to find additional articles that represent a more complete offering of the publisher’s work on the topic. Publishers benefit when more of the right content is presented to a prospective customer. More Like This can also be used across whole collections of journals, and often enables users to discover otherwise hidden relationships between subjects in different disciplines.

Because DeepDyve is not dependent on any meta-data or taxonomy, the implementation of this tool is a snap. More Like This functionality is now available to any Website as an API – by simply copying and pasting a simple programming script a publisher can now, in a sense, rebundle an ad hoc “journal” from any set of content that is customized for the user’s needs at that moment. Rather than fight the momentum of unbundling, it’s possible that tools such as More Like This and ‘related artist’ will introduce a new means of bundling – one that is aligned with the evolving needs of the user.

The Future of Blogs

March 2nd, 2009

Occasionally, we will be inviting noted experts in the field of search and information to guest-blog for us.  Today, I’m pleased to introduce our first such entry from Joseph J. Esposito who also serves as an advisor to DeepDyve.

I was very pleased when Bill Park, the DeepDyve CEO, asked me to write a guest blog for DeepDyve.  It comes at a good time for me, as I have been blogging for a while now on a number of sites, principally Pubfrontier, but also at O’Reilly’s Tools of Change, The Scholarly Kitchen, and Teleread — blogging all over the place, but there is a feeling of same old, same old setting in.  It really is time to reinvent the blog, and DeepDyve is the place to do it.

No, DeepDyve is not suddenly going to make me smarter or any easier to get along with, but DeepDyve could bring its technology to bear on blogging to give us something new–and, as the B-school types like to say, something “value-added.”

Here’s my problem with blogs: they all pretty much look alike, do the same things.  Blogs differ because writers differ, but the blog as a form, well, it’s a short piece with some links; a blog roll in the margin; an archive; and maybe a snapshot of the family dog.  You read, you click on a link, follow it to the target, and maybe come home again.  There is a lot of horsepower under the hood of computers these days, and blogs just aren’t taking advantage of it.

So here is what I would like DeepDyve to do:  I would like DeepDyve to embed its search bar into blogging software so that when someone writes a blog entry (like this one), it automatically generates a search using DeepDyve’s KeyPhrase technology.  So imagine that you are working in WordPress or Movable Type.  You draft your post, tweak it a bit, add a few snarky comments, and then click on “publish.”  But instead of the post going directly to the Web, it first passes through the DD search system.  The entire text of the blog is used as a DeepDyve search query.  The results that come back can then be posted alongside the entry itself.

Imagine a blogger working in the life sciences, for example.  Her post is about a recent news story about a connection between certain diets and dementia.  She clicks to publish and then sees her post next to a list of related articles and stories to further inform her audience.  The blog post, in other words, has been re-created as part of a network of information all interlinked by related concepts.  This DeepDyve blog “plug-in” doesn’t only dive deep; it also brings up a net of related resources.  Contrast this with an ordinary blog post, that relies on the painstaking work of the author to identify links.  Links should not be work; they should be automated.

This new kind of blog is something that pretty much only DeepDyve could build.  This is because unlike other search engines (or research engines, as Bill likes to say), DeepDyve allows you to use a query of any length.  Your blog post could be as long as a full-length article.  You couldn’t use such a long query with Google.  Try it and you will see.  Google puts an upper limit of 32 words on a query, and even queries that long either retrieve far too many pages to be useful.  But a DeepDyve blog would retrieve only the truly relevant material and put your post into the context of all other documents that resemble it.

Now this isn’t to say that DeepDyve cannot be a useful tool in blog writing today.  For example, I can achieve the same benefits of above by simply copying and pasting my entire blog entry into the DeepDyve search bar and running a query.  It will bring back all related articles and links that I can then incorporate into my blog as needed.  But if DeepDyve offered that seamless plug-in, well, that would make DeepDyve an essential and easy step in the blog-writing world.

So, Bill, let’s give it a shot.  It’s great that DeepDyve has pushed search technology to a new frontier, but let’s now have DeepDyve change the nature of communications about research as well.

Joseph J. Esposito is an independent consultant providing strategy assessment and interim management to the information industries. He has served as an executive at Simon & Schuster and Random House, as President of Merriam-Webster, and CEO of Encyclopaedia Britannica, where he was responsible for the launch of the first Internet service of its kind.

“The Flu Season Is Coming – Tips to Research Prevention and Treatment”

February 18th, 2009

Earlier this week, CNN Health reported that the flu season started later than normal but is now on the rise based on data from Google flu trends. The story also referenced an article from that outlined how doctors diagnose the flu.

We decided to look into this further and found some fantastic information on how to protect ourselves from the flu.  We did this by copying the entire 176 words from the article and pasting it into DeepDyve. The top result was an article from Sage Publications called “Cold and Flu Survivor Guide” (The Diabetes Educator , Volume 30 (1) :(80-90)).  It mentions “Three of the antiviral drugs (amantadine, rimantadine, and oseltamivir) have been approved for prevention of the flu. These drugs are not, however, a substitute for influenza vaccination. All of these drugs are prescription drugs, and a doctor should be consulted before the drugs are used…”.  Please note that this article is fee-based and can be purchased at the Sage website for $30.

Other useful results included:
• “An Ounce of Prevention Is Worth a Pound of Cure” (Patricia T. Alpert; “Home Health Care Management & Practice”, Volume Prepublished; SAGE Publications) also available for $30.
• “What three-letter word spells m-i-s-e-r-y? The CDC recommends a yearly flu vaccine as the first and most important step in defending against the virus” which is available for free.

Empowered Patients, Empowered Consumers

February 13th, 2009

“Here are the important stories and why”. “5 experts surveyed say this product is the best”. “Our analysts rate this stock a Buy”. And perhaps most importantly, “Here’s your diagnosis, here’s your treatment”.

Historically, information dissemination was a top-down affair where large institutions and so-called experts analyzed hard-to-get-to data and provided their authoritative voice on what we should do. The general masses lacked the tools and until relatively recently, lacked the education, to probe or even challenge the expert advice. For example, according to a recent study from the U.S. Department of Education, in 1960 just 41.1% of Americans age 25 and over had completed high school or higher, whereas by 2007 that figure had increased to 85.5%.

In today’s highly educated and highly digital world, consumers are not only well-schooled but also well-trained to find and analyze information for themselves through the Internet. And increasingly they are also well-connected through online forums and social networks. In the area of healthcare, this empowerment is taking an even further turn as patients become more active in their own diagnosis and treatment. Last December, BusinessWeek ran an art article titled “Can Patients Cure Healthcare?” (Catherine Arnst, 12/15/08), in which critically-ill patients joined together, shared their medical history and even collaborated on experiments to treat their particular disorder. They would not or could not passively accept their fate as prescribed by the doctors, drug companies, and government regulators.

This “patients-as-partners” model is often called Health 2.0 and represents the broader trend of individual empowerment that is enabled as a result of wealth of information available on the internet, and the easy-to-use tools by which to search against this data. There are already precedents or parallels to this macro-trend, most notably in entertainment where the digitization of music has led to massive, and often illegal, distribution. The small number of gate-keepers, i.e. the music labels, could not stop the millions of persistent consumers who found new keys to getting their content. If not wisdom of the masses, perhaps it’s innovation of the masses. The question which we will discuss in a future blog, is what does this mean for the information industry? Can they fight the trend of consumer empowerment and digital distribution of content, or will they need to rapidly adopt new strategies and technologies to capitalize on this seemingly inevitable movement?

More to come…