MultiGrain: Web-Scale Discovery in Context

by Patrick Carr, East Carolina University

Web-scale discovery tools like EBSCO Discovery Service and Summon by Serials Solutions enable patrons to instantly search a vast range of subscription databases, catalogs, and repositories. As a result, these tools are being widely implemented by libraries.

But, with their powerful searching capabilities, web-scale discovery tools also introduce problems. Becoming accustomed to the convenience of a simple, Google-like search experience, patrons might neglect to search specialized databases that would give them better results. Moreover, the fact that the leading vendors of these tools are also leading vendors of subscription databases raises questions about the extent to which the vendors’ interests in promoting use of their databases might lead them to rank search results for content in these databases over results in competing databases.

How should libraries address these potential problems with Web-Scale Discovery?

Pin It

6 thoughts on “MultiGrain: Web-Scale Discovery in Context

  1. Pingback: Against the Grain Announces “MultiGrain,” conversations linking librarians, publishers, vendors, and others

  2. I’m not sure there is a solution. Moving from medical librarianship to general liberal arts undergraduate librarianship, I have had to come to the hard fact that we cannot (and should not) make some users “search better” than they are willing to. That doesn’t mean that we can’t improve their skills but there is a clear limit. Searching more than one database is not something some users are willing or able to do. As for biases within search results in for-profit companies’ products… With virtually no power there is really nothing we can do.

    The only thing we can do, which I’ve dreamt of for years, is that libraries take “search” into their own hands. With open source catalogs like Evergreen, there is the potential for libraries and librarians to have direct input into the search functions and targets. It would be great to bring this over to article searching as well. It is to dream!

  3. Patrick poses two interesting dilemmas with web scale discovery. Both were of importance to Serials Solutions in the development of the Summon service and drove us to develop new technology rather than trying a hybrid approach.

    Getting to specialized databases is a primary concern because it’s these databases that can distinguish the unique qualities of the library. They’re often the gems in carefully curated collections designed to serve the particular research needs of its constituents. Summon’s goal is to enable libraries to leverage these investments – no matter how specialized — and one way we’ve done that is through our “Database Recommender.” Database Recommender considers the user’s search, returns results, but leads those results with advice on other databases to consider. Take a look at this page from Grand Valley State University to see how it works:

    Patrick’s concern with favoritism is legitimate because it can sully the accuracy of the results. Libraries are best known for the quality and accuracy of their content and it’s essential that all of us who participate in this community work to support that brand value. Summon protects against favoritism by using a single, unified index where each record is treated identically. The single index also provides remarkable speed, but more to the point, it enables Summon to bypass federated search, where the agility of one vendor’s search engine can trump the relevancy of the content of another vendor. Those slower engines may come from a smaller, more specialized database… which brings us right back to the first dilemma.

    We can’t change how students and faculty want to search… we must meet their needs for simple, easy, fast access to library collections. But to protect the values of the library we need an all-new approach… one that gets it right and brings the user back to the library for keeps. Of course we believe that solution is Summon. We won’t succeed with a patchwork of past interfaces that drove users to Google.

  4. The posting raises two issues. Both are important and we appreciate you asking us to respond.

    1.) You mention that, due to discovery services, patrons might neglect to search specialized databases that would “give them better results”.

    This implies that the best specialized databases (e.g., subject indexes) are not in discovery services. It is true that the overwhelming majority of subject indexes are not part of the unified indexes in discovery services. With that said, for libraries that buy their subject indexes via EBSCOhost, EBSCO uses “platform blending” to allow the end user to actually retrieve a single result list from both the unified index and individual subject indexes, with lightning fast speed. This is done while still allowing the individual information providers to retain their autonomy and highly detailed statistical capabilities. It means the usage of specialized databases actually goes up and their content is discovered even more often. In addition, we highlight each specialized database in the Content Providers facet, allowing the end user to limit results to a particular specialized database after its results have contributed to the main result list.

    2.) You also mentioned that “the vendors’ interests in promoting use of their databases might lead them to rank search results for content in these databases over results in competing databases”.

    While this is a reasonable concern, let me state unequivocally that EBSCO Discovery Service does not favor EBSCO content over content owned or provided by other organizations. Further, it is worth noting that EBSCO Discovery Service has a lot more non-EBSCO content than any other discovery service, so it is by no means dependent on EBSCO content. True competitors haven’t been sharing content. For example, ProQuest doesn’t provide content to EDS and EBSCO doesn’t provide content to Summon (owned by ProQuest) content. When a provider doesn’t work with EDS, we can’t include their metadata in the unified index or via platform blending, so instead we use “integrated federation”. This allows results of our competitors’ content to be added by the end user. Clearly, when a discovery service finds no way to incorporate particular resources, that discovery service has a bias against those resources. We avoid this scenario by trying to include all content in EDS in one of three ways – via: 1.) The unified index (e.g., Web of Science, JSTOR, LexisNexis, primary publishers of journals, and many others) 2.) Platform blending (any subject index accessed via EBSCOhost) 3.) Integrated federation (almost anything that can’t be integrated through the unified index or platform blending).

  5. Our library (Arizona State University) has used Summon since January 2009. Excluding a few stumbles and hiccups in the roll-out, it has assumed the status of an important tool for library research. We feature it as “Library One Search”, giving it some prime real estate on our main page.

    At debut, use was significant. We figure this generation of searchers tries everything and Summon proved the point. As with our other major search platforms use continues to grow and develop What do we hear from users? As we know they are a diverse and fickle union. There are power users of precisions and power users of recall. We set Summon to display by relevance and any type of search easily retrieves useful and irrelevant results. This is the meme of discovery–show me what may be there. For power users intent on something this is valuable, It provides choices.

    For other users, though, results are often too many and not on point. They seek the certitude of the golden age of searching although some may never have heard of Dialog or STN. “Too many hits” one student messaged me, I don’t want to deal with this.”

    For those of us who search for a living and help find stuff, we can easily lead this user in a better direction. Two cheers for Summon’s database recommender that helps these users to escape Summon’s botanical garden of search in a way less than SearchShank Redemption. But it isn’t enough.
    In the next generation of discovery Summon and its competitors need to do something, even magic, to balance precision and recall. As a user of Summon I suspect this lies in better and more complete meta-data which allows clustering results into myriad groups of subject, date, format, relevance.
    In the next generation all discovery providers also need to be more open and clear about search algorithms and who is getting what first. We are puzzled. In the golden age of search, Dialog’s One Search normalized the search of its over 500 databases and submitted results, if duplicative, randomly. They did this to make the data providers happy, it was after all transactional pricing. We need to know now how you do it perhaps to make better content choices and—who knows—get a leg up on the discounting?

  6. Pingback: ATG Article of the Week: “Calling International Rescue: Knowledge Lost in Literature and Data Landslide!” «