Data Expeditions: Mining Data for Effective Decision Making

by | Nov 7, 2018 | 0 comments

Ivy Anderson, Gwen Evans, Ann Michael

(L-R) Ivy Anderson, Gwen Evans, Ann Michael (moderator)

This session was a panel discussion by Ann Michael, CEO of Delta Think; Ivy Anderson, Associate Executive Director, California Digital Library (CDL); and Gwen Evans, Executive Director, OhioLink. Ann introduced the panel with the observation that we must use data as a tool, and not let what is perfect be the enemy of what is good: work with the best that you have. No model is perfect; data is an access that requires time and a special skill set.

Ivy described the CDL’s journal value algorithm.

CDL Journal Value Algorithm

The combined algorithm scores at the package level are used to identify those that are ripe for review, cancellation, or re-registration. Progression analysis is used to set a target price. In the area of OA, what are CDL authors spending on APCs, and what is an appropriate level of APC? 80% of UC’s publication output is with just 25 publishers, so they are candidates for a potential OA agreement.

Gwen discussed OhioLINK’s services, resources, and problems. Five regional depositories are coordinated; they have about 8 million items and are nearing capacity. 75% of the items are duplicated in more than 99 libraries. Costs for retrieval are increasing at an alarming rate. How can the space be made more valuable? De-duplication of high density facilities is not quick or effective. Removing an item creates a gap in the locations. Instead of focusing on de-duplication, they redefined the problem: what is the minimum set of items that have to be touched? The repository can be compressed by focusing on uniqueness rather than what is duplicated: 60,000 unique items were identified and moved out to a storage facility. The risk is shifted from inadvertently keeping duplicates to accidentally discarding unique titles.

Unique monographs

Statewide negotiations are being undertaken with textbook publishers to resolve high prices to students. Bookstores control the data, but it is not easily exposed. Over 30 separate institutions must be contacted, and it is hard to determine who to ask for the data. The collection was eventually defined as everything from major textbook publishers. To monitor pricing, OhioLINK is simulating a unified bookstore and determining an OhioLINK price that is used to negotiate with publishers.

The ability to collect data is a competency of libraries. We must consider tits quality as well as the ability to manage it. Subject experts want to see the whole range of data; when you are explaining data to a legislature, it must be at a 3rd grade level!

Don Hawkins blogs about conferences for Information Today and Against The Grain. He also maintains the Conference Calendar on the Information Today website and is the Editor of Personal Archiving: Preserving Our Digital Heritage, published by Information Today in 2013, and Co-Editor of Public Knowledge: Access and Benefits, published by Information Today in 2016. He received his Ph.D. degree from the University of California, Berkeley, and has worked in the information industry for over 45 years.

Sign-up Today!

Join our mailing list to receive free daily updates.

You have Successfully Subscribed!

We Need Your Feedback!

We Need Your Feedback!

Please take a moment and send us your thoughts on our new website design.

You have Successfully Subscribed!

Pin It on Pinterest