During the 2018 Charleston Conference, we had a panel of several of the “Chefs” from The Scholarly Kitchen (TSK).TSK is a blog that publishes five days a week on, as the tagline says, “What’s hot and cooking in scholarly publishing”. David Crotty (Oxford University Press) was the moderator of the panel which included Alice Meadows (ORCID), Lettie Conrad (Maverick Publishing Specialists), Robert Harington (American Mathematical Society), Judy Luther (Informed Strategies), Lisa Hinchliffe (University of Illinois Urbana Champaign), Ann Michael (DeltaThink), and Joe Esposito (Clarke & Esposito).
The following is Part 2 of an edited transcript of the panel. (Click here to read Part 1)
David Crotty: Thanks, Lisa. One of the things you’ve written a lot about lately is the question of privacy and how this is an increasing concern with the growth of digital platforms. Libraries have for a long time been stalwart protectors of privacy, and libraries are likely to play a key role in privacy going forward. Ann, your answer to the question was talking about the balance of trust between users and digital platforms. We hear about the great potential of targeted, relevant services but those often require the user to disclose an increasing amount of data to feed the algorithm and the A.I. Can you talk about that sort of dilemma? Are these services are worth the loss of privacy that they seem to require?
Ann Michael: A common theme that we often deal with is people looking at a problem or challenge and trying to come up with a single answer to it. How we consider and manage privacy is one example of this. What I’ve observed personally and professionally is that when you have control over what data you give to whom, when, where, and how, and when you understand the benefit that you’re getting back for providing that data, you don’t really have a problem with it. Most people would acknowledge that the right people having their data can save them time, can save them money, can save them aggravation. But, by the same token they would also tell you that a lot of our data is floating around, they don’t know where it is, what it is, how it was assembled, what it’s being used for. That’s really the core issue — our ability to process and use data in new and innovative and sometimes scary ways has outpaced our understanding of how to manage it and our technical ability to manage it.
One of the things that really scares me, is our ability in the US to stay competitive. I just came back from China where I was exposed to some of the neat applications being developed. When you think about the richness of some of these applications, it hinges on the predominance and prevalence of available data. Many of the same uses of data would make people here in the US cringe. They are moving into uses of data in e-commerce, including things like chips and detailed tracking of what you’ve purchased, that would scare many people I know to death.
But what scared me was if we don’t find a balance, if we don’t figure out how to decide what we’re willing to let people use, in what way, and how we’re going to leverage data as an individual, as a company, and as a country, we might not have such a rich and flourishing economy 15 years from now. Someone else will have become really good at things that we were too nervous about pursuing (for good reason).
So, I don’t want to say that I don’t think people’s privacy concerns are valid. They are. I have them too. But, you know what? My identity was stolen 20 years ago from paper. So, there’s always a downside to anything we do, and anything we share. I think that we have to figure out (and I know this is an easy thing to say and a hard thing to do) how are we going to catch up or at least get closer to the abilities that people have to use our data. How are we going to get closer to being able to manage, and at least have some say in, that process? The problem isn’t that data can be used in all these different ways. It’s that we feel helpless.
David Crotty: I think you’re right. You can look at any challenge as “well, this is the end of the world” or “this is an opportunity,” and I think that that brings us to Joe, your answer was about a different kind of trust – in an era of uncertainty, it’s hard to sort out all of the threats and opportunities that present themselves. You work with a lot of clients, particularly the not-for-profit research societies that Robert talked about. How do you help them build enough knowledge to trust in the decisions that they’re making? Can you talk about how the sort of self-analysis that the different stakeholders in the community can be doing to help them find a path forward?
Joseph Esposito: As Robert mentioned, when we were going through our preparation for this, David started to get very anxious. There was gloom and doom, and everyone was feeling oppressed and I said “gosh, you’ve got to find something positive to say”. Certainly there are some very difficult circumstances that we have been living under for some time. But, I have to say I will be making a confession here today and that my name is Joe Esposito and I am an optimist and what keeps me up at night I can say very easily is just that there are so many opportunities out there.
It is astounding and it is hard to sleep because you want to jump out of bed and go after all of the latest. And the sense of gloom anddoom is, to my mind, simply a lack of imagination and really what we need to do is focus on imagination and innovation and our best qualities and not our worst qualities in order to go forward. And I believe that’s in my DNA; this is just the way I am constructed, and I can’t help it. I don’t want to belittle the people who are feeling anxious and feeling that there are these issues of trust out there. But, I have to say I really think that is looking at the wrong problem.
When we have a political figure who talks about, you know, “the failing New York Times”, that is not a statement of a lack of trust in authoritative information. It is rather very much the opposite. It is the awareness that that authoritative information is trusted by so many people. It proves a threat to a bully and in the discussions of fake news, or things like climate science denial, what we always hear is the desperate attempts to overturn authority structures that have served us very, very well. So, I have to say that rather than thinking about a failure of trust or decline of trust, I think we should be thinking about a failure of nerve which is I think more really the cultural issue that we’re dealing with.
So, my view is that is the key problem we’re facing today and I want to narrow this from a more global discussion into scholarly communications. I think the real issue that we’re facing today is the sheer explosion of information and knowledge that is being created. We hear the figure that we’re up to 3 million scientific articles a year now. That is an astonishing amount and what it means is that every day the amount of information just grows and grows and grows. It becomes harder and harder to find what you’re looking for. That puts a premium on discovery services. We have all these new tools to help us. I happen to think that it’s a little bit like the novel Tristram Shandy, that you can narrate your life, you realize you’ll never catch up, that your life keeps going on faster than the narration and we will not catch up with this this knowledge explosion and that is I think a very, very real concern.
So, where does one go for this? I would say this: in our work with professional societies what we say to them is that you have to be aware of the fact that you happen to possess the greatest search engine that’s ever been found, and that is a reliable brand. You have a name that is associated with particularly important quality — qualitatively superior information in a particular field — and people go to that brand in order to find that information. The second thing we tell them is that you have to stop thinking of yourselves as publishers. You have to begin to think of yourselves as brand managers and you have to find opportunities to put this brand on other kinds of high quality information, all of which in fact is a message about the quality of your brand and the authority that you bring.
I would say that there is something we should not do, and by this I mean not just publishers, but the community at large, and that is to do anything that undercuts the brands that are associated with quality publication. And we know that there are so many attempts to do that. We recognize, of course, that some people think the journals are not important and that they are just accidents, and we talk about editorial boards and the review process as just being rigged situations. But, if you work with them you realize they’re not perfect but they’re pretty good. So, I think what I would say about this issue of trust is that the real issue here is that I think we have forgotten that all of us do in fact put our trust in certain kinds of things and we go back to them over and over again. The brand is the best search engine that has ever been created. It’s far better than Google and what we encourage societies to do is to have a strategy is to reinforce the strength of their brand and to extend it.
David Crotty: Thanks, Joe. That’s the end of my prepared questions.
A follow-up question for Lisa — Ann talked about privacy and you’ve written a lot about privacy. You and I often argue about whether the cat is out of the bag already as every one of your students in your library is signed into Facebook and Google while they’re using your library tools, so someone is tracking them already. But, you’ve made a lot of really compelling arguments that we still need to make the effort to protect their privacy and the idea that a user should control the information about themselves is I think a strong thread in what you’ve written. Can you talk a little about that?
Lisa Hinchliffe: If you’re a librarian, you know that this is an issue that’s really broiling through our community, as the historic way that we’ve understood protecting people’s privacy with the role of the library as the gatekeeper to privacy is in contrast with the reality of today’s digital age. We might still delete your circulation record, but at the same time you’re going off to third party platforms all the time, all of which are tracking you. For my own thinking, as David alluded to, and I think the really strong point Ann was making here as well, is that from my perspective, it is particularly problematic when people’s data is being collected and tracked and used in ways that they are not even informed about, or honestly in some cases, is intentionally kept from them. The obscuring of the way that data is being collected and the sort of language that is sometimes put forward seems very reassuring until you read it.
These are some very interesting things. It’s easy for us to look at the positives. It’s easy to also tell the scary story. I think though, that the issues are much more challenging when we actually don’t know how reality is being shaped for us. SSP, our sponsor, also sponsored a pre-conference at this event. We had a panel of researchers and it was very interesting because when they were asked about privacy, they said, “I assume I don’t have any privacy”. They said that was particularly the case since they work for a state institution. And I work for a state institution. Any of you can FOI (Freedom of Information) my email, you can see what my salary is. So, there’s a degree to which this is absolutely the case, but, I then posed to them the question of if you could actually roll that back would you want to? And they said, “Yes, absolutely”.
Acknowledging that your privacy has essentially been taken away from you does not mean you like it. And so I think that’s the other question which is what would it look like? Because I agree with David, it’s kind of gone. So, if we wanted to roll it back what would that look like? I don’t think we have a good answer for that. I can say that I do have a grant with a couple of people at Montana State University where we’re trying to have this kind of conversation and if you happen to be coming to the Coalition for Networked Information Conference in December, we’ll be looking at questions of what would library license language look like if the publisher community and library community could agree on how data was going to be collected and used in ways that librarians could say “yes, we can feel good about this from our privacy protecting standpoint,” and publishers can feel good about it in the sense of it allowing that kind of innovative platform development that our users seem to want. So, it’ll be interesting to see if we can come up with something.
Ann Michael: I was actually going to make one little comment about your horror story on the social score. So, I probably spoke to at least 50 different people in China about the social score, not a representative or random sample at all, I admit. Although I was horrified, 48 of them thought it was a great idea. And the reason they thought it was a great idea was because of the firm belief that it would only benefit them, that it might be a problem for other people, but it wouldn’t be a problem for them. And now we can chuckle and say “well, that’s really naïve.” And that may very well be naïve. But, I bet you there will be people for whom it is a very positive thing. The problem is arbitrary analysis and the governmental use of their data in ways that they have not given permission for, and the fact that they are not given the option to participate or not. It is being forced on them. I think that there is value in sharing data and there are also many horror stories. We need to be able to manage and balance privacy versus benefit. We need to be able to take more ownership. People need to be accountable. There needs to be some regulation around how people can use data, but we can’t just be afraid of the data itself or we are going to find ourselves really far behind in this world.
Lisa Hinchliffe: I think we actually quite a bit — like 90 percent agree here.
David Crotty: But, you both talked about this idea of “invisible forces,” something like the social score without a clear explanation of how exactly it works. In the world of scholarly communications we’re really turning towards A.I. [artificial intelligence] as being a growing factor. How do I discover material and what role are algorithms playing? AI is creeping into the peer review process, you’re sending in your article to us, we’re going to run it through our A.I. and see what it thinks of that. These are, to most of us, impenetrable black boxes. What does that mean when we know they can be biased, we know they can be gamed. What does that do for this idea of trust? Is this something we need to be concerned about or are there ways we can be transparent with this?
Lettie Conrad: Can I give a shout out for transparency? Alice and I had a chance to chat a bit last week ahead of this event and we said,
“Okay, well, we want to be positive,” and one of the surefire things that we came up with was transparency. I think it earns quite a bit of trust to label privacy policies correctly as data management policies but specifically for organizations to be very upfront about their process. This is our mission, this is how we’re getting it done. It comes back to that individual integrity piece. You know, if each of us were every day getting up and doing our best and being willing to share exactly what’s going on in our organizations, then there’s no reason to be ruled by fear. And I really liked what Joe said as far as the kind of crisis of nerve or the lack of nerve and there really is something to that. It takes some bravery to be able to say “Yeah, here’s my messy reality. Here’s the transparent reality of how I do what I do.”
Ann Michael: One of the things that we do in my organization is we pool together a lot of data from disparate sources. We do surveys and collect other data and do lots of interesting things to look at the open access market. What astounds me is that publishers really don’t like to share their data and it’s very difficult for us to get a handle on what’s going on, when, if it was all shared, even if anonymized, it would benefit everybody. Earlier at this meeting there was a talk about the Metadata 2020 initiative and the fact that our data is so poorly prepared; we don’t have an OA indicator at the article level. So, people guess which papers are hybrid which aren’t just based on what the license is and where it was published and what not. So, just the idea of getting better data quality,of being brave enough to share it in an anonymous fashion, I think is really important so we all kind of understand where we are and where things are going.
Alice Meadows: That goes back to transparency again, doesn’t it? You want transparent metadata on top of everything else. I just want to add to Lettie’s point and to go back to your original question which is about AI. Transparency is the answer for AI. I know never in a million years is Google going to tell us how their algorithms work. But, you know, wouldn’t it be great if they did because then we would understand why when you type in some search terms you get whatever horrible stuff it is that you get. And then we could all help, hopefully, make search results better. There are also awful stories about Amazon, I think, using AI to screen job applications and basically this is a sort of self-reinforcing, “only ever hire men” algorithm. You know, this is terrible. But if it is transparent at least you can see it and you can fix it.
David Crotty: Joe, Lettie just mentioned this idea of nerve that you’ve brought up and, one of the things that’s frustrating when you see the different publishers get together in these industry groups and through these informational projects is that nobody wants to disclose any information about what they do because there’s this fear that runs through most organizations — that if I tell you what I am spending to run the air conditioner in my building, Elsevier is somehow going to use that and do something that hurts my business. It is a fear that if I dare let my secrets out of the bag, it’s going to destroy me. And how do you know where you can be open and transparent, Joe?
Joseph Esposito: I’d like to address this from a slightly different angle. So, we were talking about this incredibly scary world of living under surveillance all the time, but I’d like to share an anecdote. When my daughter was in high school, like most kids in United States, she was asked to read Aldous Huxley’s Brave New World, a novel that I imagine most of the people in this room have read. After she read the book she had to write a paper on it, and she wrote the paper and said, “Dad, would you read my paper and tell me if it’s any good?” I said, “sure.”
And I sat down and read the paper and was utterly astonished to find out that, “gosh, my little daughter didn’t realize that this was a dystopian novel, full of sex, drugs, and rock and roll.” And she’s reading through this book and she said, “oh, my gosh, what a great place to be.” And as I was reading her paper and I started to reflect, and I went back and I read the book again and I said, you know, this is interesting. She’s got a point. I have to say that as a Dad I don’t like the sex part too much. But what we realized was that what was dystopian in the 1930’s, I’m not sure just what the date of publication was, didn’t really seem quite so scary today. So, what I would suggest here is that some of the anxieties that we’re expressing here are not going to be solved. They’re simply going to be superseded. Just as we once had the philosophical problem of how many angels can dance on the head of a pin there is no answer to that question. But, we don’t ask that question anymore. And I think a lot of the things that are making people feel very troubled today are questions that will resolve themselves in due course, not by being answered but because we’ll move on to another level of question that will engage us at that time.
David Crotty: Robert a last word before we’re out of time?
Robert Harington: A slight twist on this as a publisher. As mathematicians, one of the things we’ve been involved with is the development of open source tools. I think that as a publisher, you could be have this in mind at the beginning of the publishing process, and start being transparent and building and providing tools to the community that can be used by others. I’m thinking of things like MathJax for example here. But, just thinking about transparency, not just in terms of AI, not just in terms of privacy or piracy, however you say it, but in terms of open source tools that you can actually deploy and then share with others. I think that’s another way of looking at it.
David Crotty: I think that’s a huge — we’re at a huge moment where that’s really becoming increasingly important to the community. I think we are a couple minutes over. But thanks for your attention and thanks to the Chefs for their opinions.