In this episode, Nikesh Gosalia talks to David Worlock, who started one of the first online publishing services in the mid-80s. David gives his take on how different access to information is today and how it will evolve in the near future. David calls this “The Digital Revolution”. He explains how AI will help obtain information, how metadata is a critical factor, and what role “nanopublishing” will play.
David Worlock is a Cambridge History graduate. He was CEO of the pioneer development of EUROLEX (1980–1985), the UK’s first online service for lawyers. In 2013, the Professional Publishers Association (PPA) honored him with the George Henderson Award for lifetime achievement for his work in publishing and information marketplaces. He can be reached at:
All Things SciComm is a weekly podcast brought to you by ScienceTalks. Follow us:
Hey, everyone. Welcome to All Things SciComm. All Things SciComm is a weekly podcast brought to you by ScienceTalks, a media platform that aims to make science accessible to everyone. In this program, we will dive into the latest from the sci-tech world.
A little bit about myself. My name is Nikesh Gosalia, and I am joining you from London. I have been a part of the Science Communication and Scholarly Publishing industry for more than 14 years. I have had the privilege of working with researchers, academic publishers, journals, societies, and universities. It is my absolute pleasure to be hosting this podcast about the latest in science, tech, and research, an area that I am tremendously passionate about.
Let's get started with our first episode.
Today, I am chatting with David Worlock. David is one of the veterans of scholarly publishing, having begun his career in the 1960s. He has had the privilege of starting one of the first online publishing services way back in 1985. David has spent decades in creating strategies and solutions for publishers and content companies. David is a very dear friend as well. Welcome, David!
Thank you very much.
I think I don't know if I have done justice to the introduction, David. But if you would want to elaborate more on that and talk a little bit about your journey in the publishing industry, I think that would really be beneficial for the listeners.
Well, for me, Nikesh, listening to your kind introduction, I reflect that I am pre-digital man. I have lived through the digital revolution in all its various stages. I do reflect that one of the great gulfs in understanding, in our society really, is the idea that when we took the world of print, and put it into digital form, we had accomplished something revolutionary. We had not. We had simply transformed the medium of communication. Now, we are deep in what is a real digital revolution. That is beginning to flow through scholarly communications. It suggests to me wholly different ways in which the processes, which we inherited from the print world, from the pre-digital world, may be transformed, truncated, or developed or improved in a genuinely digital environment, a born-digital environment.
Very interesting! Thank you, David.
Let's start at what seems like the very basis of scholarly communication as we know it today, the article. I have obviously read a lot of your blogposts. I have heard you say that article is not for reading. Would you tell us a little bit more about that and why you say that?
Yes, I say that simply to keep my readers awake if you have got this far. But look, we live in a wholly different world when it comes to accessing information. We have inherited the world where articles appeared in print and people read them in order to stay up to date or they read the AI abstracts, they read intelligence about articles in order to keep up to date.
Now, patently, we are living in a world where it is impossible to stay up to date in that readership way. I don't know what the latest estimate is. But the last one I saw, I think, from Delta Think was around 470,000 articles published in the 2 years of the pandemic on COVID research. No COVID researcher has read all of that. We have to use other intelligent methodologies to scan this information and look at it.
Add to that, there is a huge time pressure now in our systems, getting research to the frontline as fast as we possibly can, becomes extremely important. On the 25th anniversary of archive, we see that what started out almost as a science-based server for putting up articles in preparation, we see the full development of preprint servers. That, again, suggests to me that the digital age will craft out its own ways of making intelligence available for artificial intelligence searching and for other forms of inquiry without necessarily going through the publication process as we know it in our present ecosystem.
Here are revolutionary changes on our doorstep. They come about not just because we have the tech, they come about because of the pressure of a research-based highly R&D-orientated society, not just in Europe and America, but of course, in China, Japan, and India as well.
Absolutely! Well, that's very interesting, David, and maybe I'll come back to your point about scanning intelligently all the research that's available. But we'll come back to that. I know that just an extension of what you have mentioned, you have also said that researchers do not write articles. Can you elaborate a little bit on that?
Yes, indeed, I can.
Well, just as a researcher cannot read all of the material available, it does seem to me increasingly likely that we will use intelligent means to assemble research findings. At this stage, I am beginning to call them research findings rather than articles. We have a familiar pattern from the print world, from the pre-digital world of what an article contains. It begins with an abstract, of course, and then it has the laying out of a claim, a hypothesis. Then, we generally follow that with a methodology for exploring that hypothesis and demonstrating its truth or its falsehood. Then, we have findings arising from that, maybe statistical, maybe evidential findings from the experimentation.
Then, we have a literature review. We have all of these structures, which we have built up. Then, we have a citation index. We have built in all of these structures into the publishing ecosystem. Actually, of course, that material is available at different times in the course of a research process.
If you were strict followers, I guess, of the world of Open Science, you would say, well, there is an important break in the middle of that, that one should at the beginning of an experimental activity, publish the hypothesis and publish the methodology. Then, one should wait till the end of the research process and publish the results, knowing that the initial parts being publicly available, could not be altered, and that people would see whether you had hit your objectives or whether you had been unable to demonstrate that. Whether that methodology will become prevalent, I do not know.
But it does seem to me that much of what we are going to publish in the future as now comes off the lab notebook, comes off the automated structures that we have put in place when we made the research grant, comes off the machinery of input which accompanies our research processes, posting that material to a preprint server or to a repository. I think only PLOS has a preregistration repository at the moment, but I expect to see more of those. Then, it breaks up the article in its structure, opens it up to a much greater automation of filing of the various parts of this activity. Therefore, the article may itself fragment once in time, but be united in linkage online.
Very interesting! I know that you briefly mentioned Open Science and maybe I will come to that in a minute. But just going back to a very interesting point that you made, David, around scanning just the huge volumes of research that we have, intelligently. Are there one or two tips or emerging trends that you see that you would like to talk about and then we can move to the other topic?
I think the vital thing about this whole world is time and time again I see a sort of metadata neglect taking place. This makes life very, very difficult for the linked world that we are talking about. I was fascinated at this year's APE Conference in Berlin last week to hear the conversation around metadata, especially where it was concerned with tracking retractions and articles, which made false claims, or which might mislead other researchers. You are never going to be able to find that material or have it delineated for you unless the metadata is in place. It is a potent way for people to mislead themselves. Adding that metadata, adding it in full, adding it stage by stage, I think, becomes the critical factor here.
When I look back to the old days of the beginnings of Crossref and realize that we thought we have done what we needed to do in metadata terms, when we simply attached a number, when we simply put a DOI on things and forgot that we need to put a DOI on every part of these things and forgot what we needed to do to extract the terms and forgot what we needed to do, to make sure that we could make claims identifiable.
Here, again, there is a group in Germany beginning to talk about nano publishing so that you can distinguish claims digitally across bodies of material. That is where we are going. I am absolutely convinced of it. It's the only way in which machine-to-machine interoperability takes place. It's much more important now that our machines are fully able to access everything and know what it is than that we are.
Absolutely! Regarding Open Science, you have said in the past that Open Science is not a lens but a prism. In other words, people have very different emotions about "open". Can we talk a little bit more about Open Science, open research and in your opinion, what are your thoughts about it?
Let me use another metaphor.
I think Open Science is a climate. It is one which is increasingly pervasive, in that I think the younger generation of researchers will be intolerant of some of the practices which mark journal publishing in some of the worst of the battle days. By which I mean the clear favoritism of editorial boards for a particular line of research or the inability of people who queried that approach to get published in certain leading outlets, the sorts of which we say, academic, but very human bullying, which has taken place, and which has sometimes shaded real results coming to the fore.
They also think we have to have a timeframe here. We love to sort of stamp things with a stamp of approval, and peer reviewed and that means something to many people and less than something in a lot of different contexts.
I take as an example, the two physicists who won the Nobel Prize in 2021 – two of the three physicists who won the prize in 2021. Two of them are in their late 70s, early 80s. The articles for which they are cited as having won the prize were almost unrecognized at the time. It took 20 years for those articles to come to the fore. We think that at point of publication we know everything we need to know in judgment terms about whether this is worthy science, distinguished science, or pretentious rubbish.
It is absurd. What we need to do is to track the progress of articles over 20 years. We need to have our systems in place to be able to tell you what is rising in estimation, what is now being recognized as being a real line of inquiry while it was dismissed at publication. We don't have that in place in our systems. But in a digital world, we can have that in place in our systems.
Very interesting! Just staying with Open Science, I think another trend that we have been seeing, David, is also a lot of conversations around impact. Right through the entire process, all the stakeholders have been talking about what is the real impact of all the research that we do. There is a lot of work also starting to happen around dissemination and engagement. Any thoughts on that, David? Any trends that you see?
We are living in an open access world. Even more than in the world of paid publication, it is common in many places that dissemination ends at publication. I have put it into the system and therefore anybody who needs it can find it. What a falsity that is in the attention economy of the internet. That is a hopeless position to take, in my view. Therefore, we need to disseminate, and we need to do everything we can to ensure high impact.
I think in a few years' time, young researchers will be demanding of open access publishers. Give me the assurance that if I put this through your outlet, that you will personally through or as a service, you will alert all of the leading players in my sector. In the age of knowledge graphs, it is very simple to map a research sector to find out where the researchers are, who is doing leading or indeed competitive work and to make sure that everybody knows. Disseminating in the flaccid way in which we are at moment has no real excuse, it seems to me. Then, we can begin to measure impact. We can begin to say who use that and what sort of response did it get.
Then, we can begin to monitor the blogs and the conferences. Here, again, we are seeing with companies like Cassini and Morrissey [ph], the conferences and the seminars, the in-house seminars of the research groups are now getting monitored. Here we are in a position where we can see what the impact is, if only we can devise the tools to measure it. I think we are going to go through an impact in dissemination revolution in the next 5 years. The scholarship will benefit hugely from that. Hugely!
Absolutely! No, I fully agree with you, David. I think it's a trend that you are probably seeing. We have had the Research Excellence Framework here in the UK as well. The importance of impact is just growing. I agree with you. I think there is going to be a revolution, which is probably much needed.
Well, I would say to you, and you mentioned the UK researchers. Well, it was fascinating to me that the UK research body chose to fund a scheme called Octopus to the tune of £650,000 to do initial research studies into how to improve publication processes. One of, of course, those improvements is about impact. It seems to me that even the award of that grant is calling out subscription-based publishing as inadequate, not doing the job, not getting the articles to the people who need them.