Insights Xchange: Conversations Shaping Academic Research

The Story Behind Figshare with Mark Hahnel

January 09, 2023 ScienceTalks Season 2 Episode 9
Insights Xchange: Conversations Shaping Academic Research
The Story Behind Figshare with Mark Hahnel
Show Notes Transcript

Join host Nikesh Gosalia as he chats with Mark Hahnel from Figshare about the creation of the Figshare platform. Mark talks about his background in genomics and how a gap year spent travelling eventually led him to do a PhD in stem cell biology. As a PhD student, he experienced struggles while trying to publish his research findings, and this inspired him to build his own publishing platform, which is now known as Figshare. For Figshare’s 10th anniversary, Mark reflects on Figshare’s collaboration with Digital Science and how the visionary thinking of their CEO led to Figshare being created before open science became common practice. Mark talks about the different versions of Figshare, from the free figshare.com which is open to everyone, to purpose-built versions of Figshare meant for specific organizations. This ties in with the open science concept of “as open as possible, as closed as necessary”, where not all datasets should be publicly available for safety. Mark shares his thoughts on what academics want out of publishing: something fast, good, open, and possibly free. He also addresses some drawbacks of Figshare, such as quality control, lack of data curation, and content monitoring issues.

 

Mark Hahnel is the Founder and CEO of Figshare, the all-in-one repository for papers, FAIR data, and nontraditional research outputs. He is passionate about open science and its potential to bring positive change to the research community. Mark has acted as an advisor for the Springer Nature master classes and is currently on the advisory board for the Directory of Open Access Journals (DOAJ). He can be reached on Twitter or LinkedIn.

Insights Xchange is a fortnightly podcast brought to you by Cactus Communications (CACTUS). Follow us:

Figshare turned 10 this year. I called it Figshare because it's the smallest unit of academia you can share. And so, fig one, fig two, fig three. But now, everybody thinks it's about the fruit, which is fine by the way. I loved it. It never felt like work. It just felt like a continuation of doing the stuff that I like. Wow! That's fascinating. All Things SciComm. What does the future of science look like? What's happening in science communication? Here's your host, Nikesh Gosalia. Hi, everyone. Welcome back to All Things SciComm. Today's guest is a visionary. He's the Founder and CEO of Figshare, the all-in-one repository for papers, FAIR data, and nontraditional research outputs. All Things SciComm Today's innovator. Knock-knock. His passion for open science is revolutionizing the research community. His experience as both, a scientist and a businessman, enables him to provide unique points of view in this space. Everyone, please welcome today's guest, Mark Hahnel. Good to have you today, Mark. Thank you, Nikesh, it's great to be here. Let's get started at the very beginning, Mark. What got you into the research field in the first place? I did okay at school where I was good with mathematics, and when I got to the university stage, I always intended to go to university and I ended up doing Biological Sciences, which then ended up. It was a nice, kind of, everything to been in a little funnel. I started doing Biological Sciences up in Newcastle, and then you specialize. I was really interested in the beginning to do something that I like. I had this choice of do maths that I am good at, or do something that I am more interested in. I am really happy that I chose to go into the life sciences at that point. I am amazed that a 17-year-old me made some good decisions. I started doing biological sciences and you get to do a little taster of several different modules. I knew I wasn't very good at pharmacology, but I knew I really liked the genetic side of things. It was around the time of Dolly the Sheep was big in the news. And so, it all seemed so, kind of, like sci-fi and bleeding edge. And so, I ended up majoring in genomics, as they say in America. My undergrad was genomics. I did a master's in human genetics. And I did a masters in human genetics I didn't ever intend to go into post grad research. But then, I went away on a little travel, as people do occasionally in the UK and take a year out. On that trip, I met somebody who did genomics and was going to do a masters and a PhD in genomics. He had a plan. I said, why are you going to do that? He said, well, I like doing it. I don't know what else to do. I thought that sounds like a good idea. Universities weren't as expensive as they are now. It seemed that it was much more accessible. I did a master's in human genetics and managed to fall into a PhD in stem cell biology as a result of it. Yes, I loved it. It never felt like work, it just felt like a continuation of doing the stuff that I liked. I never intended to leave academia, but I am a good 10 years out of academia now. The best-laid plans don't always go to plan. Wow, that's fascinating, Mark. Like you mentioned, to do something that you really like is generally a privilege. Could you just maybe talk a little bit about what's Figshare all about, where you are in the journey right now? Yes, sure thing. During my PhD, as I said, I didn't plan to leave academia. I was enjoying stem cell research. There was a lot of cool stuff going on in that field at the time. We had the dawn of iPFS cells, which sorry, iPS cells, which were, again, this new kind of magic that was happening. At the time, I was not familiar with academic publishing. I didn't really know how it worked. I wasn't aware what was an Elsevier, what was a Wiley? I knew that it was good to publish in Nature and things like this. I was just going about my work, trying to start publishing some papers. When I first tried to publish a paper, I had a lot of videos of stem cells moving from one side of the screen to the other. I was trying to get them to move. If I showed you this video, you'd understand that the blue ones are moving faster than the red ones, so something is happening to the blue ones. You can understand that research is very visual. When I went to publish my first paper, they said, sorry, we can't accept those files, even as supporting information because they are too big. They were about five megabytes. I was upset not because I wanted all research to be open and everybody to share all of their research outputs, which is a noble endeavor. I was upset because I had spent my whole weekend making sure these videos were in the right file format and edited in the right size and everything like this. I was upset that I could not publish these outputs. Then, the paper that I had written had to have this extra section, which was the blue cells moving at 2.6 microns per second and things like this. At the time, I was a hacky coder at best, I still am. I just decided I was doing some science blogging, because I liked that. And I decided to start making my outputs available online. I started getting feedback from other people in the science blogging community that, yes, if you are going to start publishing your other research outputs, there's a few rules you should really follow because they need to be persistent, right? You could upload those videos to YouTube. But then, you could delete them at any time. It's more difficult to cite. Ideally, they'd have a DOI so you could cite them and get some credit for them. And so, I just listened to the good advice to the community and started building this with my own version of a publishing platform that turned out to be what is now Figshare? I allowed other people to upload their files as well. You could sign up, upload your files, publish it and get a DOI. At the time, Digital Science had just started down the road. I was in London, I was at Imperial College, and Digital Science was an incubator investor in academic tech ideas and academic tech companies. I was at a science blogging conference and bumped into Daniel Hook, who at the time was CEO of a company called Symplectic, which is one of the digital science portfolios, and he said, you should come in and talk to Digital Science about Figshare and what you are doing. I called it Figshare because Datashare was already taken. Video Share was already taken. I was trying to think of what is the smallest unit of academia you can share. And so, fig one, fig two, fig three. But now, everybody thinks it's about the fruit, which is fine by me. Because it's a memorable name. And so, that is how Figshare was born. Since then, we have scaled it up. Figshare.com, as we call it, the free platform. Anyone can upload and publish any research output. We track citations and Altmetric scores and things like this. We sustain ourselves by building white labeled versions of that, white labeled repositories for hundreds of universities, funders, publishers around the world. Wow, that's well ahead of time to be honest, because, now within the scholarly publishing industry over maybe the last three to five years, the whole conversation of open access, but not just open access, open science. Obviously, the Plan S and OSTP memo, all of those conversations are coming in. I guess, for lack of a better term, everyone's warming up to the fact that we have to make everything more accessible. But that's, I mean, really, ahead of time, and perhaps, a lot of it because of your personal experience and frustrations that you went through. It's interesting as well because I have a lot of thoughts on this. Figshare turned 10 this year, which is a really good time, especially with lockdown I think a lot of people were reflecting on many things. We had the great resignation and everyone going off to become a gardener and all of these things. Ten years is a great time to look back on things and then look forward on things and see how fast things have moved along. It was actually a guy called Timo Hannay, who was the original CEO of Digital Science. We have had two, there was Timo and now there's Daniel. Timo used to run nature.com. And so, he was very on the ball with both technology and what was coming through in terms of the trends in academia. When Digital Science first took on Figshare, what it was a... they call it a catalyst grant, and Digital Science still have these catalyst grants. They will give you some funding for an idea but without taking any equity for it, just to see if it develops as an idea. I remember Timo saying to me, we know open data is going to be a thing in the next few years, we know it's coming down the track, but we don't know how it's going to be sustainable. We like Figshare. We like the idea of it. We think it's early to the space, but we think there's enough time to develop a model around it. It was really him being aware of what was coming down the track when it comes to fund the mandates and things like this. We have the National Institutes of Health for mandating that all research that they fund has to be made... the data has to be made open as of January 2023. In that time, we have gone from nobody really saying you have to do this to every big funder saying you have to do this, peaking with the NIH. UNESCO came out with a statement this year. And so, it was a combination of that is his visionary thinking on it and my just frustration with, well, I want to get credit for all of my research outputs, my egotistical academia. I need to compete against my peer's mindset, and it was just on that level, right time, right place, but yes, 2019, we saw more open access publications. More than 50% of the publications coming out that year were open access as default. That doesn't mean you still don't have people who don't think that open access or open data is a good thing. But I think on the open data side, in particular, we have seen COVID, for all of its awful things, was a great thing for people understanding that open data is important for moving academia forward further faster. Drug discovery, and everything like that that has lives attached to it is a really good way for people to understand. If everything is just open, we can build on top of the research that's gone before a lot easier without any gatekeeping. This is absolutely fantastic, Mark, that all of this was thought about, implemented, and now we are discussing, and I see that in all of our conversations that have happened over the last few months. Just to kind of talk about Figshare a little bit. As I was scrolling through the site, I noticed a lot of research is posted accessible to even lay audiences. How much of the product is available to the public? I am very happy that we have been able to remain core to our roots on a lot of stuff. But obviously, in making sure these things are sustainable and making sure that we can grow in the space, it has slightly tweaked a little bit. In the beginning, I thought, oh, we'll just-it's free to make things public. You pay to keep things private or something like that, but kind of like the freemium GitHub model. But then, why wouldn't you just use GitHub? They have 86 million users. There are others. Why wouldn't you use Dropbox? I use Dropbox. And so, we have two concepts, what I think of as the free figshare.com, we call it, which is anyone can come along, upload some files, add some metadata, and publish their content. That has some rules around it, which we dictate. It can only be licensed under two Creative Commons licenses CC BY or CC0, which means it's open by default. Then, as I say, we build versions of this software for anyone from the Department of Homeland Security in North America has a repository for publishing coast guard data and things like this. Also, hundreds of universities. The university requirements are much... each university has its own country rules, and then its own university rules. And so, the universities will use it for publishing all different types of content. If you have to make a dataset available with the publication, they can help you do that. If you have to make a... If you have to publish your thesis, they can help you do that. But then universities will have rules like you can only publish the thesis so it's available on campus. If you are on the campus, you go to their website and you see full access to that dissertation. Whereas if myself sat at home in London, I wouldn't have access to that content. If you had got me in 2012, I'd say, open all the things, everything needs to be open. Now, the European Commission has a great line, which is, as open as possible, as closed as necessary, because not all content can and should be open. We shouldn't be making datasets around the locations of at-risk species available because they are at risk for a reason. It's usually poachers or something like that. We shouldn't be sharing their location. As open as possible, as closed as necessary is a great line. But the free version of Figshare, I think, is very important for equity and allowing anybody to have access to a platform to make their research available. What I will say about this is I have spent, as I said, a lot of time reflecting. I have spent a lot of time thinking about this idea of what academics want in this new Open Access world. I think it's fast publishing, good publishing, and open publishing. I'd also contest that maybe free is an option as well into this mix, because not everybody has money to pay$12,000 for a Nature Neuroscience paper. But this idea of fast, good, open is at the crux of where a lot of the issues are, I think, with open access publishing and open science and open research. In that, academia works, you publish papers, we understand the credit system. Is it good? Yes. Because there's peer review. Is it open? Well, we are moving to that model so it's becoming more open, and I think we'll get to 95% in the next 10 years, or whatever, we can get to as a max. Is it fast? No, it's not fast. You have preprints? Fast? Yes. Open? Yes. Good? Right? This is the problem with data as well. Data and preprints are very similar as to each other, and so publishing data using the free figshare.com or publishing any nontraditional research output posters, presentations, code. Fast, open, but the quality issue, even down to the metadata. There's no curation. We have six million files across Figshare infrastructure. It's very hard to have a sustainable way to curate that or to check the content even. We have rules about taking down certain bits of content. It's advertising yourself too much or advertising something in general, we take it down. Whereas it does open up this non-peer review landscape, which is an interesting one to decide who is the gatekeeper of what can and can't be published. We have been struggling with that a lot. But anybody can upload any content, make it openly available, and then anybody can come and consume all of that content. We have open APIs. People build cool stuff on it. People index it in certain places. That's been really fun to see. Great. That's all the time we have today. Catch more of this episode in the next part with your host, Nikesh Gosalia. See you soon.