A post – “The more the merrier or how more registered grants means more relationships with outputs” – was published in the Crossref blog last Feb highlighting the progress around the issuing of Crossref DOI-based grant IDs achieved by the European Union as a research funder. This involved having DOI-based grant IDs minted for thousands of H2020 and Horizon Europe projects. We have talked to Pavel Zbornik, former lead for this initiative at the European Commission, to learn more about this grant ID-issuing effort conducted in collaboration with the EU Publications Office.
A brief glossary is included at the bottom of the post.
Pavel Zbornik: „At present Wikidata ID is the most suitable ID for organisations in terms of coverage“(interview by Pablo de Castro)
[with additional input by Stephane Ndong at the EC RTD]
What led the EC to take this step? What main use cases do you see for grant IDs?
There was an internal discussion on the use of grant IDs which resulted in an agreement that the EC should issue grant IDs for Horizon Europe. The overall purpose for the Commission is to contribute to the general development of PIDs in a mid- and long-term timescale. The idea underpinning this effort is that the availability of PIDs will become an important factor in the steering and evaluation of R&I Policies.
One major use case for grant IDs is linked to the evaluation and monitoring of the impact of the framework programme. The expectation is that better links between grants and related publications – plus other research outputs such as patents, prototypes, software components, etc – will simplify the analysis when the link is declared via a PID instead of as free-form text.
Are you aware of any duplicate 6-digit grant numbers for EC-funded projects with grant numbers issued by other funders? Did this play any role in the decision to switch to 9-digit grant numbers?
The switch from 6-digit to 9-digit was rather technical in nature as the eGrants system issues identifiers for submitted proposals and then the same ID is also used for grants. Given that the eGrants system has been applied to all centrally-managed programmes in the EC (40+ programmes), the remaining number of available 6-digit ids was not sufficient and a switch to 9-digit has subsequently been made.
How many grant IDs have been minted so far? Have all Framework Programmes (FPs) been covered already?
Almost 36k for Horizon 2020 – which should already be fully covered – and Horizon Europe, for which grant IDs are being minted as new grants are signed. Discussions were also held on the possibility of issuing grant IDs for the FP7 programme too, but to my knowledge no final decision on this has yet been made.
It is relatively easy to follow the statistics of the minted grant IDs via the Crossref API: following this link at https://api.crossref.org/types/grant/works?rows=0&query.publisher-name=European%20Union&facet=funder-name:* you can get a summary count of grant IDs issued by a specific funder. In the case of the EC, these are broken down by pillars for each Framework Programme.
Did you have splash pages in CORDIS for all EC-funded projects including those that were part of the most ancient FPs?
Yes, all projects dating back to FP1 have their own page on CORDIS, this is a principal function of the CORDIS website.
How complex was the technical workflow for issuing these grant IDs? Could you provide an idea of the time the process took since the early conversations started?
Complexity is a matter of perspective and technical maturity. In principle you have two options, a more technical one via a call to the CrossRef web service (submitting xml via a REST API) and a simpler one involving the use of the CrossRef registration form that will generate the xml in the background. More information is available at https://www.crossref.org/documentation/research-nexus/grants/.
I would say the most complex challenge in this process is not technical but operational, especially to map the metadata needed by the grant ID to the internally-held data about the grants. The schema reference is available at https://gitlab.com/crossref/schema/-/blob/master/schemas/grant_id0.0.1.xsd. Understanding the different award types, finding the funder identifiers to which each grant will be linked and organising the metadata registration are all complex tasks.
To make our case a bit more special, the Publications Office of the European Union (OP) is also a DOI registration agency so in our case the OP minted the DOIs first and then registered them with CrossRef.
From a timeline perspective, the process took several months. Most of the effort was related to the internal discussions and to discussions with Crossref – as a change in the metadata was required in order to enable registration of non-personal grant IDs (grants not linked to the Principal Investigator (PI) but to an organisation). A change in the grant ID metadata is going to be needed in the future, as it is still rather centred on PI grant types and it is not possible to encode beneficiary organisations without a PI.
Did the collaboration with Crossref make things simpler? To what extent were you able to rely on a cross-funder technical collaboration network?
I can’t comment much on this as I was not involved in a direct discussion with Crossref. These discussions – and the actual implementation – were carried out by the Publications Office of the European Union (OP).
What was the role of the Publications Office of the EU in the process?
The role of OP was crucial in the process, as they maintain the public presence of EU data about funded grants via CORDIS and being the DOI registration agency for EU, their technical and subject expertise were quite valuable.
Together with their contractor mEDRA, the OP developed all necessary technical provisions to issue the grant IDs and continue doing so.
As per the recent (Feb 2023) Crossref blog post (which may already be outdated) the European Union tops the list of funders by percentage of grants with a persistent ID. Do you think your pioneering work could provide a best practice case study for other funders both in and outside the Crossref Funder Advisory Group?
Given the number of funded grants by H2020 and Horizon Europe, when we started minting grant IDs for H2020 we estimated the total number of existing grant IDs would double. This would make the EU one of the main funders issuing grant IDs.
I could personally hope other funders would take this work as an inspiration for their own PID strategy.
Could you elaborate on the possible use of PICs as persistent identifiers for organisations holding EC-funded projects? Do you think PICs might suit your purposes better than RORs or Ringgold IDs?
PIC started in FP7, and it quickly became the de facto standard identifier for organisations in eGrants for all centrally managed grants. Given it is also used by other non-R&I-related programmes I would not compare it to R&I organisation PIDs like ROR or Ringgold.
The main reason would be coverage, ROR has a bit over 100k organisations and the public list of Ringgold ID is about 500k. Both of these are world coverage, but the PIC registry contains more than 600k organisations. When a matching exercise was performed between PIC and ROR, only about 30k PIC-ROR links were found. Although the matching could certainly be improved, the coverage is not very high. In this particular matching exercise, we found that Wikidata ID is currently the best-suited ID for organisations in terms of coverage while containing links to both ROR and Ringgold if available.
Have grant holders been told about this development already?
Communication with the beneficiaries about grant IDs still has some room for improvement. For the moment it has mostly been a silent release, making the machinery work and then starting to raise awareness.
Have the workflows for including these new grant IDs in manuscript submission systems already been discussed with publishers – even preliminarily – or is this rather the role of Crossref?
Not to my knowledge, one could hope as Crossref is in most cases the DOI issuer it would take the role of linking grant ID in the publication DOI metadata.
What would be your advice on issuing grant IDs for funders with little technical resources at hand?
Use the registration form from Crossref which doesn’t need major technical resources.
What linked entities to grant IDs within the PID Graph would you expect to eventually provide more value? Publications, datasets, patents, research equipment and facilities?
This is a rather hard question. Given that PIDs linked to grant IDs are mainly DOIs for publications and datasets at the moment, I would expect the highest value to come from publications. This is also one of the main drivers for the initiative, to be able to see linked publications to the grants. Having this info also for patents would be very valuable, although I remain quite sceptical on when this link could eventually become usable for analytical work.
PIDs of all kinds are particularly valuable for the European Open Science Cloud. Has the process for minting grant IDs involved any sort of collaboration with the EOSC Association?
Not with the EOSC Association directly, but the RTD Unit responsible for Open Science and the EOSC has been involved in the internal process and discussions.
Are you familiar with RAiDs (Research Activity IDs)? These comprehensive exercise for issuing grant IDs for EC-funded projects could become a key building block for the soon-to-be-minted RAiDs. Do you have any thoughts on this “building the plane as we fly it” character of the current PID landscape where everything is being built at the same time including bits that may be needed for other PIDs to build on?
I do know RAiDs exist, although I’ve never seen it applied and to my little knowledge it is mainly used in Australia. I also heard of some possible use in UK by UKRI. I’m not sure how far it went.
For the second part of the question, I would say if grant IDs achieve major adoption and given it contain links to other PIDs like Funder ID, ORCID or ROR it would facilitate the use of these PIDs in another context, in a sort of network effect.
CORDIS: CORDIS stands for Community Research and Development Information Service and is the database for EU-funded projects and their results since 1990, https://cordis.europa.eu/en. Project factsheets held in CORDIS typically had their 6-digit grant number (later expanded to 9-digit codes). Now they also include the DOI-based grant IDs (for H2020 and Horizon Europe projects for the time being).
EC RTD: Directorate-General for Research and Innovation of the European Commission, https://en.wikipedia.org/wiki/Directorate-General_for_Research_and_Innovation.
Grant ID: DOI-based persistent identifier for grants issued by a specific funder. These DOIs are usually issued in collaboration with Crossref, whose Funder Registry provides the so-called DOI prefix.Grant number: Traditional systems used by funder to identify the grants they award. These are typically combinations of text and numbers specific to each funder and they are usually not resolvable to a grant page, meaning they are technically not persistent identifiers. Grant numbers are at risk of being duplicated across funders (see an example below). Grant IDs effectively remove this risk.
Participant Identification Code (PIC): 9-digit number serving as a unique identifier for organisations (legal entities) participating in EU funding programmes/procurements.
- Choosing a data sample provider for our study on the impact of Plan S - 22. Januar 2024
- Scidecode to explore the impact of Plan S - 6. Oktober 2023
- „At present Wikidata ID is the most suitable ID for organisations in terms of coverage“ - 22. Mai 2023