The CERIFy Data Surgery brought together those involved in business intelligence, marketing, research information management and analysis from the four partner institutions (Queen’s University Belfast, Aberystwyth University, University of Bath, and Huddersfield University) and the project's commercial partner, Thomson Reuters. The event was designed to follow a series of site visits and stakeholder interviews, in which the project team collected information about the partner institutions' current business processes associated with research information management and their key strategic objectives. The results of these site visits were formalised into generic "as is" and "to be" processes, which the group was asked to analyse and evaluate as part of the workshop activity. The four priorities areas identified for examination were:
-
InCites Exchange
-
Measures of esteem
-
Pre-Awards Management
-
Benchmarking
"The project is pretty agnostic about CERIF ... the idea is to get an honest picture of what the issues are." - Mahendra Mahey, UKOLN, Project Manager
The Business Case for CERIF
In this video interview, Josh Brown from JISC explains the history and context of the CERIFy project within JISC's Research Management funding strand and argues the business case for greater engagement with CERIF...
In addressing the group, Brown noted that the aim behind CERIFy is to encourage a community of practice. Research reporting is emerging as a key issue for institutions, so the project is designed to prove the efficiency savings and show the benefits of more integrated research information sharing through the partner institutions. He also emphasised that the work of the CERIFy project is a key resource for the development of a national infrastructure and will be affecting policy by feeding into the RIM group, informing their next round of funding.
Overview of User Requirements from Site Visits
"Users should be at the heart of any good system: this is the philosophy of the CERIFy project." - Stephanie Taylor, UKOLN
Prior to the Data Surgery workshop, the CERIFy team conducted site visits to each of the partner institutions, where they spoke to all of “the usual suspects” working in research information management, and anyone else who had some kind of interest in research information, to get an idea of what they wanted out of their systems. In the first of a series of presentations, Stephanie Taylor provided a snapshot and summary of the user requirements identified during these site visits to provide a context for the work of the Data Surgery...
Available on Slideshare.
Introduction to the CERIF Standard
Rosemary Russell from UKOLN presented an overview of the CERIF standard, beginning with the origin of CERIF as an EU recommendation to member states, now maintained by a task group within euroCRIS. CERIF enables institutions to collect research information in granular detail. It avoids the current necessity for institutions to enter data multiple times into different systems by allowing them to enter standardised data once and use it multiple times in different contexts. CERIF also enables you to create relationships between information, and interfaces with institutional repositories. Beyond internal efficiencies, CERIF also has wider implications. HEFCE has announced that the REF will have a CERIF compatible import/export interface option. Russell observed that other research councils are also looking seriously at the standard. In support of Josh Brown's earlier comments, she pointed us to the infoNet savings calculation to illustrate the business case for a UK standard. Russell went on to take us through the CERIF conceptual structure, which is based around the core entities: "person", "project", and "organisation". She admitted that CERIF is big, so most people only implement a small subset of the full model. However, the huge number of options makes it very flexible. The task group is very responsive the things happening within the community, so the standard is constantly being extended, but there are plans for some contractions, including removing Dublin Core fields.
Project Perspective
Mahendra Mahey, UKOLN
In this short video interview, Mahendra Mahey explains what he sees as the benefits of CERIF and the CERIFy project...
CERIF Mapping and the Health Check
Talat Chaudhri, UKOLN The site visits made it obvious that institutions have large array of data sources, many of which are just spreadsheets, with more coming out of the woodwork the more you look. For the data mapping element of the CERIFy project, the team need to know what all those data sources are, establish the entities and the field names, then map to CERIF. Chaudhri took the group through examples of person, organisation and publication records, demonstrating how the CERIF fields correspond to the fields institutions may already be using, to illustrate the process that will be used to create CERIF health checks for each of the partner institutions. He emphasised that the aim is to establish how well existing data maps to CERIF to see if it is a good fit for research information management data. However, it is not just about the individual fields and some their relationships between entities and how they map to specific business processes, but how well the entities in the CERIF model fit with the sorts of business processes the partner institutions use to so see if the fundamental structure of CERIF is right.
"We are trying to link CERIF back to reality." - Niamh Brennan, Trinity College Dublin
InCites Exchange
Niamh Brennan, Trinity College Dublin
InCites is a way of accessing Web of Science data through an online interface, through which you can run reports, pull information into your local system, create custom reports, and make comparisons between institutions or groups of institutions. Brennan ran a mock report comparing two institutions to show the type of output produced, discussing the amount of time and institutional overhead involved in getting this information manually. She also outlined how Irish universities, who have a national subscription to the service, exert influence on Thomson Reuters to make the most out of the product. Brennan emphasised that the CERIFy project aims to help institutions make the most out of existing subscriptions to commercial services, rather than promoting Thomson Reuters through this partnership. CERIFy are in talks with other commercial providers and are keen to involve as many as possible, particularly at the data exchange stage. She noted that Thomson Reuters started looking at CERIF as a result of being approached by the project, so CERIFy has already affected positive change towards an environment where commercial products can be used interchangeably.
"As Is"
Queen's University (are already using the InCites product, so representatives Gavin Mitchell and Ricky Rankin were able to explain some of the practical issues they have encountered and ways they are using the system internally. This includes using the service to compare expected numbers of citations with actual citations per research cluster, and receiving a database download that they can analyse and map to suit their needs, rather than being restricted to the online tools within InCites. There was much interest from the group about ways of resolving citation issues when academics use different naming conventions, and the level of coverage offered by the Web of Science data. There was also an uneasiness about giving a commercial organisation the “crown jewels”. Mitchell and Rankin agreed that improving the quality of their data within InCites gives them more accurate reporting on their own activities, but there is a risk that other institutions may not provide accurate data, so any comparisons may not be completely realistic.
Partner Perspective
Ricky Rankin, Queen's University, Belfast
In this short video interview, Ricky Rankin explains how he came to the CERIF standard...
Site Visit Evidence
Stephanie Taylor summed up much of this discussion when she provided the following overview of the site visit responses about this issue:
Available on Slideshare.
"To Be"
Niamh Brennan discussed the generic InCites "to be" process, much of which focussed on the benefits and potential pitfalls of using an employee ID so that citation information can also be reported together with financial information, student numbers and other data. Participants were also asked to look at a first draft attempt at mapping InCites field names to the CERIF fields by Thomson Reuters. The aim of this exercise was to expose participants to the CERIF fields and identify how they describe these locally to see how much of this information they can already provide.
Partner Perspective
Patricia Brennan, Thomson Reuters
Patricia Brennan explains why Thomson Reuters were keen to get involved with the CERIFy project and what they hope to bring to the table in this short video interview...
During the event, Patricia also provided the group with an overview of Thomson Reuters' suite of scientific and scholarly research business tools, and was on hand throughout to discuss specific queries regarding InCites from project partners, including issues relating to author disambiguation and institution address consolidation.
Measures of Esteem
Talat Chaudhri, UKOLN
The nature of academia, with its wide range of different disciplines and different institutional alignments, means that measures of esteem are varied and difficult to formalise. Chaudhri produced some general categories for esteem, based on his research, which included:
-
Memberships;
-
Awards, prizes;
-
Publications and outputs, including named lectures and major research projects;
-
Significant funding grants;
-
Commercial collaborations or consultancy.
He drew attention to the MICE project, (Measuring Impact Under CERIF) which is currently working to quantify measures of Impact (which will include measures of esteem). Josh Brown elaborated on this by noting that they are struggling to find a 1:1 mapping between indicators and measures, demonstrating that this is an extremely difficult area of work.
"As Is"
Partner institutions shared details about their processes for collecting data about esteem and impact within their institutions. These ranged from paper-based systems requesting free-form information from researchers manually each year, through to more structured quantitive surveys. There were advancements towards pre-population with known data, such as numbers of PhD students, and using pattern matching to identify words known to be associated with esteem from staff webpages. Brennan observed that all of these systems were collecting similar types of information, but there is still a lot of effort involved to extract most of it, particularly in the face of what she termed "British reticence". The group agreed it would be interesting to see if more of this information could be collected from other sources.
Site Visit Evidence
The site visit interviews provided the following perspectives on these issues:
Available on Slideshare.
In presenting these results, Stephanie Taylor emphasised that the most frequently description of this process was "woolly", highlighting the difficulty in providing meaningful data.
"To Be"
The group discussed additions to the generic aspirational process, including innovative suggestions involving Google Alerts to identify positive news stories, which could then be tagged by the PR department so that information can be gathered automatically. The group also discussed the Australian measures of esteem in more detail. It was felt that an adaptation of this model, with the accompanying criteria, would be a useful system to standardise esteem measurement. The practical work of this session focussed on identifying the data fields currently used to collect information about esteem and discussing the nuances of this data. For example, the group drilled down into "Editorship of a Journal" to identify the data fields they may need to measure the associated esteem more accurately. These included:
-
Title of journal
-
Type of journal
-
Impact factor of journal
-
Journal ranking
-
Field classification
-
Peer-reviewed or not
-
Start and end date of editorship
-
Place of publication
-
Career stage of the academic
There was also discussion about the fact that esteem data is often stored in different systems throughout the institution, concluding that whilst the aim of the RIM system must be to integrate and reduce duplication of data entry, it will not necessarily reduce duplication of data storage.
Project Perspective
Niamh Brennan, Trinity College Dublin
In this short video interview, Niamh Brennan outlines her hopes for reduced duplication of data and improved efficiency that she believes CERIF can bring to research information management...
Pre-Awards Management
There were a lot of variations between partner institutions' approaches to this process though there were some common processes too, and many had never had the need to drill down into it before. However, pro-vice chancellors at all of the partner institutions were interested in getting more business intelligence out of this process. They not only required information about the successful proposals, but also the rejected proposals.
"As Is"
The group discussed activity involved with the bidding process in terms of streams corresponding to activities undertaken by the academic, the funder and internal systems. This was guided by core processes used by University of Bath, but fleshed out with variations and additional steps from other institutions. This discussion naturally extended to desired improvements – particularly how to make existing systems more joined up internally. There was a strong desire to get more feedback about failed applications to help improve systems and support, so all parties were interested to hear about what, if anything, others currently have in place to collect this data.
Site Visit Evidence
The site visit interviews provided the following evidence:
Available on Slideshare.
In presenting these findings, Stephanie Taylor emphasised that the overall aim was to obtain transparent awards information throughout the whole life cycle of the project from application to completion. However, she noted that this type of reporting was never required from these business processes before.
"To Be"
The group analysed a proposed "to be" process, which placed automation and digitisation as key shifts to avoid human error and improve the availability of data. Additional suggestions included increased links between research proposals and strategic themes of the institution, increased used of funding alerts to provide data relating to the earliest stages of the bidding process, some form of numeric coding for funding councils and grants categories to avoid human error, specification of the type of grant (e.g. first grants), and records of the co-investigators involved to help institutions manage staffing commitments more effectively.
Partner Perspective
Katy McKen, University of Bath
In this short video interview, Katy McKen discusses the University of Bath's involvement with the CERIFy project...
Benchmarking
Rosemary Russell provided an introduction to known benchmarking processes, including screen scraping data, retrieving comparative data from HESA, using RAE 2008 results and research council publications. She highlighted some of the tools available, including InCites, Research Professional and benchmarkinginHE.co.uk, which is partly funded by HEFCE. She also noted the range of statistical data sources used by these tools, and observed that many institutions are doing comparisons manually using these same sources. Niamh Brennan observed that benchmarking is a diverse area that can take in a number of subprocesses, including everything covered so far in the data surgery. Even something as process-orientated as the pre-awards management system can feed into benchmarking data.
"As Is"
Kirsty Taylor from the University of Huddersfield provided an overview of their benchmarking process, which involves retrieving data from HESA and identifying other new universities to narrow down their comparisons on key criteria. However, she emphasised that there is insufficient information publicly available to allow them to drill down beyond an institutional level. The group explored this area further by discussing their own experiences and knowledge of other projects working in the field to establish if this data might be more forthcoming. Patricia Brennan discussed some of the work Thomson Reuters have undertaken on the Global Institutional Profiles Project, associated with the Times HE ranking. A generic “as is” process was discussed, but it was observed that this is a particularly sensitive area in terms of partners sharing their current processes.
Partner Perspective
Kirsty Taylor, University of Huddersfield
In this short video interview, Kirsty Taylor discusses the University of Huddersfield's involvement with the CERIFy project...
Site Visit Evidence
The site visit interviews confirmed that current tools are not providing sufficient detail for specific disciplines, so the differences between them are not necessarily being considered. The full summary of these results included:
Available on Slideshare.
"To Be"
The group reviewed the generic "to be" process, which boiled down to wanting “a button to pull everything together” and allow institutions to drilled down into the data to the level of granularity required. The group also examined Research Professional in more detail to see whether the fields supplied are sufficient or if there is anything that needs to be added.
Conclusions
The event concluded with a summary of the project plan following the Data Surgery. This will involve providing each partner institution with a health check to help inform their decisions about CRIS systems or their current status in terms of engaging with the CERIF standard, before focussing on two institutions to take in a selected area of data, map it to CERIF, and exchange data between two different systems.