SciVal’s Field weighted citation impact: Sample size matters!

There’s been a buzz on social media recently about Field weighted Citation Impact (FWCI) particularly around the recent leak from the University of Manchester that the FWCI is one of the measures suggested by which to assess academics most at risk of redundancy:

In his recent blog on The Bibliomagician Blog  (reposted here with permission) Iain Rowlands a Research Information & Intelligence Specialist at King’s College London and a member of the LIS-Bibliometrics committee questions the stability of the FWCI indicator for sets of fewer than 10,000 documents. Ian invites others to use his methodology to further test his theory…

SciVal’s field-weighted citation impact (FWCI) is an article-level metric that takes the form of a simple ratio: actual citations to a given output divided by the expected rate for outputs of similar age, subject and publication type.  FWCI has the dual merits of simplicity and ease of interpretation: a value of 2 indicates that an output has achieved twice the expected impact relative to the world literature.  It is a really useful addition to the benchmarking toolkit.

The trouble is that, typically, the distribution of citations to outputs is highly skewed, with most outputs achieving minimal impact at one end and a small number of extreme statistical outliers at the other.  Applying the arithmetic mean to data distributed like this, as does FWCI, is not ideal because the outliers can exert a strong leveraging effect, “inflating” the average for the whole set.  This effect is likely to be more marked the smaller the sample size.

I explored this effect in a simple experiment.  I downloaded SciVal FWCI values for 52,118 King’s College London papers published up until 2014.  I then calculated mean FWCI and 95% confidence (or stability) intervals for the whole sample using the bootstrapping[1] feature in SPSS.  Then I took progressively smaller random samples (99%, 98%, and so on to 1%, then 0.1%), recalculating mean FWCI and stability intervals each time.

The findings shows how mean FWCI becomes less stable as sample size decreases.  Highly cited outliers are relatively uncommon, but their chance inclusion or exclusion makes a big difference, especially as the number of outputs decreases.  In this experiment, FWCI values range across four orders of magnitude, from 0.03 to 398.28.

FWCI chart_black

What does this mean for interpreting FWCI, especially when benchmarking? The table below offers some guidance.  It shows typical stability intervals around FWCI at different scales.  The final column assumes that SciVal spits out a value of 2.20 and shows how that figure should be interpreted in terms of its stability.

FWCI Table

It’s pretty clear from this analysis that you need to know when it’s time to stop when you are drilling down in SciVal!  Another implication is that there is no sensible justification for quoting FWCI to two let alone three decimal places of precision.  I’ve kept the second decimal place above simply for purposes of demonstration.

I am well aware that the guidance above is based on data from just one institution, and may not travel well. If you would like to replicate this experiment using your own data, I’m happy to share my SPSS Syntax file.  It automates the whole thing, so you just have to load and go off on a short holiday! Just drop me an email.

Ian Rowlands is a Research Information & Intelligence Specialist at King’s College London and a member of the LIS-Bibliometrics committee.

ian.rowlands@kcl.ac.uk