JuFo ranking: A Worse Impact Factor?

JuFo (by DALL·E)

Changes to the Finnish PhD education
All Finnish universities are lowering the requirements for a doctoral thesis. Instead of 3 to 4 publications, you can now defend your thesis with 2 to 3. One argument favoring this change was to create a level playing field with other countries. None of the big European countries requires any publications to get a PhD: Neither Great Britain, Germany, France, or Italy do. The other Scandinavian countries are the only other European countries that have (or have had) similar requirements with respect to the number of required publications, with Sweden being almost on par with the Finnish system.

If international harmonization was the goal, the change should have been more radical, i.e., get rid of any requirement for publications. However, the new rules just ensure that we lose in quality what we gain in quantity. I prefer fewer but better-educated PhDs, but our current government apparently just wants to bump up the number of PhDs that Finland produces.

 
Publish of perish
I fear that the push towards a shorter PhD might decrease the quality of the publications. PhD students are now under pressure to get three papers out in 3 to 4 years. As soon as the "smallest publishable unit" has been produced, it is pushed out into the world. Few research groups have the luxury of postponing publishing research results for a long time because the publish-or-perish culture is still flourishing. While Finland has abandoned the impact factor (good), we have replaced it with a less transparent system (bad). The currently used metric in Finland is the JuFo system, where journals get ranked into classes 0-3 based on "expert consensus opinion." JuFo is the abbreviation of Julkaisufoorumi ("Publication forum").

Since everybody (including experts) has their preferred go-to journals, this opens the door to bias and even manipulation. There are plentiful examples of highly respected, 10+ impact factor journals with astronomically high rejection rates that were for a long time classified as JuFo 1, until - after a rotation of members in the respective JuFo panel - they were finally bumped to level 2. Why is there such a thing as JuFo rankings in the first place? One argument is that the Journal Impact Factor (JIF) does not account for the individual differences in the publication and citation cultures between different research fields. These differences are huge, but after classifying all journals into individual categories (e.g., clinical medicine versus languages versus agriculture, etc.), one could develop a compensation formula (perhaps even including manually assigned compensation factors) to adjust the JIF for such inter-discipline differences. A software script could do all the work of the JuFo panels automatically and transparently. Bibliometrics researchers have proposed such better alternatives to the JIF. Please drop me an email if you think that I am missing something here!

 
Anecdotes versus data
Having a system that allows for "manual" downgrading of journals is certainly a good thing. I also endorse JuFo's attempt to solicit individual researchers' feedback. However, when you count the number of journals JUFO needs to evaluate, you will see that gathering feedback from 180 individual scientists' experiences and then downgrading 60 journals based on this feedback has issues. That's what happened a few years back. If we assume that ALL feedback has been negative and ALL negative feedback has resulted in a downgrade, there will be a tiny number of cases for each journal. In the parlor of evidence-based medicine, some of these cases might be anecdotes.

 
Don't get me wrong: These decisions are probably not wrong, and neither is it wrong to gather feedback from researchers. In fact, my own experience with one of the downgraded journals (International Journal of Molecular Sciences) is pretty much indicative of predatory publishing. However, the data underlying these decisions is not the best, and it would be surprising if every single one of these decisions would hold up to scrutiny. It is difficult to imagine how such feedback could avoid bias. It's self-reporting, and researchers might be incentivized to report negative experiences.

 
Grading on the curve
Another little-known fact is that JuFo is a zero-sum game. Journals cannot be freely distributed over the four categories, but JuFo uses a variation of "grading on a curve": No more than 10% of the articles published in journals for a specific JuFo panel (e.g., for Chemical sciences) are allowed to fall into JuFo category 3. This accounts for differences in publication volume between fields. However, if we - the scientific community - improve the quality of our publication output generally, this will not be reflected in the JuFo ranking. That is, in my opinion, wrong (for the same pedagogical reasons that schools are discouraged from grading pupils on a curve).

 
Decisions behind closed doors?
The second big issue is transparency. The fact that I have to speculate in the paragraph above about the numbers that have led to the downgrades is worrisome:

  • How much negative feedback does result in a downgrade?
  • What exactly counts as negative feedback?
  • Is any negative feedback treated equally?
  • Do we trust the feedback, or is evidence of the misconduct required? And how much evidence?
  • Is the feedback anonymous or not?
  • Does JuFo communicate with the journal/publisher to allow them to respond to accusations of misconduct?

I cannot find satisfactory answers to these questions. As long as these decisions are made behind closed doors and without oversight, we just exchange old problems with new ones. Goodhard's law applies not only to the Impact Factor but also to JuFo: "When a measure becomes a target, it ceases to be a good measure." JuFo has become a target because our Ministry of Education uses it to distribute money.

 
Protected by irrelevance
Confirmed cases of editorial misconduct have been seen in journals from all JuFo classes. Anecdotes are even more abundant. But the threshold of evidence ought to be higher than just anecdotes. JuFo has some protection from the fact that it's relevant only for less than 0.1% of the world's population (there are about 5.5 Mio. people living in Finland). This protection by size is, at the same time, JuFo's weakness: Does it make sense to spend that many resources for a classification system used only by 5 Mio people? The world needs more than 1000 classification systems to offer protection by size. The only country I know of that does a similar manual "journal ranking" is Norway (https://kanalregister.hkdir.no/publiseringskanaler/Om). Unsurprising, some journals that are ranked low in JuFo are ranked high in the Norwegian Register and vice versa (e.g. Transactions of the Association for Computational Linguistics is in the highest category in the Norwegian Register, but only in JuFo 1; you can find many more examples by comparing the databases).

Currently, most publishers won't bother to specifically game JuFo, but that could change with the rise of AI rather soon. But there are still dangers that do not require AI: What if a handful of Finnish researchers team up to downgrade a journal they don't like? I wouldn't be surprised to learn that this has already happened. The lack of transparency does not instill confidence.

 
Disclosure: I have applied twice to join the JuFo expert groups to gain insight into JuFo's inner workings, but I have never been selected. That begs the next question: How does JuFo ensure a balanced composition of its expert panels, and how are its members selected? The list of the panels and its members is public. It is clear without much research that it is not a representative selection of researchers working in Finland because foreign names are conspicuously underrepresented. This has been more pronounced in previous years; they know the issue and are trying to correct it.