La tiranía de la métrica académica
Febrero 2, 2018


The quest to quantify everything undermines higher education

By Jerry Z. Muller JANUARY 21, 2018
Tim Foley, for The Chronicle Review

Copyright © 2018 The Chronicle of Higher Education
A cultural pattern has become ubiquitous in recent decades, engulfing an ever-widening range of institutions. Now it has come for the university. Call it a meme, a discourse, a paradigm, or a fashion. I call it metric fixation. It affects the way people talk about the world, and thus how they think and how they act. The key components of metric fixation are:

the belief that it is possible and desirable to replace judgment, acquired by experience and talent, with numerical indicators based upon standardized data.
the belief that making such metrics public assures that institutions are carrying out their purposes.
the belief that the best way to motivate people is by attaching rewards and penalties to their measured performance.

These assumptions have been on the march for several decades, and their assumed truth goes marching on.

The pernicious spillover effects became clear to me during my time as chair of the history department at the Catholic University of America. Such a job has many facets: mentoring and hiring; ensuring that necessary courses get taught; maintaining relations with the university administration. Those responsibilities were in addition to my roles as a faculty member: teaching, researching, and keeping up with my field. I was quite satisfied.

Then, things began to change. Like all colleges, Catholic gets evaluated every decade by an accrediting body. For my university, that body is the Middle States Commission on Higher Education. It issued a report that included demands for more metrics on which to base future “assessment” — a buzzword in higher education that usually means more measurement of performance. Soon, I found my time increasingly devoted to answering requests for more and more statistics about the activities of the department, which diverted my time from research, teaching, and mentoring faculty members. New scales for evaluating the achievements of our graduating majors added no useful insights to our previous measuring instrument: grades.

Gathering and processing all this data required the university to hire ever more specialists. Some of their reports were useful; for example, spreadsheets that showed the average grade awarded in each course. But much of the information was of no real use, and read by no one. Yet once the culture of performance-documentation caught on, department chairs found themselves in a data arms-race. I led a required yearlong departmental self-assessment — a useful exercise, as it turned out. But before sending it up the bureaucratic chain, I was urged to add more statistical appendices — because if I didn’t, the report would look less rigorous than that of other departments.

My experience left me wondering about the forces fueling this waste of time and effort. The Middle States Commission operates with a mandate from the Department of Education. Under the leadership of Margaret Spellings, the department had convened a Commission on the Future of Higher Education, which published a report in 2006 emphasizing the need for greater accountability and the gathering of more data, and directing the regional accrediting agencies to make “performance outcomes” the core of their assessment. That mandate filtered down to the Middle States Commission, and from there, ultimately, to me.

Once the culture of performance-documentation caught on, department chairs found themselves in a data arms-race. Metric fixation, which seems immune to evidence that it frequently doesn’t work, has elements of a cult. Studies that demonstrate its lack of effectiveness are either ignored or met with the claim that what is needed are more data. Metric fixation, which aspires to imitate science, resembles faith.

Not that metrics are always useless or intrinsically pernicious. They can be genuinely useful. But not everything that is important is measurable, and much that is measurable is unimportant. (Or, in the words of the familiar dictum, “Not everything that can be counted counts, and not everything that counts can be counted.”) Universities, like most organizations, have multiple purposes, and those which are measured and rewarded tend to become the focus of attention, at the expense of other essential goals. Similarly, many jobs have multiple facets, and measuring only a few of them creates incentives to neglect the rest. When universities wake up to this fact, they typically add more performance measures. That creates a cascade of data — information that becomes ever less useful — while gathering it sucks up more and more time and resources.

In the process, the nature of academic work is transformed in ways that are often harmful. Like most professionals, academics resent the imposition of goals that may conflict with their professional ethos and judgment, thus lowering morale. And they inevitably become adept at manipulating performance indicators through a variety of methods, many of which are ultimately harmful to the health of a university.

In the attempt to replace judgments of quality with standardized measurement, some rankings, government institutions, and university administrators have adopted as a standard the number of scholarly publications produced by a college’s faculty, and determined these publications using commercial databases. Here is a case where standardizing information can degrade its quality.

The first problem is that these databases are frequently unreliable: Having been designed to measure production in the natural sciences, they often provide distorted information in the humanities and social sciences. In the natural sciences and some of the behavioral sciences, new research is disseminated primarily in the form of articles in peer-reviewed journals. But that is not the case in fields such as history, in which books remain the pre-eminent form of publication, so a measurement of the number of published articles presents a distorted picture. But that is only the beginning of the problem.

When individual faculty members, or whole departments, are judged by the number of publications, whether in the form of articles or books, the incentive is to produce more publications, rather than better ones. Really important books may take many years to research and write. But if the system rewards speed and volume, the result is likely to be a decline in truly significant scholarship. That is what seems to have happened in Britain as a result of its Research Assessment Exercise: a great stream of publications that are both uninteresting and unread. Nor is the problem confined to the humanities. In the sciences as well, evaluation by measured performance favors short-term publication over long-term research capacity.

In academe, as elsewhere, that which gets measured gets gamed. Take impact factors. Once developers recognized that not all articles were of equal significance, they created techniques to measure each article’s impact. That took two forms: counting the number of times the article was cited, and considering the prestige — or impact factor — of the journal in which it was published, a factor determined in turn by the frequency with which articles in the journal are cited. (This method, mind you, cannot distinguish between the following citations: “Jerry Z. Muller’s illuminating and wide-ranging article on the tyranny of metrics effectively slaughters the sacred cows of so many organizations” and “Jerry Z. Muller’s poorly conceived screed deserves to be ignored by all managers and social scientists.” From the point of view of tabulated impact, the two statements are equivalent.)

Metric fixation, which seems immune to evidence that it frequently doesn’t work, has elements of a cult. Moreover, in an attempt to raise their citation scores, some scholars formed informal citation circles, the members of which made a point of citing one another’s work as much as possible. Some lower-ranked journals requested that authors include additional citations to articles in the journal, in an attempt to improve its “impact factor.”
What, you might ask, is the alternative to tallying up the number of publications, the times they were cited, and the reach of the journals in which articles are published? Professional judgment. In a department, evaluation of faculty productivity can be done by the chair or by a small committee of colleagues, who, consulting with other faculty members when necessary, draw upon their knowledge of what constitutes significance. In the case of major decisions, such as tenure and promotion, scholars in the candidate’s area of expertise are called upon to provide confidential evaluations, a more elaborate form of peer review.

Citation databases may be of some use in that process, but numbers also require judgment grounded in experience to evaluate their worth. That judgment is precisely what is eliminated by too great a reliance on metrics. As Carl T. Bergstrom, a biologist at the University of Washington, puts it, “All too often, ranking systems are used as a cheap and ineffective method of assessing the productivity of individual scientists. Not only does this practice lead to inaccurate assessment, it lures scientists into pursuing high rankings first and good science second. There is a better way to evaluate the importance of a paper or the research output of an individual scholar: read it.”

Among the strongholds of metrics is the Department of Education, under a succession of presidents, Republican and Democratic. During President Obama’s second term, his Department set out to develop an elaborate “Postsecondary Institution Ratings System.” It was intended to grade all colleges, to disaggregate its data by “gender, race-ethnicity and other variables,” and eventually to tie federal funds to the ratings, which were to focus on access, affordability, and outcomes, including expected earnings upon graduation. The plan ran into opposition from colleges and Congress. In the end, the Department settled on a stripped-down version, the College Scorecard, unveiled in September 2015.

It was the product of good intentions, meant to address real problems in the provision of higher education, especially the extremely spotty record of for-profit institutions offering career-oriented education in fields like automotive repair, culinary arts, or health aids, which had been expanding by leaps and bounds. But in reaction to a genuine problem at the low end of the for-profit sector, the department responded with far-reaching demands that had consequences for all colleges.

What the advocates of greater accountability metrics overlook is how the increasing cost of college is due in part to the expanding cadres of administrators, many of whom are required to comply with government mandates. Reward for measured performance in higher education is touted by its boosters as making universities “more like a business.” But businesses have a built-in restraint on devoting too much time and money to measurement — at some point, it cuts into profits. Ironically, since universities have no such bottom line, government or accrediting agencies or the university’s administrative leadership can extend metrics endlessly. The effect is to increase costs or to divert spending from the doers to the administrators — which usually suits the latter just fine. It is hard to find a university where the ratio of administrators to professors and of administrators to students has not risen astronomically in recent decades. Metric fixation contributes to the mushrooming of administrators.

In the case of the College Scorecard, some of the suggested objectives of the original plan (the Postsecondary Institution Ratings System) were mutually exclusive, while others were simply absurd. The goal of increasing college graduation rates, for example, is at odds with increasing access, since less-advantaged students tend to be not only financially poorer but also worse prepared. The better prepared the student, the more likely she is to graduate on time. It might be possible to admit more economically and academically ill-prepared students and to ensure that more of them graduate; but only at great expense, which is at odds with another goal of the Department of Education: holding down costs.
Another metric that colleges were to supply was the average earnings of students after graduation. Not only is this information expensive to gather and highly unreliable — it is downright distortive. Many of the best students will go on to one or another form of professional education, ensuring that their earnings will be low for at least the time they remain in school. Thus a graduate who proceeds immediately to become a greeter at Walmart would show a higher score than her fellow student who goes on to medical school. But there would be numbers to show, and hence “accountability.”

Even if you leave aside the accuracy and reliability of these metrics, consider the message they convey. Initiatives like the College Scorecard treat higher education in purely economic terms: Its sole concern is return on investment, understood as the relationship between the monetary costs of college and the increase in earnings that a degree will ultimately provide. Those are, of course, legitimate considerations. College costs eat up an increasing percentage of family income or require the student to take on debt; and making a living is among the most important tasks in life.

But it is not the only task in life, and it is an impoverished conception of college that regards it purely in terms of its ability to enhance earnings. If we distinguish training, which is oriented to production and survival, from education, which is oriented to making survival meaningful, then metrics are only about the former.

The sort of lifelong satisfaction that comes from an art-history course that allows you to understand a work of art; or a music course that trains you to listen for the theme and variations of a symphony; or a literature course that heightens your appreciation of poetry; or a biology course that opens your eyes to the wonders of the human body — none of these is captured by the metrics of return on investment. Nor is the fact that college is a place where lifelong friendships are made, often including that most important of friendships, marriage. All of these benefits should be factored in when considering “return on investment”: but because they can’t be quantified, they are ignored.

The hazard of metrics so purely focused on monetary considerations is that, like so many metrics, they influence behavior. Universities at the very top of the rankings already send a huge portion of their graduates into investment banking, consulting, and high-end law firms. Those are honorable professions, but is it really in the best interests of the nation to encourage universities to direct their best and the brightest to choose those careers?

A capitalist society depends on a variety of institutions to provide a counterweight to the market and its focus on monetary gain. To prepare students for their roles as citizens, as friends, and above all to equip them for a life of intellectual richness — those are among the proper roles of college. Conveying marketable skills is a proper role as well. But to subordinate higher education to what can be quantified is to measure with a dangerously crooked yardstick.

Jerry Z. Muller is a professor of history at the Catholic University of America. He is the author, most recently, of The Tyranny of Metrics (Princeton University Press), from which this essay is adapted.

A version of this article appeared in the January 26, 2018 issue.

———————————————————————————————————————————-

Productivity Metrics

What is the best way to assess faculty activity?

By Vimal Patel FEBRUARY 29, 2016 [space height=”HEIGHT”]
Eric Petersen for The ChronicleQuestions about faculty productivity are nothing new. But the growing use of metrics to assess faculty activity has raised the stakes at a time when colleges already face growing pressure to demonstrate accountability and compete with peer institutions.
Meanwhile, questions about how to measure a scholar’s influence in social media, known as “altmetrics,” are expected to add to the debate over faculty productivity.

One company that colleges turn to for help with metrics is Academic Analytics, which allows colleges to compare the peer-reviewed publications, journal citations, federal research grants, and other honors of their faculty members with those at peer institutions. The company says it has more than doubled the number of colleges it works with, to about 100, over the past five years.

At public colleges, the pressure to measure faculty productivity often comes from legislators. But all types of institutions are increasingly paying more attention to remaining competitive with their peers, says Peter Lange, chief academics adviser at the South Carolina-based company and a former provost of Duke University.

Such efforts haven’t always been executed with finesse. Texas A&M University, for example, issued a report in 2011 that listed faculty members’ names in red or black — like a corporate balance sheet — depending on whether the research and tuition dollars they generated covered their salary and expenses. Such heavy-handed efforts usually crumble under faculty opposition. Texas A&M abandoned its plan amid faculty objections to the perceived corporatization of the university as well as the accuracy of the data.
Winning over faculty members — or at least avoiding a revolt — is key to the long-term success of evaluation efforts, and administrators must strike a balance between their needs and faculty concerns.

The Trends Report 2016
View: The 10 Trends of 2016

Chronicle subscribers and site-license holders get access to the full Trends Report. Not yet a subscriber? Subscribe today.

“The most important thing,” says Gary A. Olson, president of Daemen College, “is to make sure you have faculty buy-in. If you have them helping in the production of the measurement instrument, you have the best chance of coming up with an instrument that everybody’s happy with.”
Pleasing everyone, though, may be impossible. Many faculty members, especially in the arts and humanities, are distrustful of faculty analytics.

“They’re trying to run creative thinking through a machine,” says Mark Usher, chair of the classics department at the University of Vermont. While the university works with Academic Analytics, it does not require the use of company data for evaluation, and recently asked each of its academic units to develop its own faculty-productivity metrics.

Mr. Usher says the metrics aren’t meaningful without context and often aren’t even accurate. He echoes faculty members on many campuses who have complained that reports based on metrics often show deflated grant awards and incorrect journal citations, and omit publications that should be included (and vice versa).

For example, he says, Academic Analytics had included Acta Astronautica, a publication he’d never heard of, as a classics journal. “What is that?” he asks, “Like, the study of UFOs?” (It’s an astronautics journal.)

Company officials acknowledge that assessing faculty productivity in the arts and humanities is tougher than in the sciences and engineering, where quantified measurements are the norm.
TAKEAWAY

Can Metrics Measure Professors?

Colleges are increasingly using data to measure their faculty members’ productivity and to compare them with professors at peer institutions. Professors complain that such metrics provide an inaccurate and incomplete picture of their activities, but colleges say the careful use of data from an outside source provides credibility.
Gaining faculty approval is key to the long-term success of any effort to measure faculty productivity.
Administrators need to be sensitive to disciplinary differences: A metric that works in civil engineering might not work in English or, for that matter, chemical engineering.
At the Massachusetts Institute of Technology, visiting committees, comprising members from academe, business, industry, and government, have reviewed each department since the 19th century. These days, those committees receive data compiled by Academic Analytics, which allows comparisons with peers. But administrators must be cognizant of disciplinary differences, says Lydia Snover, director of institutional research at MIT, and must put the data in context with other indicators of productivity, which the company’s data don’t measure.
Peer-reviewed publications may be an effective productivity metric for some departments. For others, like computer science, which produce fewer papers, the metrics might be citations per publication, or conferences per faculty member, or honors and awards. Federal grants could be a productivity metric for a department like chemistry, but not for engineering, since engineering faculty members at MIT receive a large share of grants from private sources that aren’t captured by Academic Analytics’ data.

Moreover, Ms. Snover says, colleges must make clear that faculty-productivity metrics will be placed in the context of a faculty member’s broader body of work. “A lot of this,” she says, “is just to be able to provide some comparative data that isn’t hearsay. It’s not perfect. It’s impossible to be perfect in these areas.”

Some say the next faculty-productivity battlefield might be altmetrics, a term used to describe alternative methods of gauging scholarly impact, including the use of blogs, news coverage, and social media. How many times was a tweet about your research retweeted, or “liked” on Facebook? Such measures have made headway in Britain but are still a gray area in the United States, says Anthony J. Olej­ni­czak, chief knowledge officer of Academic Analytics.

“It’s not exactly clear where that line between what is scholarly and what is media is ultimately going to be settled,” he says. “But a blurring of that line is clearly something that has been happening in the last few years.”

Proponents of faculty analytics say the quality and accuracy of data have improved and will continue to do so as technology evolves. But the debate over how meaningful those data are won’t be settled anytime soon.

Vimal Patel covers graduate education. Follow him on Twitter @vimalpatel232, or write to him at [email protected].

0 Comments

Submit a Comment

Tu dirección de correo electrónico no será publicada. Los campos requeridos están marcados *

PUBLICACIONES

Libros

Capítulos de libros

Artículos académicos

Columnas de opinión

Comentarios críticos

Entrevistas

Presentaciones y cursos

Actividades

Documentos de interés

Google académico

DESTACADOS DE PORTADA

Artículos relacionados

Share This