Ten Reasons Not To Measure Court Performance

This post is based on a December 9, 2008, presentation to a seminar of Michigan Chief Judges and Court Administrators sponsored by the Michigan Judicial Institute, at the Michigan Hall of Justice Conference Center in Lansing, Michigan. It is an updated and expanded version of the Made2Measure post, Eight Reasons Not to Measure Court Performance, April 5, 2006.

It is not sufficient simply to proclaim the benefits of court performance measurement – accountability, transparency, focus, attention, understanding, control, predictability, influence, and strategy development -- and expect acceptance and effective implementation.
Performance measurement, like any tool, has shortcomings and introduces disruptions of the status quo that should not be dismissed or ignored.

These shortcomings and disruptions can be minimized and even eliminated, however, when they are identified, clearly understood, thoroughly and candidly explored, and addressed in specific terms. Unfortunately, they are often framed as absolute deterrents or broad indictments of performance measurements that prevent honest debate.

In his insightful (and irreverent) 1998 book, Measuring Up: Governing’s Guide to Performance Measurement for Geniuses [and Other Public Managers] (Governing Books, Washington, DC), author Jonathan Walters, tallies up the reasons why you can’t possibly do performance measurement, especially in the public sector, reasons that those around you will use to argue why you can’t. With a nod to Walters, here are ten overlapping reasons not to measure court performance that you’re likely to encounter.

Reason 1: Performance Measurement Threatens Judicial Independence

The very idea of court performance measurement – this wild-card argument goes – is antithetical to the principles of judicial independence and separation of powers. It rests most of its weight on the assumption that performance measurement is imposed by forces external to the court – legislatures, state, county or city executives. Even if a court has a strong hand in the design and use of the performance measures, the argument goes, performance measurement will expose a court to external scrutiny and meddling and, in so doing, hand over control in ways that will erode independence and blur the separation of powers.

No doubt, this reason resonates with many judges and court managers. Not so with the Nation’s state court leadership.

In various policy statements and resolutions, state court leaders link independence to accountability, and accountability to performance measurement. For example, with Joint Resolution 14, In Support of Measuring Court Performance, adopted on August 3, 2005, the Conference of Chief Justices (CCJ) and the Conference of State Court Administrators (COSCA) first join independence and accountability by recognizing that accountability fosters an environment where legislators, executive agencies, and the public understand the judiciary’s role and are less likely to interfere with the judiciary’s ability to govern itself. Institutional strength and integrity are protected, the members of the conferences suggest, not by isolation but instead by openness, transparency and collaboration. The conferences next make the link of judicial independence and performance measurement by declaring that “judiciaries need performance standards and measures that provide a balanced view of court performance in terms of prompt and efficient case administration, public access and service, equity and fairness, and effective and efficient management.” In short, the counter argument is that performance measurement strengthens independence through accountability rather than weakening it.

Reason 2: Performance Measurement Is Nothing More Than a “Gotcha” Game

This reason is rooted in the belief that the purpose of performance measurement is finding fault and playing the “gotcha” game, the fear that performance measurement is used to control, justify, audit and determine who went wrong, rather than what went wrong. People don’t like being judged and graded. For many, the word “performance” conjures up beliefs and fears about tests and races, about zero-sum games, about “beating” others or “beating” a standard. All but one is a “winner” in these games, the rest are “losers” or “sub-standard.” It’s not something most of us like to contemplate.

The purpose for which performance measurement is used is the most important determinant of people’s reaction to it. A major point in Dean R. Spitzer’s insightful 2007 book Transforming Performance Measurement: Rethinking the Way We Measure and Drive Organizational Success (New York: American Management Association) is that people actually like measuring and being measured. Think of young readers who proudly report how many pages they’ve read. What people don’t like is being judged.

Spitzer maintains that it all depends on the context, on how performance measurement is pitched and experienced. Is it presented and experienced as a steering tool or as a grading tool. The highest purposes of performance measurement are to learn and to improve. Performance measurement tends to be much more positively accepted if it is used to assist in learning and improvement.

Reason 3: Performance Measurement Is Inherently Misguided Because Courts Have Little Control of Outcomes

Jonathan Walters calls this the “cause-and-effect and long term consequences conundra of performance measurement” in the public sector. Government actions are often hard to link to results, and the ultimate results of many government programs won’t be known for a long time, if ever. In the courts, the argument goes something like this: Because it’s hard to say with any certainty that what a court does leads to better access, improved timeliness, and more fairness and, in any event, the court ultimately has so little power over them, measures of these outcomes are inherently unfair to the courts. Why should courts be held accountable for things over which they have little control?

No doubt, it’s easy to get hung up on arguments over cause-and-effect and over control of long-term consequences. But are these arguments enough to drop the whole idea of performance measurement?

There is a growing number of court leaders who believe just the opposite: that these arguments, in fact, make the case that it’s high time to forge ahead with performance measurement. They know one thing for sure: we need to know where we are (baselines) and where we’re going (trends) before we have even a hope of knowing what causes what. How else, for example, will courts be able to distinguish between internal (e.g., shifting resources) and external (e.g., legislative mandates) causes of increases in cost-per-case?

It may be that for some measures, courts can do quite a bit by themselves to impact outcomes and, for others, they need a little help from their friends. Either way, courts are better off knowing than not knowing.

Reason 4: Uniqueness of My Court Defies Performance Measurement (There’s No Way to Measure What We Do)

We need to have a sensitivity to the subteties of context. No two courts are alike. No two jurisdictions are the same. Our court is unique. In fact, what we do is so unique that it can’t be captured by performance measurement.

Such assertions, especially when linked to judicial independence, shape Reason 3. Those who make it are not at all convinced by Peter Drucker’s admonition, “You can’t manage what you can’t measure”? They are smart and thoughtful people who are just philosophically resistant to performance measurement. “We don't work on an assembly line making widgets,” they might say. “We don’t work with machines. We deal with unique cases involving human beings who have complex problems that require individual attention. You can’t just send in the bean counters and number crunchers and crank out a bunch of performance indicators to capture what we do on a spreadsheet. It just can’t be done.” (This reason is what Bob Behn, a lecturer at Harvard University’s John F. Kennedy School of Government, calls the “direct assault on the spread-sheet guys.”)

This reason for not doing performance measurement will not get much traction. First, if there truly is no way to measure a court’s performance in a meaningful way, maybe it’s fair to ask whether the court really is contributing in areas of importance to the court’s stakeholders – access to justice, fairness, system integrity, and public trust and confidence – and to suggest that the court should turn its attention and energies to something that does contribute. Second, the assertion that we can’t measure success should not preclude any attempts to try. Courts today are measuring performance thought “unmeasurable” just a few years ago.Finally, too many courts hide behind their differences. While it is true that substantial differences exist among courts and jurisdictions even within a state, there are more similarities than differences. Every court provides citizens access to a forum for resolving legal disputes, each processes and decides cases, each must take and preserve a record of its proceedings, each must work cooperatively with its justice system partners such as law enforcement agencies, and each must engender public trust and confidence.

Comparative performance measurement involves serious efforts to identify courts that are “roughly” comparable, those that are most similar to each other. Once this rough comparability is established (and accepted by courts applying the comparative measurement), remaining unique characteristics among the courts become factors that may account for differences in performances.

Reason 5: Performance Measurement Is an Invitation to Unfavorable and Unfair Comparisons

This reason is based in fears that performance measurement will show that “my” court does not measure up or “stack up” to other courts. Many court executives may prefer not to know how their courts compare. After all, as long as you don’t know how your court’s performance compares, and as long as nobody else knows, there’s no pressure for change. Even courts that are happy to learn how to improve their own performance may balk at having their performance compared with other courts.

While unfavorable comparisons are inevitable in performance measurement, the negative impact of such comparisons on individual courts need not be. Limitations of performance measurement can be minimized and attendant risks controlled. An AOC, for example, can shield individual courts from the negative impact of cross-court comparisons on a measure of court user-citizen satisfaction by reporting only state median, top (90th percentile), and bottom (10th percentile) performances. Directed by a general policy favoring self-improvement, the identities of individual courts would be revealed only by agreement with the individual ourts.

Comparisons among courts on valid and meaningful performance measures must be made – and seen to be made – in their political and practical contexts. Performance measurement, by definition, invites both favorable and unfavorable comparisons.

Why would an elected court clerk, for example, embrace cross-court comparisons on a measure of case file readiness and integrity (i.e., the proportion of case files that are delivered when requested and are accurate and complete)? Undoubtedly, the court clerk correctly views his or her personal performance and, by extension, the performance of the court clerk’s office as something to be assessed primarily by the electorate. While the court clerk’s reluctance to support comparative performance measurement may be based in historic differences in methods of truth-seeking, the issue is as simple as this: Why upset the apple cart? If he or she is a current office holder, the verdict on the performance of the clerk’s office already has been rendered. What is to be gained from comparative performance measurement, particularly when focused on a measure that is at the heart of work of a court clerk’s office? What if the court clerk’s office is identified as a relatively poor performer? Who will have access to the results of the comparative performance measurement? How can the court clerk’s office be insulated from unfavorable press coverage? These are not unreasonable questions.

But Jonathan Walters accurately describes the bottom line when he writes this about so called unfair comparisons in government performance: "The fact is, such issues as who has the most efficient social service system, the smartest kids, the best cops, the quickest snowplows, the cleanest drinking water or even the most reliable street lighting are of intense interest to citizens. And pretty soon jurisdictions not producing performance data in such areas are going to be asked why they’re not."

Reason 6: Performance Measures Will Be Misused

Like any tool, comparative performance measurement must be used correctly to be effective. As part of any performance measurement, efforts should be made to ensure that comparisons are complete, that they are based on consistent operational definitions and accurate calculations of performance measures, and that they rely on high quality data. These requirements may seem so obvious on their face that it hardly seems necessary to describe them as shortcomings of performance measurement. However, what may appear obvious at first glance becomes more subtle and possibly problematic upon closer look.

For example, differences between two courts -- court A and court B -- in the operational definition and calculation of the measure of the age of pending caseloads can lead to incomplete or invalid comparisons. Both courts may define their pending caseloads processing consistently in terms of the total number of elapsed days from filing to entry of judgment in a case. But only one court may take into account the days that cases are inactive. That is, court A makes no distinction, whereas court B makes a clear distinction, between those cases that move through the court without interruption, on the one hand, and episodic cases that may have been placed on inactive status for a variety of reasons (e.g., a defendant absconds and a bench warrant is issued for his or her return), on the other hand. Court A includes in its calculation of elapsed time the 60 days a defendant is out on warrant, for example, whereas court B excludes those 60 days.

Therefore, the differences between the courts in the age of pending caseloads (as well as the speed of case processing) may be due solely to inclusion or exclusion of the time a case is inactive in the calculations of measures of the age of pending cases, not to any differences in the efficiency of the courts’ case processing.Making comparisons of these two courts on the measures requiring the calculation of the ages of pending and completed case can be still be instructive, but doing so without taking into account the differences in definitions and calculation of the performance measure would paint an incomplete and inaccurate picture, and constitute a misuse of comparative performance measurement.

Reason 7: Performance Measures Will Be Used to Hurt, Not Help

This argument puts a sharper point on some of the preceding reasons no to measure court perfromance. The traditional command-and-control managerial mindset in many courts is: “You screw up, you fall short, and you get punished.”

Why should we get into performance measurement if the media will completely misinterpret the data and proceed to beat the court black and blue on the evening news and on the front pages of the local paper? This is a legitimate worry.

Maybe it might be a good idea to start thinking less cynically about what performance measurement might do for you, rather than to you. Jonathan Walters is absolutely brilliant on this point. He correctly notes that it’s a mistake to assume that a poor showing on a performance measure will only lead to a “public upbraiding for shoddy work, as opposed to being a catalyst for some reasoned discussion and debate about how to improve performance (in addition to perhaps winning additional resources).” Good managers, he says, turn the tables on bad performance numbers.

He cites the example of Charlie Dean, chief of police in Prince William County, Virginia, an example that should ring true for court managers who have put the measure of case clearance to good use. After getting criticized for his department’s poor showing in clearing cases, Dean turned the tables by using the same performance data to show that his department’s performance was quite good considering that his was the most understaffed departments in the region. The key to his success was that he didn’t attack performance measurement as unfair. Instead, he used performance measurement as a useful management tool and used the performance data to get what he needed.

Reason 8: Performance Measurement Is Good for Problem Diagnosis But Not Cure

This argument stems from the belief that performance trend data, by themselves, do not tell us why things are different, only that they are different.

True enough, as a general rule, comparisons of performance data – whether made among different courts or within a court over time or among different units of a single court – do not tell us why the differences occur. They will not, by themselves, identify causes, provide a prescription for how to improve performance or tell us why it may not be at the level we would like.

Of course, comparisons can help pinpoint the location of trouble spots that may suggest why particular units of a court or two courts differ in their performance ( e.g., when the civil division of a court declines dramatically in a survey of employee engagement compared to the other divisions of a court). However, without additional inquiries like those described for determining evidence-based best practices, performance measurement does not tell us why performance is relatively high, good or bad relative to other courts.

Before conclusions about differences in performances are made, possible differences that are not attributable to performance difference in the courts must be explored and, if possible, eliminated. Such differences may include the inconsistent definitions of data elements, and different counting and calculations mentioned above. As noted by the Urban Institute, who has done as much as anyone to advance performance measurement in the public and non-profit sectors, even with the most careful effort to collect consistent comparative performance data, such variations will inevitable occur. Other differences not attributable to differences in performances include demographic or regional differences, changes in resources available, and differences in the number or characteristics of cases filed.

Reason 9: Performance Measurement Is Too Expensive

This is the worry that performance measurement takes too much time, effort and money. Making comparisons with other courts, for example, adds cost and effort to “internal” performance measurement that is restricted to one court. Comparable courts and performance measures must be identified, and data must be collected, analyzed and reported. This additional cost and effort must not only be well understood but also weighed against specific benefits.

Court leaders and managers, as well as the leadership of the court's justice system partners, must have both clear expectations and political will to secure the necessary resources to support effective performance measurement. While one person -- the head of a court's research and planning division or, in smaller courts, the court administrator -- is likely to lead the process of planning and implementing the performance measurement system, a work group of several individuals will need to be involved over a period of months. Multiple stakeholders inside and outside of the court must work together to reach agreements on key success factors and performance measures aligned with the court's mission. Both the development and use of the performance measurement system must be supported by financial and political resources. As Kathryn Newcomer, professor of public administration at George Washington University, warns: "As additional yet uncompensated work, data collection to support performance measurement will simply not get done. Authority and resources must accompany responsibility for performance measurement."

Given this caveat, it is important to stress that the cost and effort of performance measurement, like any initiative, can and should be kept to a minimum. Performance measurement need not be prohibitively expensive. New pieces should be built on the base of existing parts, processes in place in one or more courts or in agencies allied with courts should be extended to other courts, the scope of the effort should be made to fit needs, automated information systems and other technology should be explored to minimize costs, and finally, specific benefits and payoffs should be weighed against specific costs.

Reason 10: Performance Measurement Is a Vulgarization of Jurisprudence

Professor Godfrey St. Peter, the protagonist in Pulitzer Prizewinning author Willa Cather’s 1925 novel, The Professor’s House, resisted with all his might what he saw as the new commercialism at the time, the aim to “show results” that he saw as undermining and vulgarizing education in his Midwestern university.

Harried judges, as well defense attorneys and prosecutors, facing overwhelming caseloads, joke about “McJustice” and being on an “assembly line.” “I don’t much care how fast the training is travelling, if it’s not going where I want to go,” one judge quipped.

This reason for not embracing performance measurement may not withstand close scrutiny but it, nonetheless, resonates with many judges. It is their mental model of the whole enterprise of “prying results from everything we do” and it shapes their resistance to performance measurement.

Rather than dismissing or dispelling this mental model of “McJustice,” a response to Reason 10 should counter the notion that “as fast as possible” and “as cheap as possible” is a not requirement of court performance measurement. To the contrary, a “balanced scorecard” of court performance measures might well demonstrate that “McJustice” has its price in declining trust and confidence in the courts and disengagement of court employees who see the court losing respect in the community.

The Bottom Line: Performance Measurement Is No Longer an Option for Courts

Court performance measurement inevitably suggests impending change that must be managed effectively. Powerful mindsets or mental models may impede this change. Clearly, courts as learning organizations must continually clarify and improve the mindsets and mental model of court performance measurement to ensure its success in court improvement. For some, it may be as easy as striking the word “performance” from discussions and speaking instead of “indicators of service quality" or "adherence to mission,” or some phrase that moves the focus from personal to organizational responsibility and accountability. For other courts, it may take much more than language to alter powerful mental models inconsistent with performance measurement.

In any event, rather than filibuster against it, it is best to acknowledge the limitations of performance measurement, strive to minimize them in specific ways, and demonstrate in specific terms that the benefits outweigh the real risks that these limitations may pose for individuals and courts. Chances are that performance measurement won’t be an optional exercise for courts for too much longer.

See also:

Getting Started with Performance Measurement – Breaking Down Resistance, Made2Measure, December 5, 2005

Comparing Apples and Oranges ....And Learning From It, Made2Measure, April 9, 2006

Fear of Misuse of Cost Per Case Measure, Made2Measure, February 1, 2006

“How Do we Stack Up Against Other Courts? The Challenges of Comparative Performance Measurement,” The Court Manager, Vol. 19, No.4 (Winter 2004-2005)

Independence, Accountability and Performance Measurement, Made2Measure, September 29, 2005

Top 10 Reasons for Performance Measurement, Made2Measure, September 26, 2005

For the latest posts and archives of Made2Measure click here.

© Copyright CourtMetrics 2008. All rights reserved.

Popular posts from this blog

Top 10 Reasons for Performance Measurement

Q & A: Outcome vs. Measure vs. Target vs. Standard

Taming “Wild Problems”: Measure Everything That Matters