Impediments in the Advance of Performance Measurement
The question How Are We Performing? lies at the heart of self-governance and effective leadership of courts. The capacity and political will to address this self-directed question regularly and continuously using the tools of performance measurement is the hallmark of a successful court organization. More and more court leaders and managers are turning to performance measurement to drive success. However, the trend is on a slow march impeded by two entrenched ways the judicial sector has tended to measure its success: (1) reliance on third-party monitoring and evaluations of court performance; and adherence to a research paradigm for assessing court performance.
Self-governance, transparency, and accountability of courts will depend on the degree to which these two impediments are attenuated. Court leaders will need to champion and support the self-assessment of performance by courts, instead of relying on third party evaluations. They will need to seek the replacement of the methodologies of the disciplines of research or program evaluation (or evaluation research), which have dominated justice sector assessment in the past, with that of performance measurement and management. These suggestions are consistent with the values, principles and tools of the National Center for State Courts’ High Performing Courts Framework, which is oriented toward United States courts, and the International Framework for Court Excellence.
Self-Assessment of Court Performance Versus Third-Party Evaluations
As with most changes in life, change that is self-directed is the most meaningful and long lasting. Self-assessment, and not third-party monitoring and program evaluation (or evaluation research), is the hallmark of successful courts. It is integral to effective self-governance. A successful court has the capacity and the political will for self-directed rigorous performance measurement and management that addresses the question How are we doing? The “we” in the question suggests the critical differences between court performance measurement and third-party evaluations of courts.
Monitoring and Evaluation of Court Programs and Processes. Performance measurement is not yet the norm in the United States and around the world, though it has a strong foothold and gaining strength in the United States and large parts of the developed and developing world. Most assessments of programs, processes, and reform initiatives in courts are accomplished instead by monitoring and evaluations instigated and conducted by third-parties, including funding agencies, donors, aid providers, and their agents (researchers, analysts, and consultants). The abiding concern of these third parties is return on their investments, and this concern does not necessarily align with the expressed purposes and fundamental responsibilities of courts, at least as these might be conceived by court leaders and managers, and in seminal authorities like the Trial Court Performance Standards. For the most part, the focus of these third-party assessments is a specific initiative, program or process (e.g., juvenile drug courts, small claims mediation, and summary jury trials).
The results are, at best, a limited and generally unsatisfactory response to the question How are we doing? Another weakness addressed in the next section of this essay: arrangements of the typical research paradigm within which these third-party monitoring and evaluation operate (e.g., long delays and ineffective sharing of information with courts) make the results even less responsive to the question. Performance data produced by monitoring and evaluation efforts are used primarily in service of decisions to increase, decrease or redirect funding or other support. While some performance data may be shared with courts and justice system, as a practical matter, most are collected, analyzed, interpreted, and used, first and foremost, by third parties, i.e., funders, donors, aid providers and their agents. The measures of success are defined and results interpreted by these third-parties with little or no input from the courts or justice systems implementing the programs, processes, and reform initiatives.
The Indexing of Performance. Another type third-party monitoring and evaluation of performance of courts takes the form of indexing of justice sector performance. Indexes are useful tools. The idea that complex things can be pinned down and quantified on a simple scale seems universally appealing. Indexes that reduce justice sector performance to a single number for purposes of comparing, ranking and rating countries are being used throughout the world to understanding everything from governance, corruption, economic vitality, health, education to the quality of life. Governments and reformers are taking such indexes seriously. They are closely watched. “There is nothing,” wrote the Economist in an October 9, 2010 report of the results of the Mo Ibrahim Foundation’s Index of African Governance, “like a bit of naming, shaming and praising.”
The crowded field of performance indexes includes comprehensive indicators and indices that encompass entire countries but include aspects of justice, like the Mo Ibrahim Foundation’s Index, and others more narrowly focused on rule of law like the World Justice Project’s WJP Rule of Law Index™ and the American Bar Association’s Judicial Reform Index. The World Justice Project’s WJP Rule of Law Index™ incorporate ten elements of the rule of law -- such as limited government powers, fundamental rights, and clear, publicized laws, and 49 associated sub-factors that make general reference to various justice “systems” but reference to courts is conspicuous by its absence.
While the indexes are successful in getting people’s attention by naming, shaming and praising the jurisdictions rated and ranked, buy-in of the leaders and managers of courts and court systems may be limited. Well known indexes like the World Justice Project’s WJP Rule of Law Index™ and the American Bar Association’s (ABA) Rule of Law Initiative’s Judicial Reform Index might be seen to reflect the ethos of the sponsoring organizations, and not necessarily the values, purposes and fundamental responsibilities of courts. Both of these well known indexes rely heavily on polls of commissioned experts who assess the factors the third parties deem important to judicial reform. Perhaps stemming in part by a broad interpretation of judicial independence that does not embrace transparency and accountability, many judges have a viscerally negative reaction to public reporting on the quality of judicial services and some may be appalled that their judicial systems are ranked numerically by outside “experts” on the basis of what they perceive as misinformation.
Third-party assessment of performance, whether it takes the form of indexing or monitoring and evaluation, is the antithesis of self-governance. Because it is not self-directed by courts, it is not as likely to be embraced by its leaders and managers and lead to reform. As might be expected, when performance assessment is, or perceived to be, wholly initiated and executed by external organizations, more energy and resources may be expended by the leaders of courts and court systems on refuting poor evaluation results or low rankings than on developing strategies for sustained reform.
Performance Measurement Versus Evaluation Research
Both the disciplines of performance measurement and research adhere to the scientific method and use statistical thinking to draw conclusions. But, as already suggested above, performance measurement and research, or evaluation research, are vastly different in their purposes, functions, sponsorship, uses, and the way they are funded and structured. Sorting out the differences between them, and choosing to employ one over the other, is not just academic hair-splitting.
First, the purposes of the two disciplines are quite different. At a very fundamental level, the purpose of performance measurement is to answer the question How are we doing? in response to the demands for transparency and accountability from stakeholders and to provide a basis for improvement. Performance measurement can give clues to why outcomes are as bad or good as they are, but it does not go the full distance of determining why things are as they are. Performance measurement can help to identify variations in performance and to isolate where and when those variations occur (e.g., an upward trend in the public’s rating of the courts is largely due to attorneys’ increased satisfaction with case processing timeliness after the court initiated electronic filing) so that decisions and actions can target improvements. It does not determine causes. This is the domain of research which helps us to understand why something has occurred. (Of course, an important value of performance data is to trigger in-depth evaluation research.) As noted above, the purpose of evaluation research in the justice sector, for the most part, is to answer the question “What has worked and what has not, and why?’ in order to justify donors’ or funders’ investments various initiatives, programs and processes.
Second, performance measurement and research differ in their sponsorship and audience – i.e., who is doing it, and for (or “to”) whom. Performance measurement is done by the courts, for the courts. Results are made known, first and foremost, to court leaders and managers. Distribution of performance data to the public and other stakeholders is done at the discretion and direction of the court, preferable in real (or near real time) in a wholly transparent manner. Evaluation research, on the other hand, is more often than not sponsored or instigated by third parties (e.g., administrative offices of the courts or funding outside agencies). At the extreme circumstances, courts and other justice sector institutions are mere “subjects” of the research. Those conducting the research have no compunction to share the research results with court leaders or managers except as a courtesy or as a quid pro quo for the courts’ participation in the research.
Third, the functions of performance measurement are specific and targeted, i.e., establishing a baseline for current performance, setting organizational goals and assessing whether performance is within determined boundaries or tolerances (controls), identifying and diagnosing problems, determining trends, and planning. Performance measurement is done for the utilitarian and practical purposes of making improvements in court programs, services and policies. Court leaders and managers use performance information to make improvements in programs and services. Specifically, they might use performance measurement to: translate vision, mission and broad goals into clear performance target; communicate progress and success succinctly in the language of performance measures and indicators; respond to legislative and executive branch representatives’ and the public’s demand for accountability; formulate and justify budget requests; respond quickly to performance downturns (corrections) and upturns (celebrations) in performance; provide incentives and motivate court staff to make improvements in programs and services; make resource allocation decisions; set future performance expectations based on past and current performance levels; insulate the court from inappropriate performance audits and appraisals imposed by external agencies or groups; and communicate better with the public to build confidence and trust. Evaluation research, on the other hand, seeks truth about the worth and merit of an initiative, program or processes, and is intended to add to our general knowledge and understanding, especially regarding future investments in those initiatives, programs, or processes.
Fourth, performance measurement and evaluation research adhere to different design and data interpretation protocols. Consistent with self-governance, performance measurement is focused on the performance of individual courts with the aim of individual accountability. Researchers, on the other hand, are interested in the generalizability of findings to all courts.
Of course, both performance measurement and evaluation research must adhere to the requirements of the scientific method. Both use quantitative and qualitative methods including surveys and questionnaires, interviews, direct observation, recording, descriptive methods, tests and assessments, and statistical analysis. But these requirements and methods are applied differently in performance measurement and evaluation research. For example, sample sizes may be smaller and levels of confidence lower in performance measurement primarily because replication of results is done on a regular and continuous basis as a critical matter of design. Evaluation research, on the other hand, is episodic. It is done when time and funds permit.
The matter of replication of results highlights a critical design difference between performance measurement and evaluation research. Basically, replication means repeating the performance measurement or evaluation research to corroborate the results and to safeguard against overgeneralizations and other false claims. Repeated measurements – i.e., replication -- on a regular and continuous basis are part of the required methodology of performance measurement. Analyzing trends beyond initial baseline measurement requires replication of the same data collection and analysis on a monthly, weekly, daily or, in the case of automated systems, on a near real-time basis. In contrast, replication in research is a methodological safeguard that is universally lauded by scientists, but seldom done in practice.
© Copyright CourtMetrics and the National Center for State Courts 2012. All rights reserved.
Self-governance, transparency, and accountability of courts will depend on the degree to which these two impediments are attenuated. Court leaders will need to champion and support the self-assessment of performance by courts, instead of relying on third party evaluations. They will need to seek the replacement of the methodologies of the disciplines of research or program evaluation (or evaluation research), which have dominated justice sector assessment in the past, with that of performance measurement and management. These suggestions are consistent with the values, principles and tools of the National Center for State Courts’ High Performing Courts Framework, which is oriented toward United States courts, and the International Framework for Court Excellence.
Self-Assessment of Court Performance Versus Third-Party Evaluations
As with most changes in life, change that is self-directed is the most meaningful and long lasting. Self-assessment, and not third-party monitoring and program evaluation (or evaluation research), is the hallmark of successful courts. It is integral to effective self-governance. A successful court has the capacity and the political will for self-directed rigorous performance measurement and management that addresses the question How are we doing? The “we” in the question suggests the critical differences between court performance measurement and third-party evaluations of courts.
Monitoring and Evaluation of Court Programs and Processes. Performance measurement is not yet the norm in the United States and around the world, though it has a strong foothold and gaining strength in the United States and large parts of the developed and developing world. Most assessments of programs, processes, and reform initiatives in courts are accomplished instead by monitoring and evaluations instigated and conducted by third-parties, including funding agencies, donors, aid providers, and their agents (researchers, analysts, and consultants). The abiding concern of these third parties is return on their investments, and this concern does not necessarily align with the expressed purposes and fundamental responsibilities of courts, at least as these might be conceived by court leaders and managers, and in seminal authorities like the Trial Court Performance Standards. For the most part, the focus of these third-party assessments is a specific initiative, program or process (e.g., juvenile drug courts, small claims mediation, and summary jury trials).
The results are, at best, a limited and generally unsatisfactory response to the question How are we doing? Another weakness addressed in the next section of this essay: arrangements of the typical research paradigm within which these third-party monitoring and evaluation operate (e.g., long delays and ineffective sharing of information with courts) make the results even less responsive to the question. Performance data produced by monitoring and evaluation efforts are used primarily in service of decisions to increase, decrease or redirect funding or other support. While some performance data may be shared with courts and justice system, as a practical matter, most are collected, analyzed, interpreted, and used, first and foremost, by third parties, i.e., funders, donors, aid providers and their agents. The measures of success are defined and results interpreted by these third-parties with little or no input from the courts or justice systems implementing the programs, processes, and reform initiatives.
The Indexing of Performance. Another type third-party monitoring and evaluation of performance of courts takes the form of indexing of justice sector performance. Indexes are useful tools. The idea that complex things can be pinned down and quantified on a simple scale seems universally appealing. Indexes that reduce justice sector performance to a single number for purposes of comparing, ranking and rating countries are being used throughout the world to understanding everything from governance, corruption, economic vitality, health, education to the quality of life. Governments and reformers are taking such indexes seriously. They are closely watched. “There is nothing,” wrote the Economist in an October 9, 2010 report of the results of the Mo Ibrahim Foundation’s Index of African Governance, “like a bit of naming, shaming and praising.”
The crowded field of performance indexes includes comprehensive indicators and indices that encompass entire countries but include aspects of justice, like the Mo Ibrahim Foundation’s Index, and others more narrowly focused on rule of law like the World Justice Project’s WJP Rule of Law Index™ and the American Bar Association’s Judicial Reform Index. The World Justice Project’s WJP Rule of Law Index™ incorporate ten elements of the rule of law -- such as limited government powers, fundamental rights, and clear, publicized laws, and 49 associated sub-factors that make general reference to various justice “systems” but reference to courts is conspicuous by its absence.
While the indexes are successful in getting people’s attention by naming, shaming and praising the jurisdictions rated and ranked, buy-in of the leaders and managers of courts and court systems may be limited. Well known indexes like the World Justice Project’s WJP Rule of Law Index™ and the American Bar Association’s (ABA) Rule of Law Initiative’s Judicial Reform Index might be seen to reflect the ethos of the sponsoring organizations, and not necessarily the values, purposes and fundamental responsibilities of courts. Both of these well known indexes rely heavily on polls of commissioned experts who assess the factors the third parties deem important to judicial reform. Perhaps stemming in part by a broad interpretation of judicial independence that does not embrace transparency and accountability, many judges have a viscerally negative reaction to public reporting on the quality of judicial services and some may be appalled that their judicial systems are ranked numerically by outside “experts” on the basis of what they perceive as misinformation.
Third-party assessment of performance, whether it takes the form of indexing or monitoring and evaluation, is the antithesis of self-governance. Because it is not self-directed by courts, it is not as likely to be embraced by its leaders and managers and lead to reform. As might be expected, when performance assessment is, or perceived to be, wholly initiated and executed by external organizations, more energy and resources may be expended by the leaders of courts and court systems on refuting poor evaluation results or low rankings than on developing strategies for sustained reform.
Performance Measurement Versus Evaluation Research
Both the disciplines of performance measurement and research adhere to the scientific method and use statistical thinking to draw conclusions. But, as already suggested above, performance measurement and research, or evaluation research, are vastly different in their purposes, functions, sponsorship, uses, and the way they are funded and structured. Sorting out the differences between them, and choosing to employ one over the other, is not just academic hair-splitting.
First, the purposes of the two disciplines are quite different. At a very fundamental level, the purpose of performance measurement is to answer the question How are we doing? in response to the demands for transparency and accountability from stakeholders and to provide a basis for improvement. Performance measurement can give clues to why outcomes are as bad or good as they are, but it does not go the full distance of determining why things are as they are. Performance measurement can help to identify variations in performance and to isolate where and when those variations occur (e.g., an upward trend in the public’s rating of the courts is largely due to attorneys’ increased satisfaction with case processing timeliness after the court initiated electronic filing) so that decisions and actions can target improvements. It does not determine causes. This is the domain of research which helps us to understand why something has occurred. (Of course, an important value of performance data is to trigger in-depth evaluation research.) As noted above, the purpose of evaluation research in the justice sector, for the most part, is to answer the question “What has worked and what has not, and why?’ in order to justify donors’ or funders’ investments various initiatives, programs and processes.
Second, performance measurement and research differ in their sponsorship and audience – i.e., who is doing it, and for (or “to”) whom. Performance measurement is done by the courts, for the courts. Results are made known, first and foremost, to court leaders and managers. Distribution of performance data to the public and other stakeholders is done at the discretion and direction of the court, preferable in real (or near real time) in a wholly transparent manner. Evaluation research, on the other hand, is more often than not sponsored or instigated by third parties (e.g., administrative offices of the courts or funding outside agencies). At the extreme circumstances, courts and other justice sector institutions are mere “subjects” of the research. Those conducting the research have no compunction to share the research results with court leaders or managers except as a courtesy or as a quid pro quo for the courts’ participation in the research.
Third, the functions of performance measurement are specific and targeted, i.e., establishing a baseline for current performance, setting organizational goals and assessing whether performance is within determined boundaries or tolerances (controls), identifying and diagnosing problems, determining trends, and planning. Performance measurement is done for the utilitarian and practical purposes of making improvements in court programs, services and policies. Court leaders and managers use performance information to make improvements in programs and services. Specifically, they might use performance measurement to: translate vision, mission and broad goals into clear performance target; communicate progress and success succinctly in the language of performance measures and indicators; respond to legislative and executive branch representatives’ and the public’s demand for accountability; formulate and justify budget requests; respond quickly to performance downturns (corrections) and upturns (celebrations) in performance; provide incentives and motivate court staff to make improvements in programs and services; make resource allocation decisions; set future performance expectations based on past and current performance levels; insulate the court from inappropriate performance audits and appraisals imposed by external agencies or groups; and communicate better with the public to build confidence and trust. Evaluation research, on the other hand, seeks truth about the worth and merit of an initiative, program or processes, and is intended to add to our general knowledge and understanding, especially regarding future investments in those initiatives, programs, or processes.
Fourth, performance measurement and evaluation research adhere to different design and data interpretation protocols. Consistent with self-governance, performance measurement is focused on the performance of individual courts with the aim of individual accountability. Researchers, on the other hand, are interested in the generalizability of findings to all courts.
Of course, both performance measurement and evaluation research must adhere to the requirements of the scientific method. Both use quantitative and qualitative methods including surveys and questionnaires, interviews, direct observation, recording, descriptive methods, tests and assessments, and statistical analysis. But these requirements and methods are applied differently in performance measurement and evaluation research. For example, sample sizes may be smaller and levels of confidence lower in performance measurement primarily because replication of results is done on a regular and continuous basis as a critical matter of design. Evaluation research, on the other hand, is episodic. It is done when time and funds permit.
The matter of replication of results highlights a critical design difference between performance measurement and evaluation research. Basically, replication means repeating the performance measurement or evaluation research to corroborate the results and to safeguard against overgeneralizations and other false claims. Repeated measurements – i.e., replication -- on a regular and continuous basis are part of the required methodology of performance measurement. Analyzing trends beyond initial baseline measurement requires replication of the same data collection and analysis on a monthly, weekly, daily or, in the case of automated systems, on a near real-time basis. In contrast, replication in research is a methodological safeguard that is universally lauded by scientists, but seldom done in practice.
© Copyright CourtMetrics and the National Center for State Courts 2012. All rights reserved.