Liver Allograft Rejection: Current Status of Classification and Grading

Kenneth P. Batts, M.D.

Department of Laboratory Medicine and Pathology, Mayo Clinic and Mayo Foundation, Rochester, MN



The morphologic features of rejection in the human hepatic allografts have been well recognized since Porter's elegant review in 1969 [1]. Of contention, however, has been the classification, nomenclature, and grading of these well recognized histologic changes. Since clinical findings and laboratory values are a poor indicator of rejection, the interpretation of liver biopsy specimens and successful communication of such to the clinician is of considerable importance. The purpose of this manuscript is to review current thoughts on the classification and grading of rejection in liver allografts.


Nomenclature of Hepatic Allograft Rejection

Nomenclature is a structured system of names used in a scientific specialty. These names should reflect current state of knowledge, be linguistically correct, readily encodable, easily memorized, and internationally acceptable. Recognizing that achieving a nomenclature for liver allograft rejection that meets these criteria is a difficult task, a panel of an International Working Party on the Terminology of Chronic Hepatitis, Hepatic Allograft Rejection, and Nodular Lesions of the Liver was organized and funded by the World Congresses of Gastroenterology, Los Angeles, in 1994 [2].

This task force recommended using the terms "humoral rejection", "acute" or "cellular" rejection, and "chronic" or "ductopenic" rejection" to refer to the three major types of liver allograft rejection [Table 1]. The terms "cellular" and "ductopenic" rejection will be used herein, with the recognition that they are synonymous with "acute" and "chronic" rejection. Humoral rejection refers to the uncommon antibody- and complement-mediated rejection that occurs in the first week post-transplant in pre-sensitized individuals. Cellular rejection refers to predominantly inflammatory-cell mediated rejection which primarily affects interlobular bile ducts and vascular endothelia and is generally reversible with additional immunosuppression. This category of rejection has generated the most discussion on grading (see below). Ductopenic rejection refers to a relatively uncommon form of rejection that typically occurs several months or more following transplant, is usually irreversible, and is defined by loss of bile ducts and the presence of obliterative vasculopathy. The term "rejection, indefinite for chronicity (indefinite for bile duct loss)" was used to address cases in which cellular rejection is present with possible but not definite loss of ducts. Since most cases of ductopenic rejection are associated with antecedent episodes of cellular rejection, cases in this category may represent examples of "early" ductopenic rejection which may be better candidates for alterations in immunosuppressive regimen to affect reversal of this usually irreversible process.


Grading of Rejection

The rationale for the morphologic grading of hepatic allograft rejection is to predict unfavorable clinical outcome. An ideal grading system would assign individuals demonstrating a certain morphologic type of rejection to clinically relevant categories—for example, those who do not need additional immunosuppression, those who need additional immunosuppression but are likely to respond, and those who will likely not respond to increased immunosuppression ("irreversible" rejection). In addition to demonstrating clinical relevance, grading systems should be simple and reproducible. In the remainder of this manuscript, some of the existing grading systems for liver allograft rejection are reviewed in the context of the aforementioned criteria. The primary focus is on cellular rejection.

Porter's review in 1969 described in detail various morphologic aspects of liver allograft rejection but did not address the issue of grading [1]. Little was written about liver allograft rejection until 1983 [3, 4]. In the next several years, a number of authors further clarified the histopathologic features of rejection, largely in the context of distinguishing rejection qualitatively from other disease processes, but most authors again did not address the concept of grading [5-8].

Grading was addressed peripherally in 1984 by Eggink et al who divided cases of cellular rejection according to the severity of inflammation; however, all forms responded to additional immunosuppression and thus, grading did not appear to have clinical relevance [9]. In 1985, Williams et al more formally addressed grading with a defined I-III system for cellular rejection, however only six patients were studied and all responded to additional immunosuppression [10].

The first study that attempted to assess the clinical relevance of grading liver allograft rejection was by Snover et al in 1987 [11]. Using a 4 grade system [Table 2], it was noted that rejection with arteritis, ballooning degeneration with confluent necrosis, or paucity of ducts was associated with unfavorable outcomes. Reproducibility of the classification was not addressed, however, and the study was based on only 36 patients with rejection.

In 1989, Kemnitz et al used a modification of Williams et al's [10] grading system for rejection (Table 2) in examining 329 biopsy specimens from 81 patients [12]. They formed that patients who experienced multiple episodes of rejection were more likely to have a severe grade of rejection and a severe grade of bile duct injury. Again, reproducibility of grading was not addressed. Furthermore, clinical outcome was not assessed as the study was retrospective, using number of biopsy-proven episodes of rejection as the measure of unfavorable outcome against which the grade of rejection was judged; graft or patient survival were not addressed.

The first detailed study of the intra- and inter-observer variation in the histopathologic assessment of liver allograft rejection was by Demetris et al in 1991 using the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) national liver transplant data base [13]. In that study, a panel of experienced pathologists showed that the qualitative diagnosis of cellular rejection as well as grading of the individual histologic features important in establishing that diagnosis were highly reproducible. The diagnosis of ductopenic rejection was less reproducible, however. This study did not address grading of rejection in a detailed fashion, however, instead focusing on grading of a large number of individual histologic features of rejection and dividing rejection into just acute and chronic forms. Furthermore, clinical relevance of the grading of individual features was not addressed.

The next major attempt at grading liver allograft rejection, by Gupta et al from the Royal Free Hospital, was published in 1995 [14]. This was a retrospective study which involved statistical analysis on the results of single biopsy specimens taken from 106 patients between 1 and 8 days post-transplant. Using stepwise logistic discriminant analysis on twenty histologic features potentially identifying cellular rejection, they confirmed the diagnostic importance of the "Snover's triad" as well as eosinophils in the diagnosis of cellular rejection. They also applied a new 0-12 grading system for cellular rejection [Table 3]. They found that their new grading system correlated well with those of Snover [11] and Demetris [15]. An advantage is that the grading criteria are restricted to Snover's triad and not dependent on hepatocellular necrosis since distinguishing rejection-related necrosis from coincident necrosis can be difficult.

This study has several drawbacks, however. It involved only specimens from the first 8 days post-transplant, a period of time in which the differential diagnosis of portal lymphoid processes does not include viral hepatitis or recurrent primary disease, making the diagnosis of cellular rejection much more straightforward. Furthermore, the ability of the grade to predict outcome could not be assessed, since the diagnosis of cellular rejection was based, in part, on response to therapy. Thus, cellular rejection which spontaneously resolved or was resistant to therapy would have been miscategorized. Thirdly, the inter- or intra-observer reproducibility of the system was not assessed. Lastly, this 0-12 system is the most complex of the proposed systems to date, making it perhaps more useful for study purposes than in daily practice.

The latest grading system was published by Demetris et al in 1995, again utilizing the NIDDK liver transplant data base [16]. This study addressed both reproducibility of grading and prognostic value of grading in separate arms of the study, using a grading system which divided cellular rejection into 4 grades (none, mild, moderate, and severe) [Table 2].

For the reproducibility arm, fifty pre-selected cases were reviewed on three separate occasions in a blinded fashion by five pathologists at four different institutions. For the third reading the pathologists were given rudimentary clinical data. A number of individual histologic features were recorded and rejection, if thought to be present, was graded using the aforementioned system. The minimum threshold for cellular rejection was the presence of two of the three components of Snover's triad. The "gold standard" assessment of the diagnosis had been pre-determined based on the original biopsy interpretation and the clinical context. Intrarater agreement for the individual histologic features of portal inflammation, bile duct inflammation/damage, and endotheliitis was good (kappas .59 to .69) as was the grading of cellular rejection (kappas .55 and .58). The addition of clinical information helped only slightly. Inter-rater agreement for the same three individual histologic features was moderate to good (kappas .48 to .61) as was the grading of cellular rejection (kappas .40 to .55).

For the prognostic arm, 295 patients from three institutions were utilized. The ability of the grade of cellular rejection or a number of graded individual histologic findings to predict an unfavorable outcome was assessed. Unfavorable short-term outcomes were defined as one of the following: rejection-related graft failure, OKT3- or ALG-requiring rejection, requirement for any secondary treatment (after initial steroid therapy), or failure of complete resolution to occur within 21 days. Long-term unfavorable outcome was defined as rejection-related graft failure or death within six months of the rejection episode. A statistically significant trend was noted between grade of cellular rejection and short-term (p=0.005) or long-term (p=0.006) unfavorable outcomes. Mild, moderate, and severe cellular rejection was associated with 37%, 48%, and 75% unfavorable short-term and 1%, 12%, and 14% unfavorable long-term outcomes respectively.


Grading in Perspective

Adoption of a unified worldwide grading system for liver allograft rejection is clearly desirable [14, 16]. As noted by Gupta et al, the advantages of such a system would be the reliable follow-up of serial biopsies, facilitation of comparison of inter-institutional results, and perhaps help in fine-tuning immunosuppressive therapy [14]. At the present time, the NIDDK scheme for cellular rejection [16] has the distinct advantages of having been shown to have acceptable inter- and intra-rater reproducibility, demonstrates clinical relevance by predicting unfavorable outcomes, and to be relatively simple through its derivation from and similarity to the previous schemes of others [Table 4][9-12].

Given the apparent utility of the NIDDK system, it would seem reasonable to direct further attempts at grading cellular rejection toward validating this system in additional medical centers. Given the relative infrequency with which actual graft loss is associated with cellular rejection, however, devoting extensive ongoing efforts to grading these lesions may have diminishing returns.

A perhaps insurmountable obstacle for any grading system will be the presence of confounding histologic variables that, if considered non-discrimantly, would skew the grading. For example, the coexistence of mild cellular rejection plus centrilobular necrosis related to ischemia, drug effect, or preservation injury might be regarded as severe rejection in many of the schemes.

Of more immediate practical interest is the issue of whether additional immunosuppression is necessary in patients with mild forms of rejection. A recent series reported eight patients with spontaneously resolving acute cellular rejection [17]. There is a need for randomized, prospective trials addressing necessity of additional immunosuppression in patients with mild or moderate grades of cellular rejection. Utilization of both the detailed scheme of Kemnitz et al [12] and the simpler scheme in Demetris et al [16] would seem reasonable in defining patients and tracking outcomes.





