Key words: Liver Transplantation; Rejection; Grading
Department of Laboratory Medicine and Pathology, Mayo Clinic and Mayo Foundation, Rochester, MN
This task force recommended using the terms "humoral rejection", "acute" or "cellular" rejection, and "chronic" or "ductopenic" rejection" to refer to the three major types of liver allograft rejection [Table 1]. The terms "cellular" and "ductopenic" rejection will be used herein, with the recognition that they are synonymous with "acute" and "chronic" rejection. Humoral rejection refers to the uncommon antibody- and complement-mediated rejection that occurs in the first week post-transplant in pre-sensitized individuals. Cellular rejection refers to predominantly inflammatory-cell mediated rejection which primarily affects interlobular bile ducts and vascular endothelia and is generally reversible with additional immunosuppression. This category of rejection has generated the most discussion on grading (see below). Ductopenic rejection refers to a relatively uncommon form of rejection that typically occurs several months or more following transplant, is usually irreversible, and is defined by loss of bile ducts and the presence of obliterative vasculopathy. The term "rejection, indefinite for chronicity (indefinite for bile duct loss)" was used to address cases in which cellular rejection is present with possible but not definite loss of ducts. Since most cases of ductopenic rejection are associated with antecedent episodes of cellular rejection, cases in this category may represent examples of "early" ductopenic rejection which may be better candidates for alterations in immunosuppressive regimen to affect reversal of this usually irreversible process.
Porter's review in 1969 described in detail various morphologic aspects of liver allograft rejection but did not address the issue of grading . Little was written about liver allograft rejection until 1983 [3, 4]. In the next several years, a number of authors further clarified the histopathologic features of rejection, largely in the context of distinguishing rejection qualitatively from other disease processes, but most authors again did not address the concept of grading [5-8].
Grading was addressed peripherally in 1984 by Eggink et al who divided cases of cellular rejection according to the severity of inflammation; however, all forms responded to additional immunosuppression and thus, grading did not appear to have clinical relevance . In 1985, Williams et al more formally addressed grading with a defined I-III system for cellular rejection, however only six patients were studied and all responded to additional immunosuppression .
The first study that attempted to assess the clinical relevance of grading liver allograft rejection was by Snover et al in 1987 . Using a 4 grade system [Table 2], it was noted that rejection with arteritis, ballooning degeneration with confluent necrosis, or paucity of ducts was associated with unfavorable outcomes. Reproducibility of the classification was not addressed, however, and the study was based on only 36 patients with rejection.
In 1989, Kemnitz et al used a modification of Williams et al's  grading system for rejection (Table 2) in examining 329 biopsy specimens from 81 patients . They formed that patients who experienced multiple episodes of rejection were more likely to have a severe grade of rejection and a severe grade of bile duct injury. Again, reproducibility of grading was not addressed. Furthermore, clinical outcome was not assessed as the study was retrospective, using number of biopsy-proven episodes of rejection as the measure of unfavorable outcome against which the grade of rejection was judged; graft or patient survival were not addressed.
The first detailed study of the intra- and inter-observer variation in the histopathologic assessment of liver allograft rejection was by Demetris et al in 1991 using the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) national liver transplant data base . In that study, a panel of experienced pathologists showed that the qualitative diagnosis of cellular rejection as well as grading of the individual histologic features important in establishing that diagnosis were highly reproducible. The diagnosis of ductopenic rejection was less reproducible, however. This study did not address grading of rejection in a detailed fashion, however, instead focusing on grading of a large number of individual histologic features of rejection and dividing rejection into just acute and chronic forms. Furthermore, clinical relevance of the grading of individual features was not addressed.
The next major attempt at grading liver allograft rejection, by Gupta et al from the Royal Free Hospital, was published in 1995 . This was a retrospective study which involved statistical analysis on the results of single biopsy specimens taken from 106 patients between 1 and 8 days post-transplant. Using stepwise logistic discriminant analysis on twenty histologic features potentially identifying cellular rejection, they confirmed the diagnostic importance of the "Snover's triad" as well as eosinophils in the diagnosis of cellular rejection. They also applied a new 0-12 grading system for cellular rejection [Table 3]. They found that their new grading system correlated well with those of Snover  and Demetris . An advantage is that the grading criteria are restricted to Snover's triad and not dependent on hepatocellular necrosis since distinguishing rejection-related necrosis from coincident necrosis can be difficult.
This study has several drawbacks, however. It involved only specimens from the first 8 days post-transplant, a period of time in which the differential diagnosis of portal lymphoid processes does not include viral hepatitis or recurrent primary disease, making the diagnosis of cellular rejection much more straightforward. Furthermore, the ability of the grade to predict outcome could not be assessed, since the diagnosis of cellular rejection was based, in part, on response to therapy. Thus, cellular rejection which spontaneously resolved or was resistant to therapy would have been miscategorized. Thirdly, the inter- or intra-observer reproducibility of the system was not assessed. Lastly, this 0-12 system is the most complex of the proposed systems to date, making it perhaps more useful for study purposes than in daily practice.
The latest grading system was published by Demetris et al in 1995, again utilizing the NIDDK liver transplant data base . This study addressed both reproducibility of grading and prognostic value of grading in separate arms of the study, using a grading system which divided cellular rejection into 4 grades (none, mild, moderate, and severe) [Table 2].
For the reproducibility arm, fifty pre-selected cases were reviewed on three separate occasions in a blinded fashion by five pathologists at four different institutions. For the third reading the pathologists were given rudimentary clinical data. A number of individual histologic features were recorded and rejection, if thought to be present, was graded using the aforementioned system. The minimum threshold for cellular rejection was the presence of two of the three components of Snover's triad. The "gold standard" assessment of the diagnosis had been pre-determined based on the original biopsy interpretation and the clinical context. Intrarater agreement for the individual histologic features of portal inflammation, bile duct inflammation/damage, and endotheliitis was good (kappas .59 to .69) as was the grading of cellular rejection (kappas .55 and .58). The addition of clinical information helped only slightly. Inter-rater agreement for the same three individual histologic features was moderate to good (kappas .48 to .61) as was the grading of cellular rejection (kappas .40 to .55).
For the prognostic arm, 295 patients from three institutions were utilized. The ability of the grade of cellular rejection or a number of graded individual histologic findings to predict an unfavorable outcome was assessed. Unfavorable short-term outcomes were defined as one of the following: rejection-related graft failure, OKT3- or ALG-requiring rejection, requirement for any secondary treatment (after initial steroid therapy), or failure of complete resolution to occur within 21 days. Long-term unfavorable outcome was defined as rejection-related graft failure or death within six months of the rejection episode. A statistically significant trend was noted between grade of cellular rejection and short-term (p=0.005) or long-term (p=0.006) unfavorable outcomes. Mild, moderate, and severe cellular rejection was associated with 37%, 48%, and 75% unfavorable short-term and 1%, 12%, and 14% unfavorable long-term outcomes respectively.
Given the apparent utility of the NIDDK system, it would seem reasonable to direct further attempts at grading cellular rejection toward validating this system in additional medical centers. Given the relative infrequency with which actual graft loss is associated with cellular rejection, however, devoting extensive ongoing efforts to grading these lesions may have diminishing returns.
A perhaps insurmountable obstacle for any grading system will be the presence of confounding histologic variables that, if considered non-discrimantly, would skew the grading. For example, the coexistence of mild cellular rejection plus centrilobular necrosis related to ischemia, drug effect, or preservation injury might be regarded as severe rejection in many of the schemes.
Of more immediate practical interest is the issue of whether additional immunosuppression is necessary in patients with mild forms of rejection. A recent series reported eight patients with spontaneously resolving acute cellular rejection . There is a need for randomized, prospective trials addressing necessity of additional immunosuppression in patients with mild or moderate grades of cellular rejection. Utilization of both the detailed scheme of Kemnitz et al  and the simpler scheme in Demetris et al  would seem reasonable in defining patients and tracking outcomes.