## Pii: s0029-7844(99)00480-9

Progesterone, Inhibin, and hCG Multiple MarkerStrategy to Differentiate Viable From NonviablePregnancies

*MAUREEN GLENNON PHIPPS, MD, JOSEPH W. HOGAN, ScD,JEFFREY F. PEIPERT, MD, MPH, GERALYN M. LAMBERT-MESSERLIAN, PhD,JACOB A. CANICK, PhD, AND DAVID B. SEIFER, MD*
**Objective****: To determine whether a combination of serum**
**Conclusion****: Serum progesterone appeared to be the single**
**and urine biomarkers drawn from symptomatic pregnant**
**most specific biomarker for distinguishing viable from non-**
**women will help early differentiation of viable from nonvi-**
**viable pregnancies. When a dual-biomarker strategy was**
**able pregnancies.**
**applied, combining serum progesterone with hCG, specific-**
**Methods****: We conducted a prospective cohort study of 220**
**ity improved significantly, which suggests that a multiple**
**women who presented in the first trimester of pregnancy**
**biomarker strategy might help distinguish viable from non-**
**with complaints of pain, cramping, bleeding, or spotting.**
**viable pregnancies in early gestation. (Obstet Gynecol 2000;**
**Serum samples for progesterone, inhibin A, and hCG, and**
**95:227–31.**
** 2000 by The American College of Obstetri-**
**urine beta-core hCG, were collected at presentation. To**
**cians and Gynecologists.)**
**evaluate whether those biomarkers could predict viable and**

nonviable outcomes in pregnancy, we used likelihood ratios

to compare operating characteristics of single and multiple

biomarker strategies.
Many biomarkers in serum, including hCG, progester-

**Results****: Of 220 pregnancies studied, 98 were viable and**
one, estradiol (E2), alpha-fetoprotein, fetal fibronectin,

**122 nonviable. Among single biomarkers, progesterone**
and inhibin A, have been studied to determine whether

**alone appears to have the greatest utility (area under the**
they could help diagnose ectopic pregnancies.1–4 Pro-

**receiver operator characteristic curve **؍

**0.923). Among dual-**
gesterone’s utility as a biomarker has been well dem-

**biomarker strategies, progesterone plus hCG and progester-**
onstrated.5–7 Inhibin A was shown to be lower in

**one plus inhibin A improved specificity but not sensitivity.**
ectopic pregnancies compared with intrauterine preg-

**At 95% sensitivity, the combination of progesterone and**
nancies.3,8 Urine -core hCG, the major metabolite of

**hCG improved specificity from 0.29 to 0.66 (improvement **؍

hCG in maternal urine, has been studied as a potential

**0.37 [95% confidence interval 0.23, 0.52]). A triple-biomarker**

combination did not show substantial improvement over the
biomarker for determining ectopic pregnancies com-

**dual-biomarker strategy. Also, combinations that used urine**
pared with normal pregnancies.9 Combinations of bi-

**beta-core hCG did not improve diagnostic accuracy.**
omarkers have also been studied to support rapiddiagnosis of ectopic pregnancies.1
The purpose of this study was to determine whether

*From the University of Michigan Health System, Robert Wood*
a combination of multiple serum and urine biomarkers

*Johnson Clinical Scholars Program and Department of Obstetrics and*
from symptomatic women at first-trimester clinical pre-

*Gynecology, Ann Arbor, Michigan; the Center for Statistical Sciences,*
sentation could differentiate a viable from a nonviable

*Department of Community Health, Brown University, Providence,Rhode Island; Department of Obstetrics and Gynecology and Depart-*
pregnancy. We conducted a prospective cohort study of

*ment of Pathology and Laboratory Medicine, Women and Infants*
women who presented with complaints of pain, cramp-

*Hospital of Rhode Island, Providence, Rhode Island; and the University*
ing, bleeding, or spotting. Serum quantitative hCG,

*of Medicine and Dentistry of New Jersey-Robert Wood Johnson MedicalSchool, Department of Obstetrics and Gynecology, New Brunswick,*
serum progesterone, serum inhibin A, and urine -core
hCG were evaluated independently and in combination

*The following companies provided assay reagents for this study:*
to determine their accuracy in predicting viability of a

*Diagnostic Products Corp., Los Angeles, California and Chiron Diag-nostics Corp., Alameda, California.*
nonparametric correlation was determined for eachmarker combination.

We used an observation cohort of 238 pregnant women
We estimated the area under the receiver operator
who presented to the Women and Infants Hospital of
characteristic (ROC) curve based on the likelihood
Rhode Island urgent care unit between June 1996 and
ratios calculated for each screening strategy. To under-
March 1997. Women were eligible if they presented
stand the differences between strategies, we compared
with complaints of bleeding, spotting, pain, or cramp-
specificity for given values of sensitivity and compared
ing in the first trimester (less than 13 weeks’ gestation).

sensitivity for given values of specificity. For example,
We limited our sample to spontaneously conceived
because progesterone alone was a very sensitive test,
pregnancies. Pregnancy outcomes determined by re-
we evaluated various multiple biomarker strategies by
view of the medical records were known in all subjects
comparing their specificity at 95% sensitivity. That
included in analysis. Of 238 women initially enrolled,
involved determining the likelihood ratio cutoff values
18 were excluded from analysis because pregnancy
that corresponded to 95% sensitivity, estimating speci-
outcomes could not be definitively established in 12 and
ficity for the cutoff, and estimating the difference in
gestational age was unknown in six. Seven women who
specificity (and associated 95% confidence interval [CI])
had therapeutic abortions were included in analysis
between biomarker combinations. Comparisons of sen-
because there was documentation of a viable pregnancy
sitivity at given values of specificity were done simi-
before termination. The Women and Infants Hospital
larly. Confidence intervals for the differences in sensi-
Institutional Review Board approved the research in
tivity and specificity were calculated using the normal
approximation to the sampling distribution of each
To determine marker values each eligible subject had
estimated difference, making proper adjustment for
serum drawn and a urine sample collected at presenta-
tion to the urgent care unit. Serum and urine samples
An important component of our analysis is the calcula-
were placed in aliquots and frozen at Ϫ20C until assays
tion of maternal- and gestational-age-adjusted likelihoodratios. Conditional on maternal and gestational age, the
were done. Commercially available assays were used to
likelihood ratio for a woman with a particular set of
analyze samples for quantitative hCG (Immulite; Diag-
marker values is the odds that she will have a nonviable
nostic Products Corp., Los Angeles, CA), progesterone
pregnancy, denoted by LR ϭ Pr(NV͉M, C)/Pr(V͉M, C),
(Immulite; Diagnostic Products Corp.), inhibin A (en-
where Pr(A͉B) is the conditional probability of A given B,
zyme-linked immunosorbant assay by Serotec, Ltd.,
NV and V, respectively, denote nonviable and viable, M
Oxford, United Kingdom) and urine -core hCG
represents a set of marker values and C denotes individ-
(Titron; Chiron Diagnostics Corp., Alameda, CA). As-
ual characteristics that might affect marker values (mater-
says were done without knowledge of pregnancy out-
nal and gestational age). Likelihood ratios can be com-
comes and results did not influence treatment. Besides
puted by logistic regression,13 but using correlated
pregnancy status and marker data, we recorded several
markers can lead to loss of efficiency and problems with
demographic and historical variables including mater-
collinearity. Instead, we computed likelihood ratios by
nal age, gestational age, gravidity, parity, race, insur-
modeling marker distribution separately by viability sta-
ance status, and reproductive history.

tus. Using Bayes theorem, the likelihood ratio can be
For our primary analysis, we compared operating
expressed in terms of the (multivariate) marker distribu-
characteristics (ie, measures of diagnostic accuracy) of
tions, so that LR ϭ [Pr(M͉NV, C) Pr(NV͉C)]/[Pr(M͉V, C)
progesterone only (P), progesterone plus inhibin A
Pr(V͉C)]. In general, covariates such as maternal and
(PϩI), progesterone plus serum hCG (PϩH), and pro-
gestational age might affect marker distribution and via-
gesterone plus serum hCG plus inhibin A (PϩHϩI).

bility status. We assume that maternal and gestational age
Our goals were to quantify information gained using
do not affect belief about viability status before collecting
multiple biomarker strategies compared with proges-
marker data, the likelihood ratio is proportional to Pr1
terone alone and to determine the nature of differences
(M͉NV, C)/Pr (M͉V, C). The parameters
between single- and multiple-biomarker strategies. Fol-
phasize that marker values follow separate models by
lowing Haddow et al10, we used likelihood ratios to
quantify the maternal- and gestational-age-adjusted
Individual likelihood ratios are calculated by estimat-
odds of nonviable pregnancies for a given combination
ing 1 and 0, then evaluating Pr (M͉NV, C)/Pr
of marker values. We used multivariate normal regres-
(M͉V,C) using individual marker values, maternal age,
sion models11 to adjust for variability in biomarkers due
and gestational age. Models for the numerator (NV) and
to differences in maternal and gestational age and to
denominator (V) of the likelihood ratios were fit under
account for correlations between biomarkers. Spearman
the assumptions that after a suitable transformation,

**228 Phipps et al**
**Table 1. **Demographic and Clinical Characteristics

polynomials were used to avoid overfitting of the
marker distributions to covariates. All analyses were
done using SAS Version 6.12 (SAS Institute, Cary, NC);multivariate models were fit using SAS Proc Mixed.

Pregnancy outcomes included 98 viable intrauterine
pregnancies, 85 first-trimester spontaneous abortions,
and 37 ectopic pregnancies. Viable intrauterine preg-
nancies were defined as viable pregnancies (45%); spon-
taneous abortions and ectopic pregnancies were de-
fined as nonviable pregnancies (55%). The mean age of
subjects with viable pregnancies was lower than that of
subjects with nonviable pregnancies (24.1 versus 27.9
years,

*P *Ͻ .01). Other demographic information and
clinical characteristics are given in Table 1. There were
no significant differences in race, gravidity, or parity.

The overall distribution for each biomarker, given
outcomes of viable or nonviable pregnancy, was di-
Table entries are total

*n *(%) except for age, which is mean Ϯ
vided into percentiles (Table 2). For progesterone, theviable pregnancy value for the 25th percentile was wellabove the nonviable pregnancy value at the 75th per-
each set of marker values follows a multivariate normal
centile. There was more overlap between other biomar-
distribution and that the mean marker value varies
kers. The greatest disparity between nonviable and
linearly with maternal age, possibly up to a cubic
viable pregnancies was in the distribution of progester-
function of gestational age. We used natural log trans-
one values. As a single marker, progesterone had the
formations for progesterone and inhibin A, and 1/6
greatest area under the ROC curve (0.923, compared
power transformation for hCG. The estimated parame-
with 0.795 for inhibin A, 0.646 for urine -core hCG, and
ters ˆ1 and ˆ0 contain regression parameters, marker
standard deviations, and marker correlations for the
In formulating multiple-marker strategies, we chose
fitted multivariate models. Each woman’s likelihood
serum hCG rather than urine -core hCG because the
two correlated so strongly (Spearman rank correlation
(M ͉iV, Ci), where

*i *indexes subject and

*f *is the multivar-
0.90) and because serum hCG is the biomarker used
iate normal density function, with dimension equal to
most commonly by practitioners for evaluating preg-
the number of markers. Under our assumptions, LR

*i *is
nancy viability. For other biomarker combinations, the
proportional to (ie, has the same rank-ordering across
pairwise rank correlation ranged from 0.32 for proges-
individuals as) the odds of having a nonviable preg-
terone and urine -core hCG to 0.73 for serum hCG and
nancy for a given set of marker values, adjusted for
maternal and gestational age. The implication is that it
Multiple-marker strategies resulted in improvement
can be used for nonparametric calculations of sensitiv-
in the area under the ROC curve. After incorporating
maternal and gestational age–adjusted likelihood ratios,
Fitted models used for likelihood ratio calculations
the area under the ROC curve for

*P *was .91. When
were checked using residual plots, and orthogonal
serum hCG was added it increased to .95 (PϩH), withinhibin A to .94 (PϩI), and the triple marker combina-tion had an area of .95 (PϩIϩH).

**Table 2. **Biomarker Distribution Percentile

Area under the ROC curve is a global measure for
characterizing utility, and interpreting clinical implica-
tions of differences in area under the ROC curve can be
difficult. In more detail, we compared sensitivities at
fixed specificity, and specificity at fixed sensitivities.

Table 3 compares specificity for fixed sensitivities. At
95% sensitivity, the specificity for

*P *was 29%. At the
same sensitivity, using the dual biomarker strategy of

**Phipps et al**
**Table 3. **Estimated Specificity at Given Sensitivity for

**Table 5. **Estimated Sensitivity at Given Specificity for

P ϭ progesterone alone; PϩI ϭ progesterone plus inhibin A; PϩH ϭ
progesterone plus serum hCG; PϩIϩH ϭ progesterone plus inhibin Aplus serum hCG.

nonviable pregnancy, or at least 25 ng/mL, suggestinga viable pregnancy.4,5 However, in our study, 41% of
PϩI, the specificity increased to 57% and the specificity
subjects had progesterone levels in the range of 10 –25
for PϩH was 66%. Based on 95% confidence intervals
for difference in specificity, those combinations repre-
A highly sensitive and specific test to determine
sent statistically significant gains in specificity over

*P*.

viability would be useful in a clinical setting when
Specificity for the combination PϩH was 9% higher
women present with acute symptoms and a decision
than PϩI (95% CI Ϫ0.02, 0.20). Table 4 shows compar-
regarding treatment must be made quickly. A highly
isons between specificity for

*P *and combinations with
sensitive biomarker more accurately identifies viable
the other biomarkers when sensitivity is 85% and 95%,
pregnancies (true positives) and a highly specific bi-
which indicated that PϩH is the superior choice. At
omarker more accurately identifies nonviable pregnan-
95% sensitivity, PϩIϩH did not show an appreciable
cies (true negatives). Treatment of women who present
gain in specificity over PϩH (0.69 compared with 0.66).

with cramping and spotting in the first trimester of
At 95% specificity,

*P *had sensitivity of 83%. Table 5
pregnancy would be better guided by a sensitive and
shows minimal gains in sensitivity among the various
specific test that would reliably categorize prognoses
biomarker strategies at fixed specificity values between
for pregnancies. Thus far, there is no single test that
accurately predicts pregnancy outcome in urgent situ-ations.

Of the single biomarkers, progesterone has the great-
est utility as measured by area under the ROC curve.

We evaluated diagnostic usefulness of biomarkers for
Although dual-biomarker strategies improved the area
distinguishing viable from nonviable early pregnancies
under the ROC curve, we found that applying the
in symptomatic women who presented for urgent care.

multiple-biomarker strategies improved specificity but
It has been well established that measuring serial serum
not sensitivity. For improving specificity, the dual bi-
quantitative hCG is helpful in treating symptomatic
omarker combination progesterone plus serum hCG
women in early gestation.14 In clinical practice, the time
(PϩH) is better than progesterone plus inhibin A (PϩI)
delay necessary for distinguishing a viable from a
and as good as the triple biomarker combination pro-
nonviable pregnancy is often distressing to women and
gesterone plus inhibin A plus serum hCG (PϩIϩH).

practitioners when women present with symptoms in
The addition of the urine -core hCG was not helpful in
an emergency setting. Adding serum progesterone to
the diagnostic tests can be helpful in clinical manage-
Our study had several limitations. It was conducted
ment if the level is under 10 ng/mL, suggesting a
from a convenience sample of symptomatic women inearly gestation who conceived spontaneously. Thus, theresults cannot be generalized to asymptomatic pregnant

**Table 4. **Difference in Specificity

women or women who pursued ovulation inductionand assisted reproductive technologies to achieve preg-
nancy. Our computation of likelihood ratios relied on a
(carefully constructed) parametric model to adjust for
differences in maternal and gestational ages, both of
which have underlying associations with various
marker values. The assumptions are needed because of
P ϭ progesterone alone; PϩI ϭ progesterone plus inhibin A; PϩH ϭ
limited sample size. Having data from large, popula-
tion-based samples would reduce the need to rely on
95% confidence interval widths adjusted for the six multiple com-
parisons using the Bonferroni method.

**230 Phipps et al**
Our study suggests the need for formalizing a mul-
**Phipps et al**
Source: http://health.bsd.uchicago.edu/thisted/epor/Papers/Hogan-Progesterone.pdf

