Proceedings of the 45th annual meeting of the association of computational linguistics

Interpreting Comparative Constructions in Biomedical Text
Marcelo Fiszman,1 Dina Demner-Fushman,2
Francois M. Lang,2 Philip Goetz,2
Thomas C. Rindflesch2
1University of Tennessee – GSM, Knoxville, TN 37920 2Lister Hill National Center for Biomedical Communications National Library of Medicine, Bethesda, MD 20894 {ddemner|goetzp|flang|trindflesch}@mail.nih.gov Abstract
In the abstracts of these reports, a treatment for some disease is typically discussed using two types of comparative structures. The first announces that underspecified semantic interpretation to the (primary) therapy focused on in the study will be compared to some other (secondary) therapy. A structures that are prevalent in the research literature reporting on clinical trials for which constructs predications based on the An outcome statement (2) often appears near the Unified Medical Language System. Results end of the abstract, asserting results in terms of the of a preliminary evaluation were recall of relative merits of the primary therapy compared to 81%. We discuss the generalization of the therapeutic and diagnostic procedures. The available structures in computable format The processing of comparative expressions such as (1) and (2) was incorporated into an existing system, SemRep [Rindflesch and Fiszman, 2003; Introduction
Rindflesch et al., 2005], which constructs semantic As natural language processing (NLP) is predications by mapping assertions in biomedical increasingly able to support advanced information text to the Unified Medical Language System® management techniques for research in medicine (UMLS)® [Humphreys et al., 1998].
and biology, it is being incrementally improved to
provide extended coverage and more accurate 2
Background
results. In this paper, we discuss the extension of Comparative structures in English
an existing semantic interpretation system to address comparative structures. These structures The range of comparative expressions in English is provide a way of explicating the characteristics of extensive and complex. Several linguistic studies one entity in terms of a second, thereby enhancing have investigated their characteristics, with the description of the first. This phenomenon is differing assumptions about syntax and semantics important in clinical research literature reporting (for example [Ryan, 1981; Rayner and Banks, the results of clinical trials. 1990; Staab and Hahn, 1997; Huddleston and Pullum, 2002]). Our study concentrates on BioNLP 2007: Biological, translational, and clinical language processing, pages 137–144, Prague, June 2007. c 2007 Association for Computational Linguistics structures in which two drugs are compared with Naproxen is safer than
respect to a shared attribute (e.g. how well they aspirin in the treatment of the
treat some disease). An assessment of their relative
merit in this regard is indicated by their positions Sodium valproate was
on a scale. The compared terms are expressed as noun phrases, which can be considered to be prochlorperazine in reducing pain
conjoined. The shared characteristic focused on is
expressed as a predicate outside the comparative In examples (3) through (7), the characteristic the structure. An adjective or noun is used to denote compared drugs have in common is treatment of the scale, and words such as than, as, with, and to some disorder, for example treatment of pertussis serve as cues to identify the compared terms, the scale, and the relative position of the terms on the Few studies describe an implemented automatic analysis of comparatives; however, Friedman The first type of structure we address (called [Friedman, 1989] is a notable exception. Jindal and comp1 and illustrated in (3)) merely asserts that the Liu [Jindal and Liu, 2006] use machine learning to primary and secondary terms (in bold) are being identify some comparative structures, but do not compared. A possible cue for identifying these provide a semantic interpretation. We exploit structures is a form of compare. A further SemRep machinery to interpret the aspects of characteristic is that the compared terms are comparative structures just described. separated by a conjunction, or a preposition, as in (3). To compare misoprostol with
dinoprostone for cervical ripening
Rindflesch et al., 2005] recovers underspecified semantic propositions in biomedical text based on As shown in (4), a scale may be mentioned a partial syntactic analysis and structured domain (efficacy); however, in this study, we only identify knowledge from the UMLS. Several systems that the compared terms in structures of this type. extract entities and relations are under development in both the clinical and molecular misoprostol with dinoprostone for
biology domains. Examples of systems for clinical text are described in [Friedman et al., 1994], [Johnson et al., 1993], [Hahn et al., 2002], and In the more complex comparative expression we [Christensen et al., 2002]. In molecular biology, accommodate (called comp2), the relative ranking examples include [Yen et al., 2006], [Chun et al., of two compared terms is indicated on a scale 2006], [Blaschke et al., 1999], [Leroy et al., 2003], denoted by an adjective (e.g. effective in (5)). The [Rindflesch et al., 2005], [Friedman et al., 2001], relative position of the compared terms in scalar and [Lussier et al., 2006]. comparative structures of this type expresses either During SemRep processing, a partial syntactic equality or inequality. Inequality is further divided parse is produced that depends on lexical look-up into superiority, where the primary compared term in the SPECIALIST lexicon [McCray et al., 1994] is higher on the scale than the secondary, and and a part-of-speech tagger [Smith et al., 2004]. inferiority, where the opposite is true. Cues MetaMap [Aronson, 2001] then matches noun associated with the adjective designating the scale phrases to concepts in the Metathesaurus® and signal these phenomena (e.g. as ADJ as in (5) for determines the semantic type for each concept. For equality, ADJer than in (6) for superiority, and less example, the structure in (9), produced for (8), ADJ than in (7) for inferiority). allows both syntactic and semantic information to Azithromycin is as effective
be used in further SemRep processing that as erythromycin estolate for the
treatment of gastroesophageal reflux disease C2: compare Term1 with/to Term2 C3: compare Term1 and/versus Term2 C4a: Term1 comparison with/to Term2 C4b: comparison of Term1 with/to Term2 C4c: comparison of Term1 and/versus Term2 (12) comp2: Scalar patterns
Predicates are derived from indicator rules that S1: Term1 BE as ADJ as {BE} Term2 map syntactic phenomena (such as verbs and S2a: Term1 BE more ADJ than {BE} Term2 nominalizations) to relationships in the UMLS S2b: Term1 BE ADJer than {BE}Term2 Semantic Network. Argument identification is S2c: Term1 BE less ADJ than {BE} Term2 guided by dependency grammar rules as well as S4: Term1 BE superior to Term2 constraints imposed by the Semantic Network. In processing (8), for example, an indicator rule links As with SemRep in general, the interpretation of the nominalization treatment with the Semantic comparative structures exploits underspecified Network relation “Pharmacologic Substance syntactic structure enhanced with Metathesaurus TREATS Disease or Syndrome.” Since the concepts and semantic types. Semantic groups semantic types of the syntactic arguments [McCray et al., 2001] from the Semantic Network identified for treatment in this sentence are also available. For this project, we exploit the (‘Pharmacologic Substance’ for “lansoprazole” and group Chemicals & Drugs, which contains such ‘Disease or Syndrome’ for “Gastroesophageal semantic types as ‘Pharmacologic Substance’, reflux disease”) match the corresponding semantic ‘Antibiotic’, and ‘Immunologic Factor’. (The types in the relation from the Semantic Network, principles used here also apply to compared terms the predication in (10) is constructed, where with semantic types from other semantic groups, subject and object are Metathesaurus concepts. such as ‘Procedures’.) In the comp1 patterns, a form of compare acts as an indicator of a comparative predication. In comp2, the adjective serves that function. Other words appearing in the patterns cue the indicator word (in comp2) and help identify the compared terms (in both comp1 Linguistic patterns
and comp2). The conjunction versus is special in We extracted sentences for developing that it cues the secondary compared term (Term2) comparative processing from a set of some 10,000 in comp1, but may also indicate a comp1 structure MEDLINE citations reporting on the results of in the absence of a form of compare (C5).
clinical trials, a rich source of comparative
structures. In this sample, the most frequent 3.2
Interpreting comp1 patterns
patterns for comp1 (only announces that two terms When SemRep encounters a form of compare, it are compared) and comp2 (includes a scale and assumes a comp1 structure and looks to the right positions on that scale) are given in (11) and (12). for the first noun phrase immediately preceded by In the patterns, Term1 and Term2 refer to the with, to, and, or versus. If the head of this phrase is primary and secondary compared terms, mapped to a concept having a semantic type in the respectively. “{BE}” means that some form of be group Chemicals & Drugs, it is marked as the is optional, and slash indicates disjunction. These secondary compared term. The algorithm then patterns served as guides for enhancing SemRep looks to the left of that term for a noun phrase argument identification machinery but were not having a semantic type also in the group Chemicals implemented as such. That is, they indicate & Drugs, which becomes the primary compared necessary components but do not preclude term. When this processing is applied to (13), the semantic predication (14) is produced, in which the (11) comp1: Compared terms
C1: Term1 {BE} compare with/to Term2 argument is the primary compared term and the other is the secondary. As noted earlier, although a head has been mapped to a concept with a scale is sometimes asserted in these structures (as semantic type in the group Chemicals & Drugs, it in (13)), SemRep does not retrieve it. An assertion is marked as the secondary compared term. As in regarding position on the scale never appears in comp1, the algorithm then looks to the left for the comp1 structures. first noun phrase having a head in the same semantic group, and that phrase is marked as the tolerability of Hypericum
perforatum with imipramine in
To find the scale name, SemRep examines the secondary compared term and then locates the first adjective to its left. The nominalization of that adjective (as found in the SPECIALIST Lexicon) is designated as the scale and serves as an SemRep considers noun phrases occurring argument of the predicate SCALE in the immediately to the right and left of versus as being interpretation. For adjectives superior and inferior compared terms if their heads have been mapped to (patterns S4 and S5 in (12)) the scale name is Metathesaurus concepts having semantic types “goodness.” belonging to the group Chemicals & Drugs. Such In determining relative position on the scale, noun phrases are interpreted as part of a comp1 equality is contrasted with inequality. If the structure, even if a form of compare has not adjective of the construction is immediately occurred. The predication (16) is derived from preceded by as (pattern S1 in (12) above), the two (15). compared terms have the same position on the scale (equality), and are construed as arguments of (15) Intravenous lorazepam versus
a predication with predicate SAME_AS. In all dimenhydrinate for treatment of
other comp2 constructions, the compared terms are in a relationship of inequality. The primary compared term is considered higher on the scale unless the adjective is inferior or is preceded by less, in which case the secondary term is higher. SemRep treats compared terms as being LOWER_THAN are used to construct predications coordinated. For example, this identification with the compared terms to interpret position on allows both “Lorazepam” and “Dimenhydrinate” the scale. The equality construction in (18) is to function as arguments of TREATS in (15). expressed as the predications in (19). Consequently, in addition to (16), the predications (18) Candesartan is as effective
in (17) are returned as the semantic interpretation as lisinopril once daily in
of (15). Such processing is done for all comp1 and comp2 structures (although these results are not (19) Candesartan COMPARED_WITH given for (13) and are not further discussed in this The superiority construction in (20) is expressed as Interpreting comp2 patterns
the predications in (21).
(20) Losartan was more effective
In addition to identifying two compared terms than atenolol in reducing
when processing comp2 patterns, a scale must be named and the relative position of the terms on that scale indicated. The algorithm for finding hypertension, diabetes, and LVH. compared terms in comp2 structures begins by (21) Losartan COMPARED_WITH locating one of the cues as, than, or to and then Atenolol examines the next noun phrase to the right. If its Evaluation
To evaluate the effectiveness of the developed methods we created a test set of 300 sentences The inferiority construction in (22) is expressed as containing comparative structures. These were extracted by the second author (who did not (22) Morphine-6-glucoronide was
participate in the development of the methodology) from 3000 MEDLINE citations published later in morphine in producing pupil
date than the citations used to develop the methodology. The citations were retrieved with a PubMed query specifying randomized controlled studies and comparative studies on drug therapy. Sentences containing direct comparisons of the pharmacological actions of two drugs expressed in the target structures (comp1 and comp2) were extracted starting from the latest retrieved citation Accommodating negation
Negation in comparative structures affects the comparative structures had been examined. These position of the compared terms on the scale, and is were annotated with the PubMed ID of the citation, accommodated differently for equality and for names of two drugs (COMPARED_WITH inequality. When a scalar comparison of equality predication), the scale on which they are compared (pattern S1, as ADJ as) is negated, the primary (SCALE), and the relative position of the primary term is lower on the scale than the secondary drug with respect to the secondary (SAME_AS, (rather than being at least equal). For example, in interpreting the negated equality construction in SemRep and evaluated against the annotated test (24) Amoxicillin-clavulanate was
set. We then computed recall and precision in not as effective as ciprofloxacin
several ways: overall for all comparative structures, for comp1 structures only, and for comp2 structures only. To understand how the overall identification of comparatives is influenced by the components of the construction, we also computed recall and precision separately for drug names, scale, and position on scale (SAME_AS, For patterns of inequality, SemRep negates the together). Recall measures the proportion of predication indicating position on the scale. For manually annotated categories that have been example, the predications in (27) represent the correctly identified automatically. Precision negated superiority comparison in (26). Negation measures what proportion of the automatically of inferiority comparatives (e.g. “X is not less annotated categories is correct. effective than Y”) is extremely rare in our sample. In addition, the overall identification of comparative structures was evaluated using the F- (26) These data show that
celecoxib is not better than
measure [Rijsbergen, 1979], which combines recall and precision. The F-measure was computed using diclofenac (P = 0.414) in terms of ulcer complications. macro-averaging and micro-averaging. Macro-averaging was computed over each category first and then averaged over the three categories (drug names, scale, and position on scale). This approach gives equal weight to each category. In micro- NEG_HIGHER_THAN averaging (which gives an equal weight to the performance on each sentence) recall and precision were obtained by summing over all individual 5
Discussion
sentences. Because it is impossible to enumerate all entities and relations which are not drugs, scale, In examining SemRep errors, we determined that or position we did not use the classification error more than 60% of the false negatives (for both rate and other metrics that require computing of comp1 and comp2) were due to “empty heads” true negative values. [Chodorow et al., 1985; Guthrie et al., 1990], in which the syntactic head of a noun phrase does not reflect semantic thrust. Such heads prevent SemRep from accurately determining the semantic Upon inspection of the SemRep processing results type and group of the noun phrase. In our sample, we noticed that the test set contained nine expressions interpreted as empty heads include duplicates. In addition, four sentences were not those referring to drug dosage and formulations, processed for various technical reasons. We report such as extended release (the latter often the results for the remaining 287 sentences, which abbreviated as XR). Examples of missed contain 288 comparative structures occurring in interpretations are in sentences (28) and (29), 168 MEDLINE citations. Seventy four citations where the empty heads are in bold. Ahlers et al. contain 85 comp2 structures. The remaining 203 [Ahlers et al., 2007] discuss enhancements to structures are comp1. SemRep for accommodating empty heads. These Correct identification of comparative structures mechanisms are being incorporated into the of both types depends on two factors: 1) processing for comparative structures. recognition of both drugs being compared, and 2) (28) Oxybutynin 15 mg was more
recognition of the presence of a comparative effective than propiverine 20 mg
structure itself. In addition, correct identification of the comp2 structures depends on recognition of the scale on which the drugs are compared and the patients. relative position of the drugs on the scale. Table 1 presents recall, precision, and F-score reflecting effective as oxybutynin immediate
release for increasing bladder
Task Recall
Precision
False positives were due exclusively to word sense ambiguity. For example, in (30) bid (twice a day) was mapped to the concept “BID protein”, which belongs to the semantic group Chemicals & Drugs. The most recent version of MetaMap, which will soon be called by comparative processing, exploits word sense disambiguation We considered drug identification to be correct [Humphrey et al., 2006] and will likely resolve only if both drugs participating in the relationship were identified correctly. The recall results (30) Retapamulin ointment 1% (bid)
indicate that approximately 30% of the drugs and comparative structures of comp1, as well as 40% oral cephalexin (bid) for 10 days
of comp2 structures, remain unrecognized; in treatment of patients with SID, however, all components are identified with high and was well tolerated. precision. Macro-averaging over compared drug Although, in this paper, we tested the method on names, scale, and position on scale categories we structures in which the compared terms belong to achieve an F-score = 0.78. The micro-average the semantic group Chemicals & Drugs, we can score for 287 comparative sentences is 0.5. straightforwardly generalize the method by adding other semantic groups to the algorithm. For example, if SemRep recognized the noun phrases of comparative structures, and that is the in bold in (31) and (32) as belonging to the group interpretation of outcome statements in MEDLINE Procedures, comparative processing could proceed citations, as a method for supporting automatic access to the latest results from clinical trials (31) Comparison of multi-slice
spiral CT and magnetic resonance
imaging in evaluation of the un-
Conclusion
We expanded a symbolic semantic interpreter to identify comparative constructions in biomedical (32) Dynamic multi-slice spiral
text. The method relies on underspecified syntactic CT is better than dynamic magnetic
analysis and domain knowledge from the UMLS. resonance to some extent in
We identify two compared terms and scalar comparative structures in MEDLINE citations. Although we restricted the method to comparisons of drug therapies, the method can be easily The semantic predications returned by SemRep generalized to other entities such as diagnostic and to represent comparative expressions can be therapeutic procedures. The availability of this considered a type of executable knowledge that information in computable format can support the supports reasoning. Since the arguments in these identification of outcome sentences in MEDLINE, predications have been mapped to the UMLS, a which in turn supports translation of biomedical structured knowledge source, they can be research into improvements in quality of patient manipulated using that knowledge. It is also care. possible to compute the transitive closure of all Acknowledgement This study was supported in
SemRep output for a collection of texts to part by the Intramural Research Programs of the determine which drug was asserted in that National Institutes of Health, National Library of collection to be the best with respect to some Medicine. characteristic. This ability could be very useful in supporting question-answering applications. References
As noted earlier, it is common in reporting on the results of randomized clinical trials and Ahlers C, Fiszman M, Demner-Fushman D, Lang F, Rindflesch TC. 2007. Extracting semantic systematic reviews that a comp1 structure appears early in the discourse to announce the objectives of the study and that a comp2 structure often appears near the end to give the results. Another example of this phenomenon appears in (33) and (34) (from Aronson AR. 2001. Effective mapping of biomedical text to the UMLS Metathesaurus: The MetaMap (33) To compare the efficacy of famotidine and omeprazole in Blaschke C, Andrade MA, Ouzounis C, and Valencia A. 1999. Automatic extraction of biological information from scientific text: protein-protein interactions. Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology. Morgan (34) Omeprazole is more effective than famotidine for the control of Christensen L, Haug PJ, and Fiszman M. 2002. understanding system. Proceedings of the Workshop on Natural Language Processing in the Biomedical Domain, Association for Computational Linguistics, We suggest one example of an application that can benefit from the information provided by the Chodorow MS, Byrd RI, and Heidom GE. 1985. knowledge inherent in the semantic interpretation Extracting Semantic Hierarchies from a Large On- Line Dictionary. Proceedings of the 23rd Annual Leroy G, Chen H, and Martinez JD. 2003 A shallow Meeting of the Association for Computational parser based on closed-class words to capture relations in biomedical text. J Biomed Inform, 36(3):145-158. Chun HW, Tsuruoka Y, Kim J-D, Shiba R, Nagata N, Hishiki T, and Tsujii J. 2006, Extraction of gene- Lussier YA, Borlawsky T, Rappaport D, Liu Y, and disease relations from Medline using domain Friedman C. 2006 PhenoGO: assigning phenotypic dictionaries and machine learning. Pac Symp context to Gene Ontology annotations with natural language processing. Pac Symp Biocomput, 64-75. Friedman C. 1989. A general computational treatment McCray AT, Srinivasan S, and Browne AC. 1994. of the comparative. Proc 27th Annual Meeting Assoc Lexical methods for managing variation in biomedical terminologies. Proc Annu Symp Comput Appl Med Care, 235-9. Friedman C, Alderson PO, Austin JH, Cimino JJ, and Johnson SB. 1994. A general natural-language text McCray AT, Burgun A, and Bodenreider O. 2001 processor for clinical radiology. J Am Med Inform Aggregating UMLS semantic types for reducing conceptual complexity. Medinfo, 10(Pt 1): 216-20. Friedman C, Kra P, Yu H, Krauthammer M, and Rayner M and Banks A. 1990. An implementable Rzhetsky A. 2001. GENIES: a natural-language semantics for comparative constructions. processing system for the extraction of molecular Computational Linguistics, 16(2):86-112. pathways from journal articles. Bioinformatics, 17 Rindflesch TC. 1995. Integrating natural language processing and biomedical domain knowledge for Guthrie L, Slater BM, Wilks Y, Bruce R. 1990. Is there increased information retrieval effectiveness. Proc content in empty heads? Proceedings of the 13th 5th Annual Dual-use Technologies and Applications Conference on Computational Linguistics, v3:138 – Rindflesch TC and Fiszman M. 2003. The interaction of domain knowledge and linguistic structure in natural MEDSYNDIKATE--a natural language system for language processing: Interpreting hypernymic the extraction of medical information from findings propositions in biomedical text. J Biomed Inform, reports. Int J Med Inf, 67(1-3):63-74. Huddleston R, and Pullum GK. 2002. The Cambridge Rindflesch TC, Marcelo Fiszman , and Bisharah Libbus. Grammar of the English Language. Cambridge 2005. Semantic interpretation for the biomedical research literature. Medical informatics: Knowledge management and data mining in biomedicine. Humphrey SM, Rogers WJ, Kilicoglu H, Demner- Fushman D, Rindflesch TC. 2006. Word sense disambiguation by selecting the best semantic type Rijsbergen V. 1979. Information Retrieval, based on Journal Descriptor Indexing: Preliminary experiment. J Am Soc Inf SciTech 57(1):96-113. Ryan K. 1981. Corepresentational grammar and parsing Humphreys BL, Lindberg DA, Schoolman HM, and English comparatives. Proc 19th Annual Meeting Barnett OG. 1998. The Unified Medical Language System: An informatics research collaboration. J Am Smith L, Rindflesch T, and Wilbur WJ. 2004. MedPost: Med Inform Assoc, 5(1):1-11. a part-of-speech tagger for biomedical text. Jindal, Nitin and Bing Liu. 2006. Identifying comparative sentences in text documents. Staab S and Hahn U. Comparatives in context. 1997. Proceedings of the 29th Annual International ACM Proc 14th National Conference on Artificial SIGIR Conference on Research & Development on Intelligence and 9th Innovative Applications of Artificial Intelligence Conference, 616-621. Johnson SB, Aguirre A, Peng P, and Cimino J. 1993. Yen YT, Chen B, Chiu HW, Lee YC, Li YC, and Hsu Interpreting natural language queries using the CY. 2006. Developing an NLP and IR-based UMLS. Proc Annu Symp Comput Appl Med Care, algorithm for analyzing gene-disease relationships.

Source: http://acl.ldc.upenn.edu/W/W07/W07-1018.pdf

09_hwplus_elem_wb_key.qxd

09_HWPlus_Elem_WB_Key.qxd 16/11/10 10:24 Page 89 Workbook key 11 2 Pierre is a French name . 3 Oxford is an English university . A Hello. What’s your name? 9 2 Cathy is Louise’s sister. 3 Stephen is 4 English is an international language . B Suzanne. What’s your name ? A My name is John. Where are you from , 5 George is Mary’s husband. 6 A Mercedes is a Ge

Herpes virus

HERPES: GENITAL, VENERAL WARTS Herpes Simplex Type II(Genital Herpes) Genital Herpes Virus in the Nerve ganglia; Herpes I Virus; Blisters in 1-2 days, becoming open genital ulcers. Ulcers last 2 weeks or longer. From poor immune response, stress, sickness, menstruation, cold or fatigue. Herpes II virus in a pregnant woman may develop into fatal encephalitis requiring Caesarian Section; The sa