Share this post on:

Lent transfer of a methyl group to either N-6 of adenine or C-5 or N-4 of cytosine. We note that while the definition PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/28300835 may appear restrictive, methylation of adenine N-6 or cytosine C-5/N-4 encompasses the entire set of ways in which DNA can beFigure 2 Event representation. BioNLP Shared Task representation for annotation of phosphorylation events (above) and representation applied for DNA methylation (below).Ohta et al. Journal of Biomedical Semantics 2011, 2(Suppl 5):S2 http://www.jbiomedsem.com/content/2/S5/SPage 6 ofmethylated. To GO definition could thus be adopted without limitation to the scope of the annotation.Document selectionThe selection of source documents for an annotated corpus is critical for assuring that the corpus provides relevant and representative material for studying the phenomena of interest. Domain corpora frequently consist of documents from a particular subdomain of interest: for example, the GENIA corpus focuses on documents concerning transcription factors in human blood cells [31]. Methods trained and evaluated on such focused resources will not necessarily generalize well to broader domains. However, there has been little study of the effect of document selection on event extraction performance. Here, we applied two distinct strategies to get a representative sample of the full scope of DNA methylation events in the literature and to assure that our annotations are relevant to the interests of biologists and our results applicable to the overall distribution of DNA methylation events in the literature. In the first strategy, we aimed in particular to select a representative sample of documents relevant to the targeted event types. For this purpose, we directly searched the PubMed literature database. We further decided not to include any text-based query in the search to avoid biasing the selection toward particular entities or forms of event expression. Instead, we only queried for the single MeSH term DNA Methylation. This term has the PubMed annotation scope definition: Addition of methyl groups to DNA. DNA methyltransferases (DNA methylases) perform this reaction using S-ADENOSYLMETHIONINE as the methyl group donor. While this definition of DNA Methylation takes a different perspective than the GO definition adopted for the event specification, in practice it identifies the same concept: by definition, DNA methylation is only performed by DNA methyltranferases, and the mentioned donor is the only one presently known. We can thus expect that PubMed queries for this concept match a complete and MK-571 (sodium salt) site unbiased set of documents involving the targeted concepts. While search for documents that are indexed by humans with the MeSH term DNA Methylation is expected to provide high-precision results for the full topic, not all such documents necessarily discuss events where specific genes are methylated. In initial efforts to annotate a random sample of these documents, we found that many did not mention specific gene names. To reduce wasted effort in examining documents that contain no markable events, we added a filter requiring a minimum number of (likely) gene mentions. We first tagged all citations tagged with DNA Methylation that have an abstract in PubMed (14350 at the time of selection) using the BANNER protein/gene name tagger [33] trained on the GENETAG corpus [34]. We found that while the overwhelmingly most frequent number of tagged mentions per document is zero, a substantial mass of abstracts have large.

Share this post on:

Author: Sodium channel