Skip to content

sentence

autocorpus.ac_bioc.sentence ¤

This module defines the BioCSentence class.

Classes¤

BioCSentence(text, offset, infons=dict(), annotations=list(), relations=list()) dataclass ¤

Bases: DataClassJsonMixin

Represents a sentence in the BioC format.

Functions¤
from_xml(elem) classmethod ¤

Create a BioCSentence instance from an XML element.

Parameters:

Name Type Description Default
elem Element

An XML element representing a sentence.

required

Returns:

Name Type Description
BioCSentence BioCSentence

An instance of BioCSentence created from the XML element.

Source code in autocorpus/ac_bioc/sentence.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
@classmethod
def from_xml(cls, elem: ET.Element) -> BioCSentence:
    """Create a BioCSentence instance from an XML element.

    Args:
        elem (ET.Element): An XML element representing a sentence.

    Returns:
        BioCSentence: An instance of BioCSentence created from the XML element.
    """
    offset = int(elem.findtext("offset", default="0"))
    text = elem.findtext("text", default="")

    infons = {
        e.attrib["key"]: e.text for e in elem.findall("infon") if e.text is not None
    }

    annotations = [
        BioCAnnotation.from_xml(a_elem) for a_elem in elem.findall("annotation")
    ]

    return cls(
        text=text,
        offset=offset,
        infons=infons,
        annotations=annotations,
    )
to_xml() ¤

Convert the BioCSentence instance to an XML element.

Returns:

Type Description
Element

ET.Element: An XML element representing the sentence.

Source code in autocorpus/ac_bioc/sentence.py
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
def to_xml(self) -> ET.Element:
    """Convert the BioCSentence instance to an XML element.

    Returns:
        ET.Element: An XML element representing the sentence.
    """
    sentence_elem = ET.Element("sentence")

    for k, v in self.infons.items():
        infon = ET.SubElement(sentence_elem, "infon", {"key": k})
        infon.text = v

    offset_elem = ET.SubElement(sentence_elem, "offset")
    offset_elem.text = str(self.offset)

    text_elem = ET.SubElement(sentence_elem, "text")
    text_elem.text = self.text

    for ann in self.annotations:
        sentence_elem.append(ann.to_xml())

    return sentence_elem