Skip to content

bioc_passage

autocorpus.bioc_passage ¤

BioC Passage builder script.

Classes¤

BioCPassage(offset, infons, text, sentences=list(), annotations=list(), relations=list()) dataclass ¤

Represents a BioC passage.

Functions¤
from_dict(passage, offset) classmethod ¤

Create a BioCPassage from a passage dict and an offset.

Parameters:

Name Type Description Default
passage dict[str, Any]

dict containing info about passage

required
offset int

Passage offset

required

Returns:

Type Description
BioCPassage

BioCPassage object

Source code in autocorpus/bioc_passage.py
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
@classmethod
def from_dict(cls, passage: dict[str, Any], offset: int) -> BioCPassage:
    """Create a BioCPassage from a passage dict and an offset.

    Args:
        passage: dict containing info about passage
        offset: Passage offset

    Returns:
        BioCPassage object
    """
    infons = {k: v for k, v in passage.items() if k not in _DEFAULT_KEYS}

    # TODO: Doesn't account for subsubsection headings which might exist
    if heading := passage.get("section_heading", None):
        infons["section_title_1"] = heading
    if subheading := passage.get("subsection_heading", None):
        infons["section_title_2"] = subheading
    for i, section_type in enumerate(passage["section_type"]):
        infons[f"iao_name_{i + 1}"] = section_type["iao_name"]
        infons[f"iao_id_{i + 1}"] = section_type["iao_id"]

    return cls(offset, infons, passage["body"])
from_title(title, offset) classmethod ¤

Create a BioCPassage from a title and offset.

Parameters:

Name Type Description Default
title str

Passage title

required
offset int

Passage offset

required

Returns:

Type Description
BioCPassage

BioCPassage object

Source code in autocorpus/bioc_passage.py
48
49
50
51
52
53
54
55
56
57
58
59
60
@classmethod
def from_title(cls, title: str, offset: int) -> BioCPassage:
    """Create a BioCPassage from a title and offset.

    Args:
        title: Passage title
        offset: Passage offset

    Returns:
        BioCPassage object
    """
    infons = {"iao_name_1": "document title", "iao_id_1": "IAO:0000305"}
    return cls(offset, infons, title)