Skip to content

config

autocorpus.config ¤

Contains a loader for default configuration files.

Classes¤

DefaultConfig(filename) ¤

Bases: Enum

An enumeration representing different configuration files for various datasets.

Attributes:

Name Type Description
LEGACY_PMC

Configuration file for legacy PMC data (pre-October 2024).

PMC

Configuration file for current PMC data.

PLOS_GENETICS

Configuration file for PLOS Genetics data.

NATURE_GENETICS

Configuration file for Nature Genetics data.

Methods:

Name Description
load_config

Loads and returns the configuration from the associated file. The configuration is lazy-loaded and cached upon first access.

Initializes the DefaultConfig enum with the given filename.

Parameters:

Name Type Description Default
filename str

The name of the configuration file to load.

required
Source code in autocorpus/config.py
50
51
52
53
54
55
56
57
def __init__(self, filename: str) -> None:
    """Initializes the DefaultConfig enum with the given filename.

    Args:
        filename: The name of the configuration file to load.
    """
    self._filename = filename
    self._config: dict[str, Any] = {}  # Lazy-loaded cache
Functions¤
load_config() ¤

Loads the configuration file when first accessed.

Returns:

Type Description
dict[str, Any]

The configuration file as a dictionary.

Source code in autocorpus/config.py
59
60
61
62
63
64
65
66
67
68
69
def load_config(self) -> dict[str, Any]:
    """Loads the configuration file when first accessed.

    Returns:
        The configuration file as a dictionary.
    """
    if self._config == {}:
        config_path = resources.files("autocorpus.configs") / self._filename
        with config_path.open("r", encoding="utf-8") as f_in:
            self._config = json.load(f_in)["config"]
    return self._config

Functions¤

read_config(config_path) ¤

Reads a configuration file and returns its contents.

Parameters:

Name Type Description Default
config_path str

The path to the configuration file.

required

Returns:

Name Type Description
dict dict[str, Any]

The contents of the configuration file.

Raises:

Type Description
FileNotFoundError

If the configuration file does not exist.

JSONDecodeError

If the configuration file is not a valid JSON.

KeyError

If the configuration file does not contain the expected "config" key.

Source code in autocorpus/config.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def read_config(config_path: str) -> dict[str, Any]:
    """Reads a configuration file and returns its contents.

    Args:
        config_path: The path to the configuration file.

    Returns:
        dict: The contents of the configuration file.

    Raises:
        FileNotFoundError: If the configuration file does not exist.
        json.JSONDecodeError: If the configuration file is not a valid JSON.
        KeyError: If the configuration file does not contain the expected "config" key.
    """
    with open(config_path, encoding="utf-8") as f:
        ## TODO: validate config file here if possible
        content = json.load(f)
        return content["config"]