Helpers for Computing Quantities

Helpers for Computing Quantities#

Utilities for precomputing the priors and posteriors.

pydantic settings lyscripts.compute.utils.BaseComputeCLI[source]#

Bases: BaseCLI

Common command line settings for the submodule compute.

Show JSON schema

{
   "title": "BaseComputeCLI",
   "description": "Common command line settings for the submodule ``compute``.",
   "type": "object",
   "properties": {
      "configs": {
         "default": [
            "config.yaml"
         ],
         "description": "Path to the YAML file(s) that contain the configuration(s). Configs from YAML files may be overwritten by command line arguments. When multiple files are specified, the configs are merged in the order they are given. Note that every config file must have a `version: 1` key in it.",
         "items": {
            "format": "path",
            "type": "string"
         },
         "title": "Configs",
         "type": "array"
      },
      "graph": {
         "$ref": "#/$defs/GraphConfig"
      },
      "model": {
         "$ref": "#/$defs/ModelConfig",
         "default": {
            "external_file": null,
            "class_name": "Unilateral",
            "constructor": "binary",
            "max_time": 10,
            "named_params": null,
            "kwargs": {}
         }
      },
      "distributions": {
         "additionalProperties": {
            "$ref": "#/$defs/DistributionConfig"
         },
         "default": {},
         "description": "Mapping of model T-categories to predefined distributions over diagnose times.",
         "title": "Distributions",
         "type": "object"
      },
      "cache_dir": {
         "default": "/home/docs/checkouts/readthedocs.org/user_builds/lyscripts/checkouts/latest/docs/source/.cache",
         "description": "Cache directory for storing function calls.",
         "format": "path",
         "title": "Cache Dir",
         "type": "string"
      },
      "scenarios": {
         "default": [],
         "description": "List of scenarios to compute risks for.",
         "items": {
            "$ref": "#/$defs/ScenarioConfig"
         },
         "title": "Scenarios",
         "type": "array"
      },
      "sampling": {
         "$ref": "#/$defs/SamplingConfig"
      }
   },
   "$defs": {
      "DiagnosisConfig": {
         "description": "Defines an ipsi- and contralateral diagnosis pattern.",
         "properties": {
            "ipsi": {
               "additionalProperties": {
                  "additionalProperties": {
                     "anyOf": [
                        {
                           "enum": [
                              false,
                              0,
                              "healthy",
                              true,
                              1,
                              "involved",
                              "micro",
                              "macro",
                              "notmacro"
                           ]
                        },
                        {
                           "type": "null"
                        }
                     ]
                  },
                  "type": "object"
               },
               "default": {},
               "description": "Observed diagnoses by different modalities on the ipsi neck.",
               "examples": [
                  {
                     "CT": {
                        "II": true,
                        "III": false
                     }
                  }
               ],
               "title": "Ipsi",
               "type": "object"
            },
            "contra": {
               "additionalProperties": {
                  "additionalProperties": {
                     "anyOf": [
                        {
                           "enum": [
                              false,
                              0,
                              "healthy",
                              true,
                              1,
                              "involved",
                              "micro",
                              "macro",
                              "notmacro"
                           ]
                        },
                        {
                           "type": "null"
                        }
                     ]
                  },
                  "type": "object"
               },
               "default": {},
               "description": "Observed diagnoses by different modalities on the contra neck.",
               "title": "Contra",
               "type": "object"
            }
         },
         "title": "DiagnosisConfig",
         "type": "object"
      },
      "DistributionConfig": {
         "description": "Configuration defining a distribution over diagnose times.",
         "properties": {
            "kind": {
               "default": "frozen",
               "description": "Parametric distributions may be updated.",
               "enum": [
                  "frozen",
                  "parametric"
               ],
               "title": "Kind",
               "type": "string"
            },
            "func": {
               "const": "binomial",
               "default": "binomial",
               "description": "Name of predefined function to use as distribution.",
               "title": "Func",
               "type": "string"
            },
            "params": {
               "additionalProperties": {
                  "anyOf": [
                     {
                        "type": "integer"
                     },
                     {
                        "type": "number"
                     }
                  ]
               },
               "default": {},
               "description": "Parameters to pass to the predefined function.",
               "title": "Params",
               "type": "object"
            }
         },
         "title": "DistributionConfig",
         "type": "object"
      },
      "GraphConfig": {
         "description": "Specifies how the tumor(s) and LNLs are connected in a DAG.",
         "properties": {
            "tumor": {
               "additionalProperties": {
                  "items": {
                     "type": "string"
                  },
                  "type": "array"
               },
               "description": "Define the name of the tumor(s) and which LNLs it/they drain to.",
               "title": "Tumor",
               "type": "object"
            },
            "lnl": {
               "additionalProperties": {
                  "items": {
                     "type": "string"
                  },
                  "type": "array"
               },
               "description": "Define the name of the LNL(s) and which LNLs it/they drain to.",
               "title": "Lnl",
               "type": "object"
            }
         },
         "required": [
            "tumor",
            "lnl"
         ],
         "title": "GraphConfig",
         "type": "object"
      },
      "InvolvementConfig": {
         "description": "Config that defines an ipsi- and contralateral involvement pattern.",
         "properties": {
            "ipsi": {
               "additionalProperties": {
                  "anyOf": [
                     {
                        "enum": [
                           false,
                           0,
                           "healthy",
                           true,
                           1,
                           "involved",
                           "micro",
                           "macro",
                           "notmacro"
                        ]
                     },
                     {
                        "type": "null"
                     }
                  ]
               },
               "default": {},
               "description": "Involvement pattern for the ipsilateral side of the neck.",
               "examples": [
                  {
                     "II": true,
                     "III": false
                  }
               ],
               "title": "Ipsi",
               "type": "object"
            },
            "contra": {
               "additionalProperties": {
                  "anyOf": [
                     {
                        "enum": [
                           false,
                           0,
                           "healthy",
                           true,
                           1,
                           "involved",
                           "micro",
                           "macro",
                           "notmacro"
                        ]
                     },
                     {
                        "type": "null"
                     }
                  ]
               },
               "default": {},
               "description": "Involvement pattern for the contralateral side of the neck.",
               "title": "Contra",
               "type": "object"
            }
         },
         "title": "InvolvementConfig",
         "type": "object"
      },
      "ModelConfig": {
         "description": "Define which of the ``lymph`` models to use and how to set them up.",
         "properties": {
            "external_file": {
               "anyOf": [
                  {
                     "format": "file-path",
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Path to a Python file that defines a model.",
               "title": "External File"
            },
            "class_name": {
               "default": "Unilateral",
               "description": "Name of the model class to use.",
               "enum": [
                  "Unilateral",
                  "Bilateral",
                  "Midline"
               ],
               "title": "Class Name",
               "type": "string"
            },
            "constructor": {
               "default": "binary",
               "description": "Trinary models differentiate btw. micro- and macroscopic disease.",
               "enum": [
                  "binary",
                  "trinary"
               ],
               "title": "Constructor",
               "type": "string"
            },
            "max_time": {
               "default": 10,
               "description": "Max. number of time-steps to evolve the model over.",
               "title": "Max Time",
               "type": "integer"
            },
            "named_params": {
               "default": null,
               "description": "Subset of valid model parameters a sampler may provide in the form of a dictionary to the model instead of as an array. Or, after sampling, with this list, one may safely recover which parameter corresponds to which index in the sample.",
               "items": {
                  "type": "string"
               },
               "title": "Named Params",
               "type": "array"
            },
            "kwargs": {
               "additionalProperties": true,
               "default": {},
               "description": "Additional keyword arguments to pass to the model constructor.",
               "title": "Kwargs",
               "type": "object"
            }
         },
         "title": "ModelConfig",
         "type": "object"
      },
      "SamplingConfig": {
         "description": "Settings to configure the MCMC sampling.",
         "properties": {
            "storage_file": {
               "description": "Path to HDF5 file store results or load last state.",
               "format": "path",
               "title": "Storage File",
               "type": "string"
            },
            "history_file": {
               "anyOf": [
                  {
                     "format": "path",
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Path to store the burn-in metrics (as CSV file).",
               "title": "History File"
            },
            "dataset": {
               "default": "mcmc",
               "description": "Name of the dataset in the HDF5 file.",
               "title": "Dataset",
               "type": "string"
            },
            "cores": {
               "anyOf": [
                  {
                     "exclusiveMinimum": 0,
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 2,
               "description": "Number of cores to use for parallel sampling. If `None`, no parallel processing is used.",
               "title": "Cores"
            },
            "seed": {
               "default": 42,
               "description": "Seed for the random number generator.",
               "title": "Seed",
               "type": "integer"
            },
            "walkers_per_dim": {
               "default": 20,
               "description": "Number of walkers per parameter space dimension.",
               "title": "Walkers Per Dim",
               "type": "integer"
            },
            "check_interval": {
               "default": 50,
               "description": "Check for convergence each time after this many steps.",
               "title": "Check Interval",
               "type": "integer"
            },
            "trust_factor": {
               "default": 50.0,
               "description": "Trust the autocorrelation time only when it's smaller than this factor times the length of the chain.",
               "title": "Trust Factor",
               "type": "number"
            },
            "relative_thresh": {
               "default": 0.05,
               "description": "Relative threshold for convergence.",
               "title": "Relative Thresh",
               "type": "number"
            },
            "burnin_steps": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Number of burn-in steps to take. If None, burn-in runs until convergence.",
               "title": "Burnin Steps"
            },
            "num_steps": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 100,
               "description": "Number of steps to take in the MCMC sampling.",
               "title": "Num Steps"
            },
            "thin_by": {
               "default": 10,
               "description": "How many samples to draw before for saving one.",
               "title": "Thin By",
               "type": "integer"
            },
            "inverse_temp": {
               "default": 1.0,
               "description": "Inverse temperature for thermodynamic integration. Note that this is not yet fully implemented.",
               "title": "Inverse Temp",
               "type": "number"
            }
         },
         "required": [
            "storage_file"
         ],
         "title": "SamplingConfig",
         "type": "object"
      },
      "ScenarioConfig": {
         "description": "Define a scenario for which e.g. prevalences and risks may be computed.",
         "properties": {
            "t_stages": {
               "description": "List of T-stages to marginalize over in the scenario.",
               "examples": [
                  [
                     "early"
                  ],
                  [
                     3,
                     4
                  ]
               ],
               "items": {
                  "anyOf": [
                     {
                        "type": "integer"
                     },
                     {
                        "type": "string"
                     }
                  ]
               },
               "title": "T Stages",
               "type": "array"
            },
            "t_stages_dist": {
               "default": [
                  1.0
               ],
               "description": "Distribution over T-stages to use for marginalization.",
               "examples": [
                  [
                     1.0
                  ],
                  [
                     0.6,
                     0.4
                  ]
               ],
               "items": {
                  "type": "number"
               },
               "title": "T Stages Dist",
               "type": "array"
            },
            "midext": {
               "anyOf": [
                  {
                     "type": "boolean"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Whether the patient's tumor extends over the midline.",
               "title": "Midext"
            },
            "mode": {
               "default": "HMM",
               "description": "Which underlying model architecture to use.",
               "enum": [
                  "HMM",
                  "BN"
               ],
               "title": "Mode",
               "type": "string"
            },
            "involvement": {
               "$ref": "#/$defs/InvolvementConfig",
               "default": {
                  "ipsi": {},
                  "contra": {}
               }
            },
            "diagnosis": {
               "$ref": "#/$defs/DiagnosisConfig",
               "default": {
                  "ipsi": {},
                  "contra": {}
               }
            }
         },
         "required": [
            "t_stages"
         ],
         "title": "ScenarioConfig",
         "type": "object"
      }
   },
   "required": [
      "graph",
      "sampling"
   ]
}

field graph: GraphConfig [Required]#

field model: ModelConfig = ModelConfig(external_file=None, class_name='Unilateral', constructor='binary', max_time=10, named_params=None, kwargs={})#

field distributions: dict[str, DistributionConfig] = {}#: Mapping of model T-categories to predefined distributions over diagnose times.

field cache_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/lyscripts/checkouts/latest/docs/source/.cache')#: Cache directory for storing function calls.

field scenarios: list[ScenarioConfig] = []#: List of scenarios to compute risks for.

field sampling: SamplingConfig [Required]#

lyscripts.compute.utils.is_hdf5_compatible(value: Any) → bool[source]#: Check if the given value can be stored in an HDF5 file.

lyscripts.compute.utils.to_hdf5_attrs(mapping: dict[str, Any]) → dict[str, str][source]#: Convert attrs to a dictionary of HDF5 compatible attributes or strings.

lyscripts.compute.utils.from_hdf5_attrs(mapping: AttributeManager) → dict[str, Any][source]#: Convert the HDF5 attributes to a dictionary of Python objects.

lyscripts.compute.utils.extract_modalities(diagnosis: dict[str, Any]) → set[str][source]#

Get the set of modalities used in the diagnosis.

This is not used in the main apps anymore, but since it may be useful, I keep it.

>>> diagnosis = {
...     "ipsi": {
...         "MRI": {"II": True, "III": False},
...         "PET": {"II": False, "III": True},
...      },
...     "contra": {"MRI": {"II": False, "III": None}},
... }
>>> sorted(extract_modalities(diagnosis))
['MRI', 'PET']

lyscripts.compute.utils.ensure_parent_dir(path: Path) → Path[source]#: Create the parent directory of the given path.

lyscripts.compute.utils.HasParentPath#

Type hint for path whose parent dir is created if it doesn’t exist.

alias of Annotated[Path, AfterValidator(func=ensure_parent_dir)]

pydantic model lyscripts.compute.utils.HDF5FileStorage[source]#

Bases: BaseModel

HDF5 file storage for in- and outputs of computations.

Show JSON schema

{
   "title": "HDF5FileStorage",
   "description": "HDF5 file storage for in- and outputs of computations.",
   "type": "object",
   "properties": {
      "file": {
         "description": "Path to the HDF5 file. Parent directories are created if needed.",
         "format": "path",
         "title": "File",
         "type": "string"
      },
      "dataset": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Name of the dataset in the HDF5 file. Save/load methods can override this.",
         "title": "Dataset"
      }
   },
   "required": [
      "file"
   ]
}

field file: Annotated[Path, AfterValidator(func=ensure_parent_dir)] [Required]#: Path to the HDF5 file. Parent directories are created if needed.

field dataset: str | None = None#: Name of the dataset in the HDF5 file. Save/load methods can override this.

load(dataset: str | None = None) → ndarray[source]#: Load the dataset with the name dataset.

get_attrs(dataset: str | None = None) → dict[str, Any][source]#: Get the attributes of the dataset dataset.

save(values: ndarray, dataset: str | None = None) → None[source]#: Set the values for the dataset dataset.

set_attrs(attrs: dict[str, Any], dataset: str | None = None) → None[source]#: Update the attrs for the dataset dataset.

lyscripts.compute.utils.reduce_pattern(pattern: dict[str, dict[str, bool]]) → dict[str, dict[str, bool]][source]#

Reduce a pattern by removing all entries that are None.

This way, it should be completely recoverable by the complete_pattern function but be shorter to store.

Unused but maybe useful for some cases. Keeping it in here for now.

>>> full = {
...     "ipsi": {"I": None, "II": True, "III": None},
...     "contra": {"I": None, "II": None, "III": None},
... }
>>> reduce_pattern(full)
{'ipsi': {'II': True}}

lyscripts.compute.utils.complete_pattern(pattern: dict[str, dict[str, bool]] | None, lnls: list[str]) → dict[str, dict[str, bool]][source]#

Make sure the provided involvement pattern is correct.

For each side of the neck, and for each of the lnls this should in the end contain True, False or None.

Unused but maybe useful for some cases. Keeping it in here for now.

>>> pattern = {"ipsi": {"II": True}}
>>> lnls = ["II", "III"]
>>> complete_pattern(pattern, lnls)
{'ipsi': {'II': True, 'III': None}, 'contra': {'II': None, 'III': None}}

lyscripts.compute.utils.get_cached(func: callable, cache_dir: Path) → callable[source]#: Return cached func with a cache at cache_dir.

Helpers for Computing Quantities

Contents

Helpers for Computing Quantities#