Compute Risk of Involvement#

Predict risks of involvements for scenarios using drawn MCMC samples.

As the priors and posteriors, this computation, too, uses caching and may skip the computation of these two initial steps if the cache directory is the same as during their computation.

lyscripts.compute.risks.compute_risks(model_config: ModelConfig, graph_config: GraphConfig, dist_configs: dict[str, DistributionConfig], modality_configs: dict[str, ModalityConfig], posteriors: ndarray, involvement: dict[Literal['ipsi', 'contra'], dict], progress_desc: str = 'Computing risks from posteriors') → ndarray[source]#

Compute the risk of involvement from each of the posteriors.

Essentially, this only calls the model’s lymph.models.Model.marginalize() method, as nothing more is necessary than to marginalize the full posterior state distribution over the states that correspond to the involvement of interest.

pydantic settings lyscripts.compute.risks.RisksCLI[source]#

Bases: BaseComputeCLI

Predict the risk of involvement scenarios from model samples given diagnoses.

Show JSON schema

{
   "title": "RisksCLI",
   "description": "Predict the risk of involvement scenarios from model samples given diagnoses.",
   "type": "object",
   "properties": {
      "configs": {
         "default": [
            "config.yaml"
         ],
         "description": "Path to the YAML file(s) that contain the configuration(s). Configs from YAML files may be overwritten by command line arguments. When multiple files are specified, the configs are merged in the order they are given. Note that every config file must have a `version: 1` key in it.",
         "items": {
            "format": "path",
            "type": "string"
         },
         "title": "Configs",
         "type": "array"
      },
      "graph": {
         "$ref": "#/$defs/GraphConfig"
      },
      "model": {
         "$ref": "#/$defs/ModelConfig",
         "default": {
            "external_file": null,
            "class_name": "Unilateral",
            "constructor": "binary",
            "max_time": 10,
            "named_params": null,
            "kwargs": {}
         }
      },
      "distributions": {
         "additionalProperties": {
            "$ref": "#/$defs/DistributionConfig"
         },
         "default": {},
         "description": "Mapping of model T-categories to predefined distributions over diagnose times.",
         "title": "Distributions",
         "type": "object"
      },
      "cache_dir": {
         "default": "/home/docs/checkouts/readthedocs.org/user_builds/lyscripts/checkouts/latest/docs/source/.cache",
         "description": "Cache directory for storing function calls.",
         "format": "path",
         "title": "Cache Dir",
         "type": "string"
      },
      "scenarios": {
         "default": [],
         "description": "List of scenarios to compute risks for.",
         "items": {
            "$ref": "#/$defs/ScenarioConfig"
         },
         "title": "Scenarios",
         "type": "array"
      },
      "sampling": {
         "$ref": "#/$defs/SamplingConfig"
      },
      "modalities": {
         "additionalProperties": {
            "$ref": "#/$defs/ModalityConfig"
         },
         "default": {},
         "description": "Maps names of diagnostic modalities to their specificity/sensitivity.",
         "title": "Modalities",
         "type": "object"
      },
      "risks": {
         "$ref": "#/$defs/HDF5FileStorage",
         "description": "Storage for the computed risks."
      }
   },
   "$defs": {
      "DiagnosisConfig": {
         "description": "Defines an ipsi- and contralateral diagnosis pattern.",
         "properties": {
            "ipsi": {
               "additionalProperties": {
                  "additionalProperties": {
                     "anyOf": [
                        {
                           "enum": [
                              false,
                              0,
                              "healthy",
                              true,
                              1,
                              "involved",
                              "micro",
                              "macro",
                              "notmacro"
                           ]
                        },
                        {
                           "type": "null"
                        }
                     ]
                  },
                  "type": "object"
               },
               "default": {},
               "description": "Observed diagnoses by different modalities on the ipsi neck.",
               "examples": [
                  {
                     "CT": {
                        "II": true,
                        "III": false
                     }
                  }
               ],
               "title": "Ipsi",
               "type": "object"
            },
            "contra": {
               "additionalProperties": {
                  "additionalProperties": {
                     "anyOf": [
                        {
                           "enum": [
                              false,
                              0,
                              "healthy",
                              true,
                              1,
                              "involved",
                              "micro",
                              "macro",
                              "notmacro"
                           ]
                        },
                        {
                           "type": "null"
                        }
                     ]
                  },
                  "type": "object"
               },
               "default": {},
               "description": "Observed diagnoses by different modalities on the contra neck.",
               "title": "Contra",
               "type": "object"
            }
         },
         "title": "DiagnosisConfig",
         "type": "object"
      },
      "DistributionConfig": {
         "description": "Configuration defining a distribution over diagnose times.",
         "properties": {
            "kind": {
               "default": "frozen",
               "description": "Parametric distributions may be updated.",
               "enum": [
                  "frozen",
                  "parametric"
               ],
               "title": "Kind",
               "type": "string"
            },
            "func": {
               "const": "binomial",
               "default": "binomial",
               "description": "Name of predefined function to use as distribution.",
               "title": "Func",
               "type": "string"
            },
            "params": {
               "additionalProperties": {
                  "anyOf": [
                     {
                        "type": "integer"
                     },
                     {
                        "type": "number"
                     }
                  ]
               },
               "default": {},
               "description": "Parameters to pass to the predefined function.",
               "title": "Params",
               "type": "object"
            }
         },
         "title": "DistributionConfig",
         "type": "object"
      },
      "GraphConfig": {
         "description": "Specifies how the tumor(s) and LNLs are connected in a DAG.",
         "properties": {
            "tumor": {
               "additionalProperties": {
                  "items": {
                     "type": "string"
                  },
                  "type": "array"
               },
               "description": "Define the name of the tumor(s) and which LNLs it/they drain to.",
               "title": "Tumor",
               "type": "object"
            },
            "lnl": {
               "additionalProperties": {
                  "items": {
                     "type": "string"
                  },
                  "type": "array"
               },
               "description": "Define the name of the LNL(s) and which LNLs it/they drain to.",
               "title": "Lnl",
               "type": "object"
            }
         },
         "required": [
            "tumor",
            "lnl"
         ],
         "title": "GraphConfig",
         "type": "object"
      },
      "HDF5FileStorage": {
         "description": "HDF5 file storage for in- and outputs of computations.",
         "properties": {
            "file": {
               "description": "Path to the HDF5 file. Parent directories are created if needed.",
               "format": "path",
               "title": "File",
               "type": "string"
            },
            "dataset": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Name of the dataset in the HDF5 file. Save/load methods can override this.",
               "title": "Dataset"
            }
         },
         "required": [
            "file"
         ],
         "title": "HDF5FileStorage",
         "type": "object"
      },
      "InvolvementConfig": {
         "description": "Config that defines an ipsi- and contralateral involvement pattern.",
         "properties": {
            "ipsi": {
               "additionalProperties": {
                  "anyOf": [
                     {
                        "enum": [
                           false,
                           0,
                           "healthy",
                           true,
                           1,
                           "involved",
                           "micro",
                           "macro",
                           "notmacro"
                        ]
                     },
                     {
                        "type": "null"
                     }
                  ]
               },
               "default": {},
               "description": "Involvement pattern for the ipsilateral side of the neck.",
               "examples": [
                  {
                     "II": true,
                     "III": false
                  }
               ],
               "title": "Ipsi",
               "type": "object"
            },
            "contra": {
               "additionalProperties": {
                  "anyOf": [
                     {
                        "enum": [
                           false,
                           0,
                           "healthy",
                           true,
                           1,
                           "involved",
                           "micro",
                           "macro",
                           "notmacro"
                        ]
                     },
                     {
                        "type": "null"
                     }
                  ]
               },
               "default": {},
               "description": "Involvement pattern for the contralateral side of the neck.",
               "title": "Contra",
               "type": "object"
            }
         },
         "title": "InvolvementConfig",
         "type": "object"
      },
      "ModalityConfig": {
         "description": "Define a diagnostic or pathological modality.",
         "properties": {
            "spec": {
               "description": "Specificity of the modality.",
               "maximum": 1.0,
               "minimum": 0.5,
               "title": "Spec",
               "type": "number"
            },
            "sens": {
               "description": "Sensitivity of the modality.",
               "maximum": 1.0,
               "minimum": 0.5,
               "title": "Sens",
               "type": "number"
            },
            "kind": {
               "default": "clinical",
               "description": "Clinical modalities cannot detect microscopic disease.",
               "enum": [
                  "clinical",
                  "pathological"
               ],
               "title": "Kind",
               "type": "string"
            }
         },
         "required": [
            "spec",
            "sens"
         ],
         "title": "ModalityConfig",
         "type": "object"
      },
      "ModelConfig": {
         "description": "Define which of the ``lymph`` models to use and how to set them up.",
         "properties": {
            "external_file": {
               "anyOf": [
                  {
                     "format": "file-path",
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Path to a Python file that defines a model.",
               "title": "External File"
            },
            "class_name": {
               "default": "Unilateral",
               "description": "Name of the model class to use.",
               "enum": [
                  "Unilateral",
                  "Bilateral",
                  "Midline"
               ],
               "title": "Class Name",
               "type": "string"
            },
            "constructor": {
               "default": "binary",
               "description": "Trinary models differentiate btw. micro- and macroscopic disease.",
               "enum": [
                  "binary",
                  "trinary"
               ],
               "title": "Constructor",
               "type": "string"
            },
            "max_time": {
               "default": 10,
               "description": "Max. number of time-steps to evolve the model over.",
               "title": "Max Time",
               "type": "integer"
            },
            "named_params": {
               "default": null,
               "description": "Subset of valid model parameters a sampler may provide in the form of a dictionary to the model instead of as an array. Or, after sampling, with this list, one may safely recover which parameter corresponds to which index in the sample.",
               "items": {
                  "type": "string"
               },
               "title": "Named Params",
               "type": "array"
            },
            "kwargs": {
               "additionalProperties": true,
               "default": {},
               "description": "Additional keyword arguments to pass to the model constructor.",
               "title": "Kwargs",
               "type": "object"
            }
         },
         "title": "ModelConfig",
         "type": "object"
      },
      "SamplingConfig": {
         "description": "Settings to configure the MCMC sampling.",
         "properties": {
            "storage_file": {
               "description": "Path to HDF5 file store results or load last state.",
               "format": "path",
               "title": "Storage File",
               "type": "string"
            },
            "history_file": {
               "anyOf": [
                  {
                     "format": "path",
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Path to store the burn-in metrics (as CSV file).",
               "title": "History File"
            },
            "dataset": {
               "default": "mcmc",
               "description": "Name of the dataset in the HDF5 file.",
               "title": "Dataset",
               "type": "string"
            },
            "cores": {
               "anyOf": [
                  {
                     "exclusiveMinimum": 0,
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 2,
               "description": "Number of cores to use for parallel sampling. If `None`, no parallel processing is used.",
               "title": "Cores"
            },
            "seed": {
               "default": 42,
               "description": "Seed for the random number generator.",
               "title": "Seed",
               "type": "integer"
            },
            "walkers_per_dim": {
               "default": 20,
               "description": "Number of walkers per parameter space dimension.",
               "title": "Walkers Per Dim",
               "type": "integer"
            },
            "check_interval": {
               "default": 50,
               "description": "Check for convergence each time after this many steps.",
               "title": "Check Interval",
               "type": "integer"
            },
            "trust_factor": {
               "default": 50.0,
               "description": "Trust the autocorrelation time only when it's smaller than this factor times the length of the chain.",
               "title": "Trust Factor",
               "type": "number"
            },
            "relative_thresh": {
               "default": 0.05,
               "description": "Relative threshold for convergence.",
               "title": "Relative Thresh",
               "type": "number"
            },
            "burnin_steps": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Number of burn-in steps to take. If None, burn-in runs until convergence.",
               "title": "Burnin Steps"
            },
            "num_steps": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 100,
               "description": "Number of steps to take in the MCMC sampling.",
               "title": "Num Steps"
            },
            "thin_by": {
               "default": 10,
               "description": "How many samples to draw before for saving one.",
               "title": "Thin By",
               "type": "integer"
            },
            "inverse_temp": {
               "default": 1.0,
               "description": "Inverse temperature for thermodynamic integration. Note that this is not yet fully implemented.",
               "title": "Inverse Temp",
               "type": "number"
            }
         },
         "required": [
            "storage_file"
         ],
         "title": "SamplingConfig",
         "type": "object"
      },
      "ScenarioConfig": {
         "description": "Define a scenario for which e.g. prevalences and risks may be computed.",
         "properties": {
            "t_stages": {
               "description": "List of T-stages to marginalize over in the scenario.",
               "examples": [
                  [
                     "early"
                  ],
                  [
                     3,
                     4
                  ]
               ],
               "items": {
                  "anyOf": [
                     {
                        "type": "integer"
                     },
                     {
                        "type": "string"
                     }
                  ]
               },
               "title": "T Stages",
               "type": "array"
            },
            "t_stages_dist": {
               "default": [
                  1.0
               ],
               "description": "Distribution over T-stages to use for marginalization.",
               "examples": [
                  [
                     1.0
                  ],
                  [
                     0.6,
                     0.4
                  ]
               ],
               "items": {
                  "type": "number"
               },
               "title": "T Stages Dist",
               "type": "array"
            },
            "midext": {
               "anyOf": [
                  {
                     "type": "boolean"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Whether the patient's tumor extends over the midline.",
               "title": "Midext"
            },
            "mode": {
               "default": "HMM",
               "description": "Which underlying model architecture to use.",
               "enum": [
                  "HMM",
                  "BN"
               ],
               "title": "Mode",
               "type": "string"
            },
            "involvement": {
               "$ref": "#/$defs/InvolvementConfig",
               "default": {
                  "ipsi": {},
                  "contra": {}
               }
            },
            "diagnosis": {
               "$ref": "#/$defs/DiagnosisConfig",
               "default": {
                  "ipsi": {},
                  "contra": {}
               }
            }
         },
         "required": [
            "t_stages"
         ],
         "title": "ScenarioConfig",
         "type": "object"
      }
   },
   "required": [
      "graph",
      "sampling",
      "risks"
   ]
}

field modalities: dict[str, ModalityConfig] = {}#: Maps names of diagnostic modalities to their specificity/sensitivity.

field risks: HDF5FileStorage [Required]#: Storage for the computed risks.

cli_cmd() → None[source]#: Start the risks subcommand.

Command Help#

Usage: lyscripts compute risks [-h] [--configs list[Path]] [--graph [JSON]]
                               [--graph.tumor dict[str,list[str]]]
                               [--graph.lnl dict[str,list[str]]]
                               [--model [JSON]]
                               [--model.external-file {Path,null}]
                               [--model.class-name {Unilateral,Bilateral,Midline}]
                               [--model.constructor {binary,trinary}]
                               [--model.max-time int]
                               [--model.named-params Sequence[str]]
                               [--model.kwargs dict[str,Any]]
                               [--distributions dict[str,JSON]]
                               [--cache-dir Path] [--scenarios list[JSON]]
                               [--sampling [JSON]]
                               [--sampling.storage-file Path]
                               [--sampling.history-file {Path,null}]
                               [--sampling.dataset str]
                               [--sampling.cores {int,null}]
                               [--sampling.seed int]
                               [--sampling.walkers-per-dim int]
                               [--sampling.check-interval int]
                               [--sampling.trust-factor float]
                               [--sampling.relative-thresh float]
                               [--sampling.burnin-steps {int,null}]
                               [--sampling.num-steps {int,null}]
                               [--sampling.thin-by int]
                               [--sampling.inverse-temp float]
                               [--modalities dict[str,JSON]] [--risks [JSON]]
                               [--risks.file Path]
                               [--risks.dataset {str,null}]

Predict the risk of involvement scenarios from model samples given diagnoses.

Options:
  -h, --help            show this help message and exit
  --configs list[Path]  Path to the YAML file(s) that contain the
                        configuration(s). Configs from YAML files may be
                        overwritten by command line arguments. When multiple
                        files are specified, the configs are merged in the
                        order they are given. Note that every config file must
                        have a `version: 1` key in it. (default:
                        ['config.yaml'])
  --distributions dict[str,JSON]
                        Mapping of model T-categories to predefined
                        distributions over diagnose times. (default: {})
  --cache-dir Path      Cache directory for storing function calls. (default:
                        /home/docs/checkouts/readthedocs.org/user_builds/lyscr
                        ipts/checkouts/latest/docs/source/.cache)
  --scenarios list[JSON]
                        List of scenarios to compute risks for. (default: [])
  --modalities dict[str,JSON]
                        Maps names of diagnostic modalities to their
                        specificity/sensitivity. (default: {})

Graph Options:
  Specifies how the tumor(s) and LNLs are connected in a DAG.

  --graph [JSON]        set graph from JSON string (default: {})
  --graph.tumor dict[str,list[str]]
                        Define the name of the tumor(s) and which LNLs it/they
                        drain to. (required)
  --graph.lnl dict[str,list[str]]
                        Define the name of the LNL(s) and which LNLs it/they
                        drain to. (required)

Model Options:
  Define which of the ``lymph`` models to use and how to set them up.

  --model [JSON]        set model from JSON string (default: {})
  --model.external-file {Path,null}
                        Path to a Python file that defines a model. (default:
                        None)
  --model.class-name {Unilateral,Bilateral,Midline}
                        Name of the model class to use. (default: Unilateral)
  --model.constructor {binary,trinary}
                        Trinary models differentiate btw. micro- and
                        macroscopic disease. (default: binary)
  --model.max-time int  Max. number of time-steps to evolve the model over.
                        (default: 10)
  --model.named-params Sequence[str]
                        Subset of valid model parameters a sampler may provide
                        in the form of a dictionary to the model instead of as
                        an array. Or, after sampling, with this list, one may
                        safely recover which parameter corresponds to which
                        index in the sample. (default: None)
  --model.kwargs dict[str,Any]
                        Additional keyword arguments to pass to the model
                        constructor. (default: {})

Sampling Options:
  Settings to configure the MCMC sampling.

  --sampling [JSON]     set sampling from JSON string (default: {})
  --sampling.storage-file Path
                        Path to HDF5 file store results or load last state.
                        (required)
  --sampling.history-file {Path,null}
                        Path to store the burn-in metrics (as CSV file).
                        (default: null)
  --sampling.dataset str
                        Name of the dataset in the HDF5 file. (default: mcmc)
  --sampling.cores {int,null}
                        Number of cores to use for parallel sampling. If
                        `None`, no parallel processing is used. (default: 2)
  --sampling.seed int   Seed for the random number generator. (default: 42)
  --sampling.walkers-per-dim int
                        Number of walkers per parameter space dimension.
                        (default: 20)
  --sampling.check-interval int
                        Check for convergence each time after this many steps.
                        (default: 50)
  --sampling.trust-factor float
                        Trust the autocorrelation time only when it's smaller
                        than this factor times the length of the chain.
                        (default: 50.0)
  --sampling.relative-thresh float
                        Relative threshold for convergence. (default: 0.05)
  --sampling.burnin-steps {int,null}
                        Number of burn-in steps to take. If None, burn-in runs
                        until convergence. (default: null)
  --sampling.num-steps {int,null}
                        Number of steps to take in the MCMC sampling.
                        (default: 100)
  --sampling.thin-by int
                        How many samples to draw before for saving one.
                        (default: 10)
  --sampling.inverse-temp float
                        Inverse temperature for thermodynamic integration.
                        Note that this is not yet fully implemented. (default:
                        1.0)

Risks Options:
  HDF5 file storage for in- and outputs of computations.

  --risks [JSON]        set risks from JSON string (default: {})
  --risks.file Path     Path to the HDF5 file. Parent directories are created
                        if needed. (required)
  --risks.dataset {str,null}
                        Name of the dataset in the HDF5 file. Save/load
                        methods can override this. (default: null)

Compute Risk of Involvement

Contents

Compute Risk of Involvement#

Command Help#