Skip to contents

This function computes entropy-based uncertainty for each observation using its membership vector and identifies uncertain locations based on either a mean-plus-SD threshold or a quantile threshold.

Usage

compute_uncertainty(
  u,
  normalized = TRUE,
  threshold_method = c("mean_sd", "quantile"),
  threshold_scale = 0.5,
  quantile_prob = NULL
)

Arguments

u

A numeric matrix of dimension \(n \times k\), where \(n\) is the number of observations and \(k\) is the number of clusters. Each row should represent a membership vector.

normalized

A logical value indicating whether to return normalized entropy.

threshold_method

A character string specifying the thresholding method. Must be one of "mean_sd" or "quantile".

threshold_scale

A numeric value specifying the multiplier of the standard deviation when threshold_method = "mean_sd".

quantile_prob

A numeric value between 0 and 1 specifying the quantile probability when threshold_method = "quantile". Ignored otherwise.

Value

A list containing:

  • entropy: A numeric vector of entropy values.

  • threshold: The threshold for entropy used to define uncertainty.

  • loc_uncertain: Indices of uncertain observations.

  • is_uncertain: A logical vector indicating uncertain observations.

  • threshold_method: The thresholding method used.

  • threshold_scale: The SD multiplier used when threshold_method = "mean_sd".

  • quantile_prob: The quantile probability used when threshold_method = "quantile", and NULL otherwise.

  • normalized: Whether entropy was normalized.