cluster_labels {stratifiedyh}R Documentation

Cluster Sampling and Labeling

Description

This function performs cluster sampling on the dataframe and assigns "Yes" or "No" labels to rows based on selected clusters.

Usage

cluster_labels(df, group_col, yes_percentage)

Arguments

df

A data frame containing the data.

group_col

A character string specifying the column to use for clustering.

yes_percentage

A numeric value between 0 and 100 indicating the percentage of clusters to label as "Yes".

Value

A data frame with an additional column "Clustered_Yes_No" containing the cluster-sampled "Yes"/"No" labels.

Examples

result <- cluster_labels(iris, group_col = "Species", yes_percentage = 50)

[Package stratifiedyh version 0.1.0 Index]