clean_lab_result {lab2clean}R Documentation

Clean and Standardize Laboratory Result Values

Description

This function is designed to clean and standardize laboratory result values. It creates two new columns "clean_result" and "scale_type" without altering the original result values. The function is part of a comprehensive R package designed for cleaning laboratory datasets.

Usage

clean_lab_result(
  lab_data,
  raw_result,
  locale = "NO",
  report = TRUE,
  n_records = NA
)

Arguments

lab_data

A data frame containing laboratory data.

raw_result

The column in lab_data that contains raw result values to be cleaned.

locale

A string representing the locale for the laboratory data. Defaults to "NO".

report

A report is written in the console. Defaults to "TRUE".

n_records

In case you are loading a grouped list of distinct results, then you can assign the n_records to the column that contains the frequency of each distinct result. Defaults to NA

Details

The function undergoes the following methodology:

  1. Clear Typos: Removes typographical errors and extraneous characters.

  2. Handle Extra Variables: Identifies and separates extra variables from result values.

  3. Detect and Assign Scale Types: Identifies and assigns the scale type using regular expressions.

  4. Number Formatting: Standardizes number formats based on predefined rules and locale.

  5. Mining Text Results: Identifies common words and patterns in text results.

Internal Datasets: The function uses an internal dataset; common_words_languages.csv which contains common words in various languages used for pattern identification in text result values.

Value

A modified lab_data data frame with additional columns:

Note

This function is part of a larger data cleaning pipeline and should be evaluated in that context. The package framework includes functions for cleaning result values and validating quantitative results for each test identifier.

Performance of the function can be affected by the size of lab_data. Considerations for data size or pre-processing may be needed.

Author(s)

Ahmed Zayed ahmed.zayed@kuleuven.be

See Also

Function 2 for result validation,


[Package lab2clean version 1.0.0 Index]