Data cleaning: Why is the default threshold set at 80%?

Data cleaning: Why is the default threshold set at 80%?

Most statistical tests require complete data without missing values. In the WebIDQ workflow, data cleaning is therefore followed by imputation, which substitutes missing values with estimates that are as realistic as possible.

However, too many missing and therefore imputed values make the statistical analysis unreliable. It is not desired to make statistical conclusions based on mainly imputed data. The data cleaning step ensures that statistical comparisons are performed for metabolites, where enough valid concentrations are available. While 80% is a widely used threshold for data cleaning, this value is not strictly defined. Also lower thresholds, e.g. 70%, or higher thresholds, e.g. 90%, can be used, depending on the study design and sample size,

Info
For additional information, refer to the WebIDQ user manual > Data cleaning.
    • Related Articles

    • How to use metabolism indicators (MetaboINDICATORS) in statistical procedures?

      General use of metabolism indicators Metabolism indicators shall help to get started with the kit data interpretations. Each calculated metabolism indicator serves as an added value to the results data set. Detailed information to each metabolism ...
    • What is the difference between the available WebIDQ cloud-based subscription plans (Basic, Core, and Core+)?

      WebIDQ cloud subscriptions are offered in multiple plans to support different user needs, team sizes, and analysis requirements. Each higher plan builds on the features of the previous one. Please see below for an overview of the key differences ...
    • What data is stored in WebIDQ cloud and for how long?

      1. Uploaded raw data files (e.g. ".wiff" files) are converted to mzML format and these are stored in the WebIDQ database. The original files are not stored in WebIDQ. Please consider an appropriate backup! 2. The WebIDQ database, including the mzML ...
    • Why are the raw data not migrated?

      The migration process takes a long time even without raw data and we want to establish a simple and straightforward migration process. Since the raw data makes up the bulk of the data, we refrain from transferring it. This saves the customer days of ...
    • Missing concentration values in Results

      Whenever a concentration cannot be calculated, a specific status is shown in WebIDQ > Results. Due to analytical reasons, concentration values may not be available, and will be represented by these statuses: NA Empty values are replaced with NA NaN ...