process

How the datasets are built

This page explains how samples are collected, reviewed, and packaged before they are listed as downloadable bundles.

Overview

  • Samples are collected from a defined generator or model family.
  • Files are reviewed, cleaned, and checked for duplicates or bad outputs.
  • Edge cases are identified and retained intentionally when they add evaluation value.
  • Bundles are versioned so updates can be tracked clearly over time.
  • Only the files intended for release are included in the final package.

This page is public so buyers can review how the datasets are assembled before purchasing.