Skip to main content
A dataset is a curated, versioned view of your data, prepared so it can be analyzed and visualized. Where an asset holds your files, a dataset adds versioning and preprocessing: DataErai analyzes each version to understand its structure so it’s ready to explore.

Datasets and versions

A dataset is the logical identity of your data. Each time you provide new content, DataErai adds a new version under the same dataset rather than creating a separate one — so v1, v2, and so on form a history. Each version is an immutable snapshot: once created, its data doesn’t change.

The version lifecycle

When a version is created, DataErai preprocesses it. You can watch its status as it moves through the pipeline:
StatusMeaning
UploadedThe version exists; preprocessing hasn’t started.
Preprocessing queuedWaiting for a preprocessing run.
Preprocessing runningDataErai is analyzing the data.
ReadyPreprocessing succeeded; the version is ready to explore and visualize.
Preprocessing failedSomething went wrong; see the error and retry.
DeprecatedSuperseded or retired.
A failed version can be retried, which puts it back in the queue. Older versions can be deprecated when they’re no longer current.
For tabular data, DataErai can prepare a dataset for you automatically. When you select a tabular asset (such as a CSV), it promotes and preprocesses the data in the background and shows a chart as soon as the version is Ready.

What you get when a version is Ready

Once a version reaches Ready, DataErai has analyzed it so you can:

Next steps

Understand your data

Schema inference and summary statistics for tabular data.

Visualize your data

Histograms and category charts, right in DataErai.