Datasets¶
Arkimet supports different dataset formats, which offer different performance and indexing features to match various ways in which data is stored and queried.
Contents:
Dataset configuration¶
Datasets are configured with simple key = value
configuration options.
These are all the supported options:
archive age
: data older than this number of days will be moved to the dataset archive during maintenance.delete age
: data older than this number of days will be deleted during maintenance.eatmydata
: disable fsync/fdatasync operations while writing data to dataset, and disable sqlite’ journaling and other data integrity features. This makes acquiring data very fast, but an interrupted import or a concurrent import may cause data corruption.format
: format of data in the dataset (one ofgrib
,bufr
,odimh5
,vm2
)index
: comma-separated list of names of metadata to index for faster queriespath
: path to the dataset, or URL forremote
datasets.replace
: whenyes
, importing duplicate data will replace the existing version. Whenno
, importing duplicate data will be rejected. Whenusn
, importing duplicate BUFR data will replace the existing version only if the BUFR Update Sequence Number is greater than the one currently in the dataset.restrict
: comma-separated list of names that have access to the dataset. This allows filtering with the--restrict
option on command line.smallfiles
:yes
orno
. Whenyes
, the file contents are also saved in the index, to speed up extraction of data with tiny payloads likevm2
.step
: segmentation step for the dataset (one ofdaily
,weekly
,biweekly
,monthly
, andyearly
).type
: dataset type (one ofiseg
,simple
,error
,duplicates
,remote
,outbound
,discard
,file
).unique
: comma-separated list of names of metadata that, taken together, make it unique in the dataset