For information on how to acknowledge usage of data sets in SPARCL, see the acknowledgments page.
This page documents the provenance of the data sets and the expected number
of records in each data set before and after applying selection criteria
(where applicable). It further includes descriptions of possible problems
encountered when ingesting the records into SPARCL.
Upon ingest, SPARCL performs a number of checks on all data sets.
specid
in SPARCL) is less than -9223372036854775808 or greater than 9223372036854775807Any failures are tracked as part of the audit process and stored in an audit database.
We also document statistics on problematic cases at the end of the How-to use SPARCL Notebook.
These two data sets are part of the same SDSS-DR16 from the SDSS Collaboration
but are ingested separately in SPARCL due to the catalog and spectra files having different data models (SDSS vs.
BOSS spectrograph). One can query them jointly using datasetgroup='SDSS_BOSS'
.
For BOSS-DR16 and SDSS-DR16, the input files are the following:
specObj-dr16.fits
(reference spectroscopic catalog used to create the list of selected records and to store information in the AUX fields)spPlate-{plate}-{mjd}.fits
(spectra files used to ingest the SPECTRA fields; data models for the SDSS and BOSS instruments)spZbest-{plate}-{mjd}.fits
(files with the best-fit model and some CORE fields; data models for the SDSS and BOSS instruments)There are two main reasons why SDSS-DR16/BOSS-DR16 records are not ingested:
Reason rejected | # of records |
---|---|
Not found in specObj-dr16 file | 168000 |
No corresponding spZbest file | 0 |
NaN/Inf values in field(s) | 27 |
Reason rejected | # of records |
---|---|
Not found in specObj-dr16 file | 130560 |
No corresponding spZbest file | 871 |
NaN/Inf values in field(s) | 61 |
The original Dark Energy Spectroscopic Instrument (DESI)
redshift catalog for healpix spectra (zall-pix-fuji.fits; data model) contains 2,847,435 rows.
Selecting only rows for DESI targets (objtype='TGT'
), we obtain 2,044,588 expected records.
The rejected entries are a combination of sky fibers and/or problematic fibers (objtype='SKY'
,
'BAD'
or blank). We opt to exclude them from SPARCL.
We do not apply any further quality cuts but we recommend that users familiarize themselves with possible
quality flags of interest such as coadd_fiberstatus
(non-zero if there is a warning or error
with the fiber) and zwarn
(non-zero when there are possible issues with determining the redshift
or fitting the spectra or carried over from the fiber status).
SPARCL only includes healpix-coadded spectra after they were joined across cameras such that each
spectrum contains a single array flux vector for the full wavelength range (rather than separate vectors for
the B, R, Z spectrograph arms). Similarly, the inverse variance (ivar
) and best-fit Redrock
template (model
) were joined across cameras into single arrays.
Information on data access through the Astro Data Lab and example notebooks showing how to use Astro Data Lab databases jointly with SPARCL are available here.
Reason rejected | # of records |
---|---|
Invalid objtype |
406737 |
NaN/Inf values in field(s) | 0 |