IRIS Data Reduction System design

Purpose

The IRIS Data Reduction System is planned to perform:

  • real-time (< 1 minute) and offline data processing of IRIS images and spectroscopic data with the iris_pipeline Python package based on JWST’s pipeline package stpipe, see the documentation

  • raw readout processing from the IRIS imager and spectrograph into raw science quality frames with the C library iris_readout at https://github.com/oirlab/iris_readout, which will be used directly during real-time operations and will be wrapped into Python modules in iris_pipeline for offline processing.

  • visualization of raw and reduced data to facilitate data assessment and analysis for real-time and offline use. These tools will be developed later and will possibly be based on existing community software tools like DS9 or cubeviz.

Software infrastructure

We rely on the excellent work mostly by Space Telescope to grow the Python in Astronomy ecosystem around the astropy package. They also developed a suite of open-source tools to operate JWST based on their experience operating the Hubble Space telescope.

The jwst Python package bundles several tools:

  • a jwst.datamodel package to handle custom schemas for complex hierarchical metadata

  • a stpipe package to configure and execute processing pipelines

  • a large array of data processing modules to analyze data from all instruments on board of JWST

We leverage this effort by:

  • building a custom schema for IRIS

  • using stpipe to execute our pipelines

  • starting from JWST processing modules and customizing them for IRIS and publishing them on the iris_pipeline repository https://github.com/oirlab/iris_pipeline.

File format

All data will be stored in FITS file format, following as closest as possible the conventions by JWST, see https://jwst-docs.stsci.edu/understanding-data-files.

The file format of all the FITS files used by iris_pipeline are defined in the iris_pipeline.datamodels package and are encoded as schemas in YAML format.

For example the FITS file format used for raw and reduced science frame is IRISImageModel, this is referenced in the FITS keyword DATAMODL:

DATAMODL= 'IRISImageModel'

all the names and datatype of all the extensions is encoded in the iris_image.schema.yaml file. Schema files can also reference other schema files, for example, iris_image.schema.yaml internally references tmt_core.schema.yaml which includes all the metadata available as FITS headers, e.g. acquisition time, pupil, detector name.

The currently implemented datamodels are:

  • IRISImageModel: raw and reduced frames from the imagers

  • TMTRampModel: raw readouts

  • TMTFlatModel: flats

  • TMTDarkModel: darks

All models are defined in iris_pipeline.datamodels, and their schemas available within the package itself, some of those models are just abstractions to group similar functionality but are never used in practice.

Example run

The best way to understand how iris_pipeline works is to checkout an example reduction of a raw science frame to a reduced science frame with flat-fielding and background subtraction.

Access calibration files via the Calibration Reference Data System (CRDS)

See the section about Calibration

Metadata

iris_pipeline requires a set of metadata from TMT and from other subsystems to process the data, see the list of required metadata.

Moreover, iris_pipeline will add to the header of processed FITS files categorizing the data in:

OBSTYPE

OBSNAME

Description

Calibration (CAL)

IMG1-NFF, SLI-NFF LEN-SPX IMG1-DRK, SLI-DRK, LEN-DRK IMG1-TEL, SLI-TEL, LEN-TEL

Flat field Lenslet Spectral Extraction Master dark Telluric Star

Engineering (ENG)

SLI-IDP, LEN-IDP

Instrumental dispersion

Science (SCI)

IMG1-SCI, LEN-SCI, SLI-SCI IMG1-SKY, LEN-SKY, SLI-SKY

Science Sky