gnuastro-0.16/doc/gnuastro.info-1

This is gnuastro.info, produced by makeinfo version 6.8 from
gnuastro.texi.

This book documents version 0.16 of the GNU Astronomy Utilities
(Gnuastro).  Gnuastro provides various programs and libraries for
astronomical data manipulation and analysis.

   Copyright © 2015-2021, Free Software Foundation, Inc.

     Permission is granted to copy, distribute and/or modify this
     document under the terms of the GNU Free Documentation License,
     Version 1.3 or any later version published by the Free Software
     Foundation; with no Invariant Sections, no Front-Cover Texts, and
     no Back-Cover Texts.  A copy of the license is included in the
     section entitled “GNU Free Documentation License”.
INFO-DIR-SECTION Astronomy
START-INFO-DIR-ENTRY
* Gnuastro: (gnuastro).       GNU Astronomy Utilities.
* libgnuastro: (gnuastro)Gnuastro library. Full Gnuastro library doc.

* help-gnuastro: (gnuastro)help-gnuastro mailing list. Getting help.

* bug-gnuastro: (gnuastro)Report a bug. How to report bugs

* Arithmetic: (gnuastro)Arithmetic. Arithmetic operations on pixels.
* astarithmetic: (gnuastro)Invoking astarithmetic. Options to Arithmetic.

* BuildProgram: (gnuastro)BuildProgram. Compile and run programs using Gnuastro’s library.
* astbuildprog: (gnuastro)Invoking astbuildprog. Options to BuildProgram.

* ConvertType: (gnuastro)ConvertType. Convert different file types.
* astconvertt: (gnuastro)Invoking astconvertt. Options to ConvertType.

* Convolve: (gnuastro)Convolve. Convolve an input file with kernel.
* astconvolve: (gnuastro)Invoking astconvolve. Options to Convolve.

* CosmicCalculator: (gnuastro)CosmicCalculator. For cosmological params.
* astcosmiccal: (gnuastro)Invoking astcosmiccal. Options to CosmicCalculator.

* Crop: (gnuastro)Crop. Crop region(s) from image(s).
* astcrop: (gnuastro)Invoking astcrop. Options to Crop.

* Fits: (gnuastro)Fits. View and manipulate FITS extensions and keywords.
* astfits: (gnuastro)Invoking astfits. Options to Fits.

* MakeCatalog: (gnuastro)MakeCatalog. Make a catalog from labeled image.
* astmkcatalog: (gnuastro)Invoking astmkcatalog. Options to MakeCatalog.

* MakeNoise: (gnuastro)MakeNoise. Make (add) noise to an image.
* astmknoise: (gnuastro)Invoking astmknoise. Options to MakeNoise.

* MakeProfiles: (gnuastro)MakeProfiles. Make mock profiles.
* astmkprof: (gnuastro)Invoking astmkprof. Options to MakeProfiles.

* Match: (gnuastro)Match. Match two separate catalogs.
* astmatch: (gnuastro)Invoking astmatch. Options to Match.

* NoiseChisel: (gnuastro)NoiseChisel. Detect signal in noise.
* astnoisechisel: (gnuastro)Invoking astnoisechisel. Options to NoiseChisel.

* Segment: (gnuastro)Segment. Segment detections based on signal structure.
* astsegment: (gnuastro)Invoking astsegment. Options to Segment.

* Query: (gnuastro)Query. Access remote databases for downloading data.
* astquery: (gnuastro)Invoking astquery. Options to Query.

* Statistics: (gnuastro)Statistics. Get image Statistics.
* aststatistics: (gnuastro)Invoking aststatistics. Options to Statistics.

* Table: (gnuastro)Table. Read and write FITS binary or ASCII tables.
* asttable: (gnuastro)Invoking asttable. Options to Table.

* Warp: (gnuastro)Warp. Warp a dataset to a new grid.
* astwarp: (gnuastro)Invoking astwarp. Options to Warp.

* astscript: (gnuastro)Installed scripts. Gnuastro’s installed scripts.
* astscript-sort-by-night: (gnuastro)Invoking astscript-sort-by-night. Options to this script
* astscript-radial-profile: (gnuastro)Invoking astscript-radial-profile. Options to this script
* astscript-ds9-region: (gnuastro)Invoking astscript-ds9-region. Options to this script

END-INFO-DIR-ENTRY


File: gnuastro.info,  Node: Top,  Next: Introduction,  Prev: (dir),  Up: (dir)

GNU Astronomy Utilities
***********************

This book documents version 0.16 of the GNU Astronomy Utilities
(Gnuastro).  Gnuastro provides various programs and libraries for
astronomical data manipulation and analysis.

   Copyright © 2015-2021, Free Software Foundation, Inc.

     Permission is granted to copy, distribute and/or modify this
     document under the terms of the GNU Free Documentation License,
     Version 1.3 or any later version published by the Free Software
     Foundation; with no Invariant Sections, no Front-Cover Texts, and
     no Back-Cover Texts.  A copy of the license is included in the
     section entitled “GNU Free Documentation License”.

* Menu:

* Introduction::                General introduction.
* Tutorials::                   Tutorials or Cookbooks.
* Installation::                Requirements and installation.
* Common program behavior::     Common behavior in all programs.
* Data containers::             Tools to operate on extensions and tables.
* Data manipulation::           Tools for basic image manipulation.
* Data analysis::               Analyze images.
* Modeling and fittings::       Make and fit models.
* High-level calculations::     Physical calculations.
* Installed scripts::           Installed scripts that operate like programs.
* Library::                     Gnuastro’s library of useful functions.
* Developing::                  The development environment.
* Gnuastro programs list::      List and short summary of Gnuastro.
* Other useful software::       Installing other useful software.
* GNU Free Doc. License::       Full text of the GNU Free documentation license.
* GNU General Public License::  Full text of the GNU General public license.
* Index::                       Index of terms

 — The Detailed Node Listing —

Introduction

* Quick start::                 A quick start to installation.
* Science and its tools::       Some philosophy and history.
* Your rights::                 User rights.
* Naming convention::           About names of programs in Gnuastro.
* Version numbering::           Understanding version numbers.
* New to GNU/Linux?::           Suggested GNU/Linux distribution.
* Report a bug::                Search and report the bug you found.
* Suggest new feature::         How to suggest a new feature.
* Announcements::               How to stay up to date with Gnuastro.
* Conventions::                 Conventions used in this book.
* Acknowledgments::             People who helped in the production.

Version numbering

* GNU Astronomy Utilities 1.0::  Plans for version 1.0 release

New to GNU/Linux?

* Command-line interface::      Introduction to the command-line

Tutorials

* Sufi simulates a detection::  Simulating a detection.
* General program usage tutorial::  Tutorial on many programs in generic scenario.
* Detecting large extended targets::  NoiseChisel for huge extended targets.

Sufi simulates a detection

* General program usage tutorial::

General program usage tutorial

* Calling Gnuastro's programs::  Easy way to find Gnuastro’s programs.
* Accessing documentation::     Access to manual of programs you are running.
* Setup and data download::     Setup this template and download datasets.
* Dataset inspection and cropping::  Crop the flat region to use in next steps.
* Angular coverage on the sky::  Measure the field size on the sky.
* Cosmological coverage::       Measure the field size at different redshifts.
* Building custom programs with the library::  Easy way to build new programs.
* Option management and configuration files::  Dealing with options and configuring them.
* Warping to a new pixel grid::  Transforming/warping the dataset.
* NoiseChisel and Multiextension FITS files::  Running NoiseChisel and having multiple HDUs.
* NoiseChisel optimization for detection::  Check NoiseChisel’s operation and improve it.
* NoiseChisel optimization for storage::  Dramatically decrease output’s volume.
* Segmentation and making a catalog::  Finding true peaks and creating a catalog.
* Working with catalogs estimating colors::  Estimating colors using the catalogs.
* Column statistics color-magnitude diagram::  Visualizing column correlations.
* Aperture photometry::         Doing photometry on a fixed aperture.
* Matching catalogs::           Easily find corresponding rows from two catalogs.
* Finding reddest clumps and visual inspection::  Selecting some targets and inspecting them.
* Writing scripts to automate the steps::  Scripts will greatly help in re-doing things fast.
* Citing and acknowledging Gnuastro::  How to cite and acknowledge Gnuastro in your papers.

Detecting large extended targets

* Downloading and validating input data::  How to get and check the input data.
* NoiseChisel optimization::    Detect the extended and diffuse wings.
* Image surface brightness limit::  Standards to quantify the noise level.
* Achieved surface brightness level::  Calculate the outer surface brightness.
* Extract clumps and objects::  Find sub-structure over the detections.

Installation

* Dependencies::                Necessary packages for Gnuastro.
* Downloading the source::      Ways to download the source code.
* Build and install::           Configure, build and install Gnuastro.

Dependencies

* Mandatory dependencies::      Gnuastro will not install without these.
* Optional dependencies::       Adding more functionality.
* Bootstrapping dependencies::  If you have the version controlled source.
* Dependencies from package managers::  Installing from OS package managers.

Mandatory dependencies

* GNU Scientific Library::      Installing GSL.
* CFITSIO::                     C interface to the FITS standard.
* WCSLIB::                      C interface to the WCS standard of FITS.

Downloading the source

* Release tarball::             Download a stable official release.
* Version controlled source::   Get and use the version controlled source.

Version controlled source

* Bootstrapping::               Adding all the automatically generated files.
* Synchronizing::               Keep your local clone up to date.

Build and install

* Configuring::                 Configure Gnuastro
* Separate build and source directories::  Keeping derivate/build files separate.
* Tests::                       Run tests to see if it is working.
* A4 print book::               Customize the print book.
* Known issues::                Issues you might encounter.

Configuring

* Gnuastro configure options::  Configure options particular to Gnuastro.
* Installation directory::      Specify the directory to install.
* Executable names::            Changing executable names.
* Configure and build in RAM::  For minimal use of HDD or SSD, and clean source.

Common program behavior

* Command-line::                How to use the command-line.
* Configuration files::         Values for unspecified variables.
* Getting help::                Getting more information on the go.
* Multi-threaded operations::   How threads are managed in Gnuastro.
* Numeric data types::          Different types and how to specify them.
* Memory management::           How memory is allocated (in RAM or HDD/SSD).
* Tables::                      Recognized table formats.
* Tessellation::                Tile the dataset into non-overlapping bins.
* Automatic output::            About automatic output names.
* Output FITS files::           Common properties when outputs are FITS.

Command-line

* Arguments and options::       Different ways to specify inputs and configuration.
* Common options::              Options that are shared between all programs.
* Shell TAB completion::        Customized TAB completion in Gnuastro.
* Standard input::              Using output of another program as input.

Arguments and options

* Arguments::                   For specifying the main input files/operations.
* Options::                     For configuring the behavior of the program.

Common options

* Input output options::        Common input/output options.
* Processing options::          Options for common processing steps.
* Operating mode options::      Common operating mode options.

Configuration files

* Configuration file format::   ASCII format of configuration file.
* Configuration file precedence::  Precedence of configuration files.
* Current directory and User wide::  Local and user configuration files.
* System wide::                 System wide configuration files.

Getting help

* --usage::                     View option names and value formats.
* --help::                      List all options with description.
* Man pages::                   Man pages generated from –help.
* Info::                        View complete book in terminal.
* help-gnuastro mailing list::  Contacting experienced users.

Multi-threaded operations

* A note on threads::           Caution and suggestion on using threads.
* How to run simultaneous operations::  How to run things simultaneously.

Tables

* Recognized table formats::    Table formats that are recognized in Gnuastro.
* Gnuastro text table format::  Gnuastro’s convention plain text tables.
* Selecting table columns::     Identify/select certain columns from a table

Recognized table formats

* Gnuastro text table format::  Reading plain text tables

Data containers

* Fits::                        View and manipulate extensions and keywords.
* ConvertType::                 Convert data to various formats.
* Table::                       Read and Write FITS tables to plain text.
* Query::                       Import data from external databases.

Fits

* Invoking astfits::            Arguments and options to Header.

Invoking Fits

* HDU information and manipulation::  Learn about the HDUs and move them.
* Keyword inspection and manipulation::  Manipulate metadata keywords in a HDU

ConvertType

* Recognized file formats::     Recognized file formats
* Color::                       Some explanations on color.
* Aligning images with small WCS offsets::  When the WCS slightly differs.
* Annotations for figure in paper::  Adding coordinates or physical scale.
* Invoking astconvertt::        Options and arguments to ConvertType.

Annotations for figure in paper

* Full script of annotations on figure::  All the steps in one script

Table

* Column arithmetic::           How to do operations on table columns.
* Operation precedence in Table::  Order of running options in Table.
* Invoking asttable::           Options and arguments to Table.

Query

* Available databases::         List of available databases to Query.
* Invoking astquery::           Inputs, outputs and configuration of Query.

Data manipulation

* Crop::                        Crop region(s) from a dataset.
* Arithmetic::                  Arithmetic on input data.
* Convolve::                    Convolve an image with a kernel.
* Warp::                        Warp/Transform an image to a different grid.

Crop

* Crop modes::                  Basic modes to define crop region.
* Crop section syntax::         How to define a section to crop.
* Blank pixels::                Pixels with no value.
* Invoking astcrop::            Calling Crop on the command-line

Invoking Crop

* Crop options::                A list of all the options with explanation.
* Crop output::                 The outputs of Crop.
* Crop known issues::           Known issues in running Crop.

Arithmetic

* Reverse polish notation::     The current notation style for Arithmetic
* Arithmetic operators::        List of operators known to Arithmetic
* Invoking astarithmetic::      How to run Arithmetic: options and output

Arithmetic operators

* Basic mathematical operators::  For example +, -, /, log, pow, and etc.
* Trigonometric and hyperbolic operators::  sin, cos, atan, asinh, and etc
* Unit conversion operators::   magnitudes to counts, or parsecs to AUs, and etc.
* Statistical operators::       Statistics of a single dataset (for example mean).
* Stacking operators::          Coadding or combining multiple datasets into one.
* Filtering operators::         Smoothing a dataset through mixing pixel with neighbors.
* Interpolation operators::     Giving blank pixels a value.
* Dimensionality changing operators::  Collapse or expand a dataset.
* Conditional operators::       Select certain pixels within the dataset.
* Mathematical morphology operators::  Work on binary images, for example erode.
* Bitwise operators::           Work on bits within one pixel.
* Numerical type conversion operators::  Convert the numeric datatype of a dataset.
* Adding noise operators::      Add noise to a dataset.
* Elliptical shape operators::  Operations that are focused on an ellipse.
* Building new dataset::        How to construct an empty dataset from scratch.
* Operand storage in memory or a file::  Tools for complex operations in one command.

Convolve

* Spatial domain convolution::  Only using the input image values.
* Frequency domain and Fourier operations::  Using frequencies in input.
* Spatial vs. Frequency domain::  When to use which?
* Convolution kernel::          How to specify the convolution kernel.
* Invoking astconvolve::        Options and argument to Convolve.

Spatial domain convolution

* Convolution process::         More basic explanations.
* Edges in the spatial domain::  Dealing with the edges of an image.

Frequency domain and Fourier operations

* Fourier series historical background::  Historical background.
* Circles and the complex plane::  Interpreting complex numbers.
* Fourier series::              Fourier Series definition.
* Fourier transform::           Fourier Transform definition.
* Dirac delta and comb::        Dirac delta and Dirac comb.
* Convolution theorem::         Derivation of Convolution theorem.
* Sampling theorem::            Sampling theorem (Nyquist frequency).
* Discrete Fourier transform::  Derivation and explanation of DFT.
* Fourier operations in two dimensions::  Extend to 2D images.
* Edges in the frequency domain::  Interpretation of edge effects.

Warp

* Warping basics::              Basics of coordinate transformation.
* Merging multiple warpings::   How to merge multiple matrices.
* Resampling::                  Warping an image is re-sampling it.
* Invoking astwarp::            Arguments and options for Warp.

Data analysis

* Statistics::                  Calculate dataset statistics.
* NoiseChisel::                 Detect objects in an image.
* Segment::                     Segment detections based on signal structure.
* MakeCatalog::                 Catalog from input and labeled images.
* Match::                       Match two datasets.

Statistics

* Histogram and Cumulative Frequency Plot::  Basic definitions.
* 2D Histograms::               Plotting the distribution of two variables.
* Sigma clipping::              Definition of $\sigma$-clipping.
* Sky value::                   Definition and derivation of the Sky value.
* Invoking aststatistics::      Arguments and options to Statistics.

2D Histograms

* 2D histogram as a table::     Format and usage in table format.
* 2D histogram as an image::    Format and usage in image format

Sky value

* Sky value definition::        Definition of the Sky/reference value.
* Sky value misconceptions::    Wrong methods to estimate the Sky value.
* Quantifying signal in a tile::  Method to estimate the presence of signal.

NoiseChisel

* NoiseChisel changes after publication::  Updates since published papers.
* Invoking astnoisechisel::     Options and arguments for NoiseChisel.

Invoking NoiseChisel

* NoiseChisel input::           NoiseChisel’s input options.
* Detection options::           Configure detection in NoiseChisel.
* NoiseChisel output::          NoiseChisel’s output options and format.

Segment

* Invoking astsegment::         Inputs, outputs and options to Segment

Invoking Segment

* Segment input::               Input files and options.
* Segmentation options::        Parameters of the segmentation process.
* Segment output::              Outputs of Segment

MakeCatalog

* Detection and catalog production::  Discussing why/how to treat these separately
* Brightness flux magnitude::   More on Magnitudes, surface brightness and etc.
* Quantifying measurement limits::  For comparing different catalogs.
* Measuring elliptical parameters::  Estimating elliptical parameters.
* Adding new columns to MakeCatalog::  How to add new columns.
* Invoking astmkcatalog::       Options and arguments to MakeCatalog.

Quantifying measurement limits

* Magnitude measurement error of each detection::  Derivation of mag error equation
* Surface brightness error of each detection::  Error in measuring the Surface brightness.
* Completeness limit of each detection::  Possibility of detecting similar objects?
* Upper limit magnitude of each detection::  How reliable is your magnitude?
* Surface brightness limit of image::  How deep is your data?
* Upper limit magnitude of image::  How deep is your data for certain footprint?

Invoking MakeCatalog

* MakeCatalog inputs and basic settings::  Input files and basic settings.
* Upper-limit settings::        Settings for upper-limit measurements.
* MakeCatalog measurements::    Available measurements in MakeCatalog.
* MakeCatalog output::          File names of MakeCatalog’s output table.

Match

* Invoking astmatch::           Inputs, outputs and options of Match

Modeling and fitting

* MakeProfiles::                Making mock galaxies and stars.
* MakeNoise::                   Make (add) noise to an image.

MakeProfiles

* Modeling basics::             Astronomical modeling basics.
* If convolving afterwards::    Considerations for convolving later.
* Profile magnitude::           Definition of total profile magnitude.
* Invoking astmkprof::          Inputs and Options for MakeProfiles.

Modeling basics

* Defining an ellipse and ellipsoid::  Definition of these important shapes.
* PSF::                         Radial profiles for the PSF.
* Stars::                       Making mock star profiles.
* Galaxies::                    Radial profiles for galaxies.
* Sampling from a function::    Sample a function on a pixelated canvas.
* Oversampling::                Oversampling the model.

Invoking MakeProfiles

* MakeProfiles catalog::        Required catalog properties.
* MakeProfiles profile settings::  Configuration parameters for all profiles.
* MakeProfiles output dataset::  The canvas/dataset to build profiles over.
* MakeProfiles log file::       A description of the optional log file.

MakeNoise

* Noise basics::                Noise concepts and definitions.
* Invoking astmknoise::         Options and arguments to MakeNoise.

Noise basics

* Photon counting noise::       Poisson noise
* Instrumental noise::          Readout, dark current and other sources.
* Final noised pixel value::    How the final noised value is calculated.
* Generating random numbers::   How random numbers are generated.

High-level calculations

* CosmicCalculator::            Calculate cosmological variables

CosmicCalculator

* Distance on a 2D curved space::  Distances in 2D for simplicity
* Extending distance concepts to 3D::  Going to 3D (our real universe).
* Invoking astcosmiccal::       How to run CosmicCalculator

Invoking CosmicCalculator

* CosmicCalculator input options::  Options to specify input conditions.
* CosmicCalculator basic cosmology calculations::  Like distance modulus, distances and etc.
* CosmicCalculator spectral line calculations::  How they get affected by redshift.

Installed scripts

* Sort FITS files by night::    Sort many files by date.
* Generate radial profile::     Radial profile of an object in an image.
* SAO DS9 region files from table::  Create ds9 region file from a table.

Sort FITS files by night

* Invoking astscript-sort-by-night::  Inputs and outputs to this script.

Generate radial profile

* Invoking astscript-radial-profile::  How to call astscript-radial-profile

SAO DS9 region files from table

* Invoking astscript-ds9-region::  How to call astscript-ds9-region

Library

* Review of library fundamentals::  Guide on libraries and linking.
* BuildProgram::                Link and run source files with this library.
* Gnuastro library::            Description of all library functions.
* Library demo programs::       Demonstration for using libraries.

Review of library fundamentals

* Headers::                     Header files included in source.
* Linking::                     Linking the compiled source files into one.
* Summary and example on libraries::  A summary and example on using libraries.

BuildProgram

* Invoking astbuildprog::       Options and examples for using this program.

Gnuastro library

* Configuration information::   General information about library config.
* Multithreaded programming::   Tools for easy multi-threaded operations.
* Library data types::          Definitions and functions for types.
* Pointers::                    Wrappers for easy working with pointers.**
* Library blank values::        Blank values and functions to deal with them.
* Library data container::      General data container in Gnuastro.
* Dimensions::                  Dealing with coordinates and dimensions.
* Linked lists::                Various types of linked lists.
* Array input output::          Reading and writing images or cubes.
* Table input output::          Reading and writing table columns.
* FITS files::                  Working with FITS data.
* File input output::           Reading and writing to various file formats.
* World Coordinate System::     Dealing with the world coordinate system.
* Arithmetic on datasets::      Arithmetic operations on a dataset.
* Tessellation library::        Functions for working on tiles.
* Bounding box::                Finding the bounding box.
* Polygons::                    Working with the vertices of a polygon.
* Qsort functions::             Helper functions for Qsort.
* K-d tree::                    Space partitioning in K dimensions.
* Permutations::                Re-order (or permute) the values in a dataset.
* Matching::                    Matching catalogs based on position.
* Statistical operations::      Functions for basic statistics.
* Binary datasets::             Datasets that can only have values of 0 or 1.
* Labeled datasets::            Working with Segmented/labeled datasets.
* Convolution functions::       Library functions to do convolution.
* Interpolation::               Interpolate (over blank values possibly).
* Git wrappers::                Wrappers for functions in libgit2.
* Unit conversion library::     Converting between recognized units.
* Spectral lines library::      Functions for operating on Spectral lines.
* Cosmology library::           Cosmological calculations.
* SAO DS9 library::             Take inputs from files generated by SAO DS9.

Multithreaded programming (‘threads.h’)

* Implementation of pthread_barrier::  Some systems don’t have pthread_barrier
* Gnuastro's thread related functions::  Functions for managing threads.

Data container (‘data.h’)

* Generic data container::      Definition of Gnuastro’s generic container.
* Dataset allocation::          Allocate, initialize and free a dataset.
* Arrays of datasets::          Functions to help with array of datasets.
* Copying datasets::            Functions to copy a dataset to a new one.

Linked lists (‘list.h’)

* List of strings::             Simply linked list of strings.
* List of int32_t::             Simply linked list of int32_ts.
* List of size_t::              Simply linked list of size_ts.
* List of float::               Simply linked list of floats.
* List of double::              Simply linked list of doubles
* List of void::                Simply linked list of void * pointers.
* Ordered list of size_t::      Simply linked, ordered list of size_t.
* Doubly linked ordered list of size_t::  Definition and functions.
* List of gal_data_t::          Simply linked list Gnuastro’s generic datatype.

FITS files (‘fits.h’)

* FITS macros errors filenames::  General macros, errors and checking names.
* CFITSIO and Gnuastro types::  Conversion between FITS and Gnuastro types.
* FITS HDUs::                   Opening and getting information about HDUs.
* FITS header keywords::        Reading and writing FITS header keywords.
* FITS arrays::                 Reading and writing FITS images/arrays.
* FITS tables::                 Reading and writing FITS tables.

File input output

* Text files::                  Reading and writing from/to plain text files.
* TIFF files::                  Reading and writing from/to TIFF files.
* JPEG files::                  Reading and writing from/to JPEG files.
* EPS files::                   Writing to EPS files.
* PDF files::                   Writing to PDF files.

Tessellation library (‘tile.h’)

* Independent tiles::           Work on or check independent tiles.
* Tile grid::                   Cover a full dataset with non-overlapping tiles.

Library demo programs

* Library demo - reading a image::  Read a FITS image into memory.
* Library demo - inspecting neighbors::  Inspect the neighbors of a pixel.
* Library demo - multi-threaded operation::  Doing an operation on threads.
* Library demo - reading and writing table columns::  Simple Column I/O.

Developing

* Why C::                       Why Gnuastro is designed in C.
* Program design philosophy::   General ideas behind the package structure.
* Coding conventions::          Gnuastro coding conventions.
* Program source::              Conventions for the code.
* Documentation::               Documentation is an integral part of Gnuastro.
* Building and debugging::      Build and possibly debug during development.
* Test scripts::                Understanding the test scripts.
* Bash programmable completion::  Auto-completions for better user experience.
* Developer's checklist::       Checklist to finalize your changes.
* Gnuastro project webpage::    Central hub for Gnuastro activities.
* Developing mailing lists::    Stay up to date with Gnuastro’s development.
* Contributing to Gnuastro::    Share your changes with all users.

Program source

* Mandatory source code files::  Description of files common to all programs.
* The TEMPLATE program::        Template for easy creation of a new program.

Bash programmable completion

* Bash TAB completion tutorial::  Fast tutorial to get you started on concepts.
* Implementing TAB completion in Gnuastro::  How Gnuastro uses Bash auto-completion features.

Contributing to Gnuastro

* Copyright assignment::        Copyright has to be assigned to the FSF.
* Commit guidelines::           Guidelines for commit messages.
* Production workflow::         Submitting your commits (work) for inclusion.
* Forking tutorial::            Tutorial on workflow steps with Git.

Other useful software

* SAO DS9::                     Viewing FITS images.
* PGPLOT::                      Plotting directly in C

SAO DS9

* Viewing multiextension FITS images::  Configure SAO DS9 for multiextension images.


File: gnuastro.info,  Node: Introduction,  Next: Tutorials,  Prev: Top,  Up: Top

1 Introduction
**************

GNU Astronomy Utilities (Gnuastro) is an official GNU package consisting
of separate programs and libraries for the manipulation and analysis of
astronomical data.  All the programs share the same basic command-line
user interface for the comfort of both the users and developers.
Gnuastro is written to comply fully with the GNU coding standards so it
integrates finely with the GNU/Linux operating system.  This also
enables astronomers to expect a fully familiar experience in the source
code, building, installing and command-line user interaction that they
have seen in all the other GNU software that they use.  The official and
always up to date version of this book (or manual) is freely available
under *note GNU Free Doc. License:: in various formats (PDF, HTML, plain
text, info, and as its Texinfo source) at
<http://www.gnu.org/software/gnuastro/manual/>.

   For users who are new to the GNU/Linux environment, unless otherwise
specified most of the topics in *note Installation:: and *note Common
program behavior:: are common to all GNU software, for example
installation, managing command-line options or getting help (also see
*note New to GNU/Linux?::).  So if you are new to this empowering
environment, we encourage you to go through these chapters carefully.
They can be a starting point from which you can continue to learn more
from each program’s own manual and fully benefit from and enjoy this
wonderful environment.  Gnuastro also comes with a large set of
libraries, so you can write your own programs using Gnuastro’s building
blocks, see *note Review of library fundamentals:: for an introduction.

   In Gnuastro, no change to any program or library will be committed to
its history, before it has been fully documented here first.  As
discussed in *note Science and its tools:: this is a founding principle
of the Gnuastro.

* Menu:

* Quick start::                 A quick start to installation.
* Science and its tools::       Some philosophy and history.
* Your rights::                 User rights.
* Naming convention::           About names of programs in Gnuastro.
* Version numbering::           Understanding version numbers.
* New to GNU/Linux?::           Suggested GNU/Linux distribution.
* Report a bug::                Search and report the bug you found.
* Suggest new feature::         How to suggest a new feature.
* Announcements::               How to stay up to date with Gnuastro.
* Conventions::                 Conventions used in this book.
* Acknowledgments::             People who helped in the production.


File: gnuastro.info,  Node: Quick start,  Next: Science and its tools,  Prev: Introduction,  Up: Introduction

1.1 Quick start
===============

The latest official release tarball is always available as
‘gnuastro-latest.tar.gz’
(http://ftp.gnu.org/gnu/gnuastro/gnuastro-latest.tar.gz).  For better
compression (faster download), and robust archival features, an Lzip
(http://www.nongnu.org/lzip/lzip.html) compressed tarball is also
available at ‘gnuastro-latest.tar.lz’
(http://ftp.gnu.org/gnu/gnuastro/gnuastro-latest.tar.lz), see *note
Release tarball:: for more details on the tarball release(1).

   Let’s assume the downloaded tarball is in the ‘TOPGNUASTRO’
directory.  The first two commands below can be used to decompress the
source.  If you download ‘tar.lz’ and your Tar implementation doesn’t
recognize Lzip (the second command fails), run the third and fourth
lines(2).  Note that lines starting with ‘##’ don’t need to be typed.

     ## Go into the download directory.
     $ cd TOPGNUASTRO

     ## Also works on `tar.gz'. GNU Tar recognizes both formats.
     $ tar xf gnuastro-latest.tar.lz

     ## Only when the previous command fails.
     $ lzip -d gnuastro-latest.tar.lz
     $ tar xf gnuastro-latest.tar

   Gnuastro has three mandatory dependencies and some optional
dependencies for extra functionality, see *note Dependencies:: for the
full list.  In *note Dependencies from package managers:: we have
prepared the command to easily install Gnuastro’s dependencies using the
package manager of some operating systems.  When the mandatory
dependencies are ready, you can configure, compile, check and install
Gnuastro on your system with the following commands.  See *note Known
issues:: if you confront any complications.

     $ cd gnuastro-X.X                  # Replace X.X with version number.
     $ ./configure
     $ make -j8                         # Replace 8 with no. CPU threads.
     $ make check -j8                   # Replace 8 with no. CPU threads.
     $ sudo make install
     $ echo "source /usr/local/share/gnuastro/completion.bash" >> ~/.bashrc

The last command is to enable Gnuastro’s custom TAB completion in Bash.
For more on this useful feature, see *note Shell TAB completion::).

   For each program there is an ‘Invoke ProgramName’ sub-section in this
book which explains how the programs should be run on the command-line
(for example *note Invoking asttable::).

   Some complete Tutorials are also available in this book with common
Gnuastro usage scenarios in astronomical research.  They even contain
links to download the necessary data, and thoroughly describe every step
of the process (the science, statistics and optimal usage of the
command-line).  We therefore strongly recommend to follow the tutorials
before starting to use Gnuastro, see *note Tutorials::.

   ---------- Footnotes ----------

   (1) The Gzip library and program are commonly available on most
systems.  However, Gnuastro recommends Lzip as described above and the
beta-releases are also only distributed in ‘tar.lz’.  You can download
and install Lzip’s source (in ‘.tar.gz’ format) from its web page and
follow the same process as below: Lzip has no dependencies, so simply
decompress, then run ‘./configure’, ‘make’, ‘sudo make install’.

   (2) In case Tar doesn’t directly uncompress your ‘.tar.lz’ tarball,
you can merge the separate calls to Lzip and Tar (shown in the main body
of text) into one command by directly piping the output of Lzip into Tar
with a command like this: ‘$ lzip -cd gnuastro-0.5.tar.lz | tar -xf -’


File: gnuastro.info,  Node: Science and its tools,  Next: Your rights,  Prev: Quick start,  Up: Introduction

1.2 Gnuastro manifesto: Science and its tools
=============================================

History of science indicates that there are always inevitably unseen
faults, hidden assumptions, simplifications and approximations in all
our theoretical models, data acquisition and analysis techniques.  It is
precisely these that will ultimately allow future generations to advance
the existing experimental and theoretical knowledge through their new
solutions and corrections.

   In the past, scientists would gather data and process them
individually to achieve an analysis thus having a much more intricate
knowledge of the data and analysis.  The theoretical models also
required little (if any) simulations to compare with the data.  Today
both methods are becoming increasingly more dependent on pre-written
software.  Scientists are dissociating themselves from the intricacies
of reducing raw observational data in experimentation or from bringing
the theoretical models to life in simulations.  These ‘intricacies’ are
precisely those unseen faults, hidden assumptions, simplifications and
approximations that define scientific progress.

     Unfortunately, most persons who have recourse to a computer for
     statistical analysis of data are not much interested either in
     computer programming or in statistical method, being primarily
     concerned with their own proper business.  Hence the common use of
     library programs and various statistical packages.  ...  It’s time
     that was changed.
  — _F.J. Anscombe. The American Statistician, Vol. 27, No. 1. 1973_

   Anscombe’s quartet
(http://en.wikipedia.org/wiki/Anscombe%27s_quartet) demonstrates how
four data sets with widely different shapes (when plotted) give nearly
identical output from standard regression techniques.  Anscombe uses
this (now famous) quartet, which was introduced in the paper quoted
above, to argue that “_Good statistical analysis is not a purely routine
matter, and generally calls for more than one pass through the
computer_”.  Echoing Anscombe’s concern after 44 years, some of the
highly recognized statisticians of our time (Leek, McShane, Gelman,
Colquhoun, Nuijten and Goodman), wrote in Nature that:

     We need to appreciate that data analysis is not purely
     computational and algorithmic – it is a human
     behaviour....Researchers who hunt hard enough will turn up a result
     that fits statistical criteria – but their discovery will probably
     be a false positive.
        — _Five ways to fix statistics, Nature, 551, Nov 2017._

   Users of statistical (scientific) methods (software) are therefore
not passive (objective) agents in their result.  Therefore, it is
necessary to actually understand the method, not just use it as a black
box.  The subjective experience gained by frequently using a
method/software is not sufficient to claim an understanding of how the
tool/method works and how relevant it is to the data and analysis.  This
kind of subjective experience is prone to serious misunderstandings
about the data, what the software/statistical-method really does
(especially as it gets more complicated), and thus the scientific
interpretation of the result.  This attitude is further encouraged
through non-free software(1), poorly written (or non-existent)
scientific software manuals, and non-reproducible papers(2).  This
approach to scientific software and methods only helps in producing
dogmas and an “_obscurantist faith in the expert’s special skill, and in
his personal knowledge and authority_”(3).

     Program or be programmed.  Choose the former, and you gain access
     to the control panel of civilization.  Choose the latter, and it
     could be the last real choice you get to make.
   — _Douglas Rushkoff. Program or be programmed, O/R Books (2010)._

   It is obviously impractical for any one human being to gain the
intricate knowledge explained above for every step of an analysis.  On
the other hand, scientific data can be large and numerous, for example
images produced by telescopes in astronomy.  This requires efficient
algorithms.  To make things worse, natural scientists have generally not
been trained in the advanced software techniques, paradigms and
architecture that are taught in computer science or engineering courses
and thus used in most software.  The GNU Astronomy Utilities are an
effort to tackle this issue.

   Gnuastro is not just a software, this book is as important to the
idea behind Gnuastro as the source code (software).  This book has tried
to learn from the success of the “Numerical Recipes” book in educating
those who are not software engineers and computer scientists but still
heavy users of computational algorithms, like astronomers.  There are
two major differences.

   The first difference is that Gnuastro’s code and the background
information are segregated: the code is moved within the actual Gnuastro
software source code and the underlying explanations are given here in
this book.  In the source code, every non-trivial step is heavily
commented and correlated with this book, it follows the same logic of
this book, and all the programs follow a similar internal data, function
and file structure, see *note Program source::.  Complementing the code,
this book focuses on thoroughly explaining the concepts behind those
codes (history, mathematics, science, software and usage advise when
necessary) along with detailed instructions on how to run the programs.
At the expense of frustrating “professionals” or “experts”, this book
and the comments in the code also intentionally avoid jargon and
abbreviations.  The source code and this book are thus intimately
linked, and when considered as a single entity can be thought of as a
real (an actual software accompanying the algorithms) “Numerical
Recipes” for astronomy.

   The second major, and arguably more important, difference is that
“Numerical Recipes” does not allow you to distribute any code that you
have learned from it.  In other words, it does not allow you to release
your software’s source code if you have used their codes, you can only
publicly release binaries (a black box) to the community.  Therefore,
while it empowers the privileged individual who has access to it, it
exacerbates social ignorance.  Exactly at the opposite end of the
spectrum, Gnuastro’s source code is released under the GNU general
public license (GPL) and this book is released under the GNU free
documentation license.  You are therefore free to distribute any
software you create using parts of Gnuastro’s source code or text, or
figures from this book, see *note Your rights::.

   With these principles in mind, Gnuastro’s developers aim to impose
the minimum requirements on you (in computer science, engineering and
even the mathematics behind the tools) to understand and modify any step
of Gnuastro if you feel the need to do so, see *note Why C:: and *note
Program design philosophy::.

   Without prior familiarity and experience with optics, it is hard to
imagine how, Galileo could have come up with the idea of modifying the
Dutch military telescope optics to use in astronomy.  Astronomical
objects could not be seen with the Dutch military design of the
telescope.  In other words, it is unlikely that Galileo could have asked
a random optician to make modifications (not understood by Galileo) to
the Dutch design, to do something no astronomer of the time took
seriously.  In the paradigm of the day, what could be the purpose of
enlarging geometric spheres (planets) or points (stars)?  In that
paradigm only the position and movement of the heavenly bodies was
important, and that had already been accurately studied (recently by
Tycho Brahe).

   In the beginning of his “The Sidereal Messenger” (published in 1610)
he cautions the readers on this issue and _before_ describing his
results/observations, Galileo instructs us on how to build a suitable
instrument.  Without a detailed description of _how_ he made his tools
and done his observations, no reasonable person would believe his
results.  Before he actually saw the moons of Jupiter, the mountains on
the Moon or the crescent of Venus, Galileo was “evasive”(4) to Kepler.
Science is defined by its tools/methods, _not_ its raw results(5).

   The same is true today: science cannot progress with a black box, or
poorly released code.  The source code of a research is the new
(abstractified) communication language in science, understandable by
humans _and_ computers.  Source code (in any programming language) is a
language/notation designed to express all the details that would be too
tedious/long/frustrating to report in spoken languages like English,
similar to mathematic notation.

     An article about computational science [almost all sciences today]
     ...  is not the scholarship itself, it is merely advertising of the
     scholarship.  The Actual Scholarship is the complete software
     development environment and the complete set of instructions which
     generated the figures.
   — _Buckheit & Donoho, Lecture Notes in Statistics, Vol 103, 1996_

   Today, the quality of the source code that goes into a scientific
result (and the distribution of that code) is as critical to scientific
vitality and integrity, as the quality of its written language/English
used in publishing/distributing its paper.  A scientific paper will not
even be reviewed by any respectable journal if its written in a poor
language/English.  A similar level of quality assessment is thus
increasingly becoming necessary regarding the codes/methods used to
derive the results of a scientific paper.  For more on this, please see
Akhlaghi et al.  (2021) at arXiv:2006.03018
(https://arxiv.org/abs/2006.03018)).

   Bjarne Stroustrup (creator of the C++ language) says: “_Without
understanding software, you are reduced to believing in magic_”.  Ken
Thomson (the designer or the Unix operating system) says “_I abhor a
system designed for the ‘user’ if that word is a coded pejorative
meaning ‘stupid and unsophisticated’_.” Certainly no scientist (user of
a scientific software) would want to be considered a believer in magic,
or stupid and unsophisticated.

   This can happen when scientists get too distant from the raw data and
methods, and are mainly discussing results.  In other words, when they
feel they have tamed Nature into their own high-level (abstract) models
(creations), and are mainly concerned with scaling up, or
industrializing those results.  Roughly five years before special
relativity, and about two decades before quantum mechanics fundamentally
changed Physics, Lord Kelvin is quoted as saying:

     There is nothing new to be discovered in physics now.  All that
     remains is more and more precise measurement.
                — _William Thomson (Lord Kelvin), 1900_

A few years earlier Albert.  A. Michelson made the following statement:

     The more important fundamental laws and facts of physical science
     have all been discovered, and these are now so firmly established
     that the possibility of their ever being supplanted in consequence
     of new discoveries is exceedingly remote....  Our future
     discoveries must be looked for in the sixth place of decimals.
— _Albert. A. Michelson, dedication of Ryerson Physics Lab, U. Chicago 1894_

   If scientists are considered to be more than mere puzzle solvers(6)
(simply adding to the decimals of existing values or observing a feature
in 10, 100, or 100000 more galaxies or stars, as Kelvin and Michelson
clearly believed), they cannot just passively sit back and uncritically
repeat the previous (observational or theoretical) methods/tools on new
data.  Today there is a wealth of raw telescope images ready (mostly for
free) at the finger tips of anyone who is interested with a fast enough
internet connection to download them.  The only thing lacking is new
ways to analyze this data and dig out the treasure that is lying hidden
in them to existing methods and techniques.

     New data that we insist on analyzing in terms of old ideas (that
     is, old models which are not questioned) cannot lead us out of the
     old ideas.  However many data we record and analyze, we may just
     keep repeating the same old errors, missing the same crucially
     important things that the experiment was competent to find.
— _Jaynes, Probability theory, the logic of science. Cambridge U. Press (2003)._

   ---------- Footnotes ----------

   (1) <https://www.gnu.org/philosophy/free-sw.html>

   (2) Where the authors omit many of the analysis/processing “details”
from the paper by arguing that they would make the paper too
long/unreadable.  However, software engineers have been dealing with
such issues for a long time.  There are thus software management
solutions that allow us to supplement papers with all the details
necessary to exactly reproduce the result.  For example see Akhlaghi et
al.  (2021, arXiv:2006.03018 (https://arxiv.org/abs/2006.03018)).

   (3) Karl Popper.  The logic of scientific discovery.  1959.  Larger
quote is given at the start of the PDF (for print) version of this book.

   (4) Galileo G. (Translated by Maurice A. Finocchiaro).  _The
essential Galileo_.Hackett publishing company, first edition, 2008.

   (5) For example, take the following two results on the age of the
universe: roughly 14 billion years (suggested by the current consensus
of the standard model of cosmology) and less than 10,000 years
(suggested from some interpretations of the Bible).  Both these numbers
are _results_.  What distinguishes these two results, is the
tools/methods that were used to derive them.  Therefore, as the term
“Scientific method” also signifies, a scientific statement it defined by
its _method_, not its result.

   (6) Thomas S. Kuhn.  _The Structure of Scientific Revolutions_,
University of Chicago Press, 1962.


File: gnuastro.info,  Node: Your rights,  Next: Naming convention,  Prev: Science and its tools,  Up: Introduction

1.3 Your rights
===============

The paragraphs below, in this section, belong to the GNU Texinfo(1)
manual and are not written by us!  The name “Texinfo” is just changed to
“GNU Astronomy Utilities” or “Gnuastro” because they are released under
the same licenses and it is beautifully written to inform you of your
rights.

   GNU Astronomy Utilities is “free software”; this means that everyone
is free to use it and free to redistribute it on certain conditions.
Gnuastro is not in the public domain; it is copyrighted and there are
restrictions on its distribution, but these restrictions are designed to
permit everything that a good cooperating citizen would want to do.
What is not allowed is to try to prevent others from further sharing any
version of Gnuastro that they might get from you.

   Specifically, we want to make sure that you have the right to give
away copies of the programs that relate to Gnuastro, that you receive
the source code or else can get it if you want it, that you can change
these programs or use pieces of them in new free programs, and that you
know you can do these things.

   To make sure that everyone has such rights, we have to forbid you to
deprive anyone else of these rights.  For example, if you distribute
copies of the Gnuastro related programs, you must give the recipients
all the rights that you have.  You must make sure that they, too,
receive or can get the source code.  And you must tell them their
rights.

   Also, for our own protection, we must make certain that everyone
finds out that there is no warranty for the programs that relate to
Gnuastro.  If these programs are modified by someone else and passed on,
we want their recipients to know that what they have is not what we
distributed, so that any problems introduced by others will not reflect
on our reputation.

   The full text of the licenses for the Gnuastro book and software can
be respectively found in *note GNU General Public License::(2) and *note
GNU Free Doc. License::(3).

   ---------- Footnotes ----------

   (1) Texinfo is the GNU documentation system.  It is used to create
this book in all the various formats.

   (2) Also available in <http://www.gnu.org/copyleft/gpl.html>

   (3) Also available in <http://www.gnu.org/copyleft/fdl.html>


File: gnuastro.info,  Node: Naming convention,  Next: Version numbering,  Prev: Your rights,  Up: Introduction

1.4 Naming convention
=====================

Gnuastro is a package of independent programs and a collection of
libraries, here we are mainly concerned with the programs.  Each program
has an official name which consists of one or two words, describing what
they do.  The latter are printed with no space, for example NoiseChisel
or Crop.  On the command-line, you can run them with their executable
names which start with an ‘ast’ and might be an abbreviation of the
official name, for example ‘astnoisechisel’ or ‘astcrop’, see *note
Executable names::.

   We will use “ProgramName” for a generic official program name and
‘astprogname’ for a generic executable name.  In this book, the programs
are classified based on what they do and thoroughly explained.  An
alphabetical list of the programs that are installed on your system with
this installation are given in *note Gnuastro programs list::.  That
list also contains the executable names and version numbers along with a
one line description.


File: gnuastro.info,  Node: Version numbering,  Next: New to GNU/Linux?,  Prev: Naming convention,  Up: Introduction

1.5 Version numbering
=====================

Gnuastro can have two formats of version numbers, for official and
unofficial releases.  Official Gnuastro releases are announced on the
‘info-gnuastro’ mailing list, they have a version control tag in
Gnuastro’s development history, and their version numbers are formatted
like “‘A.B’”.  ‘A’ is a major version number, marking a significant
planned achievement (for example see *note GNU Astronomy Utilities
1.0::), while ‘B’ is a minor version number, see below for more on the
distinction.  Note that the numbers are not decimals, so version 2.34 is
much more recent than version 2.5, which is not equal to 2.50.

   Gnuastro also allows a unique version number for unofficial releases.
Unofficial releases can mark any point in Gnuastro’s development
history.  This is done to allow astronomers to easily use any point in
the version controlled history for their data-analysis and research
publication.  See *note Version controlled source:: for a complete
introduction.  This section is not just for developers and is intended
to straightforward and easy to read, so please have a look if you are
interested in the cutting-edge.  This unofficial version number is a
meaningful and easy to read string of characters, unique to that
particular point of history.  With this feature, users can easily stay
up to date with the most recent bug fixes and additions that are
committed between official releases.

   The unofficial version number is formatted like: ‘A.B.C-D’.  ‘A’ and
‘B’ are the most recent official version number.  ‘C’ is the number of
commits that have been made after version ‘A.B’.  ‘D’ is the first 4 or
5 characters of the commit hash number(1).  Therefore, the unofficial
version number ‘‘3.92.8-29c8’’, corresponds to the 8th commit after the
official version ‘3.92’ and its commit hash begins with ‘29c8’.  The
unofficial version number is sort-able (unlike the raw hash) and as
shown above is descriptive of the state of the unofficial release.  Of
course an official release is preferred for publication (since its
tarballs are easily available and it has gone through more tests, making
it more stable), so if an official release is announced prior to your
publication’s final review, please consider updating to the official
release.

   The major version number is set by a major goal which is defined by
the developers and user community before hand, for example see *note GNU
Astronomy Utilities 1.0::.  The incremental work done in minor releases
are commonly small steps in achieving the major goal.  Therefore, there
is no limit on the number of minor releases and the difference between
the (hypothetical) versions 2.927 and 3.0 can be a small (negligible to
the user) improvement that finalizes the defined goals.

* Menu:

* GNU Astronomy Utilities 1.0::  Plans for version 1.0 release

   ---------- Footnotes ----------

   (1) Each point in Gnuastro’s history is uniquely identified with a 40
character long hash which is created from its contents and previous
history for example: ‘5b17501d8f29ba3cd610673261e6e2229c846d35’.  So the
string ‘D’ in the version for this commit could be ‘5b17’, or ‘5b175’.


File: gnuastro.info,  Node: GNU Astronomy Utilities 1.0,  Prev: Version numbering,  Up: Version numbering

1.5.1 GNU Astronomy Utilities 1.0
---------------------------------

Currently (prior to Gnuastro 1.0), the aim of Gnuastro is to have a
complete system for data manipulation and analysis at least similar to
IRAF(1). So an astronomer can take all the standard data analysis steps
(starting from raw data to the final reduced product and standard
post-reduction tools) with the various programs in Gnuastro.

   The maintainers of each camera or detector on a telescope can provide
a completely transparent shell script or Makefile to the observer for
data analysis.  This script can set configuration files for all the
required programs to work with that particular camera.  The script can
then run the proper programs in the proper sequence.  The user/observer
can easily follow the standard shell script to understand (and modify)
each step and the parameters used easily.  Bash (or other modern
GNU/Linux shell scripts) is powerful and made for this gluing job.  This
will simultaneously improve performance and transparency.  Shell
scripting (or Makefiles) are also basic constructs that are easy to
learn and readily available as part of the Unix-like operating systems.
If there is no program to do a desired step, Gnuastro’s libraries can be
used to build specific programs.

   The main factor is that all observatories or projects can freely
contribute to Gnuastro and all simultaneously benefit from it (since it
doesn’t belong to any particular one of them), much like how for-profit
organizations (for example RedHat, or Intel and many others) are major
contributors to free and open source software for their shared benefit.
Gnuastro’s copyright has been fully awarded to GNU, so it doesn’t belong
to any particular astronomer or astronomical facility or project.

   ---------- Footnotes ----------

   (1) <http://iraf.noao.edu/>


File: gnuastro.info,  Node: New to GNU/Linux?,  Next: Report a bug,  Prev: Version numbering,  Up: Introduction

1.6 New to GNU/Linux?
=====================

Some astronomers initially install and use a GNU/Linux operating system
because their necessary tools can only be installed in this environment.
However, the transition is not necessarily easy.  To encourage you in
investing the patience and time to make this transition, and actually
enjoy it, we will first start with a basic introduction to GNU/Linux
operating systems.  Afterwards, in *note Command-line interface:: we’ll
discuss the wonderful benefits of the command-line interface, how it
beautifully complements the graphic user interface, and why it is worth
the (apparently steep) learning curve.  Finally a complete chapter
(*note Tutorials::) is devoted to real world scenarios of using Gnuastro
(on the command-line).  Therefore if you don’t yet feel comfortable with
the command-line we strongly recommend going through that chapter after
finishing this section.

   You might have already noticed that we are not using the name
“Linux”, but “GNU/Linux”.  Please take the time to have a look at the
following essays and FAQs for a complete understanding of this very
important distinction.

   • <https://www.gnu.org/gnu/gnu-users-never-heard-of-gnu.html>

   • <https://www.gnu.org/gnu/linux-and-gnu.html>

   • <https://www.gnu.org/gnu/why-gnu-linux.html>

   • <https://www.gnu.org/gnu/gnu-linux-faq.html>

   In short, the Linux kernel(1) is built using the GNU C library
(glibc) and GNU compiler collection (gcc).  The Linux kernel software
alone is just a means for other software to access the hardware
resources, it is useless alone: to say “running Linux”, is like saying
“driving your carburetor”.

   To have an operating system, you need lower-level (to build the
kernel), and higher-level (to use it) software packages.  The majority
of such software in most Unix-like operating systems are GNU software:
“the whole system is basically GNU with Linux loaded”.  Therefore to
acknowledge GNU’s instrumental role in the creation and usage of the
Linux kernel and the operating systems that use it, we should call these
operating systems “GNU/Linux”.

* Menu:

* Command-line interface::      Introduction to the command-line

   ---------- Footnotes ----------

   (1) In Unix-like operating systems, the kernel connects software and
hardware worlds.


File: gnuastro.info,  Node: Command-line interface,  Prev: New to GNU/Linux?,  Up: New to GNU/Linux?

1.6.1 Command-line interface
----------------------------

One aspect of Gnuastro that might be a little troubling to new GNU/Linux
users is that (at least for the time being) it only has a command-line
user interface (CLI). This might be contrary to the mostly graphical
user interface (GUI) experience with proprietary operating systems.
Since the various actions available aren’t always on the screen, the
command-line interface can be complicated, intimidating, and frustrating
for a first-time user.  This is understandable and also experienced by
anyone who started using the computer (from childhood) in a graphical
user interface (this includes most of Gnuastro’s authors).  Here we hope
to convince you of the unique benefits of this interface which can
greatly enhance your productivity while complementing your GUI
experience.

   Through GNOME 3(1), most GNU/Linux based operating systems now have
an advanced and useful GUI. Since the GUI was created long after the
command-line, some wrongly consider the command line to be obsolete.
Both interfaces are useful for different tasks.  For example you can’t
view an image, video, pdf document or web page on the command-line.  On
the other hand you can’t reproduce your results easily in the GUI.
Therefore they should not be regarded as rivals but as complementary
user interfaces, here we will outline how the CLI can be useful in
scientific programs.

   You can think of the GUI as a veneer over the CLI to facilitate a
small subset of all the possible CLI operations.  Each click you do on
the GUI, can be thought of as internally running a different CLI
command.  So asymptotically (if a good designer can design a GUI which
is able to show you all the possibilities to click on) the GUI is only
as powerful as the command-line.  In practice, such graphical designers
are very hard to find for every program, so the GUI operations are
always a subset of the internal CLI commands.  For programs that are
only made for the GUI, this results in not including lots of potentially
useful operations.  It also results in ‘interface design’ to be a
crucially important part of any GUI program.  Scientists don’t usually
have enough resources to hire a graphical designer, also the complexity
of the GUI code is far more than CLI code, which is harmful for a
scientific software, see *note Science and its tools::.

   For programs that have a GUI, one action on the GUI (moving and
clicking a mouse, or tapping a touchscreen) might be more efficient and
easier than its CLI counterpart (typing the program name and your
desired configuration).  However, if you have to repeat that same action
more than once, the GUI will soon become frustrating and prone to
errors.  Unless the designers of a particular program decided to design
such a system for a particular GUI action, there is no general way to
run any possible series of actions automatically on the GUI.

   On the command-line, you can run any series of of actions which can
come from various CLI capable programs you have decided your self in any
possible permutation with one command(2).  This allows for much more
creativity and exact reproducibility that is not possible to a GUI user.
For technical and scientific operations, where the same operation (using
various programs) has to be done on a large set of data files, this is
crucially important.  It also allows exact reproducibility which is a
foundation principle for scientific results.  The most common CLI (which
is also known as a shell) in GNU/Linux is GNU Bash, we strongly
encourage you to put aside several hours and go through this beautifully
explained web page: <https://flossmanuals.net/command-line/>.  You don’t
need to read or even fully understand the whole thing, only a general
knowledge of the first few chapters are enough to get you going.

   Since the operations in the GUI are limited and they are visible,
reading a manual is not that important in the GUI (most programs don’t
even have any!).  However, to give you the creative power explained
above, with a CLI program, it is best if you first read the manual of
any program you are using.  You don’t need to memorize any details, only
an understanding of the generalities is needed.  Once you start working,
there are more easier ways to remember a particular option or operation
detail, see *note Getting help::.

   To experience the command-line in its full glory and not in the GUI
terminal emulator, press the following keys together: <CTRL+ALT+F4>(3)
to access the virtual console.  To return back to your GUI, press the
same keys above replacing <F4> with <F7> (or <F1>, or <F2>, depending on
your GNU/Linux distribution).  In the virtual console, the GUI, with all
its distracting colors and information, is gone.  Enabling you to focus
entirely on your actual work.

   For operations that use a lot of your system’s resources (processing
a large number of large astronomical images for example), the virtual
console is the place to run them.  This is because the GUI is not
competing with your research work for your system’s RAM and CPU. Since
the virtual consoles are completely independent, you can even log out of
your GUI environment to give even more of your hardware resources to the
programs you are running and thus reduce the operating time.

   Since it uses far less system resources, the CLI is also convenient
for remote access to your computer.  Using secure shell (SSH) you can
log in securely to your system (similar to the virtual console) from
anywhere even if the connection speeds are low.  There are apps for
smart phones and tablets which allow you to do this.

   ---------- Footnotes ----------

   (1) <http://www.gnome.org/>

   (2) By writing a shell script and running it, for example see the
tutorials in *note Tutorials::.

   (3) Instead of <F4>, you can use any of the keys from <F1> to <F6>
for different virtual consoles depending on your GNU/Linux distribution,
try them all out.  You can also run a separate GUI from within this
console if you want to.


File: gnuastro.info,  Node: Report a bug,  Next: Suggest new feature,  Prev: New to GNU/Linux?,  Up: Introduction

1.7 Report a bug
================

According to Wikipedia “a software bug is an error, flaw, failure, or
fault in a computer program or system that causes it to produce an
incorrect or unexpected result, or to behave in unintended ways”.  So
when you see that a program is crashing, not reading your input
correctly, giving the wrong results, or not writing your output
correctly, you have found a bug.  In such cases, it is best if you
report the bug to the developers.  The programs will also inform you if
known impossible situations occur (which are caused by something
unexpected) and will ask the users to report the bug issue.

   Prior to actually filing a bug report, it is best to search previous
reports.  The issue might have already been found and even solved.  The
best place to check if your bug has already been discussed is the bugs
tracker on *note Gnuastro project webpage:: at
<https://savannah.gnu.org/bugs/?group=gnuastro>.  In the top search
fields (under “Display Criteria”) set the “Open/Closed” drop-down menu
to “Any” and choose the respective program or general category of the
bug in “Category” and click the “Apply” button.  The results colored
green have already been solved and the status of those colored in red is
shown in the table.

   Recently corrected bugs are probably not yet publicly released
because they are scheduled for the next Gnuastro stable release.  If the
bug is solved but not yet released and it is an urgent issue for you,
you can get the version controlled source and compile that, see *note
Version controlled source::.

   To solve the issue as readily as possible, please follow the
following to guidelines in your bug report.  The How to Report Bugs
Effectively (http://www.chiark.greenend.org.uk/~sgtatham/bugs.html) and
How To Ask Questions The Smart Way
(http://catb.org/~esr/faqs/smart-questions.html) essays also provide
some good generic advice for all software (don’t contact their authors
for Gnuastro’s problems).  Mastering the art of giving good bug reports
(like asking good questions) can greatly enhance your experience with
any free and open source software.  So investing the time to read
through these essays will greatly reduce your frustration after you see
something doesn’t work the way you feel it is supposed to for a large
range of software, not just Gnuastro.

*Be descriptive*
     Please provide as many details as possible and be very descriptive.
     Explain what you expected and what the output was: it might be that
     your expectation was wrong.  Also please clearly state which
     sections of the Gnuastro book (this book), or other references you
     have studied to understand the problem.  This can be useful in
     correcting the book (adding links to likely places where users will
     check).  But more importantly, it will be encouraging for the
     developers, since you are showing how serious you are about the
     problem and that you have actually put some thought into it.  “To
     be able to ask a question clearly is two-thirds of the way to
     getting it answered.” – John Ruskin (1819-1900).

*Individual and independent bug reports*
     If you have found multiple bugs, please send them as separate (and
     independent) bugs (as much as possible).  This will significantly
     help us in managing and resolving them sooner.

*Reproducible bug reports*
     If we cannot exactly reproduce your bug, then it is very hard to
     resolve it.  So please send us a Minimal working example(1) along
     with the description.  For example in running a program, please
     send us the full command-line text and the output with the ‘-P’
     option, see *note Operating mode options::.  If it is caused only
     for a certain input, also send us that input file.  In case the
     input FITS is large, please use Crop to only crop the problematic
     section and make it as small as possible so it can easily be
     uploaded and downloaded and not waste the archive’s storage, see
     *note Crop::.

There are generally two ways to inform us of bugs:

   • Send a mail to ‘bug-gnuastro@gnu.org’.  Any mail you send to this
     address will be distributed through the bug-gnuastro mailing
     list(2).  This is the simplest way to send us bug reports.  The
     developers will then register the bug into the project web page
     (next choice) for you.

   • Use the Gnuastro project web page at
     <https://savannah.gnu.org/projects/gnuastro/>: There are two ways
     to get to the submission page as listed below.  Fill in the form as
     described below and submit it (see *note Gnuastro project webpage::
     for more on the project web page).

        • Using the top horizontal menu items, immediately under the top
          page title.  Hovering your mouse on “Support” will open a
          drop-down list.  Select “Submit new”.

        • In the main body of the page, under the “Communication tools”
          section, click on “Submit new item”.

   Once the items have been registered in the mailing list or web page,
the developers will add it to either the “Bug Tracker” or “Task Manager”
trackers of the Gnuastro project web page.  These two trackers can only
be edited by the Gnuastro project developers, but they can be browsed by
anyone, so you can follow the progress on your bug.  You are most
welcome to join us in developing Gnuastro and fixing the bug you have
found maybe a good starting point.  Gnuastro is designed to be easy for
anyone to develop (see *note Science and its tools::) and there is a
full chapter devoted to developing it: *note Developing::.

   ---------- Footnotes ----------

   (1) <http://en.wikipedia.org/wiki/Minimal_Working_Example>

   (2) <https://lists.gnu.org/mailman/listinfo/bug-gnuastro>


File: gnuastro.info,  Node: Suggest new feature,  Next: Announcements,  Prev: Report a bug,  Up: Introduction

1.8 Suggest new feature
=======================

We would always be happy to hear of suggested new features.  For every
program there are already lists of features that we are planning to add.
You can see the current list of plans from the Gnuastro project web page
at <https://savannah.gnu.org/projects/gnuastro/> and following
“Tasks”→“Browse” on the horizontal menu at the top of the page
immediately under the title, see *note Gnuastro project webpage::.  If
you want to request a feature to an existing program, click on the
“Display Criteria” above the list and under “Category”, choose that
particular program.  Under “Category” you can also see the existing
suggestions for new programs or other cases like installation,
documentation or libraries.  Also be sure to set the “Open/Closed” value
to “Any”.

   If the feature you want to suggest is not already listed in the task
manager, then follow the steps that are fully described in *note Report
a bug::.  Please have in mind that the developers are all busy with
their own astronomical research, and implementing existing “task”s to
add or resolving bugs.  Gnuastro is a volunteer effort and none of the
developers are paid for their hard work.  So, although we will try our
best, please don’t not expect that your suggested feature be immediately
included (with the next release of Gnuastro).

   The best person to apply the exciting new feature you have in mind is
you, since you have the motivation and need.  In fact Gnuastro is
designed for making it as easy as possible for you to hack into it (add
new features, change existing ones and so on), see *note Science and its
tools::.  Please have a look at the chapter devoted to developing (*note
Developing::) and start applying your desired feature.  Once you have
added it, you can use it for your own work and if you feel you want
others to benefit from your work, you can request for it to become part
of Gnuastro.  You can then join the developers and start maintaining
your own part of Gnuastro.  If you choose to take this path of action
please contact us before hand (*note Report a bug::) so we can avoid
possible duplicate activities and get interested people in contact.

*Gnuastro is a collection of low level programs:* As described in *note
Program design philosophy::, a founding principle of Gnuastro is that
each library or program should be basic and low-level.  High level jobs
should be done by running the separate programs or using separate
functions in succession through a shell script or calling the libraries
by higher level functions, see the examples in *note Tutorials::.  So
when making the suggestions please consider how your desired job can
best be broken into separate steps and modularized.


File: gnuastro.info,  Node: Announcements,  Next: Conventions,  Prev: Suggest new feature,  Up: Introduction

1.9 Announcements
=================

Gnuastro has a dedicated mailing list for making announcements
(‘info-gnuastro’).  Anyone can subscribe to this mailing list.  Anytime
there is a new stable or test release, an email will be circulated
there.  The email contains a summary of the overall changes along with a
detailed list (from the ‘NEWS’ file).  This mailing list is thus the
best way to stay up to date with new releases, easily learn about the
updated/new features, or dependencies (see *note Dependencies::).

   To subscribe to this list, please visit
<https://lists.gnu.org/mailman/listinfo/info-gnuastro>.  Traffic (number
of mails per unit time) in this list is designed to be low: only a
handful of mails per year.  Previous announcements are available on its
archive (http://lists.gnu.org/archive/html/info-gnuastro/).


File: gnuastro.info,  Node: Conventions,  Next: Acknowledgments,  Prev: Announcements,  Up: Introduction

1.10 Conventions
================

In this book we have the following conventions:

   • All commands that are to be run on the shell (command-line) prompt
     as the user start with a ‘$’.  In case they must be run as a
     super-user or system administrator, they will start with a single
     ‘#’.  If the command is in a separate line and next line ‘is also
     in the code type face’, but doesn’t have any of the ‘$’ or ‘#’
     signs, then it is the output of the command after it is run.  As a
     user, you don’t need to type those lines.  A line that starts with
     ‘##’ is just a comment for explaining the command to a human reader
     and must not be typed.

   • If the command becomes larger than the page width a <\> is inserted
     in the code.  If you are typing the code by hand on the
     command-line, you don’t need to use multiple lines or add the extra
     space characters, so you can omit them.  If you want to copy and
     paste these examples (highly discouraged!)  then the <\> should
     stay.

     The <\> character is a shell escape character which is used
     commonly to make characters which have special meaning for the
     shell loose that special place (the shell will not treat them
     specially if there is a <\> behind them).  When it is a last
     character in a line (the next character is a new-line character)
     the new-line character looses its meaning an the shell sees it as a
     simple white-space character, enabling you to use multiple lines to
     write your commands.

   This is not a convention, but a bi-product of the PDF building
process of the manual: In the PDF version of this manual, a single quote
(or apostrophe) character in the commands or codes is shown like this:
‘'’.  Single quotes are sometimes necessary in combination with commands
like ‘awk’ or ‘sed’, or when using Column arithmetic in Gnuastro’s own
Table (see *note Column arithmetic::).  Therefore when typing
(recommended) or copy-pasting (not recommended) the commands that have a
‘'’, please correct it to the single-quote (or apostrophe) character,
otherwise the command will fail.


File: gnuastro.info,  Node: Acknowledgments,  Prev: Conventions,  Up: Introduction

1.11 Acknowledgments
====================

Gnuastro would not have been possible without scholarships and grants
from several funding institutions.  We thus ask that if you used
Gnuastro in any of your papers/reports, please add the proper citation
and acknowledge the funding agencies/projects.  For details of which
papers to cite (may be different for different programs) and get the
acknowledgment statement to include in your paper, please run the
relevant programs with the common ‘--cite’ option like the example
commands below (for more on ‘--cite’, please see *note Operating mode
options::).

     $ astnoisechisel --cite
     $ astmkcatalog --cite

   Here, we’ll acknowledge all the institutions (and their grants) along
with the people who helped make Gnuastro possible.  The full list of
Gnuastro authors is available at the start of this book and the
‘AUTHORS’ file in the source code (both are generated automatically from
the version controlled history).  The plain text file ‘THANKS’, which is
also distributed along with the source code, contains the list of people
and institutions who played an indirect role in Gnuastro (not committed
any code in the Gnuastro version controlled history).

   The Japanese Ministry of Education, Culture, Sports, Science, and
Technology (MEXT) scholarship for Mohammad Akhlaghi’s Masters and PhD
degree in Tohoku University Astronomical Institute had an instrumental
role in the long term learning and planning that made the idea of
Gnuastro possible.  The very critical view points of Professor Takashi
Ichikawa (Mohammad’s adviser) were also instrumental in the initial
ideas and creation of Gnuastro.  Afterwards, the European Research
Council (ERC) advanced grant 339659-MUSICOS (Principal investigator:
Roland Bacon) was vital in the growth and expansion of Gnuastro.
Working with Roland at the Centre de Recherche Astrophysique de Lyon
(CRAL), enabled a thorough re-write of the core functionality of all
libraries and programs, turning Gnuastro into the large collection of
generic programs and libraries it is today.  Work on improving Gnuastro
and making it mature is now continuing primarily in the Instituto de
Astrofisica de Canarias (IAC) and in particular in collaboration with
Johan Knapen and Ignacio Trujillo.

   In general, we would like to gratefully thank the following people
for their useful and constructive comments and suggestions (in
alphabetical order by family name): Valentina Abril-melgarejo, Marjan
Akbari, Carlos Allende Prieto, Hamed Altafi, Roland Bacon, Roberto Baena
Gallé, Zahra Bagheri, Karl Berry, Leindert Boogaard, Nicolas Bouché,
Stefan Brüns, Fernando Buitrago, Adrian Bunk, Rosa Calvi, Mark
Calabretta Nushkia Chamba, Benjamin Clement, Nima Dehdilani, Antonio
Diaz Diaz, Alexey Dokuchaev, Pierre-Alain Duc, Elham Eftekhari, Paul
Eggert, Sepideh Eskandarlou, Gaspar Galaz, Andrés García-Serra Romero,
Zohre Ghaffari, Thérèse Godefroy, Giulia Golini, Madusha Gunawardhana,
Bruno Haible, Stephen Hamer, Leslie Hunt, Takashi Ichikawa, Raúl Infante
Sainz, Brandon Invergo, Oryna Ivashtenko, Aurélien Jarno, Lee Kelvin,
Brandon Kelly, Mohammad-Reza Khellat, Johan Knapen, Geoffry Krouchi,
Martin Kuemmel, Clotilde Laigle, Floriane Leclercq, Alan Lefor, Javier
Licandro, Sebastián Luna Valero, Alberto Madrigal, Guillaume Mahler,
Juan Miro, Alireza Molaeinezhad, Javier Moldon, Juan Molina Tobar,
Francesco Montanari, Raphael Morales, Carlos Morales Socorro, Sylvain
Mottet, Dmitrii Oparin, François Ochsenbein, Bertrand Pain, William
Pence, Mamta Pommier, Marcel Popescu, Bob Proulx, Joseph Putko, Samane
Raji, Teymoor Saifollahi, Joanna Sakowska, Elham Saremi, Markus Schaney,
Yahya Sefidbakht, Alejandro Serrano Borlaff, Zahra Sharbaf, David Shupe,
Leigh Smith, Jenny Sorce, Lee Spitler, Richard Stallman, Michael Stein,
Ole Streicher, Alfred M. Szmidt, Michel Tallon, Juan C. Tello, Vincenzo
Testa, Éric Thiébaut, Ignacio Trujillo, David Valls-Gabaud, Aaron
Watkins, Richard Wilbur, Michael H.F. Wilkinson, Christopher Willmer,
Xiuqin Wu, Sara Yousefi Taemeh, Johannes Zabl.  The GNU French
Translation Team is also managing the French version of the top Gnuastro
web page which we highly appreciate.  Finally we should thank all the
(sometimes anonymous) people in various online forums which patiently
answered all our small (but imporant) technical questions.

   All work on Gnuastro has been voluntary, but the authors are most
grateful to the following institutions (in chronological order) for
hosting/supporting us in our research.  Where necessary, these
institutions have disclaimed any ownership of the parts of Gnuastro that
were developed there, thus insuring the freedom of Gnuastro for the
future (see *note Copyright assignment::).  We highly appreciate their
support for free software, and thus free science, and therefore a free
society.

     Tohoku University Astronomical Institute, Sendai, Japan.
     University of Salento, Lecce, Italy.
     Centre de Recherche Astrophysique de Lyon (CRAL), Lyon, France.
     Instituto de Astrofisica de Canarias (IAC), Tenerife, Spain.
     Google Summer of Code 2020


File: gnuastro.info,  Node: Tutorials,  Next: Installation,  Prev: Introduction,  Up: Top

2 Tutorials
***********

To help new users have a smooth and easy start with Gnuastro, in this
chapter several thoroughly elaborated tutorials, or cookbooks, are
provided.  These tutorials demonstrate the capabilities of different
Gnuastro programs and libraries, along with tips and guidelines for the
best practices of using them in various realistic situations.

   We strongly recommend going through these tutorials to get a good
feeling of how the programs are related (built in a modular design to be
used together in a pipeline), very similar to the core Unix-based
programs that they were modeled on.  Therefore these tutorials will
greatly help in optimally using Gnuastro’s programs (and generally, the
Unix-like command-line environment) effectively for your research.

   In *note Sufi simulates a detection::, we’ll start with a
fictional(1) tutorial explaining how Abd al-rahman Sufi (903 – 986 A.D.,
the first recorded description of “nebulous” objects in the heavens is
attributed to him) could have used some of Gnuastro’s programs for a
realistic simulation of his observations and see if his detection of
nebulous objects was trust-able.  Because all conditions are under
control in a simulated/mock environment/dataset, mock datasets can be a
valuable tool to inspect the limitations of your data analysis and
processing.  But they need to be as realistic as possible, so the first
tutorial is dedicated to this important step of an analysis.

   The next two tutorials (*note General program usage tutorial:: and
*note Detecting large extended targets::) use real input datasets from
some of the deep Hubble Space Telescope (HST) images and the Sloan
Digital Sky Survey (SDSS) respectively.  Their aim is to demonstrate
some real-world problems that many astronomers often face and how they
can be be solved with Gnuastro’s programs.

   The ultimate aim of *note General program usage tutorial:: is to
detect galaxies in a deep HST image, measure their positions and
brightness and select those with the strongest colors.  In the process,
it takes many detours to introduce you to the useful capabilities of
many of the programs.  So please be patient in reading it.  If you don’t
have much time and can only try one of the tutorials, we recommend this
one.

   *note Detecting large extended targets:: deals with a major problem
in astronomy: effectively detecting the faint outer wings of bright (and
large) nearby galaxies to extremely low surface brightness levels
(roughly one quarter of the local noise level in the example discussed).
Besides the interesting scientific questions in these low-surface
brightness features, failure to properly detect them will bias the
measurements of the background objects and the survey’s noise estimates.
This is an important issue, especially in wide surveys.  Because
bright/large galaxies and stars(2), cover a significant fraction of the
survey area.

   In these tutorials, we have intentionally avoided too many cross
references to make it more easy to read.  For more information about a
particular program, you can visit the section with the same name as the
program in this book.  Each program section in the subsequent chapters
starts by explaining the general concepts behind what it does, for
example see *note Convolve::.  If you only want practical information on
running a program, for example its options/configuration, input(s) and
output(s), please consult the subsection titled “Invoking ProgramName”,
for example see *note Invoking astnoisechisel::.  For an explanation of
the conventions we use in the example codes through the book, please see
*note Conventions::.

* Menu:

* Sufi simulates a detection::  Simulating a detection.
* General program usage tutorial::  Tutorial on many programs in generic scenario.
* Detecting large extended targets::  NoiseChisel for huge extended targets.

   ---------- Footnotes ----------

   (1) The two historically motivated tutorials (*note Sufi simulates a
detection:: is not intended to be a historical reference (the historical
facts of this fictional tutorial used Wikipedia as a reference).  This
form of presenting a tutorial was influenced by the PGF/TikZ and Beamer
manuals.  They are both packages in in TeX and LaTeX, the first is a
high-level vector graphic programming environment, while with the second
you can make presentation slides.  On a similar topic, there are also
some nice words of wisdom for Unix-like systems called Rootless Root
(http://catb.org/esr/writings/unix-koans).  These also have a similar
style but they use a mythical figure named Master Foo.  If you already
have some experience in Unix-like systems, you will definitely find
these Unix Koans entertaining/educative.

   (2) Stars also have similarly large and extended wings due to the
point spread function, see *note PSF::.


File: gnuastro.info,  Node: Sufi simulates a detection,  Next: General program usage tutorial,  Prev: Tutorials,  Up: Tutorials

2.1 Sufi simulates a detection
==============================

It is the year 953 A.D. and Abd al-rahman Sufi (903 – 986 A.D.)(1) is in
Shiraz as a guest astronomer.  He had come there to use the advanced 123
centimeter astrolabe for his studies on the Ecliptic.  However,
something was bothering him for a long time.  While mapping the
constellations, there were several non-stellar objects that he had
detected in the sky, one of them was in the Andromeda constellation.
During a trip he had to Yemen, Sufi had seen another such object in the
southern skies looking over the Indian ocean.  He wasn’t sure if such
cloud-like non-stellar objects (which he was the first to call ‘Sahābi’
in Arabic or ‘nebulous’) were real astronomical objects or if they were
only the result of some bias in his observations.  Could such diffuse
objects actually be detected at all with his detection technique?

   He still had a few hours left until nightfall (when he would continue
his studies on the ecliptic) so he decided to find an answer to this
question.  He had thoroughly studied Claudius Ptolemy’s (90 – 168 A.D)
Almagest and had made lots of corrections to it, in particular in
measuring the brightness.  Using his same experience, he was able to
measure a magnitude for the objects and wanted to simulate his
observation to see if a simulated object with the same brightness and
size could be detected in a simulated noise with the same detection
technique.  The general outline of the steps he wants to take are:

  1. Make some mock profiles in an over-sampled image.  The initial mock
     image has to be over-sampled prior to convolution or other forms of
     transformation in the image.  Through his experiences, Sufi knew
     that this is because the image of heavenly bodies is actually
     transformed by the atmosphere or other sources outside the
     atmosphere (for example gravitational lenses) prior to being
     sampled on an image.  Since that transformation occurs on a
     continuous grid, to best approximate it, he should do all the work
     on a finer pixel grid.  In the end he can re-sample the result to
     the initially desired grid size.

  2. Convolve the image with a point spread function (PSF, see *note
     PSF::) that is over-sampled to the same resolution as the mock
     image.  Since he wants to finish in a reasonable time and the PSF
     kernel will be very large due to oversampling, he has to use
     frequency domain convolution which has the side effect of dimming
     the edges of the image.  So in the first step above he also has to
     build the image to be larger by at least half the width of the PSF
     convolution kernel on each edge.

  3. With all the transformations complete, the image should be
     re-sampled to the same size of the pixels in his detector.

  4. He should remove those extra pixels on all edges to remove
     frequency domain convolution artifacts in the final product.

  5. He should add noise to the (until now, noise-less) mock image.
     After all, all observations have noise associated with them.

   Fortunately Sufi had heard of GNU Astronomy Utilities from a
colleague in Isfahan (where he worked) and had installed it on his
computer a year before.  It had tools to do all the steps above.  He had
used MakeProfiles before, but wasn’t sure which columns he had chosen in
his user or system wide configuration files for which parameters, see
*note Configuration files::.  So to start his simulation, Sufi runs
MakeProfiles with the ‘-P’ option to make sure what columns in a catalog
MakeProfiles currently recognizes and the output image parameters.  In
particular, Sufi is interested in the recognized columns (shown below).

     $ astmkprof -P

     [[[ ... Truncated lines ... ]]]

     # Output:
      type         float32     # Type of output: e.g., int16, float32, etc...
      mergedsize   1000,1000   # Number of pixels along first FITS axis.
      oversample   5           # Scale of oversampling (>0 and odd).

     [[[ ... Truncated lines ... ]]]

     # Columns, by info (see `--searchin'), or number (starting from 1):
      ccol         2           # Coord. columns (one call for each dim.).
      ccol         3           # Coord. columns (one call for each dim.).
      fcol         4           # sersic (1), moffat (2), gaussian (3),
                               # point (4), flat (5), circumference (6),
                               # distance (7), radial-table (8).
      rcol         5           # Effective radius or FWHM in pixels.
      ncol         6           # Sersic index or Moffat beta.
      pcol         7           # Position angle.
      qcol         8           # Axis ratio.
      mcol         9           # Magnitude.
      tcol         10          # Truncation in units of radius or pixels.

     [[[ ... Truncated lines ... ]]]


In Gnuastro, column counting starts from 1, so the columns are ordered
such that the first column (number 1) can be an ID he specifies for each
object (and MakeProfiles ignores), each subsequent column is used for
another property of the profile.  It is also possible to use column
names for the values of these options and change these defaults, but
Sufi preferred to stick to the defaults.  Fortunately MakeProfiles has
the capability to also make the PSF which is to be used on the mock
image and using the ‘--prepforconv’ option, he can also make the mock
image to be larger by the correct amount and all the sources to be
shifted by the correct amount.

   For his initial check he decides to simulate the nebula in the
Andromeda constellation.  The night he was observing, the PSF had
roughly a FWHM of about 5 pixels, so as the first row (profile), he
defines the PSF parameters and sets the radius column (‘rcol’ above,
fifth column) to ‘5.000’, he also chooses a Moffat function for its
functional form.  Remembering how diffuse the nebula in the Andromeda
constellation was, he decides to simulate it with a mock Sérsic index
1.0 profile.  He wants the output to be 499 pixels by 499 pixels, so he
can put the center of the mock profile in the central pixel of the image
(note that an even number doesn’t have a central element).

   Looking at his drawings of it, he decides a reasonable effective
radius for it would be 40 pixels on this image pixel scale, he sets the
axis ratio and position angle to approximately correct values too and
finally he sets the total magnitude of the profile to 3.44 which he had
accurately measured.  Sufi also decides to truncate both the mock
profile and PSF at 5 times the respective radius parameters.  In the end
he decides to put four stars on the four corners of the image at very
low magnitudes as a visual scale.  While he was preparing the catalog,
one of his students approached him and was also following the steps.

   Using all the information above, he creates the catalog of mock
profiles he wants in a file named ‘cat.txt’ (short for catalog) using
his favorite text editor and stores it in a directory named
‘simulationtest’ in his home directory.  [The ‘cat’ command prints the
contents of a file, short for “concatenation”.  So please copy-paste the
lines after “‘cat cat.txt’” into ‘cat.txt’ when the editor opens in the
steps above it, note that there are 7 lines, first one starting with
<#>.  Also be careful when copying from the PDF format, the Info, web,
or text formats shouldn’t have any problem]:

     $ mkdir ~/simulationtest
     $ cd ~/simulationtest
     $ pwd
     /home/rahman/simulationtest
     $ emacs cat.txt
     $ ls
     cat.txt
     $ cat cat.txt
     # Column 4: PROFILE_NAME [,str6] Radial profile's functional name
      1  0.0000   0.0000  moffat  5.000  4.765  0.0000  1.000  30.000  5.000
      2  250.00   250.00  sersic  40.00  1.000  -25.00  0.400  3.4400  5.000
      3  50.000   50.000  point   0.000  0.000  0.0000  0.000  6.0000  0.000
      4  450.00   50.000  point   0.000  0.000  0.0000  0.000  6.5000  0.000
      5  50.000   450.00  point   0.000  0.000  0.0000  0.000  7.0000  0.000
      6  450.00   450.00  point   0.000  0.000  0.0000  0.000  7.5000  0.000

The zero point magnitude for his observation was 18.  Now he has all the
necessary parameters and runs MakeProfiles with the following command:


     $ astmkprof --prepforconv --mergedsize=499,499 --zeropoint=18.0 cat.txt
     MakeProfiles started on Sat Oct  6 16:26:56 953
       - 6 profiles read from cat.txt
       - Random number generator (RNG) type: mt19937
       - Using 8 threads.
       ---- row 2 complete, 5 left to go
       ---- row 3 complete, 4 left to go
       ---- row 4 complete, 3 left to go
       ---- row 5 complete, 2 left to go
       ---- ./0_cat.fits created.
       ---- row 0 complete, 1 left to go
       ---- row 1 complete, 0 left to go
       - ./cat_profiles.fits created.                       0.041651 seconds
     MakeProfiles finished in 0.267234 seconds

     $ls
     0_cat.fits  cat_profiles.fits  cat.txt

The file ‘0_cat.fits’ is the PSF Sufi had asked for, and
‘cat_profiles.fits’ is the image containing the main objects in the
catalog.  The size of ‘cat_profiles.fits’ was surprising for the
student, instead of 499 by 499 (as we had requested), it was 2615 by
2615 pixels (from the command below):

     $ astfits cat_profiles.fits -h1 | grep NAXIS

So Sufi explained why oversampling is important in modeling, especially
for parts of the image where the flux change is significant over a
pixel.  Recall that when you oversample the model (for example by 5
times), for every desired pixel, you get 25 pixels ($5\times5$).  Sufi
then explained that after convolving (next step below) we will
down-sample the image to get our originally desired size/resolution.

   Sufi then opened ‘cat_profiles.fits’ [you can use any FITS viewer,
for example, ‘ds9’].  After seeing the image, the student complained
that only the large elliptical model for the Andromeda nebula can be
seen in the center.  He couldn’t see the four stars that we had also
requested in the catalog.  So Sufi had to explain that the stars are
there in the image, but the reason that they aren’t visible when looking
at the whole image at once, is that they only cover a single pixel!  To
prove it, he centered the image around the coordinates 2308 and 2308,
where one of the stars is located in the over-sampled image [you can do
this in ‘ds9’ by selecting “Pan” in the “Edit” menu, then clicking
around that position].  Sufi then zoomed in to that region and soon, the
star’s non-zero pixel could be clearly seen.

   Sufi explained that the stars will take the shape of the PSF (cover
an area of more than one pixel) after convolution.  If we didn’t have an
atmosphere and we didn’t need an aperture, then stars would only cover a
single pixel with normal CCD resolutions.  So Sufi convolved the image
with this command:

     $ astconvolve --kernel=0_cat.fits cat_profiles.fits \
                   --output=cat_convolved.fits
     Convolve started on Sat Oct  6 16:35:32 953
       - Using 8 CPU threads.
       - Input: cat_profiles.fits (hdu: 1)
       - Kernel: 0_cat.fits (hdu: 1)
       - Input and Kernel images padded.                    0.075541 seconds
       - Images converted to frequency domain.              6.728407 seconds
       - Multiplied in the frequency domain.                0.040659 seconds
       - Converted back to the spatial domain.              3.465344 seconds
       - Padded parts removed.                              0.016767 seconds
       - Output: cat_convolved.fits
     Convolve finished in:  10.422161 seconds

     $ls
     0_cat.fits  cat_convolved.fits  cat_profiles.fits  cat.txt

When convolution finished, Sufi opened ‘cat_convolved.fits’ and the four
stars could be easily seen now.  It was interesting for the student that
all the flux in that single pixel is now distributed over so many pixels
(the sum of all the pixels in each convolved star is actually equal to
the value of the single pixel before convolution).  Sufi explained how a
PSF with a larger FWHM would make the points even wider than this
(distributing their flux in a larger area).  With the convolved image
ready, they were prepared to re-sample it to the original pixel scale
Sufi had planned [from the ‘$ astmkprof -P’ command above, recall that
MakeProfiles had over-sampled the image by 5 times].  Sufi explained the
basic concepts of warping the image to his student and ran Warp with the
following command:

     $ astwarp --scale=1/5 --centeroncorner cat_convolved.fits
     Warp started on Sat Oct  6 16:51:59 953
      Using 8 CPU threads.
      Input: cat_convolved.fits (hdu: 1)
      matrix:
             0.2000   0.0000   0.4000
             0.0000   0.2000   0.4000
             0.0000   0.0000   1.0000

     $ ls
     0_cat.fits          cat_convolved_scaled.fits     cat.txt
     cat_convolved.fits  cat_profiles.fits

     $ astfits -p cat_convolved_scaled.fits | grep NAXIS
     NAXIS   =                    2 / number of data axes
     NAXIS1  =                  523 / length of data axis 1
     NAXIS2  =                  523 / length of data axis 2

‘cat_convolved_scaled.fits’ now has the correct pixel scale.  However,
the image is still larger than what we had wanted, it is 523
($499+12+12$) by 523 pixels.  The student is slightly confused, so Sufi
also re-samples the PSF with the same scale by running

     $ astwarp --scale=1/5 --centeroncorner 0_cat.fits
     $ astfits -p 0_cat_scaled.fits | grep NAXIS
     NAXIS   =                    2 / number of data axes
     NAXIS1  =                   25 / length of data axis 1
     NAXIS2  =                   25 / length of data axis 2

Sufi notes that $25=(2\times12)+1$ and goes on to explain how frequency
space convolution will dim the edges and that is why he added the
‘--prepforconv’ option to MakeProfiles, see *note If convolving
afterwards::.  Now that convolution is done, Sufi can remove those extra
pixels using Crop with the command below.  Crop’s ‘--section’ option
accepts coordinates inclusively and counting from 1 (according to the
FITS standard), so the crop region’s first pixel has to be 13, not 12.

     $ astcrop cat_convolved_scaled.fits --section=13:*-12,13:*-12    \
               --mode=img --zeroisnotblank
     Crop started on Sat Oct  6 17:03:24 953
       - Read metadata of 1 image.                          0.001304 seconds
       ---- ...nvolved_scaled_cropped.fits created: 1 input.
     Crop finished in:  0.027204 seconds

     $ls
     0_cat.fits          cat_convolved_scaled_cropped.fits  cat_profiles.fits
     cat_convolved.fits  cat_convolved_scaled.fits          cat.txt

Finally, ‘cat_convolved_scaled_cropped.fits’ is $499\times499$ pixels
and the mock Andromeda galaxy is centered on the central pixel (open the
image in a FITS viewer and confirm this by zooming into the center, note
that an even-width image wouldn’t have a central pixel).  This is the
same dimensions as Sufi had desired in the beginning.  All this trouble
was certainly worth it because now there is no dimming on the edges of
the image and the profile centers are more accurately sampled.

   The final step to simulate a real observation would be to add noise
to the image.  Sufi set the zero point magnitude to the same value that
he set when making the mock profiles and looking again at his
observation log, he had measured the background flux near the nebula had
a magnitude of 7 that night.  So using these values he ran MakeNoise:

     $ astmknoise --zeropoint=18 --background=7 --output=out.fits    \
                  cat_convolved_scaled_cropped.fits
     MakeNoise started on Sat Oct  6 17:05:06 953
       - Generator type: ranlxs1
       - Generator seed: 1428318100
     MakeNoise finished in:  0.033491 (seconds)

     $ls
     0_cat.fits         cat_convolved_scaled_cropped.fits cat_profiles.fits
     cat_convolved.fits cat_convolved_scaled.fits         cat.txt  out.fits

The ‘out.fits’ file now contains the noised image of the mock catalog
Sufi had asked for.  Seeing how the ‘--output’ option allows the user to
specify the name of the output file, the student was confused and wanted
to know why Sufi hadn’t used it more regularly before?  Sufi then
explained to him that for intermediate steps, you can rely on the
automatic output of the programs, see *note Automatic output::.  Doing
so will give all the intermediate files a similar basic name structure,
so in the end you can simply remove them all with the Shell’s
capabilities, and it will be familiar for other users.  So Sufi decided
to show this to the student by making a shell script from the commands
he had used before.

   The command-line shell has the capability to read all the separate
input commands from a file.  This is useful when you want to do the same
thing multiple times, with only the names of the files or minor
parameters changing between the different instances.  Using the shell’s
history (by pressing the up keyboard key) Sufi reviewed all the commands
and then he retrieved the last 5 commands with the ‘$ history 5’
command.  He selected all those lines he had input and put them in a
text file named ‘mymock.sh’.  Then he defined the ‘edge’ and ‘base’
shell variables for easier customization later.  Finally, before every
command, he added some comments (lines starting with <#>) for future
readability.

     edge=12
     base=cat

     # Stop running next commands if one fails.
     set -e

     # Remove any (possibly) existing output (from previous runs)
     # before starting.
     rm -f out.fits

     # Run MakeProfiles to create an oversampled FITS image.
     astmkprof --prepforconv --mergedsize=499,499 --zeropoint=18.0 \
               "$base".txt

     # Convolve the created image with the kernel.
     astconvolve --kernel=0_"$base".fits "$base"_profiles.fits \
                 --output="$base"_convolved.fits

     # Scale the image back to the intended resolution.
     astwarp --scale=1/5 --centeroncorner "$base"_convolved.fits

     # Crop the edges out (dimmed during convolution). ‘--section’
     # accepts inclusive coordinates, so the start of the section
     # must be one pixel larger than its end.
     st_edge=$(( edge + 1 ))
     astcrop "$base"_convolved_scaled.fits --zeroisnotblank \
             --mode=img --section=$st_edge:*-$edge,$st_edge:*-$edge

     # Add noise to the image.
     astmknoise --zeropoint=18 --background=7 --output=out.fits \
                "$base"_convolved_scaled_cropped.fits

     # Remove all the temporary files.
     rm 0*.fits "$base"*.fits

   He used this chance to remind the student of the importance of
comments in code or shell scripts: when writing the code, you have a
good mental picture of what you are doing, so writing comments might
seem superfluous and excessive.  However, in one month when you want to
re-use the script, you have lost that mental picture and remembering it
can be time-consuming and frustrating.  The importance of comments is
further amplified when you want to share the script with a
friend/colleague.  So it is good to accompany any script/code with
useful comments while you are writing it (create a good mental picture
of what/why you are doing something).

   Sufi then explained to the eager student that you define a variable
by giving it a name, followed by an ‘=’ sign and the value you want.
Then you can reference that variable from anywhere in the script by
calling its name with a ‘$’ prefix.  So in the script whenever you see
‘$base’, the value we defined for it above is used.  If you use advanced
editors like GNU Emacs or even simpler ones like Gedit (part of the
GNOME graphical user interface) the variables will become a different
color which can really help in understanding the script.  We have put
all the ‘$base’ variables in double quotation marks (‘"’) so the
variable name and the following text do not get mixed, the shell is
going to ignore the ‘"’ after replacing the variable value.  To make the
script executable, Sufi ran the following command:

     $ chmod +x mymock.sh

Then finally, Sufi ran the script, simply by calling its file name:

     $ ./mymock.sh

   After the script finished, the only file remaining is the ‘out.fits’
file that Sufi had wanted in the beginning.  Sufi then explained to the
student how he could run this script anywhere that he has a catalog if
the script is in the same directory.  The only thing the student had to
modify in the script was the name of the catalog (the value of the
‘base’ variable in the start of the script) and the value to the ‘edge’
variable if he changed the PSF size.  The student was also happy to hear
that he won’t need to make it executable again when he makes changes
later, it will remain executable unless he explicitly changes the
executable flag with ‘chmod’.

   The student was really excited, since now, through simple shell
scripting, he could really speed up his work and run any command in any
fashion he likes allowing him to be much more creative in his works.
Until now he was using the graphical user interface which doesn’t have
such a facility and doing repetitive things on it was really frustrating
and some times he would make mistakes.  So he left to go and try
scripting on his own computer.

   Sufi could now get back to his own work and see if the simulated
nebula which resembled the one in the Andromeda constellation could be
detected or not.  Although it was extremely faint(2), fortunately it
passed his detection tests and he wrote it in the draft manuscript that
would later become “Book of fixed stars”.  He still had to check the
other nebula he saw from Yemen and several other such objects, but they
could wait until tomorrow (thanks to the shell script, he only has to
define a new catalog).  It was nearly sunset and they had to begin
preparing for the night’s measurements on the ecliptic.

* Menu:

* General program usage tutorial::

   ---------- Footnotes ----------

   (1) In Latin Sufi is known as Azophi.  He was an Iranian astronomer.
His manuscript “Book of fixed stars” contains the first recorded
observations of the Andromeda galaxy, the Large Magellanic Cloud and
seven other non-stellar or ‘nebulous’ objects.

   (2) The brightness of a diffuse object is added over all its pixels
to give its final magnitude, see *note Brightness flux magnitude::.  So
although the magnitude 3.44 (of the mock nebula) is orders of magnitude
brighter than 6 (of the stars), the central galaxy is much fainter.  Put
another way, the brightness is distributed over a large area in the case
of a nebula.


File: gnuastro.info,  Node: General program usage tutorial,  Next: Detecting large extended targets,  Prev: Sufi simulates a detection,  Up: Tutorials

2.2 General program usage tutorial
==================================

Measuring colors of astronomical objects in broad-band or narrow-band
images is one of the most basic and common steps in astronomical
analysis.  Here, we will use Gnuastro’s programs to get a physical scale
(area at certain redshifts) of the field we are studying, detect objects
in a Hubble Space Telescope (HST) image, measure their colors and
identify the ones with the strongest colors, do a visual inspection of
these objects and inspect spatial position in the image.  After this
tutorial, you can also try the *note Detecting large extended targets::
tutorial which goes into a little more detail on detecting very low
surface brightness signal.

   During the tutorial, we will take many detours to explain, and
practically demonstrate, the many capabilities of Gnuastro’s programs.
In the end you will see that the things you learned during this tutorial
are much more generic than this particular problem and can be used in
solving a wide variety of problems involving the analysis of data
(images or tables).  So please don’t rush, and go through the steps
patiently to optimally master Gnuastro.

   In this tutorial, we’ll use the HSTeXtreme Deep Field
(https://archive.stsci.edu/prepds/xdf) dataset.  Like almost all
astronomical surveys, this dataset is free for download and usable by
the public.  You will need the following tools in this tutorial:
Gnuastro, SAO DS9 (1), GNU Wget(2), and AWK (most common implementation
is GNU AWK(3)).

   This tutorial was first prepared for the “Exploring the Ultra-Low
Surface Brightness Universe” workshop (November 2017) at the ISSI in
Bern, Switzerland.  It was further extended in the “4th Indo-French
Astronomy School” (July 2018) organized by LIO, CRAL CNRS UMR5574, UCBL,
and IUCAA in Lyon, France.  We are very grateful to the organizers of
these workshops and the attendees for the very fruitful discussions and
suggestions that made this tutorial possible.

*Write the example commands manually:* Try to type the example commands
on your terminal manually and use the history feature of your
command-line (by pressing the “up” button to retrieve previous
commands).  Don’t simply copy and paste the commands shown here.  This
will help simulate future situations when you are processing your own
datasets.

* Menu:

* Calling Gnuastro's programs::  Easy way to find Gnuastro’s programs.
* Accessing documentation::     Access to manual of programs you are running.
* Setup and data download::     Setup this template and download datasets.
* Dataset inspection and cropping::  Crop the flat region to use in next steps.
* Angular coverage on the sky::  Measure the field size on the sky.
* Cosmological coverage::       Measure the field size at different redshifts.
* Building custom programs with the library::  Easy way to build new programs.
* Option management and configuration files::  Dealing with options and configuring them.
* Warping to a new pixel grid::  Transforming/warping the dataset.
* NoiseChisel and Multiextension FITS files::  Running NoiseChisel and having multiple HDUs.
* NoiseChisel optimization for detection::  Check NoiseChisel’s operation and improve it.
* NoiseChisel optimization for storage::  Dramatically decrease output’s volume.
* Segmentation and making a catalog::  Finding true peaks and creating a catalog.
* Working with catalogs estimating colors::  Estimating colors using the catalogs.
* Column statistics color-magnitude diagram::  Visualizing column correlations.
* Aperture photometry::         Doing photometry on a fixed aperture.
* Matching catalogs::           Easily find corresponding rows from two catalogs.
* Finding reddest clumps and visual inspection::  Selecting some targets and inspecting them.
* Writing scripts to automate the steps::  Scripts will greatly help in re-doing things fast.
* Citing and acknowledging Gnuastro::  How to cite and acknowledge Gnuastro in your papers.

   ---------- Footnotes ----------

   (1) See *note SAO DS9::, available at
<http://ds9.si.edu/site/Home.html>

   (2) <https://www.gnu.org/software/wget>

   (3) <https://www.gnu.org/software/gawk>


File: gnuastro.info,  Node: Calling Gnuastro's programs,  Next: Accessing documentation,  Prev: General program usage tutorial,  Up: General program usage tutorial

2.2.1 Calling Gnuastro’s programs
---------------------------------

A handy feature of Gnuastro is that all program names start with ‘ast’.
This will allow your command-line processor to easily list and
auto-complete Gnuastro’s programs for you.  Try typing the following
command (press <TAB> key when you see ‘<TAB>’) to see the list:

     $ ast<TAB><TAB>

Any program that starts with ‘ast’ (including all Gnuastro programs)
will be shown.  By choosing the subsequent characters of your desired
program and pressing <<TAB><TAB>> again, the list will narrow down and
the program name will auto-complete once your input characters are
unambiguous.  In short, you often don’t need to type the full name of
the program you want to run.


File: gnuastro.info,  Node: Accessing documentation,  Next: Setup and data download,  Prev: Calling Gnuastro's programs,  Up: General program usage tutorial

2.2.2 Accessing documentation
-----------------------------

Gnuastro contains a large number of programs and it is natural to forget
the details of each program’s options or inputs and outputs.  Therefore,
before starting the analysis steps of this tutorial, let’s review how
you can access this book to refresh your memory any time you want,
without having to take your hands off the keyboard.

   When you install Gnuastro, this book is also installed on your system
along with all the programs and libraries, so you don’t need an internet
connection to to access/read it.  Also, by accessing this book as
described below, you can be sure that it corresponds to your installed
version of Gnuastro.

   GNU Info(1) is the program in charge of displaying the manual on the
command-line (for more, see *note Info::).  To see this whole book on
your command-line, please run the following command and press subsequent
keys.  Info has its own mini-environment, therefore we’ll show the keys
that must be pressed in the mini-environment after a ‘->’ sign.  You can
also ignore anything after the ‘#’ sign in the middle of the line, they
are only for your information.

     $ info gnuastro                # Open the top of the manual.
     -> <SPACE>                     # All the book chapters.
     -> <SPACE>                     # Continue down: show sections.
     -> <SPACE> ...                 # Keep pressing space to go down.
     -> q                           # Quit Info, return to the command-line.

   The thing that greatly simplifies navigation in Info is the links
(regions with an underline).  You can immediately go to the next link in
the page with the <<TAB>> key and press <<ENTER>> on it to go into that
part of the manual.  Try the commands above again, but this time also
use <<TAB>> to go to the links and press <<ENTER>> on them to go to the
respective section of the book.  Then follow a few more links and go
deeper into the book.  To return to the previous page, press <l> (small
L). If you are searching for a specific phrase in the whole book (for
example an option name), press <s> and type your search phrase and end
it with an <<ENTER>>.

   You don’t need to start from the top of the manual every time.  For
example, to get to *note Invoking astnoisechisel::, run the following
command.  In general, all programs have such an “Invoking ProgramName”
section in this book.  These sections are specifically for the
description of inputs, outputs and configuration options of each
program.  You can access them directly for each program by giving its
executable name to Info.

     $ info astnoisechisel

   The other sections don’t have such shortcuts.  To directly access
them from the command-line, you need to tell Info to look into
Gnuastro’s manual, then look for the specific section (an unambiguous
title is necessary).  For example, if you only want to review/remember
NoiseChisel’s *note Detection options::), just run the following
command.  Note how case is irrelevant for Info when calling a title in
this manner.

     $ info gnuastro "Detection options"

   In general, Info is a powerful and convenient way to access this
whole book with detailed information about the programs you are running.
If you are not already familiar with it, please run the following
command and just read along and do what it says to learn it.  Don’t stop
until you feel sufficiently fluent in it.  Please invest the half an
hour’s time necessary to start using Info comfortably.  It will greatly
improve your productivity and you will start reaping the rewards of this
investment very soon.

     $ info info

   As a good scientist you need to feel comfortable to play with the
features/options and avoid (be critical to) using default values as much
as possible.  On the other hand, our human memory is limited, so it is
important to be able to easily access any part of this book fast and
remember the option names, what they do and their acceptable values.

   If you just want the option names and a short description, calling
the program with the ‘--help’ option might also be a good solution like
the first example below.  If you know a few characters of the option
name, you can feed the output to ‘grep’ like the second or third example
commands.

     $ astnoisechisel --help
     $ astnoisechisel --help | grep quant
     $ astnoisechisel --help | grep check

   ---------- Footnotes ----------

   (1) GNU Info is already available on almost all Unix-like operating
systems.


File: gnuastro.info,  Node: Setup and data download,  Next: Dataset inspection and cropping,  Prev: Accessing documentation,  Up: General program usage tutorial

2.2.3 Setup and data download
-----------------------------

The first step in the analysis of the tutorial is to download the
necessary input datasets.  First, to keep things clean, let’s create a
‘gnuastro-tutorial’ directory and continue all future steps in it:

     $ mkdir gnuastro-tutorial
     $ cd gnuastro-tutorial

   We will be using the near infra-red Wide Field Camera
(http://www.stsci.edu/hst/wfc3) dataset.  If you already have them in
another directory (for example ‘XDFDIR’, with the same FITS file names),
you can set the ‘download’ directory to be a symbolic link to ‘XDFDIR’
with a command like this:

     $ ln -s XDFDIR download

Otherwise, when the following images aren’t already present on your
system, you can make a ‘download’ directory and download them there.

     $ mkdir download
     $ cd download
     $ xdfurl=http://archive.stsci.edu/pub/hlsp/xdf
     $ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f105w_v1_sci.fits
     $ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f125w_v1_sci.fits
     $ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits
     $ cd ..

In this tutorial, we’ll just use these three filters.  Later, you may
need to download more filters.  To do that, you can use the shell’s
‘for’ loop to download them all in series (one after the other(1)) with
one command like the one below for the WFC3 filters.  Put this command
instead of the three ‘wget’ commands above.  Recall that all the extra
spaces, back-slashes (‘\’), and new lines can be ignored if you are
typing on the lines on the terminal.

     $ for f in f105w f125w f140w f160w; do \
         wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_"$f"_v1_sci.fits; \
       done

   ---------- Footnotes ----------

   (1) Note that you only have one port to the internet, so downloading
in parallel will actually be slower than downloading in series.


File: gnuastro.info,  Node: Dataset inspection and cropping,  Next: Angular coverage on the sky,  Prev: Setup and data download,  Up: General program usage tutorial

2.2.4 Dataset inspection and cropping
-------------------------------------

First, let’s visually inspect the datasets we downloaded in *note Setup
and data download::.  Let’s take F160W image as an example.  Do the
steps below with the other image(s) too (and later with any dataset that
you want to work on).  It is very important to get a good visual feeling
of the dataset you intend to use.  Also, note how SAO DS9 (used here for
visual inspection of FITS images) doesn’t follow the GNU style of
options where “long” and “short” options are preceded by ‘--’ and ‘-’
respectively (for example ‘--width’ and ‘-w’, see *note Options::).

   Run the command below to see the F160W image with DS9.  Ds9’s
‘-zscale’ scaling is good to visually highlight the low surface
brightness regions, and as the name suggests, ‘-zoom to fit’ will fit
the whole dataset in the window.  If the window is too small, expand it
with your mouse, then press the “zoom” button on the top row of buttons
above the image.  Afterwards, in the bottom row of buttons, press “zoom
fit”.  You can also zoom in and out by scrolling your mouse or the
respective operation on your touch-pad when your cursor/pointer is over
the image.

     $ ds9 download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits     \
           -zscale -zoom to fit

   As you hover your mouse over the image, notice how the “Value” and
positional fields on the top of the ds9 window get updated.  The first
thing you might notice is that when you hover the mouse over the regions
with no data, they have a value of zero.  The next thing might be that
the dataset actually has two “depth”s (see *note Quantifying measurement
limits::).  Recall that this is a combined/reduced image of many
exposures, and the parts that have more exposures are deeper.  In
particular, the exposure time of the deep inner region is larger than 4
times of the outer (more shallower) parts.

   To simplify the analysis in this tutorial, we’ll only be working on
the deep field, so let’s crop it out of the full dataset.  Fortunately
the XDF survey web page (above) contains the vertices of the deep flat
WFC3-IR field.  With Gnuastro’s Crop program(1), you can use those
vertices to cutout this deep region from the larger image.  But before
that, to keep things organized, let’s make a directory called ‘flat-ir’
and keep the flat (single-depth) regions in that directory (with a
‘‘xdf-’’ suffix for a shorter and easier filename).

     $ mkdir flat-ir
     $ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f105w.fits \
               --polygon="53.187414,-27.779152 : 53.159507,-27.759633 : \
                          53.134517,-27.787144 : 53.161906,-27.807208" \
               download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f105w_v1_sci.fits

     $ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f125w.fits \
               --polygon="53.187414,-27.779152 : 53.159507,-27.759633 : \
                          53.134517,-27.787144 : 53.161906,-27.807208" \
               download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f125w_v1_sci.fits

     $ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f160w.fits \
               --polygon="53.187414,-27.779152 : 53.159507,-27.759633 : \
                          53.134517,-27.787144 : 53.161906,-27.807208" \
               download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits

   The only thing varying in the three calls to Gnuastro’s Crop program
is the filter name!  Note how everything else is the same.  In such
cases, you should generally avoid repeating a command manually, it is
prone to many bugs, and as you see, it is very hard to read (didn’t you
suddenly write a ‘7’ as an ‘8’?).  To simplify the command, and later
allow work on more filters, we can use the shell’s ‘for’ loop as shown
below.  Notice how the place where the filter names (‘f105w’, ‘f125w’
and ‘f160w’) are used above, have been replaced with ‘$f’ (the shell
variable that ‘for’ will update in every loop) below.

     $ rm flat-ir/*.fits
     $ for f in f105w f125w f160w; do \
         astcrop --mode=wcs -h0 --output=flat-ir/xdf-$f.fits \
                 --polygon="53.187414,-27.779152 : 53.159507,-27.759633 : \
                            53.134517,-27.787144 : 53.161906,-27.807208" \
                 download/hlsp_xdf_hst_wfc3ir-60mas_hudf_"$f"_v1_sci.fits; \
       done

   Please open these images and inspect them with the same ‘ds9’ command
you used above.  You will see how it is nicely flat now and doesn’t have
varying depths.  Another important result of this crop is that regions
with no data now have a NaN (Not-a-Number, or a blank value) value.  In
the downloaded files, such regions had a value of zero.  However, zero
is a number, and is thus meaningful, especially when you later want to
NoiseChisel(2).  Generally, when you want to ignore some pixels in a
dataset, and avoid higher-level ambiguities or complications, it is
always best to give them blank values (not zero, or some other absurdly
large or small number).  Gnuastro has the Arithmetic program for such
cases, and we’ll introduce it later in this tutorial.

   ---------- Footnotes ----------

   (1) To learn more about the crop program see *note Crop::.

   (2) As you will see below, unlike most other detection algorithms,
NoiseChisel detects the objects from their faintest parts, it doesn’t
start with their high signal-to-noise ratio peaks.  Since the Sky is
already subtracted in many images and noise fluctuates around zero, zero
is commonly higher than the initial threshold applied.  Therefore not
ignoring zero-valued pixels in this image, will cause them to part of
the detections!


File: gnuastro.info,  Node: Angular coverage on the sky,  Next: Cosmological coverage,  Prev: Dataset inspection and cropping,  Up: General program usage tutorial

2.2.5 Angular coverage on the sky
---------------------------------

This is the deepest image we currently have of the sky.  The first thing
that comes to mind may be this: “How large is this field on the sky?”.
You can get a fast and crude answer with Gnuastro’s Fits program using
this command:

     astfits flat-ir/xdf-f160w.fits --skycoverage

   It will print the sky coverage in two formats (all numbers are in
units of degrees for this image): 1) the image’s central RA and Dec and
full width around that center, 2) the range of RA and Dec covered by
this image.  You can use these values in various online query systems.
You can also use this option to automatically calculate the area covered
by this image.  With the ‘--quiet’ option, the printed output of
‘--skycoverage’ will not contain human-readable text, making it easier
for further processing:

     astfits flat-ir/xdf-f160w.fits --skycoverage --quiet

   The second row is the coverage range along RA and Dec (compare with
the outputs before using ‘--quiet’).  We can thus simply subtract the
second from the first column and multiply it with the difference of the
fourth and third columns to calculate the image area.  We’ll also
multiply each by 60 to have the area in arc-minutes squared.

     astfits flat-ir/xdf-f160w.fits --skycoverage --quiet \
             | awk 'NR==2{print ($2-$1)*60*($4-$3)*60}'

   The returned value is $9.06711$ arcmin$^2$.  *However, this method
ignores the fact many of the image pixels are blank!*  In other words,
the image does cover this area, but there is no data in more than half
of the pixels.  So let’s calculate the area coverage over-which we
actually have data.

   The FITS world coordinate system (WCS) meta data standard contains
the key to answering this question.  Run the following command to see
all the FITS keywords (metadata) for one of the images (almost identical
with the other images because they were are scaled to the same region of
Sky):

     astfits flat-ir/xdf-f160w.fits -h1

   Look into the keywords grouped under the ‘‘World Coordinate System
(WCS)’’ title.  These keywords define how the image relates to the
outside world.  In particular, the ‘CDELT*’ keywords (or ‘CDELT1’ and
‘CDELT2’ in this 2D image) contain the “Coordinate DELTa” (or change in
coordinate units) with a change in one pixel.  But what is the units of
each “world” coordinate?  The ‘CUNIT*’ keywords (for “Coordinate UNIT”)
have the answer.  In this case, both ‘CUNIT1’ and ‘CUNIT1’ have a value
of ‘deg’, so both “world” coordinates are in units of degrees.  We can
thus conclude that the value of ‘CDELT*’ is in units of
degrees-per-pixel(1).

   With the commands below, we’ll use ‘CDELT’ (along with the image
size) to find the answer of our initial question: “how much of the sky
does this image cover?”.  The lines starting with ‘##’ are just comments
for you to read and understand each command.  Don’t type them on the
terminal.  The commands are intentionally repetitive in some places to
better understand each step and also to demonstrate the beauty of
command-line features like history, variables, pipes and loops (which
you will commonly use as you master the command-line).

*Use shell history:* Don’t forget to make effective use of your shell’s
history: you don’t have to re-type previous command to add something to
them.  This is especially convenient when you just want to make a small
change to your previous command.  Press the “up” key on your keyboard
(possibly multiple times) to see your previous command(s) and modify
them accordingly.

     ## If your system language uses ',' (not '.') as decimal separator.
     $ export LANG=C

     ## See the general statistics of non-blank pixel values.
     $ aststatistics flat-ir/xdf-f160w.fits

     ## We only want the number of non-blank pixels.
     $ aststatistics flat-ir/xdf-f160w.fits --number

     ## Keep the result of the command above in the shell variable `n'.
     $ n=$(aststatistics flat-ir/xdf-f160w.fits --number)

     ## See what is stored the shell variable `n'.
     $ echo $n

     ## Show all the FITS keywords of this image.
     $ astfits flat-ir/xdf-f160w.fits -h1

     ## The resolution (in degrees/pixel) is in the `CDELT' keywords.
     ## Only show lines that contain these characters, by feeding
     ## the output of the previous command to the `grep' program.
     $ astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT

     ## Since the resolution of both dimensions is (approximately) equal,
     ## we'll only use one of them (CDELT1).
     $ astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT1

     ## To extract the value (third token in the line above), we'll
     ## feed the output to AWK. Note that the first two tokens are
     ## `CDELT1' and `='.
     $ astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT1 | awk '{print $3}'

     ## Save it as the shell variable `r'.
     $ r=$(astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT1   \
                   | awk '{print $3}')

     ## Print the values of `n' and `r'.
     $ echo $n $r

     ## Use the number of pixels (first number passed to AWK) and
     ## length of each pixel's edge (second number passed to AWK)
     ## to estimate the area of the field in arc-minutes squared.
     $ echo $n $r | awk '{print $1 * ($2*60)^2}'

   The output of the last command (area of this field) is 4.03817 (or
approximately 4.04) arc-minutes squared.  Just for comparison, this is
roughly 175 times smaller than the average moon’s angular area (with a
diameter of 30arc-minutes or half a degree).

   Some FITS writers don’t use the ‘CDELT’ convention, making it hard to
use the steps above.  In such cases, you can extract the pixel scale
with the ‘--pixelscale’ option of Gnuastro’s Fits program like the
command below.  Like the ‘--skycoverage’ option above, you can also use
the ‘--quiet’ option to allow easy usage of the values in scripts.

     $ astfits flat-ir/xdf-f160w.fits --pixelscale

*AWK for table/value processing:* As you saw above AWK is a powerful and
simple tool for text processing.  You will see it often in shell
scripts.  GNU AWK (the most common implementation) comes with a free and
wonderful book (https://www.gnu.org/software/gawk/manual/) in the same
format as this book which will allow you to master it nicely.  Just like
this manual, you can also access GNU AWK’s manual on the command-line
whenever necessary without taking your hands off the keyboard.  Just run
‘info awk’.

*Your locale doesn’t use ‘.’ as decimal separator:* the input/output of
some core operating system tools like ‘awk’ or ‘seq’ depend on the
system locale
(https://en.wikipedia.org/wiki/Locale_(computer_software)).  For example
in Spanish and some other languages the decimal separator (symbol used
to separate the integer and fractional part of a number), is a comma.
Therefore in systems that have Spanish as their default Locale, ‘seq’
will print half of unity as ‘‘0,5’’ (instead of ‘‘0.5’’).  This can
cause problems for parsing the printed numbers in other programs.  You
can check your current locale with the ‘locale’ command.  You can test
your default decimal separator with this command:

     seq 0.5 1

   To avoid these kinds of locale-specific problems (for example another
program not being able to read ‘‘0,5’’ as half of unity), you can change
the locale by setting the ‘LANG’ environment variable (or the
lower-level/generic ‘LC_ALL’).  You can do it only for a single command
(the first one below), or all commands within the running session (the
second command below):

     ## Change the locale to the standard, only for this 'seq' command.
     $ LANG=C seq 0.5 1

     ## Change the locale to the standard, for all commands after it.
     $ export LANG=C

   If you want to change it generally for all future sessions, you can
put the second command in your shell’s startup file.  For more on
startup files, please see *note Installation directory::.

   ---------- Footnotes ----------

   (1) With the FITS ‘CDELT’ convention, rotation (‘PC’ or ‘CD’
keywords) and scales (‘CDELT’) are separated.  In the FITS standard the
‘CDELT’ keywords are optional.  When ‘CDELT’ keywords aren’t present,
the ‘PC’ matrix is assumed to contain _both_ the coordinate rotation and
scales.  Note that not all FITS writers use the ‘CDELT’ convention.  So
you might not find the ‘CDELT’ keywords in the WCS meta data of some
FITS files.  However, all Gnuastro programs (which use the default FITS
keyword writing format of WCSLIB) write their output WCS with the
‘CDELT’ convention, even if the input doesn’t have it.  If your dataset
doesn’t use the ‘CDELT’ convention, you can feed it to any (simple)
Gnuastro program (for example Arithmetic) and the output will have the
‘CDELT’ keyword.  See Section 8 of the FITS standard
(https://fits.gsfc.nasa.gov/standard40/fits_standard40aa-le.pdf) for
more


File: gnuastro.info,  Node: Cosmological coverage,  Next: Building custom programs with the library,  Prev: Angular coverage on the sky,  Up: General program usage tutorial

2.2.6 Cosmological coverage
---------------------------

Having found the angular coverage of the dataset in *note Angular
coverage on the sky::, we can now use Gnuastro to answer a more
physically motivated question: “How large is this area at different
redshifts?”.  To get a feeling of the tangential area that this field
covers at redshift 2, you can use Gnuastro’s CosmicCalcular program
(*note CosmicCalculator::).  In particular, you need the tangential
distance covered by 1 arc-second as raw output.  Combined with the
field’s area that was measured before, we can calculate the tangential
distance in Mega Parsecs squared ($Mpc^2$).

     ## If your system language uses ',' (not '.') as decimal separator.
     $ export LANG=C

     ## Print general cosmological properties at redshift 2 (for example).
     $ astcosmiccal -z2

     ## When given a "Specific calculation" option, CosmicCalculator
     ## will just print that particular calculation. To see all such
     ## calculations, add a `--help' token to the previous command
     ## (under the same title). Note that with `--help', no processing
     ## is done, so you can always simply append it to remember
     ## something without modifying the command you want to run.
     $ astcosmiccal -z2 --help

     ## Only print the "Tangential dist. covered by 1arcsec at z (kpc)".
     ## in units of kpc/arc-seconds.
     $ astcosmiccal -z2 --arcsectandist

     ## But its easier to use the short version of this option (which
     ## can be appended to other short options.
     $ astcosmiccal -sz2

     ## Convert this distance to kpc^2/arcmin^2 and save in `k'.
     $ k=$(astcosmiccal -sz2 | awk '{print ($1*60)^2}')

     ## Re-calculate the area of the dataset in arcmin^2.
     $ n=$(aststatistics flat-ir/xdf-f160w.fits --number)
     $ r=$(astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT1   \
                   | awk '{print $3}')
     $ a=$(echo $n $r | awk '{print $1 * ($2^2) * 3600}')

     ## Multiply `k' and `a' and divide by 10^6 for value in Mpc^2.
     $ echo $k $a | awk '{print $1 * $2 / 1e6}'

At redshift 2, this field therefore covers approximately 1.07 $Mpc^2$.
If you would like to see how this tangential area changes with redshift,
you can use a shell loop like below.

     $ for z in 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0; do        \
         k=$(astcosmiccal -sz$z);                                  \
         echo $z $k $a | awk '{print $1, ($2*60)^2 * $3 / 1e6}';   \
       done

Fortunately, the shell has a useful tool/program to print a sequence of
numbers that is nicely called ‘seq’.  You can use it instead of typing
all the different redshifts in this example.  For example the loop below
will calculate and print the tangential coverage of this field across a
larger range of redshifts (0.1 to 5) and with finer increments of 0.1.

     ## If your system language uses ',' (not '.') as decimal separator.
     $ export LANG=C

     ## The loop over the redshifts
     $ for z in $(seq 0.1 0.1 5); do                                  \
         k=$(astcosmiccal -z$z --arcsectandist);                      \
         echo $z $k $a | awk '{print $1, ($2*60)^2 * $3 / 1e6}';   \
       done


File: gnuastro.info,  Node: Building custom programs with the library,  Next: Option management and configuration files,  Prev: Cosmological coverage,  Up: General program usage tutorial

2.2.7 Building custom programs with the library
-----------------------------------------------

In *note Cosmological coverage::, we repeated a certain
calculation/output of a program multiple times using the shell’s ‘for’
loop.  This simple way repeating a calculation is great when it is only
necessary once.  However, if you commonly need this calculation and
possibly for a larger number of redshifts at higher precision, the
command above can be slow (try it out to see).

   This slowness of the repeated calls to a generic program (like
CosmicCalculator), is because it can have a lot of overhead on each
call.  To be generic and easy to operate, it has to parse the
command-line and all configuration files (see *note Option management
and configuration files::) which contain human-readable characters and
need a lot of pre-processing to be ready for processing by the computer.
Afterwards, CosmicCalculator has to check the sanity of its inputs and
check which of its many options you have asked for.  All the this
pre-processing takes as much time as the high-level calculation you are
requesting, and it has to re-do all of these for every redshift in your
loop.

   To greatly speed up the processing, you can directly access the core
work-horse of CosmicCalculator without all that overhead by designing
your custom program for this job.  Using Gnuastro’s library, you can
write your own tiny program particularly designed for this exact
calculation (and nothing else!).  To do that, copy and paste the
following C program in a file called ‘myprogram.c’.

     #include <math.h>
     #include <stdio.h>
     #include <stdlib.h>
     #include <gnuastro/cosmology.h>

     int
     main(void)
     {
       double area=4.03817;          /* Area of field (arcmin^2). */
       double z, adist, tandist;     /* Temporary variables.      */

       /* Constants from Plank 2018 (arXiv:1807.06209, Table 2) */
       double H0=67.66, olambda=0.6889, omatter=0.3111, oradiation=0;

       /* Do the same thing for all redshifts (z) between 0.1 and 5. */
       for(z=0.1; z<5; z+=0.1)
         {
           /* Calculate the angular diameter distance. */
           adist=gal_cosmology_angular_distance(z, H0, olambda,
                                                omatter, oradiation);

           /* Calculate the tangential distance of one arcsecond. */
           tandist = adist * 1000 * M_PI / 3600 / 180;

           /* Print the redshift and area. */
           printf("%-5.2f %g\n", z, pow(tandist * 60,2) * area / 1e6);
         }

       /* Tell the system that everything finished successfully. */
       return EXIT_SUCCESS;
     }

Then run the following command to compile your program and run it.

     $ astbuildprog myprogram.c

In the command above, you used Gnuastro’s BuildProgram program.  Its job
is to greatly simplify the compilation, linking and running of simple C
programs that use Gnuastro’s library (like this one).  BuildProgram is
designed to manage Gnuastro’s dependencies, compile and link your custom
program and then run it.

   Did you notice how your custom program was much faster than the
repeated calls to CosmicCalculator in the previous section?  You might
have noticed that a new file called ‘myprogram’ is also created in the
directory.  This is the compiled program that was created and run by the
command above (its in binary machine code format, not human-readable any
more).  You can run it again to get the same results with a command like
this:

     $ ./myprogram

   The efficiency of your custom ‘myprogram’ compared to repeated calls
to CosmicCalculator is because in the latter, the requested processing
is comparable to the necessary overheads.  For other programs that take
large input datasets and do complicated processing on them, the overhead
is usually negligible compared to the processing.  In such cases, the
libraries are only useful if you want a different/new processing
compared to the functionalities in Gnuastro’s existing programs.

   Gnuastro has a large library which is used extensively by all the
programs.  In other words, the library is like the skeleton of Gnuastro.
For the full list of available functions classified by context, please
see *note Gnuastro library::.  Gnuastro’s library and BuildProgram are
created to make it easy for you to use these powerful features as you
like.  This gives you a high level of creativity, while also providing
efficiency and robustness.  Several other complete working examples
(involving images and tables) of Gnuastro’s libraries can be see in
*note Library demo programs::.

   But for this tutorial, let’s stop discussing the libraries at this
point in and get back to Gnuastro’s already built programs which don’t
need any programming.  But before continuing, let’s clean up the files
we don’t need any more:

     $ rm myprogram*


File: gnuastro.info,  Node: Option management and configuration files,  Next: Warping to a new pixel grid,  Prev: Building custom programs with the library,  Up: General program usage tutorial

2.2.8 Option management and configuration files
-----------------------------------------------

None of Gnuastro’s programs keep a default value internally within their
code.  However, when you ran CosmicCalculator only with the ‘-z2’ option
(not specifying the cosmological parameters) in *note Cosmological
coverage::, it completed its processing and printed results.  Where did
the necessary cosmological parameters (like the matter density, etc)
that are necessary for its calculations come from?  Fast reply: the
values come from a configuration file (see *note Configuration file
precedence::).

   CosmicCalculator is a small program with a limited set of
parameters/options.  Therefore, let’s use it to discuss configuration
files in Gnuastro (for more, you can always see *note Configuration
files::).  Configuration files are an important part of all Gnuastro’s
programs, especially the ones with a large number of options, so its
important to understand this part well .

   Once you get comfortable with configuration files here, you can make
good use of them in all Gnuastro programs (for example, NoiseChisel).
For example, to do optimal detection on various datasets, you can have
configuration files for different noise properties.  The configuration
of each program (besides its version) is vital for the reproducibility
of your results, so it is important to manage them properly.

   As we saw above, the full list of the options in all Gnuastro
programs can be seen with the ‘--help’ option.  Try calling it with
CosmicCalculator as shown below.  Note how options are grouped by
context to make it easier to find your desired option.  However, in each
group, options are ordered alphabetically.

     $ astcosmiccal --help

The options that need a value have an <=> sign after their long version
and ‘FLT’, ‘INT’ or ‘STR’ for floating point numbers, integer numbers,
and strings (filenames for example) respectively.  All options have a
long format and some have a short format (a single character), for more
see *note Options::.

   When you are using a program, it is often necessary to check the
value the option has just before the program starts its processing.  In
other words, after it has parsed the command-line options and all
configuration files.  You can see the values of all options that need
one with the ‘--printparams’ or ‘-P’ option.  ‘--printparams’ is common
to all programs (see *note Common options::).  In the command below, try
replacing ‘-P’ with ‘--printparams’ to see how both do the same
operation.

     $ astcosmiccal -P

   Let’s say you want a different Hubble constant.  Try running the
following command (just adding ‘--H0=70’ after the command above) to see
how the Hubble constant in the output of the command above has changed.

     $ astcosmiccal -P --H0=70

Afterwards, delete the ‘-P’ and add a ‘-z2’ to see the calculations with
the new cosmology (or configuration).

     $ astcosmiccal --H0=70 -z2

   From the output of the ‘--help’ option, note how the option for
Hubble constant has both short (‘-H’) and long (‘--H0’) formats.  One
final note is that the equal (<=>) sign is not mandatory.  In the short
format, the value can stick to the actual option (the short option name
is just one character after-all, thus easily identifiable) and in the
long format, a white-space character is also enough.

     $ astcosmiccal -H70    -z2
     $ astcosmiccal --H0 70 -z2 --arcsectandist

When an option doesn’t need a value, and has a short format (like
‘--arcsectandist’), you can easily append it _before_ other short
options.  So the last command above can also be written as:

     $ astcosmiccal --H0 70 -sz2

   Let’s assume that in one project, you want to only use rounded
cosmological parameters (H0 of 70km/s/Mpc and matter density of 0.3).
You should therefore run CosmicCalculator like this:

     $ astcosmiccal --H0=70 --olambda=0.7 --omatter=0.3 -z2

   But having to type these extra options every time you run
CosmicCalculator will be prone to errors (typos in particular),
frustrating and slow.  Therefore in Gnuastro, you can put all the
options and their values in a “Configuration file” and tell the programs
to read the option values from there.

   Let’s create a configuration file...  With your favorite text editor,
make a file named ‘my-cosmology.conf’ (or ‘my-cosmology.txt’, the suffix
doesn’t matter, but a more descriptive suffix like ‘.conf’ is
recommended).  Then put the following lines inside of it.  One space
between the option value and name is enough, the values are just under
each other to help in readability.  Also note that you can only use long
option names in configuration files.

     H0       70
     olambda  0.7
     omatter  0.3

You can now tell CosmicCalculator to read this file for option values
immediately using the ‘--config’ option as shown below.  Do you see how
the output of the following command corresponds to the option values in
‘my-cosmology.conf’, and is therefore identical to the previous command?

     $ astcosmiccal --config=my-cosmology.conf -z2

   But still, having to type ‘--config=my-cosmology.conf’ every time is
annoying, isn’t it?  If you need this cosmology every time you are
working in a specific directory, you can use Gnuastro’s default
configuration file names and avoid having to type it manually.

   The default configuration files (that are checked if they exist) must
be placed in the hidden ‘.gnuastro’ sub-directory (in the same directory
you are running the program).  Their file name (within ‘.gnuastro’) must
also be the same as the program’s executable name.  So in the case of
CosmicCalculator, the default configuration file in a given directory is
‘.gnuastro/astcosmiccal.conf’.

   Let’s do this.  We’ll first make a directory for our custom
cosmology, then build a ‘.gnuastro’ within it.  Finally, we’ll copy the
custom configuration file there:

     $ mkdir my-cosmology
     $ mkdir my-cosmology/.gnuastro
     $ mv my-cosmology.conf my-cosmology/.gnuastro/astcosmiccal.conf

   Once you run CosmicCalculator within ‘my-cosmology’ (as shown below),
you will see how your custom cosmology has been implemented without
having to type anything extra on the command-line.

     $ cd my-cosmology
     $ astcosmiccal -P
     $ cd ..

   To further simplify the process, you can use the ‘--setdirconf’
option.  If you are already in your desired working directory, calling
this option with the others will automatically write the final values
(along with descriptions) in ‘.gnuastro/astcosmiccal.conf’.  For example
try the commands below:

     $ mkdir my-cosmology2
     $ cd my-cosmology2
     $ astcosmiccal -P
     $ astcosmiccal --H0 70 --olambda=0.7 --omatter=0.3 --setdirconf
     $ astcosmiccal -P
     $ cd ..

   Gnuastro’s programs also have default configuration files for a
specific user (when run in any directory).  This allows you to set a
special behavior every time a program is run by a specific user.  Only
the directory and filename differ from the above, the rest of the
process is similar to before.  Finally, there are also system-wide
configuration files that can be used to define the option values for all
users on a system.  See *note Configuration file precedence:: for a more
detailed discussion.

   We’ll stop the discussion on configuration files here, but you can
always read about them in *note Configuration files::.  Before
continuing the tutorial, let’s delete the two extra directories that we
don’t need any more:

     $ rm -rf my-cosmology*


File: gnuastro.info,  Node: Warping to a new pixel grid,  Next: NoiseChisel and Multiextension FITS files,  Prev: Option management and configuration files,  Up: General program usage tutorial

2.2.9 Warping to a new pixel grid
---------------------------------

We are now ready to start processing the downloaded images.  The XDF
datasets we are using here are already aligned to the same pixel grid.
However, warping to a different/matched pixel grid is commonly needed
before higher-level analysis when you are using datasets from different
instruments.  So let’s have a look at Gnuastro’s features warping
features here.

   Gnuastro’s Warp program should be used for warping the pixel-grid
(see *note Warp::).  For example, try rotating one of the images by 20
degrees:

     $ astwarp flat-ir/xdf-f160w.fits --rotate=20

Open the output (‘xdf-f160w_rotated.fits’) and see how it is rotated.
If your final image is already aligned with RA and Dec, you can simply
use the ‘--align’ option and let Warp calculate the necessary rotation
and apply it.  For example, try aligning the rotated image back to the
standard orientation (just note that because of the two rotations, the
NaN parts of the image are larger now):

     $ astwarp xdf-f160w_rotated.fits --align

   Warp can generally be used for many kinds of pixel grid manipulation
(warping), not just rotations.  For example the outputs of the commands
below will respectively have larger pixels (new resolution being one
quarter the original resolution), get shifted by 2.8 (by sub-pixel), get
a shear of 2, and be tilted (projected).  Run each of them and open the
output file to see the effect, they will become handy for you in the
future.

     $ astwarp flat-ir/xdf-f160w.fits --scale=0.25
     $ astwarp flat-ir/xdf-f160w.fits --translate=2.8
     $ astwarp flat-ir/xdf-f160w.fits --shear=0.2
     $ astwarp flat-ir/xdf-f160w.fits --project=0.001,0.0005

If you need to do multiple warps, you can combine them in one call to
Warp.  For example to first rotate the image, then scale it, run this
command:

     $ astwarp flat-ir/xdf-f160w.fits --rotate=20 --scale=0.25

   If you have multiple warps, do them all in one command.  Don’t warp
them in separate commands because the correlated noise will become too
strong.  As you see in the matrix that is printed when you run Warp, it
merges all the warps into a single warping matrix (see *note Merging
multiple warpings::) and simply applies that (mixes the pixel values)
just once.  However, if you run Warp multiple times, the pixels will be
mixed multiple times, creating a strong artificial blur/smoothing, or
stronger correlated noise.

   Recall that the merging of multiple warps is done through matrix
multiplication, therefore order matters in the separate operations.  At
a lower level, through Warp’s ‘--matrix’ option, you can directly
request your desired final warp and don’t have to break it up into
different warps like above (see *note Invoking astwarp::).

   Fortunately these datasets are already aligned to the same pixel
grid, so you don’t actually need the files that were just generated.You
can safely delete them all with the following command.  Here, you see
why we put the processed outputs that we need later into a separate
directory.  In this way, the top directory can be used for temporary
files for testing that you can simply delete with a generic command like
below.

     $ rm *.fits


File: gnuastro.info,  Node: NoiseChisel and Multiextension FITS files,  Next: NoiseChisel optimization for detection,  Prev: Warping to a new pixel grid,  Up: General program usage tutorial

2.2.10 NoiseChisel and Multiextension FITS files
------------------------------------------------

Having completed a review of the basics in the previous sections, we are
now ready to separate the signal (galaxies or stars) from the background
noise in the image.  We will be using the results of *note Dataset
inspection and cropping::, so be sure you already have them.  Gnuastro
has NoiseChisel for this job.  But NoiseChisel’s output is a
multi-extension FITS file, therefore to better understand how to use
NoiseChisel, let’s take a look at multi-extension FITS files and how you
can interact with them.

   In the FITS format, each extension contains a separate dataset (image
in this case).  You can get basic information about the extensions in a
FITS file with Gnuastro’s Fits program (see *note Fits::).  To start
with, let’s run NoiseChisel without any options, then use Gnuastro’s
FITS program to inspect the number of extensions in this file.

     $ astnoisechisel flat-ir/xdf-f160w.fits
     $ astfits xdf-f160w_detected.fits

   From the output list, we see that NoiseChisel’s output contains 5
extensions and the first (counting from zero, with name
‘NOISECHISEL-CONFIG’) is empty: it has value of ‘0’ in the last column
(which shows its size).  The first extension in all the outputs of
Gnuastro’s programs only contains meta-data: data about/describing the
datasets within (all) the output’s extensions.  This is recommended by
the FITS standard, see *note Fits:: for more.  In the case of Gnuastro’s
programs, this generic zero-th/meta-data extension (for the whole file)
contains all the configuration options of the program that created the
file.

   The second extension of NoiseChisel’s output (numbered 1, named
‘INPUT-NO-SKY’) is the Sky-subtracted input that you provided.  The
third (‘DETECTIONS’) is NoiseChisel’s main output which is a binary
image with only two possible values for all pixels: 0 for noise and 1
for signal.  Since it only has two values, to avoid taking too much
space on your computer, its numeric datatype an unsigned 8-bit integer
(or ‘uint8’)(1).  The fourth and fifth (‘SKY’ and ‘SKY_STD’) extensions,
have the Sky and its standard deviation values for the input on a tile
grid and were calculated over the undetected regions (for more on the
importance of the Sky value, see *note Sky value::).

   Metadata regarding how the analysis was done (or a dataset was
created) is very important for higher-level analysis and
reproducibility.  Therefore, Let’s first take a closer look at the
‘NOISECHISEL-CONFIG’ extension.  If you specify a special header in the
FITS file, Gnuastro’s Fits program will print the header keywords
(metadata) of that extension.  You can either specify the HDU/extension
counter (starting from 0), or name.  Therefore, the two commands below
are identical for this file:

     $ astfits xdf-f160w_detected.fits -h0
     $ astfits xdf-f160w_detected.fits -hNOISECHISEL-CONFIG

   The first group of FITS header keywords are standard keywords
(containing the ‘SIMPLE’ and ‘BITPIX’ keywords the first empty line).
They are required by the FITS standard and must be present in any FITS
extension.  The second group contains the input file and all the options
with their values in that run of NoiseChisel.  Finally, the last group
contains the date and version information of Gnuastro and its
dependencies.  The “versions and date” group of keywords are present in
all Gnuastro’s FITS extension outputs, for more see *note Output FITS
files::.

   Note that if a keyword name is larger than 8 characters, it is
preceded by a ‘HIERARCH’ keyword and that all keyword names are in
capital letters.  Therefore, if you want to see only one keyword’s value
by feeding the output to Grep, you should ask Grep to ignore case with
its ‘-i’ option (short name for ‘--ignore-case’).  For example, below
we’ll check the value to the ‘--snminarea’ option, note how we don’t
need Grep’s ‘-i’ option when it is fed with ‘astnoisechisel -P’ since it
is already in small-caps there.  The extra white spaces in the first
command are only to help in readability, you can ignore them when
typing.

     $ astnoisechisel -P                   | grep    snminarea
     $ astfits xdf-f160w_detected.fits -h0 | grep -i snminarea

The metadata (that is stored in the output) can later be used to exactly
reproduce/understand your result, even if you have lost/forgot the
command you used to create the file.  This feature is present in all of
Gnuastro’s programs, not just NoiseChisel.

   Let’s continue with the extensions in NoiseChisel’s output that
contain a dataset by visually inspecting them (here, we’ll use SAO DS9).
Since the file contains multiple related extensions, the easiest way to
view all of them in DS9 is to open the file as a “Multi-extension data
cube” with the ‘-mecube’ option as shown below(2).

     $ ds9 -mecube xdf-f160w_detected.fits -zscale -zoom to fit

   A “cube” window opens along with DS9’s main window.  The buttons and
horizontal scroll bar in this small new window can be used to navigate
between the extensions.  In this mode, all DS9’s settings (for example
zoom or color-bar) will be identical between the extensions.  Try
zooming into to one part and flipping through the extensions to see how
the galaxies were detected along with the Sky and Sky standard deviation
values for that region.  Just have in mind that NoiseChisel’s job is
_only_ detection (separating signal from noise), We’ll do segmentation
on this result later to find the individual galaxies/peaks over the
detected pixels.

   Each HDU/extension in a FITS file is an independent dataset (image or
table) which you can delete from the FITS file, or copy/cut to another
file.  For example, with the command below, you can copy NoiseChisel’s
‘DETECTIONS’ HDU/extension to another file:

     $ astfits xdf-f160w_detected.fits --copy=DETECTIONS -odetections.fits

   There are similar options to conveniently cut (‘--cut’, copy, then
remove from the input) or delete (‘--remove’) HDUs from a FITS file
also.  See *note HDU information and manipulation:: for more.

   ---------- Footnotes ----------

   (1) To learn more about numeric data types see *note Numeric data
types::.

   (2) You can configure your graphic user interface to open DS9 in
multi-extension cube mode by default when using the GUI (double clicking
on the file).  If your graphic user interface is GNOME (another GNU
software, it is most common in GNU/Linux operating systems), a full
description is given in *note Viewing multiextension FITS images::


File: gnuastro.info,  Node: NoiseChisel optimization for detection,  Next: NoiseChisel optimization for storage,  Prev: NoiseChisel and Multiextension FITS files,  Up: General program usage tutorial

2.2.11 NoiseChisel optimization for detection
---------------------------------------------

In *note NoiseChisel and Multiextension FITS files::, we ran NoiseChisel
and reviewed NoiseChisel’s output format.  Now that you have a better
feeling for multi-extension FITS files, let’s optimize NoiseChisel for
this particular dataset.

   One good way to see if you have missed any signal (small galaxies, or
the wings of brighter galaxies) is to mask all the detected pixels and
inspect the noise pixels.  For this, you can use Gnuastro’s Arithmetic
program (in particular its ‘where’ operator, see *note Arithmetic
operators::).  The command below will produce ‘mask-det.fits’.  In it,
all the pixels in the ‘INPUT-NO-SKY’ extension that are flagged 1 in the
‘DETECTIONS’ extension (dominated by signal, not noise) will be set to
NaN.

   Since the various extensions are in the same file, for each dataset
we need the file and extension name.  To make the command easier to
read/write/understand, let’s use shell variables: ‘‘in’’ will be used
for the Sky-subtracted input image and ‘‘det’’ will be used for the
detection map.  Recall that a shell variable’s value can be retrieved by
adding a ‘$’ before its name, also note that the double quotations are
necessary when we have white-space characters in a variable name (like
this case).

     $ in="xdf-f160w_detected.fits -hINPUT-NO-SKY"
     $ det="xdf-f160w_detected.fits -hDETECTIONS"
     $ astarithmetic $in $det nan where --output=mask-det.fits

To invert the result (only keep the detected pixels), you can flip the
detection map (from 0 to 1 and vice-versa) by adding a ‘‘not’’ after the
second ‘$det’:

     $ astarithmetic $in $det not nan where --output=mask-sky.fits

   Look again at the ‘DETECTIONS’ extension, in particular the long
worm-like structure around (1) pixel 1650 (X) and 1470 (Y). These types
of long wiggly structures show that we have dug too deep into the noise,
and are a signature of correlated noise.  Correlated noise is created
when we warp (for example rotate) individual exposures (that are each
slightly offset compared to each other) into the same pixel grid before
adding them into one deeper image.  During the warping, nearby pixels
are mixed and the effect of this mixing on the noise (which is in every
pixel) is called “correlated noise”.  Correlated noise is a form of
convolution and it slightly smooths the image.

   In terms of the number of exposures (and thus correlated noise), the
XDF dataset is by no means an ordinary dataset.  Therefore the default
parameters need to be slightly customized.  It is the result of warping
and adding roughly 80 separate exposures which can create strong
correlated noise/smoothing.  In common surveys the number of exposures
is usually 10 or less.  See Figure 2 of Akhlaghi [2019]
(https://arxiv.org/abs/1909.11230) and the discussion on
‘--detgrowquant’ there for more on how NoiseChisel “grow”s the detected
objects and the patterns caused by correlated noise.

   Let’s tweak NoiseChisel’s configuration a little to get a better
result on this dataset.  Don’t forget that “_Good statistical analysis
is not a purely routine matter, and generally calls for more than one
pass through the computer_” (Anscombe 1973, see *note Science and its
tools::).  A good scientist must have a good understanding of her tools
to make a meaningful analysis.  So don’t hesitate in playing with the
default configuration and reviewing the manual when you have a new
dataset (from a new instrument) in front of you.  Robust data analysis
is an art, therefore a good scientist must first be a good artist.  Once
you have found the good configuration for that particular noise pattern
(instrument) you can safely use it for all new data that have a similar
noise pattern.

   NoiseChisel can produce “Check images” to help you visualize and
inspect how each step is done.  You can see all the check images it can
produce with this command.

     $ astnoisechisel --help | grep check

   Let’s check the overall detection process to get a better feeling of
what NoiseChisel is doing with the following command.  To learn the
details of NoiseChisel in more detail, please see *note NoiseChisel::,
Akhlaghi and Ichikawa [2015] (https://arxiv.org/abs/1505.01664) and
Akhlaghi [2019] (https://arxiv.org/abs/1909.11230).

     $ astnoisechisel flat-ir/xdf-f160w.fits --checkdetection

   The check images/tables are also multi-extension FITS files.  As you
saw from the command above, when check datasets are requested,
NoiseChisel won’t go to the end.  It will abort as soon as all the
extensions of the check image are ready.  Please list the extensions of
the output with ‘astfits’ and then opening it with ‘ds9’ as we done
above.  If you have read the paper, you will see why there are so many
extensions in the check image.

     $ astfits xdf-f160w_detcheck.fits
     $ ds9 -mecube xdf-f160w_detcheck.fits -zscale -zoom to fit

   In order to understand the parameters and their biases (especially as
you are starting to use Gnuastro, or running it a new dataset), it is
_strongly_ encouraged to play with the different parameters and use the
respective check images to see which step is affected by your changes
and how, for example see *note Detecting large extended targets::.

   Let’s focus on one step: the ‘OPENED_AND_LABELED’ extension shows the
initial detection step of NoiseChisel.  We see the seeds of that
correlated noise structure with many small detections (a relatively
early stage in the processing).  Such connections at the lowest surface
brightness limits usually occur when the dataset is too smoothed, the
threshold is too low, or the final “growth” is too much.

   As you see from the 2nd (‘CONVOLVED’) extension, the first operation
that NoiseChisel does on the data is to slightly smooth it.  However,
the natural correlated noise of this dataset is already one level of
artificial smoothing, so further smoothing it with the default kernel
may be the culprit.  To see the effect, let’s use a sharper kernel as a
first step to convolve/smooth the input.

   By default NoiseChisel uses a Gaussian with full-width-half-maximum
(FWHM) of 2 pixels.  We can use Gnuastro’s MakeProfiles to build a
kernel with FWHM of 1.5 pixel (truncated at 5 times the FWHM, like the
default) using the following command.  MakeProfiles is a powerful tool
to build any number of mock profiles on one image or independently, to
learn more of its features and capabilities, see *note MakeProfiles::.

     $ astmkprof --kernel=gaussian,1.5,5 --oversample=1

Please open the output ‘kernel.fits’ and have a look (it is very small
and sharp).  We can now tell NoiseChisel to use this instead of the
default kernel with the following command (we’ll keep the
‘--checkdetection’ to continue checking the detection steps)

     $ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                      --checkdetection

   Open the output ‘xdf-f160w_detcheck.fits’ as a multi-extension FITS
file and go to the last extension (‘DETECTIONS-FINAL’, it is the same
pixels as the final NoiseChisel output without ‘--checkdetections’).
Look again at that position mentioned above (1650,1470), you see that
the long wiggly structure is gone.  This shows we are making progress
:-).

   Looking at the new ‘OPENED_AND_LABELED’ extension, we see that the
thin connections between smaller peaks has now significantly decreased.
Going two extensions/steps ahead (in the first ‘HOLES-FILLED’), you can
see that during the process of finding false pseudo-detections, too many
holes have been filled: do you see how the many of the brighter galaxies
are connected?  At this stage all holes are filled, irrespective of
their size.

   Try looking two extensions ahead (in the first ‘PSEUDOS-FOR-SN’), you
can see that there aren’t too many pseudo-detections because of all
those extended filled holes.  If you look closely, you can see the
number of pseudo-detections in the printed outputs of NoiseChisel
(around 6400).  This is another side-effect of correlated noise.  To
address it, we should slightly increase the pseudo-detection threshold
(before changing ‘--dthresh’, run with ‘-P’ to see the default value):

     $ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits \
                      --dthresh=0.1 --checkdetection

   Before visually inspecting the check image, you can already see the
effect of this small change in NoiseChisel’s command-line output: notice
how the number of pseudo-detections has increased to more than 7100!
Open the check image now and have a look, you can see how the
pseudo-detections are distributed much more evenly in the blank sky
regions of the ‘PSEUDOS-FOR-SN’ extension.

*Maximize the number of pseudo-detections:* When using NoiseChisel on
datasets with a new noise-pattern (for example going to a Radio
astronomy image, or a shallow ground-based image), play with ‘--dthresh’
until you get a maximal number of pseudo-detections: the total number of
pseudo-detections is printed on the command-line when you run
NoiseChisel, you don’t even need to open a FITS viewer.

   In this particular case, try ‘--dthresh=0.2’ and you will see that
the total printed number decreases to around 6700 (recall that with
‘--dthresh=0.1’, it was roughly 7100).  So for this type of very deep
HST images, we should set ‘--dthresh=0.1’.

   As discussed in Section 3.1.5 of Akhlaghi and Ichikawa [2015]
(https://arxiv.org/abs/1505.01664), the signal-to-noise ratio of
pseudo-detections are critical to identifying/removing false detections.
For an optimal detection they are very important to get right (where you
want to detect the faintest and smallest objects in the image
successfully).  Let’s have a look at their signal-to-noise distribution
with ‘--checksn’.

     $ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                      --dthresh=0.1 --checkdetection --checksn

   The output (‘xdf-f160w_detsn.fits’) contains two extensions for the
pseudo-detections containing two-column tables over the undetected
(‘SKY_PSEUDODET_SN’) regions and those over detections
(‘DET_PSEUDODET_SN’).  With the first command below you can see the HDUs
of this file, and with the second you can see the information of the
table in the first HDU (which is the default when you don’t use
‘--hdu’):

     $ astfits xdf-f160w_detsn.fits
     $ asttable xdf-f160w_detsn.fits -i

You can see the table columns with the first command below and get a
feeling of the signal-to-noise value distribution with the second
command (the two Table and Statistics programs will be discussed later
in the tutorial):

     $ asttable xdf-f160w_detsn.fits -hSKY_PSEUDODET_SN
     $ aststatistics xdf-f160w_detsn.fits -hSKY_PSEUDODET_SN -c2
     ... [output truncated] ...
     Histogram:
      |           *
      |          ***
      |         ******
      |        *********
      |        **********
      |       *************
      |      *****************
      |     ********************
      |    **************************
      |   ********************************
      |*******************************************************   * **       *
      |----------------------------------------------------------------------

   The correlated noise is again visible in the signal-to-noise
distribution of sky pseudo-detections!  Do you see how skewed this
distribution is?  In an image with less correlated noise, this
distribution would be much more symmetric.  A small change in the
quantile will translate into a big change in the S/N value.  For example
see the difference between the three 0.99, 0.95 and 0.90 quantiles with
this command:

     $ aststatistics xdf-f160w_detsn.fits -hSKY_PSEUDODET_SN -c2      \
                     --quantile=0.99 --quantile=0.95 --quantile=0.90

   We get a change of almost 2 units (which is very significant).  If
you run NoiseChisel with ‘-P’, you’ll see the default signal-to-noise
quantile ‘--snquant’ is 0.99.  In effect with this option you specify
the purity level you want (contamination by false detections).  With the
‘aststatistics’ command above, you see that a small number of extra
false detections (impurity) in the final result causes a big change in
completeness (you can detect more lower signal-to-noise true
detections).  So let’s loosen-up our desired purity level, remove the
check-image options, and then mask the detected pixels like before to
see if we have missed anything.

     $ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                      --dthresh=0.1 --snquant=0.95
     $ in="xdf-f160w_detected.fits -hINPUT-NO-SKY"
     $ det="xdf-f160w_detected.fits -hDETECTIONS"
     $ astarithmetic $in $det nan where --output=mask-det.fits

   Overall it seems good, but if you play a little with the color-bar
and look closer in the noise, you’ll see a few very sharp, but faint,
objects that have not been detected.  For example the object around
pixel (456, 1662).  Despite its high valued pixels, this object was lost
because erosion ignores the precise pixel values.  Loosing small/sharp
objects like this only happens for under-sampled datasets like HST
(where the pixel size is larger than the point spread function FWHM). So
this won’t happen on ground-based images.

   To address this problem of sharp objects, we can use NoiseChisel’s
‘--noerodequant’ option.  All pixels above this quantile will not be
eroded, thus allowing us to preserve small/sharp objects (that cover a
small area, but have a lot of signal in it).  Check its default value,
then run NoiseChisel like below and make the mask again.

     $ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits     \
                      --noerodequant=0.95 --dthresh=0.1 --snquant=0.95

   This seems to be fine and the object above is now detected.  We’ll
stop the configuration here, but please feel free to keep looking into
the data to see if you can improve it even more.

   Once you have found the proper customization for the type of images
you will be using you don’t need to change them any more.  The same
configuration can be used for any dataset that has been similarly
produced (and has a similar noise pattern).  But entering all these
options on every call to NoiseChisel is annoying and prone to bugs
(mistakenly typing the wrong value for example).  To simply things,
we’ll make a configuration file in a visible ‘config’ directory.  Then
we’ll define the hidden ‘.gnuastro’ directory (that all Gnuastro’s
programs will look into for configuration files) as a symbolic link to
the ‘config’ directory.  Finally, we’ll write the finalized values of
the options into NoiseChisel’s standard configuration file within that
directory.  We’ll also put the kernel in a separate directory to keep
the top directory clean of any files we later need.

     $ mkdir kernel config
     $ ln -s config/ .gnuastro
     $ mv kernel.fits kernel/noisechisel.fits
     $ echo "kernel kernel/noisechisel.fits" > config/astnoisechisel.conf
     $ echo "noerodequant 0.95"             >> config/astnoisechisel.conf
     $ echo "dthresh      0.1"              >> config/astnoisechisel.conf
     $ echo "snquant      0.95"             >> config/astnoisechisel.conf

We are now ready to finally run NoiseChisel on the three filters and
keep the output in a dedicated directory (which we’ll call ‘nc’ for
simplicity).
     $ rm *.fits
     $ mkdir nc
     $ for f in f105w f125w f160w; do \
         astnoisechisel flat-ir/xdf-$f.fits --output=nc/xdf-$f.fits; \
       done

   ---------- Footnotes ----------

   (1) To find a particular coordiante easily in DS9, you can do this:
Click on the “Edit” menu, and select “Region”.  Then click on any random
part of the image to see a circle show up in that location (this is the
“region”).  Double-click on the region and a “Circle” window will open.
If you have celestial coordinates, keep the default “fk5” in the
scroll-down menu after the “Center”.  But if you have pixel/image
coordinates, click on the “fk5” and select “Image”.  Now you can set the
“Center” coordinates of the region (‘1650’ and ‘1470’ in this case) by
manually typing them in the two boxes in front of “Center”.  Finally,
when everything is ready, click on the “Apply” button and your region
will go over your requested coordinates.  You can zoom out (to see the
whole image) and visually find it.


File: gnuastro.info,  Node: NoiseChisel optimization for storage,  Next: Segmentation and making a catalog,  Prev: NoiseChisel optimization for detection,  Up: General program usage tutorial

2.2.12 NoiseChisel optimization for storage
-------------------------------------------

As we showed before (in *note NoiseChisel and Multiextension FITS
files::), NoiseChisel’s output is a multi-extension FITS file with
several images the same size as the input.  As the input datasets get
larger this output can become hard to manage and waste a lot of storage
space.  Fortunately there is a solution to this problem (which is also
useful for Segment’s outputs).

   In this small section we’ll take a short detour to show this feature.
Please note that the outputs generated here are not needed for the rest
of the tutorial.  But first, let’s have a look at the contents/HDUs and
volume of NoiseChisel’s output from *note NoiseChisel optimization for
detection:: (fast answer, its larger than 100 mega-bytes):

     $ astfits nc/xdf-f160w.fits
     $ ls -lh nc/xdf-f160w.fits

   Two options can drastically decrease NoiseChisel’s output file size:
1) With the ‘--rawoutput’ option, NoiseChisel won’t create a
Sky-subtracted input.  After all, it is redundant: you can always
generate it by subtracting the ‘SKY’ extension from the input image
(which you have in your database) using the Arithmetic program.  2) With
the ‘--oneelempertile’, you can tell NoiseChisel to store its Sky and
Sky standard deviation results with one pixel per tile (instead of many
pixels per tile).  So let’s run NoiseChisel with these options, then
have another look at the HDUs and the over-all file size:

     $ astnoisechisel flat-ir/xdf-f160w.fits --oneelempertile --rawoutput \
                      --output=nc-for-storage.fits
     $ astfits nc-for-storage.fits
     $ ls -lh nc-for-storage.fits

See how ‘nc-for-storage.fits’ has four HDUs, while ‘nc/xdf-f160w.fits’
had five HDUs?  As explained above, the missing extension is
‘INPUT-NO-SKY’.  Also, look at the sizes of the ‘SKY’ and ‘SKY_STD’
HDUs, unlike before, they aren’t the same size as ‘DETECTIONS’, they
only have one pixel for each tile (group of pixels in raw input).
Finally, you see that ‘nc-for-storage.fits’ is just under 8 mega byes
(while ‘nc/xdf-f160w.fits’ was 100 mega bytes)!

   But were are not finished!  You can even be more efficient in
storage, archival or transferring NoiseChisel’s output by compressing
this file.  Try the command below to see how NoiseChisel’s output has
now shrunk to about 250 kilo-byes while keeping all the necessary
information as the original 100 mega-byte output.

     $ gzip --best nc-for-storage.fits
     $ ls -lh nc-for-storage.fits.gz

   We can get this wonderful level of compression because NoiseChisel’s
output is binary with only two values: 0 and 1.  Compression algorithms
are highly optimized in such scenarios.

   You can open ‘nc-for-storage.fits.gz’ directly in SAO DS9 or feed it
to any of Gnuastro’s programs without having to decompress it.
Higher-level programs that take NoiseChisel’s output (for example
Segment or MakeCatalog) can also deal with this compressed image where
the Sky and its Standard deviation are one pixel-per-tile.  You just
have to give the “values” image as a separate option, for more, see
*note Segment:: and *note MakeCatalog::.

   Segment (the program we will introduce in the next section for
identifying sub-structure), also has similar features to optimize its
output for storage.  Since this file was only created for a fast detour
demonstration, let’s keep our top directory clean and move to the next
step:

     rm nc-for-storage.fits.gz


File: gnuastro.info,  Node: Segmentation and making a catalog,  Next: Working with catalogs estimating colors,  Prev: NoiseChisel optimization for storage,  Up: General program usage tutorial

2.2.13 Segmentation and making a catalog
----------------------------------------

The main output of NoiseChisel is the binary detection map (‘DETECTIONS’
extension, see *note NoiseChisel optimization for detection::).  which
only has two values of 1 or 0.  This is useful when studying the noise
or background properties, but hardly of any use when you actually want
to study the targets/galaxies in the image, especially in such a deep
field where almost everything is connected.  To find the galaxies over
the detections, we’ll use Gnuastro’s *note Segment:: program:

     $ mkdir seg
     $ astsegment nc/xdf-f160w.fits -oseg/xdf-f160w.fits
     $ astsegment nc/xdf-f125w.fits -oseg/xdf-f125w.fits
     $ astsegment nc/xdf-f105w.fits -oseg/xdf-f105w.fits

   Segment’s operation is very much like NoiseChisel (in fact, prior to
version 0.6, it was part of NoiseChisel).  For example the output is a
multi-extension FITS file, it has check images and uses the undetected
regions as a reference.  Please have a look at Segment’s multi-extension
output with ‘ds9’ to get a good feeling of what it has done.

     $ ds9 -mecube seg/xdf-f160w.fits -zscale -zoom to fit

   Like NoiseChisel, the first extension is the input.  The ‘CLUMPS’
extension shows the true “clumps” with values that are $\ge1$, and the
diffuse regions labeled as $-1$.  Please flip between the first
extension and the clumps extension and zoom-in on some of the clumps to
get a feeling of what they are.  In the ‘OBJECTS’ extension, we see that
the large detections of NoiseChisel (that may have contained many
galaxies) are now broken up into separate labels.  Play with the
color-bar and hover your mouse of the various detections to see their
different labels.

   The clumps are not affected by the hard-to-deblend and low
signal-to-noise diffuse regions, they are more robust for calculating
the colors (compared to objects).  From this step onward, we’ll continue
with clumps.

   Having localized the regions of interest in the dataset, we are ready
to do measurements on them with *note MakeCatalog::.  MakeCatalog is
specialized and optimized for doing measurements over labeled regions of
an image.  In other words, through MakeCatalog, you can “reduce” an
image to a table (catalog of certain properties of objects in the
image).  Each requested measurement (over each label) will be given a
column in the output table.  To see the full set of available
measurements run it with ‘--help’ like below (and scroll up), note that
measurements are classified by context.

     $ astmkcatalog --help

   So let’s select the properties we want to measure in this tutorial.
First of all, we need to know which measurement belongs to which object
or clump, so we’ll start with the ‘--ids’ (read as: IDs(1)).  We also
want to measure (in this order) the Right Ascension (with ‘--ra’),
Declination (‘--dec’), magnitude (‘--magnitude’), and signal-to-noise
ratio (‘--sn’) of the objects and clumps.  Furthermore, as mentioned
above, we also want measurements on clumps, so we also need to call
‘--clumpscat’.  The following command will make these measurements on
Segment’s F160W output and write them in a catalog for each object and
clump in a FITS table.

     $ mkdir cat
     $ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
                    --zeropoint=25.94 --clumpscat --output=cat/xdf-f160w.fits

From the printed statements on the command-line, you see that
MakeCatalog read all the extensions in Segment’s output for the various
measurements it needed.  To calculate colors, we also need magnitude
measurements on the other filters.  So let’s repeat the command above on
them, just changing the file names and zeropoint (which we got from the
XDF survey web page):

     $ astmkcatalog seg/xdf-f125w.fits --ids --ra --dec --magnitude --sn \
                    --zeropoint=26.23 --clumpscat --output=cat/xdf-f125w.fits

     $ astmkcatalog seg/xdf-f105w.fits --ids --ra --dec --magnitude --sn \
                    --zeropoint=26.27 --clumpscat --output=cat/xdf-f105w.fits

   However, the galaxy properties might differ between the filters
(which is the whole purpose behind observing in different filters!).
Also, the noise properties and depth of the datasets differ.  You can
see the effect of these factors in the resulting clump catalogs, with
Gnuastro’s Table program.  We’ll go deep into working with tables in the
next section, but in summary: the ‘-i’ option will print information
about the columns and number of rows.  To see the column values, just
remove the ‘-i’ option.  In the output of each command below, look at
the ‘Number of rows:’, and note that they are different.

     $ asttable cat/xdf-f105w.fits -hCLUMPS -i
     $ asttable cat/xdf-f125w.fits -hCLUMPS -i
     $ asttable cat/xdf-f160w.fits -hCLUMPS -i

   Matching the catalogs is possible (for example with *note Match::).
However, the measurements of each column are also done on different
pixels: the clump labels can/will differ from one filter to another for
one object.  Please open them and focus on one object to see for your
self.  This can bias the result, if you match catalogs.

   An accurate color calculation can only be done when magnitudes are
measured from the same pixels on all images and this can be done easily
with MakeCatalog.  In fact this is one of the reasons that NoiseChisel
or Segment don’t generate a catalog like most other
detection/segmentation software.  This gives you the freedom of
selecting the pixels for measurement in any way you like (from other
filters, other software, manually, and etc).  Fortunately in these
images, the Point spread function (PSF) is very similar, allowing us to
use a single labeled image output for all filters(2).

   The F160W image is deeper, thus providing better
detection/segmentation, and redder, thus observing smaller/older stars
and representing more of the mass in the galaxies.  We will thus use the
F160W filter as a reference and use its segment labels to identify which
pixels to use for which objects/clumps.  But we will do the measurements
on the sky-subtracted F105W and F125W images (using MakeCatalog’s
‘--valuesfile’ option) as shown below: Notice that the only difference
between these calls and the call to generate the raw F160W catalog
(excluding the zero point and the output name) is the ‘--valuesfile’.

     $ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
                    --valuesfile=nc/xdf-f125w.fits --zeropoint=26.23 \
                    --clumpscat --output=cat/xdf-f125w-on-f160w-lab.fits

     $ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
                    --valuesfile=nc/xdf-f105w.fits --zeropoint=26.27 \
                    --clumpscat --output=cat/xdf-f105w-on-f160w-lab.fits

   After running the commands above, look into what MakeCatalog printed
on the command-line.  You can see that (as requested) the object and
clump pixel labels in both were taken from the respective extensions in
‘seg/xdf-f160w.fits’.  However, the pixel values and pixel Sky standard
deviation were respectively taken from ‘nc/xdf-f105w.fits’ and
‘nc/xdf-f125w.fits’.  Since we used the same labeled image on all
filters, the number of rows in both catalogs are now identical.  Let’s
have a look:

     $ asttable cat/xdf-f105w-on-f160w-lab.fits -hCLUMPS -i
     $ asttable cat/xdf-f125w-on-f160w-lab.fits -hCLUMPS -i
     $ asttable cat/xdf-f160w.fits -hCLUMPS -i

   Finally, MakeCatalog also does basic calculations on the full dataset
(independent of each labeled region but related to whole data), for
example pixel area or per-pixel surface brightness limit.  They are
stored as keywords in the FITS headers (or lines starting with ‘#’ in
plain text).  You can see them with this command (for more, see *note
Image surface brightness limit:: in the next tutorial):

     $ astfits cat/xdf-f160w.fits -h1

   ---------- Footnotes ----------

   (1) This option is plural because we need two ID columns for
identifying “clumps” in the clumps catalog/table: the first column will
be the ID of the host “object”, and the second one will be the ID of the
clump within that object.  In the “objects” catalog/table, only a single
column will be returned for this option.

   (2) When the PSFs between two images differ largely, you would have
to PSF-match the images before using the same pixels for measurements.


File: gnuastro.info,  Node: Working with catalogs estimating colors,  Next: Column statistics color-magnitude diagram,  Prev: Segmentation and making a catalog,  Up: General program usage tutorial

2.2.14 Working with catalogs (estimating colors)
------------------------------------------------

In the previous step we generated catalogs of objects and clumps over
our dataset (see *note Segmentation and making a catalog::).  The
catalogs are available in the two extensions of the single FITS file(1).
Let’s see the extensions and their basic properties with the Fits
program:

     $ astfits  cat/xdf-f160w.fits              # Extension information

   Let’s inspect the table in each extension with Gnuastro’s Table
program (see *note Table::).  We should have used ‘-hOBJECTS’ and
‘-hCLUMPS’ instead of ‘-h1’ and ‘-h2’ respectively.  The numbers are
just used here to convey that both names or numbers are possible, in the
next commands, we’ll just use names.

     $ asttable cat/xdf-f160w.fits -h1 --info   # Objects catalog info.
     $ asttable cat/xdf-f160w.fits -h1          # Objects catalog columns.
     $ asttable cat/xdf-f160w.fits -h2 -i       # Clumps catalog info.
     $ asttable cat/xdf-f160w.fits -h2          # Clumps catalog columns.

As you see above, when given a specific table (file name and extension),
Table will print the full contents of all the columns.  To see the basic
metadata about each column (for example name, units and comments),
simply append a ‘--info’ (or ‘-i’) to the command.

   To print the contents of special column(s), just give the column
number(s) (counting from ‘1’) or the column name(s) (if they have one)
to the ‘--column’ (or ‘-c’) option.  For example, if you just want the
magnitude and signal-to-noise ratio of the clumps (in the clumps
catalog), you can get it with any of the following commands

     $ asttable cat/xdf-f160w.fits -hCLUMPS --column=5,6
     $ asttable cat/xdf-f160w.fits -hCLUMPS -c5,SN
     $ asttable cat/xdf-f160w.fits -hCLUMPS -c5         -c6
     $ asttable cat/xdf-f160w.fits -hCLUMPS -cMAGNITUDE -cSN

Similar to HDUs, when the columns have names, always use the name: it is
so common to mis-write numbers or forget the order later!  Using column
names instead of numbers has many advantages:
  1. You don’t have to worry about the order of columns in the table.
  2. It acts as a documentation in the script.
  3. Column meta-data (including a name) aren’t just limited to FITS
     tables and can also be used in plain text tables, see *note
     Gnuastro text table format::.

Table also has tools to limit the displayed rows.  For example with the
first command below only rows with a magnitude in the range of 29 to 30
will be shown.  With the second command, you can further limit the
displayed rows to rows with an S/N larger than 10 (a range between 10 to
infinity).  You can further sort the output rows, only show the top (or
bottom) N rows and etc, for more see *note Table::.

     $ asttable cat/xdf-f160w.fits -hCLUMPS --range=MAGNITUDE,28:29
     $ asttable cat/xdf-f160w.fits -hCLUMPS \
                --range=MAGNITUDE,28:29 --range=SN,10:inf

   Now that you are comfortable in viewing table columns and rows, let’s
look into merging columns of multiple tables into one table (which is
necessary for measuring the color of the clumps).  Since
‘cat/xdf-f160w.fits’ and ‘cat/xdf-f105w-on-f160w-lab.fits’ have exactly
the same number of rows and the rows correspond to the same clump, let’s
merge them to have one table with magnitudes in both filters.

   We can merge columns with the ‘--catcolumnfile’ option like below.
You give this option a file name (which is assumed to be a table that
has the same number of rows as the main input), and all the table’s
columns will be concatenated/appended to the main table.  So please try
it out with the commands below.  We’ll first look at the metadata of the
first table (only the ‘CLUMPS’ extension).  With the second command,
we’ll concatenate the two tables and write them in, ‘two-in-one.fits’
and finally, we’ll check the new catalog’s metadata.

     $ asttable cat/xdf-f160w.fits -i -hCLUMPS
     $ asttable cat/xdf-f160w.fits -hCLUMPS --output=two-in-one.fits \
                --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
                --catcolumnhdu=CLUMPS
     $ asttable two-in-one.fits -i

   By comparing the two metadata, we see that both tables have the same
number of rows.  But what might have attracted your attention more, is
that ‘two-in-one.fits’ has double the number of columns (as expected,
after all, you merged both tables into one file, and didn’t ask for any
specific column).  In fact you can concatenate any number of other
tables in one command, for example:

     $ asttable cat/xdf-f160w.fits -hCLUMPS --output=three-in-one.fits \
                --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
                --catcolumnfile=cat/xdf-f105w-on-f160w-lab.fits \
                --catcolumnhdu=CLUMPS --catcolumnhdu=CLUMPS
     $ asttable three-in-one.fits -i

   As you see, to avoid confusion in column names, Table has
intentionally appended a ‘-1’ to the column names of the first
concatenated table (so for example we have the original ‘RA’ column, and
another one called ‘RA-1’).  Similarly a ‘-2’ has been added for the
columns of the second concatenated table.

   However, this example clearly shows a problem with this full
concatenation: some columns are identical (for example ‘HOST_OBJ_ID’ and
‘HOST_OBJ_ID-1’), or not needed (for example ‘RA-1’ and ‘DEC-1’ which
are not necessary here).  In such cases, you can use ‘--catcolumns’ to
only concatenate certain columns, not the whole table.  For example this
command:

     $ asttable cat/xdf-f160w.fits -hCLUMPS --output=two-in-one-2.fits \
                --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
                --catcolumnhdu=CLUMPS --catcolumns=MAGNITUDE
     $ asttable two-in-one-2.fits -i

   You see that we have now only appended the ‘MAGNITUDE’ column of
‘cat/xdf-f125w-on-f160w-lab.fits’.  This is what we needed to be able to
later subtract the magnitudes.  Let’s go ahead and add the F105W
magnitudes also with the command below.  Note how we need to call
‘--catcolumnhdu’ once for every table that should be appended, but we
only call ‘--catcolumn’ once (assuming all the tables that should be
appended have this column).

     $ asttable cat/xdf-f160w.fits -hCLUMPS --output=three-in-one-2.fits \
                --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
                --catcolumnfile=cat/xdf-f105w-on-f160w-lab.fits \
                --catcolumnhdu=CLUMPS --catcolumnhdu=CLUMPS \
                --catcolumns=MAGNITUDE
     $ asttable three-in-one-2.fits -i

   But we aren’t finished yet!  There is a very big problem: its not
immediately clear which one of ‘MAGNITUDE’, ‘MAGNITUDE-1’ or
‘MAGNITUDE-2’ columns belong to which filter!  Right now, you know this
because you just ran this command.  But in one hour, you’ll start
doubting your self and will be forced to go through your command
history, trying to figure out if you added F105W first, or F125W. You
should never torture your future-self (or your colleagues) like this!
So, let’s rename these confusing columns in the matched catalog.

   Fortunately, with the ‘--colmetadata’ option, you can correct the
column metadata of the final table (just before it is written).  It
takes four values: 1) the original column name or number, 2) the new
column name, 3) the column unit and 4) the column comments.  Since the
comments are usually human-friendly sentences and contain space
characters, you should put them in double quotations like below.  For
example by adding three calls of this option to the previous command, we
write the filter name in the magnitude column name and description.

     $ asttable cat/xdf-f160w.fits -hCLUMPS --output=three-in-one-3.fits \
             --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
             --catcolumnfile=cat/xdf-f105w-on-f160w-lab.fits \
             --catcolumnhdu=CLUMPS --catcolumnhdu=CLUMPS \
             --catcolumns=MAGNITUDE \
             --colmetadata=MAGNITUDE,MAG-F160w,log,"Magnitude in F160W." \
             --colmetadata=MAGNITUDE-1,MAG-F125w,log,"Magnitude in F125W." \
             --colmetadata=MAGNITUDE-2,MAG-F105w,log,"Magnitude in F105W."
     $ asttable three-in-one-3.fits -i

   We now have all three magnitudes in one table and can start doing
arithmetic on them (to estimate colors, which are just a subtraction of
magnitudes).  To use column arithmetic, simply call the column selection
option (‘--column’ or ‘-c’), put the value in single quotations and
start the value with ‘arith’ (followed by a space) like the example
below.  Column arithmetic uses the same “reverse polish notation” as the
Arithmetic program (see *note Reverse polish notation::), with almost
all the same operators (see *note Arithmetic operators::), and some
column-specific operators (that aren’t available for images).  In
column-arithmetic, you can identify columns by number (prefixed with a
‘$’) or name, for more see *note Column arithmetic::.

   So let’s estimate one color from ‘three-in-one-3.fits’ using column
arithmetic.  All the commands below will produce the same output, try
them each and focus on the differences.  Note that column arithmetic can
be mixed with other ways to choose output columns (the ‘-c’ option).

     $ asttable three-in-one-3.fits -ocolor-cat.fits \
                -c1,2,3,4,'arith $5 $7 -'

     $ asttable three-in-one-3.fits -ocolor-cat.fits \
                -c1,2,RA,DEC,'arith MAG-F125W MAG-F160W -'

     $ asttable three-in-one-3.fits -ocolor-cat.fits -c1,2 \
                -cRA,DEC --column='arith MAG-F105W MAG-F160W -'

   This example again highlights the important point on using column
names: if you don’t know the commands before, you have no way of making
sense of the first command: what is in column 5 and 7?  why not subtract
columns 3 and 4 from each other?  Do you see how cryptic the first one
is?  Then look at the last one: even if you have no idea how this table
was created, you immediately understand the desired operation.  *When
you have column names, please use them.*  If your table doesn’t have
column names, give them names with the ‘--colmetadata’ (described above)
as you are creating them.  But how about the metadata for the column you
just created with column arithmetic?  Have a look at the column metadata
of the table produced above:

     $ asttable color-cat.fits -i

   The name of the column produced by arithmetic column is ‘ARITH_1’!
This is natural: Arithmetic has no idea what the modified column is!
You could have multiplied two columns, or done much more complex
transformations with many columns.  _Metadata can’t be set
automatically, your (the human) input is necessary._  To add metadata,
you can use ‘--colmetadata’ like before:

     $ asttable three-in-one-3.fits -ocolor-cat.fits -c1,2,RA,DEC \
              --column='arith MAG-F105W MAG-F160W -' \
              --colmetadata=ARITH_1,F105W-F160W,log,"Magnitude difference"
     $ asttable color-cat.fits -i

   We are now ready to make our final table.  We want it to have the
magnitudes in all three filters, as well as the three possible colors.
Recall that by convention in astronomy colors are defined by subtracting
the bluer magnitude from the redder magnitude.  In this way a larger
color value corresponds to a redder object.  So from the three
magnitudes, we can produce three colors (as shown below).  Also, because
this is the final table we are creating here and want to use it later,
we’ll store it in ‘cat/’ and we’ll also give it a clear name and use the
‘--range’ option to only print columns with a signal-to-noise ratio
(‘SN’ column, from the F160W filter) above 5.

     $ asttable three-in-one-3.fits --range=SN,5,inf -c1,2,RA,DEC,SN \
              -cMAG-F160W,MAG-F125W,MAG-F105W \
              -c'arith MAG-F125W MAG-F160W -' \
              -c'arith MAG-F105W MAG-F125W -' \
              -c'arith MAG-F105W MAG-F160W -' \
              --colmetadata=SN,SN-F160W,ratio,"F160W signal to noise ratio" \
              --colmetadata=ARITH_1,F125W-F160W,log,"Color F125W and F160W" \
              --colmetadata=ARITH_2,F105W-F125W,log,"Color F105W and F125W" \
              --colmetadata=ARITH_3,F105W-F160W,log,"Color F105W and F160W" \
              --output=cat/mags-with-color.fits
     $ asttable cat/mags-with-color.fits -i

   The table now has all the columns we need and it has the proper
metadata to let us safely use it later (without frustrating over column
orders!)  or passing it to colleagues.

   Let’s finish this section of the tutorial with a useful tip on
modifying column metadata.  Above, updating/changing column metadata was
done with the ‘--colmetadata’ in the same command that produced the
newly created Table file.  But in many situations, the table is already
made and you just want to update the metadata of one column.  In such
cases using ‘--colmetadata’ is over-kill (wasting CPU/RAM energy or time
if the table is large) because it will load the full table data and
metadata into memory, just change the metadata and write it back into a
file.

   In scenarios when the table’s data doesn’t need to be changed and you
just want to set or update the metadata, it is much more efficient to
use basic FITS keyword editing.  For example, in the FITS standard,
column names are stored in the ‘TTYPE’ header keywords, so let’s have a
look:

     $ asttable two-in-one.fits -i
     $ astfits two-in-one.fits -h1 | grep TTYPE

   Changing/updating the column names is as easy as updating the values
to these keywords.  You don’t need to touch the actual data!  With the
command below, we’ll just update the ‘MAGNITUDE’ and ‘MAGNITUDE-1’
columns (which are respectively stored in the ‘TTYPE5’ and ‘TTYPE11’
keywords) by modifying the keyword values and checking the effect by
listing the column metadata again:

     $ astfits two-in-one.fits -h1 \
               --update=TTYPE5,MAG-F160W \
               --update=TTYPE11,MAG-F125W
     $ asttable two-in-one.fits -i

   You can see that the column names have indeed been changed without
touching any of the data.  You can do the same for the column units or
comments by modifying the keywords starting with ‘TUNIT’ or ‘TCOMM’.

   Generally, Gnuastro’s table is a very useful program in data analysis
and what you have seen so far is just the tip of the iceberg.  But to
avoid making the tutorial even longer, we’ll stop reviewing the features
here, for more, please see *note Table::.  Before continuing, let’s just
delete all the temporary FITS tables we placed in the top project
directory:

     rm *.fits

   ---------- Footnotes ----------

   (1) MakeCatalog can also output plain text tables.  However, in the
plain text format you can only have one table per file.  Therefore, if
you also request measurements on clumps, two plain text tables will be
created (suffixed with ‘_o.txt’ and ‘_c.txt’).


File: gnuastro.info,  Node: Column statistics color-magnitude diagram,  Next: Aperture photometry,  Prev: Working with catalogs estimating colors,  Up: General program usage tutorial

2.2.15 Column statistics (color-magnitude diagram)
--------------------------------------------------

In *note Working with catalogs estimating colors:: we created a single
catalog containing the magnitudes of our desired clumps in all three
filters, and their colors.  To start with, let’s inspect the
distribution of three colors with the Statistics program.

     $ aststatistics cat/mags-with-color.fits -cF105W-F125W
     $ aststatistics cat/mags-with-color.fits -cF105W-F160W
     $ aststatistics cat/mags-with-color.fits -cF125W-F160W

   This tiny and cute ASCII histogram (and the general information
printed above it) gives you a crude (but very useful and fast) feeling
on the distribution.  You can later use Gnuastro’s Statistics program
with the ‘--histogram’ option to build a much more fine-grained
histogram as a table to feed into your favorite plotting program for a
much more accurate/appealing plot (for example with PGFPlots in LaTeX).
If you just want a specific measure, for example the mean, median and
standard deviation, you can ask for them specifically, like below:

     $ aststatistics cat/mags-with-color.fits -cF105W-F160W \
                     --mean --median --std

   The basic statistics we measured above were just on one column.  In
many scenarios this is fine, but things get much more exciting if you
look at the correlation of two columns with each other.  For example,
let’s create the color-magnitude diagram for our measured targets.

   In many papers, the color-magnitude diagram is usually plotted as a
scatter plot.  However, scatter plots have a major limitation when there
are a lot of points and they cluster together in one region of the plot:
the possible correlation in that dense region is lost (because the
points fall over each other).  In such cases, its much better to use a
2D histogram.  In a 2D histogram, the full range in both columns is
divided into discrete 2D bins (or pixels!)  and we count how many
objects fall in that 2D bin.

   Since a 2D histogram is a pixelated space, we can simply save it as a
FITS image and view it in a FITS viewer.  Let’s do this in the command
below.  As is common with color-magnitude plots, we’ll put the redder
magnitude on the horizontal axis and the color on the vertical axis.
We’ll set both dimensions to have 100 bins (with ‘--numbins’ for the
horizontal and ‘--numbins2’ for the vertical).  Also, to avoid strong
outliers in any of the dimensions, we’ll manually set the range of each
dimension with the ‘--greaterequal’, ‘--greaterequal2’, ‘--lessthan’ and
‘--lessthan2’ options.

     $ aststatistics cat/mags-with-color.fits -cMAG-F160W,F105W-F160W \
                     --histogram2d=image --manualbinrange \
                     --numbins=100  --greaterequal=22  --lessthan=30 \
                     --numbins2=100 --greaterequal2=-1 --lessthan2=3 \
                     --manualbinrange --output=cmd.fits

You can now open this FITS file as a normal FITS image, for example with
the command below.  Try hovering/zooming over the pixels: not only will
you see the number of objects in the UVUDF catalog that fall in each
bin/pixel, but you also see the ‘F160W’ magnitude and color of that
pixel also (in the same place you usually see RA and Dec when hovering
over an astronomical image).

     $ ds9 cmd.fits -cmap sls -zoom to fit

   Having a 2D histogram as a FITS image with WCS has many great
advantages.  For example, just like FITS images of the night sky, you
can “match” many 2D histograms that were created independently.  You can
add two histograms with each other, or you can use advanced features of
FITS viewers to find structure in the correlation of your columns.

With the first command below, you can activate the grid feature of DS9
to actually see the coordinate grid, as well as values on each line.
With the second command, DS9 will even read the labels of the axises and
use them to generate an almost publication-ready plot.

     $ ds9 cmd.fits -cmap sls -zoom to fit -grid yes
     $ ds9 cmd.fits -cmap sls -zoom to fit -grid yes -grid type publication

   If you are happy with the grid and coloring and etc, you can also use
ds9 to save this as a JPEG image to directly use in your
documents/slides with these extra DS9 options (DS9 will write the image
to ‘cmd-2d.jpeg’ and quit immediately afterwards):

     $ ds9 cmd.fits -cmap sls -zoom 4 -grid yes -grid type publication \
           -saveimage cmd-2d.jpeg -quit

   This is good for a fast progress update.  But for your paper or more
official report, you want to show something with higher quality.  For
that, you can use the PGFPlots package in LaTeX to add axises in the
same font as your text, sharp grids and many other elegant/powerful
features (like over-plotting interesting points, lines and etc).  But to
load the 2D histogram into PGFPlots first you need to convert the FITS
image into a more standard format, for example PDF. We’ll use Gnuastro’s
*note ConvertType:: for this, and use the ‘sls-inverse’ color map (which
will map the pixels with a value of zero to white):

     $ astconvertt cmd.fits --colormap=sls-inverse --borderwidth=0 -ocmd.pdf

Below you can see a minimally working example of how to add axis
numbers, labels and a grid to the PDF generated above.  First, let’s
create a new ‘report’ directory to keep the LaTeX outputs, then put the
minimal report’s source in a file called ‘report.tex’.  Notice the
‘xmin’, ‘xmax’, ‘ymin’, ‘ymax’ values and how they are the same as the
range specified above.

     $ mkdir report
     $ mv cmd.pdf report/
     $ cat report/report.tex
     \documentclass{article}
     \usepackage{pgfplots}
     \dimendef\prevdepth=0
     \begin{document}

     You can write all you want here...\par

     \begin{tikzpicture}
       \begin{axis}[
           enlargelimits=false,
           grid,
           axis on top,
           width=\linewidth,
           height=\linewidth,
           xlabel={Magnitude (F160W)},
           ylabel={Color (F105W-F160W)}]

         \addplot graphics[xmin=22, xmax=30, ymin=-1, ymax=3] {cmd.pdf};
       \end{axis}
     \end{tikzpicture}
     \end{document}

Run this command to build your PDF (assuming you have LaTeX and
PGFPlots).

     $ cd report
     $ pdflatex report.tex

   Open the newly created ‘report.pdf’ and enjoy the exquisite quality.
The improved quality, blending in with the text, vector-graphics
resolution and other features make this plot pleasing to the eye, and
let your readers focus on the main point of your scientific argument.
PGFPlots can also built the PDF of the plot separately from the rest of
the paper/report, see *note 2D histogram as a table:: for the necessary
changes in the preamble.

   We won’t go much deeper into the Statistics program here, but there
is so much more you can do with it.  After finishing the tutorial, see
*note Statistics::.


File: gnuastro.info,  Node: Aperture photometry,  Next: Matching catalogs,  Prev: Column statistics color-magnitude diagram,  Up: General program usage tutorial

2.2.16 Aperture photometry
--------------------------

The colors we calculated in *note Working with catalogs estimating
colors:: used a different segmentation map for each object.  This might
not satisfy some science cases that need the flux within a fixed
area/aperture.  Fortunately Gnuastro’s modular programs make it very
easy do this type of measurement (photometry).  To do this, we can
ignore the labeled images of NoiseChisel of Segment, we can just built
our own labeled image!  That labeled image can then be given to
MakeCatalog

   To generate the apertures catalog we’ll use Gnuastro’s MakeProfiles
(see *note MakeProfiles::).  But first we need a list of positions
(aperture photometry needs a-priori knowledge of your target positions).
So we’ll first read the clump positions from the F160W catalog, then use
AWK to set the other parameters of each profile to be a fixed circle of
radius 5 pixels (recall that we want all apertures to have an identical
size/area in this scenario).

     $ rm *.fits *.txt
     $ asttable cat/xdf-f160w.fits -hCLUMPS -cRA,DEC \
                | awk '!/^#/{print NR, $1, $2, 5, 5, 0, 0, 1, NR, 1}' \
                > apertures.txt
     $ cat apertures.txt

   We can now feed this catalog into MakeProfiles using the command
below to build the apertures over the image.  The most important option
for this particular job is ‘--mforflatpix’, it tells MakeProfiles that
the values in the magnitude column should be used for each pixel of a
flat profile.  Without it, MakeProfiles would build the profiles such
that the _sum_ of the pixels of each profile would have a _magnitude_
(in log-scale) of the value given in that column (what you would expect
when simulating a galaxy for example).  See *note Invoking astmkprof::
for details on the options.

     $ astmkprof apertures.txt --background=flat-ir/xdf-f160w.fits \
                 --clearcanvas --replace --type=int16 --mforflatpix \
                 --mode=wcs --output=apertures.fits

   Open ‘apertures.fits’ with a FITS image viewer (like SAO DS9) and
look around at the circles placed over the targets.  Also open the input
image and Segment’s clumps image and compare them with the positions of
these circles.  Where the apertures overlap, you will notice that one
label has replaced the other (because of the ‘--replace’ option).  In
the future, MakeCatalog will be able to work with overlapping labels,
but currently it doesn’t.  If you are interested, please join us in
completing Gnuastro with added improvements like this (see task 14750
(1)).

   We can now feed the ‘apertures.fits’ labeled image into MakeCatalog
instead of Segment’s output as shown below.  In comparison with the
previous MakeCatalog call, you will notice that there is no more
‘--clumpscat’ option, since there is no more separate “clump” image now,
each aperture is treated as a separate “object”.

     $ astmkcatalog apertures.fits -h1 --zeropoint=26.27 \
                    --valuesfile=nc/xdf-f105w.fits \
                    --ids --ra --dec --magnitude --sn \
                    --output=cat/xdf-f105w-aper.fits

   This catalog has the same number of rows as the catalog produced from
clumps in *note Working with catalogs estimating colors::.  Therefore
similar to how we found colors, you can compare the aperture and clump
magnitudes for example.

   You can also change the filter name and zero point magnitudes and run
this command again to have the fixed aperture magnitude in the F160W
filter and measure colors on apertures.

   ---------- Footnotes ----------

   (1) <https://savannah.gnu.org/task/index.php?14750>


File: gnuastro.info,  Node: Matching catalogs,  Next: Finding reddest clumps and visual inspection,  Prev: Aperture photometry,  Up: General program usage tutorial

2.2.17 Matching catalogs
------------------------

In the example above, we had the luxury to generate the catalogs
ourselves, and where thus able to generate them in a way that the rows
match.  But this isn’t generally the case.  In many situations, you need
to use catalogs from many different telescopes, or catalogs with
high-level calculations that you can’t simply regenerate with the same
pixels without spending a lot of time or using heavy computation.  In
such cases, when each catalog has the coordinates of its own objects,
you can use the coordinates to match the rows with Gnuastro’s Match
program (see *note Match::).

   As the name suggests, Gnuastro’s Match program will match rows based
on distance (or aperture in 2D) in one, two, or three columns.  For this
tutorial, let’s try matching the two catalogs that weren’t created from
the same labeled images, recall how each has a different number of rows:

     $ asttable cat/xdf-f105w.fits -hCLUMPS -i
     $ asttable cat/xdf-f160w.fits -hCLUMPS -i

   You give Match two catalogs (from the two different filters we
derived above) as argument, and the HDUs containing them (if they are
FITS files) with the ‘--hdu’ and ‘--hdu2’ options.  The ‘--ccol1’ and
‘--ccol2’ options specify the coordinate-columns which should be matched
with which in the two catalogs.  With ‘--aperture’ you specify the
acceptable error (radius in 2D), in the same units as the columns.

     $ astmatch cat/xdf-f160w.fits           cat/xdf-f105w.fits \
                --hdu=CLUMPS                 --hdu2=CLUMPS \
                --ccol1=RA,DEC               --ccol2=RA,DEC \
                --aperture=0.5/3600 \
                --output=matched.fits
     $ astfits matched.fits

   From the second command, you see that the output has two extensions
and that both have the same number of rows.  The rows in each extension
are the matched rows of the respective input table: those in the first
HDU come from the first input and those in the second HDU come from the
second.  However, their order may be different from the input tables
because the rows match: the first row in the first HDU matches with the
first row in the second HDU, and etc.  You can also see which objects
didn’t match with the ‘--notmatched’, like below.  Note how each
extension of now has a different number of rows.

     $ astmatch cat/xdf-f160w.fits           cat/xdf-f105w.fits \
                --hdu=CLUMPS                 --hdu2=CLUMPS \
                --ccol1=RA,DEC               --ccol2=RA,DEC \
                --aperture=0.5/3600 \
                --output=not-matched.fits    --notmatched
     $ astfits not-matched.fits

   The ‘--outcols’ of Match is a very convenient feature: you can use it
to specify which columns from the two catalogs you want in the output
(merge two input catalogs into one).  If the first character is an
‘<a>’, the respective matched column (number or name, similar to Table
above) in the first catalog will be written in the output table.  When
the first character is a ‘<b>’, the respective column from the second
catalog will be written in the output.  Also, if the first character is
followed by ‘_all’, then all the columns from the respective catalog
will be put in the output.

     $ astmatch cat/xdf-f160w.fits           cat/xdf-f105w.fits \
                --hdu=CLUMPS                 --hdu2=CLUMPS \
                --ccol1=RA,DEC               --ccol2=RA,DEC \
                --aperture=0.35/3600 \
                --outcols=a_all,bMAGNITUDE,bSN \
                --output=matched.fits
     $ astfits matched.fits


File: gnuastro.info,  Node: Finding reddest clumps and visual inspection,  Next: Writing scripts to automate the steps,  Prev: Matching catalogs,  Up: General program usage tutorial

2.2.18 Finding reddest clumps and visual inspection
---------------------------------------------------

As a final step, let’s go back to the original clumps-based color
measurement we generated in *note Working with catalogs estimating
colors::.  We’ll find the objects with the strongest color and make a
cutout to inspect them visually and finally, we’ll see how they are
located on the image.  With the command below, we’ll select the reddest
objects (those with a color larger than 1.5):

     $ asttable cat/mags-with-color.fits --range=F105W-F160W,1.5,inf

You can see how many they are by piping it to ‘wc -l’:

     $ asttable cat/mags-with-color.fits --range=F105W-F160W,1.5,inf | wc -l

   Let’s crop the F160W image around each of these objects, but we first
need a unique identifier for them.  We’ll define this identifier using
the object and clump labels (with an underscore between them) and feed
the output of the command above to AWK to generate a catalog.  Note that
since we are making a plain text table, we’ll define the necessary (for
the string-type first column) metadata manually (see *note Gnuastro text
table format::).

     $ echo "# Column 1: ID [name, str10] Object ID" > reddest.txt
     $ asttable cat/mags-with-color.fits --range=F105W-F160W,1.5,inf \
                | awk '{printf("%d_%-10d %f %f\n", $1, $2, $3, $4)}' \
                >> reddest.txt

   We can now feed ‘reddest.txt’ into Gnuastro’s Crop program to see
what these objects look like.  To keep things clean, we’ll make a
directory called ‘crop-red’ and ask Crop to save the crops in this
directory.  We’ll also add a ‘-f160w.fits’ suffix to the crops (to
remind us which filter they came from).  The width of the crops will be
15 arc-seconds (or 15/3600 degrees, which is the units of the WCS).

     $ mkdir crop-red
     $ astcrop flat-ir/xdf-f160w.fits --mode=wcs --namecol=ID \
               --catalog=reddest.txt --width=15/3600,15/3600  \
               --suffix=-f160w.fits --output=crop-red

   You can see all the cropped FITS files in the ‘crop-red’ directory.
Like the MakeProfiles command in *note Aperture photometry::, you might
notice that the crops aren’t made in order.  This is because each crop
is independent of the rest, therefore crops are done in parallel, and
parallel operations are asynchronous.  In the command above, you can
change ‘f160w’ to ‘f105w’ to make the crops in both filters.

   To view the crops more easily (not having to open ds9 for each
image), you can convert the FITS crops into the JPEG format with a shell
loop like below.

     $ cd crop-red
     $ for f in *.fits; do                                                  \
         astconvertt $f --fluxlow=-0.001 --fluxhigh=0.005 --invert -ojpg;   \
       done
     $ cd ..
     $ ls crop-red/

   You can now use your general graphic user interface image viewer to
flip through the images more easily, or import them into your
papers/reports.

   The ‘for’ loop above to convert the images will do the job in series:
each file is converted only after the previous one is complete.  If you
have GNU Parallel (https://www.gnu.org/s/parallel), you can greatly
speed up this conversion.  GNU Parallel will run the separate commands
simultaneously on different CPU threads in parallel.  For more
information on efficiently using your threads, see *note Multi-threaded
operations::.  Here is a replacement for the shell ‘for’ loop above
using GNU Parallel.

     $ cd crop-red
     $ parallel astconvertt --fluxlow=-0.001 --fluxhigh=0.005 --invert   \
                -ojpg ::: *.fits
     $ cd ..

Did you notice how much faster this one was?  When possible, its always
very helpful to do your analysis in parallel.  But the problem is that
many operations are not as simple as this.  For such cases, you can use
Make (https://en.wikipedia.org/wiki/Make_(software)) which will greatly
help designing workflows.  But that is beyond the topic here.

   As the final action, let’s see how these objects are positioned over
the dataset.  DS9 has the “Region”s concept for this purpose.  You just
have to convert your catalog into a “region file” to feed into DS9.  To
do that, you can use AWK again as shown below.

     $ awk 'BEGIN{print "# Region file format: DS9 version 4.1";      \
                  print "global color=green width=2";                 \
                  print "fk5";}                                       \
            !/^#/{printf "circle(%s,%s,1\") # text={%s}\n",$2,$3,$1;}'\
           reddest.txt > reddest.reg

   This region file can be loaded into DS9 with its ‘-regions’ option to
display over any image (that has world coordinate system).  In the
example below, we’ll open Segment’s output and load the regions over all
the extensions (to see the image and the respective clump):

     $ ds9 -mecube seg/xdf-f160w.fits -zscale -zoom to fit    \
           -regions load all reddest.reg


File: gnuastro.info,  Node: Writing scripts to automate the steps,  Next: Citing and acknowledging Gnuastro,  Prev: Finding reddest clumps and visual inspection,  Up: General program usage tutorial

2.2.19 Writing scripts to automate the steps
--------------------------------------------

In the previous sub-sections, we went through a series of steps like
downloading the necessary datasets (in *note Setup and data download::),
detecting the objects in the image, and finally selecting a particular
subset of them to inspect visually (in *note Finding reddest clumps and
visual inspection::).  To benefit most effectively from this subsection,
please go through the previous sub-sections, and if you haven’t actually
done them, we recommended to do/run them before continuing here.

   Each sub-section/step of the sub-sections above involved several
commands on the command-line.  Therefore, if you want to reproduce the
previous results (for example to only change one part, and see its
effect), you’ll have to go through all the sections above and read
through them again.  If you done the commands recently, you may also
have them in the history of your shell (command-line environment).  You
can see many of your previous commands on the shell (even if you have
closed the terminal) with the ‘history’ command, like this:

     $ history

   Try it in your teminal to see for your self.  By default in GNU Bash,
it shows the last 500 commands.  You can also save this “history” of
previous commands to a file using shell redirection (to have it after
your next 500 commands), with this command

     $ history > my-previous-commands.txt

   This is a good way to temporarily keep track of every single command
you ran.  But in the middle of all the useful commands, you will have
many extra commands, like tests that you did before/after the good
output of a step (that you decided to continue working on), or an
unrelated job you had to do in the middle of this project.  Because of
these impurities, after a few days (that you have forgot the context:
tests you didn’t end-up using, or unrelated jobs) reading this full
history will be very frustrating.

   Keeping the final commands that were used in each step of an analysis
is a common problem for anyone who is doing something serious with the
computer.  But simply keeping the most important commands in a text file
is not enough, the small steps in the middle (like making a directory to
keep the outputs of one step) are also important.  In other words, the
only way you can be sure that you are under control of your processing
(and actually understand how you produced your final result) is to run
the commands automatically.

   Fortunately, typing commands interactively with your fingers isn’t
the only way to operate the shell.  The shell can also take its
orders/commands from a plain-text file, which is called a _script_.
When given a script, the shell will read it line-by-line as if you have
actually typed it manually.

   Let’s continue with an example: try typing the commands below in your
shell.  With these commands we are making a text file (‘a.txt’)
containing a simple $3\times3$ matrix, converting it to a FITS image and
computing its basic statistics.  After the first three commands open
‘a.txt’ with a text editor to actually see the values we wrote in it,
and after the fourth, open the FITS file to see the matrix as an image.
‘a.txt’ is created through the shell’s redirection feature: ‘‘>’’
overwrites the existing contents of a file, and ‘‘>>’’ appends the new
contents after the old contents.

     $ echo "1 1 1" > a.txt
     $ echo "1 2 1" >> a.txt
     $ echo "1 1 1" >> a.txt
     $ astconvertt a.txt --output=a.fits
     $ aststatistics a.fits

   To automate these series of commands, you should put them in a text
file.  But that text file must have two special features: 1) It should
tell the shell what program should interpret the script.  2) The
operating system should know that the file can be directly executed.

   For the first, Unix-like operating systems define the _shebang_
concept (also known as _sha-bang_ or _hashbang_).  In the shebang
convention, the first two characters of a file should be ‘‘#!’’.  When
confronted with these characters, the script will be interpreted with
the program that follows them.  In this case, we want to write a shell
script and the most common shell program is GNU Bash which is installed
in ‘/bin/bash’.  So the first line of your script should be
‘‘#!/bin/bash’’(1).

   It may happen (rarely) that GNU Bash is in another location on your
system.  In other cases, you may prefer to use a non-standard version of
Bash installed in another location (that has higher priority in your
‘PATH’, see *note Installation directory::).  In such cases, you can use
the ‘‘#!/usr/bin/env bash’’ shebang instead.  Through the ‘env’ program,
this shebang will look in your ‘PATH’ and use the first ‘bash’ it finds
to run your script.  But for simplicity in the rest of the tutorial,
we’ll continue with the ‘‘#!/bin/bash’’ shebang.

   Using your favorite text editor, make a new empty file, let’s call it
‘my-first-script.sh’.  Write the GNU Bash shebang (above) as its first
line After the shebang, copy the series of commands we ran above.  Just
note that the ‘‘$’’ sign at the start of every line above is the prompt
of the interactive shell (you never actually typed it, remember?).
Therefore, commands in a shell script should not start with a ‘‘$’’.
Once you add the commands, close the text editor and run the ‘cat’
command to confirm its contents.  It should look like the example below.
Recall that you should only type the line that starts with a ‘‘$’’, the
lines without a ‘‘$’’, are printed automatically on the command-line
(they are the contents of your script).

     $ cat my-first-script.sh
     #!/bin/bash
     echo "1 1 1" > a.txt
     echo "1 2 1" >> a.txt
     echo "1 1 1" >> a.txt
     astconvertt a.txt --output=a.fits
     aststatistics a.fits

   The script contents are now ready, but to run it, you should activate
the script file’s _executable flag_.  In Unix-like operating systems,
every file has three types of flags: _read_ (or ‘r’), _write_ (or ‘w’)
and _execute_ (or ‘x’).  To toggle a file’s flags, you should use the
‘chmod’ (for “change mode”) command.  To activate a flag, you put a
‘‘+’’ before the flag character (for example ‘+x’).  To deactivate it,
you put a ‘‘-’’ (for example ‘-x’).  In this case, you want to activate
the script’s executable flag, so you should run

     $ chmod +x my-first-script.sh

   Your script is now ready to run/execute the series of commands.  To
run it, you should call it while specifying its location in the file
system.  Since you are currently in the same directory as the script,
its easiest to use relative addressing like below (where ‘‘./’’ means
the current directory).  But before running your script, first delete
the two ‘a.txt’ and ‘a.fits’ files that were created when you
interactively ran the commands.

     $ rm a.txt a.fits
     $ ls
     $ ./my-first-script.sh
     $ ls

The script immediately prints the statistics while doing all the
previous steps in the background.  With the last ‘ls’, you see that it
automatically re-built the ‘a.txt’ and ‘a.fits’ files, open them and
have a look at their contents.

   An extremely useful feature of shell scripts is that the shell will
ignore anything after a ‘‘#’’ character.  You can thus add
descriptions/comments to the commands and make them much more useful for
the future.  For example, after adding comments, your script might look
like this:

     $ cat my-first-script.sh
     #!/bin/bash

     # This script is my first attempt at learning to write shell scripts.
     # As a simple series of commands, I am just building a small FITS
     # image, and calculating its basic statistics.

     # Write the matrix into a file.
     echo "1 1 1" > a.txt
     echo "1 2 1" >> a.txt
     echo "1 1 1" >> a.txt

     # Convert the matrix to a FITS image.
     astconvertt a.txt --output=a.fits

     # Calculate the statistics of the FITS image.
     aststatistics a.fits

Isn’t this much more easier to read now?  Comments help to provide
human-friendly context to the raw commands.  At the time you make a
script, comments may seem like an extra effort and slow you down.  But
in one year, you will forget almost everything about your script and you
will appreciate the effort so much!  Think of the comments as an email
to your future-self and always put a well-written description of the
context/purpose (most importantly, things that aren’t directly clear by
reading the commands) in your scripts.

   The example above was very basic and mostly redundant series of
commands, to show the basic concepts behind scripts.  You can put any
(arbitrarily long and complex) series of commands in a script by
following the two rules: 1) add a shebang, and 2) enable the executable
flag.  In fact, as you continue your own research projects, you will
find that any time you are dealing with more than two or three commands,
keeping them in a script (and modifying that script, and running it) is
much more easier, and future-proof, then typing the commands directly on
the command-line and relying on things like ‘history’.  Here are some
tips that will come in handy when you are writing your scripts:

   As a more realistic example, let’s have a look at a script that will
do the steps of *note Setup and data download:: and *note Dataset
inspection and cropping::.  In particular note how often we are using
variables to avoid repeating fixed strings of characters (usually
file/directory names).  This greatly helps in scaling up your project,
and avoiding hard-to-find bugs that are caused by typos in those fixed
strings.

     $ cat gnuastro-tutorial-1.sh
     #!/bin/bash


     # Download the input datasets
     # ---------------------------
     #
     # The default file names have this format (where `FILTER' differs for
     # each filter):
     #   hlsp_xdf_hst_wfc3ir-60mas_hudf_FILTER_v1_sci.fits
     # To make the script easier to read, a prefix and suffix variable are
     # used to sandwich the filter name into one short line.
     downloaddir=download
     xdfsuffix=_v1_sci.fits
     xdfprefix=hlsp_xdf_hst_wfc3ir-60mas_hudf_
     xdfurl=http://archive.stsci.edu/pub/hlsp/xdf

     # The file name and full URLs of the input data.
     f105w_in=$xdfprefix"f105w"$xdfsuffix
     f160w_in=$xdfprefix"f160w"$xdfsuffix
     f105w_full=$xdfurl/$f105w_in
     f160w_full=$xdfurl/$f160w_in

     # Go into the download directory and download the images there,
     # then come back up to the top running directory.
     mkdir $downloaddir
     cd $downloaddir
     wget $f105w_full
     wget $f160w_full
     cd ..


     # Only work on the deep region
     # ----------------------------
     #
     # To help in readability, each vertice of the deep/flat field is stored
     # as a separate variable. They are then merged into one variable to
     # define the polygon.
     flatdir=flat-ir
     vertice1="53.187414,-27.779152"
     vertice2="53.159507,-27.759633"
     vertice3="53.134517,-27.787144"
     vertice4="53.161906,-27.807208"
     f105w_flat=$flatdir/xdf-f105w.fits
     f160w_flat=$flatdir/xdf-f160w.fits
     deep_polygon="$vertice1:$vertice2:$vertice3:$vertice4"

     mkdir $flatdir
     astcrop --mode=wcs -h0 --output=$f105w_flat \
             --polygon=$deep_polygon $downloaddir/$f105w_in
     astcrop --mode=wcs -h0 --output=$f160w_flat \
             --polygon=$deep_polygon $downloaddir/$f160w_in

   The first thing you may notice is that even if you already have the
downloaded input images, this script will always try to re-download
them.  Also, if you re-run the script, you will notice that ‘mkdir’
prints an error message that the download directory already exists.
Therefore, the script above isn’t too useful and some modifications are
necessary to make it more generally useful.  Here are some general tips
that are often very useful when writing scripts:

*Stop script if a command crashes*
     By default, if a command in a script crashes (aborts and fails to
     do what it was meant to do), the script will continue onto the next
     command.  In GNU Bash, you can tell the shell to stop a script in
     the case of a crash by adding this line at the start of your
     script:

          set -e

*Check if a file/directory exists to avoid re-creating it*
     Conditionals are a very useful feature in scripts.  One common
     conditional is to check if a file exists or not.  Assuming the
     file’s name is ‘FILENAME’, you can check its existance (to avoid
     re-doing the commands that build it) like this:
          if [ -f FILENAME ]; then
            echo "FILENAME exists"
          else
            # Some commands to generate the file
            echo "done" > FILENAME
          fi
     To check the existance of a directory instead of a file, use ‘-d’
     instead of ‘-f’.  To negate a conditional, use ‘‘!’’ and note that
     conditionals can be written in one line also (useful for when its
     short).

     One common scenario that you’ll need to check the existance of
     directories is when you are making them: the default ‘mkdir’
     command will crash if the desired directory already exists.  On
     some systems (including GNU/Linux distributions), ‘mkdir’ has
     options to deal with such cases.  But if you want your script to be
     portable, its best to check yourself like below:

          if ! [ -d DIRNAME ]; then mkdir DIRNAME; fi

Taking these tips into consideration, we can write a better version of
the script above that includes checks on every step to avoid repeating
steps/commands.  Please compare this script with the previous one
carefully to spot the differences.  These are very important points that
you will definitely encouter during your own research, and knowing them
can greatly help your productiveity, so pay close attention (even in the
comments).

     $ cat gnuastro-tutorial-2.sh
     #!/bin/bash
     set -e


     # Download the input datasets
     # ---------------------------
     #
     # The default file names have this format (where `FILTER' differs for
     # each filter):
     #   hlsp_xdf_hst_wfc3ir-60mas_hudf_FILTER_v1_sci.fits
     # To make the script easier to read, a prefix and suffix variable are
     # used to sandwich the filter name into one short line.
     downloaddir=download
     xdfsuffix=_v1_sci.fits
     xdfprefix=hlsp_xdf_hst_wfc3ir-60mas_hudf_
     xdfurl=http://archive.stsci.edu/pub/hlsp/xdf

     # The file name and full URLs of the input data.
     f105w_in=$xdfprefix"f105w"$xdfsuffix
     f160w_in=$xdfprefix"f160w"$xdfsuffix
     f105w_full=$xdfurl/$f105w_in
     f160w_full=$xdfurl/$f160w_in

     # Go into the download directory and download the images there,
     # then come back up to the top running directory.
     if ! [ -d $downloaddir ]; then mkdir $downloaddir; fi
     cd $downloaddir
     if ! [ -f $f105w_in ]; then wget $f105w_full; fi
     if ! [ -f $f160w_in ]; then wget $f160w_full; fi
     cd ..


     # Only work on the deep region
     # ----------------------------
     #
     # To help in readability, each vertice of the deep/flat field is stored
     # as a separate variable. They are then merged into one variable to
     # define the polygon.
     flatdir=flat-ir
     vertice1="53.187414,-27.779152"
     vertice2="53.159507,-27.759633"
     vertice3="53.134517,-27.787144"
     vertice4="53.161906,-27.807208"
     f105w_flat=$flatdir/xdf-f105w.fits
     f160w_flat=$flatdir/xdf-f160w.fits
     deep_polygon="$vertice1:$vertice2:$vertice3:$vertice4"

     if ! [ -d $flatdir ]; then mkdir $flatdir; fi
     if ! [ -f $f105w_flat ]; then
         astcrop --mode=wcs -h0 --output=$f105w_flat \
                 --polygon=$deep_polygon $downloaddir/$f105w_in
     fi
     if ! [ -f $f160w_flat ]; then
         astcrop --mode=wcs -h0 --output=$f160w_flat \
                 --polygon=$deep_polygon $downloaddir/$f160w_in
     fi

   ---------- Footnotes ----------

   (1) When the script is to be run by the same shell that is calling it
(like this script), the shebang is optional.  But it is still
recommended, because it ensures that even if the user isn’t using GNU
Bash, the script will be run in GNU Bash: given the differences between
various shells, writing truely portable shell scripts, that can be run
by many shell programs/implementations, isn’t easy (sometimes not
possible!).


File: gnuastro.info,  Node: Citing and acknowledging Gnuastro,  Prev: Writing scripts to automate the steps,  Up: General program usage tutorial

2.2.20 Citing and acknowledging Gnuastro
----------------------------------------

In conclusion, we hope this extended tutorial has been a good starting
point to help in your exciting research.  If this book or any of the
programs in Gnuastro have been useful for your research, please cite the
respective papers, and acknowledge the funding agencies that made all of
this possible.  Without citations, we won’t be able to secure future
funding to continue working on Gnuastro or improving it, so please take
software citation seriously (for all the scientific software you use,
not just Gnuastro).

   To help you in this aspect is well, all Gnuastro programs have a
‘--cite’ option to facilitate the citation and acknowledgment.  Just
note that it may be necessary to cite additional papers for different
programs, so please try it out on all the programs that you used, for
example:

     $ astmkcatalog --cite
     $ astnoisechisel --cite


File: gnuastro.info,  Node: Detecting large extended targets,  Prev: General program usage tutorial,  Up: Tutorials

2.3 Detecting large extended targets
====================================

The outer wings of large and extended objects can sink into the noise
very gradually and can have a large variety of shapes (for example due
to tidal interactions).  Therefore separating the outer boundaries of
the galaxies from the noise can be particularly tricky.  Besides causing
an under-estimation in the total estimated brightness of the target,
failure to detect such faint wings will also cause a bias in the noise
measurements, thereby hampering the accuracy of any measurement on the
dataset.  Therefore even if they don’t constitute a significant fraction
of the target’s light, or aren’t your primary target, these regions must
not be ignored.  In this tutorial, we’ll walk you through the strategy
of detecting such targets using *note NoiseChisel::.

*Don’t start with this tutorial:* If you haven’t already completed *note
General program usage tutorial::, we strongly recommend going through
that tutorial before starting this one.  Basic features like access to
this book on the command-line, the configuration files of Gnuastro’s
programs, benefiting from the modular nature of the programs, viewing
multi-extension FITS files, or using NoiseChisel’s outputs are discussed
in more detail there.

   We’ll try to detect the faint tidal wings of the beautiful M51
group(1) in this tutorial.  We’ll use a dataset/image from the public
Sloan Digital Sky Survey (http://www.sdss.org/), or SDSS. Due to its
more peculiar low surface brightness structure/features, we’ll focus on
the dwarf companion galaxy of the group (or NGC 5195).

* Menu:

* Downloading and validating input data::  How to get and check the input data.
* NoiseChisel optimization::    Detect the extended and diffuse wings.
* Image surface brightness limit::  Standards to quantify the noise level.
* Achieved surface brightness level::  Calculate the outer surface brightness.
* Extract clumps and objects::  Find sub-structure over the detections.

   ---------- Footnotes ----------

   (1) <https://en.wikipedia.org/wiki/M51_Group>


File: gnuastro.info,  Node: Downloading and validating input data,  Next: NoiseChisel optimization,  Prev: Detecting large extended targets,  Up: Detecting large extended targets

2.3.1 Downloading and validating input data
-------------------------------------------

To get the image, you can use SDSS’s Simple field search
(https://dr12.sdss.org/fields) tool.  As long as it is covered by the
SDSS, you can find an image containing your desired target either by
providing a standard name (if it has one), or its coordinates.  To
access the dataset we will use here, write ‘NGC5195’ in the “Object
Name” field and press “Submit” button.

*Type the example commands:* Try to type the example commands on your
terminal and use the history feature of your command-line (by pressing
the “up” button to retrieve previous commands).  Don’t simply copy and
paste the commands shown here.  This will help simulate future
situations when you are processing your own datasets.

   You can see the list of available filters under the color image.  For
this demonstration, we’ll use the r-band filter image.  By clicking on
the “r-band FITS” link, you can download the image.  Alternatively, you
can just run the following command to download it with GNU Wget(1).  To
keep things clean, let’s also put it in a directory called ‘ngc5195’.
With the ‘-O’ option, we are asking Wget to save the downloaded file
with a more manageable name: ‘r.fits.bz2’ (this is an r-band image of
NGC 5195, which was the directory name).

     $ mkdir ngc5195
     $ cd ngc5195
     $ topurl=https://dr12.sdss.org/sas/dr12/boss/photoObj/frames
     $ wget $topurl/301/3716/6/frame-r-003716-6-0117.fits.bz2 -Or.fits.bz2

   When you want to reproduce a previous result (a known analysis, on a
known dataset, to get a known result: like the case here!)  it is
important to verify that the file is correct: that the input file hasn’t
changed (on the remote server, or in your own archive), or there was no
downloading problem.  Otherwise, if the data have changed in your
server/archive, and you use the same script, you will get a different
result, causing a lot of confusion!

   One good way to verify the contents of a file is to store its
_Checksum_ in your analysis script and check it before any other
operation.  The _Checksum_ algorithms look into the contents of a file
and calculate a fixed-length string from them.  If any change (even in a
bit or byte) is made within the file, the resulting string will change,
for more see Wikipedia (https://en.wikipedia.org/wiki/Checksum).  There
are many common algorithms, but a simple one is the SHA-1 algorithm
(https://en.wikipedia.org/wiki/SHA-1) (Secure Hash Algorithm 1) that you
can calculate easily with the command below (the second line is the
output, and the checksum is the first/long string: it is independent of
the file name)

     $ sha1sum r.fits.bz2
     5fb06a572c6107c72cbc5eb8a9329f536c7e7f65  r.fits.bz2

   If the checksum on your computer is different from this, either the
file has been incorrectly downloaded (most probable), or it has changed
on SDSS servers (very unlikely(2)).  To get a better feeling of
checksums open your favorite text editor and make a test file by writing
something in it.  Save it and calculate the text file’s SHA-1 checksum
with ‘sha1sum’.  Try renaming that file, and you’ll see the checksum
hasn’t changed (checksums only look into the contents, not the
name/location of the file).  Then open the file with your text editor
again, make a change and re-calculate its checksum, you’ll see the
checksum string has changed.

   Its always good to keep this short checksum string with your
project’s scripts and validate your input data before using them.  You
can do this with a shell conditional like this:

     filename=r.fits.bz2
     expected=5fb06a572c6107c72cbc5eb8a9329f536c7e7f65
     sum=$(sha1sum $filename | awk '{print $1}')
     if [ $sum = $expected ]; then
       echo "$filename: validated"
     else
       echo "$filename: wrong checksum!"
       exit 1
     fi

Now that we know you have the same data that we wrote this tutorial
with, let’s continue.  The SDSS server keeps the files in a Bzip2
compressed file format (that have a ‘.bz2’ suffix).  So we’ll first
decompress it with the following command to use it as a normal FITS
file.  By convention, compression programs delete the original file
(compressed when uncompressing, or uncompressed when compressing).  To
keep the original file, you can use the ‘--keep’ or ‘-k’ option which is
available in most compression programs for this job.  Here, we don’t
need the compressed file any more, so we’ll just let ‘bunzip’ delete it
for us and keep the directory clean.

     $ bunzip2 r.fits.bz2

   ---------- Footnotes ----------

   (1) To make the command easier to view on screen or in a page, we
have defined the top URL of the image as the ‘topurl’ shell variable.
You can just replace the value of this variable with ‘$topurl’ in the
‘wget’ command.

   (2) If your checksum is different, try uncompressing the file with
the ‘bunzip2’ command after this, and open the resulting FITS file.  If
it opens and you see the image of M51 and NGC5195, then there was no
download problem, and the file has indeed changed on the SDSS servers!
In this case, please contact us at ‘bug-gnuastro@gnu.org’.


File: gnuastro.info,  Node: NoiseChisel optimization,  Next: Image surface brightness limit,  Prev: Downloading and validating input data,  Up: Detecting large extended targets

2.3.2 NoiseChisel optimization
------------------------------

In *note Detecting large extended targets:: we downloaded the single
exposure SDSS image.  Let’s see how NoiseChisel operates on it with its
default parameters:

     $ astnoisechisel r.fits -h0

   As described in *note NoiseChisel and Multiextension FITS files::,
NoiseChisel’s default output is a multi-extension FITS file.  Open the
output ‘r_detected.fits’ file and have a look at the extensions, the
0-th extension is only meta-data and contains NoiseChisel’s
configuration parameters.  The rest are the Sky-subtracted input, the
detection map, Sky values and Sky standard deviation.

     $ ds9 -mecube r_detected.fits -zscale -zoom to fit

   Flipping through the extensions in a FITS viewer, you will see that
the first image (Sky-subtracted image) looks reasonable: there are no
major artifacts due to bad Sky subtraction compared to the input.  The
second extension also seems reasonable with a large detection map that
covers the whole of NGC5195, but also extends towards the bottom of the
image where we actually see faint and diffuse signal in the input image.

   Now try flipping between the ‘DETECTIONS’ and ‘SKY’ extensions.  In
the ‘SKY’ extension, you’ll notice that there is still significant
signal beyond the detected pixels.  You can tell that this signal
belongs to the galaxy because the far-right side of the image (away from
M51) is dark (has lower values) and the brighter parts in the Sky image
(with larger values) are just under the detections and follow a similar
pattern.

   The fact that signal from the galaxy remains in the ‘SKY’ HDU shows
that NoiseChisel can be optimized for a much better result.  The ‘SKY’
extension must not contain any light around the galaxy.  Generally, any
time your target is much larger than the tile size and the signal is
very diffuse and extended at low signal-to-noise values (like this
case), this _will_ happen.  Therefore, when there are large objects in
the dataset, *the best place* to check the accuracy of your detection is
the estimated Sky image.

   When dominated by the background, noise has a symmetric distribution.
However, signal is not symmetric (we don’t have negative signal).
Therefore when non-constant(1) signal is present in a noisy dataset, the
distribution will be positively skewed.  For a demonstration, see Figure
1 of Akhlaghi and Ichikawa [2015] (https://arxiv.org/abs/1505.01664).
This skewness is a good measure of how much faint signal we have in the
distribution.  The skewness can be accurately measured by the difference
in the mean and median (assuming no strong outliers): the more distant
they are, the more skewed the dataset is.  For more see *note
Quantifying signal in a tile::.

   However, skewness is only a proxy for signal when the signal has
structure (varies per pixel).  Therefore, when it is approximately
constant over a whole tile, or sub-set of the image, the constant
signal’s effect is just to shift the symmetric center of the noise
distribution to the positive and there won’t be any skewness (major
difference between the mean and median).  This positive(2) shift that
preserves the symmetric distribution is the Sky value.  When there is a
gradient over the dataset, different tiles will have different constant
shifts/Sky-values, for example see Figure 11 of Akhlaghi and Ichikawa
[2015] (https://arxiv.org/abs/1505.01664).

   To make this very large diffuse/flat signal detectable, you will
therefore need a larger tile to contain a larger change in the values
within it (and improve number statistics, for less scatter when
measuring the mean and median).  So let’s play with the tessellation a
little to see how it affects the result.  In Gnuastro, you can see the
option values (‘--tilesize’ in this case) by adding the ‘-P’ option to
your last command.  Try running NoiseChisel with ‘-P’ to see its default
tile size.

   You can clearly see that the default tile size is indeed much smaller
than this (huge) galaxy and its tidal features.  As a result,
NoiseChisel was unable to identify the skewness within the tiles under
the outer parts of M51 and NGC 5159 and the threshold has been
over-estimated on those tiles.  To see which tiles were used for
estimating the quantile threshold (no skewness was measured), you can
use NoiseChisel’s ‘--checkqthresh’ option:

     $ astnoisechisel r.fits -h0 --checkqthresh

   Did you see how NoiseChisel aborted after finding and applying the
quantile thresholds?  When you call any of NoiseChisel’s ‘--check*’
options, by default, it will abort as soon as all the check steps have
been written in the check file (a multi-extension FITS file).  This
allows you to focus on the problem you wanted to check as soon as
possible (you can disable this feature with the ‘--continueaftercheck’
option).

   To optimize the threshold-related settings for this image, let’s play
with this quantile threshold check image a little.  Don’t forget that
“_Good statistical analysis is not a purely routine matter, and
generally calls for more than one pass through the computer_” (Anscombe
1973, see *note Science and its tools::).  A good scientist must have a
good understanding of her tools to make a meaningful analysis.  So don’t
hesitate in playing with the default configuration and reviewing the
manual when you have a new dataset (from a new instrument) in front of
you.  Robust data analysis is an art, therefore a good scientist must
first be a good artist.  So let’s open the check image as a
multi-extension cube:

     $ ds9 -mecube r_qthresh.fits -zscale -cmap sls -zoom to fit

   The first extension (called ‘CONVOLVED’) of ‘r_qthresh.fits’ is the
convolved input image where the threshold(s) is(are) defined (and later
applied to).  For more on the effect of convolution and thresholding,
see Sections 3.1.1 and 3.1.2 of Akhlaghi and Ichikawa [2015]
(https://arxiv.org/abs/1505.01664).  The second extension
(‘QTHRESH_ERODE’) has a blank/white value for all the pixels of any tile
that was identified as having significant signal.  The other tiles have
the measured threshold over them.  The next two extensions
(‘QTHRESH_NOERODE’ and ‘QTHRESH_EXPAND’) are the other two quantile
thresholds that are necessary in NoiseChisel’s later steps.  Every step
in this file is repeated on the three thresholds.

   Play a little with the color bar of the ‘QTHRESH_ERODE’ extension,
you clearly see how the non-blank tiles around NGC 5195 have a gradient.
As one line of attack against discarding too much signal below the
threshold, NoiseChisel rejects outlier tiles.  Go forward by three
extensions to ‘VALUE1_NO_OUTLIER’ and you will see that many of the
tiles over the galaxy have been removed in this step.  For more on the
outlier rejection algorithm, see the latter half of *note Quantifying
signal in a tile::.

   Even though much of the galaxy’s footprint has been rejected as
outliers, there are still tiles with signal remaining: play with the DS9
color-bar and you still see a gradient near the outer tidal feature of
the galaxy.  Before trying to correct this, let’s look at the other
extensions of this check image.  We will use a ‘*’ as a wild-card that
can be 1, 2 or 3.  In the ‘THRESH*_INTERP’ extensions, you see that all
the blank tiles have been interpolated using their nearest neighbors
(the relevant option here is ‘--interpnumngb’).  In the following
‘THRESH*_SMOOTH’ extensions, you can see the tile values after smoothing
(configured with ‘--smoothwidth’ option).  Finally, in
‘QTHRESH-APPLIED’, you see the thresholded image: pixels with a value of
1 will be eroded later, but pixels with a value of 2 will pass the
erosion step un-touched.

   Let’s get back to the problem of optimizing the result.  You have two
strategies for detecting the outskirts of the merging galaxies: 1)
Increase the tile size to get more accurate measurements of skewness.
2) Strengthen the outlier rejection parameters to discard more of the
tiles with signal.  Fortunately in this image we have a sufficiently
large region on the right of the image that the galaxy doesn’t extend
to.  So we can use the more robust first solution.  In situations where
this doesn’t happen (for example if the field of view in this image was
shifted to the left to have more of M51 and less sky) you are limited to
a combination of the two solutions or just to the second solution.

*Skipping convolution for faster tests:* The slowest step of NoiseChisel
is the convolution of the input dataset.  Therefore when your dataset is
large (unlike the one in this test), and you are not changing the input
dataset or kernel in multiple runs (as in the tests of this tutorial),
it is faster to do the convolution separately once (using *note
Convolve::) and use NoiseChisel’s ‘--convolved’ option to directly feed
the convolved image and avoid convolution.  For more on ‘--convolved’,
see *note NoiseChisel input::.

   To better identify the skewness caused by the flat NGC 5195 and M51
tidal features on the tiles under it, we have to choose a larger tile
size.  Let’s try a tile size of 100 by 100 pixels and inspect the check
image.

     $ astnoisechisel r.fits -h0 --tilesize=100,100 --checkqthresh
     $ ds9 -mecube r_qthresh.fits -zscale -cmap sls -zoom to fit

   You can clearly see the effect of this increased tile size: the tiles
are much larger and when you look into ‘VALUE1_NO_OUTLIER’, you see that
all the tiles are nicely grouped on the right side of the image (the
farthest from M51, where we don’t see a gradient in ‘QTHRESH_ERODE’).
Things look good now, so let’s remove ‘--checkqthresh’ and let
NoiseChisel proceed with its detection.

     $ astnoisechisel r.fits -h0 --tilesize=100,100
     $ ds9 -mecube r_detected.fits -zscale -cmap sls -zoom to fit

   The detected pixels of the ‘DETECTIONS’ extension have expanded a
little, but not as much.  Also, the gradient in the ‘SKY’ image is
almost fully removed (and doesn’t fall over M51 anymore).  However, on
the bottom-right of the m51 detection, we see many holes gradually
increasing in size.  This hints that there is still signal out there.
Let’s check the next series of detection steps by adding the
‘--checkdetection’ option this time:

     $ astnoisechisel r.fits -h0 --tilesize=100,100 --checkdetection
     $ ds9 -mecube r_detcheck.fits -zscale -cmap sls -zoom to fit

   The output now has 16 extensions, showing every step that is taken by
NoiseChisel.  The first and second (‘INPUT’ and ‘CONVOLVED’) are clear
from their names.  The third (‘THRESHOLDED’) is the thresholded image
after finding the quantile threshold (last extension of the output of
‘--checkqthresh’).  The fourth HDU (‘ERODED’) is new: its the name-stake
of NoiseChisel, or eroding pixels that are above the threshold.  By
erosion, we mean that all pixels with a value of ‘1’ (above the
threshold) that are touching a pixel with a value of ‘0’ (below the
threshold) will be flipped to zero (or “carved” out)(3).  You can see
its effect directly by going back and forth between the ‘THRESHOLDED’
and ‘ERODED’ extensions.

   In the fifth extension (‘OPENED-AND-LABELED’) the image is “opened”,
which is a name for eroding once, then dilating (dilation is the inverse
of erosion).  This is good to remove thin connections that are only due
to noise.  Each separate connected group of pixels is also given its
unique label here.  Do you see how just beyond the large M51 detection,
there are many smaller detections that get smaller as you go more
distant?  This hints at the solution: the default number of erosions is
too much.  Let’s see how many erosions take place by default (by adding
‘-P | grep erode’ to the previous command)

     $ astnoisechisel r.fits -h0 --tilesize=100,100 -P | grep erode

We see that the value of ‘erode’ is ‘2’.  The default NoiseChisel
parameters are primarily targeted to processed images (where there is
correlated noise due to all the processing that has gone into the
warping and stacking of raw images, see *note NoiseChisel optimization
for detection::).  In those scenarios 2 erosions are commonly necessary.
But here, we have a single-exposure image where there is no correlated
noise (the pixels aren’t mixed).  So let’s see how things change with
only one erosion:

     $ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1 \
                      --checkdetection
     $ ds9 -mecube r_detcheck.fits -zscale -cmap sls -zoom to fit

   Looking at the ‘OPENED-AND-LABELED’ extension again, we see that the
main/large detection is now much larger than before.  While the
immediately-outer connected regions are still present, they have
decreased dramatically, so we can pass this step.

   After the ‘OPENED-AND-LABELED’ extension, NoiseChisel goes onto
finding false detections using the undetected pixels.  The process is
fully described in Section 3.1.5.  (Defining and Removing False
Detections) of arXiv:1505.01664 (https://arxiv.org/pdf/1505.01664.pdf).
Please compare the extensions to what you read there and things will be
very clear.  In the last HDU (‘DETECTION-FINAL’), we have the final
detected pixels that will be used to estimate the Sky and its Standard
deviation.  We see that the main detection has indeed been detected very
far out, so let’s see how the full NoiseChisel will estimate the Sky and
its standard deviation (by removing ‘--checkdetection’):

     $ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1
     $ ds9 -mecube r_detected.fits -zscale -cmap sls -zoom to fit

   The ‘DETECTIONS’ extension of ‘r_detected.fits’ closely follows what
the ‘DETECTION-FINAL’ of the check image (looks good!).  If you go ahead
to the ‘SKY’ extension, things still look good.  But it can still be
improved.

   Look at the ‘DETECTIONS’ again, you will see the right-ward edges of
M51’s detected pixels have many “holes” that are fully surrounded by
signal (value of ‘1’) and the signal stretches out in the noise very
thinly (the size of the holes increases as we go out).  This suggests
that there is still undetected signal and that we can still dig deeper
into the noise.

   With the ‘--detgrowquant’ option, NoiseChisel will “grow” the
detections in to the noise.  Its value is the ultimate limit of the
growth in units of quantile (between 0 and 1).  Therefore
‘--detgrowquant=1’ means no growth and ‘--detgrowquant=0.5’ means an
ultimate limit of the Sky level (which is usually too much and will
cover the whole image!).  See Figure 2 of arXiv:1909.11230
(https://arxiv.org/pdf/1909.11230.pdf) for more on this option.  Try
running the previous command with various values (from 0.6 to higher
values) to see this option’s effect on this dataset.  For this
particularly huge galaxy (with signal that extends very gradually into
the noise), we’ll set it to ‘0.75’:

     $ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1 \
                      --detgrowquant=0.75
     $ ds9 -mecube r_detected.fits -zscale -cmap sls -zoom to fit

   Beyond this level (smaller ‘--detgrowquant’ values), you see many of
the smaller background galaxies (towards the right side of the image)
starting to create thin spider-leg-like features, showing that we are
following correlated noise for too much.  Please try it for your self by
changing it to ‘0.6’ for example.

   When you look at the ‘DETECTIONS’ extension of the command shown
above, you see the wings of the galaxy being detected much farther out,
But you also see many holes which are clearly just caused by noise.
After growing the objects, NoiseChisel also allows you to fill such
holes when they are smaller than a certain size through the
‘--detgrowmaxholesize’ option.  In this case, a maximum area/size of
10,000 pixels seems to be good:

     $ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1 \
                      --detgrowquant=0.75 --detgrowmaxholesize=10000
     $ ds9 -mecube r_detected.fits -zscale -cmap sls -zoom to fit

   When looking at the raw input image (which is very “shallow”: less
than a minute exposure!), you don’t see anything so far out of the
galaxy.  You might just think to yourself that “this is all noise, I
have just dug too deep and I’m following systematics”!  If you feel like
this, have a look at the deep images of this system in Watkins et al.
[2015] (https://arxiv.org/abs/1501.04599), or a 12 hour deep image of
this system (with a 12-inch telescope):
<https://i.redd.it/jfqgpqg0hfk11.jpg>(4).  In these deeper images you
clearly see how the outer edges of the M51 group follow this exact
structure, below in *note Achieved surface brightness level::, we’ll
measure the exact level.

   As the gradient in the ‘SKY’ extension shows, and the deep images
cited above confirm, the galaxy’s signal extends even beyond this.  But
this is already far deeper than what most (if not all) other tools can
detect.  Therefore, we’ll stop configuring NoiseChisel at this point in
the tutorial and let you play with the other options a little more,
while reading more about it in the papers (Akhlaghi and Ichikawa [2015]
(https://arxiv.org/abs/1505.01664) and Akhlaghi [2019]
(https://arxiv.org/abs/1909.11230)) and *note NoiseChisel::.  When you
do find a better configuration feel free to contact us for feedback.
Don’t forget that good data analysis is an art, so like a sculptor,
master your chisel for a good result.

   To avoid typing all these options every time you run NoiseChisel on
this image, you can use Gnuastro’s configuration files, see *note
Configuration files::.  For an applied example of setting/using them,
see *note Option management and configuration files::.

*This NoiseChisel configuration is NOT GENERIC:* Don’t use the
configuration derived above, on another instrument’s image _blindly_.
If you are unsure, just use the default values.  As you saw above, the
reason we chose this particular configuration for NoiseChisel to detect
the wings of the M51 group was strongly influenced by the noise
properties of this particular image.  Remember *note NoiseChisel
optimization for detection::, where we looked into the very deep XDF
image which had strong correlated noise?

   As long as your other images have similar noise properties (from the
same data-reduction step of the same instrument), you can use your
configuration on any of them.  But for images from other instruments,
please follow a similar logic to what was presented in these tutorials
to find the optimal configuration.

*Smart NoiseChisel:* As you saw during this section, there is a clear
logic behind the optimal parameter value for each dataset.  Therefore,
we plan to capabilities to (optionally) automate some of the choices
made here based on the actual dataset, please join us in doing this if
you are interested.  However, given the many problems in existing
“smart” solutions, such automatic changing of the configuration may
cause more problems than they solve.  So even when they are implemented,
we would strongly recommend quality checks for a robust analysis.

   ---------- Footnotes ----------

   (1) by constant, we mean that it has a single value in the region we
are measuring.

   (2) In processed images, where the Sky value can be over-estimated,
this constant shift can be negative.

   (3) Pixels with a value of ‘2’ are very high signal-to-noise pixels,
they are not eroded, to preserve sharp and bright sources.

   (4) The image is taken from this Reddit discussion:
<https://www.reddit.com/r/Astronomy/comments/9d6x0q/12_hours_of_exposure_on_the_whirlpool_galaxy/>


File: gnuastro.info,  Node: Image surface brightness limit,  Next: Achieved surface brightness level,  Prev: NoiseChisel optimization,  Up: Detecting large extended targets

2.3.3 Image surface brightness limit
------------------------------------

In *note NoiseChisel optimization:: we showed how to customize
NoiseChisel for a single-exposure SDSS image of the M51 group.  When
presenting your detection results in a paper or scientific conference,
usually the first thing that someone will ask (if you don’t explicitly
say it!), is the dataset’s _surface brightness limit_ (a standard
measure of the noise level), and your target’s surface brightness (a
measure of the signal, either in the center or outskirts, depending on
context).  For more on the basics of these important concepts please see
*note Quantifying measurement limits::).  Here, we’ll measure these
values for this image.

   Let’s start by measuring the surface brightness limit masking all the
detected pixels and have a look at the noise distribution with the
‘astarithmetic’ and ‘aststatistics’ commands below.

     $ astarithmetic r_detected.fits -hINPUT-NO-SKY set-in \
                     r_detected.fits -hDETECTIONS set-det \
                     in det nan where -odet-masked.fits
     $ ds9 det-masked.fits
     $ aststatistics det-masked.fits

From the ASCII histogram, we see that the distribution is roughly
symmetric.  We can also quantify this by measuring the skewness
(difference between mean and median, divided by the standard deviation):

     $ aststatistics det-masked.fits --mean --median --std \
                     | awk '{print ($1-$2)/$3}'

Showing that the mean is larger than the median by $0.08\sigma$, in
other words, as we saw in *note NoiseChisel optimization::, a very small
residual signal still remains in the undetected regions and it was up to
you as an exercise to improve it.  So let’s continue with this value.
Now, we will use the masked image and the surface brightness limit
equation in *note Quantifying measurement limits:: to measure the
$3\sigma$ surface brightness limit over an area of $25 \rm{arcsec}^2$:

     $ nsigma=3
     $ zeropoint=22.5
     $ areaarcsec2=25
     $ std=$(aststatistics det-masked.fits --sigclip-std)
     $ pixarcsec2=$(astfits det-masked.fits --pixelscale --quiet \
                            | awk '{print $3*3600*3600}')
     $ astarithmetic --quiet $nsigma $std x \
                     $areaarcsec2 $pixarcsec2 x \
                     sqrt / $zeropoint counts-to-mag
     26.0241

   The customizable steps above are good for any type of mask.  For
example your field of view may contain a very deep part so you need to
mask all the shallow parts _as well as_ the detections before these
steps.  But when your image is flat (like this), there is a much simpler
method to obtain the same value through MakeCatalog (when the standard
deviation image is made by NoiseChisel).  NoiseChisel has already
calculated the minimum (‘MINSTD’), maximum (‘MAXSTD’) and median
(‘MEDSTD’) standard deviation within the tiles during its processing and
has stored them as FITS keywords within the ‘SKY_STD’ HDU. You can see
them by piping all the keywords in this HDU into ‘grep’.  In Grep, each
‘.’ represents one character that can be anything so ‘M..STD’ will match
all three keywords mentioned above.

     $ astfits r_detected.fits --hdu=SKY_STD | grep 'M..STD'

   The ‘MEDSTD’ value is very similar to the standard deviation derived
above, so we can safely use it instead of having to mask and run
Statistics.  In fact, MakeCatalog also uses this keyword and will report
the dataset’s $n\sigma$ surface brightness limit as keywords in the
output (not as measurement columns, since its related to the noise, not
labeled signal):

     $ astmkcatalog r_detected.fits -hDETECTIONS --output=sbl.fits \
                    --forcereadstd --ids

Before looking into the measured surface brightness limits, let’s review
some important points about this call to MakeCatalog first:
   • We are only concerned with the noise (not the signal), so we don’t
     ask for any further measurements, because they can un-necessarily
     slow it down.  However, MakeCatalog requires at least one column,
     so we’ll only ask for the ‘--ids’ column (which doesn’t need any
     measurement!).  The output catalog will therefore have a single row
     and a single column, with 1 as its value(1).
   • If we don’t ask for any noise-related column (for example the
     signal-to-noise ratio column with ‘--sn’, among other noise-related
     columns), MakeCatalog is not going to read the noise standard
     deviation image (again, to speed up its operation when it is
     redundant).  We are thus using the ‘--forcereadstd’ option (short
     for “force read standard deviation image”) here so it is ready for
     the surface brightness limit measurements that are written as
     keywords.

   With the command below you can see all the keywords that were
measured with the table.  Notice the group of keywords that are under
the “Surface brightness limit (SBL)” title.

     $ astfits sbl.fits -h1

Since all the keywords of interest here start with ‘SBL’, we can get a
more cleaner view with this command.

     $ astfits sbl.fits -h1 | grep ^SBL

   Notice how the ‘SBLSTD’ has the same value as NoiseChisel’s ‘MEDSTD’
above.  Using ‘SBLSTD’, MakeCatalog has determined the $n\sigma$ surface
brightness limiting magnitude in these header keywords.  The multiple of
$\sigma$, or $n$, is the value of the ‘SBLNSIG’ keyword which you can
change with the ‘--sfmagnsigma’.  The surface brightness limiting
magnitude within a pixel (‘SBLNSIG’) and within a pixel-agnostic area of
‘SBLAREA’ arcsec$^2$ are stored in ‘SBLMAG’.

   You will notice that the two surface brightness limiting magnitudes
above have values around 3 and 4 (which is not correct!).  This is
because we haven’t given a zero point magnitude to MakeCatalog, so it
uses the default value of ‘0’.  SDSS image pixel values are calibrated
in units of “nanomaggy” which are defined to have a zero point magnitude
of 22.5(2).  So with the first command below we give the zero point
value and with the second we can see the surface brightness limiting
magnitudes with the correct values (around 25 and 26)

     $ astmkcatalog r_detected.fits -hDETECTIONS --zeropoint=22.5 \
                    --output=sbl.fits --forcereadstd --ids
     $ astfits sbl.fits -h1 | grep ^SBL

   As you see from ‘SBLNSIG’ and ‘SBLAREA’, the default multiple of
sigma is 1 and the default area is 1 arcsec$^2$.  Usually higher values
are used for these two parameters.  Following the manual example we did
above, you can ask for the multiple of sigma to be 3 and the area to be
25 arcsec$^2$:

     $ astmkcatalog r_detected.fits -hDETECTIONS --zeropoint=22.5 \
                    --output=sbl.fits --sfmagarea=25 --sfmagnsigma=3 \
                    --forcereadstd --ids
     $ astfits sbl.fits -h1 | awk '/^SBLMAG /{print $3}'
     26.02296

   You see that the value is identical to the custom surface brightness
limiting magnitude we measured above (a difference of $0.00114$
magnitudes is negligible and hundreds of times larger than the typical
errors in the zero point magnitude or magnitude measurements).  But it
is much more easier to have MakeCatalog do this measurement, because
these values will be appended (as keywords) into your final catalog of
objects within that image.

*Custom STD for MakeCatalog’s Surface brightness limit:* You can
manually change/set the value of the ‘MEDSTD’ keyword in your input STD
image with *note Fits:::

     $ std=$(aststatistics masked.fits --sigclip-std)
     $ astfits noisechisel.fits -hSKY_STD --update=MEDSTD,$std

   With this change, MakeCatalog will use your custom standard deviation
for the surface brightness limit.  This is necessary in scenarios where
your image has multiple depths and during your masking, you also mask
the shallow regions (as well as the detections of course).

   We have successfully measured the image’s $3\sigma$ surface
brightness limiting magnitude over 25 arcsec$^2$.  However, as discussed
in *note Quantifying measurement limits:: this value is just an
extrapolation of the per-pixel standard deviation.  Issues like
correlated noise will cause the real noise over a large area to be
different.  So for a more robust measurement, let’s use the upper-limit
magnitude of similarly sized region.  For more on the upper-limit
magnitude, see the respective item in *note Quantifying measurement
limits::.

   In summary, the upper-limit measurements involve randomly placing the
footprint of an object in undetected parts of the image many times.
This results in a random distribution of brightness measurements, the
standard deviation of that distribution is then converted into
magnitudes.  To be comparable with the results above, let’s make a
circular aperture that has an area of 25 arcsec$^2$ (thus with a radius
of $2.82095$ arcsec).

     zeropoint=22.5
     r_arcsec=2.82095

     ## Convert the radius (in arcseconds) to pixels.
     r_pixel=$(astfits r_detected.fits --pixelscale -q \
                       | awk '{print '$r_arcsec'/($1*3600)}')

     ## Make circular aperture at pixel (100,100) position is irrelevant.
     echo "1 100 100 5 $r_pixel 0 0 1 1 1" \
          | astmkprof --background=r_detected.fits \
                      --clearcanvas --mforflatpix --type=uint8 \
                      --output=lab.fits

     ## Do the upper-limit measurement, ignoring all NoiseChisel's
     ## detections as a mask for the upper-limit measurements.
     astmkcatalog lab.fits -h1 --zeropoint=$zeropoint -osbl.fits \
                  --sfmagarea=25 --sfmagnsigma=3 --forcereadstd \
                  --valuesfile=r_detected.fits --valueshdu=INPUT-NO-SKY \
                  --upmaskfile=r_detected.fits --upmaskhdu=DETECTIONS \
                  --upnsigma=3 --checkuplim=1 --upnum=1000 \
                  --ids --upperlimitsb

   The ‘sbl.fits’ catalog now contains the upper-limit surface
brightness for a circle with an area of 25 arcsec$^2$.  You can check
the value with the command below, but the great thing is that now you
have both the surface brightness limiting magnitude in the headers
discussed above, and the upper-limit surface brightness within the
table.  You can also add more profiles with different shapes and sizes
if necessary.  Of course, you can also use ‘--upperlimitsb’ in your
actual science objects and clumps to get an object-specific or
clump-specific value.

     $ asttable sbl.fits -cUPPERLIMIT_SB
     25.9119

You will get a slightly different value from the command above.  In
fact, if you run the MakeCatalog command again and look at the measured
upper-limit surface brightness, it will be slightly different with your
first trial!  Please try exactly the same MakeCatalog command above a
few times to see how it changes.

   This is because of the _random_ factor in the upper-limit
measurements: every time you run it, different random points will be
checked, resulting in a slightly different distribution.  You can
decrease the random scatter by increasing the number of random checks
(for example setting ‘--upnum=100000’, compared to 1000 in the command
above).  But this will be slower and the results won’t be exactly
reproducible.  The only way to ensure you get an identical result later
is to fix the random number generator function and seed like the command
below(3).  This is a very important point regarding any statistical
process involving random numbers, please see *note Generating random
numbers::.

     export GSL_RNG_TYPE=ranlxs1
     export GSL_RNG_SEED=1616493518
     astmkcatalog lab.fits -h1 --zeropoint=$zeropoint -osbl.fits \
                  --sfmagarea=25 --sfmagnsigma=3 --forcereadstd \
                  --valuesfile=r_detected.fits --valueshdu=INPUT-NO-SKY \
                  --upmaskfile=r_detected.fits --upmaskhdu=DETECTIONS \
                  --upnsigma=3 --checkuplim=1 --upnum=1000 \
                  --ids --upperlimitsb --envseed

   But where do all the random apertures of the upper-limit measurement
fall on the image?  It is good to actually inspect their location to get
a better understanding for the process and also detect possible
bugs/biases.  When MakeCatalog is run with the ‘--checkuplim’ option, it
will print all the random locations and their measured brightness as a
table in a file with the suffix ‘_upcheck.fits’.  With the first command
below you can use Gnuastro’s ‘asttable’ and ‘astscript-ds9-region’ to
convert the successful aperture locations into a DS9 region file, and
with the second can load the region file into the detections and
sky-subtracted image to visually see where they are.

     ## Create a DS9 region file from the check table (activated
     ## with '--checkuplim')
     asttable lab_upcheck.fits --noblank=RANDOM_SUM \
              | astscript-ds9-region -c1,2 --mode=img \
                                     --radius=$r_pixel

     ## Have a look at the regions in relation with NoiseChisel's
     ## detections.
     ds9 r_detected.fits[INPUT-NO-SKY] -regions load ds9.reg
     ds9 r_detected.fits[DETECTIONS] -regions load ds9.reg

   In this example, we were looking at a single-exposure image that has
no correlated noise.  Because of this, the surface brightness limit and
the upper-limit surface brightness are very close.  They will have a
bigger difference on deep datasets with stronger correlated noise (that
are the result of stacking many individual exposures).  As an exercise,
please try measuring the upper-limit surface brightness level and
surface brightness limit for the deep HST data that we used in the
previous tutorial (*note General program usage tutorial::).

   ---------- Footnotes ----------

   (1) Recall that NoiseChisel’s output is a binary image: 0-valued
pixels are noise and 1-valued pixel are signal.  NoiseChisel doesn’t
identify sub-structure over the signal, this is the job of Segment, see
*note Extract clumps and objects::.

   (2) From <https://www.sdss.org/dr12/algorithms/magnitudes>

   (3) You can use any integer for the seed.  One recommendation is to
run MakeCatalog without ‘--envseed’ once and use the randomly generated
seed that is printed on the terminal.


File: gnuastro.info,  Node: Achieved surface brightness level,  Next: Extract clumps and objects,  Prev: Image surface brightness limit,  Up: Detecting large extended targets

2.3.4 Achieved surface brightness level
---------------------------------------

In *note NoiseChisel optimization:: we customized NoiseChisel for a
single-exposure SDSS image of the M51 group and in *note Image surface
brightness limit:: we measured the surface brightness limit and the
upper-limit surface brightness level (which are both measures of the
noise level).  In this section, let’s do some measurements on the
outer-most edges of the M51 group to see how they relate to the noise
measurements found in the previous section.

   For this measurement, we’ll need to estimate the average flux on the
outer edges of the detection.  Fortunately all this can be done with a
few simple commands using *note Arithmetic:: and *note MakeCatalog::.
First, let’s separate each detected region, or give a unique
label/counter to all the connected pixels of NoiseChisel’s detection map
with the command below.  Recall that with the ‘set-’ operator, the
popped operand will be given a name (‘det’ in this case) for easy usage
later.

     $ astarithmetic r_detected.fits -hDETECTIONS set-det \
                     det 2 connected-components -olabeled.fits

   You can find the label of the main galaxy visually (by opening the
image and hovering your mouse over the M51 group’s label).  But to have
a little more fun, let’s do this automatically (which is necessary in a
general scenario).  The M51 group detection is by far the largest
detection in this image, this allows us to find its ID/label easily.
We’ll first run MakeCatalog to find the area of all the labels, then
we’ll use Table to find the ID of the largest object and keep it as a
shell variable (‘id’):

     # Run MakeCatalog to find the area of each label.
     $ astmkcatalog labeled.fits --ids --geoarea -h1 -ocat.fits

     ## Sort the table by the area column.
     $ asttable cat.fits --sort=AREA_FULL

     ## The largest object, is the last one, so we'll use '--tail'.
     $ asttable cat.fits --sort=AREA_FULL --tail=1

     ## We only want the ID, so let's only ask for that column:
     $ asttable cat.fits --sort=AREA_FULL --tail=1 --column=OBJ_ID

     ## Now, let's put this result in a variable (instead of printing)
     $ id=$(asttable cat.fits --sort=AREA_FULL --tail=1 --column=OBJ_ID)

     ## Just to confirm everything is fine.
     $ echo $id

We can now use the ‘id’ variable to reject all other detections:

     $ astarithmetic labeled.fits $id eq -oonly-m51.fits

   Open the image and have a look.  To separate the outer edges of the
detections, we’ll need to “erode” the M51 group detection.  So in the
same Arithmetic command as above, we’ll erode three times (to have more
pixels and thus less scatter), using a maximum connectivity of 2
(8-connected neighbors).  We’ll then save the output in ‘eroded.fits’.

     $ astarithmetic labeled.fits $id eq 2 erode 2 erode 2 erode \
                     -oeroded.fits

In ‘labeled.fits’, we can now set all the 1-valued pixels of
‘eroded.fits’ to 0 using Arithmetic’s ‘where’ operator added to the
previous command.  We’ll need the pixels of the M51 group in
‘labeled.fits’ two times: once to do the erosion, another time to find
the outer pixel layer.  To do this (and be efficient and more readable)
we’ll use the ‘set-i’ operator (to give this image the name ‘‘i’’).  In
this way we can use it any number of times afterwards, while only
reading it from disk and finding M51’s pixels once.

     $ astarithmetic labeled.fits $id eq set-i i \
                     i 2 erode 2 erode 2 erode 0 where -oedge.fits

   Open the image and have a look.  You’ll see that the detected edge of
the M51 group is now clearly visible.  You can use ‘edge.fits’ to mark
(set to blank) this boundary on the input image and get a visual feeling
of how far it extends:

     $ astarithmetic r.fits -h0 edge.fits nan where -oedge-masked.fits

   To quantify how deep we have detected the low-surface brightness
regions (in units of signal to-noise ratio), we’ll use the command
below.  In short it just divides all the non-zero pixels of ‘edge.fits’
in the Sky subtracted input (first extension of NoiseChisel’s output) by
the pixel standard deviation of the same pixel.  This will give us a
signal-to-noise ratio image.  The mean value of this image shows the
level of surface brightness that we have achieved.  You can also break
the command below into multiple calls to Arithmetic and create temporary
files to understand it better.  However, if you have a look at *note
Reverse polish notation:: and *note Arithmetic operators::, you should
be able to easily understand what your computer does when you run this
command(1).

     $ astarithmetic edge.fits -h1                  set-edge \
                     r_detected.fits -hSKY_STD      set-skystd \
                     r_detected.fits -hINPUT-NO-SKY set-skysub \
                     skysub skystd / edge not nan where meanvalue --quiet

   We have thus detected the wings of the M51 group down to roughly
1/3rd of the noise level in this image which is a very good achievement!
But the per-pixel S/N is a relative measurement.  Let’s also measure the
depth of our detection in absolute surface brightness units; or
magnitudes per square arc-seconds (see *note Brightness flux
magnitude::).  We’ll also ask for the S/N and magnitude of the full edge
we have defined.  Fortunately doing this is very easy with Gnuastro’s
MakeCatalog:

     $ astmkcatalog edge.fits -h1 --valuesfile=r_detected.fits \
                    --zeropoint=22.5 --ids --surfacebrightness --sn \
                    --magnitude
     $ asttable edge_cat.fits
     1      25.6971       55.2406       15.8994

   We have thus reached an outer surface brightness of $25.70$
magnitudes/arcsec$^2$ (second column in ‘edge_cat.fits’) on this single
exposure SDSS image!  This is very similar to the surface brightness
limit measured in *note Image surface brightness limit:: (which is a big
achievement!).  But another point in the result above is very
interesting: the total S/N of the edge is $55.24$ with a total edge
magnitude(2) of 15.90!!!  This very large for such a faint signal
(recall that the mean S/N per pixel was 0.32) and shows a very important
point in the study of galaxies: While the per-pixel signal in their
outer edges may be very faint (and invisible to the eye in noise), a lot
of signal hides deeply buried in the noise.

   In interpreting this value, you should just have in mind that
NoiseChisel works based on the contiguity of signal in the pixels.
Therefore the larger the object, the deeper NoiseChisel can carve it out
of the noise (for the same outer surface brightness).  In other words,
this reported depth, is the depth we have reached for this object in
this dataset, processed with this particular NoiseChisel configuration.
If the M51 group in this image was larger/smaller than this (the field
of view was smaller/larger), or if the image was from a different
instrument, or if we had used a different configuration, we would go
deeper/shallower.

   ---------- Footnotes ----------

   (1) ‘edge.fits’ (extension ‘1’) is a binary (0 or 1 valued) image.
Applying the ‘not’ operator on it, just flips all its pixels (from ‘0’
to ‘1’ and vice-versa).  Using the ‘where’ operator, we are then setting
all the newly 1-valued pixels (pixels that aren’t on the edge) to
NaN/blank in the sky-subtracted input image (‘r_detected.fits’,
extension ‘INPUT-NO-SKY’, which we call ‘skysub’).  We are then dividing
all the non-blank pixels (only those on the edge) by the sky standard
deviation (‘r_detected.fits’, extension ‘SKY_STD’, which we called
‘skystd’).  This gives the signal-to-noise ratio (S/N) for each of the
pixels on the boundary.  Finally, with the ‘meanvalue’ operator, we are
taking the mean value of all the non-blank pixels and reporting that as
a single number.

   (2) You can run MakeCatalog on ‘only-m51.fits’ instead of ‘edge.fits’
to see the full magnitude of the M51 group in this image.