This is gnuastro.info, produced by makeinfo version 6.8 from gnuastro.texi. This book documents version 0.16 of the GNU Astronomy Utilities (Gnuastro). Gnuastro provides various programs and libraries for astronomical data manipulation and analysis. Copyright © 2015-2021, Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”. INFO-DIR-SECTION Astronomy START-INFO-DIR-ENTRY * Gnuastro: (gnuastro). GNU Astronomy Utilities. * libgnuastro: (gnuastro)Gnuastro library. Full Gnuastro library doc. * help-gnuastro: (gnuastro)help-gnuastro mailing list. Getting help. * bug-gnuastro: (gnuastro)Report a bug. How to report bugs * Arithmetic: (gnuastro)Arithmetic. Arithmetic operations on pixels. * astarithmetic: (gnuastro)Invoking astarithmetic. Options to Arithmetic. * BuildProgram: (gnuastro)BuildProgram. Compile and run programs using Gnuastro’s library. * astbuildprog: (gnuastro)Invoking astbuildprog. Options to BuildProgram. * ConvertType: (gnuastro)ConvertType. Convert different file types. * astconvertt: (gnuastro)Invoking astconvertt. Options to ConvertType. * Convolve: (gnuastro)Convolve. Convolve an input file with kernel. * astconvolve: (gnuastro)Invoking astconvolve. Options to Convolve. * CosmicCalculator: (gnuastro)CosmicCalculator. For cosmological params. * astcosmiccal: (gnuastro)Invoking astcosmiccal. Options to CosmicCalculator. * Crop: (gnuastro)Crop. Crop region(s) from image(s). * astcrop: (gnuastro)Invoking astcrop. Options to Crop. * Fits: (gnuastro)Fits. View and manipulate FITS extensions and keywords. * astfits: (gnuastro)Invoking astfits. Options to Fits. * MakeCatalog: (gnuastro)MakeCatalog. Make a catalog from labeled image. * astmkcatalog: (gnuastro)Invoking astmkcatalog. Options to MakeCatalog. * MakeNoise: (gnuastro)MakeNoise. Make (add) noise to an image. * astmknoise: (gnuastro)Invoking astmknoise. Options to MakeNoise. * MakeProfiles: (gnuastro)MakeProfiles. Make mock profiles. * astmkprof: (gnuastro)Invoking astmkprof. Options to MakeProfiles. * Match: (gnuastro)Match. Match two separate catalogs. * astmatch: (gnuastro)Invoking astmatch. Options to Match. * NoiseChisel: (gnuastro)NoiseChisel. Detect signal in noise. * astnoisechisel: (gnuastro)Invoking astnoisechisel. Options to NoiseChisel. * Segment: (gnuastro)Segment. Segment detections based on signal structure. * astsegment: (gnuastro)Invoking astsegment. Options to Segment. * Query: (gnuastro)Query. Access remote databases for downloading data. * astquery: (gnuastro)Invoking astquery. Options to Query. * Statistics: (gnuastro)Statistics. Get image Statistics. * aststatistics: (gnuastro)Invoking aststatistics. Options to Statistics. * Table: (gnuastro)Table. Read and write FITS binary or ASCII tables. * asttable: (gnuastro)Invoking asttable. Options to Table. * Warp: (gnuastro)Warp. Warp a dataset to a new grid. * astwarp: (gnuastro)Invoking astwarp. Options to Warp. * astscript: (gnuastro)Installed scripts. Gnuastro’s installed scripts. * astscript-sort-by-night: (gnuastro)Invoking astscript-sort-by-night. Options to this script * astscript-radial-profile: (gnuastro)Invoking astscript-radial-profile. Options to this script * astscript-ds9-region: (gnuastro)Invoking astscript-ds9-region. Options to this script END-INFO-DIR-ENTRY  File: gnuastro.info, Node: Bounding box, Next: Polygons, Prev: Tessellation library, Up: Gnuastro library 11.3.16 Bounding box (‘box.h’) ------------------------------ Functions related to reporting the bounding box of certain inputs are declared in ‘gnuastro/box.h’. All coordinates in this header are in the FITS format (first axis is the horizontal and the second axis is vertical). -- Function: void gal_box_bound_ellipse_extent (double a, double b, double theta_deg, double *extent) Return the maximum extent along each dimension of the given ellipse from the center of the ellipse. Therefore this is half the extent of the box in each dimension. ‘a’ is the ellipse semi-major axis, ‘b’ is the semi-minor axis, ‘theta_deg’ is the position angle in degrees. The extent in each dimension is in floating point format and stored in ‘extent’ which must already be allocated before this function. -- Function: void gal_box_bound_ellipse (double a, double b, double theta_deg, long *width) Any ellipse can be enclosed into a rectangular box. This function will write the height and width of that box where ‘width’ points to. It assumes the center of the ellipse is located within the central pixel of the box. ‘a’ is the ellipse semi-major axis length, ‘b’ is the semi-minor axis, ‘theta_deg’ is the position angle in degrees. The ‘width’ array will contain the output size in long integer type. ‘width[0]’, and ‘width[1]’ are the number of pixels along the first and second FITS axis. Since the ellipse center is assumed to be in the center of the box, all the values in ‘width’ will be an odd integer. -- Function: void gal_box_bound_ellipsoid_extent (double *semiaxes, double *euler_deg, double *extent) Return the maximum extent along each dimension of the given ellipsoid from its center. Therefore this is half the extent of the box in each dimension. The semi-axis lengths of the ellipsoid must be present in the 3 element ‘semiaxis’ array. The ‘euler_deg’ array contains the three ellipsoid Euler angles in degrees. For a description of the Euler angles, see description of ‘gal_box_bound_ellipsoid’ below. The extent in each dimension is in floating point format and stored in ‘extent’ which must already be allocated before this function. -- Function: void gal_box_bound_ellipsoid (double *semiaxes, double *euler_deg, long *width) Any ellipsoid can be enclosed into a rectangular volume/box. The purpose of this function is to give the integer size/width of that box. The semi-axes lengths of the ellipse must be in the ‘semiaxes’ array (with three elements). The major axis length must be the first element of ‘semiaxes’. The only other condition is that the next two semi-axes must both be smaller than the first. The orientation of the major axis is defined through three proper Euler angles (ZXZ order in degrees) that are given in the ‘euler_deg’ array. The ‘width’ array will contain the output size in long integer type (in FITS axis order). Since the ellipsoid center is assumed to be in the center of the box, all the values in ‘width’ will be an odd integer. The proper Euler angles can be defined in many ways (which axes to rotate about). For a full description of the Euler angles, please see Wikipedia (https://en.wikipedia.org/wiki/Euler_angles). Here we adopt the ZXZ (or $Z_1X_2Z_3$) proper Euler angles were the first rotation is done around the Z axis, the second one about the (rotated) X axis and the third about the (rotated) Z axis. -- Function: void gal_box_border_from_center (double center, size_t ndim, long *width, long *fpixel, long *lpixel) Given the center coordinates in ‘center’ and the ‘width’ (along each dimension) of a box, return the coordinates of the first (‘fpixel’) and last (‘lpixel’) pixels. All arrays must have ‘ndim’ elements (one for each dimension). -- Function: int gal_box_overlap (long *naxes, long *fpixel_i, long *lpixel_i, long *fpixel_o, long *lpixel_o, size_t ndim) An ‘ndim’-dimensional dataset of size ‘naxes’ (along each dimension, in FITS order) and a box with first and last (inclusive) coordinate of ‘fpixel_i’ and ‘lpixel_i’ is given. This box doesn’t necessarily have to lie within the dataset, it can be outside of it, or only partially overlap. This function will change the values of ‘fpixel_i’ and ‘lpixel_i’ to exactly cover the overlap in the input dataset’s coordinates. This function will return 1 if there is an overlap and 0 if there isn’t. When there is an overlap, the coordinates of the first and last pixels of the overlap will be put in ‘fpixel_o’ and ‘lpixel_o’.  File: gnuastro.info, Node: Polygons, Next: Qsort functions, Prev: Bounding box, Up: Gnuastro library 11.3.17 Polygons (‘polygon.h’) ------------------------------ Polygons are commonly necessary in image processing. For example in Crop they are used for cutting out non-rectangular regions of a image (see *note Crop::), and in Warp, for mapping different pixel grids over each other (see *note Warp::). Polygons come in two classes: convex and concave (or generally, non-convex!), see below for a demonstration. Convex polygons are those where all inner angles are less than 180 degrees. By contrast, a convex polygon is one where an inner angle may be more than 180 degress. Concave Polygon Convex Polygon D --------C D------------- C \ | E / | \E | \ | / | \ | A--------B A ----------B In all the functions here the vertices (and points) are defined as an array. So a polygon with 4 vertices will be identified with an array of 8 elements with the first two elements keeping the 2D coordinates of the first vertice and so on. -- Macro: GAL_POLYGON_MAX_CORNERS The largest number of vertices a polygon can have in this library. -- Macro: GAL_POLYGON_ROUND_ERR We have to consider floating point round-off errors when dealing with polygons. For example we will take ‘A’ as the maximum of ‘A’ and ‘B’ when ‘A>B-GAL_POLYGON_ROUND_ERR’. -- Function: void gal_polygon_vertices_sort_convex (double *in, size_t n, size_t *ordinds) We have a simple polygon (that can result from projection, so its edges don’t collide or it doesn’t have holes) and we want to order its corners in an anticlockwise fashion. This is necessary for clipping it and finding its area later. The input vertices can have practically any order. The input (‘in’) is an array containing the coordinates (two values) of each vertice. ‘n’ is the number of corners. So ‘in’ should have ‘2*n’ elements. The output (‘ordinds’) is an array with ‘n’ elements specifying the indices in order. This array must have been allocated before calling this function. The indexes are output for more generic usage, for example in a homographic transform (necessary in warping an image, see *note Warping basics::), the necessary order of vertices is the same for all the pixels. In other words, only the positions of the vertices change, not the way they need to be ordered. Therefore, this function would only be necessary once. As a summary, the input is unchanged, only ‘n’ values will be put in the ‘ordinds’ array. Such that calling the input coordinates in the following fashion will give an anti-clockwise order when there are 4 vertices: 1st vertice: in[ordinds[0]*2], in[ordinds[0]*2+1] 2nd vertice: in[ordinds[1]*2], in[ordinds[1]*2+1] 3rd vertice: in[ordinds[2]*2], in[ordinds[2]*2+1] 4th vertice: in[ordinds[3]*2], in[ordinds[3]*2+1] The implementation of this is very similar to the Graham scan in finding the Convex Hull. However, in projection we will never have a concave polygon (the left condition below, where this algorithm will get to E before D), we will always have a convex polygon (right case) or E won’t exist! This is because we are always going to be calculating the area of the overlap between a quadrilateral and the pixel grid or the quadrilateral its self. The ‘GAL_POLYGON_MAX_CORNERS’ macro is defined so there will be no need to allocate these temporary arrays separately. Since we are dealing with pixels, the polygon can’t really have too many vertices. -- Function: int gal_polygon_is_convex (double *v, size_t n) Returns ‘1’ if the polygon is convex with vertices defined by ‘v’ and ‘0’ if it is a concave polygon. Note that the vertices of the polygon should be sorted in an anti-clockwise manner. -- Function: double gal_polygon_area (double *v, size_t n) Find the area of a polygon with vertices defined in ‘v’. ‘v’ points to an array of doubles which keep the positions of the vertices such that ‘v[0]’ and ‘v[1]’ are the positions of the first vertice to be considered. -- Function: int gal_polygon_is_inside (double *v, double *p, size_t n) Returns ‘0’ if point ‘p’ in inside a polygon, either convex or concave. The vertices of the polygon are defined by ‘v’ and ‘0’ otherwise, they have to be ordered in an anti-clockwise manner. This function uses the winding number algorithm (https://en.wikipedia.org/wiki/Point_in_polygon#Winding_number_algorithm), to check the points. Note that this is a generic function (working on both concave and convex polygons, so if you know before-hand that your polygon is convex, it is much more efficient to use ‘gal_polygon_is_inside_convex’. -- Function: int gal_polygon_is_inside_convex (double *v, double *p, size_t n) Return ‘1’ if the point ‘p’ is within the polygon whose vertices are defined by ‘v’. The polygon is assumed to be convex, for a more generic function that deals with concave and convex polygons, see ‘gal_polygon_is_inside’. Note that the vertices of the polygon have to be sorted in an anti-clock-wise manner. -- Function: int gal_polygon_ppropin (double *v, double *p, size_t n) Similar to ‘gal_polygon_is_inside_convex’, except that if the point ‘p’ is on one of the edges of a polygon, this will return ‘0’. -- Function: int gal_polygon_is_counterclockwise (double *v, size_t n) Returns ‘1’ if the sorted polygon has a counter-clockwise orientation and ‘0’ otherwise. This function uses the concept of “winding”, which defines the relative order in which the vertices of a polygon are listed to determine the orientation of vertices. For complex polygons (where edges, or sides, intersect), the most significant orientation is returned. In a complex polygon, when the alternative windings are equal (for example an ‘8’-shape) it will return ‘1’ (as if it was counter-clockwise). Note that the polygon vertices have to be sorted before calling this function. -- Function: int gal_polygon_to_counterclockwise (double *v, size_t n) Arrange the vertices of the sorted polygon in place, to be in a counter-clockwise direction. If the input polygon already has a counter-clockwise direction it won’t touch the input. The return value is ‘1’ on successful execution. This function is just a wrapper over ‘gal_polygon_is_counterclockwise’, and will reverse the order of the vertices when necessary necessary. -- Function: void gal_polygon_clip (double *s, size_t n, double *c, size_t m, double *o, size_t *numcrn) Clip (find the overlap of) two polygons. This function uses the Sutherland-Hodgman (https://en.wikipedia.org/wiki/Sutherland%E2%80%93Hodgman_algorithm) polygon clipping algorithm. Note that the vertices of both polygons have to be sorted in an anti-clock-wise manner. -- Function: void gal_polygon_vertices_sort (double *vertices, size_t n, size_t *ordinds) Sort the indices of the un-ordered ‘vertices’ array to a counter-clockwise polygon in the already allocated space of ‘ordinds’. It is assumed that there are ‘n’ vertices, and thus that ‘vertices’ contains ‘2*n’ elements where the two coordinates of the first vertice occupy the first two elements of the array and so on. The polygon can be both concave and convex (see the start of this section). However, note that for concave polygons there is no unique sort from an un-ordered set of vertices. So after this function you may want to use ‘gal_polygon_is_convex’ and print a warning to check the output if the polygon was concave. Note that the contents of the ‘vertices’ array are left untouched by this function. If you want to write the ordered vertice coordinates in another array with the same size, you can use a loop like this: for(i=0;i #include /* qsort is defined in stdlib.h. */ #include int main (void) { size_t s[4]={0, 1, 2, 3}; float f[4]={1.3,0.2,1.8,0.1}; gal_qsort_index_single=f; qsort(s, 4, sizeof(size_t), gal_qsort_index_single_float_d); printf("%zu, %zu, %zu, %zu\n", s[0], s[1], s[2], s[3]); return EXIT_SUCCESS; } The output will be: ‘2, 0, 1, 3’. -- Function: int gal_qsort_index_single_TYPE_i (const void *a, const void *b) Similar to ‘gal_qsort_index_single_TYPE_d’, but will sort the indexes such that the values of ‘gal_qsort_index_single’ can be parsed in increasing order. -- Function: int gal_qsort_index_multi_d (const void *a, const void *b) When passed to ‘qsort’ with an array of ‘gal_qsort_index_multi’, this function will sort the array based on the values of the given indices. The sorting will be ordered according to the ‘values’ pointer of ‘gal_qsort_index_multi’. Note that ‘values’ must point to the same place in all the structures of the ‘gal_qsort_index_multi’ array. This function is only useful when the indices of multiple arrays on multiple threads are to be sorted. If your program is single threaded, or all the indices belong to a single array (sorting different sub-sets of indices in a single array on multiple threads), it is recommended to use ‘gal_qsort_index_single_d’. -- Function: int gal_qsort_index_multi_i (const void *a, const void *b) Similar to ‘gal_qsort_index_multi_d’, but the result will be sorted in increasing order (first element will have the smallest value).  File: gnuastro.info, Node: K-d tree, Next: Permutations, Prev: Qsort functions, Up: Gnuastro library 11.3.19 K-d tree (‘kdtree.h’) ----------------------------- K-d tree is a space-partitioning binary search tree for organizing points in a k-dimensional space. They are a very useful data structure for multidimensional searches like range searches and nearest neighbor searches. For a more formal and complete introduction see the Wikipedia page (https://en.wikipedia.org/wiki/K-d_tree). Each non-leaf node in a k-d tree divides the space into two parts, known as half-spaces. To select the top/root node for partitioning, we find the median of the points and make a hyperplane normal to the first dimension. The points to the left of this space are represented by the left subtree of that node and points to the right of the space are represented by the right subtree. This is then repeated for all the points in the input, thus associating a “left” and “right” branch for each input point. Gnuastro uses the standard algorithms of the k-d tree with one small difference that makes it much more memory and CPU optimized. The set of input points that define the tree nodes are given as a list of Gnuastro’s data container type, see *note List of gal_data_t::. Each ‘gal_data_t’ in the list represents the point’s coordinate in one dimension, and the first element in the list is the first dimension. Hence the number of data values in each ‘gal_data_t’ (which must be equal in all of them) represents the number of points. This is the same format that Gnuastro’s Table reading/writing functions read/write columns in tables, see *note Table input output::. The output k-d tree is a list of two ‘gal_data_t’s, representing the input’s row-number (or index, counting from 0) of the left and right subtrees of each row. Each ‘gal_data_t’ thus has the same number of rows (or points) as the input, but only containing integers with a type of ‘uint32_t’ (unsigned 32-bit integer). If a node has no left, or right subtree, then ‘GAL_BLANK_UINT32’ will be used. Below you can see the simple tree for 2D points from Wikipedia. The input point coordinates are represented as two input ‘gal_data_t’s (‘X’ and ‘Y’, where ‘X->next=Y’ and ‘Y->next=NULL’). If you had three dimensional points, you could define an extra ‘gal_data_t’ such that ‘Y->next=Z’ and ‘Z->next=NULL’. The output is always a list of two ‘gal_data_t’s, where the first one contains the index of the left sub-tree in the input, and the second one, the index of the right subtree. The index of the root node (‘0’ in the case below(1)) is also returned as a single number. INDEX INPUT OUTPUT K-D Tree (as guide) X --> Y LEFT --> RIGHT (visualized) ---------- ------- -------------- ------------------ 0 5 4 1 2 (5,4) 1 2 3 BLANK 4 / \ 2 7 2 5 3 (2,3) \ 3 9 6 BLANK BLANK \ (7,2) 4 4 7 BLANK BLANK (4,7) / \ 5 8 1 BLANK BLANK (8,1) (9,6) This format is therefore scalable to any number of dimensions: the number of dimensions are determined from the number of nodes in the input list of ‘gal_data_t’s (for example, using ‘gal_list_data_number’). In Gnuastro’s k-d tree implementation, there are thus no special structures to keep every tree node (which would take extra memory and would need to be moved around as the tree is being created). Everything is done internally on the index of each point in the input dataset: the only thing that is flipped/sorted during tree creation is the index to the input row for any number of dimensions. As a result, Gnuastro’s k-d tree implementation is very memory and CPU efficient and its two output columns can directly be written into a standard table (without having to define any special binary format). -- Function: gal_data_t * gal_kdtree_create (gal_data_t *coords_raw, size_t *root) Create a k-d tree in a bottom-up manner (from leaves to the root). This function returns two ‘gal_data_t’s connected as a list, see description above. The first dataset contains the indexes of left and right nodes of the subtrees for each input node. The index of the root node is written into the memory that ‘root’ points to. ‘coords_raw’ is the list of the input points (one ‘gal_data_t’ per dimension, see above). For example, assume you have the simple set of points below (from the visualized example at the start of this section) in a plain-text file called ‘coordinates.txt’: $ cat coordinates.txt 5 4 2 3 7 2 9 6 4 7 8 1 With the program below, you can calculate the kd-tree, and write it in a FITS file (while keeping the root index as a FITS keyword inside of it). #include #include #include int main (void) { gal_data_t *input, *kdtree; char kdtreefile[]="kd-tree.fits"; char inputfile[]="coordinates.txt"; /* To write the root within the saved file. */ size_t root; char *unit="index"; char *keyname="KDTROOT"; gal_fits_list_key_t *keylist=NULL; char *comment="k-d tree root index (counting from 0)."; /* Read the input table. Note: this assumes the table only * contains your input point coordinates (one column for each * dimension). If it contains more columns with other properties * for each point, you can specify which columns to read by * name or number, see the documentation of 'gal_table_read'. */ input=gal_table_read(inputfile, "1", NULL, NULL, GAL_TABLE_SEARCH_NAME, 0, -1, 0, NULL); /* Construct a k-d tree. The index of root is stored in `root` */ kdtree=gal_kdtree_create(input, &root); /* Write the k-d tree to a file and write root index and input * name as FITS keywords ('gal_table_write' frees 'keylist').*/ gal_fits_key_list_title_add(&keylist, "k-d tree parameters", 0); gal_fits_key_write_filename("KDTIN", inputfile, &keylist, 0); gal_fits_key_list_add_end(&keylist, GAL_TYPE_SIZE_T, keyname, 0, &root, 0, comment, 0, unit, 0); gal_table_write(kdtree, &keylist, NULL, GAL_TABLE_FORMAT_BFITS, kdtreefile, "kdtree", 0); /* Clean up and return. */ gal_list_data_free(input); gal_list_data_free(kdtree); return EXIT_SUCCESS; } You can inspect the saved k-d tree FITS table with Gnuastro’s *note Table:: (first command below), and you can see the keywords containing the root index with *note Fits:: (second command below): asttable kd-tree.fits astfits kd-tree.fits -h1 -- Function: size_t gal_kdtree_nearest_neighbour (gal_data_t *coords_raw, gal_data_t *kdtree, size_t root, double *point, double *least_dist) Returns the index of the nearest input point to the query point (‘point’, assumed to be an array with same number of elements as ‘gal_data_t’s in ‘coords_raw’). The distance between the query point and its nearest neighbor is stored in the space that ‘least_dist’ points to. This search is efficient due to the constant checking for the presence of possible best points in other branches. If it isn’t possible for the other branch to have a better nearest neighbor, that branch is not searched. As an example, let’s use the k-d tree that was created in the example of ‘gal_kdtree_create’ (above) and find the nearest row to a given coordinate (‘point’). This will be a very common scenario, especially in large and multi-dimensional datasets where the k-d tree creation can take long and you don’t want to re-create the k-d tree every time. In the ‘gal_kdtree_create’ example output, we also wrote the k-d tree root index as a FITS keyword (‘KDTROOT’), so after loading the two table data (input coordinates and k-d tree), we’ll read the root from the FITS keyword. This is a very simple example, but the scalability is clear: for example it is trivial to parallelize (see *note Library demo - multi-threaded operation::). #include #include #include int main (void) { /* INPUT: desired point. */ double point[2]={8.9,5.9}; /* Same as example in description of 'gal_kdtree_create'. */ gal_data_t *input, *kdtree; char kdtreefile[]="kd-tree.fits"; char inputfile[]="coordinates.txt"; /* Processing variables of this function. */ char kdtreehdu[]="1"; double *in_x, *in_y, least_dist; size_t root, nkeys=1, nearest_index; gal_data_t *rkey, *keysll=gal_data_array_calloc(nkeys); /* Read the input coordinates, see comments in example of * 'gal_kdtree_create' for more. */ input=gal_table_read(inputfile, "1", NULL, NULL, GAL_TABLE_SEARCH_NAME, 0, -1, 0, NULL); /* Read the k-d tree contents (created before). */ kdtree=gal_table_read(kdtreefile, "1", NULL, NULL, GAL_TABLE_SEARCH_NAME, 0, -1, 0, NULL); /* Read the k-d tree root index from the header keyword. * See example in description of 'gal_fits_key_read_from_ptr'.*/ keysll[0].name="KDTROOT"; keysll[0].type=GAL_TYPE_SIZE_T; gal_fits_key_read(kdtreefile, kdtreehdu, keysll, 0, 0); keysll[0].name=NULL; /* Since we didn't allocate it. */ rkey=gal_data_copy_to_new_type(&keysll[0], GAL_TYPE_SIZE_T); root=((size_t *)(rkey->array))[0]; /* Find the nearest neighbour of the point. */ nearest_index=gal_kdtree_nearest_neighbour(input, kdtree, root, point, &least_dist); /* Print the results. */ in_x=input->array; in_y=input->next->array; printf("(%g, %g): nearest is (%g, %g), with a distance of %g\n", point[0], point[1], in_x[nearest_index], in_y[nearest_index], least_dist); /* Clean up and return. */ gal_data_free(rkey); gal_list_data_free(input); gal_list_data_free(kdtree); gal_data_array_free(keysll, nkeys, 1); return EXIT_SUCCESS; } ---------- Footnotes ---------- (1) This example input table is the same as the example in Wikipedia (as of December 2020). However, on the Wikipedia output, the root node is (7,2), not (5,4). The difference is primarily because there are 6 rows and the median element of an even number of elements can vary by integer calculation strategies. Here we use 0-based indexes for finding median and round to the smaller integer.  File: gnuastro.info, Node: Permutations, Next: Matching, Prev: K-d tree, Up: Gnuastro library 11.3.20 Permutations (‘permutation.h’) -------------------------------------- Permutation is the technical name for re-ordering of values. The need for permutations occurs a lot during (mainly low-level) processing. To do permutation, you must provide two inputs: an array of values (that you want to re-order in place) and a permutation array which contains the new index of each element (let’s call it ‘perm’). The diagram below shows the input array before and after the re-ordering. permute: AFTER[ i ] = BEFORE[ perm[i] ] i = 0 .. N-1 inverse: AFTER[ perm[i] ] = BEFORE[ i ] i = 0 .. N-1 The functions here are a re-implementation of the GNU Scientific Library’s ‘gsl_permute’ function. The reason we didn’t use that function was that it uses system-specific types (like ‘long’ and ‘int’) which can have different widths on different systems, hence are not easily convertible to Gnuastro’s fixed width types (see *note Numeric data types::). There is also a separate function for each type, heavily using macros to allow a ‘base’ function to work on all the types. Thus it is hard to read/understand. Hence, Gnuastro contains a re-write of their steps in a new type-agnostic method which is a single function that can work on any type. As described in GSL’s source code and manual, this implementation comes from Donald Knuth’s _Art of computer programming_ book, in the "Sorting and Searching" chapter of Volume 3 (3rd ed). Exercise 10 of Section 5.2 defines the problem and in the answers, Knuth describes the solution. So if you are interested, please have a look there for more. We are in contact with the GSL developers and in the future(1) we will submit these implementations to GSL. If they are finally incorporated there, we will delete this section in future versions. -- Function: void gal_permutation_check (size_t *permutation, size_t size) Print how ‘permutation’ will re-order an array that has ‘size’ elements for each element in one one line. -- Function: void gal_permutation_apply (gal_data_t *input, size_t *permutation) Apply ‘permutation’ on the ‘input’ dataset (can have any type), see above for the definition of permutation. -- Function: void gal_permutation_apply_inverse (gal_data_t *input, size_t *permutation) Apply the inverse of ‘permutation’ on the ‘input’ dataset (can have any type), see above for the definition of permutation. ---------- Footnotes ---------- (1) Gnuastro’s Task 14497 (http://savannah.gnu.org/task/?14497). If this task is still “postponed” when you are reading this and you are interested to help, your help would be very welcome. Both Gnuastro and GSL developers are very busy, hence both would appreciate your help.  File: gnuastro.info, Node: Matching, Next: Statistical operations, Prev: Permutations, Up: Gnuastro library 11.3.21 Matching (‘match.h’) ---------------------------- Matching is often necessary when two measurements of the same points have been done using different instruments (or hardware), different software or different configurations of the same software. In other words, you have two catalogs or tables and each has N columns containing the N-dimensional “positional” values of each point. Each can have other columns too, for example one can have brightness measurements in one filter, and another can have brightness measurements in another filter as well as morphology measurements or etc. The matching functions here will use the positional columns to find the permutation necessary to apply to both tables. This will enable you to match by the positions, then apply the permutation to the brightness or morphology columns in the example above. The input and output data formats of the functions below are the some and described below before the actual functions. Each function also has extra arguments due to the particular algorithm it uses for the matching. The two inputs of the functions (‘coord1’ and ‘coord2’) must be *note List of gal_data_t::. Each ‘gal_data_t’ node in ‘coord1’ or ‘coord2’ should be a single dimensional dataset (column in a table) and all the nodes must have the same number of elements (rows). In other words, each column can be visualized as having the coordinates of each point in its respective dimension. The dimensions of the coordinates is determined by the number of ‘gal_data_t’ nodes in the two input lists (which must be equal). The number of rows (or the number of elements in each ‘gal_data_t’) in the columns of ‘coord1’ and ‘coord2’ can be different. All these functions will all be satisfied if you use ‘gal_table_read’ to read the two coordinate columns, see *note Table input output::. The functions below return a simply-linked list of three 1D datasets (see *note List of gal_data_t::), let’s call the returned dataset ‘ret’. The first two (‘ret’ and ‘ret->next’) are permutations. In other words, the ‘array’ elements of both have a type of ‘size_t’, see *note Permutations::. The third node (‘ret->next->next’) is the calculated distance for that match and its array has a type of ‘double’. The number of matches will be put in the space pointed by the ‘nummatched’ argument. If there wasn’t any match, this function will return ‘NULL’. The two permutations can be applied to the rows of the two inputs: the first one (‘ret’) should be applied to the rows of the table containing ‘coord1’ and the second one (‘ret->next’) to the table containing ‘coord2’. After applying the returned permutations to the inputs, the top ‘nummatched’ elements of both will match with each other. The ordering of the rest of the elements is undefined (depends on the matching funciton used). The third node is the distances between the respective match (which may be elliptical distance, see discussion of “aperture” below). The functions will not simply return the nearest neighbor as a match. The nearest neighbor may be too far to be a meaningful. They will check the distance between the distance of the nearest neighbor of each point and only return a match for it it is within an acceptable N-dimensional distance (or “aperture”). The matching aperture is defined by the ‘aperture’ array that is an input argument to the functions. If several points of one catalog lie within this aperture of a point in the other, the nearest is defined as the match. In a 2D situation (where the input lists have two nodes), for the most generic case, it must have three elements: the major axis length, axis ratio and position angle (see *note Defining an ellipse and ellipsoid::). If ‘aperture[1]==1’, the aperture will be a circle of radius ‘aperture[0]’ and the third value won’t be used. When the aperture is an ellipse, distances between the points are also calculated in the respective elliptical distances ($r_{el}$ in *note Defining an ellipse and ellipsoid::). -- Function: gal_data_t * gal_match_coordinates (gal_data_t *coord1, gal_data_t *coord2, double *aperture, int sorted_by_first, int inplace, size_t minmapsize, int quietmmap, size_t *nummatched) Use a basic sort-based match to find the matching points of two input coordinates. See the descriptions above on the format of the inputs and outputs. To speed up the search, this function will sort the input coordinates by their first column (first axis). If _both_ are already sorted by their first column, you can avoid the sorting step by giving a non-zero value to ‘sorted_by_first’. When sorting is necessary and ‘inplace’ is non-zero, the actual input columns will be sorted. Otherwise, an internal copy of the inputs will be made, used (sorted) and later freed before returning. Therefore, when ‘inplace==0’, inputs will remain untouched, but this function will take more time and memory. If internal allocation is necessary and the space is larger than ‘minmapsize’, the space will be not allocated in the RAM, but in a file, see description of ‘--minmapsize’ and ‘--quietmmap’ in *note Processing options::. *Output permutations ignore internal sorting*: the output permutations will correspond to the initial inputs. Therefore, even when ‘inplace!=0’ (and this function re-arranges the inputs in place), the output permutation will correspond to original (possibly non-sorted) inputs. The reason for this is that you rarely want to permute the actual positional columns after the match. Usually, you also have other columns (for example the brightness, morphology and etc) and you want to find how they differ between the objects that match. Once you have the permutations, they can be applied to those other columns (see *note Permutations::) and the higher-level processing can continue. So if you don’t need the coordinate columns for the rest of your analysis, it is better to set ‘inplace=1’.  File: gnuastro.info, Node: Statistical operations, Next: Binary datasets, Prev: Matching, Up: Gnuastro library 11.3.22 Statistical operations (‘statistics.h’) ----------------------------------------------- After reading a dataset into memory from a file or fully simulating it with another process, the most common processes that will be done on it are statistical operations to let you quantify different aspects of the data. the functions in this section describe Gnuastro’s current set of tools for this job. All these functions can work on any numeric data type natively (see *note Numeric data types::) and can also work on tiles over a dataset. Hence the inputs and outputs are in Gnuastro’s *note Generic data container::. -- Macro: GAL_STATISTICS_SIG_CLIP_MAX_CONVERGE The maximum number of clips, when $\sigma$-clipping should be done by convergence. If the clipping does not converge before making this many clips, all $\sigma$-clipping outputs will be NaN. -- Macro: GAL_STATISTICS_MODE_GOOD_SYM The minimum acceptable symmetricity of the mode calculation. If the symmetricity of the derived mode is less than this value, all the returned values by ‘gal_statistics_mode’ will have a value of NaN. -- Macro: GAL_STATISTICS_BINS_INVALID -- Macro: GAL_STATISTICS_BINS_REGULAR -- Macro: GAL_STATISTICS_BINS_IRREGULAR Macros used to identify if the regularity of the bins when defining bins. -- Function: gal_data_t * gal_statistics_number (gal_data_t *input) Return a single-element dataset with type ‘size_t’ which contains the number of non-blank elements in ‘input’. -- Function: gal_data_t * gal_statistics_minimum (gal_data_t *input) Return a single-element dataset containing the minimum non-blank value in ‘input’. The numerical datatype of the output is the same as ‘input’. -- Function: gal_data_t * gal_statistics_maximum (gal_data_t *input) Return a single-element dataset containing the maximum non-blank value in ‘input’. The numerical datatype of the output is the same as ‘input’. -- Function: gal_data_t * gal_statistics_sum (gal_data_t *input) Return a single-element (‘double’ or ‘float64’) dataset containing the sum of the non-blank values in ‘input’. -- Function: gal_data_t * gal_statistics_mean (gal_data_t *input) Return a single-element (‘double’ or ‘float64’) dataset containing the mean of the non-blank values in ‘input’. -- Function: gal_data_t * gal_statistics_std (gal_data_t *input) Return a single-element (‘double’ or ‘float64’) dataset containing the standard deviation of the non-blank values in ‘input’. -- Function: gal_data_t * gal_statistics_mean_std (gal_data_t *input) Return a two-element (‘double’ or ‘float64’) dataset containing the mean and standard deviation of the non-blank values in ‘input’. The first element of the returned dataset is the mean and the second is the standard deviation. This function will calculate both values in one pass over the dataset. Hence when both the mean and standard deviation of a dataset are necessary, this function is much more efficient than calling ‘gal_statistics_mean’ and ‘gal_statistics_std’ separately. -- Function: gal_data_t * gal_statistics_median (gal_data_t *input, int inplace) Return a single-element dataset containing the median of the non-blank values in ‘input’. The numerical datatype of the output is the same as ‘input’. Calculating the median involves sorting the dataset and removing blank values, for better performance (and less memory usage), you can give a non-zero value to the ‘inplace’ argument. In this case, the sorting and removal of blank elements will be done directly on the input dataset. However, after this function the original dataset may have changed (if it wasn’t sorted or had blank values). -- Function: size_t gal_statistics_quantile_index (size_t size, double quantile) Return the index of the element that has a quantile of ‘quantile’ assuming the dataset has ‘size’ elements. -- Function: gal_data_t * gal_statistics_quantile (gal_data_t *input, double quantile, int inplace) Return a single-element dataset containing the value with in a quantile ‘quantile’ of the non-blank values in ‘input’. The numerical datatype of the output is the same as ‘input’. See ‘gal_statistics_median’ for a description of ‘inplace’. -- Function: size_t gal_statistics_quantile_function_index (gal_data_t *input, gal_data_t *value, int inplace) Return the index of the quantile function (inverse quantile) of ‘input’ at ‘value’. In other words, this function will return the index of the nearest element (of a sorted and non-blank) ‘input’ to ‘value’. If the value is outside the range of the input, then this function will return ‘GAL_BLANK_SIZE_T’. -- Function: gal_data_t * gal_statistics_quantile_function (gal_data_t *input, gal_data_t *value, int inplace) Return a single-element dataset containing the quantile function of the non-blank values in ‘input’ at ‘value’ (a single-element dataset). The numerical data type is of the returned dataset is ‘float64’ (or ‘double’). In other words, this function will return the quantile of ‘value’ in ‘input’. ‘value’ has to have the same type as ‘input’. See ‘gal_statistics_median’ for a description of ‘inplace’. When all elements are blank, the returned value will be NaN. If the value is smaller than the input’s smallest element, the returned value will be negative infinity. If the value is larger than the input’s largest element, then the returned value will be positive infinity -- Function: gal_data_t * gal_statistics_unique (gal_data_t *input, int inplace) Return a 1D dataset with the same numeric data type as the input, but only containing its unique elements and without any (possible) blank/NaN elements. Note that the input’s number of dimensions is irrelevant for this function. If ‘inplace’ is not zero, then the unique values will over-write the allocated space of the input, otherwise a new space will be allocated and the input will not be touched. -- Function: gal_data_t * gal_statistics_mode (gal_data_t *input, float mirrordist, int inplace) Return a four-element (‘double’ or ‘float64’) dataset that contains the mode of the ‘input’ distribution. This function implements the non-parametric algorithm to find the mode that is described in Appendix C of Akhlaghi and Ichikawa [2015] (https://arxiv.org/abs/1505.01664). In short it compares the actual distribution and its “mirror distribution” to find the mode. In order to be efficient, you can determine how far the comparison goes away from the mirror through the ‘mirrordist’ parameter (think of it as a multiple of sigma/error). See ‘gal_statistics_median’ for a description of ‘inplace’. The output array has the following elements (in the given order, note that counting in C starts from 0). array[0]: mode array[1]: mode quantile. array[2]: symmetricity. array[3]: value at the end of symmetricity. -- Function: gal_data_t * gal_statistics_mode_mirror_plots (gal_data_t *input, gal_data_t *value, size_t numbins, int inplace, double *mirror_val) Make a mirrored histogram and cumulative frequency plot (with ‘numbins’) with the mirror distribution of the ‘input’ having a value in ‘value’. If all the input elements are blank, or the mirror value is outside the range of the input, this function will return a ‘NULL’ pointer. The output is a list of data structures (see *note List of gal_data_t::): the first is the bins with one bin at the mirror point, the second is the histogram with a maximum of one and the third is the cumulative frequency plot (with a maximum of one). -- Function: int gal_statistics_is_sorted (gal_data_t *input, int updateflags) Return ‘0’ if the input is not sorted, if it is sorted, this function will return ‘1’ and ‘2’ if it is increasing or decreasing, respectively. This function will abort with an error if ‘input’ has zero elements and will return ‘1’ (sorted, increasing) when there is only one element. This function will only look into the dataset if the ‘GAL_DATA_FLAG_SORT_CH’ bit of ‘input->flag’ is ‘0’, see *note Generic data container::. When the flags don’t indicate a previous check _and_ ‘updateflags’ is non-zero, this function will set the flags appropriately to avoid having to re-check the dataset in future calls (this can be very useful when repeated checks are necessary). When ‘updateflags==0’, this function has no side-effects on the dataset: it will not toggle the flags. If you want to re-check a dataset with the blank-value-check flag already set (for example if you have made changes to it), then explicitly set the ‘GAL_DATA_FLAG_SORT_CH’ bit to zero before calling this function. When there are no other flags, you can simply set the flags to zero (with ‘input->flag=0’), otherwise you can use this expression: input->flag &= ~GAL_DATA_FLAG_SORT_CH; -- Function: void gal_statistics_sort_increasing (gal_data_t *input) Sort the input dataset (in place) in an increasing order and toggle the sort-related bit flags accordingly. -- Function: void gal_statistics_sort_decreasing (gal_data_t *input) Sort the input dataset (in place) in a decreasing order and toggle the sort-related bit flags accordingly. -- Function: gal_data_t * gal_statistics_no_blank_sorted (gal_data_t *input, int inplace) Remove all the blanks and sort the input dataset. If ‘inplace’ is non-zero this will happen on the input dataset (in the allocated space of the input dataset). However, if ‘inplace’ is zero, this function will allocate a new copy of the dataset and work on that. Therefore if ‘inplace==0’, the input dataset will be modified. This function uses the bit flags of the input, so if you have modified the dataset, set ‘input->flag=0’ before calling this function. Also note that ‘inplace’ is only for the dataset elements. Therefore even when ‘inplace==0’, if the input is already sorted _and_ has no blank values, then the flags will be updated to show this. If all the elements were blank, then the returned dataset’s ‘size’ will be zero. This is thus a good parameter to check after calling this function to see if there actually were any non-blank elements in the input or not and take the appropriate measure. This can help avoid strange bugs in later steps. The flags of a zero-sized returned dataset will indicate that it has no blanks and is sorted in an increasing order. Even if having blank values or being sorted is not defined on a zero-element dataset, it is up to the caller to choose what they will do with a zero-element dataset. The flags have to be set after this function any way. -- Function: gal_data_t * gal_statistics_regular_bins (gal_data_t *input, gal_data_t *inrange, size_t numbins, double onebinstart) Generate an array of regularly spaced elements as a 1D array (column) of type ‘double’ (i.e., ‘float64’, it has to be double to account for small differences on the bin edges). The input arguments are described below ‘input’ The dataset you want to apply the bins to. This is only necessary if the range argument is not complete, see below. If ‘inrange’ has all the necessary information, you can pass a ‘NULL’ pointer for this. ‘inrange’ This dataset keeps the desired range along each dimension of the input data structure, it has to be in ‘float’ (i.e., ‘float32’) type. • If you want the full range of the dataset (in any dimensions, then just set ‘inrange’ to ‘NULL’ and the range will be specified from the minimum and maximum value of the dataset (‘input’ cannot be ‘NULL’ in this case). • If there is one element for each dimension in range, then it is viewed as a quantile (Q), and the range will be: ‘Q to 1-Q’. • If there are two elements for each dimension in range, then they are assumed to be your desired minimum and maximum values. When either of the two are NaN, the minimum and maximum will be calculated for it. ‘numbins’ The number of bins: must be larger than 0. ‘onebinstart’ A desired value for onebinstart. Note that with this option, the bins won’t start and end exactly on the given range values, it will be slightly shifted to accommodate this request. -- Function: gal_data_t * gal_statistics_histogram (gal_data_t *input, gal_data_t *bins, int normalize, int maxone) Make a histogram of all the elements in the given dataset with bin values that are defined in the ‘inbins’ structure (see ‘gal_statistics_regular_bins’, they currently have to be equally spaced). ‘inbins’ is not mandatory, if you pass a ‘NULL’ pointer, the bins structure will be built within this function based on the ‘numbins’ input. As a result, when you have already defined the bins, ‘numbins’ is not used. Let’s write the center of the $i$th element of the bin array as $b_i$, and the fixed half-bin width as $h$. Then element $j$ of the input array ($in_j$) will be counted in $b_i$ if $(b_i-h) \le in_j < (b_i+h)$. However, if $in_j$ is somewhere in the last bin, the condition changes to $(b_i-h) \le in_j \le (b_i+h)$. If ‘normalize!=0’, the histogram will be “normalized” such that the sum of the counts column will be one. In other words, all the counts in every bin will be divided by the total number of counts. If ‘maxone!=0’, the histogram’s maximum count will be 1. In other words, the counts in every bin will be divided by the value of the maximum. -- Function: gal_data_t * gal_statistics_histogram2d (gal_data_t *input, gal_data_t *bins) This function is very similar to ‘gal_statistics_histogram’, but will build a 2D histogram (count how many of the elements of ‘input’ have a within a 2D box. The bins comprising the first dimension of the 2D box are defined by ‘bins’. The bins of the second dimension are defined by ‘bins->next’ (‘bins’ is a *note List of gal_data_t::). Both the ‘bin’ and ‘bin->next’ can be created with ‘gal_statistics_regular_bins’. This function returns a list of ‘gal_data_t’ with three nodes/columns, so you can directly write them into a table (see *note Table input output::). Assuming ‘bins’ has $N1$ bins and ‘bins->next’ has $N2$ bins, each node/column of the returned output is a 1D array with $N1\times N2$ elements. The first and second columns are the center of the 2D bin along the first and second dimensions and have a ‘double’ data type. The third column is the 2D histogram (the number of input elements that have a value within that 2D bin) and has a ‘uint32’ data type (see *note Numeric data types::). -- Function: gal_data_t * gal_statistics_cfp (gal_data_t *input, gal_data_t *bins, int normalize) Make a cumulative frequency plot (CFP) of all the elements in ‘input’ with bin values that are defined in the ‘bins’ structure (see ‘gal_statistics_regular_bins’). The CFP is built from the histogram: in each bin, the value is the sum of all previous bins in the histogram. Thus, if you have already calculated the histogram before calling this function, you can pass it onto this function as the data structure in ‘bins->next’ (see ‘List of gal_data_t’). If ‘bin->next!=NULL’, then it is assumed to be the histogram. If it is ‘NULL’, then the histogram will be calculated internally and freed after the job is finished. When a histogram is given and it is normalized, the CFP will also be normalized (even if the normalized flag is not set here): note that a normalized CFP’s maximum value is 1. -- Function: gal_data_t * gal_statistics_sigma_clip (gal_data_t *input, float multip, float param, int inplace, int quiet) Apply $\sigma$-clipping on a given dataset and return a dataset that contains the results. For a description of $\sigma$-clipping see *note Sigma clipping::. ‘multip’ is the multiple of the standard deviation (or $\sigma$, that is used to define outliers in each round of clipping). The role of ‘param’ is determined based on its value. If ‘param’ is larger than ‘1’ (one), it must be an integer and will be interpreted as the number clips to do. If it is less than ‘1’ (one), it is interpreted as the tolerance level to stop the iteration. The returned dataset (let’s call it ‘out’) contains a four-element array with type ‘GAL_TYPE_FLOAT32’. The final number of clips is stored in the ‘out->status’. float *array=out->array; array[0]: Number of points used. array[1]: Median. array[2]: Mean. array[3]: Standard deviation. If the $\sigma$-clipping doesn’t converge or all input elements are blank, then this function will return NaN values for all the elements above. -- Function: gal_data_t * gal_statistics_outlier_bydistance (int pos1_neg0, gal_data_t *input, size_t window_size, float sigma, float sigclip_multip, float sigclip_param, int inplace, int quiet) Find the first positive outlier (if ‘pos1_neg0!=0’) in the ‘input’ distribution. When ‘pos1_neg0==0’, the same algorithm goes to the start of the dataset. The returned dataset contains a single element: the first positive outlier. It is one of the dataset’s elements, in the same type as the input. If the process fails for any reason (for example no outlier was found), a ‘NULL’ pointer will be returned. All (possibly existing) blank elements are first removed from the input dataset, then it is sorted. A sliding window of ‘window_size’ elements is parsed over the dataset. Starting from the ‘window_size’-th element of the dataset, in the direction of increasing values. This window is used as a reference. The first element where the distance to the previous (sorted) element is ‘sigma’ units away from the distribution of distances in its window is considered an outlier and returned by this function. Formally, if we assume there are $N$ non-blank elements. They are first sorted. Searching for the outlier starts on element $W$. Let’s take $v_i$ to be the $i$-th element of the sorted input (with no blank values) and $m$ and $\sigma$ as the $\sigma$-clipped median and standard deviation from the distances of the previous $W$ elements (not including $v_i$). If the value given to ‘sigma’ is displayed with $s$, the $i$-th element is considered as an outlier when the condition below is true. $${(v_i-v_{i-1})-m\over \sigma}>s$$ The ‘sigclip_multip’ and ‘sigclip_param’ arguments specify the properties of the $\sigma$-clipping (see *note Sigma clipping:: for more). You see that by this definition, the outlier cannot be any of the lower half elements. The advantage of this algorithm compared to $\sigma$-clippign is that it only looks backwards (in the sorted array) and parses it in one direction. If ‘inplace!=0’, the removing of blank elements and sorting will be done within the input dataset’s allocated space. Otherwise, this function will internally allocate (and later free) the necessary space to keep the intermediate space that this process requires. If ‘quiet!=0’, this function will report the parameters every time it moves the window as a separate line with several columns. The first column is the value, the second (in square brackets) is the sorted index, the third is the distance of this element from the previous one. The Fourth and fifth (in parenthesis) are the median and standard deviation of the $\sigma$-clipped distribution within the window and the last column is the difference between the third and fourth, divided by the fifth. -- Function: gal_data_t * gal_statistics_outlier_flat_cfp (gal_data_t *input, size_t numprev, float sigclip_multip, float sigclip_param, float thresh, size_t numcontig, int inplace, int quiet, size_t *index) Return the first element in the given dataset where the cumulative frequency plot first becomes significantly flat for a sufficient number of elements. The returned dataset only has one element (with the same type as the input). If ‘index!=NULL’, the index (counting from zero, after sorting the dataset and removing any blanks) is written in the space that ‘index’ points to. If no sufficiently flat portion is found, the returned pointer will be ‘NULL’. The flatness on the cumulative frequency plot is defined like this (see *note Histogram and Cumulative Frequency Plot::): on the sorted dataset, for every point ($a_i$), we calculate $d_i=a_{i+2}-a_{i-2}$. This done on the first $N$ elements (value of ‘numprev’). After element $a_{N+2}$, we start estimating the flatness as follows: for every element we use the $N$, $d_i$ measurements before it as the reference. Let’s call this set $D_i$ for element $i$. The $\sigma$-clipped median ($m$) and standard deviation ($s$) of $D_i$ are then calculated. The $\sigma$-clipping can be configured with the two ‘sigclip_param’ and ‘sigclip_multip’ arguments. Taking $t$ as the significance threshold (value to ‘thresh’), a point is considered flat when $a_i>m+t\sigma$. But a single point satisfying this condition will probably just be due to noise. To make a more robust estimate, this significance/condition has to hold for ‘numcontig’ contiguous elements after $a_i$. When this is satisfied, $a_i$ is returned as the point where the distribution’s cumulative frequency plot becomes flat. To get a good estimate of $m$ and $s$, it is thus recommended to set ‘numprev’ as large as possible. However, be careful not to set it too high: the checks in the paragraph above are not done on the first ‘numprev’ elements and this function assumes the flatness occurs after them. Also, be sure that the value to ‘numcontig’ is much less than ‘numprev’, otherwise $\sigma$-clipping may not be able to remove the immediate outliers in $D_i$ near the boundary of the flat region. When ‘quiet==0’, the basic measurements done on each element are printed on the command-line (good for finding the best parameters). When ‘inplace!=0’, the sorting and removal of blank elements is done on the input dataset, so the input may be altered after this function.  File: gnuastro.info, Node: Binary datasets, Next: Labeled datasets, Prev: Statistical operations, Up: Gnuastro library 11.3.23 Binary datasets (‘binary.h’) ------------------------------------ Binary datasets only have two (usable) values: 0 (also known as background) or 1 (also known as foreground). They are created after some binary classification is applied to the dataset. The most common is thresholding: for example in an image, pixels with a value above the threshold are given a value of 1 and those with a value less than the threshold are assigned a value of 0. Since there is only two values, in the processing of binary images, you are usually concerned with the positioning of an element and its vicinity (neighbors). When a dataset has more than one dimension, multiple classes of immediate neighbors (that are touching the element) can be defined for each data-element. To separate these different classes of immediate neighbors, we define _connectivity_. The classification is done by the distance from element center to the neighbor’s center. The nearest immediate neighbors have a connectivity of 1, the second nearest class of neighbors have a connectivity of 2 and so on. In total, the largest possible connectivity for data with ‘ndim’ dimensions is ‘ndim’. For example in a 2D dataset, 4-connected neighbors (that share an edge and have a distance of 1 pixel) have a connectivity of 1. The other 4 neighbors that only share a vertice (with a distance of $\sqrt{2}$ pixels) have a connectivity of 2. Conventionally, the class of connectivity-2 neighbors also includes the connectivity 1 neighbors, so for example we call them 8-connected neighbors in 2D datasets. Ideally, one bit is sufficient for each element of a binary dataset. However, CPUs are not designed to work on individual bits, the smallest unit of memory addresses is a byte (containing 8 bits on modern CPUs). Therefore, in Gnuastro, the type used for binary dataset is ‘uint8_t’ (see *note Numeric data types::). Although it does take 8-times more memory, this choice offers much better performance and the some extra (useful) features. The advantage of using a full byte for each element of a binary dataset is that you can also have other values (that will be ignored in the processing). One such common “other” value in real datasets is a blank value (to mark regions that should not be processed because there is no data). The constant ‘GAL_BLANK_UINT8’ value must be used in these cases (see *note Library blank values::). Another is some temporary value(s) that can be given to a processed pixel to avoid having another copy of the dataset as in ‘GAL_BINARY_TMP_VALUE’ that is described below. -- Macro: GAL_BINARY_TMP_VALUE The functions described below work on a ‘uint8_t’ type dataset with values of 1 or 0 (no other pixel will be touched). However, in some cases, it is necessary to put temporary values in each element during the processing of the functions. This temporary value has a special meaning for the operation and will be operated on. So if your input datasets have values other than 0 and 1 that you don’t want these functions to work on, be sure they are not equal to this macro’s value. Note that this value is also different from ‘GAL_BLANK_UINT8’, so your input datasets may also contain blank elements. -- Function: gal_data_t * gal_binary_erode (gal_data_t *input, size_t num, int connectivity, int inplace) Do ‘num’ erosions on the ‘connectivity’-connected neighbors of ‘input’ (see above for the definition of connectivity). If ‘inplace’ is non-zero _and_ the input’s type is ‘GAL_TYPE_UINT8’, then the erosion will be done within the input dataset and the returned pointer will be ‘input’. Otherwise, ‘input’ is copied (and converted if necessary) to ‘GAL_TYPE_UINT8’ and erosion will be done on this new dataset which will also be returned. This function will only work on the elements with a value of 1 or 0. It will leave all the rest unchanged. Erosion (inverse of dilation) is an operation in mathematical morphology where each foreground pixel that is touching a background pixel is flipped (changed to background). The ‘connectivity’ value determines the definition of “touching”. Erosion will thus decrease the area of the foreground regions by one layer of pixels. -- Function: gal_data_t * gal_binary_dilate (gal_data_t *input, size_t num, int connectivity, int inplace) Do ‘num’ dilations on the ‘connectivity’-connected neighbors of ‘input’ (see above for the definition of connectivity). For more on ‘inplace’ and the output, see ‘gal_binary_erode’. Dilation (inverse of erosion) is an operation in mathematical morphology where each background pixel that is touching a foreground pixel is flipped (changed to foreground). The ‘connectivity’ value determines the definition of “touching”. Dilation will thus increase the area of the foreground regions by one layer of pixels. -- Function: gal_data_t * gal_binary_open (gal_data_t *input, size_t num, int connectivity, int inplace) Do ‘num’ openings on the ‘connectivity’-connected neighbors of ‘input’ (see above for the definition of connectivity). For more on ‘inplace’ and the output, see ‘gal_binary_erode’. Opening is an operation in mathematical morphology which is defined as erosion followed by dilation (see above for the definitions of erosion and dilation). Opening will thus remove the outer structure of the foreground. In this implementation, ‘num’ erosions are going to be applied on the dataset, then ‘num’ dilations. -- Function: size_t gal_binary_connected_components (gal_data_t *binary, gal_data_t **out, int connectivity) Return the number of connected components in ‘binary’ through the breadth first search algorithm (finding all pixels belonging to one component before going on to the next). Connection between two pixels is defined based on the value to ‘connectivity’. ‘out’ is a dataset with the same size as ‘binary’ with ‘GAL_TYPE_INT32’ type. Every pixel in ‘out’ will have the label of the connected component it belongs to. The labeling of connected components starts from 1, so a label of zero is given to the input’s background pixels. When ‘*out!=NULL’ (its space is already allocated), it will be cleared (to zero) at the start of this function. Otherwise, when ‘*out==NULL’, the necessary dataset to keep the output will be allocated by this function. ‘binary’ must have a type of ‘GAL_TYPE_UINT8’, otherwise this function will abort with an error. Other than blank pixels (with a value of ‘GAL_BLANK_UINT8’ defined in *note Library blank values::), all other non-zero pixels in ‘binary’ will be considered as foreground (and will be labeled). Blank pixels in the input will also be blank in the output. -- Function: gal_data_t * gal_binary_connected_indexs(gal_data_t *binary, int connectivity) Build a ‘gal_data_t’ linked list, where each node of the list contains an array with indices of the connected regions. Therefore the arrays of each node can have a different size. Note that the indices will only be calculated on the pixels with a value of 1 and internally, it will temporarily change the values to 2 (and return them back to 1 in the end). -- Function: gal_data_t * gal_binary_connected_adjacency_matrix (gal_data_t *adjacency, size_t *numconnected) Find the number of connected labels and new labels based on an adjacency matrix, which must be a square binary array (type ‘GAL_TYPE_UINT8’). The returned dataset is a list of new labels for each old label. In other words, this function will find the objects that are connected (possibly through a third object) and in the output array, the respective elements for all input labels is going to have the same value. The total number of connected labels is put into the space that ‘numconnected’ points to. An adjacency matrix defines connection between two labels. For example, let’s assume we have 5 labels and we know that labels 1 and 5 are connected to label 3, but are not connected with each other. Also, labels 2 and 4 are not touching any other label. So in total we have 3 final labels: one combined object (merged from labels 1, 3, and 5) and the initial labels 2 and 4. The input adjacency matrix would look like this (note the extra row and column for a label 0 which is ignored): INPUT OUTPUT ===== ====== in_lab 1 2 3 4 5 | | numconnected = 3 0 0 0 0 0 0 | in_lab 1 --> 0 0 0 1 0 0 | in_lab 2 --> 0 0 0 0 0 0 | Returned: new labels for the in_lab 3 --> 0 1 0 0 0 1 | 5 initial objects in_lab 4 --> 0 0 0 0 0 0 | | 0 | 1 | 2 | 1 | 3 | 1 | in_lab 5 --> 0 0 0 1 0 0 | Although the adjacency matrix as used here is symmetric, currently this function assumes that it is filled on both sides of the diagonal. -- Function: gal_data_t * gal_binary_connected_adjacency_list (gal_list_sizet_t **listarr, size_t number, size_t minmapsize, int quietmmap, size_t *numconnected) Find the number of connected labels and new labels based on an adjacency list. The output of this function is identical to that of ‘gal_binary_connected_adjacency_matrix’. But the major difference is that it uses a list of connected labels to each label instead of a square adjacency matrix. This is done because when the number of labels becomes very large (for example on the scale of 100,000), the adjacency matrix can consume more than 10GB of RAM! The input list has the following format: it is an array of pointers to ‘gal_list_sizet_t *’ (or ‘gal_list_sizet_t **’). The array has ‘number’ elements and each ‘listarr[i]’ is a linked list of ‘gal_list_sizet_t *’. As a demonstration, the input of the same example in ‘gal_binary_connected_adjacency_matrix’ would look like below and the output of this function will be identical to there. listarr[0] = NULL listarr[1] = 3 listarr[2] = NULL listarr[3] = 1 -> 5 listarr[4] = NULL listarr[5] = 3 From this example, it is already clear that this method will consume far less memory. But because it needs to parse lists (and not easily jump between array elements), it can be slower. But in scenarios where there are too many objects (that may exceed the whole system’s RAM+SWAP), this option is a good alternative and the drop in processing speed is worth getting the job done. Similar to ‘gal_binary_connected_adjacency_matrix’, this function will write the final number of connected labels in ‘numconnected’. But since it takes no ‘gal_data_t *’ argument (where it can inherit the ‘minmapsize’ and ‘quietmmap’ parameters), it also needs these as input. For more on ‘minmapsize’ and ‘quietmmap’, see *note Memory management::. -- Function: gal_data_t * gal_binary_holes_label (gal_data_t *input, int connectivity, size_t *numholes) Label all the holes in the foreground (non-zero elements in input) as independent regions. Holes are background regions (zero-valued in input) that are fully surrounded by the foreground, as defined by ‘connectivity’. The returned dataset has a 32-bit signed integer type with the size of the input. All holes in the input will have labels/counters greater or equal to ‘1’. The rest of the background regions will still have a value of ‘0’ and the initial foreground pixels will have a value of ‘-1’. The total number of holes will be written where ‘numholes’ points to. -- Function: void gal_binary_holes_fill (gal_data_t *input, int connectivity, size_t maxsize) Fill all the holes (0 valued pixels surrounded by 1 valued pixels) of the binary ‘input’ dataset. The connectivity of the holes can be set with ‘connectivity’. Holes larger than ‘maxsize’ are not filled. This function currently only works on a 2D dataset.  File: gnuastro.info, Node: Labeled datasets, Next: Convolution functions, Prev: Binary datasets, Up: Gnuastro library 11.3.24 Labeled datasets (‘label.h’) ------------------------------------ A labeled dataset is one where each element/pixel has an integer label (or counter). The label identifies the group/class that the element belongs to. This form of labeling allows the higher-level study of all pixels within a certain class. For example, to detect objects/targets in an image/dataset, you can apply a threshold to separate the noise from the signal (to detect diffuse signal, a threshold is useless and more advanced methods are necessary, for example *note NoiseChisel::). But the output of detection is a binary dataset (which is just a very low-level labeling of ‘0’ for noise and ‘1’ for signal). The raw detection map is therefore hardly useful for any kind of analysis on objects/targets in the image. One solution is to use a connected-components algorithm (see ‘gal_binary_connected_components’ in *note Binary datasets::). It is a simple and useful way to separate/label connected patches in the foreground. This higher-level (but still elementary) labeling therefore allows you to count how many connected patches of signal there are in the dataset and is a major improvement compared to the raw detection. However, when your objects/targets are touching, the simple connected components algorithm is not enough and a still higher-level labeling mechanism is necessary. This brings us to the necessity of the functions in this part of Gnuastro’s library. The main inputs to the functions in this section are already labeled datasets (for example with the connected components algorithm above). Each of the labeled regions are independent of each other (the labels specify different classes of targets). Therefore, especially in large datasets, it is often useful to process each label on independent CPU threads in parallel rather than in series. Therefore the functions of this section actually use an array of pixel/element indices (belonging to each label/class) as the main identifier of a region. Using indices will also allow processing of overlapping labels (for example in deblending problems). Just note that overlapping labels are not yet implemented, but planned. You can use ‘gal_label_indexs’ to generate lists of indices belonging to separate classes from the labeled input. -- Macro: GAL_LABEL_INIT -- Macro: GAL_LABEL_RIVER -- Macro: GAL_LABEL_TMPCHECK Special negative integer values used internally by some of the functions in this section. Recall that meaningful labels are considered to be positive integers ($\geq1$). Zero is conventionally kept for regions with no labels, therefore negative integers can be used for any extra classification in the labeled datasets. -- Function: gal_data_t * gal_label_indexs (gal_data_t *labels, size_t numlabs, size_t minmapsize, int quietmmap) Return an array of ‘gal_data_t’ containers, each containing the pixel indices of the respective label (see *note Generic data container::). ‘labels’ contains the label of each element and has to have an ‘GAL_TYPE_INT32’ type (see *note Library data types::). Only positive (greater than zero) values in ‘labels’ will be used/indexed, other elements will be ignored. Meaningful labels start from ‘1’ and not ‘0’, therefore the output array of ‘gal_data_t’ will contain ‘numlabs+1’ elements. The first (zero-th) element of the output (‘indexs[0]’ in the example below) will be initialized to a dataset with zero elements. This will allow easy (non-confusing) access to the indices of each (meaningful) label. ‘numlabs’ is the number of labels in the dataset. If it is given a value of zero, then the maximum value in the input (largest label) will be found and used. Therefore if it is given, but smaller than the actual number of labels, this function may/will crash (it will write in unallocated space). ‘numlabs’ is therefore useful in a highly optimized/checked environment. For example, if the returned array is called ‘indexs’, then ‘indexs[10].size’ contains the number of elements that have a label of ‘10’ in ‘labels’ and ‘indexs[10].array’ is an array (after casting to ‘size_t *’) containing the indices of each one of those elements/pixels. By _index_ we mean the 1D position: the input number of dimensions is irrelevant (any dimensionality is supported). In other words, each element’s index is the number of elements/pixels between it and the dataset’s first element/pixel. Therefore it is always greater or equal to zero and stored in ‘size_t’ type. -- Function: size_t gal_label_watershed (gal_data_t *values, gal_data_t *indexs, gal_data_t *label, size_t *topinds, int min0_max1) Use the watershed algorithm(1) to “over-segment” the pixels in the ‘indexs’ dataset based on values in the ‘values’ dataset. Internally, each local extrema (maximum or minimum, based on ‘min0_max1’) and its surrounding pixels will be given a unique label. For demonstration, see Figures 8 and 9 of Akhlaghi and Ichikawa [2015] (http://arxiv.org/abs/1505.01664). If ‘topinds!=NULL’, it is assumed to point to an already allocated space to write the index of each clump’s local extrema, otherwise, it is ignored. The ‘values’ dataset must have a 32-bit floating point type (‘GAL_TYPE_FLOAT32’, see *note Library data types::) and will only be read by this function. ‘indexs’ must contain the indices of the elements/pixels that will be over-segmented by this function and have a ‘GAL_TYPE_SIZE_T’ type, see the description of ‘gal_label_indexs’, above. The final labels will be written in the respective positions of ‘labels’, which must have a ‘GAL_TYPE_INT32’ type and be the same size as ‘values’. When ‘indexs’ is already sorted, this function will ignore ‘min0_max1’. To judge if the dataset is sorted or not (by the values the indices correspond to in ‘values’, not the actual indices), this function will look into the bits of ‘indexs->flag’, for the respective bit flags, see *note Generic data container::. If ‘indexs’ is not already sorted, this function will sort it according to the values of the respective pixel in ‘values’. The increasing/decreasing order will be determined by ‘min0_max1’. Note that if this function is called on multiple threads _and_ ‘values’ points to a different array on each thread, this function will not return a reasonable result. In this case, please sort ‘indexs’ prior to calling this function (see ‘gal_qsort_index_multi_d’ in *note Qsort functions::). When ‘indexs’ is decreasing (increasing), or ‘min0_max1’ is ‘1’ (‘0’), local minima (maxima), are considered rivers (watersheds) and given a label of ‘GAL_LABEL_RIVER’ (see above). Note that rivers/watersheds will also be formed on the edges of the labeled regions or when the labeled pixels touch a blank pixel. Therefore this function will need to check for the presence of blank values. To be most efficient, it is thus recommended to use ‘gal_blank_present’ (with ‘updateflag=1’) prior to calling this function (see *note Library blank values::. Once the flag has been set, no other function (including this one) that needs special behavior for blank pixels will have to parse the dataset to see if it has blank values any more. If you are sure your dataset doesn’t have blank values (by the design of your software), to avoid an extra parsing of the dataset and improve performance, you can set the two bits manually (see the description of ‘flags’ in *note Generic data container::): input->flag |= GAL_DATA_FLAG_BLANK_CH; /* Set bit to 1. */ input->flag &= ~GAL_DATA_FLAG_HASBLANK; /* Set bit to 0. */ -- Function: void gal_label_clump_significance (gal_data_t *values, gal_data_t *std, gal_data_t *label, gal_data_t *indexs, struct gal_tile_two_layer_params *tl, size_t numclumps, size_t minarea, int variance, int keepsmall, gal_data_t *sig, gal_data_t *sigind) This function is usually called after ‘gal_label_watershed’, and is used as a measure to identify which over-segmented “clumps” are real and which are noise. A measurement is done on each clump (using the ‘values’ and ‘std’ datasets, see below). To help in multi-threaded environments, the operation is only done on pixels which are indexed in ‘indexs’. It is expected for ‘indexs’ to be sorted by their values in ‘values’. If not sorted, the measurement may not be reliable. If sorted in a decreasing order, then clump building will start from their highest value and vice-versa. See the description of ‘gal_label_watershed’ for more on ‘indexs’. Each “clump” (identified by a positive integer) is assumed to be surrounded by at least one river/watershed pixel (with a non-positive label). This function will parse the pixels identified in ‘indexs’ and make a measurement on each clump and over all the river/watershed pixels. The number of clumps (‘numclumps’) must be given as an input argument and any clump that is smaller than ‘minarea’ is ignored (because of scatter). If ‘variance’ is non-zero, then the ‘std’ dataset is interpreted as variance, not standard deviation. The ‘values’ and ‘std’ datasets must have a ‘float’ (32-bit floating point) type. Also, ‘label’ and ‘indexs’ must respectively have ‘int32’ and ‘size_t’ types. ‘values’ and ‘label’ must have the same size, but ‘std’ can have three possible sizes: 1) a single element (which will be used for the whole dataset, 2) the same size as ‘values’ (so a different error can be assigned to every pixel), 3) a single value for each tile, based on the ‘tl’ tessellation (see *note Tile grid::). In the last case, a tile/value will be associated to each clump based on its flux-weighted (only positive values) center. The main output is an internally allocated, 1-dimensional array with one value per label. The array information (length, type, etc) will be written into the ‘sig’ generic data container. Therefore ‘sig->array’ must be ‘NULL’ when this function is called. After this function, the details of the array (number of elements, type and size, etc) will be written in to the various components of ‘sig’, see the definition of ‘gal_data_t’ in *note Generic data container::. Therefore ‘sig’ must already be allocated before calling this function. Optionally (when ‘sigind!=NULL’, similar to ‘sig’) the clump labels of each measurement in ‘sig’ will be written in ‘sigind->array’. If ‘keepsmall’ zero, small clumps (where no measurement is made) will not be included in the output table. This function is initially intended for a multi-threaded environment. In such cases, you will be writing arrays of clump measures from different regions in parallel into an array of ‘gal_data_t’s. You can simply allocate (and initialize), such an array with the ‘gal_data_array_calloc’ function in *note Arrays of datasets::. For example if the ‘gal_data_t’ array is called ‘array’, you can pass ‘&array[i]’ as ‘sig’. Along with some other functions in ‘label.h’, this function was initially written for *note Segment::. The description of the parameter used to measure a clump’s significance is fully given in Akhlaghi [2019] (https://arxiv.org/abs/1909.11230). -- Function: void gal_label_grow_indexs (gal_data_t *labels, gal_data_t *indexs, int withrivers, int connectivity) Grow the (positive) labels of ‘labels’ over the pixels in ‘indexs’ (see description of ‘gal_label_indexs’). The pixels (position in ‘indexs’, values in ‘labels’) that must be “grown” must have a value of ‘GAL_LABEL_INIT’ in ‘labels’ before calling this function. For a demonstration see Columns 2 and 3 of Figure 10 in Akhlaghi and Ichikawa [2015] (http://arxiv.org/abs/1505.01664). In many aspects, this function is very similar to over-segmentation (watershed algorithm, ‘gal_label_watershed’). The big difference is that in over-segmentation local maximums (that aren’t touching any alreadylabeled pixel) get a separate label. However, here the final number of labels will not change. All pixels that aren’t directly touching a labeled pixel just get pushed back to the start of the loop, and the loop iterates until its size doesn’t change any more. This is because in a generic scenario some of the indexed pixels might not be reachable through other indexed pixels. The next major difference with over-segmentation is that when there is only one label in growth region(s), it is not mandatory for ‘indexs’ to be sorted by values. If there are multiple labeled regions in growth region(s), then values are important and you can use ‘qsort’ with ‘gal_qsort_index_single_d’ to sort the indices by values in a separate array (see *note Qsort functions::). This function looks for positive-valued neighbors of each pixel in ‘indexs’ and will label a pixel if it touches one. Therefore, it is very important that only pixels/labels that are intended for growth have positive values in ‘labels’ before calling this function. Any non-positive (zero or negative) value will be ignored as a label by this function. Thus, it is recommended that while filling in the ‘indexs’ array values, you initialize all the pixels that are in ‘indexs’ with ‘GAL_LABEL_INIT’, and set non-labeled pixels that you don’t want to grow to ‘0’. This function will write into both the input datasets. After this function, some of the non-positive ‘labels’ pixels will have a new positivelabel and the number of useful elements in ‘indexs’ will have decreased. The index of those pixels that couldn’t be labeled will remain inside ‘indexs’. If ‘withrivers’ is non-zero, then pixels that are immediately touching more than one positive value will be given a ‘GAL_LABEL_RIVER’ label. Note that the ‘indexs->array’ is not re-allocated to its new size at the end(2). But since ‘indexs->dsize[0]’ and ‘indexs->size’ have new values after this function is returned, the extra elements just won’t be used until they are ultimately freed by ‘gal_data_free’. Connectivity is a value between ‘1’ (fewest number of neighbors) and the number of dimensions in the input (most number of neighbors). For example in a 2D dataset, a connectivity of ‘1’ and ‘2’ corresponds to 4-connected and 8-connected neighbors. ---------- Footnotes ---------- (1) The watershed algorithm was initially introduced by Vincent and Soille (https://doi.org/10.1109/34.87344). It starts from the minima and puts the pixels in, one by one, to grow them until the touch (create a watershed). For more, also see the Wikipedia article: . (2) Note that according to the GNU C Library, even a ‘realloc’ to a smaller size can also cause a re-write of the whole array, which is not a cheap operation.  File: gnuastro.info, Node: Convolution functions, Next: Interpolation, Prev: Labeled datasets, Up: Gnuastro library 11.3.25 Convolution functions (‘convolve.h’) -------------------------------------------- Convolution is a very common operation during data analysis and is thoroughly described as part of Gnuastro’s *note Convolve:: program which is fully devoted to this job. Because of the complete introduction that was presented there, we will directly skip onto the currently available convolution functions in Gnuastro’s library. As of this version, only spatial domain convolution is available in Gnuastro’s libraries. We haven’t had the time to liberate the frequency domain function convolution and de-convolution functions that are available in the Convolve program(1). -- Function: gal_data_t * gal_convolve_spatial (gal_data_t *tiles, gal_data_t *kernel, size_t numthreads, int edgecorrection, int convoverch) Convolve the given ‘tiles’ dataset (possibly a list of tiles, see *note List of gal_data_t:: and *note Tessellation library::) with ‘kernel’ on ‘numthreads’ threads. When ‘edgecorrection’ is non-zero, it will correct for the edge dimming effects as discussed in *note Edges in the spatial domain::. ‘tiles’ can be a single/complete dataset, but in that case the speed will be very slow. Therefore, for larger images, it is recommended to give a list of tiles covering a dataset. To create a tessellation that fully covers an input image, you may use ‘gal_tile_full’, or ‘gal_tile_full_two_layers’ to also define channels over your input dataset. These functions are discussed in *note Tile grid::. You may then pass the list of tiles to this function. This is the recommended way to call this function because spatial domain convolution is slow and breaking the job into many small tiles and working on simultaneously on several threads can greatly speed up the processing. If the tiles are defined within a channel (a larger tile), by default convolution will be done within the channel, so pixels on the edge of a channel will not be affected by their neighbors that are in another channel. See *note Tessellation:: for the necessity of channels in astronomical data analysis. This behavior may be disabled when ‘convoverch’ is non-zero. In this case, it will ignore channel borders (if they exist) and mix all pixels that cover the kernel within the dataset. -- Function: void gal_convolve_spatial_correct_ch_edge (gal_data_t *tiles, gal_data_t *kernel, size_t numthreads, int edgecorrection, gal_data_t *tocorrect) Correct the edges of channels in an already convolved image when it was initially convolved with ‘gal_convolve_spatial’ and ‘convoverch==0’. In that case, strong boundaries might exist on the channel edges. So if you later need to remove those boundaries at later steps of your processing, you can call this function. It will only do convolution on the tiles that are near the edge and were effected by the channel borders. Other pixels in the image will not be touched. Hence, it is much faster. ---------- Footnotes ---------- (1) Hence any help would be greatly appreciated.  File: gnuastro.info, Node: Interpolation, Next: Git wrappers, Prev: Convolution functions, Up: Gnuastro library 11.3.26 Interpolation (‘interpolate.h’) --------------------------------------- During data analysis, it happens that parts of the data cannot be given a value, but one is necessary for the higher-level analysis. For example a very bright star saturated part of your image and you need to fill in the saturated pixels with some values. Another common usage case are masked sky-lines in 1D spectra that similarly need to be assigned a value for higher-level analysis. In other situations, you might want a value in an arbitrary point: between the elements/pixels where you have data. The functions described in this section are for such operations. The parametric interpolations discussed below are wrappers around the interpolation functions of the GNU Scientific Library (or GSL, see *note GNU Scientific Library::). To identify the different GSL interpolation types, Gnuastro’s ‘gnuastro/interpolate.h’ header file contains macros that are discussed below. The GSL wrappers provided here are not yet complete because we are too busy. If you need them, please consider helping us in adding them to Gnuastro’s library. Your would be very welcome and appreciated. -- Macro: GAL_INTERPOLATE_NEIGHBORS_METRIC_RADIAL -- Macro: GAL_INTERPOLATE_NEIGHBORS_METRIC_MANHATTAN -- Macro: GAL_INTERPOLATE_NEIGHBORS_METRIC_INVALID The metric used to find distance for nearest neighbor interpolation. A radial metric uses the simple Euclidean function to find the distance between two pixels. A manhattan metric will always be an integer and is like steps (but is also much faster to calculate than radial metric because it doesn’t need a square root calculation). -- Macro: GAL_INTERPOLATE_NEIGHBORS_FUNC_MIN -- Macro: GAL_INTERPOLATE_NEIGHBORS_FUNC_MAX -- Macro: GAL_INTERPOLATE_NEIGHBORS_FUNC_MEDIAN -- Macro: GAL_INTERPOLATE_NEIGHBORS_FUNC_INVALID The various types of nearest-neighbor interpolation functions for ‘gal_interpolate_neighbors’. The names are descriptive for the operation they do, so we won’t go into much more detail here. The median operator will be one of the most used, but operators like the maximum are good to fill the center of saturated stars. -- Function: gal_data_t * gal_interpolate_neighbors (gal_data_t *input, struct gal_tile_two_layer_params *tl, uint8_t metric, size_t numneighbors, size_t numthreads, int onlyblank, int aslinkedlist, int function) Interpolate the values in the input dataset using a calculated statistics from the distribution of their ‘numneighbors’ closest neighbors. The desired statistics is determined from the ‘func’ argument, which takes any of the ‘GAL_INTERPOLATE_NEIGHBORS_FUNC_’ macros (see above). This function is non-parametric and thus agnostic to the input’s number of dimension or shape of the distribution. Distance can be defined on different metrics that are identified through ‘metric’ (taking values determined by the ‘GAL_INTERPOLATE_NEIGHBORS_METRIC_’ macros described above). If ‘onlyblank’ is non-zero, then only blank elements will be interpolated and pixels that already have a value will be left untouched. This function is multi-threaded and will run on ‘numthreads’ threads (see ‘gal_threads_number’ in *note Multithreaded programming::). ‘tl’ is Gnuastro’s tessellation structure used to define tiles over an image and is fully described in *note Tile grid::. When ‘tl!=NULL’, then it is assumed that the ‘input->array’ contains one value per tile and interpolation will respect certain tessellation properties, for example to not interpolate over channel borders. If several datasets have the same set of blank values, you don’t need to call this function multiple times. When ‘aslinkedlist’ is non-zero, then ‘input’ will be seen as a *note List of gal_data_t::. In this case, the same neighbors will be used for all the datasets in the list. Of course, the values for each dataset will be different, so a different value will be written in the each dataset, but the neighbor checking that is the most CPU intensive part will only be done once. This is a non-parametric and robust function for interpolation. The interpolated values are also always within the range of the non-blank values and strong outliers do not get created. However, this type of interpolation must be used with care when there are gradients. This is because it is non-parametric and if there aren’t enough neighbors, step-like features can be created. -- Macro: GAL_INTERPOLATE_1D_INVALID This is just a place holder to manage errors. -- Macro: GAL_INTERPOLATE_1D_LINEAR [From GSL:] Linear interpolation. This interpolation method does not require any additional memory. -- Macro: GAL_INTERPOLATE_1D_POLYNOMIAL [From GSL:] Polynomial interpolation. This method should only be used for interpolating small numbers of points because polynomial interpolation introduces large oscillations, even for well-behaved datasets. The number of terms in the interpolating polynomial is equal to the number of points. -- Macro: GAL_INTERPOLATE_1D_CSPLINE [From GSL:] Cubic spline with natural boundary conditions. The resulting curve is piecewise cubic on each interval, with matching first and second derivatives at the supplied data-points. The second derivative is chosen to be zero at the first point and last point. -- Macro: GAL_INTERPOLATE_1D_CSPLINE_PERIODIC [From GSL:] Cubic spline with periodic boundary conditions. The resulting curve is piecewise cubic on each interval, with matching first and second derivatives at the supplied data-points. The derivatives at the first and last points are also matched. Note that the last point in the data must have the same y-value as the first point, otherwise the resulting periodic interpolation will have a discontinuity at the boundary. -- Macro: GAL_INTERPOLATE_1D_AKIMA [From GSL:] Non-rounded Akima spline with natural boundary conditions. This method uses the non-rounded corner algorithm of Wodicka. -- Macro: GAL_INTERPOLATE_1D_AKIMA_PERIODIC [From GSL:] Non-rounded Akima spline with periodic boundary conditions. This method uses the non-rounded corner algorithm of Wodicka. -- Macro: GAL_INTERPOLATE_1D_STEFFEN [From GSL:] Steffen’s method(1) guarantees the monotonicity of the interpolating function between the given data points. Therefore, minima and maxima can only occur exactly at the data points, and there can never be spurious oscillations between data points. The interpolated function is piecewise cubic in each interval. The resulting curve and its first derivative are guaranteed to be continuous, but the second derivative may be discontinuous. -- Function: gsl_spline * gal_interpolate_1d_make_gsl_spline (gal_data_t *X, gal_data_t *Y, int type_1d) Allocate and initialize a GNU Scientific Library (GSL) 1D ‘gsl_spline’ structure using the non-blank elements of ‘Y’. ‘type_1d’ identifies the interpolation scheme and must be one of the ‘GAL_INTERPOLATE_1D_*’ macros defined above. If ‘X==NULL’, the X-axis is assumed to be integers starting from zero (the index of each element in ‘Y’). Otherwise, the values in ‘X’ will be used to initialize the interpolation structure. Note that when given, ‘X’ must _not_ contain any blank elements and it must be sorted (in increasing order). Each interpolation scheme needs a minimum number of elements to successfully operate. If the number of non-blank values in ‘Y’ is less than this number, this function will return a ‘NULL’ pointer. To be as generic and modular as possible, GSL’s tools are low-level. Therefore before doing the interpolation, many steps are necessary (like preparing your dataset, then allocating and initializing ‘gsl_spline’). The metadata available in Gnuastro’s *note Generic data container:: make it easy to hide all those preparations within this function. Once ‘gsl_spline’ has been initialized by this function, the interpolation can be evaluated for any X value within the non-blank range of the input using ‘gsl_spline_eval’ or ‘gsl_spline_eval_e’. For example in the small program below, we read the first two columns of the table in ‘table.txt’ and feed them to this function to later estimate the values in the second column for three selected points. You can use *note BuildProgram:: to compile and run this function, see *note Library demo programs:: for more. #include #include #include #include int main(void) { size_t i; gal_data_t *X, *Y; gsl_spline *spline; gsl_interp_accel *acc; gal_list_str_t *cols=NULL; /* Change the values based on your input table. */ double points[]={1.8, 2.5, 10.3}; /* Read the first two columns from `tab.txt'. IMPORTANT: the list is first-in-first-out, so the output column order is the inverse of the input order. */ gal_list_str_add(&cols, "1", 0); gal_list_str_add(&cols, "2", 0); Y=gal_table_read("table.txt", NULL, cols, GAL_TABLE_SEARCH_NAME, 0, -1, 1, NULL); X=Y->next; /* Allocate the GSL interpolation accelerator and make the `gsl_spline' structure. */ acc=gsl_interp_accel_alloc(); spline=gal_interpolate_1d_make_gsl_spline(X, Y, GAL_INTERPOLATE_1D_STEFFEN); /* Calculate the respective value for all the given points, if `spline' could be allocated. */ if(spline) for(i=0; i<(sizeof points)/(sizeof *points); ++i) printf("%f: %f\n", points[i], gsl_spline_eval(spline, points[i], acc)); /* Clean up and return. */ gal_data_free(X); gal_data_free(Y); gsl_spline_free(spline); gsl_interp_accel_free(acc); gal_list_str_free(cols, 0); return EXIT_SUCCESS; } -- Function: void gal_interpolate_1d_blank (gal_data_t *in, int type_1d) Fill the blank elements of ‘in’ using the rest of the elements and the given interpolation. The interpolation scheme can be set through ‘type_1d’, which accepts any of the ‘GAL_INTERPOLATE_1D_*’ macros above. The interpolation is internally done in 64-bit floating point type (‘double’). However the evaluated/interpolated values (originally blank) will be written (in ‘in’) with its original numeric datatype, using C’s standard type conversion. By definition, interpolation is only defined “between” valid points. Therefore, if any number of elements on the start or end of the 1D array are blank, those elements will not be interpolated and will remain blank. To see if any blank (non-interpolated) elements remain, you can use ‘gal_blank_present’ on ‘in’ after this function is finished. ---------- Footnotes ---------- (1)  File: gnuastro.info, Node: Git wrappers, Next: Unit conversion library, Prev: Interpolation, Up: Gnuastro library 11.3.27 Git wrappers (‘git.h’) ------------------------------ Git is one of the most common tools for version control and it can often be useful during development, for example see ‘COMMIT’ keyword in *note Output FITS files::. At installation time, Gnuastro will also check for the existence of libgit2, and store the value in the ‘GAL_CONFIG_HAVE_LIBGIT2’, see *note Configuration information:: and *note Optional dependencies::. ‘gnuastro/git.h’ includes ‘gnuastro/config.h’ internally, so you won’t have to include both for this macro. -- Function: char * gal_git_describe ( ) When libgit2 is present and the program is called within a directory that is version controlled, this function will return a string containing the commit description (similar to Gnuastro’s unofficial version number, see *note Version numbering::). If there are uncommitted changes in the running directory, it will add a ‘‘-dirty’’ prefix to the description. When there is no tagged point in the previous commit, this function will return a uniquely abbreviated commit object as fallback. This function is used for generating the value of the ‘COMMIT’ keyword in *note Output FITS files::. The output string is similar to the output of the following command: $ git describe --dirty --always Space for the output string is allocated within this function, so after using the value you have to ‘free’ the output string. If libgit2 is not installed or the program calling this function is not within a version controlled directory, then the output will be the ‘NULL’ pointer.  File: gnuastro.info, Node: Unit conversion library, Next: Spectral lines library, Prev: Git wrappers, Up: Gnuastro library 11.3.28 Unit conversion library (‘units.h’) ------------------------------------------- Datasets can contain values in various formats or units. The functions in this section are defined to facilitate the easy conversion between them and are declared in ‘units.h’. If there are certain conversions that are useful for your work, please get in touch. -- Function: int gal_units_extract_decimal (char *convert, const char *delimiter, double *args, size_t n) Parse the input ‘convert’ string with a certain delimiter (for example ‘01:23:45’, where the delimiter is ‘":"’) as multiple numbers (for example 1,23,45) and write them as an array in the space that ‘args’ is pointing to. The expected number of values in the string is specified by the ‘n’ argument (3 in the example above). If the function succeeds, it will return 1, otherwise it will return 0 and the values may not be fully written into ‘args’. If the number of values parsed in the string is different from ‘n’, this function will fail. -- Function: double gal_units_ra_to_degree (char *convert) Convert the input Right Ascension (RA) string (in the format of hours, minutes and seconds either as ‘_h_m_s’ or ‘_:_:_’) to degrees (a single floating point number). -- Function: double gal_units_dec_to_degree (char *convert) Convert the input Declination (Dec) string (in the format of degrees, arc-minutes and arc-seconds either as ‘_d_m_s’ or ‘_:_:_’) to degrees (a single floating point number). -- Function: char * gal_units_degree_to_ra (double decimal, int usecolon) Convert the input Right Ascension (RA) degree (a single floating point number) to old/standard notation (in the format of hours, minutes and seconds of ‘_h_m_s’). If ‘usecolon!=0’, then the delimiters between the components will be colons: ‘_:_:_’. -- Function: char * gal_units_degree_to_dec (double decimal, int usecolon) Convert the input Declination (Dec) degree (a single floating point number) to old/standard notation (in the format of degrees, arc-minutes and arc-seconds of ‘_d_m_s’). If ‘usecolon!=0’, then the delimiters between the components will be colons: ‘_:_:_’. -- Function: double gal_units_counts_to_mag (double counts, double zeropoint) Convert counts to magnitudes through the given zero point. For more on the equation, see *note Brightness flux magnitude::. -- Function: double gal_units_mag_to_counts (double mag, double zeropoint) Convert magnitudes to counts through the given zero point. For more on the equation, see *note Brightness flux magnitude::. -- Function: double gal_units_counts_to_jy (double counts, double zeropoint_ab) Convert counts to Janskys through an AB magnitude-based zero point. For more on the equation, see *note Brightness flux magnitude::. -- Function: double gal_units_au_to_pc (double au) Convert the input value (assumed to be in Astronomical Units) to Parsecs. For the conversion equation, see the description of ‘au-to-pc’ operator in *note Arithmetic operators::. -- Function: double gal_units_pc_to_au (double pc) Convert the input value (assumed to be in Parsecs) to Astronomical Units (AUs). For the conversion equation, see the description of ‘au-to-pc’ operator in *note Arithmetic operators::. -- Function: double gal_units_ly_to_pc (double ly) Convert the input value (assumed to be in Light-years) to Parsecs. For the conversion equation, see the description of ‘ly-to-pc’ operator in *note Arithmetic operators::. -- Function: double gal_units_pc_to_ly (double pc) Convert the input value (assumed to be in Parsecs) to Light-years. For the conversion equation, see the description of ‘ly-to-pc’ operator in *note Arithmetic operators::. -- Function: double gal_units_ly_to_au (double ly) Convert the input value (assumed to be in Light-years) to Astronomical Units. For the conversion equation, see the description of ‘ly-to-pc’ operator in *note Arithmetic operators::. -- Function: double gal_units_au_to_ly (double au) Convert the input value (assumed to be in Astronomical Units) to Light-years. For the conversion equation, see the description of ‘ly-to-pc’ operator in *note Arithmetic operators::.  File: gnuastro.info, Node: Spectral lines library, Next: Cosmology library, Prev: Unit conversion library, Up: Gnuastro library 11.3.29 Spectral lines library (‘speclines.h’) ---------------------------------------------- Gnuastro’s library has the following macros and functions for dealing with spectral lines. All these functions are declared in ‘gnuastro/spectra.h’. -- Macro: GAL_SPECLINES_INVALID -- Macro: GAL_SPECLINES_SIIRED -- Macro: GAL_SPECLINES_SII -- Macro: GAL_SPECLINES_SIIBLUE -- Macro: GAL_SPECLINES_NIIRED -- Macro: GAL_SPECLINES_NII -- Macro: GAL_SPECLINES_HALPHA -- Macro: GAL_SPECLINES_NIIBLUE -- Macro: GAL_SPECLINES_OIIIRED -- Macro: GAL_SPECLINES_OIII -- Macro: GAL_SPECLINES_OIIIBLUE -- Macro: GAL_SPECLINES_HBETA -- Macro: GAL_SPECLINES_HEIIRED -- Macro: GAL_SPECLINES_HGAMMA -- Macro: GAL_SPECLINES_HDELTA -- Macro: GAL_SPECLINES_HEPSILON -- Macro: GAL_SPECLINES_NEIII -- Macro: GAL_SPECLINES_OIIRED -- Macro: GAL_SPECLINES_OII -- Macro: GAL_SPECLINES_OIIBLUE -- Macro: GAL_SPECLINES_BLIMIT -- Macro: GAL_SPECLINES_MGIIRED -- Macro: GAL_SPECLINES_MGII -- Macro: GAL_SPECLINES_MGIIBLUE -- Macro: GAL_SPECLINES_CIIIRED -- Macro: GAL_SPECLINES_CIII -- Macro: GAL_SPECLINES_CIIIBLUE -- Macro: GAL_SPECLINES_HEIIBLUE -- Macro: GAL_SPECLINES_LYALPHA -- Macro: GAL_SPECLINES_LYLIMIT -- Macro: GAL_SPECLINES_INVALID_MAX Internal values/identifiers for specific spectral lines as is clear from their names. Note the first and last one, they can be used when parsing the lines automatically: both don’t correspond to any line, but their integer values correspond to the two integers just before and after the first and last line identifier. ‘GAL_SPECLINES_INVALID’ has a value of zero, and allows you to have a fixed integer which never corresponds to a line. ‘GAL_SPECLINES_INVALID_MAX’ is the total number of pre-defined lines, plus one. So you can parse all the known lines with a ‘for’ loop like this: for(i=1;i  File: gnuastro.info, Node: Library demo programs, Prev: Gnuastro library, Up: Library 11.4 Library demo programs ========================== In this final section of *note Library::, we give some example Gnuastro programs to demonstrate various features in the library. All these programs have been tested and once Gnuastro is installed you can compile and run them with with Gnuastro’s *note BuildProgram:: program that will take care of linking issues. If you don’t have any FITS file to experiment on, you can use those that are generated by Gnuastro after ‘make check’ in the ‘tests/’ directory, see *note Quick start::. * Menu: * Library demo - reading a image:: Read a FITS image into memory. * Library demo - inspecting neighbors:: Inspect the neighbors of a pixel. * Library demo - multi-threaded operation:: Doing an operation on threads. * Library demo - reading and writing table columns:: Simple Column I/O.  File: gnuastro.info, Node: Library demo - reading a image, Next: Library demo - inspecting neighbors, Prev: Library demo programs, Up: Library demo programs 11.4.1 Library demo - reading a FITS image ------------------------------------------ The following simple program demonstrates how to read a FITS image into memory and use the ‘void *array’ pointer in of *note Generic data container::. For easy linking/compilation of this program along with a first run see *note BuildProgram::. Before running, also change the ‘filename’ and ‘hdu’ variable values to specify an existing FITS file and/or extension/HDU. This is just intended to demonstrate how to use the ‘array’ pointer of ‘gal_data_t’. Hence it doesn’t do important sanity checks, for example in real datasets you may also have blank pixels. In such cases, this program will return a NaN value (see *note Blank pixels::). So for general statistical information of a dataset, it is much better to use Gnuastro’s *note Statistics:: program which can deal with blank pixels any many other issues in a generic dataset. #include #include #include /* includes gnuastro's data.h and type.h */ #include int main(void) { size_t i; float *farray; double sum=0.0f; gal_data_t *image; char *filename="img.fits", *hdu="1"; /* Read `img.fits' (HDU: 1) as a float32 array. */ image=gal_fits_img_read_to_type(filename, hdu, GAL_TYPE_FLOAT32, -1, 1); /* Use the allocated space as a single precision floating * point array (recall that `image->array' has `void *' * type, so it is not directly usable. */ farray=image->array; /* Calculate the sum of all the values. */ for(i=0; isize; ++i) sum += farray[i]; /* Report the sum. */ printf("Sum of values in %s (hdu %s) is: %f\n", filename, hdu, sum); /* Clean up and return. */ gal_data_free(image); return EXIT_SUCCESS; }  File: gnuastro.info, Node: Library demo - inspecting neighbors, Next: Library demo - multi-threaded operation, Prev: Library demo - reading a image, Up: Library demo programs 11.4.2 Library demo - inspecting neighbors ------------------------------------------ The following simple program shows how you can inspect the neighbors of a pixel using the ‘GAL_DIMENSION_NEIGHBOR_OP’ function-like macro that was introduced in *note Dimensions::. For easy linking/compilation of this program along with a first run see *note BuildProgram::. Before running, also change the file name and HDU (first and second arguments to ‘gal_fits_img_read_to_type’) to specify an existing FITS file and/or extension/HDU. #include #include #include #include int main(void) { double sum; float *array; size_t i, num, *dinc; gal_data_t *input=gal_fits_img_read_to_type("input.fits", "1", GAL_TYPE_FLOAT32, -1, 1); /* To avoid the `void *' pointer and have `dinc'. */ array=input->array; dinc=gal_dimension_increment(input->ndim, input->dsize); /* Go over all the pixels. */ for(i=0;isize;++i) { num=0; sum=0.0f; GAL_DIMENSION_NEIGHBOR_OP( i, input->ndim, input->dsize, input->ndim, dinc, {++num; sum+=array[nind];} ); printf("%zu: num: %zu, sum: %f\n", i, num, sum); } /* Clean up and return. */ gal_data_free(input); return EXIT_SUCCESS; }  File: gnuastro.info, Node: Library demo - multi-threaded operation, Next: Library demo - reading and writing table columns, Prev: Library demo - inspecting neighbors, Up: Library demo programs 11.4.3 Library demo - multi-threaded operation ---------------------------------------------- The following simple program shows how to use Gnuastro to simplify spinning off threads and distributing different jobs between the threads. The relevant thread-related functions are defined in *note Gnuastro's thread related functions::. For easy linking/compilation of this program, along with a first run, see Gnuastro’s *note BuildProgram::. Before running, also change the ‘filename’ and ‘hdu’ variable values to specify an existing FITS file and/or extension/HDU. This is a very simple program to open a FITS image, distribute its pixels between different threads and print the value of each pixel and the thread it was assigned to. The actual operation is very simple (and would not usually be done with threads in a real-life program). It is intentionally chosen to put more focus on the important steps in spinning of threads and how the worker function (which is called by each thread) can identify the job-IDs it should work on. For example, instead of an array of pixels, you can define an array of tiles or any other context-specific structures as separate targets. The important thing is that each action should have its own unique ID (counting from zero, as is done in an array in C). You can then follow the process below and use each thread to work on all the targets that are assigned to it. Recall that spinning-off threads is its self an expensive process and we don’t want to spin-off one thread for each target (see the description of ‘gal_threads_dist_in_threads’ in *note Gnuastro's thread related functions::. There are many (more complicated, real-world) examples of using ‘gal_threads_spin_off’ in Gnuastro’s actual source code, you can see them by searching for the ‘gal_threads_spin_off’ function from the top source (after unpacking the tarball) directory (for example with this command): $ grep -r gal_threads_spin_off ./ The code of this demonstration program is shown below. This program was also built and run when you ran ‘make check’ during the building of Gnuastro (‘tests/lib/multithread.c’), so it is already tested for your system and you can safely use it as a guide. #include #include #include #include /* This structure can keep all information you want to pass onto the * worker function on each thread. */ struct params { gal_data_t *image; /* Dataset to print values of. */ }; /* This is the main worker function which will be called by the * different threads. `gal_threads_params' is defined in * `gnuastro/threads.h' and contains the pointer to the parameter we * want. Note that the input argument and returned value of this * function always must have `void *' type. */ void * worker_on_thread(void *in_prm) { /* Low-level definitions to be done first. */ struct gal_threads_params *tprm=(struct gal_threads_params *)in_prm; struct params *p=(struct params *)tprm->params; /* Subsequent definitions. */ float *array=p->image->array; size_t i, index, *dsize=p->image->dsize; /* Go over all the actions (pixels in this case) that were assigned * to this thread. */ for(i=0; tprm->indexs[i] != GAL_BLANK_SIZE_T; ++i) { /* For easy reading. */ index = tprm->indexs[i]; /* Print the information. */ printf("(%zu, %zu) on thread %zu: %g\n", index%dsize[1]+1, index/dsize[1]+1, tprm->id, array[index]); } /* Wait for all the other threads to finish, then return. */ if(tprm->b) pthread_barrier_wait(tprm->b); return NULL; } /* High-level function (called by the operating system). */ int main(void) { struct params p; char *filename="input.fits", *hdu="1"; size_t numthreads=gal_threads_number(); /* We are using * `-1' for `minmapsize' to ensure that the image is read into * memory and `1' for `quietmmap' (which can also be zero), see the "Memory management" section in the book. */ int quietmmap=1; size_t minmapsize=-1; /* Read the image into memory as a float32 data type. */ p.image=gal_fits_img_read_to_type(filename, hdu, GAL_TYPE_FLOAT32, minmapsize, quietmmap); /* Print some basic information before the actual contents: */ printf("Pixel values of %s (HDU: %s) on %zu threads.\n", filename, hdu, numthreads); printf("Used to check the compiled library's capability in opening " "a FITS file, and also spinning-off threads.\n"); /* A small sanity check: this is only intended for 2D arrays (to * print the coordinates of each pixel). */ if(p.image->ndim!=2) { fprintf(stderr, "only 2D images are supported."); exit(EXIT_FAILURE); } /* Spin-off the threads and do the processing on each thread. */ gal_threads_spin_off(worker_on_thread, &p, p.image->size, numthreads, minmapsize, quietmmap); /* Clean up and return. */ gal_data_free(p.image); return EXIT_SUCCESS; }  File: gnuastro.info, Node: Library demo - reading and writing table columns, Prev: Library demo - multi-threaded operation, Up: Library demo programs 11.4.4 Library demo - reading and writing table columns ------------------------------------------------------- Tables are some of the most common inputs to, and outputs of programs. This section contains a small program for reading and writing tables using the constructs described in *note Table input output::. For easy linking/compilation of this program, along with a first run, see Gnuastro’s *note BuildProgram::. Before running, also set the following file and column names in the first two lines of ‘main’. The input and output names may be ‘.txt’ and ‘.fits’ tables, ‘gal_table_read’ and ‘gal_table_write’ will be able to write to both formats. For plain text tables see see *note Gnuastro text table format::. This example program reads three columns from a table. The first two columns are selected by their name (‘NAME1’ and ‘NAME2’) and the third is selected by its number: column 10 (counting from 1). Gnuastro’s column selection is discussed in *note Selecting table columns::. The first and second columns can be any type, but this program will convert them to ‘int32_t’ and ‘float’ for its internal usage respectively. However, the third column must be double for this program. So if it isn’t, the program will abort with an error. Having the columns in memory, it will print them out along with their sum (just a simple application, you can do what ever you want at this stage). Reading the table finishes here. The rest of the program is a demonstration of writing a table. While parsing the rows, this program will change the first column (to be counters) and multiply the second by 10 (so the output will be different). Then it will define the order of the output columns by setting the ‘next’ element (to create a *note List of gal_data_t::). Before writing, this function will also set names for the columns (units and comments can be defined in a similar manner). Writing the columns to a file is then done through a simple call to ‘gal_table_write’. The operations that are shown in this example program are not necessary all the time. For example, in many cases, you know the numerical data type of the column before writing your program (see *note Numeric data types::), so type checking and copying to a specific type won’t be necessary. #include #include #include int main(void) { /* File names and column names (which may also be numbers). */ char *c1_name="NAME1", *c2_name="NAME2", *c3_name="10"; char *inname="input.fits", *hdu="1", *outname="out.fits"; /* Internal parameters. */ float *array2; double *array3; int32_t *array1; size_t i, counter=0; gal_data_t *c1, *c2; gal_data_t tmp, *col, *columns; gal_list_str_t *column_ids=NULL; /* Define the columns to read. */ gal_list_str_add(&column_ids, c1_name, 0); gal_list_str_add(&column_ids, c2_name, 0); gal_list_str_add(&column_ids, c3_name, 0); /* The columns were added in reverse, so correct it. */ gal_list_str_reverse(&column_ids); /* Read the desired columns. */ columns = gal_table_read(inname, hdu, column_ids, GAL_TABLE_SEARCH_NAME, 1, -1, 1, NULL); /* Go over the columns, we'll assume that you don't know their type * a-priori, so we'll check */ counter=1; for(col=columns; col!=NULL; col=col->next) switch(counter++) { case 1: /* First column: we want it as int32_t. */ c1=gal_data_copy_to_new_type(col, GAL_TYPE_INT32); array1 = c1->array; break; case 2: /* Second column: we want it as float. */ c2=gal_data_copy_to_new_type(col, GAL_TYPE_FLOAT32); array2 = c2->array; break; case 3: /* Third column: it MUST be double. */ if(col->type!=GAL_TYPE_FLOAT64) { fprintf(stderr, "Column %s must be float64 type, it is " "%s", c3_name, gal_type_name(col->type, 1)); exit(EXIT_FAILURE); } array3 = col->array; break; } /* As an example application we'll just print them out. In the * meantime (just for a simple demonstration), change the first * array value to the counter and multiply the second by 10. */ for(i=0;isize;++i) { printf("%zu: %d + %f + %f = %f\n", i+1, array1[i], array2[i], array3[i], array1[i]+array2[i]+array3[i]); array1[i] = i+1; array2[i] *= 10; } /* Link the first two columns as a list. */ c1->next = c2; c2->next = NULL; /* Set names for the columns and write them out. */ c1->name = "COUNTER"; c2->name = "VALUE"; gal_table_write(c1, NULL, NULL, GAL_TABLE_FORMAT_BFITS, outname, "MY-COLUMNS", 0); /* The names weren't allocated, so to avoid cleaning-up problems, * we'll set them to NULL. */ c1->name = c2->name = NULL; /* Clean up and return. */ gal_data_free(c1); gal_data_free(c2); gal_list_data_free(columns); gal_list_str_free(column_ids, 0); /* strings weren't allocated. */ return EXIT_SUCCESS; }  File: gnuastro.info, Node: Developing, Next: Gnuastro programs list, Prev: Library, Up: Top 12 Developing ************* The basic idea of GNU Astronomy Utilities is for an interested astronomer to be able to easily understand the code of any of the programs or libraries, be able to modify the code if s/he feels there is an improvement and finally, to be able to add new programs or libraries for their own benefit, and the larger community if they are willing to share it. In short, we hope that at least from the software point of view, the “obscurantist faith in the expert’s special skill and in his personal knowledge and authority” can be broken, see *note Science and its tools::. With this aim in mind, Gnuastro was designed to have a very basic, simple, and easy to understand architecture for any interested inquirer. This chapter starts with very general design choices, in particular *note Why C:: and *note Program design philosophy::. It will then get a little more technical about the Gnuastro code and file/directory structure in *note Coding conventions:: and *note Program source::. *note The TEMPLATE program:: discusses a minimal (and working) template to help in creating new programs or easier learning of a program’s internal structure. Some other general issues about documentation, building and debugging are then discussed. This chapter concludes with how you can learn about the development and get involved in *note Gnuastro project webpage::, *note Developing mailing lists:: and *note Contributing to Gnuastro::. * Menu: * Why C:: Why Gnuastro is designed in C. * Program design philosophy:: General ideas behind the package structure. * Coding conventions:: Gnuastro coding conventions. * Program source:: Conventions for the code. * Documentation:: Documentation is an integral part of Gnuastro. * Building and debugging:: Build and possibly debug during development. * Test scripts:: Understanding the test scripts. * Bash programmable completion:: Auto-completions for better user experience. * Developer's checklist:: Checklist to finalize your changes. * Gnuastro project webpage:: Central hub for Gnuastro activities. * Developing mailing lists:: Stay up to date with Gnuastro’s development. * Contributing to Gnuastro:: Share your changes with all users.  File: gnuastro.info, Node: Why C, Next: Program design philosophy, Prev: Developing, Up: Developing 12.1 Why C programming language? ================================ Currently the programming languages that are commonly used in scientific applications are C++(1), Java(2); Python(3), and Julia(4) (which is a newcomer but swiftly gaining ground). One of the main reasons behind choosing these is their high-level abstractions. However, GNU Astronomy Utilities is fully written in the C programming language(5). The reasons can be summarized with simplicity, portability and efficiency/speed. All four are very important in a scientific software and we will discuss them below. Simplicity can best be demonstrated in a comparison of the main books of C++ and C. The “C programming language”(6) book, written by the authors of C, is only 286 pages and covers a very good fraction of the language, it has also remained unchanged from 1988. C is the main programming language of nearly all operating systems and there is no plan of any significant update. On the other hand, the most recent “C++ programming language”(7) book, also written by its author, has 1366 pages and its fourth edition came out in 2013! As discussed in *note Science and its tools::, it is very important for other scientists to be able to readily read the code of a program at their will with minimum requirements. In C++ or Java, inheritance in the object oriented programming paradigm and their internal functions make the code very easy to write for a programmer who is deeply invested in those objects and understands all their relations well. But it simultaneously makes reading the program for a first time reader (a curious scientist who wants to know only how a small step was done) extremely hard. Before understanding the methods, the scientist has to invest a lot of time and energy in understanding those objects and their relations. But in C, everything is done with basic language types for example ‘int’s or ‘float’s and their pointers to define arrays. So when an outside reader is only interested in one part of the program, that part is all they have to understand. Recently it is also becoming common to write scientific software in Python, or a combination of it with C or C++. Python is a high level scripting language which doesn’t need compilation. It is very useful when you want to do something on the go and don’t want to be halted by the troubles of compiling, linking, memory checking, etc. When the datasets are small and the job is temporary, this ability of Python is great and is highly encouraged. A very good example might be plotting, in which Python is undoubtedly one of the best. But as the data sets increase in size and the processing becomes more complicated, the speed of Python scripts significantly decrease. So when the program doesn’t change too often and is widely used in a large community, mostly on large data sets (like astronomical images), using Python will waste a lot of valuable research-hours. It is possible to wrap C or C++ functions with Python to fix the speed issue. But this creates further complexity, because the interested scientist has to master two programming languages and their connection (which is not trivial). Like C++, Python is object oriented, so as explained above, it needs a high level of experience with that particular program to reasonably understand its inner workings. To make things worse, since it is mainly for on-the-go programming(8), it can undergo significant changes. One recent example is how Python 2.x and Python 3.x are not compatible. Lots of research teams that invested heavily in Python 2.x cannot benefit from Python 3.x or future versions any more. Some converters are available, but since they are automatic, lots of complications might arise in the conversion(9). If a research project begins using Python 3.x today, there is no telling how compatible their investments will be when Python 4.x or 5.x will come out. Java is also fully object-oriented, but uses a different paradigm: its compilation generates a hardware-independent _bytecode_, and a _Java Virtual Machine_ (JVM) is required for the actual execution of this bytecode on a computer. Java also evolved with time, and tried to remain backward compatible, but inevitably some evolutions required discontinuities and replacements of a few Java components which were first declared as becoming _deprecated_, and removed from later versions. This stems from the core principles of high-level languages like Python or Java: that they evolve significantly on the scale of roughly 5 to 10 years. They are therefore useful when you want to solve a short-term problem and you are ready to pay the high cost of keeping your software up to date with all the changes in the language. This is fine for private companies, but usually too expensive for scientific projects that have limited funding for a fixed period. As a result, the reproducibility of the result (ability to regenerate the result in the future, which is a core principal of any scientific result) and re-usability of all the investments that went into the science software will be lost to future generations! Rebuilding all the dependencies of a software in an obsolete language is not easy, or even not possible. Future-proof code (as long as current operating systems will be used) is therefore written in C. The portability of C is best demonstrated by the fact that C++, Java and Python are part of the C-family of programming languages which also include Julia, Perl, and many other languages. C libraries can be immediately included in C++, and it is easy to write wrappers for them in all C-family programming languages. This will allow other scientists to benefit from C libraries using any C-family language that they prefer. As a result, Gnuastro’s library is already usable in C and C++, and wrappers will be(10) added for higher-level languages like Python, Julia and Java. The final reason was speed. This is another very important aspect of C which is not independent of simplicity (first reason discussed above). The abstractions provided by the higher-level languages (which also makes learning them harder for a newcomer) come at the cost of speed. Since C is a low-level language(11) (closer to the hardware), it has a direct access to the CPU(12), is generally considered as being faster in its execution, and is much less complex for both the human reader _and_ the computer. The benefits of simplicity for a human were discussed above. Simplicity for the computer translates into more efficient (faster) programs. This creates a much closer relation between the scientist/programmer (or their program) and the actual data and processing. The GNU coding standards(13) also encourage the use of C over all other languages when generality of usage and “high speed” is desired. ---------- Footnotes ---------- (1) (2) (3) (4) (5) (6) Brian Kernighan, Dennis Ritchie. _The C programming language_. Prentice Hall, Inc., Second edition, 1988. It is also commonly known as K&R and is based on the ANSI C and ISO C90 standards. (7) Bjarne Stroustrup. _The C++ programming language_. Addison-Wesley Professional; 4 edition, 2013. (8) Note that Python is good for fast programming, not fast programs. (9) For example see Jenness (2017) (https://arxiv.org/abs/1712.00461) which describes how LSST is managing the transition. (10) (11) Low-level languages are those that directly operate the hardware like assembly languages. So C is actually a high-level language, but it can be considered one of the lowest-level languages among all high-level languages. (12) for instance the _long double_ numbers with at least 64-bit mantissa are not accessible in Python or Java. (13)  File: gnuastro.info, Node: Program design philosophy, Next: Coding conventions, Prev: Why C, Up: Developing 12.2 Program design philosophy ============================== The core processing functions of each program (and all libraries) are written mostly with the basic ISO C90 standard. We do make lots of use of the GNU additions to the C language in the GNU C library(1), but these functions are mainly used in the user interface functions (reading your inputs and preparing them prior to or after the analysis). The actual algorithms, which most scientists would be more interested in, are much more closer to ISO C90. For this reason, program source files that deal with user interface issues and those doing the actual processing are clearly separated, see *note Program source::. If anything particular to the GNU C library is used in the processing functions, it is explained in the comments in between the code. All the Gnuastro programs provide very low level and modular operations (modeled on GNU Coreutils). Almost all the basic command-line programs like ‘ls’, ‘cp’ or ‘rm’ on GNU/Linux operating systems are part of GNU Coreutils. This enables you to use shell scripting languages (for example GNU Bash) to operate on a large number of files or do very complex things through the creative combinations of these tools that the authors had never dreamed of. We have put a few simple examples in *note Tutorials::. For example all the analysis output can be saved as ASCII tables which can be fed into your favorite plotting program to inspect visually. Python’s Matplotlib is very useful for fast plotting of the tables to immediately check your results. If you want to include the plots in a document, you can use the PGFplots package within LaTeX, no attempt is made to include such operations in Gnuastro. In short, Bash can act as a glue to connect the inputs and outputs of all these various Gnuastro programs (and other programs) in any fashion. Of course, Gnuastro’s programs are just front-ends to the main workhorse (*note Gnuastro library::), allowing a user to create their own programs (for example with *note BuildProgram::). So once the functions within programs become mature enough, they will be moved within the libraries for even more general applications. The advantage of this architecture is that the programs become small and transparent: the starting and finishing point of every program is clearly demarcated. For nearly all operations on a modern computer (fast file input-output) with a modest level of complexity, the read/write speed is insignificant compared to the actual processing a program does. Therefore the complexity which arises from sharing memory in a large application is simply not worth the speed gain. Gnuastro’s design is heavily influenced from Eric Raymond’s “The Art of Unix Programming”(2) which beautifully describes the design philosophy and practice which lead to the success of Unix-based operating systems(3). ---------- Footnotes ---------- (1) Gnuastro uses many GNU additions to the C library. However, thanks to the GNU Portability library (Gnulib) which is included in the Gnuastro tarball, users of non-GNU/Linux operating systems can also benefit from all these features when using Gnuastro. (2) Eric S. Raymond, 2004, _The Art of Unix Programming_, Addison-Wesley Professional Computing Series. (3) KISS principle: Keep It Simple, Stupid!  File: gnuastro.info, Node: Coding conventions, Next: Program source, Prev: Program design philosophy, Up: Developing 12.3 Coding conventions ======================= In Gnuastro, we try our best to follow the GNU coding standards. Added to those, Gnuastro defines the following conventions. It is very important for readability that the whole package follows the same convention. • The code must be easy to read by eye. So when the order of several lines within a function does not matter (for example when defining variables at the start of a function). You should put the lines in the order of increasing length and group the variables with similar types such that this half-pyramid of declarations becomes most visible. If the reader is interested, a simple search will show them the variable they are interested in. However, this visual aid greatly helps in general inspections of the code and help the reader get a grip of the function’s processing. • A function that cannot be fully displayed (vertically) in your monitor is probably too long and may be more useful if it is broken up into multiple functions. 40 lines is usually a good reference. When the start and end of a function are clearly visible in one glance, the function is much more easier to understand. This is most important for low-level functions (which usually define a lot of variables). Low-level functions do most of the processing, they will also be the most interesting part of a program for an inquiring astronomer. This convention is less important for higher level functions that don’t define too many variables and whose only purpose is to run the lower-level functions in a specific order and with checks. In general you can be very liberal in breaking up the functions into smaller parts, the GNU Compiler Collection (GCC) will automatically compile the functions as inline functions when the optimizations are turned on. So you don’t have to worry about decreasing the speed. By default Gnuastro will compile with the ‘-O3’ optimization flag. • All Gnuastro hand-written text files (C source code, Texinfo documentation source, and version control commit messages) should not exceed *75* characters per line. Monitors today are certainly much wider, but with this limit, reading the functions becomes much more easier. Also for the developers, it allows multiple files (or multiple views of one file) to be displayed beside each other on wide monitors. Emacs’s buffers are excellent for this capability, setting a buffer width of 80 with ‘’ will allow you to view and work on several files or different parts of one file using the wide monitors common today. Emacs buffers can also be used as a shell prompt and compile the program (with ), and 80 characters is the default width in most terminal emulators. If you use Emacs, Gnuastro sets the 75 character ‘fill-column’ variable automatically for you, see cartouche below. For long comments you can use press in Emacs to separate them into separate lines automatically. For long literal strings, you can use the fact that in C, two strings immediately after each other are concatenated, for example ‘"The first part, " "and the second part."’. Note the space character in the end of the first part. Since they are now separated, you can easily break a long literal string into several lines and adhere to the maximum 75 character line length policy. • The headers required by each source file (ending with ‘.c’) should be defined inside of it. All the headers a complete program needs should _not_ be stacked in another header to include in all source files (for example ‘main.h’). Although most ‘professional’ programmers choose this single header method, Gnuastro is primarily written for professional/inquisitive astronomers (who are generally amateur programmers). The list of header files included provides valuable general information and helps the reader. ‘main.h’ may only include the header file(s) that define types that the main program structure needs, see ‘main.h’ in *note Program source::. Those particular header files that are included in ‘main.h’ can of course be ignored (not included) in separate source files. • The headers should be classified (by an empty line) into separate groups: 1. ‘#include ’: This must be the first code line (not commented or blank) in each source file _within Gnuastro_. It sets macros that the GNU Portability Library (Gnulib) will use for a unified environment (GNU C Library), even when the user is building on a system that doesn’t use the GNU C library. 2. The C library header files, for example ‘stdio.h’, ‘stdlib.h’, or ‘math.h’. 3. Installed library header files, including Gnuastro’s installed headers (for example ‘cfitsio.h’ or ‘gsl/gsl_rng.h’, or ‘gnuastro/fits.h’). 4. Gnuastro’s internal headers (that are not installed), for example ‘gnuastro-internal/options.h’. 5. For programs, the ‘main.h’ file (which is needed by the next group of headers). 6. That particular program’s header files, for example ‘mkprof.h’, or ‘noisechisel.h’. As much as order does not matter when you include the header of each group, sort them by length, as described above. • All function names, variables, etc should be in lower case. Macros and constant global ‘enum’s should be in upper case. • For the naming of exported header files, functions, variables, macros, and library functions, we adopt similar conventions to those used by the GNU Scientific Library (GSL)(1). In particular, in order to avoid clashes with the names of functions and variables coming from other libraries the name-space ‘‘gal_’’ is prefixed to them. GAL stands for _G_NU _A_stronomy _L_ibrary. • All installed header files should be in the ‘lib/gnuastro’ directory (under the top Gnuastro source directory). After installation, they will be put in the ‘$prefix/include/gnuastro’ directory (see *note Installation directory:: for ‘$prefix’). Therefore with this convention Gnuastro’s headers can be included in internal (to Gnuastro) and external (a library user) source files with the same line # include Note that the GSL convention for header file names is ‘gsl_specialname.h’, so your include directive for a GSL header must be something like ‘#include ’. Gnuastro doesn’t follow this GSL guideline because of the repeated ‘gsl’ in the include directive. It can be confusing and cause bugs for beginners. All Gnuastro (and GSL) headers must be located within a unique directory and will not be mixed with other headers. Therefore the ‘‘gsl_’’ prefix to the header file names is redundant(2). • All installed functions and variables should also include the base-name of the file in which they are defined as prefix, using underscores to separate words(3). The same applies to exported macros, but in upper case. For example in Gnuastro’s top source directory, the prototype of function ‘gal_box_border_from_center’ is in ‘lib/gnuastro/box.h’, and the macro ‘GAL_POLYGON_MAX_CORNERS’ is defined in ‘lib/gnuastro/polygon.h’. This is necessary to give any user (who is not familiar with the library structure) the ability to follow the code. This convention does make the function names longer (a little harder to write), but the extra documentation it provides plays an important role in Gnuastro and is worth the cost. • There should be no trailing white space in a line. To do this automatically every time you save a file in Emacs, add the following line to your ‘~/.emacs’ file. (add-hook 'before-save-hook 'delete-trailing-whitespace) • There should be no tabs in the indentation(4). • Individual, contextually similar, functions in a source file are separated by 5 blank lines to be easily seen to be related in a group when parsing the source code by eye. In Emacs you can use . • One group of contextually similar functions in a source file is separated from another with 20 blank lines. In Emacs you can use . Each group of functions has short descriptive title of the functions in that group. This title is surrounded by asterisks (<*>) to make it clearly distinguishable. Such contextual grouping and clear title are very important for easily understanding the code. • Always read the comments before the patch of code under it. Similarly, try to add as many comments as you can regarding every patch of code. Effectively, we want someone to get a good feeling of the steps, without having to read the C code and only by reading the comments. This follows similar principles as Literate programming (https://en.wikipedia.org/wiki/Literate_programming). The last two conventions are not common and might benefit from a short discussion here. With a good experience in advanced text editor operations, the last two are redundant for a professional developer. However, recall that Gnuastro aspires to be friendly to unfamiliar, and inexperienced (in programming) eyes. In other words, as discussed in *note Science and its tools::, we want the code to appear welcoming to someone who is completely new to coding (and text editors) and only has a scientific curiosity. Newcomers to coding and development, who are curious enough to venture into the code, will probably not be using (or have any knowledge of) advanced text editors. They will see the raw code in the web page or on a simple text editor (like Gedit) as plain text. Trying to learn and understand a file with dense functions that are all spaced with one or two blank lines can be very taunting for a newcomer. But when they scroll through the file and see clear titles and meaningful spaces for similar functions, we are helping them find and focus on the part they are most interested in sooner and easier. *GNU Emacs, the recommended text editor:* GNU Emacs is an extensible and easily customizable text editor which many programmers rely on for developing due to its countless features. Among them, it allows specification of certain settings that are applied to a single file or to all files in a directory and its sub-directories. In order to harmonize code coming from different contributors, Gnuastro comes with a ‘.dir-locals.el’ file which automatically configures Emacs to satisfy most of the coding conventions above when you are using it within Gnuastro’s directories. Thus, Emacs users can readily start hacking into Gnuastro. If you are new to developing, we strongly recommend this editor. Emacs was the first project released by GNU and is still one of its flagship projects. Some resources can be found at: Official manual At . This is a great and very complete manual which is being improved for over 30 years and is the best starting point to learn it. It just requires a little patience and practice, but rest assured that you will be rewarded. If you install Emacs, you also have access to this manual on the command-line with the following command (see *note Info::). $ info emacs A guided tour of emacs At . A short visual tour of Emacs, officially maintained by the Emacs developers. Unofficial mini-manual At . A shorter manual which contains nice animated images of using Emacs. ---------- Footnotes ---------- (1) (2) For GSL, this prefix has an internal technical application: GSL’s architecture mixes installed and not-installed headers in the same directory. This prefix is used to identify their installation status. Therefore this filename prefix in GSL a technical internal issue (for developers, not users). (3) The convention to use underscores to separate words, called “snake case” (or “snake_case”). This is also recommended by the GNU coding standards. (4) If you use Emacs, Gnuastro’s ‘.dir-locals.el’ file will automatically never use tabs for indentation. To make this a default in all your Emacs sessions, you can add the following line to your ‘~/.emacs’ file: ‘(setq-default indent-tabs-mode nil)’  File: gnuastro.info, Node: Program source, Next: Documentation, Prev: Coding conventions, Up: Developing 12.4 Program source =================== Besides the fact that all the programs share some functions that were explained in *note Library::, everything else about each program is completely independent. Recall that Gnuastro is written for an active astronomer/scientist (not a passive one who just uses a software). It must thus be easily navigable. Hence there are fixed source files (that contain fixed operations) that must be present in all programs, these are discussed fully in *note Mandatory source code files::. To easily understand the explanations in this section you can use *note The TEMPLATE program:: which contains the bare minimum code for one working program. This template can also be used to easily add new utilities: just copy and paste the directory and change ‘TEMPLATE’ with your program’s name. * Menu: * Mandatory source code files:: Description of files common to all programs. * The TEMPLATE program:: Template for easy creation of a new program.  File: gnuastro.info, Node: Mandatory source code files, Next: The TEMPLATE program, Prev: Program source, Up: Program source 12.4.1 Mandatory source code files ---------------------------------- Some programs might need lots of source files and if there is no fixed convention, navigating them can become very hard for a new inquirer into the code. The following source files exist in every program’s source directory (which is located in ‘bin/progname’). For small programs, these files are enough. Larger programs will need more files and developers are encouraged to define any number of new files. It is just important that the following list of files exist and do what is described here. When creating other source files, please choose filenames that are a complete single word: don’t abbreviate (abbreviations are cryptic). For a minimal program containing all these files, see *note The TEMPLATE program::. ‘main.c’ Each executable has a ‘main’ function, which is located in ‘main.c’. Therefore this file is the starting point when reading any program’s source code. No actual processing functions must be defined in this file, the function(s) in this file are only meant to connect the most high level steps of each program. Generally, ‘main’ will first call the top user interface function to read user input and make all the preparations. Then it will pass control to the top processing function for that program. The functions to do both these jobs must be defined in other source files. ‘main.h’ All the major parameters which will be used in the program must be stored in a structure which is defined in ‘main.h’. The name of this structure is usually ‘prognameparams’, for example ‘cropparams’ or ‘noisechiselparams’. So ‘#include "main.h"’ will be a staple in all the source codes of the program. It is also regularly the first (and only) argument most of the program’s functions which greatly helps in readability. Keeping all the major parameters of a program in this structure has the major benefit that most functions will only need one argument: a pointer to this structure. This will significantly facilitate the job of the programmer, the inquirer and the computer. All the programs in Gnuastro are designed to be low-level, small and independent parts, so this structure should not get too large. The main root structure of all programs contains at least one instance of the ‘gal_options_common_params’ structure. This structure will keep the values to all common options in Gnuastro’s programs (see *note Common options::). This top root structure is conveniently called ‘p’ (short for parameters) by all the functions in the programs and the common options parameters within it are called ‘cp’. With this convention any reader can immediately understand where to look for the definition of one parameter. For example you know that ‘p->cp->output’ is in the common parameters while ‘p->threshold’ is in the program’s parameters. With this basic root structure, source code of functions can potentially become full of structure de-reference operators (‘->’) which can make the code very unreadable. In order to avoid this, whenever a structure element is used more than a couple of times in a function, a variable of the same type and with the same name (so it can be searched) as the desired structure element should be defined with the value of the root structure inside of it in definition time. Here is an example. char *hdu=p->cp.hdu; float threshold=p->threshold; ‘args.h’ The options particular to each program are defined in this file. Each option is defined by a block of parameters in ‘program_options’. These blocks are all you should modify in this file, leave the bottom group of definitions untouched. These are fed directly into the GNU C library’s Argp facilities and it is recommended to have a look at that for better understand what is going on, although this is not required here. Each element of the block defining an option is described under ‘argp_option’ in ‘bootstrapped/lib/argp.h’ (from Gnuastro’s top source file). Note that the last few elements of this structure are Gnuastro additions (not documented in the standard Argp manual). The values to these last elements are defined in ‘lib/gnuastro/type.h’ and ‘lib/gnuastro-internal/options.h’ (from Gnuastro’s top source directory). ‘ui.h’ Besides declaring the exported functions of ‘ui.c’, this header also keeps the “key”s to every program-specific option. The first class of keys for the options that have a short-option version (single letter, see *note Options::). The character that is defined here is the option’s short option name. The list of available alphabet characters can be seen in the comments. Recall that some common options also take some characters, for those, see ‘lib/gnuastro-internal/options.h’. The second group of options are those that don’t have a short option alternative. Only the first in this group needs a value (‘1000’), the rest will be given a value by C’s ‘enum’ definition, so the actual value is irrelevant and must never be used, always use the name. ‘ui.c’ Everything related to reading the user input arguments and options, checking the configuration files and checking the consistency of the input parameters before the actual processing is run should be done in this file. Since most functions are the same, with only the internal checks and structure parameters differing. We recommend going through the ‘ui.c’ of *note The TEMPLATE program::, or several other programs for a better understanding. The most high-level function in ‘ui.c’ is named ‘ui_read_check_inputs_setup’. It accepts the raw command-line inputs and a pointer to the root structure for that program (see the explanation for ‘main.h’). This is the function that ‘main’ calls. The basic idea of the functions in this file is that the processing functions should need a minimum number of such checks. With this convention an inquirer who only wants to understand only one part (mostly the processing part and not user input details and sanity checks) of the code can easily do so in the later files. It also makes all the errors related to input appear before the processing begins which is more convenient for the user. ‘progname.c, progname.h’ The high-level processing functions in each program are in a file named ‘progname.c’, for example ‘crop.c’ or ‘noisechisel.c’. The function within these files which ‘main’ calls is also named after the program, for example void crop(struct cropparams *p) or void noisechisel(struct noisechiselparams *p) In this manner, if an inquirer is interested the processing steps, they can immediately come and check this file for the first processing step without having to go through ‘main.c’ and ‘ui.c’ first. In most situations, any failure in any step of the programs will result in an informative error message and an immediate abort in the program. So there is usually no need for return values. Under more complicated situations where a return value might be necessary, ‘void’ will be replaced with an ‘int’ in the examples above. This value must be directly returned by ‘main’, so it has to be an ‘int’. ‘authors-cite.h’ This header file keeps the global variable for the program authors and its BibTeX record for citation. They are used in the outputs of the common options ‘--version’ and ‘--cite’, see *note Operating mode options::. ‘progname-complete.bash’ This shell script is used for implementing auto-completion features when running Gnuastro’s programs within GNU Bash. For more on the concept of shell auto-completion and how it is managed in Gnuastro, see *note Bash programmable completion::. These files assume a set of common shell functions that have the prefix ‘_gnuastro_autocomplete_’ in their name and are defined in ‘bin/complete.bash.in’ (of the source directory, and under version control) and ‘bin/complete.bash.built’ (built during the building of Gnuastro in the build directory). During Gnuastro’s build, all these Bash completion files are merged into one file that is installed and the user can ‘source’ them into their Bash startup file, for example see *note Quick start::.  File: gnuastro.info, Node: The TEMPLATE program, Prev: Mandatory source code files, Up: Program source 12.4.2 The TEMPLATE program --------------------------- The extra creativity offered by libraries comes at a cost: you have to actually write your ‘main’ function and get your hands dirty in managing user inputs: are all the necessary parameters given a value? is the input in the correct format? do the options and the inputs correspond? and many other similar checks. So when an operation has well-defined inputs and outputs and is commonly needed, it is much more worthwhile to simply do use all the great features that Gnuastro has already defined for such operations. To make it easier to learn/apply the internal program infra-structure discussed in *note Mandatory source code files::, in the *note Version controlled source::, Gnuastro ships with a template program . This template program is not available in the Gnuastro tarball so it doesn’t confuse people using the tarball. The ‘bin/TEMPLATE’ directory in Gnuastro’s Git repository contains the bare-minimum files necessary to define a new program and all the basic/necessary files/functions are pre-defined there. Below you can see a list of initial steps to take for customizing this template. We just assume that after cloning Gnuastro’s history, you have already bootstrapped Gnuastro, if not, please see *note Bootstrapping::. 1. Select a name for your new program (for example ‘myprog’). 2. Copy the ‘TEMPLATE’ directory to a directory with your program’s name: $ cp -R bin/TEMPLATE bin/myprog 3. As with all source files in Gnuastro, all the files in template also have a copyright notice at their top. Open all the files and correct these notices: 1) The first line contains a single-line description of the program. 2) In the second line only the name or your program needs to be fixed and 3) Add your name and email as a “Contributing author”. As your program grows, you will need to add new files, don’t forget to add this notice in those new files too, just put your name and email under “Original author” and correct the copyright years. 4. Open ‘configure.ac’ in the top Gnuastro source. This file manages the operations that are done when a user runs ‘./configure’. Going down the file, you will notice repetitive parts for each program. You will notice that the program names follow an alphabetic ordering in each part. There is also a commented line/patch for the ‘TEMPLATE’ program in each part. You can copy one line/patch (from the program above or below your desired name for example) and paste it in the proper place for your new program. Then correct the names of the copied program to your new program name. There are multiple places where this has to be done, so be patient and go down to the bottom of the file. Ultimately add ‘bin/myprog/Makefile’ to ‘AC_CONFIG_FILES’, only here the ordering depends on the length of the name (it isn’t alphabetical). 5. Open ‘Makefile.am’ in the top Gnuastro source. Similar to the previous step, add your new program similar to all the other programs. Here there are only two places: 1) at the top where we define the conditionals (three lines per program), and 2) immediately under it as part of the value for ‘SUBDIRS’. 6. Open ‘doc/Makefile.am’ and similar to ‘Makefile.am’ (above), add the proper entries for the man-page of your program to be created (here, the variable that keeps all the man-pages to be created is ‘dist_man_MANS’). Then scroll down and add a rule to build the man-page similar to the other existing rules (in alphabetical order). Don’t forget to add a short one-line description here, it will be displayed on top of the man-page. 7. Change ‘TEMPLATE.c’ and ‘TEMPLATE.h’ to ‘myprog.c’ and ‘myprog.h’ in the file names: $ cd bin/myprog $ mv TEMPLATE.c myprog.c $ mv TEMPLATE.h myprog.h 8. Correct all occurrences of ‘TEMPLATE’ in the input files to ‘myprog’ (in short or long format). You can get a list of all occurrences with the following command. If you use Emacs, it will be able to parse the Grep output and open the proper file and line automatically. So this step can be very easy. $ grep --color -nHi -e template * 9. Run the following commands to re-build the configuration and build system, and then to configure and build Gnuastro (which now includes your exciting new program). $ autoreconf -f $ ./configure $ make 10. You are done! You can now start customizing your new program to do your special processing. When its complete, just don’t forget to add checks also, so it can be tested at least once on a user’s system with ‘make check’, see *note Test scripts::. Finally, if you would like to share it with all Gnuastro users, inform us so we merge it into Gnuastro’s main history.  File: gnuastro.info, Node: Documentation, Next: Building and debugging, Prev: Program source, Up: Developing 12.5 Documentation ================== Documentation (this book) is an integral part of Gnuastro (see *note Science and its tools::). Documentation is not considered a separate project and must be written by its developers. Users can make edits/corrections, but the initial writing must be by the developer. So, no change is considered valid for implementation unless the respective parts of the book have also been updated. The following procedure can be a good suggestion to take when you have a new idea and are about to start implementing it. The steps below are not a requirement, the important thing is that when you send your work to be included in Gnuastro, the book and the code have to both be fully up-to-date and compatible, with the purpose of the update very clearly explained. You can follow any strategy you like, the following strategy was what we have found to be most useful until now. 1. Edit the book and fully explain your desired change, such that your idea is completely embedded in the general context of the book with no sense of discontinuity for a first time reader. This will allow you to plan the idea much more accurately and in the general context of Gnuastro (a particular program or library). Later on, when you are coding, this general context will significantly help you as a road-map. A very important part of this process is the program/library introduction. These first few paragraphs explain the purposes of the program or library and are fundamental to Gnuastro. Before actually starting to code, explain your idea’s purpose thoroughly in the start of the respective/new section you wish to work on. While actually writing its purpose for a new reader, you will probably get some valuable and interesting ideas that you hadn’t thought of before. This has occurred several times during the creation of Gnuastro. If an introduction already exists, embed or blend your idea’s purpose with the existing introduction. We emphasize that doing this is equally useful for you (as the programmer) as it is useful for the user (reader). Recall that the purpose of a program is very important, see *note Program design philosophy::. As you have already noticed for every program/library, it is very important that the basics of the science and technique be explained in separate subsections prior to the ‘Invoking Programname’ subsection. If you are writing a new program or your addition to an existing program involves a new concept, also include such subsections and explain the concepts so a person completely unfamiliar with the concepts can get a general initial understanding. You don’t have to go deep into the details, just enough to get an interested person (with absolutely no background) started with some good pointers/links to where they can continue studying if they are more interested. If you feel you can’t do that, then you have probably not understood the concept yourself. If you feel you don’t have the time, then think about yourself as the reader in one year: you will forget almost all the details, so now that you have done all the theoretical preparations, add a few more hours and document it. Therefore in one year, when you find a bug or want to add a new feature, you don’t have to prepare as much. Have in mind that your only limitation in length is the fatigue of the reader after reading a long text, nothing else. So as long as you keep it relevant/interesting for the reader, there is no page number limit/cost. It might also help if you start discussing the usage of your idea in the ‘Invoking ProgramName’ subsection (explaining the options and arguments you have in mind) at this stage too. Actually starting to write it here will really help you later when you are coding. 2. After you have finished adding your initial intended plan to the book, then start coding your change or new program within the Gnuastro source files. While you are coding, you will notice that somethings should be different from what you wrote in the book (your initial plan). So correct them as you are actually coding, but don’t worry too much about missing a few things (see the next step). 3. After your work has been fully implemented, read the section documentation from the start and see if you didn’t miss any change in the coding and to see if the context is fairly continuous for a first time reader (who hasn’t seen the book or had known Gnuastro before you made your change). 4. If the change is notable, also update the ‘NEWS’ file.  File: gnuastro.info, Node: Building and debugging, Next: Test scripts, Prev: Documentation, Up: Developing 12.6 Building and debugging =========================== To build the various programs and libraries in Gnuastro, the GNU build system is used which defines the steps in *note Quick start::. It consists of GNU Autoconf, GNU Automake and GNU Libtool which are collectively known as GNU Autotools. They provide a very portable system to check the hosts environment and compile Gnuastro based on that. They also make installing everything in their standard places very easy for the programmer. Most of the small caps files that you see in the top source directory of the tarball are created by these three tools (see *note Version controlled source::). To facilitate the building and testing of your work during development, Gnuastro comes with two useful scripts: ‘developer-build’ This is more fully described in *note Configure and build in RAM::. During development, you will usually run this command only once (at the start of your work). ‘tests/during-dev.sh’ This script is designed to be run each time you make a change and want to test your work (with some possible input and output). The script itself is heavily commented and thoroughly describes the best way to use it, so we won’t repeat it here. As a short summary: you specify the build directory, an output directory (for the built program to be run in, and also contains the inputs), the program’s short name and the arguments and options that it should be run with. This script will then build Gnuastro, go to the output directory and run the built executable from there. One option for the output directory might be your desktop, so you can easily see the output files and delete them when you are finished. The main purpose of these scripts is to keep your source directory clean and facilitate your development. By default all the programs are compiled with optimization flags for increased speed. A side effect of optimization is that valuable debugging information is lost. All the libraries are also linked as shared libraries by default. Shared libraries further complicate the debugging process and significantly slow down the compilation (the ‘make’ command). So during development it is recommended to configure Gnuastro as follows: $ ./configure --enable-debug In ‘developer-build’ you can ask for this behavior through the ‘--debug’ option, see *note Separate build and source directories::. In order to understand the building process, you can go through the Autoconf, Automake and Libtool manuals, like all GNU manuals they provide both a great tutorial and technical documentation. The “A small Hello World” section in Automake’s manual (in chapter 2) can be a good starting guide after you have read the separate introductions.  File: gnuastro.info, Node: Test scripts, Next: Bash programmable completion, Prev: Building and debugging, Up: Developing 12.7 Test scripts ================= As explained in *note Tests::, for every program some simple tests are written to check the various independent features of the program. All the tests are placed in the ‘tests/’ directory. The ‘tests/prepconf.sh’ script is the first ‘test’ that will be run. It will copy all the configuration files from the various directories to a ‘tests/.gnuastro’ directory (which it will make) so the various tests can set the default values. This script will also make sure the programs don’t go searching for user and system wide configuration files to avoid the mixing of values with different Gnuastro version on the system. For each program, the tests are placed inside directories with the program name. Each test is written as a shell script. The last line of this script is the test which runs the program with certain parameters. The return value of this script determines the fate of the test, see the “Support for test suites” chapter of the Automake manual for a very nice and complete explanation. In every script, two variables are defined at first: ‘prog’ and ‘execname’. The first specifies the program name and the second the location of the executable. The most important thing to have in mind about all the test scripts is that they are run from inside the ‘tests/’ directory in the “build tree”. Which can be different from the directory they are stored in (known as the “source tree”)(1). This distinction is made by GNU Autoconf and Automake (which configure, build and install Gnuastro) so that you can install the program even if you don’t have write access to the directory keeping the source files. See the “Parallel build trees (a.k.a VPATH builds)” in the Automake manual for a nice explanation. Because of this, any necessary inputs that are distributed in the tarball(2), for example the catalogs necessary for checks in MakeProfiles and Crop, must be identified with the ‘$topsrc’ prefix instead of ‘../’ (for the top source directory that is unpacked). This ‘$topsrc’ variable points to the source tree where the script can find the source data (it is defined in ‘tests/Makefile.am’). The executables and other test products were built in the build tree (where they are being run), so they don’t need to be prefixed with that variable. This is also true for images or files that were produced by other tests. ---------- Footnotes ---------- (1) The ‘developer-build’ script also uses this feature to keep the source and build directories separate (see *note Separate build and source directories::). (2) In many cases, the inputs of a test are outputs of previous tests, this doesn’t apply to this class of inputs. Because all outputs of previous tests are in the “build tree”.  File: gnuastro.info, Node: Bash programmable completion, Next: Developer's checklist, Prev: Test scripts, Up: Developing 12.8 Bash programmable completion ================================= *Under development:* While work on TAB completion is ongoing, it is not yet fully ready, please see the notice at the start of *note Shell TAB completion::. Gnuastro provides Programmable completion facilities in Bash. This greatly helps users reach their desired result with minimal keystrokes, and helps them spend less time on figuring out the option names and values their acceptable values. Gnuastro’s completion script not only completes the half-written commands, but also prints suggestions based on previous arguments. Imagine a scenario where we need to download three columns containing the right ascension, declination, and parallax from the GAIA EDR3 dataset. We have to make sure how these columns are abbreviated or spelled. So we can call the command below, and store the column names in a file such as ‘gaia-edr3-columns.txt’. $ astquery gaia --information > gaia-edr3-columns.txt Then we need to memorize or copy the column names of interest, and specify an output fits file name such as ‘gaia.fits’: $ astquery gaia --dataset=edr3 --output=gaia.fits \ --column=ra,dec,parallax However, this is much easier using the auto-completion feature: $ astquery gaia --dataset=edr3 --output=gaia.fits --column=<[TAB]> After pressing <[TAB]>, a full list of gaia edr3 dataset column names will be displayed. Typing the first key of the desired column and pressing <[TAB]> again will limit the displayed list to only the matching ones until the desired column is found. * Menu: * Bash TAB completion tutorial:: Fast tutorial to get you started on concepts. * Implementing TAB completion in Gnuastro:: How Gnuastro uses Bash auto-completion features.  File: gnuastro.info, Node: Bash TAB completion tutorial, Next: Implementing TAB completion in Gnuastro, Prev: Bash programmable completion, Up: Bash programmable completion 12.8.1 Bash TAB completion tutorial ----------------------------------- When a user presses the <[TAB]> key while typing commands, Bash will inspect the input to find a relevant “completion specification”, or ‘compspec’. If available, the ‘compspec’ will generate a list of possible suggestions to complete the current word. A custom ‘compsec’ can be generated for any command using bash completion builtins(1) and the bash variables that start with the ‘COMP’ keyword(2). First, let’s see a quick example of how you can make a completion script in just one line of code. With the command below, we are asking Bash to give us three suggestions for ‘echo’: ‘foo’, ‘bar’ and ‘bAr’. Please run it in your terminal for the next steps. $ complete -W "foo bar bAr" echo The possible completion suggestions are fed into ‘complete’ using the ‘-W’ option followed by a list of space delimited words. Let’s see it in action: $ echo <[TAB][TAB]> bar bAr foo Nicely done! Just note that the strings are sorted alphabetically, not in the original order. Also, an arbitrary number of space characters are printed between them (based on the number of suggestions and terminal size and etc). Now, if you type ‘f’ and press <[TAB]>, bash will automatically figure out that you wanted ‘foo’ and it be completed right away: $ myprogram f<[TAB]> $ myprogram foo However, nothing will happen if you type ‘b’ and press <[TAB]> only once. This is because of the ambiguity: there isn’t enough information to figure out which suggestion you want: ‘bar’ or ‘bAr’? So, if you press <[TAB]> twice, it will print out all the options that start with ‘b’: $ echo b<[TAB][TAB]> bar bAr $ echo ba<[TAB]> $ echo bar Not bad for a simple program. But what if you need more control? By passing the ‘-F’ option to ‘complete’ instead of ‘-W’, it will run a function for generating the suggestions, instead of using a static string. For example, let’s assume that the expected value after ‘foo’ is the number of files in the current directory. Since the logic is getting more complex, let’s write and save the commands below into a shell script with an arbitrary name such as ‘completion-tutorial.sh’: $ cat completion-tutorial.sh _echo(){ if [ "$3" == "foo" ]; then COMPREPLY=( $(ls | wc -l) ) else COMPREPLY=( $(compgen -W "foo bar bAr" -- "$2") ) fi } complete -F _echo echo We will look at it in detail soon. But for now, let’s ‘source’ the file into your current terminal and check if it works as expected: $ source completion-tutorial.sh $ echo <[TAB][TAB]> foo bar bAr $ echo foo <[TAB]> $ touch empty.txt $ echo foo <[TAB]> Success! As you see, this allows for setting up highly customized completion scripts. Now let’s have a closer look at the ‘completion-tutorial.sh’ completion script from above. First, the ‘-F’ option in front the ‘complete’ command indicates that we want shell to execute the ‘_echo’ function whenever ‘echo’ is called. As a convention, the function name should be the same as the program name, but prefixed with an underscore (‘_’). Within the ‘_echo’ function, we’re checking if ‘$3’ is equal to ‘foo’. In Bash’s auto-complete, ‘$3’ means the word before current cursor position. In fact, these are the arguments that the ‘_echo’ function is receiving: ‘$1’ The name of the command, here it is ‘echo’. ‘$2’ The current word being completed (empty unless we are in the middle of typing a word). ‘$3’ The word before the word being completed. To tell the completion script what to reply with, we use the ‘COMPREPLY’ array. This array holds all the suggestions that ‘complete’ will show for the user in the end. In the example above, we simply give it the string output of ‘ls | wc -l’. Finally, we have the ‘compgen’ command. According to bash programmable completion builtins manual, the command ‘compgen [OPTION] [WORD]’ generates possible completion matches for ‘[WORD]’ according to ‘[OPTIONS]’. Using the ‘-W’ option asks ‘compgen’ to generate a list of words from an input string. This is known as Word Splitting(3). ‘compgen’ will automatically use the ‘$IFS’ variable to split the string into a list of words. You can check the default delimiters by calling: $ printf %q "$IFS" The default value of ‘$IFS’ might be ‘ \t\n’. This means the SPACE, TAB, and New-line characters. Finally, notice the ‘-- "$2"’ in this command: COMPREPLY=( $(compgen -W "foo bar bAr" -- "$2") ) Here, the ‘--’ instructs ‘compgen’ to only reply with a list of words that match ‘$2’, i.e. the current word being completed. That is why when you type the letter ‘b’, ‘complete’ will reply only with its matches (‘bar’ and ‘bAr’), and will exclude ‘foo’. Let’s get a little more realistic, and develop a very basic completion script for one of Gnuastro’s programs. Since the ‘--help’ option will list all the options available in Gnuastro’s programs, we are going to use its output and create a very basic TAB completion for it. Note that the actual TAB completion in Gnuastro is a little more complex than this and fully described in *note Implementing TAB completion in Gnuastro::. But this is a good exercise to get started. We’ll use ‘asttable’ as the demo, and the goal is to suggest all options that this program has to offer. You can print all of them (with a lot of extra information) with this command: $ asttable --help Let’s write an ‘awk’ script that prints all of the long options. When printing the option names we can safely ignore the short options because if a user knows about the short options, s/he already knows exactly what they want! Also, due to their single-character length, they will be too cryptic without their descriptions. One way to catch the long options is through ‘awk’ as shown below. We only keep the lines that 1) starting with an empty space, 2) their first no-white character is ‘-’ and that have the format of ‘--’ followed by any number of numbers or characters. Within those lines, if the first word ends in a comma (‘,’), the first word is the short option, so we want the second word (which is the long option). Otherwise, the first word is the long option. But for options that take a value, this will also include the format of the value (for example ‘--column=STR’). So with a ‘sed’ command, we remove everything that is after the equal sign, but keep the equal sign itself (to highlight to the user that this option should have a value). $ asttable --help \ | awk '/^ / && $1 ~ /^-/ && /--+[a-zA-Z0-9]*/ { \ if($1 ~ /,$/) name=$2; \ else name=$1; \ print name}' \ | sed -e's|=.*|=|' If we wanted to show all the options to the user, we could simply feed the values of the command above to ‘compgen’ and ‘COMPREPLY’ subsequently. But, we need _smarter_ completions: we want to offer suggestions based on the previous options that have already been typed in. Just Beware! Sometimes the program might not be acting as you expected. In that case, using debug messages can clear things up. You can add a ‘echo’ command before the completion function ends, and check all current variables. This can save a lot of headaches, since things can get complex. Take the option ‘--wcsfile=’ for example. This option accepts a FITS file. Usually, the user is trying to feed a FITS file from the current directory. So it would be nice if we could help them and print only a list of FITS files sitting in the current directory – or whatever directory they have typed-in so far. But there’s a catch. When splitting the user’s input line, Bash will consider ‘=’ as a separate word. To avoid getting caught in changing the ‘IFS’ or ‘WORDBREAKS’ values, we will simply check for ‘=’ and act accordingly. That is, if the previous word is a ‘=’, we’ll ignore it and take the word before that as the previous word. Also, if the current word is a ‘=’, ignore it completely. Taking all of that into consideration, the code below might serve well: _asttable(){ if [ "$2" = "=" ]; then word="" else word="$2" fi if [ "$3" = "=" ]; then prev="${COMP_WORDS[COMP_CWORD-2]}" else prev="${COMP_WORDS[COMP_CWORD-1]}" fi case "$prev" in --wcsfile) COMPREPLY=( $(compgen -f -X "!*.[fF][iI][tT][sS]" -- "$word") ) ;; esac } complete -o nospace -F _asttable asttable To test the code above, write it into ‘asttable-tutorial.sh’, and load it into your running terminal with this command: $ source asttable-tutorial.sh If you then go to a directory that has at least one FITS file (with a ‘.fits’ suffix, among other files), you can checkout the function by typing the following command. You will see that only files ending in ‘.fits’ are shown, not any other file. asttable --wcsfile=[TAB][TAB] The code above first identifies the current and previous words. It then checks if the previous word is equal to ‘--wcsfile’ and if so, fills ‘COMPREPLY’ array with the necessary suggestions. We are using ‘case’ here (instead of ‘if’) because in a real scenario, we need to check many more values and ‘case’ is far better suited for such cases (cleaner and more efficient code). The ‘-f’ option in ‘compgen’ indicates we’re looking for a file. The ‘-X’ option _filters out_ the filenames that match the next regular expression pattern. Therefore we should start the regular expression with ‘!’ if we want the files matching the regular expression. The ‘-- "$word"’ component collects only filenames that match the current word being typed. And last but not least, the ‘-o nospace’ option in the ‘complete’ command instructs the completion script to _not_ append a white space after each suggestion. That is important because the long format of an option, its value is more clear when it sticks to the option name with a ‘=’ sign. You have now written a very basic and working TAB completion script that can easily be generalized to include more options (and be good for a single/simple program). However, Gnuastro has many programs that share many similar things and the options are not independent. Also, complex situations do often come up: for example some people use a ‘.fit’ suffix for FITS files and others don’t even use a suffix at all! So in practice, things need to get a little more complicated, but the core concept is what you learnt in this section. We just modularize the process (breaking logically independent steps into separate functions to use in different situations). In *note Implementing TAB completion in Gnuastro::, we will review the generalities of Gnuastro’s implementation of Bash TAB completion. ---------- Footnotes ---------- (1) (2) (3)  File: gnuastro.info, Node: Implementing TAB completion in Gnuastro, Prev: Bash TAB completion tutorial, Up: Bash programmable completion 12.8.2 Implementing TAB completion in Gnuastro ---------------------------------------------- The basics of Bash auto-completion was reviewed in *note Bash TAB completion tutorial::. Gnuastro is a very complex package of many programs, that have many similar features, so implementing those principles in an easy to maintain manner requires a modular solution. As a result, Bash’s TAB completion is implemented as multiple files in Gnuastro: ‘bin/completion.bash.built’ (in build directory, automatically created) This file contains the values of all Gnuastro options or arguments that take fixed strings as values (not file names). For example the names of Arithmetic’s operators (see *note Arithmetic operators::), or spectral line names (like ‘--obsline’ in *note CosmicCalculator input options::). This file is created automatically during the building of Gnuastro. The recipe to build it is available in Gnuastro’s top-level ‘Makefile.am’ (under the target ‘bin/completion.bash’). It parses the respective Gnuastro source file that contains the necessary user-specified strings. All the acceptable values values are then stored as shell variables (within a function). ‘bin/completion.bash.in’ (in source directory, under version control) All the low-level completion functions that are common to all programs are stored here. It thus contains functions that will parse the command-line or files, or suggest the completion replies. ‘PROGNAME-complete.bash’ (in source directory, under version control) All Gnuastro programs contain a ‘PROGNAME-complete.bash’ script within their source (for more on the fixed files of each program, see *note Program source::). This file contains the very high-level (program-specific) Bash programmable completion features that are almost always defined in Gnuastro-generic Bash completion file (‘bin/completion.bash.in’). The top-level function that is called by Bash should be called ‘_gnuastro_autocomplete_PROGNAME’ and its last line should be the ‘complete’ command of Bash which calls this function. The contents of ‘_gnuastro_autocomplete_PROGNAME’ are almost identical for all the programs, it is just a very high-level function that either calls ‘_gnuastro_autocomplete_PROGNAME_arguments’ to manage suggestions for the program’s arguments or ‘_gnuastro_autocomplete_PROGNAME_option_value’ to manage suggestions for the program’s option values. The scripts above follow the following conventions. After reviewing the list, please also look into the functions for examples of each point. • No global shell variables in any completion script: the contents of the files above are directly loaded into the user’s environment. So to keep the user’s environment clean and avoid annoyance to the users, everything should be defined as shell functions, and any variable within the functions should be set as ‘local’. • All the function names should start with ‘‘_gnuastro_autocomplete_’’, again to avoid populating the user’s function name-space with possibly conflicting names. • Outputs of functions should be written in the ‘local’ variables of the higher-level functions that called them.  File: gnuastro.info, Node: Developer's checklist, Next: Gnuastro project webpage, Prev: Bash programmable completion, Up: Developing 12.9 Developer’s checklist ========================== This is a checklist of things to do after applying your changes/additions in Gnuastro: 1. If the change is non-trivial, write test(s) in the ‘tests/progname/’ directory to test the change(s)/addition(s) you have made. Then add their file names to ‘tests/Makefile.am’. 2. If your change involves a change in command-line behavior of a Gnuastro program or script (for example, adding a new option or argument), create or update the respective ‘bin/PROGNAME/completion.sh’ file described under the *note Bash programmable completion:: section. 3. Run ‘$ make check’ to make sure everything is working correctly. 4. Make sure the documentation (this book) is completely up to date with your changes, see *note Documentation::. 5. Commit the change to your issue branch (see *note Production workflow:: and *note Forking tutorial::). Afterwards, run Autoreconf to generate the appropriate version number: $ autoreconf -f 6. Finally, to make sure everything will be built, installed and checked correctly run the following command (after re-configuring, and re-building). To greatly speed up the process, use multiple threads (8 in the example below, change it appropriately) $ make distcheck -j8 This command will create a distribution file (ending with ‘.tar.gz’) and try to compile it in the most general cases, then it will run the tests on what it has built in its own mini-environment. If ‘$ make distcheck’ finishes successfully, then you are safe to send your changes to us to implement or for your own purposes. See *note Production workflow:: and *note Forking tutorial::.  File: gnuastro.info, Node: Gnuastro project webpage, Next: Developing mailing lists, Prev: Developer's checklist, Up: Developing 12.10 Gnuastro project webpage ============================== Gnuastro’s central management hub (https://savannah.gnu.org/projects/gnuastro/)(1) is located on GNU Savannah (https://savannah.gnu.org/)(2). Savannah is the central software development management system for many GNU projects. Through this central hub, you can view the list of activities that the developers are engaged in, their activity on the version controlled source, and other things. Each defined activity in the development cycle is known as an ‘issue’ (or ‘item’). An issue can be a bug (see *note Report a bug::), or a suggested feature (see *note Suggest new feature::) or an enhancement or generally any _one_ job that is to be done. In Savannah, issues are classified into three categories or ‘tracker’s: Support This tracker is a way that (possibly anonymous) users can get in touch with the Gnuastro developers. It is a complement to the bug-gnuastro mailing list (see *note Report a bug::). Anyone can post an issue to this tracker. The developers will not submit an issue to this list. They will only reassign the issues in this list to the other two trackers if they are valid(3). Ideally (when the developers have time to put on Gnuastro, please don’t forget that Gnuastro is a volunteer effort), there should be no open items in this tracker. Bugs This tracker contains all the known bugs in Gnuastro (problems with the existing tools). Tasks The items in this tracker contain the future plans (or new features/capabilities) that are to be added to Gnuastro. All the trackers can be browsed by a (possibly anonymous) visitor, but to edit and comment on the Bugs and Tasks trackers, you have to be a registered on Savannah. When posting an issue to a tracker, it is very important to choose the ‘Category’ and ‘Item Group’ options accurately. The first contains a list of all Gnuastro’s programs along with ‘Installation’, ‘New program’ and ‘Webpage’. The “Item Group” contains the nature of the issue, for example if it is a ‘Crash’ in the software (a bug), or a problem in the documentation (also a bug) or a feature request or an enhancement. The set of horizontal links on the top of the page (Starting with ‘Main’ and ‘Homepage’ and finishing with ‘News’) are the easiest way to access these trackers (and other major aspects of the project) from any part of the project web page. Hovering your mouse over them will open a drop down menu that will link you to the different things you can do on each tracker (for example, ‘Submit new’ or ‘Browse’). When you browse each tracker, you can use the “Display Criteria” link above the list to limit the displayed issues to what you are interested in. The ‘Category’ and ‘Group Item’ (explained above) are a good starting point. Any new issue that is submitted to any of the trackers, or any comments that are posted for an issue, is directly forwarded to the gnuastro-devel mailing list (, see *note Developing mailing lists:: for more). This will allow anyone interested to be up to date on the over-all development activity in Gnuastro and will also provide an alternative (to Savannah) archiving for the development discussions. Therefore, it is not recommended to directly post an email to this mailing list, but do all the activities (for example add new issues, or comment on existing ones) on Savannah. *Do I need to be a member in Savannah to contribute to Gnuastro?* No. The full version controlled history of Gnuastro is available for anonymous download or cloning. See *note Production workflow:: for a description of Gnuastro’s Integration-Manager Workflow. In short, you can either send in patches, or make your own fork. If you choose the latter, you can push your changes to your own fork and inform us. We will then pull your changes and merge them into the main project. Please see *note Forking tutorial:: for a tutorial. ---------- Footnotes ---------- (1) (2) (3) Some of the issues registered here might be due to a mistake on the user’s side, not an actual bug in the program.  File: gnuastro.info, Node: Developing mailing lists, Next: Contributing to Gnuastro, Prev: Gnuastro project webpage, Up: Developing 12.11 Developing mailing lists ============================== To keep the developers and interested users up to date with the activity and discussions within Gnuastro, there are two mailing lists which you can subscribe to: ‘gnuastro-devel@gnu.org’ (at ) All the posts made in the support, bugs and tasks discussions of *note Gnuastro project webpage:: are also sent to this mailing address and archived. By subscribing to this list you can stay up to date with the discussions that are going on between the developers before, during and (possibly) after working on an issue. All discussions are either in the context of bugs or tasks which are done on Savannah and circulated to all interested people through this mailing list. Therefore it is not recommended to post anything directly to this mailing list. Any mail that is sent to it from Savannah to this list has a link under the title “Reply to this item at:”. That link will take you directly to the issue discussion page, where you can read the discussion history or join it. While you are posting comments on the Savannah issues, be sure to update the meta-data. For example if the task/bug is not assigned to anyone and you would like to take it, change the “Assigned to” box, or if you want to report that it has been applied, change the status and so on. All these changes will also be circulated with the email very clearly. ‘gnuastro-commits@gnu.org’ (at ) This mailing list is defined to circulate all commits that are done in Gnuastro’s version controlled source, see *note Version controlled source::. If you have any ideas, or suggestions on the commits, please use the bug and task trackers on Savannah to followup the discussion, do not post to this list. All the commits that are made for an already defined issue or task will state the respective ID so you can find it easily.  File: gnuastro.info, Node: Contributing to Gnuastro, Prev: Developing mailing lists, Up: Developing 12.12 Contributing to Gnuastro ============================== You have this great idea or have found a good fix to a problem which you would like to implement in Gnuastro. You have also become familiar with the general design of Gnuastro in the previous sections of this chapter (see *note Developing::) and want to start working on and sharing your new addition/change with the whole community as part of the official release. This is great and your contribution is most welcome. This section and the next (see *note Developer's checklist::) are written in the hope of making it as easy as possible for you to share your great idea with the community. In this section we discuss the final steps you have to take: legal and technical. From the legal perspective, the copyright of any work you do on Gnuastro has to be assigned to the Free Software Foundation (FSF) and the GNU operating system, or you have to sign a disclaimer. We do this to ensure that Gnuastro can remain free in the future, see *note Copyright assignment::. From the technical point of view, in this section we also discuss commit guidelines (*note Commit guidelines::) and the general version control workflow of Gnuastro in *note Production workflow::, along with a tutorial in *note Forking tutorial::. Recall that before starting the work on your idea, be sure to checkout the bugs and tasks trackers in *note Gnuastro project webpage:: and announce your work there so you don’t end up spending time on something others have already worked on, and also to attract similarly interested developers to help you. * Menu: * Copyright assignment:: Copyright has to be assigned to the FSF. * Commit guidelines:: Guidelines for commit messages. * Production workflow:: Submitting your commits (work) for inclusion. * Forking tutorial:: Tutorial on workflow steps with Git.  File: gnuastro.info, Node: Copyright assignment, Next: Commit guidelines, Prev: Contributing to Gnuastro, Up: Contributing to Gnuastro 12.12.1 Copyright assignment ---------------------------- Gnuastro’s copyright is owned by the Free Software Foundation (FSF) to ensure that Gnuastro always remains free. The FSF has also provided a Contributor FAQ (https://www.fsf.org/licensing/contributor-faq) to further clarify the reasons, so we encourage you to read it. Professor Eben Moglen, of the Columbia University Law School has given a nice summary of the reasons for this at . Below we are copying it verbatim for self consistency (in case you are offline or reading in print). Under US copyright law, which is the law under which most free software programs have historically been first published, there are very substantial procedural advantages to registration of copyright. And despite the broad right of distribution conveyed by the GPL, enforcement of copyright is generally not possible for distributors: only the copyright holder or someone having assignment of the copyright can enforce the license. If there are multiple authors of a copyrighted work, successful enforcement depends on having the cooperation of all authors. In order to make sure that all of our copyrights can meet the record keeping and other requirements of registration, and in order to be able to enforce the GPL most effectively, FSF requires that each author of code incorporated in FSF projects provide a copyright assignment, and, where appropriate, a disclaimer of any work-for-hire ownership claims by the programmer’s employer. That way we can be sure that all the code in FSF projects is free code, whose freedom we can most effectively protect, and therefore on which other developers can completely rely. Please get in touch with the Gnuastro maintainer (currently Mohammad Akhlaghi, mohammad -at- akhlaghi -dot- org) to follow the procedures. It is possible to do this for each change (good for for a single contribution), and also more generally for all the changes/additions you do in the future within Gnuastro. So if you have already assigned the copyright of your work on another GNU software to the FSF, it should be done again for Gnuastro. The FSF has staff working on these legal issues and the maintainer will get you in touch with them to do the paperwork. The maintainer will just be informed in the end so your contributions can be merged within the Gnuastro source code. Gnuastro will gratefully acknowledge (see *note Acknowledgments::) all the people who have assigned their copyright to the FSF and have thus helped to guarantee the freedom and reliability of Gnuastro. The Free Software Foundation will also acknowledge your copyright contributions in the Free Software Supporter: which will circulate to a very large community (225,910 people in July 2021). See the archives for some examples and subscribe to receive interesting updates. The very active code contributors (or developers) will also be recognized as project members on the Gnuastro project web page (see *note Gnuastro project webpage::) and can be given a ‘gnu.org’ email address. So your very valuable contribution and copyright assignment will not be forgotten and is highly appreciated by a very large community. If you are reluctant to sign an assignment, a disclaimer is also acceptable. *Do I need a disclaimer from my university or employer?* It depends on the contract with your university or employer. From the FSF’s ‘/gd/gnuorg/conditions.text’: “If you are employed to do programming, or have made an agreement with your employer that says it owns programs you write, we need a signed piece of paper from your employer disclaiming rights to” Gnuastro. The FSF’s copyright clerk will kindly help you decide, please consult the following email address: “assign -at- gnu -dot- org”.  File: gnuastro.info, Node: Commit guidelines, Next: Production workflow, Prev: Copyright assignment, Up: Contributing to Gnuastro 12.12.2 Commit guidelines ------------------------- To be able to cleanly integrate your work with the other developers, *never commit on the ‘master’ branch* (see *note Production workflow:: for a complete discussion and *note Forking tutorial:: for a cookbook example). In short, leave ‘master’ only for changes you fetch, or pull from the official repository (see *note Synchronizing::). In the Gnuastro commit messages, we strive to follow these standards. Note that in the early phases of Gnuastro’s development, we are experimenting and so if you notice earlier commits don’t satisfy some of the guidelines below, it is because they predate that guideline. Commit title The commits have to start with one short descriptive title. The title is separated from the body with one blank line. Run ‘git log’ to see some of the most recent commit messages as an example. In general, the title should satisfy the following conditions: • It is best for the title to be short, about 60 (or even 50) characters. Most emulated command-line terminals are about 80 characters wide. However, we should also allow for the commit hashes which are printed in ‘git log --oneline’, and also branch names or the graph structure outputs of ‘git log’ which are also commonly used. • The title should not finish with any full-stops or periods (‘<.>’). Commit body The body of the commit message is separated from the title by one empty line. Recall that anyone who has subscribed to ‘gnuastro-commits’ mailing list will get the commit in their email after it has been pushed to ‘master’. People will also read them when they synchronize with the main Gnuastro repository (see *note Synchronizing::). Finally, the commit messages will later be used to update the ‘NEWS’ file on each release. Therefore the commit message body plays a very important role in the development of Gnuastro, so please adhere to the following guidelines. • The body should be very descriptive. Start the commit message body by explaining what changes your commit makes from a user’s perspective (added, changed, or removed options, or arguments to programs or libraries, or modified algorithms, or new installation step, etc). • Try to explain the committed contents as best as you can. Recall that the readers of your commit message do not necessarily have your current background. After some time you will also forget the context, so this request is not just for others(1). Therefore be very descriptive and explain as much as possible: what the bug/task was, justify the way you fixed it and discuss other possible solutions that you might not have included. For the last item, it is best to discuss them thoroughly as comments in the appropriate section of the code, but only give a short summary in the commit message. Note that all added and removed source code lines will also be circulated in the ‘gnuastro-commits’ mailing list. • Like all other Gnuastro’s text files, the lines in the commit body should not be longer than 75 characters, see *note Coding conventions::. This is to ensure that on standard terminal emulators (with 80 character width), the ‘git log’ output can be cleanly displayed (note that the commit message is indented in the output of ‘git log’). If you use Emacs, Gnuastro’s ‘.dir-locals.el’ file will ensure that your commits satisfy this condition (using ). • When the commit is related to a task or a bug, please include the respective ID (in the format of ‘bug/task #ID’, note the space) in the commit message (from *note Gnuastro project webpage::) for interested people to be able to followup the discussion that took place there. If the commit fixes a bug or finishes a task, the recommended way is to add a line after the body with ‘‘This fixes bug #ID.’’, or ‘‘This finishes task #ID.’’. Don’t assume that the reader has internet access to check the bug’s full description when reading the commit message, so give a short introduction too. Below you can see a good commit message example (don’t forget to read it, it has tips for you). After reading this, please run ‘git log’ on the ‘master’ branch and read some of the recent commits for more realistic examples. The first line should be the title of the commit An empty line is necessary after the title so Git doesn't confuse lines. This top paragraph of the body of the commit usually describes the reason this commit was done. Therefore it usually starts with "Until now ...". It is very useful to explain the reason behind the change, things that aren't immediately obvious when looking into the code. You don't need to list the names of the files, or what lines have been changed, don't forget that the code changes are fully stored within Git :-). In the second paragraph (or any later paragraph!) of the body, we describe the solution and why (not "how"!) the particular solution was implemented. So we usually start this part of the commit body with "With this commit ...". Again, you don't need to go into the details that can be seen from the 'git diff' command (like the file names that have been changed or the code that has been implemented). The important thing here is the things that aren't immediately obvious from looking into the code. You can continue the explanation and it is encouraged to be very explicit about the "human factor" of the change as much as possible, not technical details. ---------- Footnotes ---------- (1)  File: gnuastro.info, Node: Production workflow, Next: Forking tutorial, Prev: Commit guidelines, Up: Contributing to Gnuastro 12.12.3 Production workflow --------------------------- Fortunately ‘Pro Git’ has done a wonderful job in explaining the different workflows in Chapter 5(1) and in particular the “Integration-Manager Workflow” explained there. The implementation of this workflow is nicely explained in Section 5.2(2) under “Forked-Public-Project”. We have also prepared a short tutorial in *note Forking tutorial::. Anything on the master branch should always be tested and ready to be built and used. As described in ‘Pro Git’, there are two methods for you to contribute to Gnuastro in the Integration-Manager Workflow: 1. You can send commit patches by email as fully explained in ‘Pro Git’. This is good for your first few contributions. Just note that raw patches (containing only the diff) do not have any meta-data (author name, date, etc). Therefore they will not allow us to fully acknowledge your contributions as an author in Gnuastro: in the ‘AUTHORS’ file and at the start of the PDF book. These author lists are created automatically from the version controlled source. To receive full acknowledgment when submitting a patch, is thus advised to use Git’s ‘format-patch’ tool. See Pro Git’s Public project over email (https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project#Public-Project-over-Email) section for a nice explanation. If you would like to get more heavily involved in Gnuastro’s development, then you can try the next solution. 2. You can have your own forked copy of Gnuastro on any hosting site you like (GitHub, GitLab, BitBucket, etc) and inform us when your changes are ready so we merge them in Gnuastro. This is more suited for people who commonly contribute to the code (see *note Forking tutorial::). In both cases, your commits (with your name and information) will be preserved and your contributions will thus be fully recorded in the history of Gnuastro and in the ‘AUTHORS’ file and this book (second page in the PDF format) once they have been incorporated into the official repository. Needless to say that in such cases, be sure to follow the bug or task trackers (or subscribe to the ‘gnuastro-devel’ mailing list) and contact us before hand so you don’t do something that someone else is already working on. In that case, you can get in touch with them and help the job go on faster, see *note Gnuastro project webpage::. This workflow is currently mostly borrowed from the general recommendations of Git(3) and GitHub. But since Gnuastro is currently under heavy development, these might change and evolve to better suit our needs. ---------- Footnotes ---------- (1) (2) (3)  File: gnuastro.info, Node: Forking tutorial, Prev: Production workflow, Up: Contributing to Gnuastro 12.12.4 Forking tutorial ------------------------ This is a tutorial on the second suggested method (commonly known as forking) that you can submit your modifications in Gnuastro (see *note Production workflow::). To start, please create an empty repository on your hosting service web page (we recommend GitLab(1)). If this is your first hosted repository on the web page, you also have to upload your public SSH key(2) for the ‘git push’ command below to work. Here we’ll assume you use the name ‘janedoe’ to refer to yourself everywhere and that you choose ‘gnuastro-janedoe’ as the name of your Gnuastro fork. Any online hosting service will give you an address (similar to the ‘‘git@gitlab.com:...’’ below) of the empty repository you have created using their web page, use that address in the third line below. $ git clone git://git.sv.gnu.org/gnuastro.git $ cd gnuastro $ git remote add janedoe git@gitlab.com:janedoe/gnuastro-janedoe.git $ git push janedoe master The full Gnuastro history is now pushed onto your hosting service and the ‘janedoe’ remote is now also following your ‘master’ branch. If you run ‘git remote show REMOTENAME’ for the ‘origin’ and ‘janedoe’ remotes, you will see their difference: the first has pull access and the second doesn’t. This nicely summarizes the main idea behind this workflow: you push to your remote repository, we pull from it and merge it into ‘master’, then you finalize it by pulling from the main repository. To test (compile) your changes during your work, you will need to bootstrap the version controlled source, see *note Bootstrapping:: for a full description. The cloning process above is only necessary for your first time setup, you don’t need to repeat it. However, please repeat the steps below for each independent issue you intend to work on. Let’s assume you have found a bug in ‘lib/statistics.c’’s median calculating function. Before actually doing anything, please announce it (see *note Report a bug::) so everyone knows you are working on it or to find out others aren’t already working on it. With the commands below, you make a branch, checkout to it, correct the bug, check if it is indeed fixed, add it to the staging area, commit it to the new branch and push it to your hosting service. But before all of them, make sure that you are on the ‘master’ branch and that your ‘master’ branch is up to date with the main Gnuastro repository with the first two commands. $ git checkout master $ git pull $ git checkout -b bug-median-stats # Choose a descriptive name $ emacs lib/statistics.c $ # do your checks here $ git add lib/statistics.c $ git commit $ git push janedoe bug-median-stats Your new branch is now on your hosted repository. Through the respective tacker on Savannah (see *note Gnuastro project webpage::) you can then let the other developers know that your ‘bug-median-stats’ branch is ready. They will pull your work, test it themselves and if it is ready to be merged into the main Gnuastro history, they will merge it into the ‘master’ branch. After that is done, you can simply checkout your local ‘master’ branch and pull all the changes from the main repository. After the pull you can run ‘‘git log’’ as shown below, to see how ‘bug-median-stats’ is merged with master. To finalize, you can push all the changes to your hosted repository and delete the branch: $ git checkout master $ git pull $ git log --oneline --graph --decorate --all $ git push janedoe master $ git branch -d bug-median-stats # delete local branch $ git push janedoe --delete bug-median-stats # delete remote branch Just as a reminder, always keep your work on each issue in a separate local and remote branch so work can progress on them independently. After you make your announcement, other people might contribute to the branch before merging it in to ‘master’, so this is very important. As a final reminder: before starting each issue branch from ‘master’, be sure to run ‘git pull’ in ‘master’ as shown above. This will enable you to start your branch (work) from the most recent commit and thus simplify the final merging of your work. ---------- Footnotes ---------- (1) See for an evaluation of the major existing repositories. Gnuastro uses GNU Savannah (which also has the highest ranking in the evaluation), but for starters, GitLab may be easier. (2) For example see this explanation provided by GitLab: .  File: gnuastro.info, Node: Gnuastro programs list, Next: Other useful software, Prev: Developing, Up: Top Appendix A Gnuastro programs list ********************************* GNU Astronomy Utilities 0.16, contains the following programs. They are sorted in alphabetical order and a short description is provided for each program. The description starts with the executable names in ‘thisfont’ followed by a pointer to the respective section in parenthesis. Throughout this book, they are ordered based on their context, please see the top-level contents for contextual ordering (based on what they do). Arithmetic (‘astarithmetic’, see *note Arithmetic::) For arithmetic operations on multiple (theoretically unlimited) number of datasets (images). It has a large and growing set of arithmetic, mathematical, and even statistical operators (for example ‘+’, ‘-’, ‘*’, ‘/’, ‘sqrt’, ‘log’, ‘min’, ‘average’, ‘median’). BuildProgram (‘astbuildprog’, see *note BuildProgram::) Compile, link and run programs that depend on the Gnuastro library (see *note Gnuastro library::). This program will automatically link with the libraries that Gnuastro depends on, so there is no need to explicitly mention them every time you are compiling a Gnuastro library dependent program. ConvertType (‘astconvertt’, see *note ConvertType::) Convert astronomical data files (FITS or IMH) to and from several other standard image and data formats, for example TXT, JPEG, EPS or PDF. Convolve (‘astconvolve’, see *note Convolve::) Convolve (blur or smooth) data with a given kernel in spatial and frequency domain on multiple threads. Convolve can also do de-convolution to find the appropriate kernel to PSF-match two images. CosmicCalculator (‘astcosmiccal’, see *note CosmicCalculator::) Do cosmological calculations, for example the luminosity distance, distance modulus, comoving volume and many more. Crop (‘astcrop’, see *note Crop::) Crop region(s) from an image and stitch several images if necessary. Inputs can be in pixel coordinates or world coordinates. Fits (‘astfits’, see *note Fits::) View and manipulate FITS file extensions and header keywords. MakeCatalog (‘astmkcatalog’, see *note MakeCatalog::) Make catalog of labeled image (output of NoiseChisel). The catalogs are highly customizable and adding new calculations/columns is very straightforward. MakeNoise (‘astmknoise’, see *note MakeNoise::) Make (add) noise to an image, with a large set of random number generators and any seed. MakeProfiles (‘astmkprof’, see *note MakeProfiles::) Make mock 2D profiles in an image. The central regions of radial profiles are made with a configurable 2D Monte Carlo integration. It can also build the profiles on an over-sampled image. Match (‘astmatch’, see *note Match::) Given two input catalogs, find the rows that match with each other within a given aperture (may be an ellipse). NoiseChisel (‘astnoisechisel’, see *note NoiseChisel::) Detect signal in noise. It uses a technique to detect very faint and diffuse, irregularly shaped signal in noise (galaxies in the sky), using thresholds that are below the Sky value, see arXiv:1505.01664 (http://arxiv.org/abs/1505.01664). Query (‘astquery’, see *note Query::) High-level interface to query pre-defined remote, or external databases, and directly download the required sub-tables on the command-line. Segment (‘astsegment’, see *note Segment::) Segment detected regions based on the structure of signal and the input dataset’s noise properties. Statistics (‘aststatistics’, see *note Statistics::) Statistical calculations on the input dataset (column in a table, image or datacube). Table (‘asttable’, *note Table::) Convert FITS binary and ASCII tables into other such tables, print them on the command-line, save them in a plain text file, or get the FITS table information. Warp (‘astwarp’, see *note Warp::) Warp image to new pixel grid. Any projective transformation or Homography can be applied to the input images. The programs listed above are designed to be highly modular and generic. Hence, they are naturally for lower-level operations. In Gnuastro, higher-level operations (combining multiple programs, or running a program in a special way), are done with installed Bash scripts (all prefixed with ‘astscript-’). They can be run just like a program and behave very similarly (with minor differences, see *note Installed scripts::). ‘astscript-ds9-region’ (See *note SAO DS9 region files from table::) Given a table (either as a file or from standard input), create an SAO DS9 region file from the requested positional columns (WCS or image coordinates). ‘astscript-radial-profile’ (See *note Generate radial profile::) Calculate the radial profile of an object within an image. The object can be at any location in the image, using various measures (median, sigma-clipped mean and etc), and the radial distance can also be measured on any general ellipse. ‘astscript-sort-by-night’ (See *note Sort FITS files by night::) Given a list of FITS files, and a HDU and keyword name (for a date), this script separates the files in the same night (possibly over two calendar days).  File: gnuastro.info, Node: Other useful software, Next: GNU Free Doc. License, Prev: Gnuastro programs list, Up: Top Appendix B Other useful software ******************************** In this appendix the installation of programs and libraries that are not direct Gnuastro dependencies are discussed. However they can be useful for working with Gnuastro. * Menu: * SAO DS9:: Viewing FITS images. * PGPLOT:: Plotting directly in C  File: gnuastro.info, Node: SAO DS9, Next: PGPLOT, Prev: Other useful software, Up: Other useful software B.1 SAO DS9 =========== SAO DS9(1) is not a requirement of Gnuastro, it is a FITS image viewer. So to check your inputs and outputs, it is one of the best options. Like the other packages, it might already be available in your distribution’s repositories. It is already pre-compiled in the download section of its web page. Once you download it you can unpack and install (move it to a system recognized directory) with the following commands (‘x.x.x’ is the version number): $ tar xf ds9.linux64.x.x.x.tar.gz $ sudo mv ds9 /usr/local/bin Once you run it, there might be a complaint about the Xss library, which you can find in your distribution package management system. You might also get an ‘XPA’ related error. In this case, you have to add the following line to your ‘~/.bashrc’ and ‘~/.profile’ file (you will have to log out and back in again for the latter): export XPA_METHOD=local * Menu: * Viewing multiextension FITS images:: Configure SAO DS9 for multiextension images. ---------- Footnotes ---------- (1)  File: gnuastro.info, Node: Viewing multiextension FITS images, Prev: SAO DS9, Up: SAO DS9 B.1.1 Viewing multiextension FITS images ---------------------------------------- The FITS definition allows for multiple extensions inside one FITS file, each extension can have a completely independent dataset inside of it. If you just double click on a multi-extension FITS file or run ‘$ds9 foo.fits’, SAO DS9 will only show you the first extension. If you have a multi-extension file containing 2D images, one way to load and switch between the each 2D extension is to take the following steps in the SAO DS9 window: “File”→”Open Other”→”Open Multi Ext Cube” and then choose the Multi extension FITS file in your computer’s file structure. The method above is a little tedious to do every time you want view a multi-extension FITS file. A different series of steps is also necessary if you the extensions are 3D data cubes. Fortunately SAO DS9 also provides command-line options that you can use to specify a particular behavior. One of those options is ‘-mecube’ which opens a FITS image as a multi-extension data cube (treating each 2D extension as a slice in a 3D cube). This allows you to flip through the extensions easily while keeping all the settings similar. Try running ‘$ds9 -mecube foo.fits’ to see the effect (for example on the output of *note NoiseChisel::). If the file has multiple extensions, a small window will also be opened along with the main ds9 window. This small window allows you to slide through the image extensions of ‘foo.fits’. If ‘foo.fits’ only consists of one extension, then SAO DS9 will open as usual. Just to avoid confusion, note that SAO DS9 does not follow the GNU style of separating long and short options as explained in *note Arguments and options::. In the GNU style, this ‘long’ (multi-character) option should have been called like ‘--mecube’, but SAO DS9 follows its own conventions. Recall the ‘-mecube’ opens each 2D input extension as a slice in 3D. Therefore, when you want to inspect a multi-extension FITS file containing a 3D dataset, the ‘-mecube’ option is no good any more (it only opens the first slice of the 3D cube in each extension). In that case, we have to use SAO DS9’s ‘-multiframe’ option to open each extension as a separate frame. Since the input is a 3D dataset, we get the same small window as the 2D case above for scrolling through the 3D slices. We then have to also ask ds9 to match the frames and lock the slices, so for example zooming in one, will also zoom the others. We can use a script to automatize this process and make work much easier (and save a lot of time) when opening any generic 2D or 3D dataset. After taking the following steps, when you click on a FITS file in your graphic user interface, ds9 will open in the respective 2D or 3D mode when double clicking a FITS file on the graphic user interface, and an executable will also be available to open ds9 similarly on the command-line. Note that the following solution assumes you already have Gnuastro installed (and in particular the *note Fits:: program). Let’s assume that you want to store this script in ‘BINDIR’ (that is in your ‘PATH’ environment variable, see *note Installation directory::). [Tip: a good place would be ‘~/.local/bin’, just don’t forget to make sure it is in your ‘PATH’]. Using your favorite text editor, put the following script into a file called ‘BINDIR/ds9-multi-ext’. You can change the size of the opened ds9 window by changing the ‘1800x3000’ part of the script below. #! /bin/bash # To allow generic usage, if no input file is given (the `if' below is # true), then just open an empty ds9. if [ "x$1" == "x" ]; then ds9 else # Make sure we are dealing with a FITS file. We are using shell # redirection here to make sure that nothing is printed in the # terminal (to standard output when we have a FITS file, or to # standard error when we don't). Since we've used redirection, # we'll also have to echo the return value of `astfits'. check=$(astfits $1 -h0 > /dev/null 2>&1; echo $?) # If the file was a FITS file, then `check' will be 0. if [ "$check" == "0" ]; then # Read the number of dimensions. n0=$(astfits $1 -h0 | awk '$1=="NAXIS"{print $3}') # Find the number of dimensions. if [ "$n0" == "0" ]; then ndim=$(astfits $1 -h1 | awk '$1=="NAXIS"{print $3}') else ndim=$n0 fi; # Open DS9 based on the number of dimension. if [ "$ndim" = "2" ]; then # 2D multi-extension file: use the "Cube" window to # flip/slide through the extensions. ds9 -zscale -geometry 1800x3000 -mecube $1 \ -zoom to fit -wcs degrees else # 3D multi-extension file: The "Cube" window will slide # between the slices of a single extension. To flip # through the extensions (not the slices), press the top # row "frame" button and from the last four buttons of the # bottom row ("first", "previous", "next" and "last") can # be used to switch through the extensions (while keeping # the same slice). ds9 -zscale -geometry 1800x3000 -wcs degrees \ -multiframe $1 -match frame image \ -lock slice image -lock frame image -single \ -zoom to fit fi else if [ -f $1 ]; then echo "'$1' isn't a FITS file." else echo "'$1' doesn't exist." fi fi fi As described above (also in the comments of the script), if you have opened a multi-extension 2D dataset (image), the “Cube” window can be used to slide/flip through each extension. But when the input is a 3D data cube, the “Cube” window will slide/flip through the slices in each extension (a separate 3D dataset). To flip through the extensions (while keeping the slice fixed), click the “frame” button on the top row of buttons, then use the last four buttons of the bottom row ("first", "previous", "next" and "last") to change between the extensions. To run this script, you have to activate its executable flag with this command: $ chmod +x BINDIR/ds9-multi-ext If ‘BINDIR’ is within your system’s ‘PATH’ environment variable (see *note Installation directory::), you can now open ds9 conditionally using the script above with this command: $ ds9-multi-ext foo.fits For the graphic user interface, we’ll assume you are using GNOME (the most popular graphic user interface for GNU/Linux systems), version 3. For GNOME 2, see below. You can customize GNOME to open specific files with ‘.desktop’ files. For each user, they are stored in ‘~/.local/share/applications/’. In case you don’t have the directory, make it your self (with ‘mkdir’). Using your favorite text editor, you can now create ‘~/.local/share/applications/saods9.desktop’ with the following contents. Just don’t forget to correct ‘BINDIR’. If you would also like to have ds9’s logo/icon in GNOME, download it, uncomment the ‘Icon’ line, and write its address in the value. [Desktop Entry] Type=Application Version=1.0 Name=SAO DS9 Comment=View FITS images Terminal=false Categories=Graphics;RasterGraphics;2DGraphics;3DGraphics #Icon=/PATH/TO/DS9/ICON/ds9.png Exec=BINDIR/ds9-multi-ext %f The steps above will add SAO DS9 as one of your applications. To make it default, take the following steps (just once is enough). Right click on a FITS file and select Open with other application→View all applications→SAO DS9. In case you are using GNOME 2 you can take the following steps: right click on a FITS file and choose Properties→Open With→Add button. A list of applications will show up, ds9 might already be present in the list, but don’t choose it because it will run with no options. Below the list is an option “Use a custom command”. Click on it and write the following command: ‘BINDIR/ds9-multi-ext’ in the box and click “Add”. Then finally choose the command you just added as the default and click the “Close” button.  File: gnuastro.info, Node: PGPLOT, Prev: SAO DS9, Up: Other useful software B.2 PGPLOT ========== PGPLOT is a package for making plots in C. It is not directly needed by Gnuastro, but can be used by WCSLIB, see *note WCSLIB::. As explained in *note WCSLIB::, you can install WCSLIB without it too. It is very old (the most recent version was released early 2001!), but remains one of the main packages for plotting directly in C. WCSLIB uses this package to make plots if you want it to make plots. If you are interested you can also use it for your own purposes. If you want your plotting codes in between your C program, PGPLOT is currently one of your best options. The recommended alternative to this method is to get the raw data for the plots in text files and input them into any of the various more modern and capable plotting tools separately, for example the Matplotlib library in Python or PGFplots in LaTeX. This will also significantly help code readability. Let’s get back to PGPLOT for the sake of WCSLIB. Installing it is a little tricky (mainly because it is so old!). You can download the most recent version from the FTP link in its web page(1). You can unpack it with the ‘tar xf’ command. Let’s assume the directory you have unpacked it to is ‘PGPLOT’, most probably it is: ‘/home/username/Downloads/pgplot/’. open the ‘drivers.list’ file: $ gedit drivers.list Remove the ‘!’ for the following lines and save the file in the end: PSDRIV 1 /PS PSDRIV 2 /VPS PSDRIV 3 /CPS PSDRIV 4 /VCPS XWDRIV 1 /XWINDOW XWDRIV 2 /XSERVE Don’t choose GIF or VGIF, there is a problem in their codes. Open the ‘PGPLOT/sys_linux/g77_gcc.conf’ file: $ gedit PGPLOT/sys_linux/g77_gcc.conf change the line saying: ‘FCOMPL="g77"’ to ‘FCOMPL="gfortran"’, and save it. This is a very important step during the compilation of the code if you are in GNU/Linux. You now have to create a folder in ‘/usr/local’, don’t forget to replace ‘PGPLOT’ with your unpacked address: $ su # mkdir /usr/local/pgplot # cd /usr/local/pgplot # cp PGPLOT/drivers.list ./ To make the Makefile, type the following command: # PGPLOT/makemake PGPLOT linux g77_gcc It should finish by saying: ‘Determining object file dependencies’. You have done the hard part! The rest is easy: run these three commands in order: # make # make clean # make cpg Finally you have to place the position of this directory you just made into the ‘LD_LIBRARY_PATH’ environment variable and define the environment variable ‘PGPLOT_DIR’. To do that, you have to edit your ‘.bashrc’ file: $ cd ~ $ gedit .bashrc Copy these lines into the text editor and save it: PGPLOT_DIR="/usr/local/pgplot/"; export PGPLOT_DIR LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/pgplot/ export LD_LIBRARY_PATH You need to log out and log back in again so these definitions take effect. After you logged back in, you want to see the result of all this labor, right? Tim Pearson has done that for you, create a temporary folder in your home directory and copy all the demonstration files in it: $ cd ~ $ mkdir temp $ cd temp $ cp /usr/local/pgplot/pgdemo* ./ $ ls You will see a lot of pgdemoXX files, where XX is a number. In order to execute them type the following command and drink your coffee while looking at all the beautiful plots! You are now ready to create your own. $ ./pgdemoXX ---------- Footnotes ---------- (1)  File: gnuastro.info, Node: GNU Free Doc. License, Next: GNU General Public License, Prev: Other useful software, Up: Top Appendix C GNU Free Doc. License ******************************** Version 1.3, 3 November 2008 Copyright © 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. 0. PREAMBLE The purpose of this License is to make a manual, textbook, or other functional and useful document “free” in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others. This License is a kind of “copyleft”, which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software. We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference. 1. APPLICABILITY AND DEFINITIONS This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The “Document”, below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as “you”. You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law. A “Modified Version” of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language. A “Secondary Section” is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them. The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none. The “Cover Texts” are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words. A “Transparent” copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not “Transparent” is called “Opaque”. Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only. The “Title Page” means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, “Title Page” means the text near the most prominent appearance of the work’s title, preceding the beginning of the body of the text. The “publisher” means any person or entity that distributes copies of the Document to the public. A section “Entitled XYZ” means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as “Acknowledgements”, “Dedications”, “Endorsements”, or “History”.) To “Preserve the Title” of such a section when you modify the Document means that it remains a section “Entitled XYZ” according to this definition. The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License. 2. VERBATIM COPYING You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3. You may also lend copies, under the same conditions stated above, and you may publicly display copies. 3. COPYING IN QUANTITY If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document’s license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects. If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document. 4. MODIFICATIONS You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version: A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission. B. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement. C. State on the Title page the name of the publisher of the Modified Version, as the publisher. D. Preserve all the copyright notices of the Document. E. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices. F. Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below. G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document’s license notice. H. Include an unaltered copy of this License. I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled “History” in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence. J. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the “History” section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission. K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein. L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles. M. Delete any section Entitled “Endorsements”. Such a section may not be included in the Modified Version. N. Do not retitle any existing section to be Entitled “Endorsements” or to conflict in title with any Invariant Section. O. Preserve any Warranty Disclaimers. If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version’s license notice. These titles must be distinct from any other section titles. You may add a section Entitled “Endorsements”, provided it contains nothing but endorsements of your Modified Version by various parties—for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard. You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version. 5. COMBINING DOCUMENTS You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. In the combination, you must combine any sections Entitled “History” in the various original documents, forming one section Entitled “History”; likewise combine any sections Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You must delete all sections Entitled “Endorsements.” 6. COLLECTIONS OF DOCUMENTS You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document. 7. AGGREGATION WITH INDEPENDENT WORKS A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an “aggregate” if the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document. If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document’s Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate. 8. TRANSLATION Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail. If a section in the Document is Entitled “Acknowledgements”, “Dedications”, or “History”, the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title. 9. TERMINATION You may not copy, modify, sublicense, or distribute the Document except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, or distribute it is void, and will automatically terminate your rights under this License. However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, receipt of a copy of some or all of the same material does not give you any rights to use it. 10. FUTURE REVISIONS OF THIS LICENSE The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See . Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License “or any later version” applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. If the Document specifies that a proxy can decide which future versions of this License can be used, that proxy’s public statement of acceptance of a version permanently authorizes you to choose that version for the Document. 11. RELICENSING “Massive Multiauthor Collaboration Site” (or “MMC Site”) means any World Wide Web server that publishes copyrightable works and also provides prominent facilities for anybody to edit those works. A public wiki that anybody can edit is an example of such a server. A “Massive Multiauthor Collaboration” (or “MMC”) contained in the site means any set of copyrightable works thus published on the MMC site. “CC-BY-SA” means the Creative Commons Attribution-Share Alike 3.0 license published by Creative Commons Corporation, a not-for-profit corporation with a principal place of business in San Francisco, California, as well as future copyleft versions of that license published by that same organization. “Incorporate” means to publish or republish a Document, in whole or in part, as part of another Document. An MMC is “eligible for relicensing” if it is licensed under this License, and if all works that were first published under this License somewhere other than this MMC, and subsequently incorporated in whole or in part into the MMC, (1) had no cover texts or invariant sections, and (2) were thus incorporated prior to November 1, 2008. The operator of an MMC Site may republish an MMC contained in the site under CC-BY-SA on the same site at any time before August 1, 2009, provided the MMC is eligible for relicensing. ADDENDUM: How to use this License for your documents ==================================================== To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page: Copyright (C) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''. If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the “with...Texts.” line with this: with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST. If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation. If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.  File: gnuastro.info, Node: GNU General Public License, Next: Index, Prev: GNU Free Doc. License, Up: Top Appendix D GNU Gen. Pub. License v3 *********************************** Version 3, 29 June 2007 Copyright © 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble ======== The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program—to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers’ and authors’ protection, the GPL clearly explains that there is no warranty for this free software. For both users’ and authors’ sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users’ freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS ==================== 0. Definitions. “This License” refers to version 3 of the GNU General Public License. “Copyright” also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. “The Program” refers to any copyrightable work licensed under this License. Each licensee is addressed as “you”. “Licensees” and “recipients” may be individuals or organizations. To “modify” a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a “modified version” of the earlier work or a work “based on” the earlier work. A “covered work” means either the unmodified Program or a work based on the Program. To “propagate” a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To “convey” a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays “Appropriate Legal Notices” to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The “source code” for a work means the preferred form of the work for making modifications to it. “Object code” means any non-source form of a work. A “Standard Interface” means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The “System Libraries” of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A “Major Component”, in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The “Corresponding Source” for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work’s System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users’ Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work’s users, your or third parties’ legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program’s source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a. The work must carry prominent notices stating that you modified it, and giving a relevant date. b. The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to “keep intact all notices”. c. You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d. If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an “aggregate” if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation’s users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a. Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b. Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c. Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d. Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e. Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A “User Product” is either (1) a “consumer product”, which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, “normally used” refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. “Installation Information” for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. “Additional permissions” are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a. Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b. Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c. Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d. Limiting the use for publicity purposes of names of licensors or authors of the material; or e. Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f. Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered “further restrictions” within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An “entity transaction” is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party’s predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A “contributor” is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor’s “contributor version”. A contributor’s “essential patent claims” are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, “control” includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor’s essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a “patent license” is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To “grant” such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. “Knowingly relying” means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient’s use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is “discriminatory” if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others’ Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License “or any later version” applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy’s public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS =========================== How to Apply These Terms to Your New Programs ============================================= If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found. ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES. Copyright (C) YEAR NAME OF AUTHOR This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: PROGRAM Copyright (C) YEAR NAME OF AUTHOR This program comes with ABSOLUTELY NO WARRANTY; for details type ‘show w’. This is free software, and you are welcome to redistribute it under certain conditions; type ‘show c’ for details. The hypothetical commands ‘show w’ and ‘show c’ should show the appropriate parts of the General Public License. Of course, your program’s commands might be different; for a GUI interface, you would use an “about box”. You should also get your employer (if you work as a programmer) or school, if any, to sign a “copyright disclaimer” for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read .