1.. _rfc-12: 2 3================================================================================ 4RFC 12: Improved File Management 5================================================================================ 6 7Author: Frank Warmerdam 8 9Contact: warmerdam@pobox.com 10 11Status: Adopted / Implemented 12 13Summary 14------- 15 16Some applications using GDAL have a requirement to provide file 17management operations through the GUI. This includes deleting, renaming, 18moving and packaging up datasets which often requires operations on 19several associated files. This RFC introduces an operation on a 20GDALDataset to identify all the dataset files, and operations to move or 21copy them. 22 23GetFileList() 24------------- 25 26The following new virtual method is added on the GDALDataset class, with 27an analygous C function. 28 29:: 30 31 virtual char **GDALDataset::GetFileList(void); 32 33The method is intended to return a list of files associated with this 34open dataset. The return is a NULL terminated string list which becomes 35owned by the caller and should be deallocated with CSLDestroy(). 36 37The default implementation tests the name of the datasource to see if it 38is a file, and if so it is returned otherwise an empty list is returned. 39If the default overview manager is active, and has overviews, those will 40also be included in the file list. The default implementation also 41checks for world files, but only those with extensions based on the 42original files extension (ie. .tfw or .tifw for .tif) but does not 43search for .wld since that is not very specific. 44 45The GDALPamDataset::GetFileList() method will add the ability to find 46.aux and .aux.xml files associated with a dataset to the core default 47behavior. 48 49pfnRename() 50----------- 51 52The following new function is added to the GDALDriver class. 53 54:: 55 56 CPLErr (*pfnRename)( const char *pszNewName, const char *pszOldName ); 57 58Also a corresponding function is added to the C API. 59 60:: 61 62 CPLErr GDALRenameDataset( GDALDriverH hDriver, const char *pszNewName, const char *pszOldName ); 63 64Note that renaming is done by the driver, but the dataset to be operated 65on should *not* be open at the time. GDALRenameDataset() will invoke 66pfnRename if it is non-NULL. 67 68If pfnRename is NULL the default implementation will be used which will 69open the dataset, fetch the file list, close the dataset, and then try 70to rename all the files (based on shared basenames). The default rename 71operation will fail if it is unable to establish a relationship between 72the files (ie. a common basename or stem) to indicate how the group of 73files should be rename to the new pattern. 74 75Optionally a NULL hDriver argument may be passed in, in which case the 76appropriate driver will be selected by first opening the datasource. 77 78CPLMoveFile() 79------------- 80 81The POSIX rename() function on which VSIRename() is usually based does 82not normally allow renaming files between file systems or between 83different kinds of file systems (ie. /vsimem to C:/abc). In order to 84implement GDALRenameDataset() such that it works efficiently within a 85file system, but still works between file systems, a new operation will 86be added to gdal/port. This is the CPLMoveFile() function which will 87first try a VSIRename(). If that fails it will use CPLCopyFile() to copy 88the whole file and then VSIUnlink() to get rid of the old file. 89 90:: 91 92 int CPLMoveFile( const char *pszNewFilename, const char *pszOldFilename ); 93 94The return value will be zero on success, otherwise an errno style 95value. 96 97It should be noted that in some error conditions, such as the 98destination file system running out of space during a copy, it may 99happen that some files for a dataset get renamed, and some do not 100leaving things in an inconsistent state. 101 102pfnCopyFiles() 103-------------- 104 105The following new function is added to the GDALDriver class. 106 107:: 108 109 CPLErr (*pfnCopyFiles)( const char *pszNewName, const char *pszOldName ); 110 111Also a corresponding function is added to the C API. 112 113:: 114 115 CPLErr GDALCopyDatasetFiles( GDALDriverH hDriver, const char *pszNewName, const char *pszOldName ); 116 117Note that copying is done by the driver. The dataset may be opened, but 118if opened in update mode it may be prudent to first do a flush to 119synchronize the in-process state with what is on disk. 120GDALCopyDatasetFiles() will invoke pfnCopyFiles if it is non-NULL. 121 122If pfnCopy is NULL the default implementation will be used which will 123open the dataset, fetch the file list, close the dataset, and then try 124to copy all the files (based on shared basenames). The default copy 125operation will fail if it is unable to establish a relationship between 126the files (ie. a common basename or stem) to indicate how the group of 127files should be renamed to the new pattern. 128 129Optionally a NULL hDriver argument may be passed in, in which case the 130appropriate driver will be selected by first opening the datasource. 131 132Copy is essentially the same as Rename, but the original files are 133unaltered. Note that this form of copy is distinct from CreateCopy() in 134that it preserves the exact binary files on disk in the new location 135while CreateCopy() just attempts to reproduce a new dataset with 136essentially the same data as modelled and carried through GDAL. 137 138pfnDelete() 139----------- 140 141 142The delete operations default implementation will be extended to use the 143GetFileList() results. 144 145Supporting Functions 146-------------------- 147 148Some sort of supporting functions should be provided to make it easy to 149identify worldfiles, .aux files and .prj files associated with a file. 150 151Drivers Updated 152--------------- 153 154It is anticipated that a majority of the commonly used drivers will be 155updated with custom GetFileList() methods that account for world files 156and other idiosyncratic files. A particular emphasis will made to handle 157the various formats in gdal/frmts/raw that consist of a header file and 158a raw binary file. 159 160Drivers for "one file formats" that are not updated will still use the 161default logic which should work fairly well, but might neglect auxiliary 162world files. 163 164- VRT: I do not anticipate updating the VRT driver at this time since 165 it gets quite complicated to collect a file list for some kinds of 166 virtual files. It is also not exactly clear whether related files 167 should be considered "owned" by the virtual dataset or not. 168- AIGRID: I will implement a custom rename operation in an attempt to 169 handle this directory oriented format gracefully. 170 171Additional Notes 172---------------- 173 174- Subdatasets will generally return an empty file list from 175 GetFileList(), and will not be manageable via Rename or Delete though 176 a very sophisticated driver could implement these operations. 177- There is no mechanism anticipated to ensure that files are closed 178 before they are removed. If an application does not ensure this 179 rename/move operations may fail on win32 since it doesn't allow 180 rename/delete operations on open files. Things could easily be left 181 in an inconsistent state. 182- Datasets without associated files in the file system will return an 183 empty file list. This essentially identifies them as "unmanagable". 184 185Implementation Plan 186------------------- 187 188This change will be implemented by Frank Warmerdam in trunk in time for 189the 1.5.0 release. 190 191SWIG Implications 192----------------- 193 194The GDALRenameDataset(), and GDALCopyDatasetFiles() operations on the 195driver, and the GetFileList() operation on the dataset will need to be 196exposed through SWIG. 197 198Testing 199------- 200 201Rename and CopyFiles testing will be added to the regression tests for a 202few representative formats. These rename operations will be between one 203directory and another, and will not test cross file system copying which 204will have to be tested manually. 205 206A small gdalmanage utility will be implemented allowing use and testing 207of the identify, rename, copy and delete operations from the commandline 208in a convenient fashion. 209