1.. _rfc-12:
2
3================================================================================
4RFC 12: Improved File Management
5================================================================================
6
7Author: Frank Warmerdam
8
9Contact: warmerdam@pobox.com
10
11Status: Adopted / Implemented
12
13Summary
14-------
15
16Some applications using GDAL have a requirement to provide file
17management operations through the GUI. This includes deleting, renaming,
18moving and packaging up datasets which often requires operations on
19several associated files. This RFC introduces an operation on a
20GDALDataset to identify all the dataset files, and operations to move or
21copy them.
22
23GetFileList()
24-------------
25
26The following new virtual method is added on the GDALDataset class, with
27an analygous C function.
28
29::
30
31      virtual char   **GDALDataset::GetFileList(void);
32
33The method is intended to return a list of files associated with this
34open dataset. The return is a NULL terminated string list which becomes
35owned by the caller and should be deallocated with CSLDestroy().
36
37The default implementation tests the name of the datasource to see if it
38is a file, and if so it is returned otherwise an empty list is returned.
39If the default overview manager is active, and has overviews, those will
40also be included in the file list. The default implementation also
41checks for world files, but only those with extensions based on the
42original files extension (ie. .tfw or .tifw for .tif) but does not
43search for .wld since that is not very specific.
44
45The GDALPamDataset::GetFileList() method will add the ability to find
46.aux and .aux.xml files associated with a dataset to the core default
47behavior.
48
49pfnRename()
50-----------
51
52The following new function is added to the GDALDriver class.
53
54::
55
56       CPLErr       (*pfnRename)( const char *pszNewName, const char *pszOldName );
57
58Also a corresponding function is added to the C API.
59
60::
61
62       CPLErr        GDALRenameDataset( GDALDriverH hDriver, const char *pszNewName, const char *pszOldName );
63
64Note that renaming is done by the driver, but the dataset to be operated
65on should *not* be open at the time. GDALRenameDataset() will invoke
66pfnRename if it is non-NULL.
67
68If pfnRename is NULL the default implementation will be used which will
69open the dataset, fetch the file list, close the dataset, and then try
70to rename all the files (based on shared basenames). The default rename
71operation will fail if it is unable to establish a relationship between
72the files (ie. a common basename or stem) to indicate how the group of
73files should be rename to the new pattern.
74
75Optionally a NULL hDriver argument may be passed in, in which case the
76appropriate driver will be selected by first opening the datasource.
77
78CPLMoveFile()
79-------------
80
81The POSIX rename() function on which VSIRename() is usually based does
82not normally allow renaming files between file systems or between
83different kinds of file systems (ie. /vsimem to C:/abc). In order to
84implement GDALRenameDataset() such that it works efficiently within a
85file system, but still works between file systems, a new operation will
86be added to gdal/port. This is the CPLMoveFile() function which will
87first try a VSIRename(). If that fails it will use CPLCopyFile() to copy
88the whole file and then VSIUnlink() to get rid of the old file.
89
90::
91
92     int CPLMoveFile( const char *pszNewFilename, const char *pszOldFilename );
93
94The return value will be zero on success, otherwise an errno style
95value.
96
97It should be noted that in some error conditions, such as the
98destination file system running out of space during a copy, it may
99happen that some files for a dataset get renamed, and some do not
100leaving things in an inconsistent state.
101
102pfnCopyFiles()
103--------------
104
105The following new function is added to the GDALDriver class.
106
107::
108
109       CPLErr       (*pfnCopyFiles)( const char *pszNewName, const char *pszOldName );
110
111Also a corresponding function is added to the C API.
112
113::
114
115       CPLErr        GDALCopyDatasetFiles( GDALDriverH hDriver, const char *pszNewName, const char *pszOldName );
116
117Note that copying is done by the driver. The dataset may be opened, but
118if opened in update mode it may be prudent to first do a flush to
119synchronize the in-process state with what is on disk.
120GDALCopyDatasetFiles() will invoke pfnCopyFiles if it is non-NULL.
121
122If pfnCopy is NULL the default implementation will be used which will
123open the dataset, fetch the file list, close the dataset, and then try
124to copy all the files (based on shared basenames). The default copy
125operation will fail if it is unable to establish a relationship between
126the files (ie. a common basename or stem) to indicate how the group of
127files should be renamed to the new pattern.
128
129Optionally a NULL hDriver argument may be passed in, in which case the
130appropriate driver will be selected by first opening the datasource.
131
132Copy is essentially the same as Rename, but the original files are
133unaltered. Note that this form of copy is distinct from CreateCopy() in
134that it preserves the exact binary files on disk in the new location
135while CreateCopy() just attempts to reproduce a new dataset with
136essentially the same data as modelled and carried through GDAL.
137
138pfnDelete()
139-----------
140
141
142The delete operations default implementation will be extended to use the
143GetFileList() results.
144
145Supporting Functions
146--------------------
147
148Some sort of supporting functions should be provided to make it easy to
149identify worldfiles, .aux files and .prj files associated with a file.
150
151Drivers Updated
152---------------
153
154It is anticipated that a majority of the commonly used drivers will be
155updated with custom GetFileList() methods that account for world files
156and other idiosyncratic files. A particular emphasis will made to handle
157the various formats in gdal/frmts/raw that consist of a header file and
158a raw binary file.
159
160Drivers for "one file formats" that are not updated will still use the
161default logic which should work fairly well, but might neglect auxiliary
162world files.
163
164-  VRT: I do not anticipate updating the VRT driver at this time since
165   it gets quite complicated to collect a file list for some kinds of
166   virtual files. It is also not exactly clear whether related files
167   should be considered "owned" by the virtual dataset or not.
168-  AIGRID: I will implement a custom rename operation in an attempt to
169   handle this directory oriented format gracefully.
170
171Additional Notes
172----------------
173
174-  Subdatasets will generally return an empty file list from
175   GetFileList(), and will not be manageable via Rename or Delete though
176   a very sophisticated driver could implement these operations.
177-  There is no mechanism anticipated to ensure that files are closed
178   before they are removed. If an application does not ensure this
179   rename/move operations may fail on win32 since it doesn't allow
180   rename/delete operations on open files. Things could easily be left
181   in an inconsistent state.
182-  Datasets without associated files in the file system will return an
183   empty file list. This essentially identifies them as "unmanagable".
184
185Implementation Plan
186-------------------
187
188This change will be implemented by Frank Warmerdam in trunk in time for
189the 1.5.0 release.
190
191SWIG Implications
192-----------------
193
194The GDALRenameDataset(), and GDALCopyDatasetFiles() operations on the
195driver, and the GetFileList() operation on the dataset will need to be
196exposed through SWIG.
197
198Testing
199-------
200
201Rename and CopyFiles testing will be added to the regression tests for a
202few representative formats. These rename operations will be between one
203directory and another, and will not test cross file system copying which
204will have to be tested manually.
205
206A small gdalmanage utility will be implemented allowing use and testing
207of the identify, rename, copy and delete operations from the commandline
208in a convenient fashion.
209