1/*
2  This file is part of MADNESS.
3
4  Copyright (C) 2015 Stony Brook University
5
6  This program is free software; you can redistribute it and/or modify
7  it under the terms of the GNU General Public License as published by
8  the Free Software Foundation; either version 2 of the License, or
9  (at your option) any later version.
10
11  This program is distributed in the hope that it will be useful,
12  but WITHOUT ANY WARRANTY; without even the implied warranty of
13  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14  GNU General Public License for more details.
15
16  You should have received a copy of the GNU General Public License
17  along with this program; if not, write to the Free Software
18  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
19
20  For more information please contact:
21
22  Robert J. Harrison
23  Oak Ridge National Laboratory
24  One Bethel Valley Road
25  P.O. Box 2008, MS-6367
26
27  email: harrisonrj@ornl.gov
28  tel:   865-241-3937
29  fax:   865-572-0680
30*/
31
32/**
33 \file serialization.dox
34 \brief Overview of the interface templates for archives (serialization).
35 \addtogroup serialization
36
37The programmer should not need to include madness/world/archive.h directly. Instead, include the header file for the actual archive (binary file, text/xml file, vector in memory, etc.) that is desired.
38
39\par Background
40
41The interface and implementation are deliberately modelled, albeit loosely, upon the boost serialization class (thanks boost!). The major differences are that this archive class does \em not break cycles and does \em not automatically store unique copies of data referenced by multiple objects. Also, classes are responsbible for managing their own version information. At the lowest level, the interface to an archive also differs to facilitate vectorization and high-bandwidth data transfer. The implementation employs templates that are almost entirely inlined. This should enable low-overhead use of archives in applications, such as interprocess communication.
42
43\par How to use an archive?
44
45An archive is a uni-directional stream of typed data to/from disk, memory, or another process. Whether the stream is for input or for output, you can use the \c & operator to transfer data to/from the stream. If you really want, you can also use the \c << or \c >> for output or input, respectively, but there is no reason to do so. The \c & operator chains just like \c << for \c cout or \c >> for \c cin. You may discover in \c archive.h other interfaces but you should \em not use them --- use the \& operator!  The lower level interfaces will probably not, or only inconsistently, incorporate type information, and may even appear to work when they are not.
46
47Unless type checking has not been implemented by an archive for reasons of efficiency (e.g., message passing) a C-string exception will be thrown on a type-mismatch when deserializing. End-of-file, out-of-memory, and others also generate string exceptions.
48
49Fundamental types (see below), STL complex, vector, strings, pairs and maps, and tensors (int, long, float, double, float_complex, double_complex) all work without you doing anything, as do fixed-dimension arrays of the same (STL allocators are not presently accomodated). For example,
50\code
51  bool finished = false;
52  int info[3] = {1, 33, 2};
53  map<int, double> fred;
54  fred[0] = 55.0; fred[1] = 99.0;
55
56  BinaryFstreamOutputArchive ar('restart.dat');
57  ar & fred & info & finished;
58\endcode
59Deserializing is identical, except that you need to use an input archive, c.f.,
60\code
61  bool finished;
62  int info[3];
63  map<int, double> fred;
64
65  BinaryFstreamInputArchive ar('restart.dat');
66  ar & fred & info & finished;
67\endcode
68
69Variable dimension and dynamically allocated arrays do not have their dimension encoded in their type. The best way to (de)serialize them is to wrap them in an \c archive_array as follows.
70\code
71  int a[n]; // n is not known at compile time
72  double *p = new double[n];
73  ar & wrap(a,n) & wrap(p,n);
74\endcode
75The \c wrap() function template is a factory function to simplify instantiation of a correctly typed \c archive_array template. Note that when deserializing, you must have first allocated the array --- the above code can be used for both serializing and deserializing. If you want the memory to be automatically allocated consider using either an STL vector or a madness tensor.
76
77To transfer the actual value of a pointer to a stream (is this really what you want?) then store an archive_ptr wrapping it. The factory function \c wrap_ptr() assists in doing this, e.g., here for a function pointer
78\code
79 int foo();
80 ar & wrap_ptr(foo);
81\endcode
82
83\par User-defined types
84
85User-defined types require a little more effort. Three cases are distinguished.
86- symmetric load and store
87  - intrusive
88  - non-intrusive
89- non-symmetric load and store
90
91We will examine each in turn, but we first need to discuss a little about the implementation.
92
93When transfering an object \c obj to/from an archive \c ar with `ar & obj`, you need to invoke the templated function
94\code
95  template <class Archive, class T>
96  inline const Archive& operator&(const Archive& ar, T& obj);
97\endcode
98that then invokes other templated functions to redirect to input or output streams as appropriate, manage type checking, etc. We would now like to overload the behavior of these functions in order to accomodate your fancy object.  However, function templates cannot be partially specialized.  Following the technique recommended <a href=http://www.gotw.ca/publications/mill17.htm>here</a> (look for moral#2), each of the templated functions directly calls a member of a templated class. Classes, unlike functions, can be partially specialized, so it is easy to control and predict what is happening. Thus, in order to change the behavior of all archives for an object you just have to provide a partial specialization of the appropriate class(es). Do \em not overload any of the function templates.
99
100<em>Symmetric intrusive method</em>
101
102Many classes can use the same code for serializing and deserializing. If such a class can be modified, the cleanest way of enabling serialization is to add a templated method as follows.
103\code
104  class A {
105      float a;
106
107  public:
108      A(float a = 0.0) : a(a) {}
109
110      template <class Archive>
111      inline void serialize(const Archive& ar) {
112          ar & a;
113      }
114  };
115\endcode
116
117<em>Symmetric non-intrusive method</em>
118
119If a class with symmetric serialization cannot be modified, then you can define an external class template with the following signature in the \c madness::archive namespace (where \c Obj is the name of your type).
120\code
121  namespace madness {
122      namespace archive {
123          template <class Archive>
124          struct ArchiveSerializeImpl<Archive,Obj> {
125              static inline void serialize(const Archive& ar, Obj& obj);
126          };
127      }
128  }
129\endcode
130
131For example,
132\code
133  class B {
134  public:
135      bool b;
136      B(bool b = false)
137          : b(b) {};
138  };
139
140  namespace madness {
141      namespace archive {
142	        template <class Archive>
143	        struct ArchiveSerializeImpl<Archive, B> {
144	            static inline void serialize(const Archive& ar, B& b) {
145                  ar & b.b;
146              };
147	        };
148      }
149  }
150\endcode
151
152<em>Non-symmetric non-intrusive</em>
153
154For classes that do not have symmetric (de)serialization you must define separate partial templates for the functions \c load and \c store with these signatures and again in the \c madness::archive namespace.
155\code
156  namespace madness {
157      namespace archive {
158	        template <class Archive>
159	        struct ArchiveLoadImpl<Archive, Obj> {
160	           static inline void load(const Archive& ar, Obj& obj);
161	        };
162
163	        template <class Archive>
164	        struct ArchiveStoreImpl<Archive, Obj> {
165	           static inline void store(const Archive& ar, const Obj& obj);
166	        };
167      }
168  }
169\endcode
170
171First a simple, but artificial example.
172\code
173  class C {
174  public:
175      long c;
176      C(long c = 0)
177          : c(c) {};
178  };
179
180  namespace madness {
181      namespace archive {
182          template <class Archive>
183	        struct ArchiveLoadImpl<Archive, C> {
184	            static inline void load(const Archive& ar, C& c) {
185                  ar & c.c;
186              }
187          };
188
189	        template <class Archive>
190	        struct ArchiveStoreImpl<Archive, C> {
191	            static inline void store(const Archive& ar, const C& c) {
192                  ar & c.c;
193              }
194	        };
195      }
196  }
197\endcode
198
199Now a more complicated example that genuinely requires asymmetric load and store.First, a class definition for a simple linked list.
200\code
201  class linked_list {
202      int value;
203      linked_list *next;
204
205  public:
206      linked_list(int value = 0)
207          : value(value), next(0) {};
208
209      void append(int value) {
210          if (next)
211              next->append(value);
212          else
213              next = new linked_list(value);
214      };
215
216      void set_value(int val) {
217          value = val;
218      };
219
220      int get_value() const {
221          return value;
222      };
223
224      linked_list* get_next() const {
225          return next;
226      };
227  };
228\endcode
229And this is how you (de)serialize it.
230\code
231  namespace madness {
232      namespace archive {
233	        template <class Archive>
234	        struct ArchiveStoreImpl<Archive, linked_list> {
235	            static void store(const Archive& ar, const linked_list& c) {
236		              ar & c.get_value() & bool(c.get_next());
237		              if (c.get_next())
238                      ar & *c.get_next();
239	            }
240	        };
241
242	        template <class Archive>
243	        struct ArchiveLoadImpl<Archive, linked_list> {
244	            static void load(const Archive& ar, linked_list& c) {
245		              int value;
246                  bool flag;
247
248		              ar & value & flag;
249		              c.set_value(value);
250		              if (flag) {
251		                  c.append(0);
252		                  ar & *c.get_next();
253		              }
254	            }
255	        };
256      }
257  }
258\endcode
259
260Given the above implementation of a linked list, you can (de)serialize an entire list using a single statement.
261\code
262  linked_list list(0);
263  for (int i=1; i<=10; ++i)
264      list.append(i);
265
266  BinaryFstreamOutputArchive ar('list.dat');
267  ar & list;
268\endcode
269
270\par Non-default constructor
271
272There are various options for objects that do not have a default constructor. The most appealing and totally non-intrusive approach is to define load/store functions for a pointer to the object. Then in the load method you can deserialize all of the information necessary to invoke the constructor and return a pointer to a new object.
273
274Things that you know are contiguously stored in memory and are painful to serialize with full type safety can be serialized by wrapping opaquely as byte streams using the \c wrap_opaque() interface. However, this should be regarded as a last resort.
275
276\par Type checking and registering your own types
277
278To enable type checking for user-defined types you must register them with the system. There are 64 empty slots for user types beginning at cookie=128.  Type checked archives (currently all except the MPI archive) store a cookie (byte with value 0-255) with each datum. Unknown (user-defined) types all end up with the same cookie indicating unkown --- i.e., no type checking unless you register.
279
280Two steps are required to register your own types (e.g., here for the types \c %Foo and \c Bar)
281-# In a header file, after including madness/world/archive.h, associate your types and pointers to them with cookie values.
282  \code
283    namespace madness {
284        namespace archive {
285	          ARCHIVE_REGISTER_TYPE_AND_PTR(Foo,128);
286	          ARCHIVE_REGISTER_TYPE_AND_PTR(Bar,129);
287        }
288    }
289  \endcode
290-# In a single source file containing your initialization routine, define a macro to force instantiation of relevant templates.
291  \code
292    #define ARCHIVE_REGISTER_TYPE_INSTANTIATE_HERE
293  \endcode
294  Then, in the initalization routine register the name of your types as follows
295  \code
296    ARCHIVE_REGISTER_TYPE_AND_PTR_NAMES(Foo);
297    ARCHIVE_REGISTER_TYPE_AND_PTR_NAMES(Bar);
298  \endcode
299Have a look at the test in \c madness/world/test_ar.cc to see things in action.
300
301\par Types of archive
302
303Presently provided are
304- madness/world/text_fstream_archive.h --- (text \c std::fstream) a file in text (XML).
305- madness/world/binary_fstream_archive.h --- (binary \c std::fstream) a file in binary.
306- madness/world/vector_archive.h --- binary in memory using an \c std::vector<unsigned_char>.
307- madness/world/buffer_archive.h --- binary in memory buffer (this is rather heavily specialized for internal use, so applications should use a vector instead).
308- madness/world/mpi_archive.h --- binary stream for point-to-point communication using MPI (non-typesafe for efficiency).
309- madness/world/parallel_archive.h --- parallel archive to binary file with multiple readers/writers. This is here mostly to support efficient transfer of large \c WorldContainer (madness/world/worlddc.h) and MADNESS \c Function (mra/mra.h) objects, though any serializable object can employ it.
310
311The buffer and \c vector archives are bitwise identical to the binary file archive.
312
313\par Implementing a new archive
314
315Minimally, an archive must derive from either \c BaseInputArchive or \c BaseOutputArchive and define for arrays of fundamental types either a \c load or \c store method, as appropriate. Additional methods can be provided to manipulate the target stream. Here is a simple, but functional, implementation of a binary file archive.
316\code
317  #include <fstream>
318  #include <madness/world/archive.h>
319  using namespace std;
320
321  class OutputArchive : public BaseOutputArchive {
322      mutable ofstream os;
323
324  public:
325    OutputArchive(const char* filename)
326        : os(filename, ios_base::binary | ios_base::out | ios_base::trunc)
327    {};
328
329    template <class T>
330    void store(const T* t, long n) const {
331        os.write((const char *) t, n*sizeof(T));
332    }
333  };
334
335  class InputArchive : public BaseInputArchive {
336    mutable ifstream is;
337
338  public:
339    InputArchive(const char* filename)
340        : is(filename, ios_base::binary | ios_base::in)
341    {};
342
343    template <class T>
344    void load(T* t, long n) const {
345        is.read((char *) t, n*sizeof(T));
346    }
347  };
348\endcode
349*/
350