xref: /dragonfly/sys/vfs/hammer2/TODO (revision 2e0c716d)
1
2* bulkfree - sync between passes and enforce serialization of operation
3
4* bulkfree - signal check, allow interrupt
5
6* bulkfree - sub-passes when kernel memory block isn't large enough
7
8* bulkfree - limit kernel memory allocation for bmap space
9
10* bulkfree - must include any detached vnodes in scan so open unlinked files
11	     are not ripped out from under the system.
12
13* bulkfree - must include all volume headers in scan so they can be used
14	     for recovery or automatic snapshot retrieval.
15
16* bulkfree - snapshot duplicate sub-tree cache and tests needed to reduce
17	     unnecessary re-scans.
18
19* Currently the check code (bref.methods / crc, sha, etc) is being checked
20  every single blasted time a chain is locked, even if the underlying buffer
21  was previously checked for that chain.  This needs an optimization to
22  (significantly) improve performance.
23
24* flush synchronization boundary crossing check and current flush chain
25  interlock needed.
26
27* snapshot creation must allocate and separately pass a new pmp for the pfs
28  degenerate 'cluster' representing the snapshot.  This theoretically will
29  also allow a snapshot to be generated inside a cluster of more than one
30  node.
31
32* snapshot copy currently also copies uuids and can confuse cluster code
33
34* hidden dir or other dirs/files/modifications made to PFS before
35  additional cluster entries added.
36
37* transaction on cluster - multiple trans structures, subtrans
38
39* inode always contains target cluster/chain, not hardlink
40
41* chain refs in cluster, cluster refs
42
43* check inode shared lock ... can end up in endless loop if following
44  hardlink because ip->chain is not updated in the exclusive lock cycle
45  when following hardlink.
46
47cpdup /build/boomdata/jails/bleeding-edge/usr/share/man/man4 /mnt/x3
48
49
50        * The block freeing code.  At the very least a bulk scan is needed
51          to implement freeing blocks.
52
53        * Crash stability.  Right now the allocation table on-media is not
54          properly synchronized with the flush.  This needs to be adjusted
55          such that H2 can do an incremental scan on mount to fixup
56          allocations on mount as part of its crash recovery mechanism.
57
58        * We actually have to start checking and acting upon the CRCs being
59          generated.
60
61        * Remaining known hardlink issues need to be addressed.
62
63        * Core 'copies' mechanism needs to be implemented to support multiple
64          copies on the same media.
65
66        * Core clustering mechanism needs to be implemented to support
67          mirroring and basic multi-master operation from a single host
68          (multi-host requires additional network protocols and won't
69          be as easy).
70
71* make sure we aren't using a shared lock during RB_SCAN's?
72
73* overwrite in write_file case w/compression - if device block size changes
74  the block has to be deleted and reallocated.  See hammer2_assign_physical()
75  in vnops.
76
77* freemap / clustering.  Set block size on 2MB boundary so the cluster code
78  can be used for reading.
79
80* need API layer for shared buffers (unfortunately).
81
82* add magic number to inode header, add parent inode number too, to
83  help with brute-force recovery.
84
85* modifications past our flush point do not adjust vchain.
86  need to make vchain dynamic so we can (see flush_scan2).??
87
88* MINIOSIZE/RADIX set to 1KB for now to avoid buffer cache deadlocks
89  on multiple locked inodes.  Fix so we can use LBUFSIZE!  Or,
90  alternatively, allow a smaller I/O size based on the sector size
91  (not optimal though).
92
93* When making a snapshot, do not allow the snapshot to be mounted until
94  the in-memory chain has been freed in order to break the shared core.
95
96* Snapshotting a sub-directory does not snapshot any
97  parent-directory-spanning hardlinks.
98
99* Snapshot / flush-synchronization point.  remodified data that crosses
100  the synchronization boundary is not currently reallocated.  see
101  hammer2_chain_modify(), explicit check (requires logical buffer cache
102  buffer handling).
103
104* on fresh mount with multiple hardlinks present separate lookups will
105  result in separate vnodes pointing to separate inodes pointing to a
106  common chain (the hardlink target).
107
108  When the hardlink target consolidates upward only one vp/ip will be
109  adjusted.  We need code to fixup the other chains (probably put in
110  inode_lock_*()) which will be pointing to an older deleted hardlink
111  target.
112
113* Filesystem must ensure that modify_tid is not too large relative to
114  the iterator in the volume header, on load, or flush sequencing will
115  not work properly.  We should be able to just override it, but we
116  should complain if it happens.
117
118* Kernel-side needs to clean up transaction queues and make appropriate
119  callbacks.
120
121* Userland side needs to do the same for any initiated transactions.
122
123* Nesting problems in the flusher.
124
125* Inefficient vfsync due to thousands of file buffers, one per-vnode.
126  (need to aggregate using a device buffer?)
127
128* Use bp->b_dep to interlock the buffer with the chain structure so the
129  strategy code can calculate the crc and assert that the chain is marked
130  modified (not yet flushed).
131
132* Deleted inode not reachable via tree for volume flush but still reachable
133  via fsync/inactive/reclaim.  Its tree can be destroyed at that point.
134
135* The direct write code needs to invalidate any underlying physical buffers.
136  Direct write needs to be implemented.
137
138* Make sure a resized block (hammer2_chain_resize()) calculates a new
139  hash code in the parent bref
140
141* The freemap allocator needs to getblk/clrbuf/bdwrite any partial
142  block allocations (less than 64KB) that allocate out of a new 64K
143  block, to avoid causing a read-before-write I/O.
144
145* Check flush race upward recursion setting SUBMODIFIED vs downward
146  recursion checking SUBMODIFIED then locking (must clear before the
147  recursion and might need additional synchronization)
148
149* There is definitely a flush race in the hardlink implementation between
150  the forwarding entries and the actual (hidden) hardlink inode.
151
152  This will require us to associate a small hard-link-adjust structure
153  with the chain whenever we create or delete hardlinks, on top of
154  adjusting the hardlink inode itself.  Any actual flush to the media
155  has to synchronize the correct nlinks value based on whether related
156  created or deleted hardlinks were also flushed.
157
158* When a directory entry is created and also if an indirect block is
159  created and entries moved into it, the directory seek position can
160  potentially become incorrect during a scan.
161
162* When a directory entry is deleted a directory seek position depending
163  on that key can cause readdir to skip entries.
164
165* TWO PHASE COMMIT - store two data offsets in the chain, and
166  hammer2_chain_delete() needs to leave the chain intact if MODIFIED2 is
167  set on its buffer until the flusher gets to it?
168
169
170				OPTIMIZATIONS
171
172* If a file is unlinked buts its descriptors is left open and used, we
173  should allow data blocks on-media to be reused since there is no
174  topology left to point at them.
175