1# Files, Trees and Packages 2 3Files, Trees, Packages and Lib are 4 proposed mechanisms for Curv source files 4to reference external resources. 5 6Curv needs a package manager. We can define a package as an encapsulated module 7composed of a number of files, and then focus on mechanisms for referencing 8external packages. 9 10These features support modular programming in Curv, wherein a large system is 11partitioned into encapsulated modules. One conventional property of a module is 12that its dependencies on external modules are all defined in one place. Within 13the body of the module, simple names are used to refer to these dependencies. 14This style should be *possible* in Curv, even if it isn't enforced. 15 16## File Syntax 17 18File Syntax is a set of rules for interpreting a regular file as a Curv value 19(based on its extension), and for interpreting a directory as a Curv value. 20 21Some file types that might be supported: 22* `*.curv` -- a Curv expression, which evaluates to an arbitrary value. 23* `*.cdef` -- a list of Curv definitions, which are textually included 24 by the parent directory module. Recursive dependencies allowed between 25 *.cdef files. Can't directly import this file type, it's not an expression. 26* `*.json` 27* `*.toml` 28* `*.rsdf` -- a Regularly Sampled Distance Field -- a voxel grid of distance 29 values, in binary. 30* *directory* -- if a local filename names a directory, then Tree syntax is 31 used to interpret the directory as a Curv value. 32* `*.vstor` -- Value Store: A compressed binary file representing an arbitrary 33 Curv value. A ZIP file containing a Tree (similar to `*.ODT` or `*.3MF`). 34 The primary use case is to represent a Curv shape as a single file, 35 where we want to package Curv source code together with some binary files. 36 37The shell command `curv filename` interprets `filename` using File Syntax, 38reading and evaluating the file and then displaying the resulting Curv value. 39 40Mime types: 41* `*.curv` == text/curv 42* `*.rsdf` == application/curv.rsdf 43* `*.vstor` == application/curv.vstor 44 45## The `file` Function 46 47This will not be part of Curv. Case analysis: 48* `file relative_pathname`: Replaced by `file.name`. 49 Avoids tricky code that restricts the use of `..` to escape from a 50 package boundary. 51* `file absolute_pathname`: This is potentially useful in a local workspace. 52 But you could also use a symlink, and reference the symlink with `file.name`. 53 This feature is a security hole if used in a package or `*.curv` file 54 downloaded from the internet. 55* `file URL`: How important is this, when we have `package URL`? 56 Potentially more susceptible to being used as a backchannel for malware 57 to "phone home", than `package`. 58 59This also means I won't have 'parameterized file readers'. 60 61## Parameterized File Readers 62 63Suppose that additional parameters must be supplied in order to interpret 64the contents of a file. How are these parameters specified? 65 66* The original plan was to provide type-specific file import functions with 67 extra parameters beyond the pathname. Eg, `svg_file` or `dxf_file`. 68 But, I want to deprecate the file function. 69* Put the parameters into an optional separate file, with the same basename 70 as the file being imported, but with a `.opts` file extension. 71 This contains a JSON or CURV record literal. 72 This is compatible with using the `file.identifier` syntax 73 for referencing file based components within a package. 74* A file reader for something like an SVG or DXF file can return a subtype 75 of Shape that provides rich access to the format-specific data. 76* An external tool can convert one of these files into an alternate form 77 that can be read by Curv without parameterization. For example, 78 mesh files are not directly readable in Curv, you must instead convert 79 the mesh to an RSDF file, and provide the mesh conversion parameters 80 to this external tool. 81 82## Trees 83 84Curv has a 'directory syntax', which interprets a directory tree as a Curv 85value: by default as a nested record value. Directory entries are interpreted 86as record members. Regular files named 'identifier.extension' are interpreted 87using File Syntax. Subdirectories named 'identifier' (no extension) are 88interpreted using Directory Syntax. Entries that don't match these patterns 89are ignored. 90 91The root of the directory tree is marked, possibly by an empty file `.curvroot`. 92 93Trees are encapsulated. You must use the Package mechanism to reference 94resources outside of the Tree. If `file` is used by a `*.curv` file in a Tree, 95you can only use relative pathnames, and you can't use `..` to reference files 96outside the Tree. 97 98The purpose of Tree syntax is to provide a local file system representation 99of `*.cpkg` files and Packages. That's why Trees are encapsulated. 100 101Within `*.curv` files in the Tree, other members of the tree can be referenced 102using lexical scoped identifiers. A directory containing files 'foo', 'bar', 103etc, is semantically equivalent to a record '{foo=..., bar=..., ...}'. The 104parent scope of the root directory is 'std', the standard namespace. This 105reference mechanism doesn't provide any additional expressive power over 106`file`, it's just nicer and more convenient. 107 108Trees may be nested. A directory tree with a `.curvroot` may be nested inside 109another directory tree. 110* This could be used for multi-package repos, or to ship a package with its 111 dependencies. 112* How does one subtree reference another sibling subtree as a dependency? 113 Let's review the existing external reference mechanisms: 114 * Lexical scoping. Nope. 115 * `file`. Nope. 116 * `package` + URL. Nope. 117 * `lib`. Nope. 118 What to do? 119 * Maybe the `.curvroot` file contains definitions of dependencies, 120 evaluated in the scope of the parent tree. Use lexically scoped variables 121 to reference sibling packages, and `package` for Internet scoped packages. 122 123A Tree can evaluate to a Shape. That's a requirement for `*.cpkg` files. 124We will extend the directory syntax with an optional file that contains a Curv 125expression that is evaluated to the directory's value. This can occur in any 126directory, not just the Tree root. Call it `.main.curv`. 127 128To export only 'public' members of a directory, use a `main.curv` file 129that contains 130``` 131{ 132 foo : foo, 133 bar : bar, 134} 135``` 136 137A possible extension: `.include.curv` evaluates to a record whose members are 138added to the record denoted by the directory. 139Can get the same effect using `main.curv`, as shown above. 140 141Many modern languages now have a standard tree/package/project manager 142that will create a project tree for you, then perform operations on that 143project tree. Often with git integration. Examples: 144* Rust, `cargo` 145* Clojure, `lein` 146 147## Trees (version 2) 148 149Maybe it's too weird that an identifier `foo` not defined anywhere in a 150`*.curv` file is implicitly defined by a sibling file `foo.curv`. So, files 151aren't converted into Curv bindings unless they are explicitly declared in 152a `*.curv` source file. Extraneous files and directories that aren't 153explicitly referenced are ignored. 154* The value of a directory `foo` is specified by `foo/main.curv`. 155* An explicit file reference or declaration for a file `foo.*` is: 156 1. `use foo;`. Can also write `use foo.bar.baz;`. 157 2. That makes it cumbersome to include a file into a scope (need two 158 definitions). `file.foo` is an expression. 159 `use a.b.c` is equivalent to `c=a.b.c`. 160 So now we have `use file.foo` or `include file.foo`. (Orthogonality.) 161 162The benefit of an explicit gesture like `file.foo` is that you get an 163explicit error message "File not found". 164 165`file.foo` is interpreted at compile time, because it is intended to behave 166like an identifier. `file` is not a record, despite the use of dot notation, 167it is a mechanism for doing lexically scoped identifier-like lookups 168in a Directory Syntax document. 169* Mutually recursive references between two `*.curv` scripts is illegal, 170 because of implementation restrictions (ref counting not garbage collection). 171 This is enforced at compile time. 172* Using a fancier compiler, we could permit mutual recursion between files, 173 with an implementation that still requires `file.foo` to be resolved at 174 compile time. 175* No immediate plan to implement `file."${foo}"`, `defined(file.foo)`, 176 or `fields file`. 177 178Under this interface, a record field could be represented by two files with 179the same basename and different extensions. Eg, one contains raw data in some 180standard non-Curv format, the other contains Curv metadata. (This is an 181alternative to "parameterized file readers".) Or, one contains geometry 182and the other contains colour. 183 184Can relax the requirement that directories contain a `main.curv` file. 185If not, construct a record from every suitable directory entry. 186 187Can relax the requirement for a `.curvroot` file. `file.foo` means: search 188for `foo.*` in the current directory, then in the parent, recursively until 189either a `.curvroot` file is found, or until the filesystem root is found. 190 191## Packages 192 193A Package is a versioned collection of Curv source files that are distributed 194over the internet as a unit. Packages explicitly declare their dependencies on 195other packages. Inspired by package management in Debian and many other systems. 196 197A Curv program can reference an external package using a URL and a version #. 198Eg, `package{repo:"https://github.com/doug-moen/laser-curv",version:"1.0"}`. 199Inspired by Rust and crates, it's distributed and decentralized. 200 201The package mechanism is heavy weight. Extend `file` to accept a URL 202argument, so that there is a simple way to reference remote resources? 203(But I have a security concern: when and how often are these URLs fetched?) 204 205When you evaluate a Curv program containing Package references, the UI 206notifies you if you have unsatisfied dependencies, and asks you if they can 207be downloaded. There is an 'upgrade' command for updating local copies of 208packages. No internet access without an explicit user action is a security 209feature of Curv. 210 211Questions: 212* How do I nest one package inside another? It's one way to satisfy a package 213 dependency. 214* Can a package be a shape? Or are they only meant to be libraries? 215 How do I distribute a shape that consists of multiple files (eg, a Curv 216 file and some 'assets' such as texture files)? A zip file is the 217 best approach: you want shapes to be single files, and zip is the standard 218 mechanism, eg OpenDocument `*.odf` or 3MF. 219* How do I develop, test, run a package on my local file system? 220 221Package metadata: 222* **In-value metadata**. If the value of a Tree is a record, then metadata 223 can be incorporated into the record value, using a naming convention. 224 Use cases? Control how a shape is rendered. BOM metadata in a shape. 225 These are shape-specific use cases, and not 'package' metadata. 226* **Out-of-value metadata**. The most obvious consumer of 'package' metadata is 227 the package manager, which doesn't need in-value metadata. A full description 228 of the package, with author, licence, description text, keywords, an image, 229 could be used to populate entries in a Curv package website (curvhub.org). 230 Use a file `.metadata.json`. 231 232## Local Packages 233 234The Package mechanism uses URLs to name external packages. 235What if you are disconnected from the internet and want to maintain a collection 236of packages on a file system, old school. 237 238You could use `package "file:/usr/local/curv/foo"`. 239Or, use `file "/usr/local/curv/foo"`. 240But those pathnames are not portable across systems maintained by different 241administrators. An important consideration for portability across heterogenous 242systems with no internet access. 243 244Alternatively, `CURVPATH` is an environment variable containing a list of 245absolute pathnames of directories. Eg, `CURVPATH=/usr/local/curv`. 246 247`lib.foo` searches for a file with basename `foo` in `CURVPATH`, as specified 248by the `file` function. If found, the file is loaded and evaluated and the 249resulting value is returned. 250 251## Standard Packages 252 253What makes sense is to have a small standard library (std), then put 254the remaining library abstractions into a collection of standard packages. 255The standard library forms the outer scope of all source files, while 256standard packages must be referenced explicitly. The standard library is 257harder to evolve than the standard packages, since a package can be deprecated 258as a whole and replaced by a new package with a different name. So it makes 259sense to keep the standard library small. 260 261How are standard packages referenced? Should we use `lib` (they are installed 262on the local file system, as part of the Curv installation process), 263or should we use `package` (they are referenced using URLs)? 264 265In the long term, standard packages should be referenced by URLs, because 266if they are part of the default install, then it becomes hard to abandon them 267or remove them from the default install (backward compatibility reasons). 268(Eg, the Python standard library is notoriously full of abandonware.) 269But then, in the long term, we would want a stable URL for these packages. 270 271I want standard packages now. What do I do? 272* Put standard packages on github. 273* `noise = package "https://github.com/doug-moen/noise.curv"` 274* The `package` function will initially use a simple package manager that 275 caches packages in `~/.cache/curv/` 276* `curvpkg` subcommands: list, install, remove, upgrade 277 278## Mutual Recursion 279Mutually recursive references between two curv files within a package 280is not supported: an error is reported. OTOH, what does work is defining a 281library in file (exporting a record value), and including that library in 282another file. That is a required feature. 283 284Directory syntax is supposedly modelled as a record literal (w.r.t. scoping). 285This suggests that mutual recursion could/should be supported. But that 286creates technical difficulties, especially when we need to support including 287another file. Curv does not let a record include a variable defined elsewhere 288in the same file. 289 290## Synthesis 291 292Implement this combination of features: 293* `file.foo` references a file `foo.*` or a directory `foo`, 294 relative to the current directory, using "lexical scoping" lookup. 295* File syntax: rules for converting files into Curv values, based on file 296 extension. 297* Directory syntax: rules for converting a directory into a Curv value. 298 Optional `main.curv` entry. Optional `.curvroot` entry. 299* `package{repo,version}`: Versioned, encapsulated packages, referenced using 300 absolute https: or file: URLs, represented as git repositories. 301 302How do you use these features, as a user? 303* Create a single hierarchical workspace for all Curv projects: `~/curv`. 304 Use file.name to reference external files. 305* Lollipop tutorial. 306 $ mkdir lollipop; cd lollipop 307 $ create main.curv 308 $ create param.curv 309 $ create lib.curv 310 Use `include file.param` and `include file.lib`. 311* The curv/examples directory will change: it gains a `.curvroot` file, 312 and uses `file.lib.experimental` to reference the library. 313 314## Bibilography 315https://medium.com/@sdboyer/so-you-want-to-write-a-package-manager-4ae9c17d9527 316