• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.devcontainer/H03-May-2022-4741

.github/H03-May-2022-138125

.vscode/H03-May-2022-6055

book/H03-May-2022-3422

include/H03-May-2022-659505

src/H03-May-2022-2,0781,232

tests/H03-May-2022-1,164979

tools/H03-May-2022-243200

.bazelignoreH A D10-Nov-20208 21

.buckconfigH A D10-Nov-2020683 2419

.buckversionH A D10-Nov-20205 21

.cargo-checksum.jsonH A D03-May-202289 11

.cargo_vcs_info.jsonH A D16-Nov-202074 65

.clang-formatH A D10-Nov-202061 32

.gitignoreH A D10-Nov-2020107 1110

BUCKH A D10-Nov-20201.6 KiB7569

BUILDH A D10-Nov-20201.9 KiB8274

Cargo.tomlH A D16-Nov-20201.6 KiB6152

Cargo.toml.orig-cargoH A D16-Nov-20201.2 KiB4235

LICENSE-APACHEH A D10-Nov-202010.6 KiB202169

LICENSE-MITH A D10-Nov-20201,023 2421

README.mdH A D16-Nov-202017.3 KiB379292

WORKSPACEH A D16-Nov-20201.3 KiB4334

build.rsH A D16-Nov-20201.4 KiB4942

README.md

1CXX — safe FFI between Rust and C++
2=========================================
3
4[<img alt="github" src="https://img.shields.io/badge/github-dtolnay/cxx-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/dtolnay/cxx)
5[<img alt="crates.io" src="https://img.shields.io/crates/v/cxx.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/cxx)
6[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-cxx-66c2a5?style=for-the-badge&labelColor=555555&logoColor=white&logo=" height="20">](https://docs.rs/cxx)
7[<img alt="build status" src="https://img.shields.io/github/workflow/status/dtolnay/cxx/CI/master?style=for-the-badge" height="20">](https://github.com/dtolnay/cxx/actions?query=branch%3Amaster)
8
9This library provides a **safe** mechanism for calling C++ code from Rust and
10Rust code from C++, not subject to the many ways that things can go wrong when
11using bindgen or cbindgen to generate unsafe C-style bindings.
12
13This doesn't change the fact that 100% of C++ code is unsafe. When auditing a
14project, you would be on the hook for auditing all the unsafe Rust code and
15*all* the C++ code. The core safety claim under this new model is that auditing
16just the C++ side would be sufficient to catch all problems, i.e. the Rust side
17can be 100% safe.
18
19```toml
20[dependencies]
21cxx = "0.5"
22
23[build-dependencies]
24cxx-build = "0.5"
25```
26
27*Compiler support: requires rustc 1.43+ and c++11 or newer*<br>
28*[Release notes](https://github.com/dtolnay/cxx/releases)*
29
30<br>
31
32## Overview
33
34The idea is that we define the signatures of both sides of our FFI boundary
35embedded together in one Rust module (the next section shows an example). From
36this, CXX receives a complete picture of the boundary to perform static analyses
37against the types and function signatures to uphold both Rust's and C++'s
38invariants and requirements.
39
40If everything checks out statically, then CXX uses a pair of code generators to
41emit the relevant `extern "C"` signatures on both sides together with any
42necessary static assertions for later in the build process to verify
43correctness. On the Rust side this code generator is simply an attribute
44procedural macro. On the C++ side it can be a small Cargo build script if your
45build is managed by Cargo, or for other build systems like Bazel or Buck we
46provide a command line tool which generates the header and source file and
47should be easy to integrate.
48
49The resulting FFI bridge operates at zero or negligible overhead, i.e. no
50copying, no serialization, no memory allocation, no runtime checks needed.
51
52The FFI signatures are able to use native types from whichever side they please,
53such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s
54`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination.
55CXX guarantees an ABI-compatible signature that both sides understand, based on
56builtin bindings for key standard library types to expose an idiomatic API on
57those types to the other language. For example when manipulating a C++ string
58from Rust, its `len()` method becomes a call of the `size()` member function
59defined by C++; when manipulating a Rust string from C++, its `size()` member
60function calls Rust's `len()`.
61
62<br>
63
64## Example
65
66In this example we are writing a Rust application that wishes to take advantage
67of an existing C++ client for a large-file blobstore service. The blobstore
68supports a `put` operation for a discontiguous buffer upload. For example we
69might be uploading snapshots of a circular buffer which would tend to consist of
702 chunks, or fragments of a file spread across memory for some other reason.
71
72A runnable version of this example is provided under the *demo* directory of
73this repo. To try it out, run `cargo run` from that directory.
74
75```rust
76#[cxx::bridge]
77mod ffi {
78    // Any shared structs, whose fields will be visible to both languages.
79    struct BlobMetadata {
80        size: usize,
81        tags: Vec<String>,
82    }
83
84    extern "Rust" {
85        // Zero or more opaque types which both languages can pass around but
86        // only Rust can see the fields.
87        type MultiBuf;
88
89        // Functions implemented in Rust.
90        fn next_chunk(buf: &mut MultiBuf) -> &[u8];
91    }
92
93    extern "C++" {
94        // One or more headers with the matching C++ declarations. Our code
95        // generators don't read it but it gets #include'd and used in static
96        // assertions to ensure our picture of the FFI boundary is accurate.
97        include!("demo/include/blobstore.h");
98
99        // Zero or more opaque types which both languages can pass around but
100        // only C++ can see the fields.
101        type BlobstoreClient;
102
103        // Functions implemented in C++.
104        fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
105        fn put(&self, parts: &mut MultiBuf) -> u64;
106        fn tag(&self, blobid: u64, tag: &str);
107        fn metadata(&self, blobid: u64) -> BlobMetadata;
108    }
109}
110```
111
112Now we simply provide Rust definitions of all the things in the `extern "Rust"`
113block and C++ definitions of all the things in the `extern "C++"` block, and get
114to call back and forth safely.
115
116Here are links to the complete set of source files involved in the demo:
117
118- [demo/src/main.rs](demo/src/main.rs)
119- [demo/build.rs](demo/build.rs)
120- [demo/include/blobstore.h](demo/include/blobstore.h)
121- [demo/src/blobstore.cc](demo/src/blobstore.cc)
122
123To look at the code generated in both languages for the example by the CXX code
124generators:
125
126```console
127   # run Rust code generator and print to stdout
128   # (requires https://github.com/dtolnay/cargo-expand)
129$ cargo expand --manifest-path demo/Cargo.toml
130
131   # run C++ code generator and print to stdout
132$ cargo run --manifest-path gen/cmd/Cargo.toml -- demo/src/main.rs
133```
134
135<br>
136
137## Details
138
139As seen in the example, the language of the FFI boundary involves 3 kinds of
140items:
141
142- **Shared structs** &mdash; their fields are made visible to both languages.
143  The definition written within cxx::bridge is the single source of truth.
144
145- **Opaque types** &mdash; their fields are secret from the other language.
146  These cannot be passed across the FFI by value but only behind an indirection,
147  such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias
148  for an arbitrarily complicated generic language-specific type depending on
149  your use case.
150
151- **Functions** &mdash; implemented in either language, callable from the other
152  language.
153
154Within the `extern "Rust"` part of the CXX bridge we list the types and
155functions for which Rust is the source of truth. These all implicitly refer to
156the `super` module, the parent module of the CXX bridge. You can think of the
157two items listed in the example above as being like `use super::MultiBuf` and
158`use super::next_chunk` except re-exported to C++. The parent module will either
159contain the definitions directly for simple things, or contain the relevant
160`use` statements to bring them into scope from elsewhere.
161
162Within the `extern "C++"` part, we list types and functions for which C++ is the
163source of truth, as well as the header(s) that declare those APIs. In the future
164it's possible that this section could be generated bindgen-style from the
165headers but for now we need the signatures written out; static assertions will
166verify that they are accurate.
167
168Your function implementations themselves, whether in C++ or Rust, *do not* need
169to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims
170where necessary to make it all work.
171
172<br>
173
174## Comparison vs bindgen and cbindgen
175
176Notice that with CXX there is repetition of all the function signatures: they
177are typed out once where the implementation is defined (in C++ or Rust) and
178again inside the cxx::bridge module, though compile-time assertions guarantee
179these are kept in sync. This is different from [bindgen] and [cbindgen] where
180function signatures are typed by a human once and the tool consumes them in one
181language and emits them in the other language.
182
183[bindgen]: https://github.com/rust-lang/rust-bindgen
184[cbindgen]: https://github.com/eqrion/cbindgen/
185
186This is because CXX fills a somewhat different role. It is a lower level tool
187than bindgen or cbindgen in a sense; you can think of it as being a replacement
188for the concept of `extern "C"` signatures as we know them, rather than a
189replacement for a bindgen. It would be reasonable to build a higher level
190bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module
191(and/or IDL like Thrift) as source of truth and generates the cxx::bridge,
192eliminating the repetition while leveraging the static analysis safety
193guarantees of CXX.
194
195But note in other ways CXX is higher level than the bindgens, with rich support
196for common standard library types. Frequently with bindgen when we are dealing
197with an idiomatic C++ API we would end up manually wrapping that API in C-style
198raw pointer functions, applying bindgen to get unsafe raw pointer Rust
199functions, and replicating the API again to expose those idiomatically in Rust.
200That's a much worse form of repetition because it is unsafe all the way through.
201
202By using a CXX bridge as the shared understanding between the languages, rather
203than `extern "C"` C-style signatures as the shared understanding, common FFI use
204cases become expressible using 100% safe code.
205
206It would also be reasonable to mix and match, using CXX bridge for the 95% of
207your FFI that is straightforward and doing the remaining few oddball signatures
208the old fashioned way with bindgen and cbindgen, if for some reason CXX's static
209restrictions get in the way. Please file an issue if you end up taking this
210approach so that we know what ways it would be worthwhile to make the tool more
211expressive.
212
213<br>
214
215## Cargo-based setup
216
217For builds that are orchestrated by Cargo, you will use a build script that runs
218CXX's C++ code generator and compiles the resulting C++ code along with any
219other C++ code for your crate.
220
221The canonical build script is as follows. The indicated line returns a
222[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can
223set up any additional source files and compiler flags as normal.
224
225[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html
226
227```toml
228# Cargo.toml
229
230[build-dependencies]
231cxx-build = "0.5"
232```
233
234```rust
235// build.rs
236
237fn main() {
238    cxx_build::bridge("src/main.rs")  // returns a cc::Build
239        .file("src/demo.cc")
240        .flag_if_supported("-std=c++11")
241        .compile("cxxbridge-demo");
242
243    println!("cargo:rerun-if-changed=src/main.rs");
244    println!("cargo:rerun-if-changed=src/demo.cc");
245    println!("cargo:rerun-if-changed=include/demo.h");
246}
247```
248
249<br>
250
251## Non-Cargo setup
252
253For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of
254invoking the C++ code generator as a standalone command line tool. The tool is
255packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the
256*gen/cmd* directory of this repo.
257
258```bash
259$ cargo install cxxbridge-cmd
260
261$ cxxbridge src/main.rs --header > path/to/mybridge.h
262$ cxxbridge src/main.rs > path/to/mybridge.cc
263```
264
265<br>
266
267## Safety
268
269Be aware that the design of this library is intentionally restrictive and
270opinionated! It isn't a goal to be powerful enough to handle arbitrary
271signatures in either language. Instead this project is about carving out a
272reasonably expressive set of functionality about which we can make useful safety
273guarantees today and maybe extend over time. You may find that it takes some
274practice to use CXX bridge effectively as it won't work in all the ways that you
275are used to.
276
277Some of the considerations that go into ensuring safety are:
278
279- By design, our paired code generators work together to control both sides of
280  the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is
281  unsafe because the Rust compiler has no way to know whether the signatures
282  you've written actually match the signatures implemented in the other
283  language. With CXX we achieve that visibility and know what's on the other
284  side.
285
286- Our static analysis detects and prevents passing types by value that shouldn't
287  be passed by value from C++ to Rust, for example because they may contain
288  internal pointers that would be screwed up by Rust's move behavior.
289
290- To many people's surprise, it is possible to have a struct in Rust and a
291  struct in C++ with exactly the same layout / fields / alignment / everything,
292  and still not the same ABI when passed by value. This is a longstanding
293  bindgen bug that leads to segfaults in absolutely correct-looking code
294  ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the
295  necessary zero-cost workaround transparently where needed, so go ahead and
296  pass your structs by value without worries. This is made possible by owning
297  both sides of the boundary rather than just one.
298
299- Template instantiations: for example in order to expose a UniquePtr\<T\> type
300  in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait
301  to connect the behavior back to the template instantiations performed by the
302  other language.
303
304[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778
305
306<br>
307
308## Builtin types
309
310In addition to all the primitive types (i32 &lt;=&gt; int32_t), the following
311common types may be used in the fields of shared structs and the arguments and
312returns of functions.
313
314<table>
315<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr>
316<tr><td>String</td><td>rust::String</td><td></td></tr>
317<tr><td>&amp;str</td><td>rust::Str</td><td></td></tr>
318<tr><td>&amp;[u8]</td><td>rust::Slice&lt;uint8_t&gt;</td><td><sup><i>arbitrary &amp;[T] not implemented yet</i></sup></td></tr>
319<tr><td><a href="https://docs.rs/cxx/0.5/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr>
320<tr><td>Box&lt;T&gt;</td><td>rust::Box&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
321<tr><td><a href="https://docs.rs/cxx/0.5/cxx/struct.UniquePtr.html">UniquePtr&lt;T&gt;</a></td><td>std::unique_ptr&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
322<tr><td>Vec&lt;T&gt;</td><td>rust::Vec&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
323<tr><td><a href="https://docs.rs/cxx/0.5/cxx/struct.CxxVector.html">CxxVector&lt;T&gt;</a></td><td>std::vector&lt;T&gt;</td><td><sup><i>cannot be passed by value, cannot hold opaque Rust type</i></sup></td></tr>
324<tr><td>fn(T, U) -&gt; V</td><td>rust::Fn&lt;V(T, U)&gt;</td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr>
325<tr><td>Result&lt;T&gt;</td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr>
326</table>
327
328The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in
329this repo. You will need to include this header in your C++ code when working
330with those types.
331
332The following types are intended to be supported "soon" but are just not
333implemented yet. I don't expect any of these to be hard to make work but it's a
334matter of designing a nice API for each in its non-native language.
335
336<table>
337<tr><th>name in Rust</th><th>name in C++</th></tr>
338<tr><td>BTreeMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
339<tr><td>HashMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
340<tr><td>Arc&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
341<tr><td>Option&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
342<tr><td><sup><i>tbd</i></sup></td><td>std::map&lt;K, V&gt;</td></tr>
343<tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map&lt;K, V&gt;</td></tr>
344<tr><td><sup><i>tbd</i></sup></td><td>std::shared_ptr&lt;T&gt;</td></tr>
345</table>
346
347<br>
348
349## Remaining work
350
351This is still early days for CXX; I am releasing it as a minimum viable product
352to collect feedback on the direction and invite collaborators. Please check the
353open issues.
354
355Especially please report issues if you run into trouble building or linking any
356of this stuff. I'm sure there are ways to make the build aspects friendlier or
357more robust.
358
359Finally, I know more about Rust library design than C++ library design so I
360would appreciate help making the C++ APIs in this project more idiomatic where
361anyone has suggestions.
362
363<br>
364
365#### License
366
367<sup>
368Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3692.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
370</sup>
371
372<br>
373
374<sub>
375Unless you explicitly state otherwise, any contribution intentionally submitted
376for inclusion in this project by you, as defined in the Apache-2.0 license,
377shall be dual licensed as above, without any additional terms or conditions.
378</sub>
379