• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.github/H11-Jun-2021-4636

3rd_party/H11-Jun-2021-150,631112,506

backupcode/H11-Jun-2021-5,0433,869

benchmark/H03-May-2022-1,5401,228

ciscripts/H03-May-2022-376310

cmake/H11-Jun-2021-818763

codegen/H03-May-2022-2,5022,054

demo/H11-Jun-2021-10,5086,808

doc/API/html/H03-May-2022-4,7024,407

express/H03-May-2022-6,9645,338

include/MNN/H11-Jun-2021-4,8132,144

package_scripts/H11-Jun-2021-791569

project/H11-Jun-2021-5,1474,946

pymnn/H03-May-2022-26,51319,579

resource/H11-Jun-2021-5246

schema/H03-May-2022-25,87123,931

source/H11-Jun-2021-251,703208,787

test/H03-May-2022-14,71812,239

tools/H11-Jun-2021-71,76258,226

.gitignoreH A D11-Jun-20215.6 KiB360290

.travis.ymlH A D11-Jun-20215.8 KiB199198

MNN.podspecH A D11-Jun-20213.1 KiB6556

README.mdH A D11-Jun-20216 KiB9972

README_CN.mdH A D11-Jun-20215.5 KiB9872

test.shH A D03-May-20229.5 KiB269243

README.md

1![MNN](doc/banner.png)
2
3[中文版本](README_CN.md)
4
5[MNN Homepage](http://www.mnn.zone)
6
7## Intro
8MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models, and has industry leading performance for inference and training on-device. At present, MNN has been integrated in more than 20 apps of Alibaba Inc, such as Taobao, Tmall, Youku, Dingtalk, Xianyu and etc., covering more than 70 usage scenarios such as live broadcast, short video capture, search recommendation, product searching by image, interactive marketing, equity distribution, security risk control. In addition, MNN is also used on embedded devices, such as IoT.
9
10The design principles and performance data of MNN has been published in an MLSys 2020 paper [here](https://proceedings.mlsys.org/static/paper_files/mlsys/2020/7-Paper.pdf). Please cite MNN in your publications if it helps your research:
11
12    @inproceedings{alibaba2020mnn,
13      author = {Jiang, Xiaotang and Wang, Huan and Chen, Yiliu and Wu, Ziqi and Wang, Lichuan and Zou, Bin and Yang, Yafeng and Cui, Zongyang and Cai, Yu and Yu, Tianhang and Lv, Chengfei and Wu, Zhihua},
14      title = {MNN: A Universal and Efficient Inference Engine},
15      booktitle = {MLSys},
16      year = {2020}
17    }
18
19## Documentation and Tools
20MNN's docs are in placed in [Yuque docs here](https://www.yuque.com/mnn/en).
21
22MNN Workbench could be downloaded from [MNN's homepage](http://www.mnn.zone), which provides pretrained models, visualized training tools, and one-click deployment of models to devices.
23
24## Key Features
25### High performance
26- Implements core computing with lots of optimized assembly code to make full use of the ARM CPU.
27- For iOS, GPU acceleration (Metal) can be turned on, which is faster than Apple's native CoreML.
28- For Android, `OpenCL`, `Vulkan`, and `OpenGL` are available and deep tuned for mainstream GPUs (`Adreno` and `Mali`).
29- Convolution and transposition convolution algorithms are efficient and stable. The Winograd convolution algorithm is widely used to better symmetric convolutions such as 3x3 -> 7x7.
30- Twice speed increase for the new architecture ARM v8.2 with FP16 half-precision calculation support.
31
32### Lightweight
33- Optimized for devices, no dependencies, can be easily deployed to mobile devices and a variety of embedded devices.
34- iOS platform: static library size for armv7+arm64 platforms is about 5MB, size increase of linked executables is about 620KB, and metallib file is about 600KB.
35- Android platform: core so size is about 400KB, OpenCL so is about 400KB, Vulkan so is about 400KB.
36
37### Versatility
38- Supports `Tensorflow`, `Caffe`, `ONNX`, and supports common neural networks such as `CNN`, `RNN`, `GAN`.
39- MNN model converter supports 149 `Tensorflow` OPs, 58 `TFLite` OPs, 47 `Caffe` OPs and 74 `ONNX` OPs; Number of OPs by different MNN hardware backends: 111 for CPU, 6 for ARM V8.2, 55 for Metal, 43 for OpenCL, and 32 for Vulkan.
40- Supports iOS 8.0+, Android 4.3+ and embedded devices with POSIX interface.
41- Supports hybrid computing on multiple devices. Currently supports CPU and GPU.
42
43### Ease of use
44- Efficient image processing module, speeding up affine transform and color space transform without libyuv or opencv.
45- Provides callbacks throughout the workflow to extract data or control the execution precisely.
46- Provides options for selecting inference branch and paralleling branches on CPU and GPU.
47- (BETA) MNN Python API helps ML engineers to easily use MNN to build a model, train it and quantize it, without dipping their toes in C++ code.
48
49## Architecture
50![architecture](doc/architecture.png)
51
52MNN can be divided into two parts: Converter and Interpreter.
53
54Converter consists of Frontends and Graph Optimize. The former is responsible for supporting different training frameworks. MNN currently supports Tensorflow, Tensorflow Lite, Caffe and ONNX (PyTorch/MXNet); the latter optimizes graphs by operator fusion, operator substitution, and layout adjustment.
55
56Interpreter consists of Engine and Backends. The former is responsible for the loading of the model and the scheduling of the calculation graph; the latter includes the memory allocation and the Op implementation under each computing device. In Engine and Backends, MNN applies a variety of optimization schemes, including applying Winograd algorithm in convolution and deconvolution, applying Strassen algorithm in matrix multiplication, low-precision calculation, Neon optimization, hand-written assembly, multi-thread optimization, memory reuse, heterogeneous computing, etc.
57
58## How to Discuss and Get Help From MNN Community
59
60Scan the following QR codes to join Dingtalk discussion group. The group discussions are predominantly Chinese. But we welcome and will help English speakers.
61
62Group #1 (Full):
63
64<img src="doc/DingTalkQR1.png" height="256"/>
65
66Group #2 (Full):
67
68<img src="doc/DingTalkQR2.png" height="256"/>
69
70Group #3:
71
72<img src="doc/DingTalkQR3.png" height="256"/>
73
74## License
75Apache 2.0
76
77## Acknowledgement
78MNN participants: Taobao Technology Department, Search Engineering Team, DAMO Team, Youku and other Alibaba Group employees.
79
80MNN refers to the following projects:
81- [Caffe](https://github.com/BVLC/caffe)
82- [flatbuffer](https://github.com/google/flatbuffers)
83- [gemmlowp](https://github.com/google/gemmlowp)
84- [Google Vulkan demo](http://www.github.com/googlesamples/android-vulkan-tutorials)
85- [Halide](https://github.com/halide/Halide)
86- [Mace](https://github.com/XiaoMi/mace)
87- [ONNX](https://github.com/onnx/onnx)
88- [protobuffer](https://github.com/protocolbuffers/protobuf)
89- [skia](https://github.com/google/skia)
90- [Tensorflow](https://github.com/tensorflow/tensorflow)
91- [ncnn](https://github.com/Tencent/ncnn)
92- [paddle-mobile](https://github.com/PaddlePaddle/paddle-mobile)
93- [stb](https://github.com/nothings/stb)
94- [rapidjson](https://github.com/Tencent/rapidjson)
95- [pybind11](https://github.com/pybind/pybind11)
96- [pytorch](https://github.com/pytorch/pytorch)
97- [bolt](https://github.com/huawei-noah/bolt)
98- [libyuv](https://chromium.googlesource.com/libyuv/libyuv)
99

README_CN.md

1![MNN](doc/banner.png)
2
3[English Version](README.md)
4
5[MNN官网](http://www.mnn.zone)
6
7## 简介
8MNN是一个高效、轻量的深度学习框架。它支持深度模型推理与训练,尤其在端侧的推理与训练性能在业界处于领先地位。目前,MNN已经在阿里巴巴的手机淘宝、手机天猫、优酷、钉钉、闲鱼等20多个App中使用,覆盖直播、短视频、搜索推荐、商品图像搜索、互动营销、权益发放、安全风控等70多个场景。此外,IoT等场景下也有若干应用。
9
10MNN的架构设计理念与性能数据在MLSys 2020上面发表。Paper [在此处](https://proceedings.mlsys.org/static/paper_files/mlsys/2020/7-Paper.pdf)。如果MNN对你的研究有所助益,欢迎引用MNN的论文:
11
12    @inproceedings{alibaba2020mnn,
13      author = {Jiang, Xiaotang and Wang, Huan and Chen, Yiliu and Wu, Ziqi and Wang, Lichuan and Zou, Bin and Yang, Yafeng and Cui, Zongyang and Cai, Yu and Yu, Tianhang and Lv, Chengfei and Wu, Zhihua},
14      title = {MNN: A Universal and Efficient Inference Engine},
15      booktitle = {MLSys},
16      year = {2020}
17    }
18
19## 文档与工具
20MNN的使用文档统一放在语雀,请移步至[语雀文档](https://www.yuque.com/mnn/cn)21
22[MNN官网](http://www.mnn.zone)上还可以下载MNN团队全新力作MNN工作台,涵盖开箱即用模型、可视化训练等工具,更可以一键部署到多端设备。
23
24## MNN 特色
25### 高性能
26- 不依赖任何第三方计算库,依靠大量手写汇编实现核心运算,充分发挥ARM CPU的算力。
27- iOS设备上可以开启GPU加速(Metal),常用模型上快于苹果原生的CoreML。
28- Android上提供了`OpenCL`、`Vulkan`、`OpenGL`三套方案,尽可能多地满足设备需求,针对主流GPU(`Adreno`和`Mali`)做了深度调优。
29- 卷积、转置卷积算法高效稳定,对于任意形状的卷积均能高效运行,广泛运用了 Winograd 卷积算法,对3x3 -> 7x7之类的对称卷积有高效的实现。
30- 针对ARM v8.2的新架构额外作了优化,新设备可利用FP16半精度计算的特性获得两倍提速。
31
32### 轻量性
33- 针对端侧设备特点深度定制和裁剪,无任何依赖,可以方便地部署到移动设备和各种嵌入式设备中。
34- iOS平台:armv7+arm64静态库大小5MB左右,链接生成可执行文件增加大小620KB左右,metallib文件600KB左右。
35- Android平台:so大小400KB左右,OpenCL库400KB左右,Vulkan库400KB左右。
36
37### 通用性
38- 支持`Tensorflow`、`Caffe`、`ONNX`等主流模型文件格式,支持`CNN`、`RNN`、`GAN`等常用网络。
39- 转换器支持149个`Tensorflow`OP、58个`TFLite` OP、47个`Caffe` OP、74个`ONNX` OP;各计算设备后端支持的MNN OP数:CPU 111个,ARM V8.2 6个,Metal 55个,OpenCL 43个,Vulkan 32个。
40- 支持iOS 8.0+、Android 4.3+和具有POSIX接口的嵌入式设备。
41- 支持异构设备混合计算,目前支持CPU和GPU。
42
43### 易用性
44- 有高效的图像处理模块,覆盖常见的形变、转换等需求,一般情况下,无需额外引入libyuv或opencv库处理图像。
45- 支持回调机制,可以在网络运行中插入回调,提取数据或者控制运行走向。
46- 支持只运行网络中的一部分,或者指定CPU和GPU间并行运行。
47- (BETA)MNN Python API,让算法工程师可以轻松地使用MNN构图、训练、量化训练,无需编写C++。
48
49## 架构设计
50![architecture](doc/architecture.png)
51
52MNN可以分为Converter和Interpreter两部分。
53
54Converter由Frontends和Graph Optimize构成。前者负责支持不同的训练框架,MNN当前支持Tensorflow(Lite)、Caffe和ONNX(PyTorch/MXNet的模型可先转为ONNX模型再转到MNN);后者通过算子融合、算子替代、布局调整等方式优化图。
55
56Interpreter由Engine和Backends构成。前者负责模型的加载、计算图的调度;后者包含各计算设备下的内存分配、Op实现。在Engine和Backends中,MNN应用了多种优化方案,包括在卷积和反卷积中应用Winograd算法、在矩阵乘法中应用Strassen算法、低精度计算、Neon优化、手写汇编、多线程优化、内存复用、异构计算等。
57
58##  社区交流与反馈
59扫描二维码加入钉钉讨论群。
60
61一群(已满):
62
63<img src="doc/DingTalkQR1.png" height="256"/>
64
65二群(已满):
66
67<img src="doc/DingTalkQR2.png" height="256"/>
68
69三群:
70
71<img src="doc/DingTalkQR23.png" height="256"/>
72
73## License
74Apache 2.0
75
76## 致谢
77MNN参与人员:淘宝技术部、搜索工程团队、达摩院团队、优酷等集团员工。
78
79MNN参考、借鉴了下列项目:
80- [Caffe](https://github.com/BVLC/caffe)
81- [flatbuffer](https://github.com/google/flatbuffers)
82- [gemmlowp](https://github.com/google/gemmlowp)
83- [Google Vulkan demo](http://www.github.com/googlesamples/android-vulkan-tutorials)
84- [Halide](https://github.com/halide/Halide)
85- [Mace](https://github.com/XiaoMi/mace)
86- [ONNX](https://github.com/onnx/onnx)
87- [protobuffer](https://github.com/protocolbuffers/protobuf)
88- [skia](https://github.com/google/skia)
89- [Tensorflow](https://github.com/tensorflow/tensorflow)
90- [ncnn](https://github.com/Tencent/ncnn)
91- [paddle-mobile](https://github.com/PaddlePaddle/paddle-mobile)
92- [stb](https://github.com/nothings/stb)
93- [rapidjson](https://github.com/Tencent/rapidjson)
94- [pybind11](https://github.com/pybind/pybind11)
95- [pytorch](https://github.com/pytorch/pytorch)
96- [bolt](https://github.com/huawei-noah/bolt)
97- [libyuv](https://chromium.googlesource.com/libyuv/libyuv)
98