1<!--- Licensed to the Apache Software Foundation (ASF) under one --> 2<!--- or more contributor license agreements. See the NOTICE file --> 3<!--- distributed with this work for additional information --> 4<!--- regarding copyright ownership. The ASF licenses this file --> 5<!--- to you under the Apache License, Version 2.0 (the --> 6<!--- "License"); you may not use this file except in compliance --> 7<!--- with the License. You may obtain a copy of the License at --> 8 9<!--- http://www.apache.org/licenses/LICENSE-2.0 --> 10 11<!--- Unless required by applicable law or agreed to in writing, --> 12<!--- software distributed under the License is distributed on an --> 13<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY --> 14<!--- KIND, either express or implied. See the License for the --> 15<!--- specific language governing permissions and limitations --> 16<!--- under the License. --> 17 18# TOPI: TVM Operator Inventory 19 20TOPI is the operator collection library for TVM intended at sharing the effort of crafting 21and optimizing tvm generated kernels. The goal: 22 23- Provide sugars for operator declaration 24- Give common primitives for fused op creation. 25- Provide commonly used schedules under each architectures 26 27## Organization 28- [include](include) C++ library, header only 29- [python](python) python library 30- [recipe](recipe) Recipe collections containing useful operator examples. 31 32## Guidelines 33- Use numpy-style naming convention for known ops 34- Seperate operator declaration from schedule when possible. 35 - This can be inconvenient but enables more general scheduling across ops. 36 - We can always recover the tensors from its outputs by traversing the tree. 37- Deliberately assert the requirements 38 - Some kernels have requirements on shape and data layout, assert them 39- Data layout aware, if not specified in argument or in function, assume NCHW by default. 40 41 42## Testcase 43- Add testcases to testout the schedule and dataflow in the TOPI workflow 44- Only do correctness testing without attaching compiler flags and only run it once. 45 46## Performance Tuning Workflow 47Since TVM is work in progress, some optimization might not be perfect. 48One quick way I find useful is to do codegen plus manual modification. 49The workflow is: 50 51- Generate the GPU kernels, write them into a file, say ```perf/matexp_generated.cu``` 52- Copy the generated file into another one, say ```perf/matexp_manual.cu```, 53 do modifications according to your intuition. 54- Set use_manual flag in the script to continue the codegen workflow as normal, but piggy back the manual written code instead. 55- Observe the performance difference. 56- If the performance improves, mark the manual code and think of optimization pass 57 to generate the desired target code. 58