• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..11-Nov-2020-

broadcast/H11-Nov-2020-137101

conv/H11-Nov-2020-808619

gemm/H11-Nov-2020-478333

reduce/H11-Nov-2020-9663

rnn/H11-Nov-2020-384282

README.mdH A D11-Nov-20202.4 KiB4925

README.md

1<!--- Licensed to the Apache Software Foundation (ASF) under one -->
2<!--- or more contributor license agreements.  See the NOTICE file -->
3<!--- distributed with this work for additional information -->
4<!--- regarding copyright ownership.  The ASF licenses this file -->
5<!--- to you under the Apache License, Version 2.0 (the -->
6<!--- "License"); you may not use this file except in compliance -->
7<!--- with the License.  You may obtain a copy of the License at -->
8
9<!---   http://www.apache.org/licenses/LICENSE-2.0 -->
10
11<!--- Unless required by applicable law or agreed to in writing, -->
12<!--- software distributed under the License is distributed on an -->
13<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
14<!--- KIND, either express or implied.  See the License for the -->
15<!--- specific language governing permissions and limitations -->
16<!--- under the License. -->
17
18# TOPI Recipe: TVM Operator Optimization Recipes
19
20TOPI is the operator collection library for TVM intended at sharing the effort of crafting
21and optimizing tvm generated kernels. The goal:
22
23- Provide sugars for operator declaration
24- Give common primitives for fused op creation.
25- Provide commonly used schedules under each architectures
26
27## Guidelines
28- Use numpy-style naming convention for known ops
29- Seperate operator declaration from schedule when possible.
30  - This can be inconvenient but enables more general scheduling across ops.
31  - We can always recover the tensors from its outputs by traversing the tree.
32- Deliberately assert the requirements
33  - Some kernels have requirements on shape and data layout, assert them
34- Data layout aware, if not specified in argument or in function, assume NCHW by default.
35
36
37## Performance Tuning Workflow
38Since TVM is work in progress, some optimization might not be perfect.
39One quick way I find useful is to do codegen plus manual modification.
40The workflow is:
41
42- Generate the GPU kernels, write them into a file, say ```perf/matexp_generated.cu```
43- Copy the generated file into another one, say ```perf/matexp_manual.cu```,
44  do modifications according to your intuition.
45- Set use_manual flag in the script to continue the codegen workflow as normal, but piggy back the manual written code instead.
46- Observe the performance difference.
47- If the performance improves, mark the manual code and think of optimization pass
48  to generate the desired target code.
49