1namespace tf {
2
3/** @page release-3-2-0 Release 3.2.0 (2021/07/29)
4
5%Taskflow 3.2.0 is the 3rd release in the 3.x line!
6This release includes several new changes such as CPU-GPU tasking, algorithm collection,
7enhanced web-based profiler, documentation, and unit tests.
8
9@tableofcontents
10
11@section release-3-2-0_download Download
12
13%Taskflow 3.2.0 can be downloaded from <a href="https://github.com/taskflow/taskflow/releases/tag/v3.2.0">here</a>.
14
15@section release-3-2-0_system_requirements System Requirements
16
17To use %Taskflow v3.2.0, you need a compiler that supports C++17:
18
19@li GNU C++ Compiler at least v8.4 with -std=c++17
20@li Clang C++ Compiler at least v6.0 with -std=c++17
21@li Microsoft Visual Studio at least v19.27 with /std:c++17
22@li AppleClang Xode Version at least v12.0 with -std=c++17
23@li Nvidia CUDA Toolkit and Compiler (nvcc) at least v11.1 with -std=c++17
24@li Intel C++ Compiler at least v19.0.1 with -std=c++17
25@li Intel DPC++ Clang Compiler at least v13.0.0 with -std=c++17 and SYCL20
26
27%Taskflow works on Linux, Windows, and Mac OS X.
28
29@section release-3-2-0_working_items Working Items
30
31@li enhancing support for SYCL with Intel DPC++
32@li enhancing parallel CPU and GPU algorithms
33@li designing pipeline interface and its scheduling algorithms
34
35@section release-3-2-0_new_features New Features
36
37@subsection release-3-2-0_taskflow_core Taskflow Core
38
39@li added tf::SmallVector optimization for optimizing the dependency storage in a graph
40@li added move constructor and move assignment operator for tf::Taskflow
41  + tf::Taskflow::Taskflow(Taskflow&&)
42  + tf::Taskflow::operator=(Taskflow&&)
43@li added moved run in tf::Executor for automatically managing taskflow's lifetimes
44  + tf::Executor::run(Taskflow&&)
45  + tf::Executor::run(Taskflow&&, C&&)
46  + tf::Executor::run_n(Taskflow&&, size_t)
47  + tf::Executor::run_n(Taskflow&&, size_t, C&&)
48  + tf::Executor::run_until(Taskflow&&, P&&)
49  + tf::Executor::run_until(Taskflow&&, P&&, C&&)
50
51@subsection release-3-2-0_cudaflow cudaFlow
52
53@li improved the execution flow of tf::cudaFlowCapturer when updates involve
54
55New algorithms in tf::cudaFlow and tf::cudaFlowCapturer:
56
57@li added tf::cudaFlow::reduce
58@li added tf::cudaFlow::transform_reduce
59@li added tf::cudaFlow::uninitialized_reduce
60@li added tf::cudaFlow::transform_uninitialized_reduce
61@li added tf::cudaFlow::inclusive_scan
62@li added tf::cudaFlow::exclusive_scan
63@li added tf::cudaFlow::transform_inclusive_scan
64@li added tf::cudaFlow::transform_exclusive_scan
65@li added tf::cudaFlow::merge
66@li added tf::cudaFlow::merge_by_key
67@li added tf::cudaFlow::sort
68@li added tf::cudaFlow::sort_by_key
69@li added tf::cudaFlow::find_if
70@li added tf::cudaFlow::min_element
71@li added tf::cudaFlow::max_element
72@li added tf::cudaFlowCapturer::reduce
73@li added tf::cudaFlowCapturer::transform_reduce
74@li added tf::cudaFlowCapturer::uninitialized_reduce
75@li added tf::cudaFlowCapturer::transform_uninitialized_reduce
76@li added tf::cudaFlowCapturer::inclusive_scan
77@li added tf::cudaFlowCapturer::exclusive_scan
78@li added tf::cudaFlowCapturer::transform_inclusive_scan
79@li added tf::cudaFlowCapturer::transform_exclusive_scan
80@li added tf::cudaFlowCapturer::merge
81@li added tf::cudaFlowCapturer::merge_by_key
82@li added tf::cudaFlowCapturer::sort
83@li added tf::cudaFlowCapturer::sort_by_key
84@li added tf::cudaFlowCapturer::find_if
85@li added tf::cudaFlowCapturer::min_element
86@li added tf::cudaFlowCapturer::max_element
87@li added tf::cudaLinearCapturing
88
89@subsection release-3-2-0_syclflow syclFlow
90
91@subsection release-3-2-0_cuda_std_algorithms CUDA Standard Parallel Algorithms
92
93@li added tf::cuda_for_each
94@li added tf::cuda_for_each_index
95@li added tf::cuda_transform
96@li added tf::cuda_reduce
97@li added tf::cuda_uninitialized_reduce
98@li added tf::cuda_transform_reduce
99@li added tf::cuda_transform_uninitialized_reduce
100@li added tf::cuda_inclusive_scan
101@li added tf::cuda_exclusive_scan
102@li added tf::cuda_transform_inclusive_scan
103@li added tf::cuda_transform_exclusive_scan
104@li added tf::cuda_merge
105@li added tf::cuda_merge_by_key
106@li added tf::cuda_sort
107@li added tf::cuda_sort_by_key
108@li added tf::cuda_find_if
109@li added tf::cuda_min_element
110@li added tf::cuda_max_element
111
112@subsection release-3-2-0_utilities Utilities
113
114@li added CUDA meta programming
115@li added SYCL meta programming
116
117@subsection release-3-2-0_profiler Taskflow Profiler (TFProf)
118
119@section release-3-2-0_bug_fixes Bug Fixes
120
121@li fixed compilation errors in constructing tf::cudaRoundRobinCapturing
122@li fixed compilation errors of TLS worker pointer in tf::Executor
123@li fixed compilation errors of nvcc v11.3 in auto template deduction
124  + std::scoped_lock
125  + tf::Serializer and tf::Deserializer
126@li fixed memory leak when moving a tf::Taskflow
127
128@section release-3-2-0_breaking_changes Breaking Changes
129
130There are no breaking changes in this release.
131
132@section release-3-2-0_deprecated_items Deprecated and Removed Items
133
134@li removed tf::cudaFlow::kernel_on method
135@li removed explicit partitions in parallel iterations and reductions
136@li removed tf::cudaFlowCapturerBase
137@li removed tf::cublasFlowCapturer
138@li renamed update and rebind methods in tf::cudaFlow and tf::cudaFlowCapturer
139            to overloads
140
141@section release-3-2-0_documentation Documentation
142
143@li revised @ref StaticTasking
144  + @ref MoveATaskflow
145@li revised @ref ExecuteTaskflow
146  + @ref ExecuteATaskflowWithTransferredOwnership
147@li revised @ref cudaFlowReduce
148@li added @ref cudaFlowAlgorithms
149  + @ref cudaFlowReduce
150  + @ref cudaFlowScan
151  + @ref cudaFlowMerge
152  + @ref cudaFlowSort
153@li added @ref cudaStandardAlgorithms
154  + @ref CUDASTDExecutionPolicy
155  + @ref CUDASTDReduce
156  + @ref CUDASTDScan
157  + @ref CUDASTDMerge
158  + @ref CUDASTDSort
159  + @ref CUDASTDFind
160
161@section release-3-2-0_miscellaneous_items Miscellaneous Items
162
163We have published tf::cudaFlow in the following conference:
164  + Dian-Lun Lin and Tsung-Wei Huang, &quot;Efficient GPU Computation using %Task Graph Parallelism,&quot; <i>European Conference on Parallel and Distributed Computing (EuroPar)</i>, 2021
165
166*/
167
168}
169