1
1
---
2
- title : " Compiler Research Research Areas "
2
+ title : " Automatic Differentiation "
3
3
layout : gridlay
4
- excerpt : " Automatic Differentiation (AD) is a general and powerful technique
4
+ excerpt : " Automatic Differentiation is a general and powerful technique
5
5
of computing partial derivatives (or the complete gradient) of a function inputted as a
6
6
computer program."
7
7
sitemap : true
@@ -40,27 +40,28 @@ differentiation:
40
40
scale well with the number of inputs in the function.
41
41
42
42
- ** Symbolic Differentiation** : This approach uses symbolic manipulation to
43
- compute derivatives analytically. It provides accurate results but can lead to
44
- lengthy expressions for large computations. It requires the computer program
45
- to be representable in a closed-form mathematical expression, and thus doesn't
46
- work well with control flow scenarios (if conditions and loops) in the
47
- program.
43
+ compute derivatives analytically. It provides accurate results but can lead
44
+ to lengthy expressions for large computations. It requires the computer
45
+ program to be representable in a closed-form mathematical expression, and
46
+ thus doesn't work well with control flow scenarios (if conditions and loops)
47
+ in the program.
48
48
49
49
- ** Automatic Differentiation (AD)** : Automatic Differentiation is a general
50
- and efficient technique that works by repeated application of the chain rule
51
- over the computation graph of the program. Given its composable nature, it
52
- can easily scale for computing gradients over a very large number of inputs.
50
+ and an efficient technique that works by repeated application of the chain
51
+ rule over the computation graph of the program. Given its composable nature,
52
+ it can easily scale for computing gradients over a very large number of
53
+ inputs.
53
54
54
55
### Forward and Reverse mode AD
55
56
Automatic Differentiation works by applying the chain rule and merging the
56
57
derivatives at each node of the computation graph. The direction of this graph
57
- traversal and derivative accumulation results in two modes of operation :
58
+ traversal and derivative accumulation results in two approaches :
58
59
59
- - Forward Mode: starts at an input to the graph and moves towards all the
60
- output nodes. For every node, it adds up all the paths feeding in. By adding
61
- them up, we get the total way in which the node is affected by the input.
62
- Hence, it calculates derivatives of output(s) with respect to a single input
63
- variable.
60
+ - Forward Mode, Tangent Mode : starts the accumulation from the input
61
+ parameters towards the output parameters in the graph. This means that we
62
+ apply the chain rule to the inner functions first. That approach
63
+ calculates derivatives of output(s) with respect to a single input
64
+ variable.
64
65
65
66
![ Forward Mode] ( /images/ForwardAccumulationAutomaticDifferentiation.png )
66
67
@@ -77,16 +78,17 @@ traversal and derivative accumulation results in two modes of operation:
77
78
Automated Differentiation implementations are based on [ two major techniques] :
78
79
Operator Overloading and Source Code Transformation. Compiler Research Group's
79
80
focus has been on exploring the [ Source Code Transformation] technique, which
80
- involves constructing the computation graph and producing a derivative at
81
+ involves constructing the computation graph and producing a derivative at
81
82
compile time.
82
83
83
84
[ The source code transformation approach] enables optimization by retaining
84
85
all the complex knowledge of the original source code. The compute graph is
85
86
constructed during compilation and then transformed to generate the derivative
86
- code. It typically uses a custom parser to build code representation and
87
- produce the transformed code. It is difficult to implement (especially in
88
- C++), but it is very efficient, since many computations and optimizations are
89
- done ahead of time.
87
+ code. The drawback of that approach in many implementations is that, it
88
+ typically uses a custom parser to build code representation and produce the
89
+ transformed code. It is difficult to implement (especially in C++), but it is
90
+ very efficient, since many computations and optimizations can be done ahead of
91
+ time.
90
92
91
93
### Advantages of using Automatic Differentiation
92
94
@@ -98,7 +100,7 @@ done ahead of time.
98
100
- It can take derivatives of algorithms involving conditionals, loops, and
99
101
recursion.
100
102
101
- - It can be easily scaled for functions with very large number of inputs.
103
+ - It can be easily scaled for functions with a very large number of inputs.
102
104
103
105
### Automatic Differentiation Implementation with Clad - a Clang Plugin
104
106
0 commit comments