| |
- builtins.object
-
- ComputationalGraphPrimer
- Exp
class ComputationalGraphPrimer(builtins.object) |
|
ComputationalGraphPrimer(*args, **kwargs)
|
|
Methods defined here:
- __init__(self, *args, **kwargs)
- Initialize self. See help(type(self)) for accurate signature.
- backprop_and_update_params_multi_neuron_model(self, predictions, y_errors)
- First note that loop index variable 'back_layer_index' starts with the index of
the last layer. For the 3-layer example shown for 'forward', back_layer_index
starts with a value of 2, its next value is 1, and that's it.
In the code below, the outermost loop is over the data samples in a batch. As shown
on Slide 73 of my Week 3 lecture, in order to calculate the partials of Loss with
respect to the learnable params, we need to backprop the prediction errors and
the gradients of the Sigmoid. For the purpose of satisfying the requirements of
SGD, the backprop of the prediction errors and the gradients needs to be carried
out separately for each training data sample in a batch. That's what the outer
loop is for.
After we exit the outermost loop, we average over the results obtained from each
training data sample in a batch.
Pay attention to the variable 'vars_in_layer'. These store the node variables in
the current layer during backpropagation.
- backprop_and_update_params_one_neuron_model(self, data_tuples_in_batch, predictions, y_errors_in_batch, deriv_sigmoids)
- This function implements the equations shown on Slide 61 of my Week 3 presentation in our DL
class at Purdue. All four parameters defined above are lists of what was either supplied to the
forward prop function or calculated by it for each training data sample in a batch.
- calculate_loss(self, predicted_val, true_val)
- ######################################################################################################
###################################### Utility Functions ############################################
- display_DAG(self)
- The network visualization code in this script should work for any general DAG defined in
an instance of CGP. For an example, see the script
graph_based_dataflow.py
in the Examples directory of the module.
- display_multi_neuron_network(self)
- In version 1.1.0, I made this network visualization more general and (if it has no bugs) it should
work with any multi-layer network graph, such as the one shown in
multi_neuron_classifier.py
in the Examples directory of the module.
- display_network1(self)
- display_network2(self)
- Provides a fancier display of the network graph
- display_one_neuron_network(self)
- In version 1.1.0, I generalized this code to work on any one-neuron network as defined in the
example:
one_neuron_classifier.py
in the Examples directory of the module.
- eval_expression(self, exp, vals_for_vars, vals_for_learnable_params, ind_vars=None)
- forward_prop_multi_neuron_model(self, data_tuples_in_batch)
- During forward propagation, we push each batch of the input data through the
network. In order to explain the logic of forward, consider the following network
layout in 4 nodes in the input layer, 2 nodes in the hidden layer, and 1 node in
the output layer.
input
x x = node
x x| | = sigmoid activation
x|
x x|
x
layer_0 layer_1 layer_2
In the code shown below, the expressions to evaluate for computing the
pre-activation values at a node are stored at the layer in which the nodes reside.
That is, the dictionary look-up "self.layer_exp_objects[layer_index]" returns the
Expression objects for which the left-side dependent variable is in the layer
pointed to layer_index. So the example shown above, "self.layer_exp_objects[1]"
will return two Expression objects, one for each of the two nodes in the second
layer of the network (that is, layer indexed 1).
The pre-activation values obtained by evaluating the expressions at each node are
then subject to Sigmoid activation, followed by the calculation of the partial
derivative of the output of the Sigmoid function with respect to its input.
In the forward, the values calculated for the nodes in each layer are stored in
the dictionary
self.forw_prop_vals_at_layers[ layer_index ]
and the gradients values calculated at the same nodes in the dictionary:
self.gradient_vals_for_layers[ layer_index ]
- forward_prop_one_neuron_model(self, data_tuples_in_batch)
- Forward propagates the batch data through the neural network according to the equations on
Slide 50 of my Week 3 slides.
As the one-neuron model is characterized by a single expression, the main job of this function is
to evaluate that expression for each data tuple in the incoming batch. The resulting output is
fed into the sigmoid activation function and the partial derivative of the sigmoid with respect
to its input calculated.
- forward_propagate_one_input_sample_with_partial_deriv_calc(self, sample_index, input_vals_for_ind_vars)
- If you want to look at how the information flows in the DAG when you don't have to worry about
estimating the partial derivatives, see the method gen_gt_dataset(). As you will notice in the
implementation code for that method, there is nothing much to pushing the input values through
the nodes and the arcs of a computational graph if we are not concerned about estimating the
partial derivatives.
On the other hand, if you want to see how one might also estimate the partial derivatives as
during the forward flow of information in a computational graph, the forward_propagate...()
presented here is the method to examine. We first split the expression that the node
variable depends on into its constituent parts on the basis of '+' and '-' operators and
subsequently, for each part, we estimate the partial of the node variable with respect
to the variables and the learnable parameters in that part.
The needed partial derivatives are all calculated using the finite difference method in which
you add a small grad_delta value to the value of the variable with respect to which you are
calculating the partial and you then estimate the resulting change at the node in question.
The change divided by grad_delta is the partial derivative you are looking for.
- gen_gt_dataset(self, vals_for_learnable_params={})
- This method illustrates that it is trivial to forward-propagate the information through
the computational graph if you are not concerned about estimating the partial derivatives
at the same time. This method is used to generate 'dataset_size' number of input/output
values for the computational graph for given values for the learnable parameters.
- gen_gt_dataset_with_activations(self, vals_for_learnable_params={})
- This method illustrates that it is trivial to forward-propagate the information through
the computational graph if you are not concerned about estimating the partial derivatives
at the same time. This method is used to generate 'dataset_size' number of input/output
values for the computational graph for given values for the learnable parameters.
- gen_training_data(self)
- This 2-class dataset is used for the demos in the following Examples directory scripts:
one_neuron_classifier.py
multi_neuron_classifier.py
multi_neuron_classifier.py
The classes are labeled 0 and 1. All of the data for class 0 is simply a list of
numbers associated with the key 0. Similarly all the data for class 1 is another list of
numbers associated with the key 1.
For each class, the dataset starts out as being standard normal (zero mean and unit variance)
to which we add a mean value of 2.0 for class 0 and we add mean value of 4 to the square of
the original numbers for class 1.
- parse_expressions(self)
- This method creates a DAG from a set of expressions that involve variables and learnable
parameters. The expressions are based on the assumption that a symbolic name that starts
with the letter 'x' is a variable, with all other symbolic names being learnable parameters.
The computational graph is represented by two dictionaries, 'depends_on' and 'leads_to'.
To illustrate the meaning of the dictionaries, something like "depends_on['xz']" would be
set to a list of all other variables whose outgoing arcs end in the node 'xz'. So
something like "depends_on['xz']" is best read as "node 'xz' depends on ...." where the
dots stand for the array of nodes that is the value of "depends_on['xz']". On the other
hand, the 'leads_to' dictionary has the opposite meaning. That is, something like
"leads_to['xz']" is set to the array of nodes at the ends of all the arcs that emanate
from 'xz'.
- parse_general_dag_expressions(self)
- This method is a modification of the previous expression parser and meant specifically
for the case when a given set of expressions are supposed to define a general DAG. The
naming conventions for the variables, which designate the nodes in the layers
of the network, and the learnable parameters remain the same as in the previous function.
- parse_multi_layer_expressions(self)
- This method is a modification of the previous expression parser and meant specifically
for the case when a given set of expressions are supposed to define a multi-layer neural
network. The naming conventions for the variables, which designate the nodes in the layers
of the network, and the learnable parameters remain the same as in the previous function.
- plot_loss(self)
- run_training_loop_multi_neuron_model(self, training_data)
- ######################################################################################################
######################################## multi neuron model ##########################################
- run_training_loop_one_neuron_model(self, training_data)
- The training loop must first initialize the learnable parameters. Remember, these are the
symbolic names in your input expressions for the neural layer that do not begin with the
letter 'x'. In this case, we are initializing with random numbers from a uniform distribution
over the interval (0,1).
- run_training_with_torchnn(self, option, training_data)
- The value of the parameter 'option' must be either 'one_neuron' or 'multi_neuron'.
For either option, the number of input nodes is specified by the expressions specified in the
constructor of the class ComputationalGraphPrimer.
When the option value is 'one_neuron', we use the OneNeuronNet for the learning network and
when the option is 'multi_neuron' we use the MultiNeuronNet.
Assuming that the number of input nodes specified by the expressions is 4, the MultiNeuronNet
class creates the following network layout in which we have 2 nodes in the hidden layer and
one node for the final output:
input
x x = node
x x| | = ReLU activation
x|
x x|
x
layer_0 layer_1 layer_2
- train_on_all_data(self)
- The purpose of this method is to call forward_propagate_one_input_sample_with_partial_deriv_calc()
repeatedly on all input/output ground-truth training data pairs generated by the method
gen_gt_dataset(). The call to the forward_propagate...() method returns the predicted value
at the output nodes from the supplied values at the input nodes. The "train_on_all_data()"
method calculates the error associated with the predicted value. The call to
forward_propagate...() also returns the partial derivatives estimated by using the finite
difference method in the computational graph. Using the partial derivatives, the
"train_on_all_data()" backpropagates the loss to the interior nodes in the computational graph
and updates the values for the learnable parameters.
Data descriptors defined here:
- __dict__
- dictionary for instance variables (if defined)
- __weakref__
- list of weak references to the object (if defined)
Data and other attributes defined here:
- AutogradCustomization = <class 'ComputationalGraphPrimer.ComputationalGraphPrimer.AutogradCustomization'>
- This class illustrates how you can add additional functionality of Autograd by
following the instructions posted at
https://pytorch.org/docs/stable/notes/extending.html
|
class Exp(builtins.object) |
|
Exp(exp, body, dependent_var, right_vars, right_params)
With CGP, you can handcraft a neural network (actually you can handcraft any DAG) by designating
the nodes and the links between them with expressions like
expressions = [ 'xx=xa^2',
'xy=ab*xx+ac*xa',
'xz=bc*xx+xy',
'xw=cd*xx+xz^3' ]
In these expressions, names beginning with 'x' denote the nodes in the DAG, and the names beginning with
lowercase letters like 'a', 'b', 'c', etc., designate the learnable parameters. The variable on the
left of the '=' symbol is considered to be the dependent_var and those on the right are, as you guessed,
the right_vars. Since the learnable parameters are always on the right of the equality sign, we refer
to them as right_params in what is shown below. The expressions shown above are parsed by the
parser function in CGP. The parser outputs an instance of the Exp class for each expression of the
sort shown above. What is shown above has 4 expressions for creating a DAG. Of course, you can have
any number of them. |
|
Methods defined here:
- __init__(self, exp, body, dependent_var, right_vars, right_params)
- Initialize self. See help(type(self)) for accurate signature.
Data descriptors defined here:
- __dict__
- dictionary for instance variables (if defined)
- __weakref__
- list of weak references to the object (if defined)
| |