DLStudio-1.0.9.html

DLStudio (version 1.0.9, 2020-March-21)

DLStudio.py
Version: 1.0.9 Author: Avinash Kak (kak@purdue.edu) Date: 2020-March-21

`Download Version 1.0.9: gztar`	`Total number of downloads (all versions): 3917` `This count is automatically updated at every rotation of the weblogs (normally once every two to four days) Last updated: Wed May 1 06:03:02 EDT 2024`

View the main module code file in your browser Switch to Version 1.1.0 CHANGES: Version 1.0.9: With this version, you can now use DLStudio for experiments in semantic segmentation of images. The code added to the module is in a new inner class that, as you might guess, is named SemanticSegmentation. The workhorse of this inner class is a new implementation of the famous Unet that I have named mUnet --- the prefix "m" stands for "multi" for the ability of the network to segment out multiple objects simultaneously. This version of DLStudio also comes with a new dataset, PurdueShapes5MultiObject, for experimenting with mUnet. Each image in this dataset contains a random number of selections from five different shapes --- rectangle, triangle, disk, oval, and star --- that are randomly scaled, oriented, and located in each image. Version 1.0.7: The main reason for creating this version of DLStudio is to be able to use the module for illustrating how to simultaneously carry out classification and regression (C&R) with the same convolutional network. The specific C&R problem that is solved in this version is the problem of object detection and localization. You want a CNN to categorize the object in an image and, at the same time, estimate the bounding-box for the detected object. Estimating the bounding-box is referred to as regression. All of the code related to object detection and localization is in the inner class DetectAndLocalize of the main module file. Training a CNN to solve the detection and localization problem requires a dataset that, in addition to the class labels for the objects, also provides bounding-box annotations for the objects. Towards that end, this version also comes with a new dataset called PurdueShapes5. Another new inner class, CustomDataLoading, that is also included in Version 1.0.7 has the dataloader for the PurdueShapes5 dataset. Version 1.0.6: This version has the bugfix for a bug in SkipBlock that was spotted by a student as I was demonstrating in class the concepts related to the use of skip connections in deep neural networks. Version 1.0.5: This version includes an inner class, SkipConnections, for experimenting with skip connections to improve the performance of a deep network. The Examples subdirectory of the distribution includes a script, playing_with_skip_connections.py, that demonstrates how you can experiment with SkipConnections. The network class used by SkipConnections is named BMEnet with an easy-to-use interface for experimenting with networks of arbitrary depth. Version 1.0.4: I have added one more inner class, AutogradCustomization, to the module that illustrates how to extend Autograd if you want to endow it with additional functionality. And, most importantly, this version fixes an important bug that caused wrong information to be written out to the disk when you tried to save the learned model at the end of a training session. I have also cleaned up the comment blocks in the implementation code. Version 1.0.3: This is the first public release version of this module. INTRODUCTION: Every design activity involves mixing and matching things and doing so repeatedly until you have achieved the desired results. The same thing is true of modern deep learning networks. When you are working with a new data domain, it is likely that you would want to experiment with different network layouts that you may have dreamed of yourself or that you may have seen somewhere in a publication or at some web site. The goal of this module is to make it easier to engage in this process. The idea is that you would drop in the module a new network and you would be able to see right away the results you would get with the new network. This module also allows you to specify a network with a configuration string. The module parses the string and creates the network. In upcoming revisions of this module, I am planning to add additional features to this approach in order to make it more general and more useful for production work. Extending Autograd: Version 1.0.4 of DLStudio incorporates a new inner class, AutogradCustomization, for illustrating how you can write your own code for customizing the behavior of PyTorch's Autograd module. Your starting point for understanding the code in AutogradCustomization should be the following script in the Examples directory of the distro: extending_autograd.py Extending Autograd requires that you define a new verb class --- as I have with the class DoSillyWithTensor shown in the main module file --- with definitions for two static methods, "forward()" and "backward()". Note that an instance constructed from this class is callable. Skip Connections: Starting with Version 1.0.6, you can now experiment with skip connections in a CNN to see how a deep network with this feature might yield improved classification results. Deep networks suffer from the problem of vanishing gradients that degrades their performance. Vanishing gradients means that the gradients of the loss calculated in the early layers of a network become increasingly muted as the network becomes deeper. An important mitigation strategy for addressing this problem consists of creating a CNN using blocks with skip connections. The code for using skip connections is in the inner class SkipConnections of the module. And the network that allows you to construct a CNN with skip connections is named BMEnet. As shown in the script playing_with_skip_connections.py in the Examples directory of the distribution, you can easily create a CNN with arbitrary depth just by using the constructor option "depth" for BMEnet. The basic block of the network constructed in this manner is called SkipBlock which, very much like the BasicBlock in ResNet-18, has a couple of convolutional layers whose output is combined with the input to the block. Note that the value given to the the "depth" constructor option for the BMEnet class does NOT translate directly into the actual depth of the CNN. [Again, see the script playing_with_skip_connections.py in the Examples directory for how to use this option.] The value of "depth" is translated into how many instances of SkipBlock to use for constructing the CNN. If you want to use DLStudio for learning how to create your own versions of SkipBlock-like shortcuts in a CNN, your starting point should be the following script in the Examples directory of the distro: playing_with_skip_connections.py This script illustrates how to use the inner class BMEnet of the module for experimenting with skip connections in a CNN. As the script shows, the constructor of the BMEnet class comes with two options: skip_connections and depth. By turning the first on and off, you can directly illustrate in a classroom setting the improvement you can get with skip connections. And by giving an appropriate value to the "depth" option, you can show results for networks of different depths. Object Detection and Localization: The code for how to solve the problem of object detection and localization with a CNN is in the inner classes DetectAndLocalize and CustomDataLoading. This code was developed for version 1.0.7 of the module. In general, object detection and localization problems are more challenging than pure classification problems because solving the localization part requires regression for the coordinates of the bounding box that localize the object. If at all possible, you would want the same CNN to provide answers to both the classification and the regression questions and do so at the same time. This calls for a CNN to possess two different output layers, one for classification and the other for regression. A deep network that does exactly that is illustrated by the LOADnet classes that are defined in the inner class DetectAndLocalize of the DLStudio module. [By the way, the acronym "LOAD" in "LOADnet" stands for "LOcalization And Detection".] Although you will find three versions of the LOADnet class inside DetectAndLocalize, for now only pay attention to the LOADnet2 class since that is the one I have worked with the most for creating the 1.0.7 distribution. As you would expect, training a CNN for object detection and localization requires a dataset that, in addition to the class labels for the images, also provides bounding-box annotations for the objects in the images. Out of my great admiration for the CIFAR-10 dataset as an educational tool for solving classification problems, I have created small-image-format training and testing datasets for illustrating the code devoted to object detection and localization in this module. The training dataset is named PurdueShapes5-10000-train.gz and it consists of 10,000 images, with each image of size 32x32 containing one of five possible shapes --- rectangle, triangle, disk, oval, and star. The shape objects in the images are randomized with respect to size, orientation, and color. The testing dataset is named PurdueShapes5-1000-test.gz and it contains 1000 images generated by the same randomization process as used for the training dataset. You will find these datasets in the "data" subdirectory of the "Examples" directory in the distribution. Providing a new dataset for experiments with detection and localization meant that I also needed to supply a custom dataloader for the dataset. Toward that end, Version 1.0.7 also includes another inner class named CustomDataLoading where you will my implementation of the custom dataloader for the PurdueShapes5 dataset. If you want to use DLStudio for learning how to write your own PyTorch code for object detection and localization, your starting point should be the following script in the Examples directory of the distro: object_detection_and_localization.py Execute the script and understand what functionality of the inner class DetectAndLocalize it invokes for object detection and localization. Semantic Segmentation: The code for how to carry out semantic segmentation is in the inner class that is appropriately named SemanticSegmentation. At its simplest, the purpose of semantic segmentation is to assign correct labels to the different objects in a scene, while localizing them at the same time. At a more sophisticated level, a system that carries out semantic segmentation should also output a symbolic expression based on the objects found in the image and their spatial relationships with one another. The code in the new inner class is based on only the simplest possible definition of what is meant by semantic segmentation. The convolutional network that carries out semantic segmentation DLStudio is named mUnet, where the letter "m" is short for "multi", which, in turn, stands for the fact that mUnet is capable of segmenting out multiple object simultaneously from an image. The mUnet network is based on the now famous Unet network that was first proposed by Ronneberger, Fischer and Brox in the paper "U-Net: Convolutional Networks for Biomedical Image Segmentation". Their UNET extracts binary masks for the cell pixel blobs of interest in biomedical images. The output of UNET can therefore be treated as a pixel-wise binary classifier at each pixel position. The mUnet class, on the other hand, is intended for segmenting out multiple objects simultaneously form an image. [A weaker reason for "m" in the name of the class is that it uses skip connections in multiple ways --- such connections are used not only across the two arms of the "U", but also also along the arms. The skip connections in the original Unet are only between the two arms of the U. mUnet works by assigning a separate channel in the output of the network to each different object type. After the network is trained, for a given input image, all you have to do is examine the different channels of the output for the presence or the absence of the objects corresponding to the channel index. This version of DLStudio also comes with a new dataset, PurdueShapes5MultiObject, for experimenting with mUnet. Each image in this dataset contains a random number of selections from five different shapes, with the shapes being randomly scaled, oriented, and located in each image. The five different shapes are: rectangle, triangle, disk, oval, and star. Your starting point for learning how to use the mUnet network for segmenting images should be the following script in the Examples directory of the distro: semantic_segmentation.py Execute the script and understand how it uses the functionality packed in the inner class SemanticSegmentation for segmenting out the objects in an image. INSTALLATION: The DLStudio class was packaged using setuptools. For installation, execute the following command in the source directory (this is the directory that contains the setup.py file after you have downloaded and uncompressed the package): sudo python setup.py install and/or, for the case of Python3, sudo python3 setup.py install On Linux distributions, this will install the module file at a location that looks like /usr/local/lib/python2.7/dist-packages/ and, for the case of Python3, at a location that looks like /usr/local/lib/python3.6/dist-packages/ If you do not have root access, you have the option of working directly off the directory in which you downloaded the software by simply placing the following statements at the top of your scripts that use the DLStudio class: import sys sys.path.append( "pathname_to_DLStudio_directory" ) To uninstall the module, simply delete the source directory, locate where the DLStudio module was installed with "locate DLStudio" and delete those files. As mentioned above, the full pathname to the installed version is likely to look like /usr/local/lib/python2.7/dist-packages/DLStudio* If you want to carry out a non-standard install of the DLStudio module, look up the on-line information on Disutils by pointing your browser to http://docs.python.org/dist/dist.html USAGE: If you want to specify a network with just a configuration string, your usage of the module is going to look like: from DLStudio import * convo_layers_config = "1x[128,3,3,1]-MaxPool(2) 1x[16,5,5,1]-MaxPool(2)" fc_layers_config = [-1,1024,10] dls = DLStudio( dataroot = "/home/kak/ImageDatasets/CIFAR-10/", image_size = [32,32], convo_layers_config = convo_layers_config, fc_layers_config = fc_layers_config, path_saved_model = "./saved_model", momentum = 0.9, learning_rate = 1e-3, epochs = 2, batch_size = 4, classes = ('plane','car','bird','cat','deer', 'dog','frog','horse','ship','truck'), use_gpu = True, debug_train = 0, debug_test = 1, ) configs_for_all_convo_layers = dls.parse_config_string_for_convo_layers() convo_layers = dls.build_convo_layers2( configs_for_all_convo_layers ) fc_layers = dls.build_fc_layers() model = dls.Net(convo_layers, fc_layers) dls.show_network_summary(model) dls.load_cifar_10_dataset() dls.run_code_for_training(model) dls.run_code_for_testing(model) or, if you would rather experiment with a drop-in network, your usage of the module is going to look something like: dls = DLStudio( dataroot = "/home/kak/ImageDatasets/CIFAR-10/", image_size = [32,32], path_saved_model = "./saved_model", momentum = 0.9, learning_rate = 1e-3, epochs = 2, batch_size = 4, classes = ('plane','car','bird','cat','deer', 'dog','frog','horse','ship','truck'), use_gpu = True, debug_train = 0, debug_test = 1, ) exp_seq = DLStudio.ExperimentsWithSequential( dl_studio = dls ) ## for your drop-in network exp_seq.load_cifar_10_dataset_with_augmentation() model = exp_seq.Net() dls.show_network_summary(model) exp_seq.run_code_for_training(model) exp_seq.run_code_for_testing(model) This assumes that you copy-and-pasted the network you want to experiment with in a class like ExperimentsWithSequential that is included in the module. CONSTRUCTOR PARAMETERS: batch_size: Carries the usual meaning in the neural network context. classes: A list of the symbolic names for the classes. convo_layers_config: This parameter allows you to specify a convolutional network with a configuration string. Must be formatted as explained in the comment block associated with the method "parse_config_string_for_convo_layers()" dataroot: This points to where your dataset is located. debug_test: Setting it allow you to see images being used and their predicted class labels every 2000 batch-based iterations of testing. debug_train: Does the same thing during training that debug_test does during testing. epochs: Specifies the number of epochs to be used for training the network. fc_layers_config: This parameter allows you to specify the final fully-connected portion of the network with just a list of the number of nodes in each layer of this portion. The first entry in this list must be the number '-1', which stands for the fact that the number of nodes in the first layer will be determined by the final activation volume of the convolutional portion of the network. image_size: The heightxwidth size of the images in your dataset. learning_rate: Again carries the usual meaning. momentum: Carries the usual meaning and needed by the optimizer. path_saved_model: The path to where you want the trained model to be saved in your disk so that it can be retrieved later for inference. use_gpu: You must set it to True if you want the GPU to be used for training. PUBLIC METHODS: (1) build_convo_layers() This method creates the convolutional layers from the parameters in the configuration string that was supplied through the constructor option 'convo_layers_config'. The output produced by the call to 'parse_config_string_for_convo_layers()' is supplied as the argument to build_convo_layers(). (2) build_fc_layers() From the list of ints supplied through the constructor option 'fc_layers_config', this method constructs the fully-connected portion of the overall network. (3) check_a_sampling_of_images() Displays the first batch_size number of images in your dataset. (4) display_tensor_as_image() This method will display any tensor of shape (3,H,W), (1,H,W), or just (H,W) as an image. If any further data normalizations is needed for constructing a displayable image, the method takes care of that. It has two input parameters: one for the tensor you want displayed as an image and the other for a title for the image display. The latter parameter is default initialized to an empty string. (5) load_cifar_10_dataset() This is just a convenience method that calls on Torchvision's functionality for creating a data loader. (6) load_cifar_10_dataset_with_augmentation() This convenience method also creates a data loader but it also includes the syntax for data augmentation. (7) parse_config_string_for_convo_layers() As mentioned in the Introduction, DLStudio module allows you to specify a convolutional network with a string provided the string obeys the formatting convention described in the comment block of this method. This method is for parsing such a string. The string itself is presented to the module through the constructor option 'convo_layers_config'. (8) run_code_for_testing() This is the method runs the trained model on the test data. Its output is a confusion matrix for the classes and the overall accuracy for each class. The method has one input parameter which is set to the network to be tested. This learnable parameters in the network are initialized with the disk-stored version of the trained model. (9) run_code_for_training() This is the method that does all the training work. If a GPU was detected at the time an instance of the module was created, this method takes care of making the appropriate calls in order to transfer the tensors involved into the GPU memory. (10) save_model() Writes the model out to the disk at the location specified by the constructor option 'path_saved_model'. Has one input parameter for the model that needs to be written out. (11) show_network_summary() Displays a print representation of your network and calls on the torchsummary module to print out the shape of the tensor at the output of each layer in the network. The method has one input parameter which is set to the network whose summary you want to see. INNER CLASSES OF THE MODULE: The purpose of the following two inner classes is to demonstrate how you can create a custom class for your own network and test it within the framework provided by the DLStudio module. (1) class ExperimentsWithSequential This class is my demonstration of experimenting with a network that I found on GitHub. I copy-and-pasted it in this class to test its capabilities. How to call on such a custom class is shown by the following script in the Examples directory: playing_with_sequential.py (2) class ExperimentsWithCIFAR This is very similar to the previous inner class, but uses a common example of a network for experimenting with the CIFAR-10 dataset. Consisting of 32x32 images, this is a great dataset for creating classroom demonstrations of convolutional networks. As to how you should use this class is shown in the following script playing_with_cifar10.py in the Examples directory of the distribution. (3) class AutogradCustomization The purpose of this class is to illustrate how to extend Autograd with additional functionality. What's shown is an implementation of the recommended approach at the following documentation page: https://pytorch.org/docs/stable/notes/extending.html (4) class SkipConnections This class is for investigating the power of skip connections in deep networks. Skip connections are used to mitigate a serious problem associated with deep networks --- the problem of vanishing gradients. It has been argued theoretically and demonstrated empirically that as the depth of a neural network increases, the gradients of the loss become more and more muted for the early layers in the network. (5) class DetectAndLocalize The code in this inner class is for demonstrating how the same convolutional network can simultaneously the twin problems of object detection and localization. Note that, unlike the previous four inner classes, class DetectAndLocalize comes with its own implementations for the training and testing methods. The main reason for that is that the training for detection and localization must use two different loss functions simultaneously, one for classification of the objects and the other for regression. The function for testing is also a bit more involved since it must now compute two kinds of errors, the classification error and the regression error on the unseen data. Although you will find a couple of different choices for the training and testing functions for detection and localization inside DetectAndLocalize, the ones I have worked with the most are those that are used in the following two scripts in the Examples directory: run_code_for_training_with_CrossEntropy_and_MSE_Losses() run_code_for_testing_detection_and_localization() (6) class CustomDataLoading This is a testbed for experimenting with a completely grounds-up attempt at designing a custom data loader. Ordinarily, if the basic format of how the dataset is stored is similar to one of the datasets that Torchvision knows about, you can go ahead and use that for your own dataset. At worst, you may need to carry out some light customizations depending on the number of classes involved, etc. However, if the underlying dataset is stored in a manner that does not look like anything in Torchvision, you have no choice but to supply yourself all of the data loading infrastructure. That is what this inner class of the DLStudio module is all about. (7) class SemanticSegmentation This inner class is for working with the mUnet convolutional network for semantic segmentation of images. This network allows you to segment out multiple objects simultaneously from an image. Each object type is assigned a different channel in the output of the network. So, for segmenting out the objects of a specified type in a given input image, all you have to do is examine the corresponding channel in the output. THE Examples DIRECTORY: The Examples subdirectory in the distribution contains the following three scripts: (1) playing_with_reconfig.py Shows how you can specify a convolution network with a configuration string. The DLStudio module parses the string constructs the network. (2) playing_with_sequential.py Shows you how you can call on a custom inner class of the 'DLStudio' module that is meant to experiment with your own network. The name of the inner class in this example script is ExperimentsWithSequential (3) playing_with_cifar10.py This is very similar to the previous example script but is based on the inner class ExperimentsWithCIFAR which uses more common examples of networks for playing with the CIFAR-10 dataset. (4) extending_autograd.py This provides a demonstration example of the recommended approach for giving additional functionality to Autograd --- as mentioned in the commented made above about the inner class AutogradCustomization. (5) playing_with_skip_connections.py This script illustrates how to use the inner class BMEnet of the module for experimenting with skip connections in a CNN. As the script shows, the constructor of the BMEnet class comes with two options: skip_connections and depth. By turning the first on and off, you can directly illustrate in a classroom setting the improvement you can get with skip connections. And by giving an appropriate value to the "depth" option, you can show results for networks of different depths. (6) custom_data_loading.py This script shows how to use the custom dataloader in the inner class CustomDataLoading of the DLStudio module. That custom dataloader is meant specifically for the PurdueShapes5 dataset that is used in object detection and localization experiments in DLStudio. (7) object_detection_and_localization.py This script shows how you can use the functionality provided by the inner class DetectAndLocalize of the DLStudio module for experimenting with object detection and localization. Detecting and localizing (D&L) objects in images is a more difficult problem than just classifying the objects. D&L requires that your CNN make two different types of inferences simultaneously, one for classification and the other for localization. For the localization part, the CNN must carry out what is known as regression. What that means is that the CNN must output the numerical values for the bounding box that encloses the object that was detected. Generating these two types of inferences requires two different loss functions, one for classification and the other for regression. (8) semantic_segmentation.py This script should be your starting point if you wish to learn how to use the mUnet neural network for semantic segmentation of images. As mentioned elsewhere in this documentation page, mUnet assigns an output channel to each different type of object that you wish to segment out from an image. So, given a test image at the input to the network, all you have to do is to examine each channel at the output for segmenting out the objects that correspond to that output channel. THE DATASETS INCLUDED: Object Detection and Localization: Training a CNN for object detection and localization requires training and testing datasets that come with bounding-box annotations. This module comes with the PurdueShapes5 dataset for that purpose. I created this small-image-format dataset out of my admiration for the CIFAR-10 dataset as an educational tool for demonstrating classification networks in a classroom setting. You will find the following dataset archive files in the "data" subdirectory of the "Examples" directory of the distro: (1) PurdueShapes5-10000-train.gz (2) PurdueShapes5-1000-test.gz (3) PurdueShapes5-20-train.gz (4) PurdueShapes5-20-test.gz The number that follows the main name string "PurdueShapes5-" is for the number of images in the dataset. You will find the last two datasets, with 20 images each, useful for debugging your logic for object detection and bounding-box regression. As to how the image data is stored in the archives, please see the main comment block for the inner class CustomLoading in this file. Semantic Segmentation: Showing interesting results with semantic segmentation requires images that contains multiple objects of different types. A good semantic segmenter would then allow for each object type to be segmented out separately from an image. A network that can carry out such segmentation needs training and testing datasets in which the images come up with multiple objects of different types in them. Towards that end, I have created the following dataset: (5) PurdueShapes5MultiObject-10000-train.gz (6) PurdueShapes5MultiObject-1000-test.gz (7) PurdueShapes5MultiObject-20-train.gz (8) PurdueShapes5MultiObject-20-test.gz The number that follows the main name string "PurdueShapes5MultiObject-" is for the number of images in the dataset. You will find the last two datasets, with 20 images each, useful for debugging your logic for semantic segmentation. As to how the image data is stored in the archive files listed above, please see the main comment block for the class PurdueShapes5MultiObjectDataset As explained there, in addition to the RGB values at the pixels that are stored in the form of three separate lists called R, G, and B, the shapes themselves are stored in the form an array of masks, each of size 64x64, with each mask array representing a particular shape. For illustration, the rectangle shape is represented by the first such array. And so on. BUGS: Please notify the author if you encounter any bugs. When sending email, please place the string 'DLStudio' in the subject line to get past the author's spam filter. ABOUT THE AUTHOR: The author, Avinash Kak, is a professor of Electrical and Computer Engineering at Purdue University. For all issues related to this module, contact the author at kak@purdue.edu If you send email, please place the string "DLStudio" in your subject line to get past the author's spam filter. COPYRIGHT: Python Software Foundation License Copyright 2020 Avinash Kak @endofdocs

Imported Modules

torch.nn.functional
PIL.ImageFilter
copy
gzip
math

torch.nn
numpy
numbers
torch.optim
os

pickle
matplotlib.pyplot
pymsgbox
random
re

sys
torch
torchvision
torchvision.transforms

Classes

__builtin__.object

DLStudio

class DLStudio(__builtin__.object)

Methods defined here:

__init__(self, *args, **kwargs)

build_convo_layers(self, configs_for_all_convo_layers)

build_fc_layers(self)

check_a_sampling_of_images(self): Displays the first batch_size number of images in your dataset.

display_tensor_as_image(self, tensor, title=''): This method converts the argument tensor into a photo image that you can display in your terminal screen. It can convert tensors of three different shapes into images: (3,H,W), (1,H,W), and (H,W), where H, for height, stands for the number of pixels in the vertical direction and W, for width, for the same along the horizontal direction. When the first element of the shape is 3, that means that the tensor represents a color image in which each pixel in the (H,W) plane has three values for the three color channels. On the other hand, when the first element is 1, that stands for a tensor that will be shown as a grayscale image. And when the shape is just (H,W), that is automatically taken to be for a grayscale image.

imshow(self, img): called by display_tensor_as_image() for displaying the image

load_cifar_10_dataset(self): We make sure that the transformation applied to the image end the images being normalized. Consider this call to normalize: "Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))". The three numbers in the first tuple affect the means in the three color channels and the three numbers in the second tuple affect the standard deviations. In this case, we want the image value in each channel to be changed to: image_channel_val = (image_channel_val - mean) / std So with mean and std both set 0.5 for all three channels, if the image tensor originally was between 0 and 1.0, after this normalization, the tensor will be between -1.0 and +1.0. If needed we can do inverse normalization by image_channel_val = (image_channel_val * std) + mean

load_cifar_10_dataset_with_augmentation(self): In general, we want to do data augmentation for training:

parse_config_string_for_convo_layers(self): Each collection of 'n' otherwise identical layers in a convolutional network is specified by a string that looks like: "nx[a,b,c,d]-MaxPool(k)" where n = num of this type of convo layer a = number of out_channels [in_channels determined by prev layer] b,c = kernel for this layer is of size (b,c) [b along height, c along width] d = stride for convolutions k = maxpooling over kxk patches with stride of k Example: "n1x[a1,b1,c1,d1]-MaxPool(k1) n2x[a2,b2,c2,d2]-MaxPool(k2)"

plot_loss(self)

run_code_for_testing(self, net)

run_code_for_training(self, net)

save_model(self, model): Save the trained model to a disk file

show_network_summary(self, net)

Data descriptors defined here:

__dict__: dictionary for instance variables (if defined)

__weakref__: list of weak references to the object (if defined)

Data and other attributes defined here:

AutogradCustomization = <class 'DLStudio.AutogradCustomization'>: This class illustrates how you can add additional functionality of Autograd by following the instructions posted at https://pytorch.org/docs/stable/notes/extending.html

CustomDataLoading = <class 'DLStudio.CustomDataLoading'>: This is a testbed for experimenting with a completely grounds-up attempt at designing a custom data loader. Ordinarily, if the basic format of how the dataset is stored is similar to one of the datasets that the Torchvision module knows about, you can go ahead and use that for your own dataset. At worst, you may need to carry out some light customizations depending on the number of classes involved, etc. However, if the underlying dataset is stored in a manner that does not look like anything in Torchvision, you have no choice but to supply yourself all of the data loading infrastructure. That is what this inner class of the DLStudio module is all about. The custom data loading exercise here is related to a dataset called PurdueShapes5 that contains 32x32 images of binary shapes belonging to the following five classes: 1. rectangle 2. triangle 3. disk 4. oval 5. star The dataset was generated by randomizing the sizes and the orientations of these five patterns. Since the patterns are rotated with a very simple non-interpolating transform, just the act of random rotations can introduce boundary and even interior noise in the patterns. Each 32x32 image is stored in the dataset as the following list: [R, G, B, Bbox, Label] where R : is a 1024 element list of the values for the red component of the color at all the pixels B : the same as above but for the green component of the color G : the same as above but for the blue component of the color Bbox : a list like [x1,y1,x2,y2] that defines the bounding box for the object in the image Label : the shape of the object I serialize the dataset with Python's pickle module and then compress it with the gzip module. You will find the following dataset directories in the "data" subdirectory of Examples in the DLStudio distro: PurdueShapes5-10000-train.gz PurdueShapes5-1000-test.gz PurdueShapes5-20-train.gz PurdueShapes5-20-test.gz The number that follows the main name string "PurdueShapes5-" is for the number of images in the dataset. You will find the last two datasets, with 20 images each, useful for debugging your logic for object detection and bounding-box regression.

DetectAndLocalize = <class 'DLStudio.DetectAndLocalize'>: The purpose of this inner class is to focus on object detection in images --- as opposed to image classification. Most people would say that object detection is a more challenging problem than image classification because, in general, the former also requires localization. The simplest interpretation of what is meant by localization is that the code that carries out object detection must also output a bounding-box rectangle for the object that was detected. You will find in this inner class some examples of LOADnet classes meant for solving the object detection and localization problem. The acronym "LOAD" in "LOADnet" stands for "LOcalization And Detection" The different network examples included here are LOADnet1, LOADnet2, and LOADnet3. For now, only pay attention to LOADnet2 since that's the class I have worked with the most for the 1.0.7 distribution.

ExperimentsWithCIFAR = <class 'DLStudio.ExperimentsWithCIFAR'>

ExperimentsWithSequential = <class 'DLStudio.ExperimentsWithSequential'>: Demonstrates how to use the torch.nn.Sequential container class

Net = <class 'DLStudio.Net'>

SemanticSegmentation = <class 'DLStudio.SemanticSegmentation'>: The purpose of this inner class is to be able to use the DLStudio module for experiments with semantic segmentation. At its simplest level, the purpose of semantic segmentation is to assign correct labels to the different objects in a scene, while localizing them at the same time. At a more sophisticated level, a system that carries out semantic segmentation should also output a symbolic expression based on the objects found in the image and their spatial relationships with one another. The workhorse of this inner class is the mUnet network that is based on the UNET network that was first proposed by Ronneberger, Fischer and Brox in the paper "U-Net: Convolutional Networks for Biomedical Image Segmentation". Their Unet extracts binary masks for the cell pixel blobs of interest in biomedical images. The output of their Unet can therefore be treated as a pixel-wise binary classifier at each pixel position. The mUnet class, on the other hand, is intended for segmenting out multiple objects simultaneously form an image. [A weaker reason for "Multi" in the name of the class is that it uses skip connections not only across the two arms of the "U", but also also along the arms. The skip connections in the original Unet are only between the two arms of the U. In mUnet, each object type is assigned a separate channel in the output of the network. This version of DLStudio also comes with a new dataset, PurdueShapes5MultiObject, for experimenting with mUnet. Each image in this dataset contains a random number of selections from five different shapes, with the shapes being randomly scaled, oriented, and located in each image. The five different shapes are: rectangle, triangle, disk, oval, and star.

SkipConnections = <class 'DLStudio.SkipConnections'>: This educational class is meant for illustrating the concepts related to the use of skip connections in neural network. It is now well known that deep networks are difficult to train because of the vanishing gradients problem. What that means is that as the depth of network increases, the loss gradients calculated for the early layers become more and more muted, which suppresses the learning of the parameters in those layers. An important mitigation strategy for addressing this problem consists of creating a CNN using blocks with skip connections. With the code shown in this inner class of the module, you can now experiment with skip connections in a CNN to see how a deep network with this feature might improve the classification results. As you will see in the code shown below, the network that allows you to construct a CNN with skip connections is named BMEnet. As shown in the script playing_with_skip_connections.py in the Examples directory of the distribution, you can easily create a CNN with arbitrary depth just by using the "depth" constructor option for the BMEnet class. The basic block of the network constructed by BMEnet is called SkipBlock which, very much like the BasicBlock in ResNet-18, has a couple of convolutional layers whose output is combined with the input to the block. Note that the value given to the the "depth" constructor option for the BMEnet class does NOT translate directly into the actual depth of the CNN. [Again, see the script playing_with_skip_connections.py in the Examples directory for how to use this option.] The value of "depth" is translated into how many instances of SkipBlock to use for constructing the CNN.

p
		__author__ = 'Avinash Kak (kak@purdue.edu)' __copyright__ = '(C) 2020 Avinash Kak. Python Software Foundation.' __date__ = '2020-March-21' __url__ = 'https://engineering.purdue.edu/kak/distDLS/DLStudio-1.0.9.html' __version__ = '1.0.9'

Author
		Avinash Kak (kak@purdue.edu)