The WK xx SL yy entry shown for each index item
refers to Slide yy of Week xx lecture.
Therefore, "WK 12 SL 10" in the very first entry
means Slide 10 of Avi Kak's Week 12 lecture.
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Z
A
Accenture says WK 12 SL 10
acceptance test WK 11 SL 2
action (RL) WK 15 SL 13
action (RL Vocab) WK 15 SL 49
activation function WK 3 SL 28
activation function nonlinearity WK 6 SL 1
AdaGrad (for Adaptive Gradients) WK 3 SL 113
Adam Optimizer WK 3 SL 117
Advantage Function (PPO) WK 15 SL 40
Adversarial Learning WK 11 SL 4
AdversarialLearning (DLStudio module) WK 4 SL 5
AdversarialLearning module code organization WK 11 SL 67
AdversarialLearning module of DLStudio WK 4 SL 30
Adversarial Loss (definition) WK 11 SL 127
Adversarial Loss (for Performance Improvement) WK 11 SL 128
Adversarial Loss (min-max criterion) WK 11 SL 127
Affine Homography WK 2 SL 44
affine parameters (BN) WK 6 SL 33
Affine vs. Projective Distortions WK 2 SL 45
agent (RL Vocab) WK 15 SL 49
agent's best policy (RL) WK 15 SL 58
all_annotations WK 7 SL 98
ALLIES (Prompting LLMs with Beam Search) WK 14 SL 85
alpha_beta_calc.py WK 11 SL 170
Amazon User Feedback Dataset WK 12 SL 12
anchor boxes (YOLO) WK 7 SL 104
Anchor-Positive pairs WK 9 SL 27
Andrew Barto and Richard Sutton WK 15 SL 1
ANN (Approximate Nearest Neighbor) WK 9 SL 64
annotation archive (annotation.p ) WK 7 SL 98
annotation.p (annotation archive) WK 7 SL 98
apply_positional_encoding() (Tranformer) WK 13 SL 64
apply_tokenizer.py (babyGPT) WK 14 SL 22
Approximate Nearest Neighbor (ANN) WK 9 SL 64
architectural highlights of GPT-2 WK 14 SL 75
Architecture of BERT WK 14 SL 31
ArticleDataset class (babyGPT) WK 14 SL 95
ArticleGatherer class (babyGPT) WK 14 SL 94
aspect ratios for anchor boxes (YOLO) WK 7 SL 105
ASPP (Atrous Spatial Pyramid Pooling) WK 8 SL 15
atrous convolution WK 8 SL 10, WK 8 SL 50
atrous convolution (definition) WK 8 SL 11
atrous convolution (Rate 2) WK 8 SL 13
Atrous Spatial Pyramid Pooling (ASPP) WK 8 SL 15
attention head WK 13 SL 29
AttentionHead class (DLStudio) WK 13 SL 32
attention layer WK 13 SL 1
Attention Map (definition) WK 14 SL 2
attributes (PythonOO) WK 1 SL 31
attributes (system-supplied, Pyhton) WK 1 SL 34
Austin Powers movies WK 7 SL 77
Autoencoder class (DLStudio) WK 16 SL 38
Autoencoder (root class for autoencoding) WK 16 SL 38
autoencoding WK 4 SL 3
Autoencoding inner class (DLStudio) WK 4 SL 27
Autograd WK 3 SL 91
Autograd (extending the module) WK 3 SL 95
Autograd module (PyTorch) WK 3 SL 3
automatic garbage collection (Python) WK 1 SL 54
automatic translation WK 13 SL 1
Automation Levels WK 4 SL 12
autoregressive calculation of attention WK 14 SL 2
autoregressive language model WK 14 SL 13
autoregressive masking WK 14 SL 68
Autoregressive Modeling WK 15 SL 11
autoregressive modeling (definition) WK 14 SL 66
Autoregressive Modeling in Latent Space WK 16 SL 90
autoregressive modeling with codebook learning WK 16 SL 82
Averaging Required by SGD WK 3 SL 61
axis dimensionality WK 5 SL 20
axis of a tensor WK 5 SL 20
B
babyGPT WK 14 SL 4
babyGPT components WK 14 SL 93
109_babygpt_tokenizer_49275.json WK 14 SL 96
112_babygpt_tokenizer_50002.json WK 14 SL 22
backpropagated loss WK 6 SL 1
backpropagation WK 3 SL 31
backpropagation (loss) WK 3 SL 3
backprop_and_update_params_one_neuron_model() WK 3 SL 57
backropagation of loss for stride=2 WK 8 SL 45
backwards() WK 7 SL 2
Base Vocab (Tokenization) WK 14 SL 18
BasicDecoderWithMasking class (DLStudio) WK 13 SL 55
BasicEncoder class (DLStudio) WK 13 SL 46
basic idea of gating WK 12 SL 49
batch based estimation of loss WK 3 SL 61
batch-level histograms WK 2 SL 66
Batch Normalization (BN) WK 6 SL 31, WK 6 SL 4
BCELoss WK 7 SL 36
(B,C,H,W)-shape WK 2 SL 20
Beam Search for Prompting LLMs WK 14 SL 85
behaviorism in psychology WK 15 SL 1
BERT WK 14 SL 1, WK 14 SL 31
BERT_base WK 14 SL 33
BERT input format (Question, Answer) WK 14 SL 34
BERT input format (Sentence, Next Sentence) WK 14 SL 34
BERT_large WK 14 SL 33
BERT Network (a PyTorch implementation) WK 14 SL 39
BERT Pre-Training WK 14 SL 46
BERT Tokenizer WK 14 SL 41
B. F. Skinner WK 15 SL 1
bias (learnable) WK 3 SL 60
bidirectional (in what BERT stands for) WK 14 SL 2
BiLingual Evaluation Understudy (BLEU) WK 13 SL 72
Binary Cross Entropy Loss WK 7 SL 27, WK 7 SL 33, WK 7 SL 36
binary masks for objects WK 7 SL 81
bit-patterns for Unicode numbers WK 14 SL 18
BLEU metric (language translation) WK 13 SL 72
BMEnet WK 6 SL 11, WK 6 SL 20, WK 6 SL 6
BN (Batch Normalization) WK 6 SL 31, WK 6 SL 4
BookCorpus dataset (BERT) WK 14 SL 48
Boolean array WK 9 SL 53
bounding box WK 7 SL 40
bounding-box prediction WK 7 SL 61
bounding-box prediction (YOLO) WK 7 SL 105
BPE (Byte-Pair Encoding) WK 14 SL 28
BPE (Byte Pair Encoding) WK 14 SL 20
broadcasting (tensor) WK 9 SL 50
brute-force similarity search WK 9 SL 63
building-block class WK 6 SL 11
Byte-Pair Encoding (BPE) WK 14 SL 28
Byte Pair Encoding (BPE) WK 14 SL 20
C
calculating attention with Q,K,V WK 13 SL 19
callable WK 1 SL 6
callable instance WK 1 SL 26
callables vs. function objects WK 1 SL 24
calling a superclass method (PythonOO) WK 1 SL 61
calling base-class constructor (PythonOO) WK 1 SL 67
__call__() method WK 1 SL 24
Cartpole_DQL.v2.py WK 15 SL 78
causes of vanishing gradients WK 6 SL 48
cells (YOLO) WK 7 SL 104
centroids of Voronoi cells WK 9 SL 68
CGP (Computational Graph Primer) WK 3 SL 33, WK 3 SL 35
CGP (ComputationalGraphPrimer) WK 3 SL 4
cgp.gen_gt_dataset() WK 3 SL 45
cgp.parse_expressions() WK 3 SL 45
cgp.train_on_all_data() WK 3 SL 46
chains of dependencies (RNN) WK 12 SL 3
channels (color) WK 2 SL 8
Character Pair Encoding (CPE) WK 14 SL 20
checkpoint performance WK 13 SL 72
chr() for returning char for a Unicode number WK 14 SL 19
CIFAR-10 WK 7 SL 13
CIFAR10 dataset WK 1 SL 9
CIFAR-10 dataset WK 7 SL 31
class ArticleDataset (babyGPT) WK 14 SL 95
class ArticleGatherer (babyGPT) WK 14 SL 94
class AttentionHead (DLStudio) WK 13 SL 32
class Autoencoder (DLStudio) WK 16 SL 38
class BasicDecoderWithMasking (DLStudio) WK 13 SL 55
class BasicEncoder (DLStudio) WK 13 SL 46
class CrossAttention (DLStudio) WK 13 SL 50
class CrossAttentionHead (DLStudio) WK 13 SL 52
class (definition) WK 1 SL 29
class hierarchy (PythonOO) WK 1 SL 60
Classification Accuracy WK 7 SL 73
classification (extreme) WK 9 SL 3
classification token (BERT) WK 14 SL 36
class MasterDecoderWithMasking (babyGPT) WK 14 SL 98
class MasterDecoderWithMasking (DLStudio) WK 13 SL 59
class MasterEncoder (DLStudio) WK 13 SL 43
class method WK 1 SL 14
class PatchEmbeddingGenerator (visTransformer) WK 13 SL 102
class PositionalEmbedding (BERT) WK 14 SL 40
class PromptResponder (babyGPT) WK 14 SL 99
class-qualified syntax (PythonOO) WK 1 SL 47
class SegmentEmbedding (BERT) WK 14 SL 40
class SelfAttention (DLStudio) WK 13 SL 35
class TrainTokenizer (babyGPT) WK 14 SL 96
class TransformerFG (babyGPT) WK 14 SL 97
class variable WK 1 SL 56
class variables (PythonOO) WK 1 SL 36
class VectorQuantizer (DLStudio) WK 16 SL 75
class visTransformer (DLStudio) WK 13 SL 101, WK 13 SL 99
class VQGAN (DLStudio) WK 16 SL 96
CLIP WK 9 SL 11, WK 9 SL 153
CLIP (DLStudio experiments) WK 9 SL 160
CLIP loss WK 9 SL 156
clipped gradients (PPO) WK 15 SL 41
CLS (BERT classification token) WK 14 SL 36
clutter in images WK 7 SL 75
CNN (dual-inferencing) WK 7 SL 12
codebook (definition) WK 16 SL 1, WK 16 SL 6
Codebook Learning (definition) WK 16 SL 63
codebook learning for autoregressive modeling WK 16 SL 82
codebook vectors (representation) WK 16 SL 87
coding issue (Supervised Metric Learning) WK 9 SL 36
color channels WK 2 SL 8
commitment loss (VQVAE) WK 16 SL 67
Compose WK 2 SL 33
Compose, a PyTorch class WK 1 SL 6
computational graph WK 3 SL 31
computational graph (dynamic) WK 3 SL 3
Computational Graph Primer (CGP) WK 3 SL 33, WK 3 SL 35
ComputationalGraphPrimer (CGP) WK 3 SL 4
computational graph (static) WK 3 SL 3
computational graph: static vs dynamic WK 3 SL 93
conditional entropy WK 7 SL 25
confusion matrix WK 7 SL 73
confusion matrix and prediction accuracy (BMEnet) WK 6 SL 27
confusion matrix (Dr.EvalDataset) WK 7 SL 94
construct_dataframes_from_datafiles WK 12 SL 100
constructing random tensors WK 2 SL 55
constructor (default) WK 1 SL 29
constructor initializer WK 1 SL 36
container classes (PyTorch) WK 4 SL 13
Context Buffer (babyGPT) WK 14 SL 100
context window size in GPT-5.4 WK 14 SL 90
contiguous tensor WK 5 SL 61
continuous state space (RL) WK 15 SL 4
convergence of gradient descent WK 3 SL 109
convo2d() WK 5 SL 6
convo layer with 1x1 kernel WK 8 SL 15
convolution WK 5 SL 1
Convolution Arithmetic WK 8 SL 52
convolution as a matrix-vector product WK 8 SL 38
convolution (atrous) WK 8 SL 10, WK 8 SL 11, WK 8 SL 50
convolution (input specs) WK 5 SL 13
convolution (kernel) WK 5 SL 8
convolution (kernel specs) WK 5 SL 13
convolutions (multi-channel) WK 5 SL 41
convolution (transpose) WK 8 SL 37
convolution vs correlation WK 5 SL 2
convolution (with dilation) WK 8 SL 50
cost function WK 3 SL 1
cost-function surface and gradient descent WK 3 SL 107
cost of moving a pixel in a histogram WK 11 SL 31
CPE (Character Pair Encoding) WK 14 SL 20
creating a new instance (PythonOO) WK 1 SL 40
creating context for next input WK 12 SL 1
Critic for learning 1-Lipschitz function WK 11 SL 91, WK 11 SL 95
Critic-Generator pairs (WGAN) WK 11 SL 67
Critic-Generator pair (WGAN) WK 11 SL 88
Critic (WGAN) WK 11 SL 88
cross-attention WK 13 SL 2, WK 8 SL 30
CrossAttention class (DLStudio) WK 13 SL 50
cross attention (definition) WK 13 SL 4
CrossAttentionHead class (DLStudio) WK 13 SL 52
cross entropy (definition) WK 7 SL 21
cross-entropy loss WK 3 SL 10
Cross Entropy Loss WK 7 SL 27
cross-entropy loss (YOLO) WK 7 SL 111
cuda() WK 2 SL 59
CUDA (Compute-Unified Device Architecture) WK 2 SL 4
custom_data_loading.py (DLStudio) WK 4 SL 49
D
DAG (Directed Acyclic Graph) WK 3 SL 5
data augmentation WK 2 SL 3, WK 2 SL 38
data chunking (Time Series Data) WK 12 SL 92
data chunking (time-series prediction) WK 4 SL 40
DataFrame (Pandas) WK 12 SL 100
data generator (NCE demo) WK 9 SL 115
DataLoader (CGP) WK 3 SL 54
Dataloader for PurdueDrEvalDataset WK 7 SL 84
Dataloader for PurdueDrEvalMultiDataset WK 7 SL 101
Dataloader (PyTorch) WK 7 SL 58
dataloader requirements for InfoNCE experiments WK 9 SL 144
data normalization (time-series prediction) WK 4 SL 41
data_normalizer() WK 12 SL 105
Data Prediction WK 12 SL 90
data prediction WK 12 SL 3
DataPrediction (DLStudio module) WK 4 SL 5
DataPrediction module of DLStudio WK 4 SL 40
Dataset (PyTorch) WK 7 SL 58
DatasetServerForUnsupervised() (InfoNCE) WK 9 SL 145
datasets_for_AdversarialNetworks.tar.gz WK 11 SL 59, WK 4 SL 60
datasets_for_DLStudio.tar.gz (DLStudio) WK 4 SL 58
datasets_for_YOLO.tar.gz WK 7 SL 80
datasets (Map-style) WK 7 SL 58
datetime attribute (Time Series data) WK 12 SL 91
datetime conditioning for prediction WK 12 SL 99
dcgan_DG2.py (DLSudio) WK 4 SL 52
dcgan_DG1.py (DLSudio) WK 4 SL 52
DCGAN (DLStudio) WK 11 SL 61
DCGAN training loop WK 11 SL 73
DDPM (Denoising Diffusion Probabilistic Model) WK 11 SL 133
dealing with variable length input (RNN) WK 12 SL 19
decision tree introspection WK 9 SL 107
DecoderForAutoenc (inner class of Autoencoder) WK 16 SL 41
Decoder network WK 8 SL 4
deep fakes WK 11 SL 3
deep networks WK 6 SL 1
default constructor WK 1 SL 29
defining a method (PythonOO) WK 1 SL 47
__del__() (object destruction, PythonOO) WK 1 SL 54
demo_bert_tokenizer.py WK 14 SL 43
denoising (definition) WK 11 SL 132
denoising Diffusion WK 11 SL 132
Denoising Diffusion for Generative Modeling WK 11 SL 135
Denoising Diffusion Probabilistic Model (DDPM) WK 11 SL 133
denoising process WK 11 SL 6
depends_on WK 3 SL 42
derivates of loss wrt learnable params WK 3 SL 60
deriving a class from a super class (PythonOO) WK 1 SL 38
Designing with Objects (DwO) WK 1 SL 2
destruction of instance objects (PythonOO) WK 1 SL 54
DetectAndLocalize (DLStudio) WK 7 SL 49
DetectAndLocalize inner class (DLStudio) WK 4 SL 26
DetectAndLocalize (inner class of DLStudio) WK 7 SL 61
diamond hierarchy (PythonOO) WK 1 SL 74
Dice Loss WK 8 SL 34
Dice Loss (a variant of IoU Loss) WK 8 SL 34
dictionary (namespace) WK 1 SL 43
__dict__ (Python) WK 1 SL 33
diff between numpy.zeros() and torch.zeros() WK 5 SL 37
differentiability of distance functions WK 11 SL 43, WK 11 SL 46, WK 11 SL 6
differentiability of JS-Divergence WK 11 SL 49
differentiability of KL-Divergence WK 11 SL 47
differentiability of Total Variation WK 11 SL 51
differentiability of Wasser Distance WK 11 SL 46
Diffusion WK 11 SL 4
diffusion (definition) WK 11 SL 132
diffusion process WK 11 SL 6
digrams (Language Modeling Simulator) WK 15 SL 25
dilation (convolution) WK 8 SL 50
dilation (nn.Conv2d constructor parameter) WK 8 SL 14
dimensionality along an axis of a tensor WK 5 SL 20
dimensionality reduction with VQ WK 9 SL 72
DIoU (Distance-IoU) WK 7 SL 44
DIoULoss (DLStudio) WK 7 SL 49
dir() WK 1 SL 38
Directed Acyclic Graph (DAG) WK 3 SL 5
Direct Preference Optimization (DPO) WK 15 SL 3
dir() (Python) WK 1 SL 33
Discounted Expected Rewards WK 15 SL 45
discrete state space (RL) WK 15 SL 4
discriminative training WK 14 SL 10
Discriminator (DCGAN) WK 11 SL 62
Discriminator-Generator pairs (DCGAN) WK 11 SL 67
discriminator (NCE demo) WK 9 SL 115, WK 9 SL 119
Discriminator 4-2-1 network (DCGAN) WK 11 SL 68
Discriminator (PatchGAN) WK 16 SL 98
Discriminator vs Generator WK 11 SL 5
Disentanglement learning WK 16 SL 17
distance between two probability distributions WK 11 SL 12
distance between two prob distributions WK 11 SL 6
Distance IoU (DIoU) WK 7 SL 44
distance (L_2) WK 9 SL 22
Distance Matrix WK 9 SL 44
Distance Matrix (naive approach) WK 9 SL 44
Distance Matrix (tensor based approach) WK 9 SL 46
distance (Wasser) WK 11 SL 88
distinctive and yet non-descriptive WK 9 SL 109
dist_{JS}(p,q) WK 11 SL 24
divergence between two prob distributions WK 11 SL 6
division by sqrt(M) normalization of Q.K^T WK 13 SL 20
DLStudio WK 4 SL 2
DLStudio (inner classes) WK 4 SL 4
DLStudio (inner classes vs modules) WK 4 SL 23
DLStudio platform WK 4 SL 22
DLStudio (specialized modules) WK 4 SL 5
DoSillyWithTensor (CGP) WK 3 SL 101
Dot-Product Attention (the basic idea) WK 13 SL 12
downsampler WK 6 SL 15
DPO (Direct Preference Optimization) WK 15 SL 3
Dr. Eval WK 7 SL 77
Dr. Eval Dataset WK 7 SL 77
Dr_Eval house watertower (OOI) WK 7 SL 97
Dr. Evil vs Dr. Eval WK 7 SL 77
dual-inferencing CNN WK 7 SL 12
Dynamic Computational Graph WK 3 SL 93
dynamic computational graph WK 3 SL 3
E
E_A and E_B input tokens (BERT) WK 14 SL 36
Earth Mover's Distance WK 11 SL 27
Earth Mover's Distance (EMD) WK 11 SL 27
ELBO (Evidence Lower BOund) WK 16 SL 22
elements b and h (yolo_vector) WK 7 SL 110
embeddings WK 9 SL 1, WK 9 SL 16
embeddings (based on ResNet-50) WK 9 SL 36
EmbeddingsGenerator() (Reward Modeling) WK 15 SL 33
embedding vectors WK 9 SL 1, WK 9 SL 16
embedding-vectors for text WK 12 SL 17
embedding (word) WK 13 SL 12
EMD (Earth Mover's Distance) WK 11 SL 27
encapsulation WK 1 SL 15
encapsulation issues (PythonOO) WK 1 SL 55
Encoder-Decoder architecture WK 8 SL 3
Encoder-Decoder architecture (Codebook Learning) WK 16 SL 6
Encoder-Decoder Architecture (Transformer) WK 13 SL 37
EncoderForAutoenc (inner class of Autoencoder) WK 16 SL 39
EncoderRNN WK 1 SL 7
Encoding Positional (Transformer) WK 13 SL 61
en_es_corpus_for_seq2sq_learning_....tar.gz WK 4 SL 62
en_es_xformer_8_90000.tar.gz (DLStudio) WK 13 SL 7
English-to-Spanish translation WK 13 SL 1
English Wikipedia dataset (2.5 Billion words) WK 14 SL 48
entropy (definition) WK 7 SL 20
environment (RL Vocab) WK 15 SL 49
episode (RL Vocab) WK 15 SL 49
epoch WK 6 SL 26
essence of a training dataset WK 16 SL 17
estimating InfoNCE Loss WK 9 SL 149
Estimating the Loss (YOLO) WK 7 SL 114
eval mode WK 6 SL 36
evaluating multi-instance detector WK 7 SL 132
evaluation challenges(MIOD) WK 7 SL 132
Evidence Lower BOund (ELBO) WK 16 SL 22
example_for_InfoNCE_loss_unsupervised.py WK 9 SL 10
example_for_pairwise_contrastive_loss.py WK 4 SL 56, WK 9 SL 102
example_for_triplet_loss.py (DLStudio) WK 4 SL 56, WK 9 SL 102
ExamplesAdversarialLearning Directory (DLStudio) WK 4 SL 52
ExamplesDataPrediction Directory (DLStudio) WK 4 SL 54
Examples directory (DLStudio) WK 4 SL 49
ExamplesMetricLearning Directory (DLStudio) WK 4 SL 56
ExamplesSeq2SeqLearning Directory (DLStudio) WK 4 SL 53
ExamplesTransformers Directory (DLStudio) WK 4 SL 55
Exp class (CGP) WK 3 SL 65
explainability problem of neural networks WK 9 SL 107
exponential acceleration in gradient descent WK 3 SL 111
extending a class (PythonOO) WK 1 SL 60
extending Autograd WK 3 SL 95
extreme classification WK 9 SL 3
F
Facebook AI Similarity Search (FAISS) WK 9 SL 80
face recognition WK 2 SL 49, WK 9 SL 2
face verification WK 9 SL 2
FAISS (Facebook AI Similarity Search) WK 9 SL 80
faiss.IndexFlatL2() WK 9 SL 80
feature map WK 8 SL 16
feature maps WK 8 SL 37
find_best_ngram_and_update_word_tokens_dict WK 14 SL 27
fine detail in images WK 8 SL 5
finite difference method WK 3 SL 46
flattening a tensor WK 5 SL 60
flatten.py WK 5 SL 60
float WK 2 SL 29
Focal Loss WK 8 SL 34
Focal Loss (cross-entropy for unbalanced data) WK 8 SL 34
formulating the loss for VQVAE WK 16 SL 66
forward() (inherited from nn.Module) WK 4 SL 16
forward prop through one-neuron classifier WK 3 SL 58
Fredrik Lundh WK 2 SL 10
functional WK 4 SL 16
function call operator '()' WK 1 SL 25
function object WK 1 SL 6
function objects vs. callables WK 1 SL 24
function (stand-alone, PythonOO) WK 1 SL 37
function (user-defined, PythonOO) WK 1 SL 37
G
gain and bias parameters for normalization WK 6 SL 46
garbage collection (Python) WK 1 SL 54
Gartner says WK 12 SL 10
Gated Recurrent Unit (GRU) WK 12 SL 4, WK 12 SL 48, WK 12 SL 53, WK 12 SL 55
gate semantics (GRU) WK 12 SL 59
gating mechanisms WK 12 SL 4
Gating Mechanisms (mitigating vanishing grads) WK 12 SL 48
Gaussian distribution WK 11 SL 159
Gauss-Newton (GN) WK 3 SL 20
GD WK 3 SL 18
Generalized IoU (GIoU) WK 7 SL 44
generating latent vector for VAE Decoder WK 16 SL 34
generating new samples WK 11 SL 1
generating token sequences WK 15 SL 24
Generation 5 network (Semantic Segmentation) WK 8 SL 25
Generation 3 network (Semantic Segmentation) WK 8 SL 20
Generation 2 network (Semantic Segmentation) WK 8 SL 19
Generation 1 network (Semantic Segmentation) WK 8 SL 18
Generation 4 network (Semantic Segmentation) WK 8 SL 23
generative data modeling WK 11 SL 5
GenerativeDiffusion (DLStudio module) WK 4 SL 5
GenerativeDiffusion (module in DLStudio) WK 11 SL 175
Generative Diffusion module of DLStudio WK 4 SL 33
generative modeling with VAE WK 16 SL 18
generative training WK 14 SL 11
Generative vs. Discriminative Training WK 14 SL 10
Generator (DCGAN) WK 11 SL 62
Generator 4-2-1 network (DCGAN) WK 11 SL 68
Generator vs Discriminator WK 11 SL 5
gen_gt_dataset() WK 3 SL 45
gen_training_data(self) WK 3 SL 53
get and set methods (PythonOO) WK 1 SL 55
getdata() (Image) WK 7 SL 56
__getitem__() WK 7 SL 82
__getitem__() (Dataset) WK 7 SL 58
getpixel(x,y) WK 2 SL 15
GIoU (Generalized IoU) WK 7 SL 44
global minimum WK 3 SL 1
GPT-3 WK 14 SL 4
GPT-5.4 WK 14 SL 90
GPT-2 WK 14 SL 1, WK 14 SL 4
GPT-5 WK 14 SL 4
GPT-2 architectural highlights WK 14 SL 75
GPT-3 (In-Context Learning) WK 14 SL 83
GPT-3: Unsupervised Learning at Scale WK 14 SL 77
Gradient Descent WK 3 SL 1
gradient descent and cost-function surface WK 3 SL 107
Gradient Penalty for improving WGAN WK 11 SL 103
graph_based_dataflow.py WK 3 SL 34, WK 3 SL 40
groups option for multi-channel convolutions WK 5 SL 48
GRU API WK 12 SL 69
GRU Based Classes (DLStudio) WK 12 SL 62
GRU (Gated Recurrent Unit) WK 12 SL 4, WK 12 SL 48, WK 12 SL 53, WK 12 SL 55
GRU (Minimally Gated) WK 12 SL 95
GRUnet network WK 12 SL 63
GRU (what makes it frustrating) WK 12 SL 70
gzip.open() WK 7 SL 56
gzip (Python module) WK 7 SL 55
H
Hard Negative Mining WK 9 SL 30
head (Attention) WK 13 SL 29
hidden state and its time evolution WK 12 SL 50, WK 12 SL 52
hidden state initialization (RNN) WK 12 SL 25
hidden state (RNN) WK 12 SL 21
highly unbalanced training data WK 9 SL 3
High Resolution Computed Tomography (HRCT) WK 9 SL 108
hill climbing for policy-based RL WK 15 SL 98
histogramming (batch data) WK 2 SL 66
histogramming color channels WK 2 SL 53
histogramming (per-channel data) WK 2 SL 68
histogramming_the_image() WK 2 SL 53
histograms WK 11 SL 35
homogeneous coordinates WK 2 SL 41
homographies for image transformations WK 2 SL 40
HRCT (High Resolution Computed Tomography) WK 9 SL 108
I
IBM says WK 12 SL 10
ICS (Internal Covariate Shift) WK 6 SL 3
identity based similarity WK 9 SL 3
illumination angle dependence WK 2 SL 49
illumnation issues (data augmentation) WK 2 SL 38
Image WK 2 SL 11
ImageChops WK 2 SL 11
Image class (PIL) WK 7 SL 56
Image Datasets (DLStudio) WK 4 SL 58
ImageDraw WK 2 SL 11
ImageFilter WK 2 SL 11
ImageFont WK 2 SL 11
Image.open() WK 2 SL 52
image processing with Torchvision WK 2 SL 51
image recognition with visTransformer WK 13 SL 93
image_recog_with_visTransformer.py WK 13 SL 103
image_recog_with_visTransformer.py (DLStudio) WK 13 SL 7
image retrieval (CLIP) WK 9 SL 183
images.max() WK 2 SL 26
images.min() WK 2 SL 25
Image Synthesis WK 11 SL 5
ImageTk WK 2 SL 11
image_to_image_retrieval_with_clip.py WK 9 SL 168
image upsampling WK 8 SL 51
implementing Wasser distance WK 11 SL 90
importance of scale in unsupervised learning WK 14 SL 79
improving search-engine performance with RL WK 15 SL 102
In-Context Learning (GPT-3) WK 14 SL 83
incorporate human preferences WK 15 SL 1
index_dataset() WK 7 SL 83
indexer (Faiss) WK 9 SL 80
Index Set (VQ) WK 9 SL 73
infimum supremum (Wasser) WK 11 SL 38
InfoNCE and patchNCE Losses WK 9 SL 127
InfoNCE Loss WK 9 SL 128, WK 9 SL 132
inheritance WK 1 SL 16
IN (Instance Normalization) WK 6 SL 38, WK 6 SL 5
__init__() (PythonOO) WK 1 SL 36, WK 1 SL 40
inner classes (DLStudio) WK 4 SL 26, WK 4 SL 4
input data chunking (Time Series Data) WK 12 SL 92
input formatting (BERT) WK 14 SL 34
input/output size shapes (nn.ConvTranspose) WK 11 SL 69
input specs (convolution) WK 5 SL 13
input (vectorizing) WK 8 SL 40
instance method WK 1 SL 14
Instance Normalization (IN) WK 6 SL 38, WK 6 SL 5
instance variable WK 1 SL 14
integer-index based representation for text WK 12 SL 18
Internal Covariate Shift (ICS) WK 6 SL 3
Inter-Pixel Attention with QKV WK 13 SL 106
Intersection-over-Union (IoU) WK 7 SL 43
intra-batch interference WK 6 SL 5
inverse data normalization (Time Series) WK 12 SL 93
Inverted Index for a dataset (VQ) WK 9 SL 73
IoU_calculator WK 7 SL 148
IoU (Intersection-over-Union) WK 7 SL 43
IoU Loss WK 7 SL 43
is_contiguous() (predicate) WK 5 SL 64
isotropic Gaussian WK 11 SL 135
isotropic Gaussian noise WK 11 SL 6
__iter__ WK 1 SL 83
iterable class instance (PythonOO) WK 1 SL 83
iterative processing with loops WK 9 SL 37
J
Jacobian WK 3 SL 16, WK 3 SL 17
Jensen-Shannon (JS) WK 11 SL 24
Jensen-Shannon (JS) Divergence and Distance WK 11 SL 23
JS (Jensen-Shannon) WK 11 SL 24
K
Kaggle Power-Load Dataset WK 12 SL 108
KD-Tree WK 9 SL 64
kernel (convolution) WK 5 SL 8
kernel specs (convolution) WK 5 SL 13
kernel (vectorizing) WK 8 SL 39
KL Divergence WK 11 SL 17
KL-Divergence formula for Gaussian distros WK 16 SL 25
KL-Divergence (VAE Loss) WK 16 SL 22
K-Means algorithm WK 9 SL 71
K-Means for initializing the prototype (InfoNCE) WK 9 SL 147
K-Means (InfoNCE) WK 9 SL 146
L
LAION WK 9 SL 160
L1 and L2 norms WK 7 SL 39
Lang Modeling Simulator for Reward Prediction WK 15 SL 23
Large Language Modeling (LLM) WK 14 SL 1
Latent Diffusion Models WK 11 SL 5
Latent Space WK 16 SL 17
Latent Space Autoregressive Modeling WK 16 SL 90
latent-space diffusion WK 11 SL 132
Layer Normalization (LN) WK 6 SL 42, WK 6 SL 5
L_2 distance WK 9 SL 22
leads_to WK 3 SL 42
learnable bias WK 3 SL 60
learnable matrices W_q, W_k, W_v WK 13 SL 14
learnable parameters WK 3 SL 27
learnable tensors W_Q, W_K, W_V (attention) WK 13 SL 16
learning a policy (RL) WK 15 SL 13
learning image and text embeddings together WK 9 SL 153
learning rate WK 3 SL 16
learning-rate warm-up WK 13 SL 75
learning (self-supervised) WK 9 SL 5
learning without supervision WK 14 SL 1
lemmatization WK 14 SL 16
Levels of Automation WK 4 SL 12
levels of continuity properties (Wasser) WK 11 SL 40
Levenberg-Marquardt (LM) WK 3 SL 20
1-Lipschitz continuous function (Wasser) WK 11 SL 91
Lipschitz function (Wasser) WK 11 SL 40
LLM as environment (RL) WK 15 SL 13
LLM (Large Language Modeling) WK 14 SL 1
LM (Levenberg-Marquardt) WK 3 SL 20
LN (Layer Normalization) WK 6 SL 42, WK 6 SL 5
L_n (loss for negative pairs) WK 9 SL 24
L_2 norm WK 9 SL 17
LOADnet WK 7 SL 61
LOADnet2 class for object detection WK 7 SL 88
LOADnet2 (DLStudio) WK 7 SL 62
Locality Sensitive Hashing (LSH) WK 9 SL 64
local minimum WK 3 SL 20
log-likelihood lower bound (VAE) WK 16 SL 23
log-likelihood of input at output WK 16 SL 22
LogSoftmax WK 7 SL 29
Long Short-Term Memory (LSTM) WK 12 SL 48, WK 12 SL 53
Loss (Adversarial) WK 11 SL 127
Loss (Adversarial, VQGAN) WK 16 SL 87
loss (backpropagated) WK 6 SL 1
loss backpropagation WK 3 SL 3
loss backprop using transpose convo WK 8 SL 43
Loss (backprop when stride > 1) WK 8 SL 45
loss.backward() WK 3 SL 92
Loss (batch based estimation) WK 3 SL 61
Loss (Binary Cross Entropy) WK 7 SL 27, WK 7 SL 36
Loss (calling backward()) WK 3 SL 92
Loss (CLIP) WK 9 SL 156
Loss (Cross Entropy) WK 3 SL 10, WK 7 SL 27
Loss (cross-entropy for YOLO) WK 7 SL 111
Loss (Denoising Diffusion) WK 11 SL 143
Loss (derivates wrt learnable params) WK 3 SL 60
Loss (Dice) WK 8 SL 34
Loss (DIoULoss) WK 7 SL 49
Loss (estimating InfoNCE Loss) WK 9 SL 149
Loss (estimation for YOLO) WK 7 SL 114
Loss (estimation without for-loops) WK 7 SL 115
Loss (Focal) WK 8 SL 34
Loss (for learning in denoising p-chain) WK 11 SL 153
Loss Function for training a VAE WK 16 SL 22
Loss (InfoNCE) WK 9 SL 128, WK 9 SL 132
Loss (IoU) WK 7 SL 43
Loss (KL-Divergence) WK 16 SL 22, WK 16 SL 24
Loss (KL-Divergence in denoising p-chain) WK 11 SL 153
loss_kld (PyTorch for VAE Loss) WK 16 SL 27
loss_kld (VAE) WK 16 SL 24
Loss (negative pairs) WK 9 SL 24
Loss (nn.BCELoss for YOLO) WK 7 SL 113
Loss (nn.CrossEntropyLoss for YOLO) WK 7 SL 113
Loss (nn.MSELoss for YOLO) WK 7 SL 113
Loss (nn.NLLLoss()) WK 7 SL 29
Loss (Pairwise Contrastive) WK 4 SL 45, WK 9 SL 18, WK 9 SL 36, WK 9 SL 5
Loss (partials wrt pre-activation values) WK 6 SL 50
Loss (PatchNCE) WK 9 SL 136
Loss (positive pairs) WK 9 SL 22
loss_recon (PyTorch for VAE Loss) WK 16 SL 29
Loss (Reconstruction) WK 16 SL 24
loss_reconstruction (VAE) WK 16 SL 24
Loss (Regression) WK 7 SL 39
Loss (Reward Modeling) WK 15 SL 20
Loss (SAM) WK 8 SL 34
loss surface WK 3 SL 12, WK 3 SL 25
Loss (torch.nn.L1Loss) WK 7 SL 39
Loss (torch.nn.MSELoss) WK 7 SL 39
Loss (Triplet) WK 4 SL 45, WK 9 SL 27, WK 9 SL 5
Loss (VAE) WK 16 SL 22
Loss (visualization of the loss surface) WK 6 SL 62
Loss (VQGAN) WK 16 SL 88
Loss (VQVAE) WK 16 SL 66
Loss (VQVAE, commitment) WK 16 SL 67
Loss (VQVAE, quantization) WK 16 SL 67
Loss (with Sigmoid activation) WK 3 SL 62
Loss (with Transpose Convolution) WK 8 SL 43
L_p loss for positive pairs WK 9 SL 22
LSH (Locality Sensitive Hashing) WK 9 SL 64
LSTM (Long Short-Term Memory) WK 12 SL 48, WK 12 SL 53
M
Map-style datasets WK 7 SL 58
margin m WK 9 SL 24
margin (Pairwise Contrastive Loss) WK 9 SL 6
Markov-Chain Monte-Carlo (MCMC) WK 11 SL 2
Markov chain vs Markov process WK 11 SL 135
Markovian Assumption (RL Agent) WK 15 SL 56
Markov process WK 11 SL 6
Masked Language Modeling (BERT MLM) WK 14 SL 46
Masked Language Modeling (MLM) WK 14 SL 13
masking for autoregressive learning WK 14 SL 67
Mask (Triplet) WK 9 SL 52
MasterDecoderWithMasking class (babyGPT) WK 14 SL 98
MasterDecoderWithMasking class (DLStudio) WK 13 SL 59
MasterEncoder class (DLStudio) WK 13 SL 43
Matrix Calculus WK 3 SL 30
matrix-vector product (for convolution) WK 8 SL 38
max() WK 7 SL 32
maximization of TER WK 15 SL 14
Maximum-Likelihood classifier WK 8 SL 2
max-likelihood calculations (Language Modeling) WK 14 SL 17
max-likelihood image recovery in p-chain WK 11 SL 143
max_seq_length limit in BERT WK 14 SL 35
max_seq_length (Transformer) WK 13 SL 39
MCMC (Markov-Chain Monte-Carlo) WK 11 SL 2
measuring checkpoint performance WK 13 SL 72
merge rules (tokenizer) WK 14 SL 22
method WK 1 SL 13
method definition outside class (PythonOO) WK 1 SL 49
method (formal definition) WK 1 SL 22
Metric learning WK 9 SL 1
MetricLearning (DLStudio module) WK 4 SL 5
MetricLearning module in DLStudio WK 4 SL 45
metric learning (unsupervised) WK 9 SL 10
Metropolis-Hastings algorithm WK 11 SL 3
MI (Mutual Information) WK 9 SL 128, WK 9 SL 132
Minimally Gated GRU WK 12 SL 95
minimization of loss for reward modeling WK 15 SL 20
mining (Hard Negative) WK 9 SL 30
mining (Metric Learning) WK 9 SL 5
mining without for-loops WK 9 SL 41
mining with simple Boolean logic WK 9 SL 41
MIOD (definition) WK 7 SL 104
MIOD (evaluation) WK 7 SL 132
MIOD (Multi-Instance Object Detection) WK 8 SL 1
MIOD (testing) WK 7 SL 139
misconceptions (tokenization) WK 14 SL 16
mitigation (vanishing gradients) WK 6 SL 2
ML Classifier WK 8 SL 2
MLM (Masked Language Modeling) WK 14 SL 13, WK 14 SL 46
modeling neighborhoods with conditional probs WK 9 SL 86
modeling the reward (PPO) WK 15 SL 16
Module class (PyTorch) WK 4 SL 16
momentum WK 3 SL 109
__mro__ (PythonOO) WK 1 SL 75
MSELoss WK 7 SL 39
multi-channel convolutions WK 5 SL 41
multi_gaussian_source() WK 11 SL 106
Multi-Headed Attention WK 13 SL 27
Multi-Instance Object Detection WK 7 SL 104
Multi-Modality Embedding Space WK 9 SL 153
multi_neuron_classifier.py WK 3 SL 4
multi_neuron_classifier.py WK 3 SL 64
multi-object detection WK 7 SL 3
multiple-inheritance class hierarchy (PythonOO) WK 1 SL 74
multi-scale representation WK 8 SL 14
mUnet (building blocks) WK 8 SL 71
mUnet (DLStudio) WK 8 SL 5
mUnet for semantic segmentation WK 8 SL 76
mUnet Network (semantic segmentation) WK 8 SL 71
Mutual Information (MI) WK 9 SL 127
N
namespace dictionary WK 1 SL 43
NCE_for_learning_point_distro.py WK 9 SL 120
NCE (Noise Contrastive Estimation) WK 9 SL 111
Nearest Neighbor search (NN) WK 9 SL 63
Negative Log Likelihood Loss (NLLLoss) WK 12 SL 30
NetForYolo (YOLOLogic) WK 7 SL 121
networks with feedback WK 12 SL 1
__new__() (PythonOO) WK 1 SL 40
Next Sentence Prediction (NSP) WK 14 SL 12
next_serial_num WK 1 SL 56
N_for_neg_samples_for_each_pos (NCE demo) WK 9 SL 115
NLLLoss WK 7 SL 29
NLLLoss() WK 12 SL 28
nn.BCELoss WK 7 SL 36
nn.BCELoss (YOLO) WK 7 SL 113
nn.Conv2d WK 5 SL 29
nn.CrossEntropyLoss WK 12 SL 30
nn.CrossEntropyLoss (YOLO) WK 7 SL 113
nn.functional WK 4 SL 16
nn.GRU API WK 12 SL 69
nn.LogSoftmax WK 12 SL 30
nn.Module class (PyTorch) WK 4 SL 16
nn.MSELoss WK 7 SL 39
nn.MSELoss (YOLO) WK 7 SL 113
NN (Nearest Neighbor) WK 9 SL 63
nn.NLLLoss() WK 12 SL 30
nn.Sequential WK 4 SL 13, WK 7 SL 88
nn.Sigmoid WK 3 SL 75
nn.Sigmoid (GRU update gate) WK 12 SL 56
nn.Softmax normalization of Q.K^T product WK 13 SL 20
nn.Softmax normalizaton (definition) WK 13 SL 21
nn.tanh (GRU reset gate) WK 12 SL 58
Noam Chomsky WK 15 SL 1
Noise Contrastive Estimation (NCE) WK 9 SL 111
noisy_object_detection_and_localization.py WK 4 SL 50, WK 7 SL 51
None in PyTorch array access syntax WK 9 SL 42, WK 9 SL 51
non-standard organization data WK 7 SL 83
nonzero() WK 2 SL 24
Normalize WK 2 SL 33
normally distributed floats in (0,1.0) interval WK 2 SL 56
notation: clutter-10 WK 7 SL 96
notation e_i for codebook vectors WK 16 SL 13
notation: noise-20 WK 7 SL 96
notation z_i for Encoder channel dim WK 16 SL 13
NSP (Next Sentence Prediction) WK 14 SL 12
numpy array WK 2 SL 16
numpy.asarray() WK 2 SL 52
numpy binary op for disimilarly shaped args WK 9 SL 41
numpy.cumprod (cumulative products) WK 11 SL 170
numpy.ndarray WK 2 SL 4
numpy.random.seed() WK 2 SL 72
numpy.reshape() WK 3 SL 105
numpy.save() WK 2 SL 52
numpy.zeros() WK 5 SL 7
O
object detection and localization WK 4 SL 3
object_detection_and_localization_iou.py WK 7 SL 51
object_detection_and_localization.py WK 7 SL 51
object_detection_and_localization.py (DLStudio) WK 4 SL 49
object detection (multi-instance) WK 7 SL 104
object detection vs image classification WK 7 SL 1
Object Oriented Python WK 1 SL 1
Objects of Interest in Dr. Eval images WK 7 SL 79
obtaining reproducible results WK 2 SL 72
one-hot vectors for text WK 12 SL 18
one_neuron_classifier.py WK 3 SL 34, WK 3 SL 53
one_neuron_classifier.py WK 3 SL 4
OpenCLIP WK 9 SL 160
Optimal Transport Theory WK 11 SL 31
optimization (step size) WK 3 SL 107
optimizer (Adam) WK 3 SL 117
optimum flow matrix (EMD) WK 11 SL 32
ord() and chr() in range 0-127 WK 14 SL 18
ord() for returning Unicode number for a char WK 14 SL 19
os.environ['PYTHONHASHSEED'] WK 2 SL 73
ost_training_cleanup() (babyGPT) WK 14 SL 29
overfitting to training data WK 13 SL 83
Oxhhhh notation for Unicode numbers WK 14 SL 18
P
Pairwise Contrastive Loss WK 9 SL 18, WK 9 SL 36, WK 9 SL 5
Pandas (Python module) WK 12 SL 100
parameter hyperplane WK 3 SL 1, WK 3 SL 12
parametric data model WK 11 SL 1
parse_expressions() WK 3 SL 45
(partial x, partial y) displacements WK 7 SL 109
partitioning a token sequence WK 15 SL 26
PatchEmbeddingGenerator class (visTransformer) WK 13 SL 102
PatchGAN Discriminator (VQGAN) WK 16 SL 98
PatchNCE WK 9 SL 10
PatchNCE Loss WK 9 SL 136
pathnames to images WK 7 SL 83
p-chain (denoising) WK 11 SL 138
performance improvement with adversarial loss WK 11 SL 126
per-instance learning of Affine Parameters (IN) WK 6 SL 40
Perplexity (definition) WK 16 SL 70
Perplexity (roughly how many codebook vecs used) WK 16 SL 78
Person (a PythonOO class) WK 1 SL 30
phenomenon of sparse gradients WK 3 SL 112
pickle.dump() WK 7 SL 99
pickle.dumps() WK 7 SL 56
pickle (Python module) WK 7 SL 56
Pillow Fork (PIL) WK 2 SL 10
PIL (Python Imaging Library) WK 2 SL 10
pixel blobs WK 8 SL 10
pixel value normalization WK 2 SL 33
pixel value scaling WK 2 SL 21, WK 2 SL 27
playing_with_cifar10.py (DLStudio) WK 4 SL 49
playing_with_reconfig.py (DLStudio) WK 4 SL 49
playing_with_sequential.py (DLStudio) WK 4 SL 49
playing_with_skip_connections.py WK 6 SL 10
playing_with_skip_connections.py WK 6 SL 26
playing_with_skip_connections.py (DLStudio) WK 4 SL 49
pmGRU (poor man's GRU) WK 12 SL 4, WK 12 SL 97
pole balancing (RL) WK 15 SL 59
policy based methods (RL) WK 15 SL 5
policy-based RL WK 15 SL 97
policy-based RL and hill climbing WK 15 SL 98
policy-based RL and maximization of TER WK 15 SL 98
policy (definition) WK 15 SL 3
Policy Gradient methods (RL) WK 15 SL 5
policy-gradients (RL) WK 15 SL 101
Policy Network WK 15 SL 37
policy (RL Vocab) WK 15 SL 49
polymorphism WK 1 SL 16
poor man's GRU (DLStudio) WK 12 SL 97
poor man's GRU (pmGRU) WK 12 SL 4
population mean (BN) WK 6 SL 35
population standard-deviation (BN) WK 6 SL 35
Population Stats versus Batch Stats (BN) WK 6 SL 35
PositionalEmbedding class (BERT) WK 14 SL 40
Positional Encoding (Transformer) WK 13 SL 61
posterior distro WK 16 SL 15
power_load_prediction_with_pmGRU.py (DLStudio) WK 4 SL 54
PPO goal WK 15 SL 39
PPO (Proximal Policy Optimization) WK 15 SL 3, WK 15 SL 37
PQ (Product Quantization) WK 9 SL 64
PQ representation of an embedding vector WK 9 SL 78
pre-defined attributes WK 1 SL 20
pre-defined attributes (instance, PythonOO) WK 1 SL 32
pre-defined attributes (PythonOO) WK 1 SL 31
pre-defined methods WK 1 SL 20
predicting bounding-boxes WK 7 SL 61
predicting bounding-boxes (YOLO) WK 7 SL 105
predicting ethnic origin (RNN) WK 12 SL 23
prediction error vector WK 3 SL 16
pre-tokenization WK 14 SL 20
probabilistic model for data WK 11 SL 1
probabilities of pos and neg pairs WK 9 SL 43
probability distribution over token sequences WK 15 SL 10
probability distro over actions (RL) WK 15 SL 98
processing text with neural networks WK 12 SL 17
Product Quantization WK 9 SL 75
Product Quantization (PQ) WK 9 SL 64
programmer-supplied attribute WK 1 SL 20
programmer-supplied methods WK 1 SL 20
Projective Homography WK 2 SL 44
prompt completion probability WK 14 SL 84
prompt (definition) WK 15 SL 11
PromptResponder class (babyGPT) WK 14 SL 99
prompt_response_model WK 15 SL 28
prompt (RL) WK 15 SL 11
proposal distribution WK 11 SL 2
providers and consumers (EMD) WK 11 SL 32
Proximal Policy Optimization (PPO) WK 15 SL 3, WK 15 SL 37
'.pt' data archive (PyTorch) WK 7 SL 83
public interface WK 1 SL 15
pulmonary radiologists WK 9 SL 108
PurdueDrEvalDataset WK 7 SL 78
PurdueDrEvalMultiDataset WK 7 SL 96
PurdueDrEvalMultiDataset (Dataloader) WK 7 SL 101
PurdueShapes5 dataset WK 7 SL 2, WK 7 SL 51, WK 7 SL 55
PurdueShapes5Dataset WK 7 SL 59
PurdueShapes5GAN Dataset WK 11 SL 54
PurdueShapes5GAN-20000.tar.gz WK 11 SL 59
PurdueShapes5GAN-20000.tar.gz (DLStudio) WK 4 SL 60
PurdueShapes5MultiObject dataset WK 8 SL 80
PurdueShapes5-1000-test.gz WK 7 SL 54
PurdueShapes5-1000-test.gz (DLStudio) WK 4 SL 59
PurdueShapes5-1000-test-noise-20.gz (DLStudio) WK 4 SL 59
PurdueShapes5-10000-train.gz WK 7 SL 54
PurdueShapes5-10000-train.gz (DLStudio) WK 4 SL 59
PurdueShapes5-10000-train-noise-20.gz (DLStudio) WK 4 SL 59
pyramid representation of images WK 8 SL 15
Python Imaging Library (PIL) WK 2 SL 10
Python implementation of Q-Learning WK 15 SL 75
Python, Object Oriented WK 1 SL 1
Python OO WK 1 SL 1
Q
q-chain (diffusion) WK 11 SL 138
Q,K,V for calculating attention WK 13 SL 19
QKV for Inter-Pixel Attention WK 13 SL 106
Q,K,V tensors (attention) WK 13 SL 18
(Q,K,V) tensors for attention WK 13 SL 13
(q,k,v) vectors for attention WK 13 SL 13
Q-Learning for the Discrete Case (RL) WK 15 SL 68
Q-Learning (RL) WK 15 SL 5, WK 15 SL 61
Quality Index Q (RL) WK 15 SL 59
quantization loss (VQVAE) WK 16 SL 67
query (definition) WK 9 SL 62
(Question, Answer) format (BERT) WK 14 SL 34
R
ramifications of using Gaussian noise WK 11 SL 163
rand() WK 2 SL 56
randint() WK 2 SL 56
randn() WK 2 SL 56
random initialization of learnable params WK 14 SL 3
random noise vector (DCGAN) WK 11 SL 69
random.seed() WK 2 SL 72
random tensors WK 2 SL 3, WK 2 SL 55
random walk WK 11 SL 2
Rate 2 atrous convolution WK 8 SL 13
R-CNN WK 7 SL 4
Recurrent Neural Network (RNN) WK 12 SL 1
reducing the burden of supervised learning WK 14 SL 3
reduction (option for loss calculation) WK 7 SL 118
regression WK 7 SL 1
regression for bounding-box prediction WK 7 SL 61
regression for bounding-box prediction (YOLO) WK 7 SL 105
Regression Loss WK 7 SL 39
Regression Loss with torch.nn.MSELoss WK 7 SL 41
Reinforcement Learning for Human Feedback (RLHF) WK 15 SL 10
Reinforcement Learning (RL) WK 15 SL 2
relationship between variances (forward prop) WK 6 SL 51
ReLU() WK 3 SL 76
repackaging vs reshaping (tensor) WK 5 SL 17
reparameterize() (VAE) WK 16 SL 36
replication study (of BERT) WK 14 SL 51
representational vocabulary WK 9 SL 106
Representation Learning WK 9 SL 5
Representation Learning (definition) WK 9 SL 107
representation with embeddings WK 9 SL 16
representing an embedding with PQ WK 9 SL 78
representing codebook vectors WK 16 SL 87
representing human preference (RL) WK 15 SL 16
representing the bounding box WK 7 SL 40
requires_grad WK 4 SL 1
requires_grad WK 2 SL 59
reset gate (GRU) WK 12 SL 55, WK 12 SL 58
reshape() WK 5 SL 53
reshaping a tensor WK 5 SL 55
ResNet WK 6 SL 6
reward (definition) WK 15 SL 15
Reward Modeling Loss WK 15 SL 20
Reward Modeling Network WK 15 SL 31
RewardNetwork (babyGPT) WK 15 SL 35
reward (RL Vocab) WK 15 SL 49
RL agent (stochastic) WK 15 SL 55
RLHF (Reinforcement Learning for Human Feedback) WK 15 SL 10
RL (Reinforcement Learning) WK 15 SL 2
RL with Continuous State Space WK 15 SL 84
RMSProp's modificatoin to AdaGrad WK 3 SL 116
RNN (Recurrent Neural Network) WK 12 SL 1
RoBERTa (a Higher-Performance BERT) WK 14 SL 50
root class named 'object' (PythonOO) WK 1 SL 38
root class (PythonOO) WK 1 SL 38
run_autoencoder.py (DLStudio) WK 4 SL 50
run_gan_code() (AdversarialLearning) WK 11 SL 69
run_training_loop_multi_neuron_model() WK 3 SL 67
run_training_loop_one_neuron_model() WK 3 SL 57
run_vae_for_image_generation.py (DLStudio) WK 4 SL 50
run_vae.py (DLStudio) WK 4 SL 50
run_vqgan.py WK 16 SL 93
run_vqgan_transformer.py WK 16 SL 93
S
SAM architecture WK 8 SL 33
SAM (Loss Functions) WK 8 SL 34
SAM (Segment Anything Model) WK 8 SL 30
sanitizing an LLM WK 15 SL 2
saturating ports of nonlinearities WK 6 SL 3
scale and shift parameters (BN) WK 6 SL 33
scale invariance (Semantic Segmentation) WK 8 SL 10
scipy.signal.convolve2d() WK 5 SL 10
Scripting with Objects (SwO) WK 1 SL 2
Segment Anything Model (SAM) WK 8 SL 30
SegmentEmbedding class (BERT) WK 14 SL 40
self-attention WK 13 SL 2
SelfAttention class (DLStudio) WK 13 SL 35
self-attention (definition) WK 13 SL 4
self (keyword in PythonOO) WK 1 SL 47
self-supervised learning WK 14 SL 1, WK 9 SL 5
self-supervision for variational modeling WK 16 SL 20
semantic segmentation WK 4 SL 3
SemanticSegmentation inner class (DLStudio) WK 4 SL 27
semantic_segmentation.py (DLStudio) WK 4 SL 50
semantic segmentation with mUnet WK 8 SL 76
(Sentence-A, Sentence-B) input format (BERT) WK 14 SL 36
(Sentence, Next Sentence) format (BERT) WK 14 SL 34
Sentiment Analysis dataset WK 12 SL 5
Sentiment Analysis Network WK 12 SL 35
Sentiment Analytics WK 12 SL 5
sentiment_dataset_test_400.tar.gz WK 12 SL 15
sentiment_dataset_test_200.tar.gz WK 12 SL 15
sentiment_dataset_test_40.tar.gz (DLStudio) WK 4 SL 61
sentiment_dataset_test_400.tar.gz (DLStudio) WK 4 SL 61
sentiment_dataset_test_200.tar.gz (DLStudio) WK 4 SL 61
sentiment_dataset_train_200.tar.gz WK 12 SL 15
sentiment_dataset_train_400.tar.gz WK 12 SL 15
sentiment_dataset_train_40.tar.gz (DLStudio) WK 4 SL 61
sentiment_dataset_train_400.tar.gz (DLStudio) WK 4 SL 61
sentiment_dataset_train_200.tar.gz (DLStudio) WK 4 SL 61
Separator Token (Reward Modeling) WK 15 SL 34
SEP (separator input token BERT) WK 14 SL 35
seq2seq learning WK 13 SL 2
Seq2SeqLearning (DLStudio module) WK 4 SL 5
Seq2SeqLearning module of DLStudio WK 4 SL 36
seq2seq (sequence-to-sequence) WK 12 SL 20
seq2seq_with_learnable_embeddings.py (DLStudio) WK 4 SL 53
seq2seq_with_pretrained_embeddings.py WK 4 SL 53
seq2seq_with_transformerFG.py (DLStudio) WK 13 SL 7
seq2seq_with_transformerFG.py (DLStudio) WK 4 SL 55
seq2seq_with_transformerPreLN.py (DLStudio) WK 4 SL 55
seq2seq_with_transformerPreLN.py (DLStudio) WK 13 SL 7
sequence of states, actions, and rewards WK 15 SL 52
sequence-to-sequence learning WK 13 SL 1
sequence-to-sequence (seq2seq) learning WK 12 SL 20
Sequential WK 4 SL 13
serialization WK 7 SL 56
set of prompts (RL) WK 15 SL 11
set of sequence of tokens (RL) WK 15 SL 10
SGD (required averaging) WK 3 SL 61
SGD (Stochastic Gradient Descent) WK 3 SL 2, WK 3 SL 22
shape of a tensor WK 5 SL 20
shortcuts WK 6 SL 2
Shreeram Abhyankar WK 2 SL 40
Sigmoid WK 12 SL 50
Sigmoid (definition) WK 3 SL 75
Sigmoid (derivative) WK 3 SL 75
similarity (identity based) WK 9 SL 3
similarity search (definition) WK 9 SL 62
SimpleClass WK 1 SL 29
single_instance_detection.py WK 7 SL 88
Single Shot Detector WK 7 SL 6
size scaling issues (data augmentation) WK 2 SL 38
sketch_and_retrieve.py WK 9 SL 177
SkipBlock assumptions WK 6 SL 15
SkipBlock definition WK 6 SL 16
SkipBlock (DLStudio) WK 6 SL 6, WK 7 SL 61
SkipBlockDN (mUnet) WK 8 SL 72
SkipBlock downsampler WK 6 SL 23
SkipBlock for creating shortcuts WK 6 SL 18
SkipBlock naming conventions in BMEnet WK 6 SL 22
SkipBlockUP (mUnet) WK 8 SL 72
skip connections WK 6 SL 1
skip connections for recovering fine detail WK 8 SL 5
SkipConnections inner class (DLStudio) WK 4 SL 26
slice op with None WK 9 SL 42, WK 9 SL 51
sMAPE metric WK 12 SL 114
SNE WK 9 SL 84
SNE (Stochastic Neighbor Embedding) WK 9 SL 85
solving cart-pole problem with a neural network WK 15 SL 88
sparse gradients WK 3 SL 112
specialized modules (DLStudio) WK 4 SL 5
Specialized Modules of DLStudio WK 4 SL 30
squared norms off the diagonal (Distance Matrix) WK 9 SL 48
squeeze() WK 5 SL 25
SSD WK 7 SL 6
stabilizing transformer learning WK 13 SL 75
Stable Diffusion WK 11 SL 133
Stable Diffusion and CLIP text embeddings WK 11 SL 133
Stanford Online Products dataset WK 9 SL 4
state (RL Vocab) WK 15 SL 49
state-space discretization WK 15 SL 69
state transition (RL Vocab) WK 15 SL 49
Static Computational Graph WK 3 SL 93
static computational graph WK 3 SL 3
static method WK 1 SL 14
static method (PythonOO) WK 1 SL 40
static (PythonOO) WK 1 SL 56
static variables (PythonOO) WK 1 SL 36
steepest descent WK 3 SL 107
stemming WK 14 SL 16
step size and sparse gradients WK 3 SL 113
Step Size Optimization WK 3 SL 107
Stochastic Gradient Descent (SGD) WK 3 SL 2, WK 3 SL 22
Stochastic Neighbor Embedding (SNE) WK 9 SL 85
Stochastic RL Agent WK 15 SL 55
stride (conv2d() param) WK 5 SL 27
stride parameter for upsampling WK 8 SL 63
string.ascii_letters WK 12 SL 27
structured clutter in images WK 7 SL 76
stuffing yolo_vector in yolo_tensor WK 7 SL 111
subclassing a class (PythonOO) WK 1 SL 38
subquantizer (PQ) WK 9 SL 78
superclass WK 1 SL 16
super() for classing parent class method WK 1 SL 64
super() (PythonOO) WK 1 SL 60
Supervised Metric Learning WK 9 SL 27
super() with new style syntax (PythonOO) WK 1 SL 71
Symmetric Mean Absolute Percentage Error (sMAPE) WK 12 SL 114
system-supplied attributes (Python) WK 1 SL 34
T
tensor (definition) WK 2 SL 4
tensor.dtype WK 2 SL 57
tensor flattening WK 5 SL 60
Tensorflow WK 3 SL 93
tensor repackaging vs reshaping WK 5 SL 17
tensor reshaping WK 5 SL 55
tensor.shape WK 5 SL 17
tensor shape WK 5 SL 20
tensor.type() WK 2 SL 51, WK 2 SL 57
tensor.type() vs type() WK 2 SL 51
TER (Total Expected Reward) WK 15 SL 14, WK 15 SL 37, WK 15 SL 57
test_checkpoint_for_visTransformer.py WK 13 SL 103
test_checkpoint_for_visTransformer.py (DLStudio) WK 13 SL 7
testing MIOD WK 7 SL 139
text classification WK 12 SL 3, WK 4 SL 3
TextClassification inner class (DLStudio) WK 4 SL 27
text_classification_with_GRU.py (DLStudio) WK 4 SL 51
text_classification_with_TEXTnetOrder2.py WK 4 SL 50
text_classification_with_TEXTnet.py WK 4 SL 50
text_class ... _with_TEXTnetOrder2_word2vec.py WK 4 SL 51
text_class ... _with_TEXTnet_word2vec.py WK 4 SL 50
text_datasets_for_DLStudio.tar.gz (DLStudio) WK 4 SL 61
TEXTnet WK 12 SL 35
TEXTnetOrder2 (stepping stone to gating) WK 12 SL 41
TEXTnetOrder2 (TEXTnet with cells) WK 12 SL 45
text retrieval (CLIP) WK 9 SL 183
text_sample_for_testing.txt (babyGPT) WK 14 SL 22
time-complexity of brute-force search WK 9 SL 63
time evolution of hidden state WK 12 SL 52
Time Series Data WK 12 SL 90
time-series data prediction WK 4 SL 40
time-series forecasting WK 12 SL 1
Tokenization: Old vs. New WK 14 SL 16
Tokenizer in BERT WK 14 SL 41
tokenizer vocab (RL) WK 15 SL 10
token_seq_to_int() (Reward Modeling) WK 15 SL 33
tok_seq_list_to_tensor() (Reward Modeling) WK 15 SL 33
ToPILImage WK 2 SL 31
torch.backends.cudnn.benchmarks WK 2 SL 73
torch.backends.cudnn.deterministic WK 2 SL 73
torch.cuda.FloatTensor WK 2 SL 60
torch.cuda.is_available() WK 2 SL 60
torch.device() WK 2 SL 60
torch.float WK 2 SL 59
torch.FloatTensor WK 2 SL 60
torch.from_numpy() WK 5 SL 14
torch.gather (estimaing InfoNCE loss) WK 9 SL 149
torch.int64 WK 2 SL 58
torch.LongTensor WK 2 SL 58
torch.manual_seed() WK 2 SL 73
torch.max() WK 12 SL 28
torch.mm() WK 3 SL 105
torch.nn WK 3 SL 33, WK 4 SL 1
torch.nn.BCELoss WK 7 SL 36
torch.nn.BCELoss (YOLO) WK 7 SL 113
torch.nn.Conv2d WK 5 SL 29, WK 6 SL 14
torch.nn.CrossEntropyLoss WK 12 SL 30, WK 7 SL 13, WK 7 SL 27
torch.nn.CrossEntropyLoss (YOLO) WK 7 SL 113
torch.nn.functional WK 4 SL 16
torch.nn.functional.conv2d() WK 5 SL 15, WK 5 SL 27
torch.nn.GRU API WK 12 SL 69
torch.nn.Linear WK 6 SL 14
torch.nn.L1Loss WK 7 SL 39
torch.nn.LogSoftmax WK 12 SL 30
torch.nn.Module class (PyTorch) WK 4 SL 16
torch.nn.MSELoss WK 7 SL 13, WK 7 SL 39
torch.nn.MSELoss (YOLO) WK 7 SL 113
torch.nn.NLLLoss() WK 12 SL 30
torch.nn.Parameter (learnable params) WK 5 SL 39
torch.nn.parameter.Parameter WK 5 SL 33
torch.nn.Sequential WK 4 SL 13, WK 7 SL 61, WK 7 SL 88
torch.nn.Softmax normalization of Q.K^T product WK 13 SL 20
torch.nn.Softmax normalizaton (definition) WK 13 SL 21
torch.nonzero() WK 9 SL 60
torch.rand() WK 2 SL 56
torch.randint WK 2 SL 19
torch.randint() WK 2 SL 56, WK 2 SL 59
torch.randint_like() WK 2 SL 56
torch.rand_like() WK 2 SL 56
torch.randn() WK 2 SL 56
torch.randn_like() WK 2 SL 56
torch.Size() WK 5 SL 17
torch.squeeze() WK 5 SL 25
torch.tensor.unfold() WK 13 SL 98
torch.uint8 WK 2 SL 21
torch.unsqueeze() WK 5 SL 22
torch.utils.data.DataLoader WK 1 SL 9, WK 7 SL 83
torch.utils.data.Dataloader WK 7 SL 58
torch.view() WK 3 SL 105
Torchvision for image processing WK 2 SL 51
torchvision.transforms WK 2 SL 19, WK 2 SL 2, WK 2 SL 9
torchvision.transforms as tvt WK 1 SL 6
torchvision.transforms.functional.perspective WK 2 SL 47
torchvision.transforms.RandomAffine WK 2 SL 46
torchvision.transforms.ToTensor WK 2 SL 2
torch.zeros WK 2 SL 16
Total Expected Reward (TER) WK 15 SL 14, WK 15 SL 37, WK 15 SL 57
Total Variation Distance WK 11 SL 14
ToTensor WK 2 SL 2, WK 2 SL 35, WK 2 SL 9
training data (unbalanced) WK 9 SL 3
training loop (Reward Modeling) WK 15 SL 35
train_on_all_data() WK 3 SL 46
TrainTokenizer (babyGPT) WK 14 SL 21
TrainTokenizer class (babyGPT) WK 14 SL 24, WK 14 SL 28, WK 14 SL 96
train_tokenizer.py (babyGPT) WK 14 SL 21
Transformer Decoder for autoregressive modeling WK 14 SL 70
TransformerFG WK 13 SL 5
TransformerFG class (babyGPT) WK 14 SL 97
TransformerFG class in Transformers module WK 4 SL 44
TransformerFG (DLStudio) WK 13 SL 66
TransformerPreLN WK 13 SL 5
TransformerPreLN class in Transformers module WK 4 SL 44
TransformerPreLN (DLStudio) WK 13 SL 66
Transformers (DLStudio module) WK 4 SL 5
Transformers module in DLStudio WK 4 SL 43
Transformers.py (DLStudio) WK 13 SL 6
transforming a noise vector WK 11 SL 5
transportation simplex WK 11 SL 31
Transpose Convo dependence on kernel size WK 8 SL 54
Transpose Convo dependence on output size WK 8 SL 54
Transpose Convo dependence on padding WK 8 SL 54
transpose convo for upsampling WK 8 SL 51
Transpose Convolution WK 8 SL 4
transpose convolution WK 8 SL 37
trigrams (Language Modeling Simulator) WK 15 SL 25
Triplet Loss WK 9 SL 27, WK 9 SL 5
Triplet Mask WK 9 SL 52
t-SNE WK 9 SL 84
t-SNE (SNE with Student-t in visualization space) WK 9 SL 85
tulips.jpg WK 2 SL 12
tvt WK 1 SL 6
tvt.Compose WK 1 SL 6, WK 2 SL 33
tvt.Normalize WK 2 SL 33
TV (Total Variation) WK 11 SL 14
tvt.ToPILImage WK 2 SL 31
tvt.ToTensor WK 2 SL 21, WK 2 SL 30, WK 2 SL 35
two-headed network WK 7 SL 1
TwoLayerNet WK 1 SL 8
type() (Python) WK 2 SL 51
type() vs. tensor.type() WK 2 SL 51
U
UMAP WK 9 SL 84, WK 9 SL 85
U-Net WK 8 SL 4
Unicode number representation WK 14 SL 18
Unicode numbers WK 14 SL 18
uniformly distributed floats in (0,1.0) interval WK 2 SL 56
uniformly distributed integers WK 2 SL 56
unraveling a deep network with skip connections WK 6 SL 58
unsigned 8-bit integer WK 2 SL 8
unsqueeze() WK 5 SL 22
Unsupervised Learning at Scale (GPT-3) WK 14 SL 77
Unsupervised Learning (GPT Models) WK 14 SL 66
unsupervised metric learning WK 9 SL 10
Unsupervised Metric Learning (CLIP) WK 9 SL 153
update gate (GRU) WK 12 SL 55, WK 12 SL 56
updating the Q table (RL) WK 15 SL 63
upsampling with stride in Transpose Convo WK 8 SL 63
upsampling with transpose convo WK 8 SL 51
Using Adversarial Loss as additional loss WK 11 SL 126
using Gaussian noise for diffusion WK 11 SL 159
using stride and padding together WK 8 SL 67
utf-8 WK 14 SL 18
V
VAE class (DLStudio) WK 16 SL 46
VAE Decoder (role) WK 16 SL 36
VaeDecoder (VAE) WK 16 SL 47
VAE Encoder output WK 16 SL 32
VaeEncoder (VAE) WK 16 SL 47
VAE (generating the latent vector for Decoder) WK 16 SL 34
VAE inner class (DLStudio) WK 4 SL 28
VAE reparameterization trick WK 16 SL 36
VAE (Variational Autoencoder) WK 16 SL 17, WK 16 SL 2
value-function based RL WK 15 SL 97
vanishing gradient problem (RNN) WK 12 SL 20, WK 12 SL 3
vanishing gradients WK 6 SL 1
vanishing gradients causes WK 6 SL 48
vanishing gradients mitigation WK 6 SL 2
variable length input (RNN) WK 12 SL 19
variance as signal energy WK 6 SL 52
Variational Autoencoder (VAE) WK 16 SL 17, WK 16 SL 2
vectorizing the input WK 8 SL 40
vectorizing the kernel WK 8 SL 39
Vector Quantization (VQ) WK 9 SL 65
VectorQuantizer class (DLStudio) WK 16 SL 75
verify_with_torchnn.py WK 3 SL 5
verify_with_torchnn.py WK 3 SL 34
verify_with_torchnn.py (CGP) WK 3 SL 84
view() WK 5 SL 53
viewpoint dependence of face images WK 2 SL 49
viewpoint effects (data augmentation) WK 2 SL 38
Vision Transformer WK 13 SL 92
visTransformer WK 13 SL 5
visTransformer class (DLStudio) WK 13 SL 101, WK 13 SL 99
visTransformer (DLStudio) WK 13 SL 92
visualizing similarity clusters WK 9 SL 84
Voronoi cell centroids WK 9 SL 68
Voronoi cells WK 9 SL 68
Voronoi diagram (definition) WK 9 SL 68
Voronoi partitioning of a plane WK 9 SL 70
VQ for dimensionality reduction WK 9 SL 72
VQGAN WK 16 SL 2
VQGAN class (DLStudio) WK 16 SL 96
VQGAN inner class (DLStudio) WK 4 SL 28
VQ-VAE WK 16 SL 2
VQVAE WK 16 SL 2
VQVAE inner class (DLStudio) WK 4 SL 28
VQVAE (VAE with codebook learning) WK 16 SL 63
VQ (Vector Quantization) WK 9 SL 65
W
warm-up phase (learning-rate) WK 13 SL 75
Wasser distance WK 11 SL 88
Wasser distance between histograms WK 2 SL 71
Wasserstein distance WK 11 SL 7
Wasserstein Distance WK 11 SL 36
Wasserstein GAN (WGAN) WK 11 SL 7, WK 11 SL 87
'.weight' for accessing the kernel WK 5 SL 33
wgan_CG1.py (DLSudio) WK 4 SL 52
WGAN-GP for Learning a Point Distro WK 11 SL 107
WGAN (Wasserstein GAN) WK 11 SL 7, WK 11 SL 87
wgan_with_gp_CG2.py (DLSudio) WK 4 SL 52
workers for data download WK 7 SL 83
working_with_hsv_color_space() WK 2 SL 52
Y
YOLO WK 7 SL 5
yolo_interval (definition) WK 7 SL 105
YOLO Logic WK 7 SL 104
YOLOLogic Python module WK 7 SL 120
yolo_tensor WK 7 SL 111
yolo_tensor_aug WK 7 SL 112
yolo_vector WK 7 SL 107
Z
zero-mean isotropic Gaussian noise WK 11 SL 6
zeros_like() WK 2 SL 29
zombie mode WK 1 SL 1
Last updated: May 2026