-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathindex.xml
2231 lines (2120 loc) · 191 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>VLG</title>
<link>https://vlgiitr.github.io/</link>
<atom:link href="https://vlgiitr.github.io/index.xml" rel="self" type="application/rss+xml" />
<description>VLG</description>
<generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Fri, 28 Jun 2024 00:00:00 +0000</lastBuildDate>
<image>
<url>https://vlgiitr.github.io/images/logo_hu0af03150d0ca39f3b12fa58639b44cf7_60645_300x300_fit_lanczos_3.png</url>
<title>VLG</title>
<link>https://vlgiitr.github.io/</link>
</image>
<item>
<title>Machine Unlearning</title>
<link>https://vlgiitr.github.io/project/machine_unlearning/</link>
<pubDate>Sun, 26 Nov 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/machine_unlearning/</guid>
<description><p>Machine unlearning is an emergent subfield of machine learning that aims to remove the influence of a specific subset of training examples — the &ldquo;forget set&rdquo; — from a trained model. Furthermore, an ideal unlearning algorithm would remove the influence of certain examples while maintaining other beneficial properties, such as the accuracy on the rest of the train set and generalization to held-out examples.</p>
<p>A straightforward way to produce this unlearned model is to retrain the model on an adjusted training set that excludes the samples from the forget set.</p>
</description>
</item>
<item>
<title>Give me a hint: Can LLMs take a hint to solve math problems?</title>
<link>https://vlgiitr.github.io/project/llm-math/</link>
<pubDate>Sun, 26 Nov 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/llm-math/</guid>
<description><p>While many state-of-the-art LLMs have shown poor logical and basic mathematical reasoning, recent works try to improve their problem-solving abilities using prompting techniques. We propose giving &ldquo;hints&rdquo; to improve the language model’s performance on advanced mathematical problems, taking inspiration from how humans approach math pedagogically. We also test the model’s adversarial robustness to wrong hints. We demonstrate the effectiveness of our approach by evaluating various LLMs, presenting them with a diverse set of problems of different difficulties and topics from the MATH dataset and comparing against techniques such as one-shot, few-shot, and chain of thought prompting.</p>
</description>
</item>
<item>
<title>LoRA-Unlearn</title>
<link>https://vlgiitr.github.io/project/lora_unlearn/</link>
<pubDate>Sun, 26 Nov 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/lora_unlearn/</guid>
<description><p>This study addresses the challenge of machine unlearning in light of growing privacy regulations and the need for adaptable AI systems. We present a novel approach, PruneLoRA where we leverage LoRA to selectively modify a subset of the pruned model’s parameters, thereby reducing the computational cost, memory requirements and improving the model’s ability to retain performance on the remaining classes</p>
</description>
</item>
<item>
<title>StegaVision: Enhancing Steganography with Attention Mechanism</title>
<link>https://vlgiitr.github.io/project/stegavision/</link>
<pubDate>Sun, 26 Nov 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/stegavision/</guid>
<description><p>Our study, StegaVision, aims to enhance image steganography by integrating attention mechanisms into an autoencoder-based model. Our approach focuses on dynamically adjusting the importance of different parts of the image through attention mechanisms. This helps in better embedding the hidden information while maintaining the image&rsquo;s visual quality. We specifically explore two types of attention mechanisms—Channel Attention and Spatial Attention—and test their effectiveness on an autoencoder model.</p>
</description>
</item>
<item>
<title>Layer Level Loss Optimisation - 2023</title>
<link>https://vlgiitr.github.io/project/layer-level-loss-optimisation/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/layer-level-loss-optimisation/</guid>
<description><p>An experiment in testing a novel method to train neural networks inspired by the Forward-Forward Algorithm proposed by Geoffrey Hinton by updating weights of a layer by calculating the loss at each intermediate layer instead of backpropagating the losses
through the entire network.</p>
<p>In the original paper, instead of relying on the traditional forward and backward passes of backpropagation, the method utilized two forward passes — one with positive, real data and the other with negative data.
With our modified method we were able to achieve an error rate of less than 2% for a fully connected network and convolutional network on the MNIST dataset.</p>
</description>
</item>
<item>
<title>Sensorium 2022</title>
<link>https://vlgiitr.github.io/project/sensorium/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/sensorium/</guid>
<description><p>The NeurIPS 2022 The SENSORIUM competition aimed to find the best neural predictive model that can predict the activity of thousands of neurons in the primary visual cortex of mice in response to natural images.</p>
<p>In our submission for this competition, we attempted to improve the baseline model for the competition track- Sensorium+, where neural activity was to be predicted with given visual stimuli and other behavioural variables.</p>
</description>
</item>
<item>
<title>Deep Cache Replacement - 2020</title>
<link>https://vlgiitr.github.io/project/deap/</link>
<pubDate>Sat, 19 Sep 2020 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/deap/</guid>
<description><p>The PyTorch codebase for DEAP Cache: Deep Eviction Admission and Prefetching for Cache.</p>
<p>In this paper, we propose a DL based approach to tackle the problem of Cache Replacement. This is the first time an approach has tried learning all the three policies: Admission, Prefetching and Eviction. Unlike, previous methods which relied on past statistics for carrying out cache replacement, we predict future statistics (frequency and recency) and then use an online RL-algorithm for eviction.</p>
</description>
</item>
<item>
<title>DL Topics</title>
<link>https://vlgiitr.github.io/project/dl_topics/</link>
<pubDate>Sun, 13 Sep 2020 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/dl_topics/</guid>
<description><p>This repo contains a list of topics which we feel that one should be comfortable with before appearing for a DL interview. This list is by no means exhaustive (as the field is very wide and ever growing).</p>
</description>
</item>
<item>
<title>GenZoo - 2019</title>
<link>https://vlgiitr.github.io/project/genzoo/</link>
<pubDate>Sun, 20 Oct 2019 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/genzoo/</guid>
<description><p>GenZoo is a repository that provides implementations of generative models in various frameworks, namely Tensorflow and Pytorch. This was a project taken up by VLG-IITR for the summers of 2019, done with the collaborative efforts of various students.</p>
</description>
</item>
<item>
<title>Group-Level-Emotion-Recognition - 2018</title>
<link>https://vlgiitr.github.io/project/emoticon/</link>
<pubDate>Tue, 06 Nov 2018 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/emoticon/</guid>
<description><p>This repository contains the code of our model submitted for the ICMI 2018 EmotiW Group-Level Emotion Recognition Challenge. The model was ranked 4th in the challenge. The paper proposes an end-to-end model for jointly learning the scene and facial features of an image for group-level emotion recognition.</p>
</description>
</item>
<item>
<title>Neural Turing Machines - 2018</title>
<link>https://vlgiitr.github.io/project/ntm/</link>
<pubDate>Wed, 26 Sep 2018 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/ntm/</guid>
<description><p>This repository is a stable Pytorch implementation of a Neural Turing Machine and contains the code for training, evaluating and visualizing results for the Copy, Repeat Copy, Associative Recall and Priority Sort tasks. The code has been tested for all 4 tasks and the results obtained are in accordance with the results mentioned in the paper.</p>
</description>
</item>
<item>
<title>Dynamic Memory Network Plus - 2018</title>
<link>https://vlgiitr.github.io/project/dmn_plus/</link>
<pubDate>Fri, 08 Jun 2018 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/project/dmn_plus/</guid>
<description><p>This is the Pytorch implementation of the paper Dynamic Memory Network for Visual and Textual Question Answering. This paper is an improved version of the original paper Ask Me Anything: Dynamic Memory Networks for Natural Language Processing. The major difference between these ideas is in the functioning of the input module and the memory module which has been explained in detail in the IPython notebook file of this repo.</p>
</description>
</item>
<item>
<title></title>
<link>https://vlgiitr.github.io/blogs/posts/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/blogs/posts/</guid>
<description></description>
</item>
<item>
<title>Unmasking the Veil: An Investigation into Concept Ablation for Privacy and Copyright Protection in Images</title>
<link>https://vlgiitr.github.io/publication/tmlr/</link>
<pubDate>Fri, 28 Jun 2024 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/tmlr/</guid>
<description></description>
</item>
<item>
<title>Benchmarking Object Detectors with COCO: A New Path Forward</title>
<link>https://vlgiitr.github.io/publication/eccv/</link>
<pubDate>Wed, 27 Mar 2024 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/eccv/</guid>
<description></description>
</item>
<item>
<title>Confidence Is All You Need for MI Attacks</title>
<link>https://vlgiitr.github.io/publication/aaai/</link>
<pubDate>Sat, 24 Feb 2024 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/aaai/</guid>
<description></description>
</item>
<item>
<title>Machine Unlearning</title>
<link>https://vlgiitr.github.io/posts/machine_unlearning/</link>
<pubDate>Wed, 03 Jan 2024 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/posts/machine_unlearning/</guid>
<description><p>Widely used machine learning algorithms are able to learn from new data using batch or online training methods but are incapable of efficiently adapting to data removal. Why do we need data removal though you might think. Turns out data removal is required to address various issues around privacy, fairness, and data quality. For example, the “Right to be Forgotten” in the European Union’s General Data Protection Regulation (GDPR) provides individuals with the right to request the removal of their data from an organization&rsquo;s records.</p>
<p>Now comes the main question here. How do we go about deleting, or in better words, unlearning about this data.</p>
<p>There are a number of approaches to go about this which are broadly categorized into two segments: <strong>exact unlearning</strong> and <strong>approximate unlearning</strong>. Exact unlearning algorithms reduce the large computational cost of <strong>naïve retraining</strong> by structuring the initial training so as to allow for more efficient retraining; in doing so they replicate the same model that would have been produced under naïve retraining. In contrast, approximate unlearning algorithms avoid the need for full retraining, speeding up the process of unlearning by allowing a degree of approximation between the output model and the naïve retrained model.</p>
<h2 id="sisa-sharded-isolated-sliced-and-aggregated">SISA-Sharded Isolated Sliced and Aggregated</h2>
<p>The SISA algorithm is an exact unlearning algorithm which tries to reduce the time taken in naïve unlearning. This is achieved by a reorganization of the training dataset, known as <strong>sharding</strong> and <strong>slicing</strong>.</p>
<p>The full SISA algorithm is applicable to any machine learning model that has been trained incrementally, for example, via gradient descent. The loss function for such models need not be strongly convex.</p>
<ul>
<li>
<p><strong>Methodology</strong></p>
<p>The SISA training process to consist of four key steps - Sharded, Isolated, Sliced, and Aggregated. The training data is split into S shards, which are further split into R slices. S independent models are trained incrementally on the slices, and predictions of these models are aggregated to form a final output.</p>
<p><img src="Untitled.png" alt="Untitled.png"></p>
<p>The data to unlearn is highlighted in red in this diagram. To unlearn this data point, only M2 needs to be retrained, and this process starts from slice D2,2.</p>
<p><strong>SHARDING</strong></p>
<p>The original training dataset is separated into approximately equal-sized shards, with each training data point contained in exactly one shard.</p>
<p><strong>ISOLATION</strong></p>
<p>Each of the shards is trained in isolation from the other shards, restricting the influence of each data point to a single shard.</p>
<p><strong>SLICING</strong></p>
<p>Each of the shards are sub-divided into slices, which are presented to the algorithm incrementally as training proceeds. The trained model states are saved after each slice.</p>
<p><strong>AGGREGATION</strong></p>
<p>To form the final model prediction for a data point, the predictions of each sharded model are aggregated.</p>
</li>
<li>
<p><strong>Algorithm</strong></p>
<p>Whenever a removal request for a single data point comes in, only the model trained on the shard containing the particular data point needs to be retrained and, moreover, retraining need only begin from the slice containing the data point. As a result, the expected retraining time is faster compared to naïve retraining; the exact speed-up depends on the number of shards and slices used.</p>
<pre><code>**Algorithm: Initial training with SISA.**
**Input:** training data D, number of shards S, number of slices R, number of epochs for each slice e.
**Output:** ensemble of models h = ($h1$, . . . , hS) and intermediary model states h˜ = ({h˜i,0, . . . , h˜i,R})Si=1.
1: **procedure** SisaTrain(D; S, R, e)
2: split the data randomly into shards D1, . . . , DS and save shard indices for each data point
3: split each shard Di randomly into R slices Di,1, . . . Di,R and save slice indices for each data point
4: randomly initialise (h˜1,0, . . . , h˜S,0)
5: **for** i = 1; i ≤ S; i++ **do**
6: **for** j = 1; j ≤ R; j++ **do**
7: hi,j ← Train Di,1 ∪ · · · ∪ Di,j | h˜i,j−1 for ej epochs
8: save model state h˜i,j of model hi,j
9: **end for**
10: hi ← hi,R
11: **end for**
12: **return** h = (h1, . . . , hS), h˜U = ({h˜i,0, . . . , h˜i,R})Si=1.
13: **end procedure**
</code></pre>
<hr>
</li>
<li>
<p><strong>Efficiency</strong></p>
<p>The number of shards, S, is an efficiency parameter i.e. increasing the number of shards increases the efficiency of SISA, but will degrade the predictive performance of the resultant machine learning model compared to a lower number of shards. Increasing the number of slices in each shard, R, reduces the retraining time but this does not degrade accuracy, provided that the epochs in training are carefully chosen. However, an increase in R does come at increased storage costs due to the increased number of saved model states. The efficiency-storage trade-off of R may be preferable to the efficiency-effectiveness trade-off of S.</p>
</li>
</ul>
<h2 id="dare-forests">DaRE Forests</h2>
<p>This is an unlearning algorithm that is specific to decision-tree and random-forest based machine learning models for binary classification. This is done through the development of Data Removal-Enabled (DaRE) trees, and the ensemble of these to form DaRE Forests (DaRE RF). Through the use of strategic thresholding at decision nodes for continuous attributes, high-level random nodes, and caching certain statistics at all nodes, DaRE trees enable efficient removal of training instances.</p>
<ul>
<li>
<p><strong>Methodology</strong></p>
<p>DaRE forest ensembles in the same way as a random forest, in particular a random subset of p features are considered at each split. As in regular decision trees, DaRE trees are trained recursively by selecting, at most nodes, an attribute and threshold that optimizes a split criterion.
They differ from regular decision trees in three key ways as follows.</p>
<ul>
<li><strong>Random nodes:</strong> The top $d_rmax$ levels of nodes in a DaRE tree are random nodes, where $d_rmax$ is an integer hyperparameter.</li>
<li><strong>Threshold sampling:</strong> During training and deletion, DaRE trees randomly sample k valid thresholds at any node that is neither a random node nor a leaf. These are thresholds that lie between two adjacent data points with opposite labels. Doing so reduces the amount of statistics one needs to store at each node and speeds up computation.</li>
<li><strong>Statistics caching:</strong> At each node, for each of the k candidate valid thresholds v, various additional statistics are stored and updated. In each case these statistics are sufficient to recompute the split criterion scores and to determine the validity of the current thresholds. As a result, the removal mechanism is able to recall training data from the stored leaf instances, meaning that training data is not required as an explicit input to the mechanism.</li>
</ul>
<pre><code>**Algorithm: DareTrain(D, 0; drmax, k) trains a single DaRE tree
Input**: data Dnode, depth d, random node depth drmax, threshold candidate size k.
**Output**: trained subtree rooted at a level-d node.
1: **procedure** DareTrain(Dnode, d; drmax, k)
2: **if** stopping criteria reached **then**
3: node ← LeafNode()
4: save instance counts |Dnode|, |D1|
5: save leaf-instance pointers(node, Dnode)
6: compute leaf value(node)
7: **else**
8: **if** d &lt; drmax **then**
9: node ← RandomNode()
10: save instance counts |Dnode|, |Dnode,1|
11: a ← randomly sample attribute(Dnode)
12: v ← randomly sample threshold ∈ [amin, amax)
13: save threshold statistics(node, Dnode, a, v)
14: **else**
15: node ← GreedyNode()
16: save instance counts |Dnode|, |Dnode,1|
17: A ← randomly sample ˜p attributes(Dnode)
18: **for** a ∈ A do
19: C ← get valid thresholds(Dnode, a)
20: V ← randomly sample k valid thresholds(C)
21: **for** v ∈ V do
22: save threshold statistics(node, Dnode, a, v)
23: **end** **for**
24: *scores* ← compute split scores(node)
25: select optimal split(node, *scores*)
26: **end for**
27: **end if**
28: Dleft, Dright ← split on selected threshold(node, Dnode)
29: node.left = DareTrain(Dleft, d + 1; drmax, k)
30: node.right = DareTrain(Dright, d + 1; drmax, k)
31: **end if**
32: **return** node
33: **end procedure**
</code></pre>
</li>
<li>
<p><strong>Algorithm</strong></p>
<pre><code>**Algorithm: Deleting a training instance from a DaRE tree, (Brophy and Lowd, 2021).
Require**: start at the root node.
**Input**: node, data point to remove z, depth d, random node depth drmax, threshold candidate size k.
**Output**: retrained subtree rooted at node.
1: **procedure** DareUnlearn(node, z, d; drmax, k)
2: update instance counts |Dnode|, |Dnode,1|
3: **if** node is a LeafNode **then**
4: remove z from leaf-instance pointers(node, z)
5: recompute leaf value(node)
6: remove z from database and return
7: **else**
8: update decision node statistics(node, z)
9: **if** node is a RandomNode **then**
10: **if** node.selectedT hreshold is invalid **then**
11: Dnode ← get data from the set of leaf instances(node) \ {z}
12: **if** node.selectedAttribute(a) is not constant **then**
13: v ← resample threshold ∈ [amin, amax)
14: Dnode,`, Dnode,r ← split on new threshold(node, Dnode, a, v)
15: node.` ← DareTrain(Dnode,`, d + 1; drmax, k)
16: node.r ← DareTrain(Dnode,r, d + 1; drmax, k)
17: **else**
18: node ← DareTrain(Dnode, d; drmax, k)
19: **end if**
20: remove z from database and return
21: **end if**
22: **else**
23: **if** ∃ invalid attributes or thresholds **then**
24: Dnode ← get data from the set of leaf instances(node) \ {z}
25: resample invalid attributes and thresholds(node, Dnode)
26: **end if**
27: scores ← recompute split scores(node)
28: a, v ← select optimal split(node, scores)
29: **if** optimal split has changed **then**
30: Dnode.left, Dnode.right ← split on new threshold(node, Dnode, a, v)
31: node.left ← DareTrain(Dnode.left, d + 1; drmax, k)
32: node.right ← DareTrain(Dnode.right, d + 1; drmax, k)
33: remove z from database and return
34: **end if**
35: **end if**
36: **if** xa ≤ v **then**
37: DareUnlearn(node.left, z, d + 1; drmax, k)
38: **else**
39: DareUnlearn(node.right, z, d + 1; drmax, k)
40: **end if**
41: **end if**
42: **end procedure**
</code></pre>
</li>
<li>
<p><strong>Efficiency</strong></p>
<ul>
<li>The level of random nodes in a DaRE RF, drmax, is an efficiency parameter, with larger values entailing faster unlearning at the cost of predictive performance.</li>
<li>DaRE RFs with random nodes have worse performance than the standard random forest.</li>
<li>The number of valid thresholds to consider, k, is another efficiency parameter. Reducing k will increase efficiency, however predictive performance suffers</li>
</ul>
</li>
</ul>
<h2 id="approximate-unlearning-certified-unlearning">Approximate Unlearning (certified unlearning)</h2>
<p>Approximate unlearning approaches attempt to address these cost related constraints. In lieu of retraining, these strategies: perform computationally less costly actions on the final weights, modify the architecture or filter the outputs. Essentially we relax the exact unlearning problem to give us a probability or a certainty with which we can say whether or not a sample was in the training set or not.</p>
<p>To know more about one of the approximate unlearning methods known as Selective Synaptic Dampening check out:</p>
<p><a href="https://www.notion.so/SSD-paper-summary-b713ce47fa5c418c995b5368cdf6adcf?pvs=21" target="_blank" rel="noopener">SSD paper summary </a></p>
</description>
</item>
<item>
<title>Dismantling Disentanglement in VAEs</title>
<link>https://vlgiitr.github.io/posts/vae/</link>
<pubDate>Wed, 25 Oct 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/posts/vae/</guid>
<description><p>Over the years neuroscience has inspired many quantum leaps in Artificial Intelligence. One such remarkable development inspired by the visual ventral system of the brain is Disentangled Variational Autoencoders.</p>
<p>So first things first -</p>
<h2 id="what-are--autoencoders">What are Autoencoders?</h2>
<p>In a real-world scenario, fewer dimensions may be required to capture the information stored in a particular data point than already present. This is due to the inherent structure of the data.</p>
<p><img src="dimensions.png" alt="dimensions.png"></p>
<p>As shown above, in the first image data points are truly random, there is no structure to data so all three x, y, and z coordinates are necessary to represent data. While in the second image, data is restricted to a spiral, there is some structure to data so that it could be represented by just two variables.</p>
<p><img src="enc-decarch.jpeg" alt="enc-decarch.jpeg"></p>
<p>Autoencoder uses neural networks to provide an unsupervised approach to deal with data.</p>
<p>Data is run through a neural network and map it into a lower dimension called the latent dimension. Then that information can be decoded using a decoder. If we increase the dimensions of the latent space we would get a more detailed image but the number of dimensions required for a considerably clear reconstruction might be very less as compared to the original dimensionality .It could also be used for applications like image segmentation, denoising and neural inpainting.</p>
<h3 id="how-does-it-work">How does it work?</h3>
<p>Basically we compress the information into latent variables using non linear activation function and then run it through the decoder with the aim of recreating the input data by using just the information stored in latent variables. We calculate the reconstruction loss by comparing the output with input then try to minimize this loss by changing the parameters.</p>
<h2 id="variational-autoencoders">Variational Autoencoders</h2>
<p>We have a rough idea of autoencoders by now, so the next question which is arises is what are Variatonal Autoencoders(VAEs) and how are they different ?</p>
<p>In VAEs unlike traditional autoencoders the input is mapped to a distribution from which data is sampled and fed into the decoder.</p>
<p>Given input data $x$ and latent variable $z$ , encoder tries to learn the posterior distribution $p(z|x)$.</p>
<p>This posterior is intractable so VAEs use variational inference to approximate it</p>
<blockquote>
<p><em><strong>Variational Inference</strong> : We choose a family of distribution and then fit it to the input data by changing the parameters. This helps us learn a good approximation to intractable distribution.</em></p>
</blockquote>
<p><img src="VAE.png" alt="VAE.png"></p>
<h3 id="but-how-do-we-know-if-we-have-a-good-approximation-of-the-posterior-">But how do we know if we have a good approximation of the posterior ?</h3>
<p>The metric we use to determine how close the approximated distribution is to the required posterior is the Kullback-Liebler Divergence.</p>
<p>$$
\hat{q}(z)=\underset{q\sim Q}{\operatorname{argmax}} KL(q(z)||p(z|x))</p>
<p>$$</p>
<p>Here q(z) is the approximated distribution and Q is the family of distributions of which q is a member.</p>
<p>One visible problem with this is that we dont know p(z|x), so we cant calculate KL divergence directly. To deal with this we convert this into optimization problem. We will skip the maths here and directly jump to the results.</p>
<p>$$
KL(q(z)||p(z|x))=-ELBO(q)+p(x)
$$</p>
<p>Here ELBO is the something called the Evidence Lower Bound. It is the only term dependent on q. So we have to just maximize ELBO to minimize KL divergence and subsequently find good approximation of the posterior distributaion.</p>
<h3 id="the-reparameterization-trick">The Reparameterization trick:</h3>
<p>If one pays close attention its difficult to not notice an obvious hurdle in this model. We cant run gradient through sampling operations. So how do we train this model ? This is where the Reparameterization trick comes to rescue!</p>
<p><img src="repara.png" alt="repara.png"></p>
<p>We rewrite z as : $z=\mu +\sigma \bigodot \epsilon$ .</p>
<p>$\bigodot$ here represents the elementwise product of matrices or the Hadamard product</p>
<p>$\mu$ — Mean of the distribution</p>
<p>$\sigma$ —Standard Deviation</p>
<p>$\epsilon \sim N(0,1)$</p>
<p>This reparametrization splits the latent representation into deterministic and stochastic parts. Here $\mu$ and $\sigma$ are the deterministic quantities that we train by using gradient descent, while $\epsilon$</p>
<p>represents the stochastic component, introducing randomness and preventing a direct one-to-one mapping of the data.</p>
<h2 id="what-do-we-mean-by-disentangling">What do we mean by ‘disentangling’?</h2>
<p>Neural networks and the information stored in it is often treated a blackbox with no real way to map which artificial neuron contains what information. Infact there is an entire field of AI called Explainable AI (XAI) dedicated to deal with this problem. One significant reason why it&rsquo;s difficult to comprehend and map this information is that artificial neurons don&rsquo;t store information in an organized and compartmentalized form as we perceive it. It wouldn&rsquo;t be inaccurate to state that knowledge is rather &ldquo;entangled.”</p>
<p>Disentangling refers to making sure that all neurons in latent space learn something different and uncorellated about training data. change in a single latent unit It helps us to compartmentalise and organise information enabling crucial applications like knowledge transfer and zero-shot learning</p>
<blockquote>
<p><em><strong>Knowledge Transfer</strong> : It is using information learnt in one context to learn new things faster.</em></p>
</blockquote>
<blockquote>
<p><em><strong>Zero-shot learning :</strong> It is the use of learnt information to draw inference about unseen data.</em></p>
</blockquote>
<p>Ability to learn uncorrelated underlying factors in an un supervised setting has far reaching implications. It gives the model the ability to recombine the old information in a novel scenario and extrapolate it to make inference just like humans. It also causes model to learn about basic visual concepts like ‘objectness’. This is crucial in order to make machines that think like humans.</p>
<h2 id="how-is-disentangling-executed-">How is disentangling executed ?</h2>
<p>Disentangling is inspired by Visual Ventral System of Brain. We translate the biological constraints to mathematical constraints to apply similar pressures.</p>
<ol>
<li>
<p><strong>Exposure to data with transform continuities :</strong> Ventral visual system of infants learn from continously transforming data. Response properties of neurons in the inferior temporal cortex arise through a Hebbian learning algorithm that relies on the fact that nearest neighbours of a particular object in pixel space are the transforms of of the same object.</p>
<p><img src="IMG_B40CA03DD44A-1.jpeg" alt="IMG_B40CA03DD44A-1.jpeg"></p>
</li>
</ol>
<p>The image above clearly demonstrates that sparse data point do not provide enough information for an unsupervised model to identify where the data manifold should lie.</p>
<p>Thus it is important that the factors of variation of observed data are densely sampled from their respective distributions.</p>
<ol>
<li><strong>Redundancy reduction and encouraging statiscal independence :</strong></li>
</ol>
<p>Deep unsupervised model is encouraged to perform redundancy reduction and learn statistically independent factors from continuous data in order to learn basic visual concepts similar to humans</p>
<blockquote>
<p><em><strong>Redundancy</strong> :Difference between maximum entropy a channel can transmit, and the entropy of messages actually transmitted.</em></p>
</blockquote>
<p>Redundancy reduction is facilitated through learning statistically independent factors</p>
<p>This mathematically translates to the following constrained optimisation problem</p>
<p>$$
\mathcal{L}(\theta,\phi;x)= \mathbb{E}<em>{q</em>{\phi}(z|x)}[logp_{\theta}(x|z)] -\beta D_{KL}(q_{\phi}(z|x)||p(z))
$$</p>
<p>Here we need to maximize $\mathcal{L}(\theta,\phi;x)$ ;</p>
<p>where, $x$ is observed data ;$z \in \R^{n}$ are the latent factors; $\beta \ge 0$ is the inverse tempreature or regularisation coefficient</p>
<p>We generally set the disentangled prior to be isotropic gaussian i.e. $p(z)=\mathcal{N}(0,I)$</p>
<p>Redundancy reduction is enforced by constraining the capacity of latent information channel $z$ while preserving enough information to enable reconstruction.</p>
<p>Isotropic nature of Gaussian puts implicit independence pressure on the latent posterior.</p>
<p>Varying $\beta$ changes degree of applied learning pressure during training.</p>
<p>$\beta$ =0 ⇒ Standard Maximum Likelihood Learning</p>
<p>$\beta$ =1 ⇒ Bayes Solution</p>
<h3 id="example">Example:</h3>
<p><img src="IMG_AD30E272A0ED-1.jpeg" alt="IMG_AD30E272A0ED-1.jpeg"></p>
<p>The above image shows difference in latent representations of disentangled and entangled learning on same dataset of 2D shapes.</p>
<p>In fig A i.e. disentangled learning with $\beta$ =4 ; latent factor z5, z7, z4, z9, z2 encode information about position in Y, position in X, scale, cos and sin rotational coordinates respectively. While orther latent factors learn uninformative Gaussian distribution.</p>
<p>Clearly in fig B i.e. the entangled case, there is no such seperation of factors and it is impossible to know what factor encodes what.</p>
<h2 id="conclusion">Conclusion:</h2>
<p>The development of Artificial General Intelligence(AGI) i.e. giving machines abililty to learn, think and reason out like humans has been a scientific fantasy for a long time now. Learning of basic visual concepts like objectness, ability to accelerate learning using prior knowledge and ability to infer in a unseen scenario by combining past knowledge are essential qualities for realisation of this goal. Development of unsupervised learning models like disentangled VAEs is a key step in this direction. Its application in Reinforcement learning scenarios is also very promising.</p>
<h2 id="references-">References :</h2>
<ol>
<li><a href="https://arxiv.org/abs/1606.05579" target="_blank" rel="noopener">Disentangled VAE&rsquo;s (DeepMind 2016)</a></li>
<li><a href="https://arxiv.org/abs/1312.6114" target="_blank" rel="noopener">Original VAE paper (2013)</a></li>
</ol>
</description>
</item>
<item>
<title>Dismantling Disentanglement in VAEs</title>
<link>https://vlgiitr.github.io/blogs/distangled_vae/</link>
<pubDate>Tue, 05 Sep 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/blogs/distangled_vae/</guid>
<description><p>Over the years neuroscience has inspired many quantum leaps in Artificial Intelligence. One such remarkable development inspired by the visual ventral system of the brain is Disentangled Variational Autoencoders.</p>
</description>
</item>
<item>
<title>Adversarial Attacks on Aligned Language Models</title>
<link>https://vlgiitr.github.io/blogs/aligned_llm/</link>
<pubDate>Sun, 27 Aug 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/blogs/aligned_llm/</guid>
<description><p>I decided to ask a certain popular language model how to build an explosive, from everday items (for no particular reason), but it didn&rsquo;t give me a plausible answer. What is happening here?</p>
</description>
</item>
<item>
<title>Adversarial Attacks on Aligned Language Models</title>
<link>https://vlgiitr.github.io/posts/attacks_on_aligned_llms/</link>
<pubDate>Sun, 27 Aug 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/posts/attacks_on_aligned_llms/</guid>
<description><p>I decided to ask a certain popular language model how to build an explosive, from everday items (for no particular reason), but it didn&rsquo;t give me a plausible answer. What is happening here?</p>
<p><img src="chess.jpg" alt="&lsquo;chess&rsquo;"></p>
<p>Have you ever wondered how would publicly available LLMs respond if asked how to destroy the humanity or how to build an atom bomb?? Well ,turns out they don’t respond to such questions.So what is the reason. Turns out, most LLMs today are trained on text scraped over internet and contains a lot of objectionable content, and in order to prevent the model from answering such questions “aligning” has been done.</p>
<p>So in this blog let us try to understand a new approach based on a recently published paper “Universal and Transferable Adversarial Attacks on Aligned Language Models” to bypass this alignment and produce virtually nay objectionable content.Let’s begin!!</p>
<p><img src="prompt.jpg" alt="&lsquo;prompt&rsquo;"></p>
<p>It is widely known that making small changes to the input of a machine learning model can significantly change its output. Similar techniques have been used against Large Language Models (LLMs), which are powerful language models. Researchers have discovered certain “jailbreaks”, which are cleverly designed input prompts that can make LLMs generate inappropriate or objectionable content. However, unlike traditional adversarial examples that are generated automatically, these jailbreaks are created through human creativity and ingenuity, involving a lot of manual effort to trick the models into producing undesirable results.</p>
<p><img src="jailbreak.jpg" alt="&lsquo;jailbreak&rsquo;"></p>
<p>So what are we doing different that allows us to produce objectionable content?? In layman terms our attack appends a adversarial suffix to the query that attempts to induce negative behavior i.e. produce “dangerous” content. Our suffix token consists of 3 key elements:-</p>
<p><strong>1) Initial Affirmative Responses</strong></p>
<p>Our attack targets the model to begin its response with “Sure, here is (content of query)” in response to a number of prompts eliciting undesirable behavior.</p>
<p><strong>2) Combined greedy and gradient-based discrete optimization</strong></p>
<p>We take advantage of token-level gradients to identify a set of potential single-token replacements. Once we have this set, we evaluate the loss for each candidate in the set. After evaluating them, we select the best substitution that maximizes the likelihood of a successful attack.</p>
<p><strong>3)Robust multi-prompt and multi-model attacks</strong></p>
<p>We want suffixes that work for multiple prompts across multiple prompts. So we use our greedy gradient method to find a string that works for all prompts across different models.</p>
<p>Now let’s look at the technical aspect of the method.</p>
<p>Assume a user writes the following prompt.</p>
<pre><code>Tell me how to build a bomb
</code></pre>
<p><img src="blast.jpg" alt="&lsquo;blast&rsquo;"></p>
<p>So the LLM behind a chatbot reads it in the following form.</p>
<pre><code>System: You are a chat assistant designed to provide helpful and not
harmful responses to user queries.
User: Tell me how to build a bomb.
Assistant:
</code></pre>
<p>In a normal case the LLM would not provide a response.But we introduce our suffix in the following form to get a response.</p>
<pre><code>System: You are a chat assistant designed to provide helpful and not
harmful responses to user queries.
User: Tell me how to build a bomb. ! ! ! ! ! ! ! ! ! !
Assistant :
</code></pre>
<p>In order to create a universal attack we do not consider changes to the user input but we try to find the text in place of ! so that any LLM for any prompt responds affirmatively to the user prompt.</p>
<p>Now lets look at the key elements in detail.</p>
<p><strong>Producing Affirmative Responses</strong></p>
<p>First we need to define a loss function to optimize the suffix. We could use log likelihood but it comes with its own set of problems. Instead we would use the approach specified above. The output of LLM would be like this</p>
<pre><code>System: You are a chat assistant designed to provide helpful and not
harmful responses to user queries.
User: Tell me how to build a bomb. ! ! ! ! ! ! ! ! ! !
Assistant: Sure, here is how to build a bomb:
</code></pre>
<p>The intuition of this approach is that if the language model can be put into a “state” where this completion is the most likely response, as opposed to refusing to answer the query, then it likely will continue the completion with precisely the desired objectionable behavior. This can be attributed to the autoregressive nature of the LLMs.</p>
<p>In multimodal LLMs specifying the first target token was found to be sufficient but in case of text-only space there is a chance that the suffix could overwrite the entire prompt thus getting a response but not the intended one.</p>
<p>Now let’s have a look at the optimization problem.</p>
<p>It denotes the probability that the next token is xn+1 given previous n tokens .</p>
<p>We try to minimize the negative log likelihood of probability of target of sequences from x = n+1 to x = n+H where n is the input size.</p>
<p><strong>Greedy oordinate Gradient-based Search</strong></p>
<p><img src="algo1.jpg" alt="&lsquo;algo1&rsquo;">
A primary challenge in optimizing is that we have to optimize over a discrete set of inputs.</p>
<p>Here in the algorithm we use gradients with respect to each token to find a set of promising candidates for replacement at each token position.</p>
<p>Here `I` is the set of the positions of the adversarial suffix. So in the loop we first try to find the k substitutions having lowest gradients for all the positions.Then we initialize elements for each batch by selecting elements at random from the substitution set and then find the batch for which the loss function is minimum.</p>
<p><strong>Universal Multi-prompt and Multi-model attacks</strong></p>
<p><img src="algo2.jpg" alt="&lsquo;algo2&rsquo;"></p>
<p>Now we build upon the above algorithm to optimize the attack for multiple prompts.Unlike in the above algorithm here x represents the prompts by the user. We use multiple prompts and their corresponding losses and define a postfix `p` of length l tokens.Instead of specifying a different subset of modifiable tokens for all the prompts we choose a single postfix and optimize the losses over that. Similar to above approach we first find the top -K substitutions for the first prompt by optimizing over p.We start with only first prompt and increment the prompts only when the postfix yields results on the earlier prompts.</p>
<p>After finding the k substitutions the process is similar to the process in the previous algorithm.To make the adversarial examples transferable, we incorporate loss functions over multiple models.</p>
<p><strong>Results</strong></p>
<p><img src="results.jpg" alt="&lsquo;results&rsquo;">
<img src="graph.jpg" alt="&lsquo;results2&rsquo;"></p>
<p>Following results were obtained on using the above method</n></p>
<p>We find that combining multiple GCG prompts can further improve ASR on several models. Firstly, we attempt to concatenate three GCG prompts into one and use it as the suffix to all behaviors. The “+ Concatenate” row of Table 2 shows that this longer suffix particularly increases ASR from 47.4% to 79.6% on GPT-3.5 (gpt-3.5-turbo), which is more than 2× higher than using GCG prompts optimized against Vicuna models only.</p>
<p>The method proposed raise substantial questions regarding current methods for the alignment of LLMs.</p>
<p><strong>References</strong></p>
<p><a href="https://arxiv.org/pdf/2307.15043.pdf" target="_blank" rel="noopener">Paper on Universal and Transferable Adversarial Attacks on Aligned Language Models</a></p>
<p>Photo by <a href="https://unsplash.com/@mrthetrain?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="noopener">Joshua Hoehne</a> on <a href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="noopener">Unsplash</a></p>
</description>
</item>
<item>
<title>OneFormer: One Transformer to Rule Universal Image Segmentation</title>
<link>https://vlgiitr.github.io/publication/oneformer/</link>
<pubDate>Mon, 26 Jun 2023 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/oneformer/</guid>
<description></description>
</item>
<item>
<title>DL Discussions Fall Semeseter 2022</title>
<link>https://vlgiitr.github.io/recents/workshop2021/</link>
<pubDate>Thu, 01 Sep 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/recents/workshop2021/</guid>
<description><p>DL Discusions by VLG for the Fall Semester 2022 start on 24th September. Stay Tuned</p>
</description>
</item>
<item>
<title>Spring 2022 Discussions</title>
<link>https://vlgiitr.github.io/previous_discussions/spring_2022_discussion/</link>
<pubDate>Mon, 08 Aug 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/previous_discussions/spring_2022_discussion/</guid>
<description><hr>
<p>We conduct discussions every week where we dicuss and recent advancements in the field of Deep Learning. Join our <a href="https://discord.gg/AHCauPv8" target="_blank" rel="noopener">Discord</a> to attend the discussions!</p>
<!-- <iframe src="https://discord.com/widget?id=877180035918884897&theme=dark" width="350" height="500" allowtransparency="true" frameborder="0" sandbox="allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts"></iframe> -->
<blockquote>
<p>See the link below for all the resources</p>
</blockquote>
<h3 id="discussions">Discussions</h3>
<table>
<thead>
<tr>
<th>Date</th>
<th>Topic</th>
</tr>
</thead>
<tbody>
<tr>
<td>22-01-2022</td>
<td>Neural Rendering</td>
</tr>
<tr>
<td>29-01-2022</td>
<td>Multi-Model AI</td>
</tr>
<tr>
<td>05-02-2022</td>
<td>Transformers</td>
</tr>
<tr>
<td>12-02-2022</td>
<td>AlphaCode</td>
</tr>
<tr>
<td>19-02-2022</td>
<td>Cross-breeding Transformers and CNNs</td>
</tr>
</tbody>
</table>
<h3 id="workshops">Workshops</h3>
<table>
<thead>
<tr>
<th>Date</th>
<th>Topic</th>
<th>Resources</th>
</tr>
</thead>
<tbody>
<tr>
<td>19-03-2022</td>
<td>Transfer Learning</td>
<td><a href="https://colab.research.google.com/drive/1-vGmphxTo4Zen2PBp6HSFPg-V9nP5w4i" target="_blank" rel="noopener">Colab Notebook</a></td>
</tr>
<tr>
<td>26-03-2022</td>
<td>Introduction to RL</td>
<td><a href="https://colab.research.google.com/drive/1EhehhDzu5ak5uaC3H0xsRI6OR-Y6Yzzj?usp=sharing" target="_blank" rel="noopener">Colab notebook</a></td>
</tr>
</tbody>
</table>
<ul>
<li>All the resources for this semester are compiled <a href="https://cliff-tv-2e1.notion.site/VLG-Discussion-Resources-0a370b2c4aa34ba88580e8fcd0403de1" target="_blank" rel="noopener">here</a></li>
</ul>
</description>
</item>
<item>
<title>Weekly AI quiz on Instagram</title>
<link>https://vlgiitr.github.io/recents/weekly_quiz/</link>
<pubDate>Mon, 08 Aug 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/recents/weekly_quiz/</guid>
<description><p>VLG now organises weekly quizes on out <a href="https://www.instagram.com/vlgiitr/" target="_blank" rel="noopener">Instagram</a>. Hop in every Wednesday and flex your DL knowledge !</p>
</description>
</item>
<item>
<title>Keys to Better Image Inpainting: Structure and Texture Go Hand in Hand</title>
<link>https://vlgiitr.github.io/publication/imageimpainting/</link>
<pubDate>Fri, 05 Aug 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/imageimpainting/</guid>
<description></description>
</item>
<item>
<title>DL Infographics on Instagram</title>
<link>https://vlgiitr.github.io/recents/ai_bi_weekly_infographs/</link>
<pubDate>Mon, 01 Aug 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/recents/ai_bi_weekly_infographs/</guid>
<description><p>We are starting our forthnightly infographic series on <a href="https://www.instagram.com/vlgiitr/" target="_blank" rel="noopener">Instagram</a></p>
</description>
</item>
<item>
<title>Members of VLG grab Gold Medal in Inter-IIT Tech Meet 10.0</title>
<link>https://vlgiitr.github.io/recents/inter-iit/</link>
<pubDate>Thu, 07 Jul 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/recents/inter-iit/</guid>
<description><p>Members of VLG <em>Harsh Kumar</em>, &ldquo;<em>Kumar Devesh</em>&rdquo;, &ldquo;<em>Sarthak Gupta</em>&rdquo;, participated in the High Prep Event - &ldquo;Bosch model extraction attack for video classification&rdquo; and grabbed GOLD Medal. Congratulations !!</p>
</description>
</item>
<item>
<title>Language Guided Meta-Control for Embodied Instruction Following</title>
<link>https://vlgiitr.github.io/publication/metacontrolforembodiedinstructionfollowing/</link>
<pubDate>Wed, 01 Jun 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/metacontrolforembodiedinstructionfollowing/</guid>
<description></description>
</item>
<item>
<title>Leveraging Dependency Grammar for Fine-Grained Offensive Language Detection using Graph Convolutional Networks</title>
<link>https://vlgiitr.github.io/publication/levragingdependencygrammer/</link>
<pubDate>Wed, 01 Jun 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/levragingdependencygrammer/</guid>
<description></description>
</item>
<item>
<title>Password Cracking</title>
<link>https://vlgiitr.github.io/blogs/password_cracking/</link>
<pubDate>Mon, 30 May 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/blogs/password_cracking/</guid>
<description><p>On hearing the term &ldquo;password-cracking,&rdquo; many will think this post will be about how to guess someone&rsquo;s password or somewhat similar, but the reality is not always so satisfying.</p>
</description>
</item>
<item>
<title>Password Cracking</title>
<link>https://vlgiitr.github.io/posts/password_cracking/</link>
<pubDate>Mon, 30 May 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/posts/password_cracking/</guid>
<description><p>On hearing the term &ldquo;password-cracking,&rdquo; many will think this post will be about how to guess someone&rsquo;s password or somewhat similar, but the reality is not always so satisfying.</p>
<h1 id="what-is-password-cracking"><strong>What is Password Cracking</strong></h1>
<p>In general, whenever anybody types a password on any device or software, passwords don&rsquo;t get stored in the raw format in the database. Instead, raw passwords are first passed through the hashing algorithm, which converts the raw passwords into some particular sequence of letters, numbers, and special characters which looks entirely random for an ordinary being.</p>
<p>Now there are several password database leaks and breaches all over the world. One such dataset is Rockyou Dataset, which contains <strong>about 31 million passwords;</strong> this is a widely used dataset because this dataset contains passwords in plain text format without any hashing. Most password cracking algorithms are either trained using this dataset or have used this for dictionary attacks.</p>
<p>These algorithms decrypt the hashes of the passwords obtained from other password databases leaks. These algorithms generate passwords, hashes them using the encryption algorithm, and then compare the hash with the hashes present in the database; if the hash matches, Bingo! We got the password corresponding to that hash; otherwise, keep generating and comparing passwords. Password hashes generated by the encryption algorithm are such that they can&rsquo;t be reverted. Passwords, once hashed, can not be converted back into passwords by any algorithm other than brute force attacks over the hash of every possible password.</p>
<h1 id="hashcat">Hashcat</h1>
<p>Hashcat is one of the most popular and widely used password crackers. It uses various kinds of attacks for cracking the passwords like:</p>
<ul>
<li><em>Dictionary attack</em>: Trying all the passwords present in a list or database.</li>
<li><em>Combinator attack</em>: Trying concatenating words from multiple wordlists.</li>
<li><em>Mask attack</em>: Trying all the characters given in charsets, per position.</li>
<li><em>Hybrid attack</em>: Trying combining wordlist and masks.</li>
<li><em>Association attack</em>: Use a piece of information that could have had an influence on password generation to attack a specific hash.</li>
</ul>
<p>In addition to these, Hashcat enables high-parallelized password cracking and the ability to support a distributed hash cracking system via overlays.</p>
<h1 id="probabilistic-context-free-grammar">Probabilistic Context-Free Grammar</h1>
<p>Context-free grammars have been in the study of natural languages, where they are used to generate strings with a particular structure. Probabilistic context-free grammar is a probabilistic approach to traditional context-free grammar; it incorporates available information about the probability distribution of user passwords. This information is used to generate password patterns in order of decreasing probability. At the same time, these structures can be either password guesses or word-mangling templates that can be filled by dictionary words. Here&rsquo;s a brief overview of how probabilistic context-free grammar is used in password cracking:</p>
<p><strong><em>Preprocessing:</em></strong> In this phase, frequencies of specific patterns are measured associated with the password string. In this, the author denotes the alpha string (sequence of alphabet symbols) by <strong>L</strong>, digit string as <strong>D</strong>, and special strings(sequence of non-alpha and non-digit symbols) as <strong>S.</strong> For password &ldquo;$password123&rdquo;, structure of the password would be <strong>SLD</strong>, base structure would also be similar to structure except that it would also incorporate the length of strings, so base structure would be <strong>S¹L⁸D³</strong>. The preterminal structure fills in the value of <strong>S</strong> and <strong>D</strong> in the base structure, whereas the terminal structure (guess) would fill the value of <strong>L</strong> in the preterminal structure.</p>
<p><img src="grammar.jpg" alt="&lsquo;grammar&rsquo;"></p>
<p><strong><em>Using Probabilistic Grammars:</em></strong> A mathematical form of defining context-free grammar as <strong>G = (V, Σ, S, P),</strong> where: <strong>V</strong> is a finite set of variables (or non-terminals), <strong>Σ</strong> is a finite set of terminals, <strong>S</strong> is the start variable, and <strong>P</strong> is a finite set of productions of the form <strong>α → β</strong> where <strong>α</strong> is a single variable and <strong>β</strong> is a string consisting of variables or terminals. Probabilistic context-free grammars have probabilities associated with each production such that for a specific left-hand-side variable, all the associated productions add up to 1. A string derived from the start symbol is called a sentential form. The probability of sentential form is simply the product of the possibilities of the productions used in its derivation. As the production rules don&rsquo;t have any data to rewrite alpha variables to alpha strings, thus sentential forms can be maximally derived up to the terminal digits and special characters with alpha <em>variables.</em> These sentential forms are the pre-terminal structures. The main idea is that preterminal structures define mangling rules that can be directly used in a distributed password cracking trial on passing them to the distributed system to fill in the alpha variables with dictionary words and hash the guesses.</p>
<p><img src="tree.jpg" alt="&rsquo;tree&rsquo;">
Assigning pre-terminal structure with probability</p>
<p>In order to generate pre-terminal structures in decreasing order of probability, authors used the approach to output all the probable pre-terminal structures, evaluate them on probability, and then sort the results. However, this pre-computation step is not parallelizable with the password cracking step that follows. Now to generate terminal structures from the pre-terminal structure, one approach is to simply fill in all relevant dictionary words for the highest pre-terminal structure and then choose the next highest probable pre-terminal structure. This approach does not further assign probabilities to the dictionary words and does not learn the specific replacement of alpha variables from the training set. This approach is called <em>pre-terminal probability order.</em> Another approach is to assign probabilities to alpha strings in various ways. For instance, it is possible to assign probabilities to words in a dictionary based on how many words of that length appear, observed use of the word, frequency of appearance in language, or knowledge about the target. This approach is called <em>terminal probability order.</em> This approach does assign each terminal structure (password guesses) a well-defined probability.</p>
<p>For comparing the performance of probabilistic context-free grammars, the authors used a standard open-source password cracking program, John the Ripper. The authors used a total of six publicly available input dictionaries to use in our tests. Four of them, &ldquo;English_lower&rdquo;, &ldquo;Finnish_lower&rdquo;, &ldquo;Swedish_lower&rdquo; and &ldquo;Common_Passwords&rdquo; were obtained from John the Ripper&rsquo;s public website. Additionally &ldquo;dic-0294&rdquo; input dictionary was obtained from a password-cracking site, and &ldquo;English-wiki&rdquo; input dictionary is based on the English words gathered from <a href="http://www.wiktionary.org" target="_blank" rel="noopener">www.wiktionary.org</a>.</p>
<p><img src="graph.jpg" alt="&lsquo;graph&rsquo;">
Number of passwords cracked against Myspace list</p>
<p>Passwords Cracked by the <em>Terminal probability order</em> approach of Probabilistic context-free grammar are the highest. It gave an improvement over John the Ripper from 28% to 129% more passwords cracked given the same number of guesses. Additionally, when we used the <em>preterminal order</em>, we also achieved better results than John the Ripper in all cases but one, though less than what we achieved using <em>terminal probability order</em>.</p>
<h1 id="passgan">PassGAN</h1>
<p>PassGAN is an example of generative adversarial networks (GANs). GANs are essentially an adversarial framework of multilayer perceptions made up of a generator and discriminator. The generator tries to generate data samples similar to the training data and fool the discriminator. In contrast, the discriminator tries to maximize the probability of assigning the correct label to both the training examples and samples generated by the generator. They both end up playing the minimax game and optimizing the value function V (G, D) for the password distributions.</p>
<p><img src="formula.jpg" alt="&lsquo;formula&rsquo;"></p>
<p>Generative modeling relies on closed-form expressions that generally aren&rsquo;t capable of noisy real data. PassGAN trains a generative deep neural network that takes as input a multi-dimensional random sample of passwords formed in a Gaussian distribution to generate a sample of the desired target distribution.</p>
<p>The generator of PassGAN takes input to the reshape node followed by five residual blocks, whereas each block consists of 1D convolutional blocks connected by relu functions, and the final output is the weighted sum of outputs from the conv1D block and the residual identity connection of input. Residual blocks are then followed by a 1D convolutional node which outputs to the softmax node to generate probability distribution in the character set.</p>
<p><img src="residual.jpg" alt="&lsquo;residual&rsquo;"></p>
<p>The discriminator of PassGAN has a very similar architecture to the generator, except that it is in the opposite order as compared to generator. Input after a transpose operation is fed to a 1D convolutional block which is followed by five residual blocks whose architecture is similar to residual blocks used in the case of generator. Output from the final residual block after having a reshape operation is given a linear transformation function which leads to the final output.</p>
<p><img src="PassGAN.jpg" alt="&lsquo;PassGAN&rsquo;"></p>
<p>For the training purpose, 2.5 million passwords were sampled uniformly from the RockYou dataset, length of the passwords was restricted to 10 in order to make training computationally feasible. For testing purposes, additional 2.5 million passwords were sampled exclusive to the training set passwords. To evaluate the trained model, 5 million passwords are generated from the generator network and compared to the test data, about 5.5% (274965) generated passwords were found in the data set, whereas 63110 among them were unique. For calculating the strength of passwords cracked by PassGAN, researchers used <strong>zxcbvn.</strong> zxcbvn is a low-budget password strength estimator. Its algorithm returns an integer strength bar from 0 - 4, estimating a higher strength with a higher score. Passwords, those PassGAN was able to crack scored 1.59 and landed an average guess per password to be 5.32 x 10⁶.</p>
<h1 id="genpass">GENPass</h1>
<p>As we have seen, PCFG(Probabilistic context-free grammar), is based on statistical probability. These approaches require a large amount of calculation, which is time-consuming. In PassGAN, neural networks are able to predict more accurate passwords, however, they are not qualified for cross-site attacks as each dataset has its own features.</p>
<p>GENPass tries to generalize on those leaked passwords and improve the performance in cross-site attacks. GENPass is a multi-source deep learning model that learns from several datasets and ensures the output wordlist can maintain high accuracy for different datasets using adversarial generation. Now before we proceed further, first define what is &ldquo;general&rdquo;.</p>
<blockquote>
<p>Definition (what is “general”) : Assume a training set T containing m leaked password datasets D1, D2, D3,…,Dm. Model Gt is obtained by training T. Model Gi is obtained by training Di (i ∈ [1, m]). If Model Gt can guess dataset Dk (Dk∉ D1, D2, D3,…,Dm) better than Gi (i ∈ [1, m]), model Gt is believed to be general.</p>
</blockquote>
<p>For generating passwords, PCFG + LSTM models, also called PL models, comes into the picture. The preprocessing step is performed by PCFG. Passwords are first encoded into a sequence of units. Each unit has a char and a number. A char stands for a sequence of letters (L), digits (D), special chars (S), or an end character (&rsquo;\n&rsquo;), and the number stands for the length of the sequence. A table is generated when we preprocess the passwords. LSTM is a widely used RNN variant, which generates the probability of the next element based on the context elements. Each LSTM model unit maintains a state Ct at time t, and three sigmoid gates control the data flow in the unit, namely the input gate, the forget gate, and the output gate. The output is calculated as follows:</p>
<p><img src="formula2.jpg" alt="&lsquo;formula2&rsquo;"></p>
<p>LSTM is used to generate passwords. By feeding the LSTM model the preprocessed wordlist and training it, the model can predict the next unit. When a unit is determined, it is transformed back into an alphabetic sequence according to the table generated during the preprocessing step. The LSTM model will output a list of units with their corresponding probabilities, if units are chosen according to the highest weight, then a large number of duplicates will be created in the output wordlist, so the unit is chosen by sampling from discrete distribution. This ensures that higher-weight candidates are chosen with higher probability, while lower ones can still be chosen after a number of guesses. This procedure is called <em>weight choosing.</em></p>
<p>PL model is suitable for only one dataset, not for several datasets simultaneously. Different datasets have different underlying principles and lengths, whereas simply mixing datasets would make it difficult for the model to learn the general principles. To solve this multi-source training problem, GENPass comes into the picture.</p>
<p><em>Prediction of Model n:</em> For all the different datasets, we train a different PL model, thus, the model can output the result with its own principle.</p>
<p><em>Weight Choosing:</em> It is assumed that all the PL model have the same probability, so the output from each model are combined, the combined list will be the input of the weight choosing process, and the final output will be chosen by sampling from discrete distribution.</p>
<p><img src="weight.jpg" alt="&lsquo;weight&rsquo;"></p>
<p><em>Classifier and Discriminator:</em> The classifier is a CNN classifier trained by raw passwords without preprocessing from different datasets. Given a password, the classifier can tell which dataset the password most likely comes from. Through a softmax layer, the output will be a list of numbers with a sum of one. Discriminator takes the classifier&rsquo;s output and accepts those passwords that have a consistent probability of appearance in different datasets so that the output passwords can be &ldquo;general&rdquo;.</p>
<p>If C is too large, the generated unit will be discarded; otherwise, it will be accepted. In the model threshold value of C is set to 0.2.</p>
<p><img src="classifier.jpg" alt="&lsquo;classifier&rsquo;"></p>
<p>Evaluation: To evaluate the PL model, it is trained with Myspace and phpBB password datasets. After each training session, the model generated a new wordlist. GENPass is also trained on the same Myspace, and phpBB password datasets and wordlist are generated. The authors trained the PL model with a single mixture of two wordlists and compared the result with the GENPass model.</p>
<p><img src="graph2.jpg" alt="&lsquo;graph2&rsquo;"></p>
<p>Here it is clear that the GENPass model outperforms all other models. Using raw LSTM without any preprocessing performs the worst. Using PL to learn Myspace alone performs second best, which proves Myspace is a good dataset. Simply mixing two datasets does not improve the matching rate.</p>
<h1 id="references">References</h1>
<p><a href="https://hashcat.net/wiki/" target="_blank" rel="noopener">Hashcat official page</a></p>
<p><a href="http://courses.csail.mit.edu/6.857/2019/project/9-Nepal-Kontomah-Oguntola-Wang.pdf" target="_blank" rel="noopener">Adversarial Password Cracking</a></p>
<p><a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8832180" target="_blank" rel="noopener">GENPass: A Multi-Source Deep Learning Model for Password Guessing</a></p>
<p><a href="https://www.researchgate.net/publication/220713709_Password_Cracking_Using_Probabilistic_Context-Free_Grammars" target="_blank" rel="noopener">Password Cracking Using Probabilistic Context-Free Grammars</a></p>
</description>
</item>
<item>
<title>On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules</title>
<link>https://vlgiitr.github.io/publication/crossmodeltransfenlp/</link>
<pubDate>Sun, 01 May 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/crossmodeltransfenlp/</guid>
<description></description>
</item>
<item>
<title>RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition</title>
<link>https://vlgiitr.github.io/publication/reitransformer/</link>
<pubDate>Tue, 29 Mar 2022 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/reitransformer/</guid>
<description></description>
</item>
<item>
<title>SeMask: Semantically Masked Transformers for Semantic Segmentation</title>
<link>https://vlgiitr.github.io/publication/semask/</link>
<pubDate>Thu, 23 Dec 2021 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/semask/</guid>
<description></description>
</item>
<item>
<title>Autumn 2021 Discussions</title>
<link>https://vlgiitr.github.io/previous_discussions/aut_2021_discussion/</link>
<pubDate>Wed, 20 Oct 2021 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/previous_discussions/aut_2021_discussion/</guid>
<description><hr>
<p>We conduct two discussions every week where we dicuss the basic concepts and recent advancements in the field of Deep Learning. Join our <a href="https://teams.microsoft.com/l/team/19%3a0e691fdb81664f3b97d753311d437996%40thread.tacv2/conversations?groupId=34aceeff-a8f6-4efc-b650-4376d252c5f7&amp;tenantId=38f62926-7559-4aef-84ae-cb5e172406fb" target="_blank" rel="noopener">MS Team</a> to attend the discussions!
Team Code : z1q54os</p>
<h3 id="basic-discussions">Basic Discussions</h3>
<p>We discuss a few fundamental concepts on Wednesdays.</p>
<table>
<thead>
<tr>
<th>Date</th>
<th>Topic</th>
<th>Resources</th>
</tr>
</thead>
<tbody>
<tr>
<td>18-08-2021</td>
<td>Introduction to GANs</td>
<td><a href="https://docs.google.com/presentation/d/1LqnIAr49wZHktXYtaiL3e1sk3Ar0P9nmEI_rg6MTXEg/edit" target="_blank" rel="noopener">Slides</a></td>
</tr>
<tr>
<td>25-08-2021</td>
<td>VAE</td>
<td><a href="https://docs.google.com/presentation/d/1HFygX3n7pjUJ35gVG-g7vP0dXiqevhtcGhT-SMtURX8/edit" target="_blank" rel="noopener">Slides</a></td>
</tr>
<tr>
<td>01-09-2021</td>
<td>Sequence Modelling</td>
<td><a href="https://docs.google.com/presentation/d/1DECew5g-z7jMBp-pwFmw_r87FdI3Ci7I/edit" target="_blank" rel="noopener">Slides</a></td>
</tr>
<tr>
<td>08-09-2021</td>
<td>Transformers and Attention</td>
<td><a href="https://docs.google.com/presentation/d/1p-A5TRKe2YJTkaA6O8wUMNlyHJDYsRrC8Suihuy-HWs/edit" target="_blank" rel="noopener">Slides</a></td>
</tr>
<tr>
<td>29-09-2021</td>
<td>Reinforcement Learning</td>
<td><a href="https://docs.google.com/presentation/d/1roMFcU5rfLrdB4RKndg7bENj6aYrTxBx/edit" target="_blank" rel="noopener">Slides</a></td>
</tr>
</tbody>
</table>
<h3 id="workshops">Workshops</h3>
<table>
<thead>
<tr>
<th>Date</th>
<th>Topic</th>
<th>Resources</th>
</tr>
</thead>
<tbody>
<tr>
<td>13-10-2021</td>
<td>Transfer Learning</td>
<td><a href="https://drive.google.com/file/d/1rCrBIrynEqBBkr1SIkRXdsDB-jUVpZ52/edit" target="_blank" rel="noopener">Jupyter Notebook</a></td>
</tr>
<tr>
<td>20-10-2021</td>
<td>VAE</td>
<td><a href="">Jupyter Notebook</a></td>
</tr>
</tbody>
</table>
<h3 id="advanced-discussions">Advanced Discussions</h3>
<p>We discuss the latest papers published in top tier conferences on Saturdays.</p>
<table>
<thead>
<tr>
<th><div style="width:75px">Date</div></th>
<th>Paper 1</th>
<th><div style="width:120px">Link</div></th>
<th>Paper 2</th>
<th><div style="width:120px">Link</div></th>
</tr>
</thead>
<tbody>
<tr>
<td>18-08-2021</td>
<td>Per-Pixel Classification is Not All You Need for Semantic Segmentation</td>
<td><a href="https://arxiv.org/abs/2107.06278" target="_blank" rel="noopener">Paper</a></td>
<td>Distilling the Knowledge in a Neural Network</td>
<td><a href="https://arxiv.org/abs/1503.02531" target="_blank" rel="noopener">Paper</a></td>
</tr>
<tr>
<td>25-08-2021</td>
<td>Large Scale Image Completion via Co-Modulated Generative Adversarial Networks</td>
<td><a href="https://arxiv.org/abs/2103.10428v1" target="_blank" rel="noopener">Paper</a></td>
<td></td>
<td></td>
</tr>
<tr>
<td>09-11-2021</td>
<td>Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer</td>
<td><a href="https://openaccess.thecvf.com/content/CVPR2021/papers/Li_Diverse_Part_Discovery_Occluded_Person_Re-Identification_With_Part-Aware_Transformer_CVPR_2021_paper.pdf" target="_blank" rel="noopener">Paper</a></td>
<td></td>
<td></td>
</tr>
<tr>
<td>02-10-2021</td>
<td>Towards Compact CNNs via Collaborative Compression</td>
<td><a href="https://openaccess.thecvf.com/content/CVPR2021/papers/Li_Towards_Compact_CNNs_via_Collaborative_Compression_CVPR_2021_paper.pdf" target="_blank" rel="noopener">Paper</a></td>
<td>Is Space-Time Attention All You Need for Video Understanding?</td>
<td><a href="https://arxiv.org/pdf/2102.05095.pdf" target="_blank" rel="noopener">Paper</a></td>
</tr>
<tr>
<td>09-10-2021</td>
<td>Rethinking Attention with Performers</td>
<td><a href="https://arxiv.org/pdf/2009.14794.pdf" target="_blank" rel="noopener">Paper</a></td>
<td>Reformer: The Efficient Transformer</td>
<td><a href="https://arxiv.org/pdf/2001.04451.pdf" target="_blank" rel="noopener">Paper</a></td>
</tr>
<tr>
<td>16-10-2021</td>
<td>Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks</td>
<td><a href="https://arxiv.org/pdf/1810.09536.pdf" target="_blank" rel="noopener">Paper</a></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</description>
</item>
<item>
<title>Exploring Long tail Visual Relationship Recognition with Large Vocabulary</title>
<link>https://vlgiitr.github.io/publication/longtailvisualrelationshipwithlargevocab/</link>
<pubDate>Sat, 25 Sep 2021 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/publication/longtailvisualrelationshipwithlargevocab/</guid>
<description></description>
</item>
<item>
<title>Spring 2021 Discussions</title>
<link>https://vlgiitr.github.io/previous_discussions/spring_2021_discussion/</link>
<pubDate>Thu, 19 Aug 2021 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/previous_discussions/spring_2021_discussion/</guid>
<description><hr>
<p>We conduct two discussions every week where we dicuss the basic concepts and recent advancements in the field of Deep Learning.</p>
<h3 id="basic-discussions">Basic Discussions</h3>
<p>We discuss a few fundamental concepts on Wednesdays.</p>
<table>
<thead>
<tr>
<th>Date</th>
<th>Topic</th>
<th>Resources</th>
</tr>
</thead>
<tbody>
<tr>
<td>14-04-2021</td>
<td>Linear Algebra</td>
<td><a href="https://docs.google.com/presentation/d/1rHrOqCQUuqUuKzB_BmXECXc0yC0IXJXdsym35gNJQlY/edit?usp=sharing" target="_blank" rel="noopener">Slides</a></td>
</tr>
<tr>
<td>21-04-2021</td>
<td>Probability and Stats</td>
<td><a href="https://drive.google.com/file/d/1l6DZk87_LxOqlfuDejCd2p7bkndRplDk/view?usp=sharing" target="_blank" rel="noopener">Slides</a></td>
</tr>
<tr>
<td>05-05-2021</td>
<td>Neural Networks and CNNs</td>
<td><a href="https://drive.google.com/file/d/1S1KvnC3avYr7I9PiI4FvC_aMg4TkLzON/view?usp=sharing" target="_blank" rel="noopener">Slides-A</a> <a href="https://drive.google.com/file/d/1wXTUzBxQBwLQvrtZOudMDva_r1E_U8SG/view?usp=sharing" target="_blank" rel="noopener">Slides B</a></td>
</tr>
</tbody>
</table>
<h3 id="advanced-discussions">Advanced Discussions</h3>
<p>We discuss the latest papers published in top tier conferences on Saturdays.</p>
<table>
<thead>
<tr>
<th><div style="width:75px">Date</div></th>
<th>Paper 1</th>
<th><div style="width:120px">Link</div></th>
<th>Paper 2</th>
<th><div style="width:120px">Link</div></th>
</tr>
</thead>
<tbody>
<tr>
<td>20-02-2021</td>
<td>GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields</td>
<td><a href="https://arxiv.org/abs/2011.12100" target="_blank" rel="noopener">https://arxiv.org/abs/2011.12100</a></td>
<td>Swapping Autoencoder for Deep Image Manipulation</td>
<td><a href="https://taesung.me/SwappingAutoencoder/" target="_blank" rel="noopener">https://taesung.me/SwappingAutoencoder/</a></td>
</tr>
<tr>
<td>27-02-2021</td>
<td>ActionBytes: Learning from Trimmed Videos to Localize Actions</td>
<td><a href="https://openaccess.thecvf.com/content_CVPR_2020/papers/Jain_ActionBytes_Learning_From_Trimmed_Videos_to_Localize_Actions_CVPR_2020_paper.pdf" target="_blank" rel="noopener">https://openaccess.thecvf.com/content_CVPR_2020/papers/Jain_ActionBytes_Learning_From_Trimmed_Videos_to_Localize_Actions_CVPR_2020_paper.pdf</a></td>
<td>The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective</td>
<td><a href="https://arxiv.org/abs/2012.11448" target="_blank" rel="noopener">https://arxiv.org/abs/2012.11448</a></td>
</tr>
<tr>
<td>13-03-2021</td>
<td>Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly</td>
<td><a href="https://arxiv.org/abs/2103.00397v1" target="_blank" rel="noopener">https://arxiv.org/abs/2103.00397v1</a></td>
<td>An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale</td>
<td><a href="https://arxiv.org/abs/2010.11929" target="_blank" rel="noopener">https://arxiv.org/abs/2010.11929</a></td>
</tr>
<tr>
<td>24-04-2021</td>
<td>PREDICT &amp; CLUSTER: Unsupervised Skeleton Based Action Recognition</td>
<td><a href="https://arxiv.org/abs/1911.12409" target="_blank" rel="noopener">https://arxiv.org/abs/1911.12409</a></td>
<td>Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference</td>
<td><a href="https://arxiv.org/abs/1912.03203" target="_blank" rel="noopener">https://arxiv.org/abs/1912.03203</a></td>
</tr>
<tr>
<td>01-05-2021</td>
<td>SCOUT: Self-aware Discriminant Counterfactual Explanations</td>
<td><a href="https://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_SCOUT_Self-Aware_Discriminant_Counterfactual_Explanations_CVPR_2020_paper.pdf" target="_blank" rel="noopener">https://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_SCOUT_Self-Aware_Discriminant_Counterfactual_Explanations_CVPR_2020_paper.pdf</a></td>
<td>High-performance brain-to-text communication via imagined handwriting</td>
<td><a href="https://www.biorxiv.org/content/10.1101/2020.07.01.183384v1.full.pdf" target="_blank" rel="noopener">https://www.biorxiv.org/content/10.1101/2020.07.01.183384v1.full.pdf</a></td>
</tr>
</tbody>
</table>
</description>
</item>
<item>
<title>New Blog on Learnings during research published on Medium!</title>
<link>https://vlgiitr.github.io/recents/noisy_research_blog/</link>
<pubDate>Wed, 28 Jul 2021 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/recents/noisy_research_blog/</guid>
<description><p>The new blog on the topic <strong>Riding the Noisy Research Track</strong> by <em>Jitesh Jain</em> is now published on Medium. In this blog, one of our members shares his experience and learnings in research, and covers some essential guidelines for a beginner in this field. Do give the blog a read <a href="https://medium.com/vlgiitr/riding-the-noisy-research-track-4035e64e7ea8" target="_blank" rel="noopener">here</a>.</p>
</description>
</item>
<item>
<title>Riding the Noisy Research Track</title>
<link>https://vlgiitr.github.io/blogs/noisy_research_track/</link>
<pubDate>Wed, 28 Jul 2021 00:00:00 +0000</pubDate>
<guid>https://vlgiitr.github.io/blogs/noisy_research_track/</guid>
<description><p>It&rsquo;s pretty common to get fascinated by the idea of research. But sometimes we lose intrest midway through it. Ride this noisy track with one of our undergrad reasercher !</p>
</description>
</item>