In this notebook I look at factorizing large kernels into successive smaller ones. In particular, I look at a specific 7×7 image filter that I want to implement as a convolution of two 3×3 filters. I talk about nonlinear optimization and show how to do this kind of problem in pytorch.
Check it out here.