Deciphering the inner workings of pyTorch code

Currently, I am struggling to grasp a specific section of the code within the ResNet architecture. You can find the complete code on this link: https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/02-intermediate/deep_residual_network/main-gpu.py. Please note that my Python knowledge is limited.

# Residual Block
class ResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1, downsample=None):
        super(ResidualBlock, self).__init__()
        self.conv1 = conv3x3(in_channels, out_channels, stride)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = conv3x3(out_channels, out_channels)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.downsample = downsample

    def forward(self, x):
        residual = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        if self.downsample:
            residual = self.downsample(x)
        out += residual
        out = self.relu(out)
        return out

# ResNet Module
class ResNet(nn.Module):
    def __init__(self, block, layers, num_classes=10):
        super(ResNet, self).__init__()
        self.in_channels = 16
        self.conv = conv3x3(3, 16)
        self.bn = nn.BatchNorm2d(16)
        self.relu = nn.ReLU(inplace=True)
        self.layer1 = self.make_layer(block, 16, layers[0])
        self.layer2 = self.make_layer(block, 32, layers[0], 2)
        self.layer3 = self.make_layer(block, 64, layers[1], 2)
        self.avg_pool = nn.AvgPool2d(8)
        self.fc = nn.Linear(64, num_classes)

    def make_layer(self, block, out_channels, blocks, stride=1):
        downsample = None
        if (stride != 1) or (self.in_channels != out_channels):
            downsample = nn.Sequential(
                conv3x3(self.in_channels, out_channels, stride=stride),
                nn.BatchNorm2d(out_channels))
        layers = []
        layers.append(block(self.in_channels, out_channels, stride, downsample))
        self.in_channels = out_channels
        for i in range(1, blocks):
            layers.append(block(out_channels, out_channels))
        return nn.Sequential(*layers)

    def forward(self, x):
        out = self.conv(x)
        out = self.bn(out)
        out = self.relu(out)
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.avg_pool(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out

resnet = ResNet(ResidualBlock, [3, 3, 3])

The main issue I'm encountering is why 'block' must be passed each time? Specifically in the function:

def make_layer(self, block, out_channels, blocks, stride=1):

Instead of passing 'block', why can't we instantiate 'ResidualBlock' and append it to layers like so?

   block = ResidualBlock(self.in_channels, out_channels, stride, downsample)
   layers.append(block)

Answer №1

The module known as ResNet is built to be versatile, allowing for the creation of networks with various blocks. If you do not specify the block you wish to create, you will need to directly mention the block's name as shown below.

# Residual Block
class ResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1, downsample=None):
        super(ResidualBlock, self).__init__()
        self.conv1 = conv3x3(in_channels, out_channels, stride)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = conv3x3(out_channels, out_channels)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.downsample = downsample

    def forward(self, x):
        residual = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        if self.downsample:
            residual = self.downsample(x)
        out += residual
        out = self.relu(out)
        return out

# ResNet Module
class ResNet(nn.Module):
    def __init__(self, layers, num_classes=10):
        super(ResNet, self).__init__()
        self.in_channels = 16
        self.conv = conv3x3(3, 16)
        self.bn = nn.BatchNorm2d(16)
        self.relu = nn.ReLU(inplace=True)
        self.layer1 = self.make_layer(16, layers[0])
        self.layer2 = self.make_layer(32, layers[0], 2)
        self.layer3 = self.make_layer(64, layers[1], 2)
        self.avg_pool = nn.AvgPool2d(8)
        self.fc = nn.Linear(64, num_classes)

    def make_layer(self, out_channels, blocks, stride=1):
        downsample = None
        if (stride != 1) or (self.in_channels != out_channels):
            downsample = nn.Sequential(
                conv3x3(self.in_channels, out_channels, stride=stride),
                nn.BatchNorm2d(out_channels))
        layers = []
        layers.append(ResidualBlock(self.in_channels, out_channels, stride, downsample))   # Major change here
        self.in_channels = out_channels
        for i in range(1, blocks):
            layers.append(ResidualBlock(out_channels, out_channels))    # Major change here
        return nn.Sequential(*layers)

    def forward(self, x):
        out = self.conv(x)
        out = self.bn(out)
        out = self.relu(out)
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.avg_pool(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out

resnet = ResNet([3, 3, 3])

This limits the flexibility of your ResNet module and ties it exclusively with the ResidualBlock. Therefore, if you introduce a new type of block (such as ResidualBlock2), you would need to construct another dedicated module like Resnet2 specifically for that block. It is advisable to establish a generic ResNet module which accepts the block parameter, enabling its utilization with various block types.

An illustrative python analogy for clarity

Imagine you want to design a function capable of performing a mathematical operation on a list and returning the output. Initially, you might construct functions like this:

def exp(inp_list):
    out_list = []
    for num in inp_list:
        out_list.append(math.exp(num))
    return out_list

def floor(inp_list):
    out_list = []
    for num in inp_list:
        out_list.append(math.floor(num))
    return out_list

In this scenario, exponentiation and flooring operations are conducted on an input list. However, a more efficient approach involves creating a universal function to achieve the same outcome:

def apply_func(fn, inp_list):
    out_list = []
    for num in inp_list:
        out_list.append(fn(num))
    return out_list

This allows for executing various operations by calling apply_func(math.exp, inp_list) for exponential calculations and apply_func(math.floor, inp_list) for flooring operations. Additionally, it opens up possibilities for any operation to be implemented.

Note: Despite not being a practical example due to the availability of techniques like map or list comprehension, it effectively illustrates the concept.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Masking Data in Python with Anonymization

Imagine you have a dataset in the form of a CSV file that contains slightly sensitive information, such as who exchanged notes in a high school English class. While it wouldn't be catastrophic if this data was leaked, it would be preferable to remove ...

Add a feature to include a function within a text that will display with designated spots for information

Just diving into Python and encountering a small issue with a function. The objective is to print a string variable Template_1 with a substring determined by either the string variable example1 or example2, based on user input. Keeping it brief so as not ...

Measuring HTML image elements using Python

I am seeking feedback on a python code snippet I have created to count HTML images with Python 3.01 after extracting them. I suspect that my regular expressions may not be used correctly. Below is the code in question: import re, os import urllib.request ...

implement a Django for loop within the template by utilizing JavaScript

Is it possible to incorporate a {% for in %} loop and {{ variables }} in a Django template using JavaScript DOM (insertAdjacentText, or textContent) and dynamically load data from the views without refreshing the entire page? If so, can you please guide me ...

discord.errors.Forbidden: 403 Forbidden (error code: 50013): Insufficient Permissions encountered while trying to establish roles configuration

I'm currently working on setting up roles for my discord bot, but I keep encountering this error message: discord.errors.Forbidden: 403 Forbidden (error code: 50013) This is the code I am using: @client.event async def on_member_join(member): g ...

I'm encountering an issue with the .kv file shown in the picture. Can anyone provide assistance in identifying the problem?

import kivy from kivy.app import App from kivy.uix.widget import Widget from kivy.properties import ObjectProperty class CustomWidget(Widget): user_input = ObjectProperty(None) def display_input(self): print(self.user_input.text) class ...

Creating a Python script to extract a list from HTML elements

Utilizing selenium and BeautifulSoup, I am extracting information from Wikipedia pages to create specific lists. The links containing the information I need are consistently structured as: <li><a href="/wiki/town_name,_California" title="town_nam ...

Exploring Feature Extraction and Dimension Reduction with MLP

I am currently developing a model that utilizes MLP for both feature extraction and dimension reduction. This model has the ability to condense data from 204 dimensions down to just 80 dimensions through the following process: A dense layer with 512 dimen ...

Ways to eliminate space within a list?

I am currently engaged in a project where I am extracting data from Indeed search results. At the moment, when I print the data I find, it appears with a space before the semicolon. Currently, my data prints like this: "Item 1 ; Item 2" I wan ...

Encountering Error When Initializing Session in Shared Server with Selenium and Python

I am facing a challenge with using Selenium on a shared server while choosing Chrome as the browser. In order to do so, I needed to obtain the necessary sources: binary and driver. Unfortunately, traditional installation methods are not feasible in this sc ...

divide the dictionary into two parts "by reference" instead of duplicating values

Is there a way to split a dictionary into two without duplicating the dictionary values? original_dict = {'foo':'spam', 'bar':'eggs'} keys_for_dict1 = ['foo'] dict1 = {}; dict2 = {} for key in origin ...

Preserving the HTML tag structure in lxml - What is the best method?

Here is a url link that I have: I am attempting to extract the main body content of this news page using LXML with xpath: //article, however it seems to include content beyond the body tag As LXML modifies the HTML structure upon initialization, I am see ...

Having trouble scraping FB posts from the new FB layout using Python Selenium - any solutions?

Using python selenium, I successfully crawled FB posts in 4 different groups. Everything was running smoothly with the old FB layout. However, as Facebook started enforcing the new layout, things became more challenging. Initially, there were workarounds ...

How can one locate all instances of a specific input value in a two-dimensional array's indices?

Writing a function in Python to find the x,y coordinates of recurring values in a 2D array has been my latest challenge. For instance, given the array and a value: array = [ [1 ,2 ,3] [2 ,3 ,1] [3 ,2, 1]] search = 1 The expected outp ...

Python - recursive function to update variable values during each call

Currently, I am attempting to develop a function that utilizes the Monte Carlo approximation method to determine the value of pi based on a specified accuracy (number of decimal places). My approach involves comparing the estimated value with the true valu ...

Is there a way to convert a 3-digit number entered into a Tkinter field into an IP address format?

Imagine the query is, "Which store number are you referring to?" When the user inputs 123 and selects begin, they receive an IP address displayed as "10.1.23.111" with [10] and [111] remaining constant. ...

Selenium is having trouble locating a button similar to the one found in Instagram

Hello The issue here is that selenium is unable to locate the like button, as it's not working via xpath and keeps throwing different errors. Click here for the first error Basically, the bot logs into Instagram through posts (based on a global hasht ...

String insertion into the table is not allowed in Python

I was working on a coding project to create a dynamic updating table. The program is currently only allowing me to insert integers and not strings. Whenever I try to input strings, it returns an "operational error". I attempted to modify the datatype of th ...

In Python, the shutil.move function is designed to move directories along with their

So I was coding in Python and encountered a problem that I can't seem to resolve. Here's the code snippet: import shutil import pathlib import os source_folder =(r'C:\Users\Acer\Desktop\New') destination_folder =(r ...

How to extract a list from a dictionary using Python and a JSON file

I'm brand new to Python and could really use some guidance. My goal is to utilize Python to extract the values of a list within a dictionary from my JSON file. After successfully reading the JSON data into my program, I have: request body = { " ...