GitHub - Mountchicken/CodeCookbook: Cookbook for Crafting Good Code (original) (raw)

Cookbook to Craft Good Code

Think of good code like a classic, well-fitting piece of clothing—it never goes out of style. Coding is both a science and an art, where neatness and logic come together. What makes some code stand out as great? Here are three key aspects:

In this guide, we'll dive into the essentials of crafting great code. We'll go through everything from how to name things clearly and highlight tools that make coding better and easier.

Contents

1. Readability

Readability in code is akin to clear handwriting in a letter. It's not just about what you write, but how you present it. A well-written piece of code should speak to its reader, guiding them through its logic as effortlessly as a well-told story. Let's delve into some of the key practices that make code readable.

Docstring

A docstring, short for "documentation string," is a string literal that occurs as the first statement in a module, function, class, or method definition. Here are three most important definitions from the official Python documentation, PEP257.

Python PEP257

Here are two python templates for docstring of function and class that may give you a more concrete idea of how to write a docstring.

Template for function

def function_name(param1, param2, ...): """A brief description of what the function does.

A more detailed description of the function if necessary.

Inputs:
    param1 (Type): Description of param1.
    param2 (Type): Description of param2.

Returns:
    ReturnType: Description of the return value.

Raises: (Optional)
    ExceptionType: Explanation of when and why the exception is raised.

Notes: (Optional)
    Additional notes or examples of usage, if necessary.

Examples: (Optional)
    >>> function_name(value1, value2)
    Expected return value
"""
# Function implementation
...

Template for class

class ClassName: """Brief description of the class's purpose and behavior.

A more detailed description if necessary.

Args: arg1 (Type): Description of arg1. arg2 (Type): Description of arg2. ...

Attributes: (Optional) attribute1 (Type): Description of attribute1. attribute2 (Type): Description of attribute2. ...

Methods: (Optional) method1: Brief description of method1. method2: Brief description of method2. ...

Examples: (Optional) >>> instance = ClassName(arg1, arg2) >>> instance.method1()

Notes: Additional information about the class, if necessary. """ def init(self, arg1, arg2, ...): # Constructor implementation ...

Here are some more detailed examples of docstrings that you can check out:

Detailed examples for docstring

from typing import Union

import torch import torch.nn as nn from torchvision.ops.boxes import box_area

simple functions

def box_iou(boxes1, boxes2): """Compute the intersection over union (IoU) between two sets of bounding boxes.

Inputs:
    boxes1 (Tensor): Bounding boxes in format (x1, y1, x2, y2). Shape (N, 4).
    boxes2 (Tensor): Bounding boxes in format (x1, y1, x2, y2). Shape (M, 4).

Returns:
    Union[Tensor, Tensor]: A tuple containing two tensors:
        iou (Tensor): The IoU between the two sets of bounding boxes. Shape (N, M).
        union (Tensor): The area of the union between the two sets of bounding boxes.
            Shape (N, M).
"""
area1 = box_area(boxes1)
area2 = box_area(boxes2)

# import ipdb; ipdb.set_trace()
lt = torch.max(boxes1[:, None, :2], boxes2[:, :2])  # [N,M,2]
rb = torch.min(boxes1[:, None, 2:], boxes2[:, 2:])  # [N,M,2]

wh = (rb - lt).clamp(min=0)  # [N,M,2]
inter = wh[:, :, 0] * wh[:, :, 1]  # [N,M]

union = area1[:, None] + area2 - inter

iou = inter / (union + 1e-6)
return iou, union

simple function with dict as input

def create_conv_layer(layer_config): """Create a convolutional layer for a neural network based on the provided configuration.

Inputs:
    layer_config (dict): A dictionary with the following keys:
        'in_channels' (int): The number of channels in the input.
        'out_channels' (int): The number of channels produced by the convolution.
        'kernel_size' (int or tuple): Size of the convolving kernel.
        'stride' (int or tuple, optional): Stride of the convolution. Default: 1
        'padding' (int or tuple, optional): Zero-padding added to both sides of the input.
            Default: 0

Returns:
    nn.Module: A PyTorch convolutional layer configured according to layer_config.

Example:
    >>> config = {'in_channels': 1, 'out_channels': 16, 'kernel_size': 3, 'stride': 1, 'padding': 0}
    >>> conv_layer = create_conv_layer(config)
    >>> isinstance(conv_layer, nn.Module)
    True
"""
return nn.Conv2d(**layer_config)

simple class

class SimpleConvNet(nn.Module): """A simple convolutional neural network wrapper class extending PyTorch's nn.Module. This class creates a neural network with a single convolutional layer.

Args:
    in_channels (int): The number of channels in the input.
    out_channels (int): The number of channels produced by the convolution.
    kernel_size (int or tuple): Size of the convolving kernel.

Attributes:
    conv_layer (nn.Module): A convolutional layer as defined in the __init__ method.

Methods:
    forward(x): Defines the forward pass of the network.

Example:
    >>> net = SimpleConvNet(1, 16, 3)
    >>> isinstance(net, nn.Module)
    True
"""

def __init__(self, in_channels, out_channels, kernel_size):
    super(SimpleConvNet, self).__init__()
    self.conv_layer = nn.Conv2d(in_channels, out_channels, kernel_size)

def forward(self, x):
    """Defines the forward pass of the neural network.

    Inputs:
        x (Tensor): The input tensor to the network.

    Returns:
        Tensor: The output tensor after passing through the convolutional layer.
    """
    return self.conv_layer(x)

Type Hinting

Type hinting is like attaching labels to your produce in the grocery store; you know exactly what you're getting. It enhance readability, facilitate debugging, and enable better tooling. Type hinting in Python is a formal solution to statically indicate the type of a variable. It was introduced in Python 3.5 and is supported by most IDEs and code editors. Let's look at an example:

def add_numbers(a: int, b: int) -> int: return a + b

Anyone reading this function signature can quickly understand that the function expects two integers as inputs and will return an integer, and that's the beauty of type hinting. It makes code more readable and self-documenting. It’s crucial to understand that type hints in Python do not change the dynamic nature of the language. They are simply hints and do not prevent runtime type errors.

Almost all built-in types are supported for type hinting. Let's start with some python in-built types.

int: Integer number. param: int = 5

float: Floating point number. param: float = 3.14

bool: Boolean value (True or False). param: bool = True

str: String. param: str = "researcher"

We can also use type hinting for more complex types by importing them from the typing module.

Generic Types: List, Tuple, Dict, Set

from typing import List, Tuple, Dict, Set

param: List[int] = [1, 2, 3] param: Dict[str, int] = {"Time": 12, "Money": 13} param: Set[int] = {1, 2, 3} param: Tuple[float, float] = (1.0, 2.0)

Specialized Types: Any, Union, Optional

- Optional: For optional values.

- Union: To indicate that a value can be of multiple types.

- Any: For values of any type.

from typing import Union, Optional, Any

param: Optional[int] = None param: Union[int, str] = 5 param: Any = "Hello"

Callable Types: For functions and methods.

from typing import Callable

param: Callable[[int], str] = lambda x: str(x)

These are the most common types you'll encounter in Python. For a complete list of supported types, check out the official documentation.

Now let's look at some examples of combining type hinting and docstring in action.

Type hinting Example

import torch import torch.nn as nn import torch.optim as optim from typing import Tuple, List, Optional

def find_max(numbers: List[int]) -> Optional[int]: """Find the maximum number in a list. Returns None if the list is empty.

Inputs:
    numbers (List[int]): A list of integers.

Returns:
    Optional[int]: The maximum number in the list, or None if the list is empty.
"""
return max(numbers) if numbers else None

class SimpleNet(nn.Module): """A simple neural network with one fully connected layer.

Args:
    input_size (int): The size of the input features.
    output_size (int): The size of the output features.
"""

def __init__(self, input_size: int, output_size: int) -> None:
    super(SimpleNet, self).__init__()
    self.fc = nn.Linear(input_size, output_size)

def forward(self, x: torch.Tensor) -> torch.Tensor:
    """Perform a forward pass of the network.

    Inputs:
        x (torch.Tensor): The input tensor.

    Returns:
        torch.Tensor: The output tensor after passing through the network.
    """
    return self.fc(x)

def train_network(network: nn.Module, data: List[Tuple[torch.Tensor, torch.Tensor]], epochs: int, learning_rate: float) -> None: """Train a neural network.

Inputs:
    network (nn.Module): The neural network to train.
    data (List[Tuple[torch.Tensor, torch.Tensor]]): Training data, a list of tuples with
      input and target tensors.
    epochs (int): The number of epochs to train for.
    learning_rate (float): The learning rate for the optimizer.

Returns:
    None
"""
criterion = nn.MSELoss()
optimizer = optim.Adam(network.parameters(), lr=learning_rate)

for epoch in range(epochs):
    for inputs, targets in data:
        optimizer.zero_grad()
        outputs = network(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()

Type hints in Python enhance code clarity, readability, and maintainability. Though Python remains dynamically typed, type hints offer the benefits of static typing, making them particularly useful in large codebases and complex applications like deep learning. Incorporating type hints is a straightforward way to make Python code more robust and easier to understand.

Naming Conventions

In programming, naming conventions are as crucial as the code itself. They are the first layer of documentation for anyone who reads your code. Good naming conventions in Python enhance readability, maintainability, and are essential for understanding the intent of the code. Let's look at some of the best practices for naming things in Python.

Formatting

Code formatting is about more than just aesthetics; it's a crucial aspect of writing readable and maintainable code. In Python, adhering to a consistent formatting style helps developers understand and navigate the code more effectively. Before we dive into the details of code formatting, let's look at two code snippets and see which style you prefer.

The Python Enhancement Proposal 8 (PEP8) is the de facto code style guide for Python. It covers various aspects of code formatting like indentation, line length, whitespace usage, and more. The rules are complex and detailed, but here are two off-the-shelf tools that can help you format your code according to PEP8, automatically.

Once you have installed the two extensions, yapf and isort, you can automatically format your Python code and organize your import statements with minimal effort.

2. Simplicity and Efficiency

KISS Principle

Keep It Simple, Stupid (KISS) is a design principle that emphasizes the importance of simplicity in software development. The core idea is that systems work best if they are kept simple rather than made complex. Simplicity here means avoiding unnecessary complexity, which can lead to code that is more reliable, easier to understand, maintain, and extend. Here are some of the key aspects of the KISS principle that you can apply to your code:

Write Efficient Code

Writing efficient code means optimizing both for speed and resource usage, ensuring that the application runs smoothly and responds quickly, even as the complexity of tasks or the size of data increases.

3. Maintainability

Maintainability refers to how easily software can be maintained over time. This includes fixing bugs, improving functionality, and update to meet new requirements or work with new technologies. High maintainability is crucial for the long-term success and adaptability of a deep learning project.

Use Registry Mechanism to Manage Your Code

Quoting from MMEngine, "The registry can be considered as a union of a mapping table and a build function of modules. The mapping table maintains a mapping from strings to classes or functions, allowing the user to find the corresponding class or function with its name/notation. For example, the mapping from the string "ResNet" to the ResNet class. The module build function defines how to find the corresponding class or function based on a string and how to instantiate the class or call the function. "

Now let's look at an example of how registry mechanism works

@BACKBONES.register_module() class ClipViTWrapper(nn.Module): def init(self, return_pool_feature: bool = True, freeze: bool = False): super(ClipViTWrapper, self).init() self.model = CLIPVisionModel.from_pretrained( 'openai/clip-vit-base-patch32') self.return_pool_feature = return_pool_feature if freeze: for param in self.model.parameters(): param.requires_grad = False

def forward(self, x: torch.Tensor):
    ...

W/O Registry <<<<<<<<<<<<<<<<<<<<<< from ...backbones import ClipViTWrapper model = ClipViTWrapper(return_pool_feature=True, freeze=False)

W/ Registry <<<<<<<<<<<<<<<<<<<<<<<< model = BACKBONES.build(dict(type='ClipViTWrapper', return_pool_feature=True, freeze=False))

Organize Your Code by Functionality

Organizing code by functionality is a crucial aspect of maintainability. It involves grouping related code into modules, packages, and libraries, and separating unrelated code into distinct sections. This makes it easier to find and update code, and also helps avoid conflicts between different parts of the codebase. Generally, an AI project is composed of three parts: model, dataset, and computation depicited in the following figure.

Based on the above three parts, we can organize our code by functionality as follows: