Skip to main content

Introduction to Pytorch in Raspberry Pi Environment

What is PyTorch?

pytorchlogo

PyTorch is an open-source machine learning framework developed by Facebook's AI Research lab (FAIR). It is known for its flexibility, dynamic computation graphs, and strong community support. PyTorch simplifies the development of deep learning models, making it a popular choice for researchers and practitioners alike.

Brief History

  • 2016: PyTorch was released by Facebook as an open-source library, combining features of Torch (a Lua-based framework) and Python for easy usability.

  • 2019: Gained significant momentum when Facebook partnered with Microsoft to create ONNX (Open Neural Network Exchange) for model interoperability.

  • 2022: PyTorch became part of the PyTorch Foundation, ensuring community-driven development.

Why Use PyTorch?

  • Dynamic Graphs: PyTorch uses dynamic computation graphs, allowing flexibility in building and debugging models.

  • Pythonic: Integrates seamlessly with Python, making it intuitive for Python developers.

  • Community Support: A vibrant ecosystem with numerous tutorials, forums, and open-source projects.

  • Accelerated Research: Its ease of use accelerates model experimentation and implementation.

What Are Dynamic Computation Graphs?

  • A computation graph represents the operations performed on data (e.g., tensors) in a deep learning model.
  • Dynamic Graphs (PyTorch):The computation graph is built on the fly as operations are executed.
    • Each forward pass can construct a different graph, allowing for greater flexibility and adaptability.
    • You don’t need to define the entire graph beforehand; it evolves during runtime.

Who Uses PyTorch?

  • Research Organizations: MIT, Stanford, OpenAI, and FAIR.

  • Companies: Facebook (Meta), Tesla (autonomous driving), Disney (AI for animation), and Microsoft.

  • Domains: Used in computer vision, natural language processing, reinforcement learning, and more.

PyTorch vs TensorFlow: Feature Comparison

pytorchvstf

PyTorch vs TensorFlow: Feature Comparison

FeaturePyTorchTensorFlow
Computation GraphDynamic (easier for debugging)Static (optimized for deployment)
Ease of UseIntuitive and PythonicRequires a steeper learning curve
CommunityPopular in academia and researchWidely used in production and enterprises
Frameworks Built OnLightning, Detectron2, Hugging FaceKeras, TFX, TensorFlow Lite
DeploymentTorchServe, ONNXTensorFlow Serving, TensorFlow.js
PerformanceEfficient but depends on optimizationBetter optimization for large-scale tasks

What is QNNPACK?

QNNPACK (Quantized Neural Network PACKage) is a high-performance kernel library developed by Facebook for running quantized neural networks efficiently on ARM CPUs. It is optimized for low-power devices, such as mobile phones and Raspberry Pi, and is a critical component for executing PyTorch's quantized models. It supports operations like convolutions, fully connected layers, and more, tailored for low-precision inference.

Setting Up the Environment for PyTorch Classification

Create a Virtual Environment

mkdir my_pytorch_course
cd my_pytorch_course
python -m venv --system-site-packages env
source env/bin/activate

Install Required Libraries

pip install torch torchvision torchaudio opencv-python numpy

Prepare Your Directory

  • Create a folder on your Desktop named pytorch.
  • Inside the pytorch folder, create the following files:

folder

Python Code (pytorch_test.py)

Copy the provided Python code into the file pytorch_test.py:

import time

import torch
import numpy as np
from torchvision import models, transforms

import cv2
from PIL import Image

torch.set_num_threads(torch.get_num_threads())

# Ensure qnnpack backend is used for quantized models
torch.backends.quantized.engine = 'qnnpack'

# Load the ImageNet class labels
with open("imagenet-classes.txt", "r") as f:
classes = [line.strip() for line in f.readlines()]

# Initialize webcam
cap = cv2.VideoCapture(0, cv2.CAP_V4L2)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 224)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 224)
cap.set(cv2.CAP_PROP_FPS, 36)

# Preprocessing pipeline
preprocess = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Load MobileNetV2 quantized model
net = models.quantization.mobilenet_v2(pretrained=True, quantize=True)
net = torch.jit.script(net) # Optimize model for inference

# Performance logging
started = time.time()
last_logged = time.time()
frame_count = 0

# Real-time inference
with torch.no_grad():
while True:
# Read frame from webcam
ret, frame = cap.read()
if not ret:
print("Failed to capture frame. Exiting...")
break

# Convert BGR to RGB
image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

# Preprocess image
input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0)

# Perform inference
output = net(input_batch)
probabilities = output[0].softmax(dim=0)

# Get top-10 predictions
top = list(enumerate(probabilities))
top.sort(key=lambda x: x[1], reverse=True)
top_predictions = [(classes[idx], val.item()) for idx, val in top[:3]]

# Display predictions on the frame
for i, (label, prob) in enumerate(top_predictions):
text = f"{prob * 100:.2f}% {label}"
cv2.putText(frame, text, (10, 25 + i * 20), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)

# Show the frame
cv2.imshow("Real-time Object Recognition", frame)

# Log fps
frame_count += 1
now = time.time()
if now - last_logged > 1:
print(f"{frame_count / (now - last_logged):.2f} fps")
last_logged = now
frame_count = 0

# Exit on pressing 'q'
if cv2.waitKey(1) & 0xFF == ord('q'):
break

# Release resources
cap.release()
cv2.destroyAllWindows()


How to Run the Code

Navigate to the pytorch directory

cd /home/pi/Desktop/pytorch

Run the Python script

python pytorch_test.py

Check Output

A window will open showing the real-time webcam feed.The top-3 predictions (with confidence percentages) will be displayed on the video feed.

result

Futher references :

Pytorch Documentation

Pytorch Course