πŸ€–Real-Time Object Detection using PythonπŸš€

Learn how to build a real-time object detection system with Python using popular libraries like OpenCV and YOLO. Try the code yourself in Google Colab! πŸ’»πŸ“·

🏒About

Hello! I'm Usha Pithani, and I led this real-time object detection project.

πŸ”How it Works

  1. βš™οΈ Set up your Python environment with required libraries (OpenCV, PyTorch, etc.).
  2. 🀩 Load a pre-trained object detection model (e.g., YOLOv11).
  3. πŸŽ₯ Process camera or video feeds to identify and classify objects.
  4. πŸ“¦ Display results with bounding boxes and labels in real time.

πŸ’‘Understanding Real-Time Object Detection

Real-time object detection involves identifying and localizing objects within video streams as they happen, with minimal latency. This technology is at the heart of many modern applications, including:

Key Components:

This project specifically leverages **YOLOv11** for its balance of speed and accuracy, and **OpenCV** for camera interaction and display.

πŸ’»Running the Project in VS Code

Follow these steps to set up and run the real-time object detection project using your laptop's webcam in Visual Studio Code.

1. Prerequisites:

2. Environment Setup:

It's crucial to use a virtual environment to manage project dependencies. Open VS Code, navigate to your desired project folder, and open a new terminal (Ctrl+Shift+`).

Create and activate a virtual environment:

# Create virtual environment
python -m venv venv

# Activate on Windows (PowerShell)
.\venv\Scripts\Activate.ps1

# Activate on Windows (Command Prompt)
venv\Scripts\activate.bat

# Activate on macOS/Linux
source venv/bin/activate

Once activated (you'll see (venv) in your terminal prompt), install the required libraries:

# Install PyTorch (CPU version - suitable for most laptops)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

# Install OpenCV and other utilities
pip install opencv-python numpy matplotlib seaborn

# Install Ultralytics (includes YOLOv11 support)
pip install ultralytics

Note: If you have an NVIDIA GPU, visit PyTorch's website for the CUDA-enabled installation command for faster inference.

3. Create the Python Script:

3.1 In VS Code, create a new file named yolo_realtime_webcam.py and paste the following code for Real-Time Object Detection Using WebcamπŸ“·:

import cv2
from ultralytics import YOLO

# Print OpenCV version
print(f"OpenCV version: {cv2.__version__}")

# Load the latest YOLO11 model (choose model size: n, s, m, l, x)
model = YOLO("yolo11s.pt")  # Alternatives: 'yolo11n.pt', 'yolo11m.pt', etc.

# Open the laptop webcam
cap = cv2.VideoCapture(0)
if not cap.isOpened():
    print("Error: Could not access the webcam.")
    exit()

print("Webcam opened successfully. Press 'q' to quit the detection window.")

try:
    while True:
        ret, frame = cap.read()
        if not ret:
            print("Failed to grab frame.")
            break

        # Run detection; YOLO expects BGR frames (default for OpenCV)
        results = model.predict(source=frame, show=False, stream=False)

        # Overlay boxes; results[0].plot() returns BGR image
        annotated_frame = results[0].plot()

        # Display result
        cv2.imshow("YOLO11 Real-time Detection", annotated_frame)

        # Exit on 'q'
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
finally:
    cap.release()
    cv2.destroyAllWindows()
    print("Webcam released and windows closed.")

```

3.2 In VS Code, create a new file named traffic_analyzer.py and paste the following code for Real-Time Object Detection Using Video FileπŸŽ₯:

import cv2
from ultralytics import YOLO

def traffic_analysis_yolo_bytetrack(video_path, output_path="output_traffic.mp4"):
    model = YOLO("yolo11n.pt")

    # Open the video file
    cap = cv2.VideoCapture(video_path)
    if not cap.isOpened():
        print(f"Error: Could not open video file {video_path}")
        return

    # Get video properties
    frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    fps = int(cap.get(cv2.CAP_PROP_FPS))

    # Define the codec and create VideoWriter object
    fourcc = cv2.VideoWriter_fourcc(*'mp4v') # You can use 'XVID' or 'MJPG'
    out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))

    print("Starting vehicle detection and tracking...")

    while True:
        ret, frame = cap.read()
        if not ret:
            print("End of video or error reading frame.")
            break
            
        results = model.track(frame, persist=True, tracker="bytetrack.yaml", conf=0.3, iou=0.5, show=False)

        # Process results
        if results and results[0].boxes.id is not None:
            # Get annotated frame with bounding boxes and track IDs
            annotated_frame = results[0].plot()

            # Display the frame
            cv2.imshow("Traffic Analysis", annotated_frame)
            
            out.write(annotated_frame)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    # Release resources
    cap.release()
    out.release()
    cv2.destroyAllWindows()
    print(f"Traffic analysis completed. Output saved to {output_path}")

if __name__ == "__main__":
    # Replace with your video file path
    input_video = "road_traffic.mp4" 
    
    traffic_analysis_yolo_bytetrack(input_video)
  ```

4. Run the Project:

4.1 Save your yolo_realtime_webcam.py file. With your virtual environment still active in the VS Code terminal, run:

python yolo_realtime_webcam.py

4.2 Save your traffic_analyzer.py file. With your virtual environment still active in the VS Code terminal, run:

python traffic_analyzer.py

A new window will appear showing your webcam feed with real-time object detections. To exit, click on the detection window and press the q key.

πŸ“Run Code in Google Colab

Click the badge below to open and run the object detection code in Google Colab! πŸ‘‰

Open in Colab

(Replace your-notebook-url with your Colab notebook's shareable link if you create a new one.)

Colab Code Snippet:

Here's the typical setup and inference code you'd run in a Google Colab notebook cell for YOLOv11 for Real-Time Object Detection using Image:

# Step 1: Install the Ultralytics package (YOLO11 is supported in ultralytics >= 8.0.x)
!pip install ultralytics --upgrade --quiet

# Step 2: Import the library
from ultralytics import YOLO

# Step 3: Load a YOLOv11 pretrained model (various sizes available, e.g., 'yolo11n.pt', 'yolo11s.pt', 'yolo11m.pt', 'yolo11x.pt')
model = YOLO('yolo11s.pt')

# Step 4: Inference on an image
results = model('path/to/image.jpg')

# Step 5: Show results (in a notebook or Python script)
results.show()          # Visualize detection output
results.save()          # Save results images to 'runs/detect/predict'
results.print()         # Print results to console

Note: In Colab, results.show() typically displays the image directly within the notebook output, and webcam access requires specific Colab code (like cv2_imshow and JavaScript integration) which is more complex than a simple Python script.