Gesture-Detection

Real-time Gesture Detection using CUDA-accelerated OpenCV in Python. Leverages GPU for high-performance image processing tasks, ensuring efficient and responsive gesture recognition.

Tech 🛠️ Languages and Tools :

Python : Popular language for implementing Neural Network
Jupyter Notebook : Best tool for running python cell by cell
Google Colab : Best Space for running Jupyter Notebook with hosted server
Math : Essential Python library for basic mathematical operations and functions
OpenCV : Best Library for working with images
Numpy : Best Library for working with arrays in python
CUDA : used for NVIDIA GPU acceleration and get better frame rate
MatPlotLib : Library for showing the charts in python
GDown : Download Resources from Google Drive

💻 Run the Notebook on Google Colab

You can easily run this code on google colab by just clicking this badge

Live version not have google colab version because the google colab will crash.

Models

We have used caffe Open Pose model. we have the code for donwloading the model.

gdown.download(id="1D3ytIZ-ZMMd5MbvVbf2Sn5oZ1L0aQ9IG", output="pose_deploy_linevec_faster_4_stages.prototxt", quiet=False)
gdown.download(id="1f-fCSTg7qFHRVKGIptyPJsgNwRs4XDsK", output="pose_iter_160000.caffemodel", quiet=False)

📝 Tutorial

Step 1: Install CUDA toolkit

The first steps are haveing a NVIDIA GPU and download and install the CUDA Toolki (I have used the version 11.7.0)

Check CUDA version:

nvcc --version

Step 2: Install cuDNN

After that you should donwload and install the cuDN (I have used the version 8.6.0 for 11.X CUDA)

After these installation make sure you move the dll files from the cuDNN zip file into the CUDA installation path.

Step 3: Install OpenCV with CMake

Now you have installed the CUDA and cuDNN now we use the Visual studio and CMake to build our OpenCV with CUDA. you can watch this video

Step 4: Import Libraries

we need to import these libraries :

math, numpy, cv2, gdown, time

import math
import numpy as np
import cv2 as cv
import gdown
import time

Step 5: Verify CUDA

for running the code and check the installation we need to verify the cuda with opencv.

print("CUDA Enabled:", cv.cuda.getCudaEnabledDeviceCount())
print("OpenCV Build Info:")
print(cv.getBuildInformation())

Step 6: Download Resources

We need to download the Caffe modele.

We download models from my google drive for protecting the repo in future.

gdown.download(id="1D3ytIZ-ZMMd5MbvVbf2Sn5oZ1L0aQ9IG", output="pose_deploy_linevec_faster_4_stages.prototxt", quiet=False)
gdown.download(id="1f-fCSTg7qFHRVKGIptyPJsgNwRs4XDsK", output="pose_iter_160000.caffemodel", quiet=False)

Step 7: Load Models

We load the Caffe models for pose detection. also we set the cuda here

protoFile = "pose_deploy_linevec_faster_4_stages.prototxt"
weightsFile = "pose_iter_160000.caffemodel"

net = cv.dnn.readNetFromCaffe(protoFile, weightsFile)
net.setPreferableBackend(cv.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv.dnn.DNN_TARGET_CUDA)

Step 8: Set Body Points

in here we define the structure of the body skeleton by connecting the points of the body

nPoints = 15
POSE_PAIRS = [[0,1], [1,2], [2,3], [3,4], [1,5], [5,6], [6,7], [1,14], [14,8], [8,9], [9,10], [14,11], [11,12], [12,13]]

Step 9: Convert image to blob

This section works with GPU

def convert_image_to_blob(frame):
    return cv.dnn.blobFromImage(frame, 1.0/255, (256, 256), (0, 0, 0), swapRB=False, crop=False)

Step 10: Run Inference Async (forward pass)

In this section we run move forward through the model for pose detection. this section used GPU.

def run_inference():
    if not hasattr(run_inference, '_warmed_up'):
        net.forward()
        run_inference._warmed_up = True

    return net.forward()

Step 11: Extract points

In this section we extract all body points (15 points here). this section use CPU

def extract_keypoints(output, height, width):
    points = []
    for i in range(15):
        probMap = output[0, i, :, :]
        _, _, _, max_loc = cv.minMaxLoc(probMap)
        points.append((int(max_loc[0] * width / output.shape[3]), 
                       int(max_loc[1] * height / output.shape[2])))
    return points

Step 12: Display Points & Skeleton

in this section we use the POSE_PIARS to draw the skeleton of the body. this section uses CPU.

# CPU based
def draw_skeleton(frame, points, POSE_PAIRS): 
    image_skeleton = frame.copy()

    for pair in POSE_PAIRS:
        partA, partB = pair
        if points[partA] and points[partB]:
            cv.line(image_skeleton, points[partA], points[partB], (255, 255, 0), 2)
            cv.circle(image_skeleton, points[partA], 8, (255, 0, 0), thickness=-1, lineType=cv.FILLED)
    return image_skeleton

Step 13: Classifying Gesture

This section tries to classigying the gesture with points of the body. this section uses CPU.

def calculate_angle(line):
    point1 = line[0]
    point2 = line[1]
    if not point1 or not point2:
        return None
    
    x1, y1 = point1
    x2, y2 = point2
    dx = x2 - x1
    dy = y2 - y1
    
    angle_rad = math.atan2(dy, dx)
    angle_deg = math.degrees(angle_rad)
    if angle_deg < 0:
        angle_deg += 360
        
    return angle_deg

def classify_pose(points): # CPU based
    try:
        # each points od body are (width, height)
        head = points[0]
        neck = points[1]
        right_shoulder = points[2]
        right_elbow = points[3]
        right_wrist = points[4]
        left_shoulder = points[5]
        left_elbow = points[6]
        left_wrist = points[7]
        right_hip = points[8]
        right_knee = points[9]
        right_ankle = points[10]
        left_hip = points[11]
        left_knee = points[12]
        left_ankle = points[13]
        center = points[14]

        # each line of the body
        head_line = (head, neck)
        right_shoulder_line = (neck, right_shoulder)
        left_shoulder_line = (neck, left_shoulder)
        torso_line = (neck, center)

        right_upper_arm_line = (right_shoulder, right_elbow)
        right_lower_arm_line = (right_elbow, right_wrist)
        left_upper_arm_line = (left_shoulder, left_elbow)
        left_lower_arm_line = (left_elbow, left_wrist)

        right_thigh_line = (right_hip, right_knee)
        right_shin_line = (right_knee, right_ankle)
        left_thigh_line = (left_hip, left_knee)
        left_shin_line = (left_knee, left_ankle)

        right_hip_line = (center, right_hip)
        left_hip_line = (center, left_hip)

        if not neck or not center:
            return "none"
        
        # detecting Position
        torso_angle = calculate_angle(torso_line)
        vertical_torso = False
        horizental_torso = False
        if (70 < torso_angle < 110):
            vertical_torso = True
        if (150 < torso_angle < 210) or (330 < torso_angle) or (torso_angle < 30):
            horizental_torso = True

        right_thigh_angle = calculate_angle(right_thigh_line)
        vertical_right_thigh = False
        horizental_right_thigh = False
        if (70 < right_thigh_angle < 110):
            vertical_right_thigh = True
        if (150 < right_thigh_angle < 210) or (330 < right_thigh_angle) or (right_thigh_angle < 30):
            horizental_right_thigh = True

        left_thigh_angle = calculate_angle(left_thigh_line)
        vertical_left_thigh = False
        horizental_left_thigh = False
        if (70 < left_thigh_angle < 110):
            vertical_left_thigh = True
        if (150 < left_thigh_angle < 210) or (330 < left_thigh_angle) or (left_thigh_angle < 30):
            horizental_left_thigh = True
        
        # Standing gesture
        if vertical_torso and vertical_right_thigh and vertical_left_thigh:
            return "standing"

        # Sitting gesture
        if horizental_left_thigh and horizental_right_thigh and vertical_torso:
            return "sitting"

        # Laying gesture
        if horizental_torso:
            return "laying"

    except:
        return "none"

    return "none"

Step 14: Put every thing together

def detect_gesture(frame, POSE_PAIRS):
    blob = convert_image_to_blob(frame)
    net.setInput(blob)
    output = run_inference()
    points = extract_keypoints(output, frame.shape[0], frame.shape[1])
    frame_with_skeleton = draw_skeleton(frame, points, POSE_PAIRS)
    label = classify_pose(points)
    return frame_with_skeleton, label

Step 15: Run Live Gesture Detection

If you want to use laptop camera use the cap = cv.VideoCapture(0) code if you want to use your phone camera you can install IP Webcam and use the second code cap = cv.VideoCapture('http://21.118.71.170:8080/video', cv.CAP_FFMPEG).

cap = cv.VideoCapture(0)
# cap = cv.VideoCapture('http://21.118.71.170:8080/video', cv.CAP_FFMPEG)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 480)

frame_count = 0
skip_frames = 3

prev_time = time.time()
while True:
    ret, frame = cap.read()
    if not ret:
        break

    frame_count += 1
    if frame_count % skip_frames != 0:
        continue

    frame = cv.flip(frame, 1)
    frame = cv.resize(frame, (640, 480))
    output_frame, label = detect_gesture(frame, POSE_PAIRS)

    curr_time = time.time()
    fps = 1 / (curr_time - prev_time)
    prev_time = curr_time

    cv.putText(output_frame, f"FPS: {fps:.1f}", (520, 30), 
                cv.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

    cv.putText(output_frame, label, (10, 30),
               cv.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2)
    cv.imshow('Gesture Detection', output_frame)

    k = cv.waitKey(5) & 0xFF
    if k == 27:
        break

cv.destroyAllWindows()
cap.release()

🪪 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Code		Code
Gif		Gif
Pictures		Pictures
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gesture-Detection

Tech 🛠️ Languages and Tools :

💻 Run the Notebook on Google Colab

Models

📝 Tutorial

Step 1: Install CUDA toolkit

Step 2: Install cuDNN

Step 3: Install OpenCV with CMake

Step 4: Import Libraries

Step 5: Verify CUDA

Step 6: Download Resources

Step 7: Load Models

Step 8: Set Body Points

Step 9: Convert image to blob

Step 10: Run Inference Async (forward pass)

Step 11: Extract points

Step 12: Display Points & Skeleton

Step 13: Classifying Gesture

Step 14: Put every thing together

Step 15: Run Live Gesture Detection

🪪 License

About

Uh oh!

Releases

Packages

Languages

License

AsadiAhmad/Gesture-Detection

Folders and files

Latest commit

History

Repository files navigation

Gesture-Detection

Tech 🛠️ Languages and Tools :

💻 Run the Notebook on Google Colab

Models

📝 Tutorial

Step 1: Install CUDA toolkit

Step 2: Install cuDNN

Step 3: Install OpenCV with CMake

Step 4: Import Libraries

Step 5: Verify CUDA

Step 6: Download Resources

Step 7: Load Models

Step 8: Set Body Points

Step 9: Convert image to blob

Step 10: Run Inference Async (forward pass)

Step 11: Extract points

Step 12: Display Points & Skeleton

Step 13: Classifying Gesture

Step 14: Put every thing together

Step 15: Run Live Gesture Detection

🪪 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages