Human Position Recognition With Camera and Raspberry Pi 4 With the Use of OpenCV
by WilliamLudeke in Circuits > Cameras
5094 Views, 18 Favorites, 0 Comments
Human Position Recognition With Camera and Raspberry Pi 4 With the Use of OpenCV
have you ever wanted to make a painting where the eyes literally track where someone is or a doll that head follows you around? well I do but I needed a method of detecting the position of a person through a webcam on small hardware such as a raspberry pi. there are a few problems needing to be overcome however:
- the raspberry pi isn't a very strong computer so I need to find a fast but accurate way of tracking someone
- well it turns out that OpenCV has a function called HOG which with some fine tuning can do just this
- how do I get the position of the person that is detected
- HOG can output a bounding box which I can use to find the center of to get someones position
- how do I know where the eyes/head are currently looking
- instead of keeping track where the eyes/head are looking i can put the camera in the eyes/head so it rotates with them and if its left of center I rotate the camera and eyes/head with it to slowly center the person on camera
- what happens if there is more than 1 person on camera, who do I track?
- simple, whoever is closest to the center
- how do I make use of the limited outputs of the raspberry pi to make things happen
- simple, I don't. instead I send data to an Arduino and let it handle the output
after a bit of time I finally got the test set up and running and am stunned at how well it works. its accurate(with 1 person at least I havent really been able to test with more than 1 person but i think it should work), its fast relative to the hardware that is running it, but most importantly it works.
in this instructable I'm going to explain how i did it and how it works to the best of my ability.
Supplies
What is needed:
- 1x Raspberry Pi 4 B(I used the 4 GB model but a 2GB model minimum is required)
- peripherals to run raspberry pi
- 1x Webcam (i used the Logitech C615)
- 1x Arduino uno
- 1x USB type A to USB type B cable
- 1x bread board
- 3x LED's
- jumper wires for bread board
Wiring
well start with connecting up all of the hardware. here are a list of connections needed to be made for the raspberry pi:
- Raspberry pi to ac outlet using USBC adapter
- Raspberry pi HDMI 0 port to monitors HDMI
- Raspberry pi to Arduino using USB A to USB B cable
next you will need to plug the LED's in according to the diagram above. first place 3 LED's on the bread board so that none of the anodes or cathodes are connected in the same lane. next you will need a jumper going from the left LED anode(the long pin) to digital pin 4 on the Arduino, a jumper going from the center LED anode to digital pin 8 on the Arduino, and a jumper going from the right LED anode to digital pin 12 on the Arduino. finally everything needs to be connected to the ground pin on the Arduino, to do this have a jumper go from the pin labeled GND on the Arduino to the line on the side of the bread board with a "-" and connect each cathode of the LED to that line with a jumper wire.
Software Prep: Installing Python
first things first update your raspberry pi by running the following commands in the terminal:
sudo apt update sudo apt upgrade
you will first need to check to see if you have python installed on the raspberry pi and it is trhe right version. to do this run the following command: into the terminal
python --version
or
python3 --version
if a version number shows up make sure it is the latest version. i will be using 3.10.6 for this project.
if you do not have the latest version of python go to the following website and select the most recent version https://www.python.org/downloads/
you might also need to install the following dependencies:
sudo apt-get install libreadline-dev libncursesw5= libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev
from there click download and then scroll down to "XZ compressed source tarball" and click it to download. next navigate to your Downloads folder in the file explorer. right click on the ".tar.xz" archive and click extract here to extract the files. finally right click on the folder that was created and select open in terminal. from there type the following commands
./configure make make test sudo make install
finally python is installed! this should install it as 'python3' so whenever you run a command with it make sure to type 'python3' instead of just 'python'. for example to check to make sure the it was installed run
python3 --version
Software Prep: Installing Necessary Libraries
your gonna need to start by installing pip for python 3. to do this simply run the following command in the terminal:
sudo apt-get install python3-pip
next you will nee to install the following libraries if they are not already installed:
- NumPy:
pip3 install numpy
- open cv
pip3 install opencv-python
- PySerial
pip3 install pyserial
once all of these commands are run all the python libraries you will need for this project will have been installed!
Software Prep: Installing Arduino IDE
to install Arduino IDE to program the Arduino you first need to type the following command to determine what architecture your raspberry pi is running (arm64 or arm32).
uname -m
if the result is aarch64 then you are running arm64 and if it says aarch32 you are running arm32. now go to the following website and download the Arduino IDE by selecting the Linux version for either arm32 or arm64 based on what your result of uname -m is.
https://www.arduino.cc/en/software
once download go to your download folder and unzip the file and open the folder. from there run "install.sh" to install the IDE.
Code: Arduino
for the Arduino code we need to start by declaring what pins the Arduino is using along with initialize the variable for receiving serial communications.
//variable for incoming data be stored in int x; //declare pin numbers int LED_Left = 4; int LED_Center = 8; int LED_Right = 12;
next we run the setup code which only runs once. it starts by setting all of the pinmodes to output and then starts the serial communication along with the timeout.
void setup() { //set pins to output pinMode(LED_Left, OUTPUT); pinMode(LED_Center, OUTPUT); pinMode(LED_Right, OUTPUT); //start serial com Serial.begin(115200); Serial.setTimeout(1); }
finally a loop that runs forever, in the loop it starts by getting trapped in a while statement until a serial com is recieved. once a communication is recieved it exits the while loop and goes on to read the data that just came in which if you recall is a string. we convert it to an integer and then by using if statements we st the LED's output to either on or off.
void loop() { while (!Serial.available()){ //loop while nothing is being sent and stop when data is recieved } //read serial data and convert to int x = Serial.readString().toInt(); //nothing detected if (x == 0){ digitalWrite(LED_Left, LOW); digitalWrite(LED_Center, LOW); digitalWrite(LED_Right, LOW); } //left of center if(x == 1){ digitalWrite(LED_Left, HIGH); digitalWrite(LED_Center, LOW); digitalWrite(LED_Right, LOW); } //centered if (x == 2){ digitalWrite(LED_Left, LOW); digitalWrite(LED_Center, HIGH); digitalWrite(LED_Right, LOW); } //right of center if (x == 3){ digitalWrite(LED_Left, LOW); digitalWrite(LED_Center, LOW); digitalWrite(LED_Right, HIGH); } Serial.print(x); }
here is the final code:
//variable for incoming data be stored in int x; //declare pin numbers int LED_Left = 4; int LED_Center = 8; int LED_Right = 12; void setup() { //set pins to output pinMode(LED_Left, OUTPUT); pinMode(LED_Center, OUTPUT); pinMode(LED_Right, OUTPUT); //start serial com Serial.begin(115200); Serial.setTimeout(1); } void loop() { while (!Serial.available()){ //loop while nothing is being sent and stop when data is recieved } //read serial data and convert to int x = Serial.readString().toInt(); //nothing detected if (x == 0){ digitalWrite(LED_Left, LOW); digitalWrite(LED_Center, LOW); digitalWrite(LED_Right, LOW); } //left of center if(x == 1){ digitalWrite(LED_Left, HIGH); digitalWrite(LED_Center, LOW); digitalWrite(LED_Right, LOW); } //centered if (x == 2){ digitalWrite(LED_Left, LOW); digitalWrite(LED_Center, HIGH); digitalWrite(LED_Right, LOW); } //right of center if (x == 3){ digitalWrite(LED_Left, LOW); digitalWrite(LED_Center, LOW); digitalWrite(LED_Right, HIGH); } Serial.print(x); }
also note the port that the arduino is connected to in the bottom right corner of the IDE window as this will be needed later for the python code
dont forget to upload the code to the arduino!
Downloads
Code: Python
your going to start by importing the necisary libraries like so
#import the necessary packages import numpy as np import cv2 import serial
from here we need to set 3 constants which are the port the arduino is connected to along with the setting up the arduino serial com and finally the distance from the center in either direction someone needs to be from the center to be detected as centered.. you will need to replace the <Your Arduino Port Here> with the port name you noted earlier.
#port that the arduino is connected to, can be found in arduino IDE arduino_port = '<Your Arduino Port Here>' arduino = serial.Serial(port=arduino_port, baudrate=115200, timeout=0.01) #sets how many pixels away from the center a person needs to be before the head stops center_tolerance = 5;
next we initialize HOG, which is a part of openCV that we use to detect people and will be explained in the next step. we set the hog descriptor to detect humans which is one of the default options.
# initialize the HOG descriptor/person detector hog = cv2.HOGDescriptor() hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
then we start the video capture
# open webcam video stream cap = cv2.VideoCapture(0)
the last thing we do before we start the loop to detect people frame by frame is we need to define the function to actually comunicate with the arduino. it starts by writing a stting then reading the reply.
def write_read(x): arduino.write(bytes(x, 'utf-8')) data = arduino.readline() return data
we then start a while loop and put the following in it. the code starts by getting a single frame from the camera then resizing it to process faster. from there it will be put into HOG which will output 2 xy cordinates which are the oposite sides of a rectangle that encompases the person. we then convert it to an array called boxes and initialize a list called centers. next it loops through each box and figures out the distance from the center and also its x cordinate relavtive to the center. it adds this data to the list and moves on to the next detected box. to make sure the code doesnt try to do anything with an empty list we have an if statement to make sure there is actually a box detected before we sort it by distance from the center. next it will draw the rectangles by iterating through the list of boxes and the first one is drawn green and the others red. the green box is closest to the center. next the code checks weather the position is left of the tolerance zone in it or right of it or nothing is detected and sends a string of "1" for left "2" for center "3" for right and "0" for nothing. finally the image is scaled back up for better viewing on the screen and then writes it to the window created earlier. finally we have that if the q key is pressed the program stops.
while(True): # Capture frame-by-frame ret, frame = cap.read() # resizing for faster detection frame = cv2.resize(frame, (140, 140)) # detect people in the image # returns the bounding boxes for the detected objects boxes, weights = hog.detectMultiScale(frame, winStride=(1,1), scale = 1.05) boxes = np.array([[x, y, x + w, y + h] for (x, y, w, h) in boxes]) centers = [] for box in boxes: #get the distance from the center of each box's center x cord to the center of the screen and ad them to a list center_x = ((box[2]-box[0])/2)+box[0] x_pos_rel_center = (center_x-70) dist_to_center_x = abs(x_pos_rel_center) centers.append({'box': box, 'x_pos_rel_center': x_pos_rel_center, 'dist_to_center_x':dist_to_center_x}) if len(centers) > 0: #sorts the list by distance_to_center sorted_boxes = sorted(centers, key=lambda i: i['dist_to_center_x']) #draws the box center_box = sorted_boxes[0]['box'] for box in range(len(sorted_boxes)): # display the detected boxes in the colour picture if box == 0: cv2.rectangle(frame, (sorted_boxes[box]['box'][0],sorted_boxes[box]['box'][1]), (sorted_boxes[box]['box'][2],sorted_boxes[box]['box'][3]), (0,255, 0), 2) else: cv2.rectangle(frame, (sorted_boxes[box]['box'][0],sorted_boxes[box]['box'][1]), (sorted_boxes[box]['box'][2],sorted_boxes[box]['box'][3]),(0,0,255),2) #retrieves the distance from center from the list and determins if the head should turn left, right, or stay put and turn lights on Center_box_pos_x = sorted_boxes[0]['x_pos_rel_center'] if -center_tolerance <= Center_box_pos_x <= center_tolerance: #turn on eye light print("center") result = write_read("2") elif Center_box_pos_x >= center_tolerance: #turn head to the right print("right") result = write_read("3") elif Center_box_pos_x <= -center_tolerance: #turn head to the left print("left") result = write_read("1") print(str(Center_box_pos_x)) else: #prints out that no person has been detected result = write_read("0") print("nothing detected") #resizes the video so its easier to see on the screen frame = cv2.resize(frame,(720,720)) # Display the resulting frame cv2.imshow("frame",frame) if cv2.waitKey(1) & 0xFF == ord('q'): break
the last bit we need to add is for stoping the camera from capturing video stop the output and then close the window.
# When everything done, release the capture cap.release() # finally, close the window cv2.destroyAllWindows() cv2.waitKey(1)
Here is the final code:
#import the necessary packages import numpy as np import cv2 import serial #port that the arduino is connected to, can be found in arduino IDE arduino_port = '<Your_Port_Here>' arduino = serial.Serial(port=arduino_port, baudrate=115200, timeout=0.01) #sets how many pixels away from the center a person needs to be before the head stops center_tolerance = 5; # initialize the HOG descriptor/person detector hog = cv2.HOGDescriptor() hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector()) cv2.startWindowThread() # open webcam video stream cap = cv2.VideoCapture(0) def write_read(x): arduino.write(bytes(x, 'utf-8')) data = arduino.readline() return data while(True): # Capture frame-by-frame ret, frame = cap.read() # resizing for faster detection frame = cv2.resize(frame, (140, 140)) # detect people in the image # returns the bounding boxes for the detected objects boxes, weights = hog.detectMultiScale(frame, winStride=(1,1), scale = 1.05) boxes = np.array([[x, y, x + w, y + h] for (x, y, w, h) in boxes]) centers = [] for box in boxes: #get the distance from the center of each box's center x cord to the center of the screen and ad them to a list center_x = ((box[2]-box[0])/2)+box[0] x_pos_rel_center = (center_x-70) dist_to_center_x = abs(x_pos_rel_center) centers.append({'box': box, 'x_pos_rel_center': x_pos_rel_center, 'dist_to_center_x':dist_to_center_x}) if len(centers) > 0: #sorts the list by distance_to_center sorted_boxes = sorted(centers, key=lambda i: i['dist_to_center_x']) #draws the box center_box = sorted_boxes[0]['box'] for box in range(len(sorted_boxes)): # display the detected boxes in the colour picture if box == 0: cv2.rectangle(frame, (sorted_boxes[box]['box'][0],sorted_boxes[box]['box'][1]), (sorted_boxes[box]['box'][2],sorted_boxes[box]['box'][3]), (0,255, 0), 2) else: cv2.rectangle(frame, (sorted_boxes[box]['box'][0],sorted_boxes[box]['box'][1]), (sorted_boxes[box]['box'][2],sorted_boxes[box]['box'][3]),(0,0,255),2) #retrieves the distance from center from the list and determins if the head should turn left, right, or stay put and turn lights on Center_box_pos_x = sorted_boxes[0]['x_pos_rel_center'] if -center_tolerance <= Center_box_pos_x <= center_tolerance: #turn on eye light print("center") result = write_read("2") elif Center_box_pos_x >= center_tolerance: #turn head to the right print("right") result = write_read("3") elif Center_box_pos_x <= -center_tolerance: #turn head to the left print("left") result = write_read("1") print(str(Center_box_pos_x)) else: #prints out that no person has been detected result = write_read("0") print("nothing detected") #resizes the video so its easier to see on the screen frame = cv2.resize(frame,(720,720)) # Display the resulting frame cv2.imshow("frame",frame) if cv2.waitKey(1) & 0xFF == ord('q'): break # When everything done, release the capture cap.release() # and release the output out.release() # finally, close the window cv2.destroyAllWindows() cv2.waitKey(1)
Downloads
What Is HOG?
histogram of oriented gradients, or HOG for short, is an image process method which outputs a given image into a vector representation of the image which is then input into a machine learning algorithm to classify and find the bounding box of the desired object