{"id":2148,"date":"2024-08-07T22:51:01","date_gmt":"2024-08-07T22:51:01","guid":{"rendered":"https:\/\/www.w3computing.com\/articles\/?p=2148"},"modified":"2024-08-08T11:14:43","modified_gmt":"2024-08-08T11:14:43","slug":"real-time-object-detection-with-yolo-opencv","status":"publish","type":"post","link":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/","title":{"rendered":"Real-Time Object Detection with YOLO and OpenCV"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Object detection is a critical task in the field of computer vision. It involves identifying and localizing objects within an image or video frame. One of the most efficient and widely used techniques for real-time object detection is YOLO (You Only Look Once). Combined with OpenCV, a powerful library for computer vision tasks, YOLO can be implemented efficiently to perform real-time object detection. This tutorial will delve into the details of setting up and using YOLO with OpenCV to achieve robust object detection in real-time applications.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Introduction to YOLO<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">YOLO is a state-of-the-art, real-time object detection system. Unlike previous detection systems that repurpose classifiers or localizers to perform detection, YOLO frames object detection as a single regression problem, straight from image pixels to bounding box coordinates and class probabilities. This makes it extremely fast and efficient.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Key Features of YOLO:<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Speed<\/strong>: YOLO is incredibly fast because it makes predictions with a single network evaluation, as opposed to sliding over regions or bounding boxes.<\/li>\n\n\n\n<li><strong>End-to-End Training<\/strong>: YOLO can be trained end-to-end directly on detection performance.<\/li>\n\n\n\n<li><strong>High Accuracy<\/strong>: YOLO achieves high accuracy with less background errors in object localization.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Setting Up Your Environment<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Before we dive into the code, it\u2019s crucial to set up our environment. We will need Python, OpenCV, and the YOLO weights and configuration files.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisites<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Python<\/strong>: Make sure you have Python installed. You can download it from <a href=\"https:\/\/www.python.org\/downloads\/\">python.org<\/a>.<\/li>\n\n\n\n<li><strong>OpenCV<\/strong>: Install OpenCV using pip:<\/li>\n<\/ol>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-1\" data-shcb-language-name=\"Bash\" data-shcb-language-slug=\"bash\"><span><code class=\"hljs language-bash\">   pip install opencv-python<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-1\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Bash<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">bash<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<ol start=\"3\" class=\"wp-block-list\">\n<li><strong>YOLO Files<\/strong>: Download the YOLO configuration and weights from the official YOLO website or the <a href=\"https:\/\/github.com\/AlexeyAB\/darknet\">Darknet GitHub repository<\/a>. You will need:<\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>yolov3.cfg<\/code>: The configuration file.<\/li>\n\n\n\n<li><code>yolov3.weights<\/code>: The pre-trained weights file.<\/li>\n\n\n\n<li><code>coco.names<\/code>: The file containing the class names.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Directory Structure<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Create a project directory and organize your files as follows:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-2\" data-shcb-language-name=\"plaintext\" data-shcb-language-slug=\"plaintext\"><span><code class=\"hljs language-plaintext\">project\/\n\u251c\u2500\u2500 yolov3.cfg\n\u251c\u2500\u2500 yolov3.weights\n\u251c\u2500\u2500 coco.names\n\u2514\u2500\u2500 detect.py<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-2\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">plaintext<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">plaintext<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\">Loading YOLO with OpenCV<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">OpenCV provides an interface to load YOLO directly. We will start by loading the model and the class names.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Load the YOLO Network<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">OpenCV\u2019s <code>dnn<\/code> module makes it straightforward to load the YOLO network. Here\u2019s how to do it:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-3\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> cv2\n<span class=\"hljs-keyword\">import<\/span> numpy <span class=\"hljs-keyword\">as<\/span> np\n\n<span class=\"hljs-comment\"># Load YOLO<\/span>\nnet = cv2.dnn.readNet(<span class=\"hljs-string\">\"yolov3.weights\"<\/span>, <span class=\"hljs-string\">\"yolov3.cfg\"<\/span>)\n\n<span class=\"hljs-comment\"># Load class names<\/span>\n<span class=\"hljs-keyword\">with<\/span> open(<span class=\"hljs-string\">\"coco.names\"<\/span>, <span class=\"hljs-string\">\"r\"<\/span>) <span class=\"hljs-keyword\">as<\/span> f:\n    classes = &#91;line.strip() <span class=\"hljs-keyword\">for<\/span> line <span class=\"hljs-keyword\">in<\/span> f.readlines()]\n\nlayer_names = net.getLayerNames()\noutput_layers = &#91;layer_names&#91;i - <span class=\"hljs-number\">1<\/span>] <span class=\"hljs-keyword\">for<\/span> i <span class=\"hljs-keyword\">in<\/span> net.getUnconnectedOutLayers()]<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-3\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Step 2: Prepare the Input Image<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">YOLO expects the input image to be preprocessed into a blob. This involves resizing the image, normalizing it, and rearranging its dimensions.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-4\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">preprocess_image<\/span><span class=\"hljs-params\">(image)<\/span>:<\/span>\n    blob = cv2.dnn.blobFromImage(image, <span class=\"hljs-number\">0.00392<\/span>, (<span class=\"hljs-number\">416<\/span>, <span class=\"hljs-number\">416<\/span>), (<span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">0<\/span>), <span class=\"hljs-literal\">True<\/span>, crop=<span class=\"hljs-literal\">False<\/span>)\n    <span class=\"hljs-keyword\">return<\/span> blob<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-4\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Step 3: Perform Forward Pass<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Perform a forward pass through the network to get the detection results.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-5\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">get_detections<\/span><span class=\"hljs-params\">(net, blob)<\/span>:<\/span>\n    net.setInput(blob)\n    outputs = net.forward(output_layers)\n    <span class=\"hljs-keyword\">return<\/span> outputs<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-5\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\">Processing YOLO Outputs<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The output from the YOLO network consists of bounding boxes, class IDs, and confidence scores. We need to process these outputs to extract useful information.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Extract Bounding Boxes<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">We will iterate over the outputs and extract bounding boxes, confidences, and class IDs.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-6\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">extract_boxes_confidences_classids<\/span><span class=\"hljs-params\">(outputs, confidence_threshold)<\/span>:<\/span>\n    boxes = &#91;]\n    confidences = &#91;]\n    class_ids = &#91;]\n\n    <span class=\"hljs-keyword\">for<\/span> output <span class=\"hljs-keyword\">in<\/span> outputs:\n        <span class=\"hljs-keyword\">for<\/span> detection <span class=\"hljs-keyword\">in<\/span> output:\n            scores = detection&#91;<span class=\"hljs-number\">5<\/span>:]\n            class_id = np.argmax(scores)\n            confidence = scores&#91;class_id]\n            <span class=\"hljs-keyword\">if<\/span> confidence &gt; confidence_threshold:\n                center_x = int(detection&#91;<span class=\"hljs-number\">0<\/span>] * width)\n                center_y = int(detection&#91;<span class=\"hljs-number\">1<\/span>] * height)\n                w = int(detection&#91;<span class=\"hljs-number\">2<\/span>] * width)\n                h = int(detection&#91;<span class=\"hljs-number\">3<\/span>] * height)\n                x = int(center_x - w \/ <span class=\"hljs-number\">2<\/span>)\n                y = int(center_y - h \/ <span class=\"hljs-number\">2<\/span>)\n                boxes.append(&#91;x, y, w, h])\n                confidences.append(float(confidence))\n                class_ids.append(class_id)\n\n    <span class=\"hljs-keyword\">return<\/span> boxes, confidences, class_ids<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-6\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Step 5: Non-Maximum Suppression<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Non-Maximum Suppression (NMS) is used to reduce the number of overlapping bounding boxes and keep the best ones.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-7\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">apply_nms<\/span><span class=\"hljs-params\">(boxes, confidences, score_threshold, nms_threshold)<\/span>:<\/span>\n    indices = cv2.dnn.NMSBoxes(boxes, confidences, score_threshold, nms_threshold)\n    <span class=\"hljs-keyword\">return<\/span> indices<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-7\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Step 6: Draw Bounding Boxes<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">We will draw the final bounding boxes on the image along with class names and confidence scores.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-8\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">draw_bounding_boxes<\/span><span class=\"hljs-params\">(image, boxes, confidences, class_ids, indices, classes)<\/span>:<\/span>\n    <span class=\"hljs-keyword\">for<\/span> i <span class=\"hljs-keyword\">in<\/span> indices:\n        i = i&#91;<span class=\"hljs-number\">0<\/span>]\n        box = boxes&#91;i]\n        x, y, w, h = box&#91;<span class=\"hljs-number\">0<\/span>], box&#91;<span class=\"hljs-number\">1<\/span>], box&#91;<span class=\"hljs-number\">2<\/span>], box&#91;<span class=\"hljs-number\">3<\/span>]\n        label = str(classes&#91;class_ids&#91;i]])\n        confidence = confidences&#91;i]\n        color = np.random.uniform(<span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">255<\/span>, size=(<span class=\"hljs-number\">3<\/span>,))\n        cv2.rectangle(image, (x, y), (x + w, y + h), color, <span class=\"hljs-number\">2<\/span>)\n        cv2.putText(image, <span class=\"hljs-string\">f\"<span class=\"hljs-subst\">{label}<\/span> <span class=\"hljs-subst\">{confidence:<span class=\"hljs-number\">.2<\/span>f}<\/span>\"<\/span>, (x, y - <span class=\"hljs-number\">10<\/span>), cv2.FONT_HERSHEY_SIMPLEX, <span class=\"hljs-number\">0.5<\/span>, color, <span class=\"hljs-number\">2<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-8\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\">Real-Time Object Detection<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Now that we have all the pieces in place, we can perform real-time object detection using a video stream from a webcam or a video file.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 7: Capture Video Stream<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Capture video stream using OpenCV\u2019s <code>VideoCapture<\/code>.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-9\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\">cap = cv2.VideoCapture(<span class=\"hljs-number\">0<\/span>)  <span class=\"hljs-comment\"># Use 0 for webcam. For a video file, provide the path to the file.<\/span>\n\n<span class=\"hljs-keyword\">while<\/span> <span class=\"hljs-literal\">True<\/span>:\n    ret, frame = cap.read()\n    <span class=\"hljs-keyword\">if<\/span> <span class=\"hljs-keyword\">not<\/span> ret:\n        <span class=\"hljs-keyword\">break<\/span>\n\n    height, width, channels = frame.shape\n\n    blob = preprocess_image(frame)\n    outputs = get_detections(net, blob)\n\n    boxes, confidences, class_ids = extract_boxes_confidences_classids(outputs, <span class=\"hljs-number\">0.5<\/span>)\n    indices = apply_nms(boxes, confidences, <span class=\"hljs-number\">0.5<\/span>, <span class=\"hljs-number\">0.4<\/span>)\n    draw_bounding_boxes(frame, boxes, confidences, class_ids, indices, classes)\n\n    cv2.imshow(<span class=\"hljs-string\">\"Real-Time Object Detection\"<\/span>, frame)\n\n    key = cv2.waitKey(<span class=\"hljs-number\">1<\/span>)\n    <span class=\"hljs-keyword\">if<\/span> key == <span class=\"hljs-number\">27<\/span>:  <span class=\"hljs-comment\"># Press 'Esc' to exit<\/span>\n        <span class=\"hljs-keyword\">break<\/span>\n\ncap.release()\ncv2.destroyAllWindows()<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-9\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Step 8: Putting It All Together<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Combine all the steps into a single script for real-time object detection.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-10\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> cv2\n<span class=\"hljs-keyword\">import<\/span> numpy <span class=\"hljs-keyword\">as<\/span> np\n\n<span class=\"hljs-comment\"># Load YOLO<\/span>\nnet = cv2.dnn.readNet(<span class=\"hljs-string\">\"yolov3.weights\"<\/span>, <span class=\"hljs-string\">\"yolov3.cfg\"<\/span>)\n\n<span class=\"hljs-comment\"># Load class names<\/span>\n<span class=\"hljs-keyword\">with<\/span> open(<span class=\"hljs-string\">\"coco.names\"<\/span>, <span class=\"hljs-string\">\"r\"<\/span>) <span class=\"hljs-keyword\">as<\/span> f:\n    classes = &#91;line.strip() <span class=\"hljs-keyword\">for<\/span> line <span class=\"hljs-keyword\">in<\/span> f.readlines()]\n\nlayer_names = net.getLayerNames()\noutput_layers = &#91;layer_names&#91;i - <span class=\"hljs-number\">1<\/span>] <span class=\"hljs-keyword\">for<\/span> i <span class=\"hljs-keyword\">in<\/span> net.getUnconnectedOutLayers()]\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">preprocess_image<\/span><span class=\"hljs-params\">(image)<\/span>:<\/span>\n    blob = cv2.dnn.blobFromImage(image, <span class=\"hljs-number\">0.00392<\/span>, (<span class=\"hljs-number\">416<\/span>, <span class=\"hljs-number\">416<\/span>), (<span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">0<\/span>), <span class=\"hljs-literal\">True<\/span>, crop=<span class=\"hljs-literal\">False<\/span>)\n    <span class=\"hljs-keyword\">return<\/span> blob\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">get_detections<\/span><span class=\"hljs-params\">(net, blob)<\/span>:<\/span>\n    net.setInput(blob)\n    outputs = net.forward(output_layers)\n    <span class=\"hljs-keyword\">return<\/span> outputs\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">extract_boxes_confidences_classids<\/span><span class=\"hljs-params\">(outputs, confidence_threshold)<\/span>:<\/span>\n    boxes = &#91;]\n    confidences = &#91;]\n    class_ids = &#91;]\n\n    <span class=\"hljs-keyword\">for<\/span> output <span class=\"hljs-keyword\">in<\/span> outputs:\n        <span class=\"hljs-keyword\">for<\/span> detection <span class=\"hljs-keyword\">in<\/span> output:\n            scores = detection&#91;<span class=\"hljs-number\">5<\/span>:]\n            class_id = np.argmax(scores)\n            confidence = scores&#91;class_id]\n            <span class=\"hljs-keyword\">if<\/span> confidence &gt; confidence_threshold:\n                center_x = int(detection&#91;<span class=\"hljs-number\">0<\/span>] * width)\n                center_y = int(detection&#91;<span class=\"hljs-number\">1<\/span>] * height)\n                w = int(detection&#91;<span class=\"hljs-number\">2<\/span>] * width)\n                h = int(detection&#91;<span class=\"hljs-number\">3<\/span>] * height)\n                x = int(center_x - w \/ <span class=\"hljs-number\">2<\/span>)\n                y = int(center_y - h \/ <span class=\"hljs-number\">2<\/span>)\n                boxes.append(&#91;x, y, w, h])\n                confidences.append(float(confidence))\n                class_ids.append(class_id)\n\n    <span class=\"hljs-keyword\">return<\/span> boxes, confidences, class_ids\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">apply_nms<\/span><span class=\"hljs-params\">(boxes, confidences, score_threshold, nms_threshold)<\/span>:<\/span>\n    indices = cv2.dnn.NMSBoxes(boxes, confidences, score_threshold, nms_threshold)\n    <span class=\"hljs-keyword\">return<\/span> indices\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">draw_bounding_boxes<\/span><span class=\"hljs-params\">(image, boxes, confidences, class_ids, indices, classes)<\/span>:<\/span>\n    <span class=\"hljs-keyword\">for<\/span> i <span class=\"hljs-keyword\">in<\/span> indices:\n        i = i\n\n&#91;<span class=\"hljs-number\">0<\/span>]\n        box = boxes&#91;i]\n        x, y, w, h = box&#91;<span class=\"hljs-number\">0<\/span>], box&#91;<span class=\"hljs-number\">1<\/span>], box&#91;<span class=\"hljs-number\">2<\/span>], box&#91;<span class=\"hljs-number\">3<\/span>]\n        label = str(classes&#91;class_ids&#91;i]])\n        confidence = confidences&#91;i]\n        color = np.random.uniform(<span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">255<\/span>, size=(<span class=\"hljs-number\">3<\/span>,))\n        cv2.rectangle(image, (x, y), (x + w, y + h), color, <span class=\"hljs-number\">2<\/span>)\n        cv2.putText(image, <span class=\"hljs-string\">f\"<span class=\"hljs-subst\">{label}<\/span> <span class=\"hljs-subst\">{confidence:<span class=\"hljs-number\">.2<\/span>f}<\/span>\"<\/span>, (x, y - <span class=\"hljs-number\">10<\/span>), cv2.FONT_HERSHEY_SIMPLEX, <span class=\"hljs-number\">0.5<\/span>, color, <span class=\"hljs-number\">2<\/span>)\n\ncap = cv2.VideoCapture(<span class=\"hljs-number\">0<\/span>)\n\n<span class=\"hljs-keyword\">while<\/span> <span class=\"hljs-literal\">True<\/span>:\n    ret, frame = cap.read()\n    <span class=\"hljs-keyword\">if<\/span> <span class=\"hljs-keyword\">not<\/span> ret:\n        <span class=\"hljs-keyword\">break<\/span>\n\n    height, width, channels = frame.shape\n\n    blob = preprocess_image(frame)\n    outputs = get_detections(net, blob)\n\n    boxes, confidences, class_ids = extract_boxes_confidences_classids(outputs, <span class=\"hljs-number\">0.5<\/span>)\n    indices = apply_nms(boxes, confidences, <span class=\"hljs-number\">0.5<\/span>, <span class=\"hljs-number\">0.4<\/span>)\n    draw_bounding_boxes(frame, boxes, confidences, class_ids, indices, classes)\n\n    cv2.imshow(<span class=\"hljs-string\">\"Real-Time Object Detection\"<\/span>, frame)\n\n    key = cv2.waitKey(<span class=\"hljs-number\">1<\/span>)\n    <span class=\"hljs-keyword\">if<\/span> key == <span class=\"hljs-number\">27<\/span>:\n        <span class=\"hljs-keyword\">break<\/span>\n\ncap.release()\ncv2.destroyAllWindows()<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-10\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\">Optimizing Performance<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Real-time object detection can be computationally intensive. Here are some tips to optimize performance:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Use a GPU<\/strong>: Offload computations to a GPU. OpenCV&#8217;s <code>dnn<\/code> module can utilize CUDA if compiled with GPU support.<\/li>\n<\/ol>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-11\" data-shcb-language-name=\"CSS\" data-shcb-language-slug=\"css\"><span><code class=\"hljs language-css\">   <span class=\"hljs-selector-tag\">net<\/span><span class=\"hljs-selector-class\">.setPreferableBackend<\/span>(<span class=\"hljs-selector-tag\">cv2<\/span><span class=\"hljs-selector-class\">.dnn<\/span><span class=\"hljs-selector-class\">.DNN_BACKEND_CUDA<\/span>)\n   <span class=\"hljs-selector-tag\">net<\/span><span class=\"hljs-selector-class\">.setPreferableTarget<\/span>(<span class=\"hljs-selector-tag\">cv2<\/span><span class=\"hljs-selector-class\">.dnn<\/span><span class=\"hljs-selector-class\">.DNN_TARGET_CUDA<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-11\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">CSS<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">css<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<ol start=\"2\" class=\"wp-block-list\">\n<li><strong>Reduce Input Size<\/strong>: Smaller input sizes reduce computational load. However, this may impact accuracy.<\/li>\n<\/ol>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-12\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\">   blob = cv2.dnn.blobFromImage(image, <span class=\"hljs-number\">0.00392<\/span>, (<span class=\"hljs-number\">320<\/span>, <span class=\"hljs-number\">320<\/span>), (<span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">0<\/span>), <span class=\"hljs-literal\">True<\/span>, crop=<span class=\"hljs-literal\">False<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-12\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<ol start=\"3\" class=\"wp-block-list\">\n<li><strong>Batch Processing<\/strong>: Process multiple frames in a batch if latency is not critical.<\/li>\n\n\n\n<li><strong>Model Optimization<\/strong>: Use a lighter YOLO model like YOLOv3-tiny for faster inference.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">Practice Exercise: Real-Time Object Detection with Custom Objects and Tracking<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Objective:<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The goal of this exercise is to implement a real-time object detection system using YOLO and OpenCV that can detect custom objects and track their movements across frames. The system will also log the coordinates of the detected objects in a file for further analysis.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Requirements:<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Detect and track multiple custom objects (e.g., cars and pedestrians).<\/li>\n\n\n\n<li>Use YOLO for object detection.<\/li>\n\n\n\n<li>Use OpenCV for video capture, preprocessing, and displaying results.<\/li>\n\n\n\n<li>Log the coordinates of the detected objects in each frame to a CSV file.<\/li>\n\n\n\n<li>Implement a basic tracking mechanism to assign unique IDs to detected objects.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Steps:<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">1. <strong>Prepare the Dataset:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Collect or download a dataset of the custom objects you want to detect.<\/li>\n\n\n\n<li>Use a tool like LabelImg to annotate the images and create training data.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">2. <strong>Train YOLO Model:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Train a YOLO model on your custom dataset using Darknet.<\/li>\n\n\n\n<li>Save the trained weights, configuration file, and class names.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">3. <strong>Implement Detection and Tracking:<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Solution:<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">Step 1: Prepare the Dataset and Train YOLO<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">(Assuming you have already trained the YOLO model and have the following files: <code>yolov3_custom.cfg<\/code>, <code>yolov3_custom.weights<\/code>, and <code>custom.names<\/code>)<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Step 2: Implement Detection and Tracking<\/h4>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-13\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> cv2\n<span class=\"hljs-keyword\">import<\/span> numpy <span class=\"hljs-keyword\">as<\/span> np\n<span class=\"hljs-keyword\">import<\/span> csv\n<span class=\"hljs-keyword\">import<\/span> time\n<span class=\"hljs-keyword\">from<\/span> collections <span class=\"hljs-keyword\">import<\/span> deque\n\n<span class=\"hljs-comment\"># Load YOLO<\/span>\nnet = cv2.dnn.readNet(<span class=\"hljs-string\">\"yolov3_custom.weights\"<\/span>, <span class=\"hljs-string\">\"yolov3_custom.cfg\"<\/span>)\n\n<span class=\"hljs-comment\"># Load class names<\/span>\n<span class=\"hljs-keyword\">with<\/span> open(<span class=\"hljs-string\">\"custom.names\"<\/span>, <span class=\"hljs-string\">\"r\"<\/span>) <span class=\"hljs-keyword\">as<\/span> f:\n    classes = &#91;line.strip() <span class=\"hljs-keyword\">for<\/span> line <span class=\"hljs-keyword\">in<\/span> f.readlines()]\n\nlayer_names = net.getLayerNames()\noutput_layers = &#91;layer_names&#91;i - <span class=\"hljs-number\">1<\/span>] <span class=\"hljs-keyword\">for<\/span> i <span class=\"hljs-keyword\">in<\/span> net.getUnconnectedOutLayers()]\n\n<span class=\"hljs-comment\"># Initialize video capture<\/span>\ncap = cv2.VideoCapture(<span class=\"hljs-number\">0<\/span>)\n\n<span class=\"hljs-comment\"># Initialize a list to hold the tracking data<\/span>\nobject_trackers = &#91;]\n\n<span class=\"hljs-comment\"># Create a deque to store previous positions of objects<\/span>\nobject_paths = {}\nmax_path_length = <span class=\"hljs-number\">30<\/span>\n\n<span class=\"hljs-comment\"># Create a CSV file to log coordinates<\/span>\ncsv_file = open(<span class=\"hljs-string\">\"object_tracking_log.csv\"<\/span>, mode=<span class=\"hljs-string\">'w'<\/span>, newline=<span class=\"hljs-string\">''<\/span>)\ncsv_writer = csv.writer(csv_file)\ncsv_writer.writerow(&#91;<span class=\"hljs-string\">\"Frame\"<\/span>, <span class=\"hljs-string\">\"ObjectID\"<\/span>, <span class=\"hljs-string\">\"Class\"<\/span>, <span class=\"hljs-string\">\"X\"<\/span>, <span class=\"hljs-string\">\"Y\"<\/span>, <span class=\"hljs-string\">\"Width\"<\/span>, <span class=\"hljs-string\">\"Height\"<\/span>])\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">preprocess_image<\/span><span class=\"hljs-params\">(image)<\/span>:<\/span>\n    blob = cv2.dnn.blobFromImage(image, <span class=\"hljs-number\">0.00392<\/span>, (<span class=\"hljs-number\">416<\/span>, <span class=\"hljs-number\">416<\/span>), (<span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">0<\/span>), <span class=\"hljs-literal\">True<\/span>, crop=<span class=\"hljs-literal\">False<\/span>)\n    <span class=\"hljs-keyword\">return<\/span> blob\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">get_detections<\/span><span class=\"hljs-params\">(net, blob)<\/span>:<\/span>\n    net.setInput(blob)\n    outputs = net.forward(output_layers)\n    <span class=\"hljs-keyword\">return<\/span> outputs\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">extract_boxes_confidences_classids<\/span><span class=\"hljs-params\">(outputs, confidence_threshold)<\/span>:<\/span>\n    boxes = &#91;]\n    confidences = &#91;]\n    class_ids = &#91;]\n\n    <span class=\"hljs-keyword\">for<\/span> output <span class=\"hljs-keyword\">in<\/span> outputs:\n        <span class=\"hljs-keyword\">for<\/span> detection <span class=\"hljs-keyword\">in<\/span> output:\n            scores = detection&#91;<span class=\"hljs-number\">5<\/span>:]\n            class_id = np.argmax(scores)\n            confidence = scores&#91;class_id]\n            <span class=\"hljs-keyword\">if<\/span> confidence &gt; confidence_threshold:\n                center_x = int(detection&#91;<span class=\"hljs-number\">0<\/span>] * width)\n                center_y = int(detection&#91;<span class=\"hljs-number\">1<\/span>] * height)\n                w = int(detection&#91;<span class=\"hljs-number\">2<\/span>] * width)\n                h = int(detection&#91;<span class=\"hljs-number\">3<\/span>] * height)\n                x = int(center_x - w \/ <span class=\"hljs-number\">2<\/span>)\n                y = int(center_y - h \/ <span class=\"hljs-number\">2<\/span>)\n                boxes.append(&#91;x, y, w, h])\n                confidences.append(float(confidence))\n                class_ids.append(class_id)\n\n    <span class=\"hljs-keyword\">return<\/span> boxes, confidences, class_ids\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">apply_nms<\/span><span class=\"hljs-params\">(boxes, confidences, score_threshold, nms_threshold)<\/span>:<\/span>\n    indices = cv2.dnn.NMSBoxes(boxes, confidences, score_threshold, nms_threshold)\n    <span class=\"hljs-keyword\">return<\/span> indices\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">draw_bounding_boxes<\/span><span class=\"hljs-params\">(image, boxes, confidences, class_ids, indices, classes)<\/span>:<\/span>\n    <span class=\"hljs-keyword\">global<\/span> object_trackers, object_paths\n\n    new_object_trackers = &#91;]\n    new_object_paths = {}\n\n    <span class=\"hljs-keyword\">for<\/span> i <span class=\"hljs-keyword\">in<\/span> indices:\n        i = i&#91;<span class=\"hljs-number\">0<\/span>]\n        box = boxes&#91;i]\n        x, y, w, h = box&#91;<span class=\"hljs-number\">0<\/span>], box&#91;<span class=\"hljs-number\">1<\/span>], box&#91;<span class=\"hljs-number\">2<\/span>], box&#91;<span class=\"hljs-number\">3<\/span>]\n        label = str(classes&#91;class_ids&#91;i]])\n        confidence = confidences&#91;i]\n        color = np.random.uniform(<span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">255<\/span>, size=(<span class=\"hljs-number\">3<\/span>,))\n        cv2.rectangle(image, (x, y), (x + w, y + h), color, <span class=\"hljs-number\">2<\/span>)\n        cv2.putText(image, <span class=\"hljs-string\">f\"<span class=\"hljs-subst\">{label}<\/span> <span class=\"hljs-subst\">{confidence:<span class=\"hljs-number\">.2<\/span>f}<\/span>\"<\/span>, (x, y - <span class=\"hljs-number\">10<\/span>), cv2.FONT_HERSHEY_SIMPLEX, <span class=\"hljs-number\">0.5<\/span>, color, <span class=\"hljs-number\">2<\/span>)\n\n        <span class=\"hljs-comment\"># Log the coordinates<\/span>\n        csv_writer.writerow(&#91;frame_count, i, label, x, y, w, h])\n\n        <span class=\"hljs-comment\"># Initialize tracker<\/span>\n        tracker = cv2.TrackerKCF_create()\n        tracker.init(image, tuple(box))\n        new_object_trackers.append((tracker, class_ids&#91;i]))\n\n        <span class=\"hljs-comment\"># Update paths<\/span>\n        <span class=\"hljs-keyword\">if<\/span> i <span class=\"hljs-keyword\">in<\/span> object_paths:\n            object_paths&#91;i].append((x + w \/\/ <span class=\"hljs-number\">2<\/span>, y + h \/\/ <span class=\"hljs-number\">2<\/span>))\n            <span class=\"hljs-keyword\">if<\/span> len(object_paths&#91;i]) &gt; max_path_length:\n                object_paths&#91;i].popleft()\n        <span class=\"hljs-keyword\">else<\/span>:\n            object_paths&#91;i] = deque(&#91;(x + w \/\/ <span class=\"hljs-number\">2<\/span>, y + h \/\/ <span class=\"hljs-number\">2<\/span>)], maxlen=max_path_length)\n\n        new_object_paths&#91;i] = object_paths&#91;i]\n\n    object_trackers = new_object_trackers\n    object_paths = new_object_paths\n\n    <span class=\"hljs-comment\"># Draw paths<\/span>\n    <span class=\"hljs-keyword\">for<\/span> object_id, path <span class=\"hljs-keyword\">in<\/span> object_paths.items():\n        <span class=\"hljs-keyword\">for<\/span> j <span class=\"hljs-keyword\">in<\/span> range(<span class=\"hljs-number\">1<\/span>, len(path)):\n            <span class=\"hljs-keyword\">if<\/span> path&#91;j - <span class=\"hljs-number\">1<\/span>] <span class=\"hljs-keyword\">is<\/span> <span class=\"hljs-literal\">None<\/span> <span class=\"hljs-keyword\">or<\/span> path&#91;j] <span class=\"hljs-keyword\">is<\/span> <span class=\"hljs-literal\">None<\/span>:\n                <span class=\"hljs-keyword\">continue<\/span>\n            thickness = int(np.sqrt(max_path_length \/ float(j + <span class=\"hljs-number\">1<\/span>)) * <span class=\"hljs-number\">2.5<\/span>)\n            cv2.line(image, path&#91;j - <span class=\"hljs-number\">1<\/span>], path&#91;j], color, thickness)\n\nframe_count = <span class=\"hljs-number\">0<\/span>\n\n<span class=\"hljs-keyword\">while<\/span> <span class=\"hljs-literal\">True<\/span>:\n    ret, frame = cap.read()\n    <span class=\"hljs-keyword\">if<\/span> <span class=\"hljs-keyword\">not<\/span> ret:\n        <span class=\"hljs-keyword\">break<\/span>\n\n    height, width, channels = frame.shape\n\n    blob = preprocess_image(frame)\n    outputs = get_detections(net, blob)\n\n    boxes, confidences, class_ids = extract_boxes_confidences_classids(outputs, <span class=\"hljs-number\">0.5<\/span>)\n    indices = apply_nms(boxes, confidences, <span class=\"hljs-number\">0.5<\/span>, <span class=\"hljs-number\">0.4<\/span>)\n    draw_bounding_boxes(frame, boxes, confidences, class_ids, indices, classes)\n\n    <span class=\"hljs-comment\"># Update trackers<\/span>\n    new_object_trackers = &#91;]\n    <span class=\"hljs-keyword\">for<\/span> tracker, class_id <span class=\"hljs-keyword\">in<\/span> object_trackers:\n        success, box = tracker.update(frame)\n        <span class=\"hljs-keyword\">if<\/span> success:\n            x, y, w, h = &#91;int(v) <span class=\"hljs-keyword\">for<\/span> v <span class=\"hljs-keyword\">in<\/span> box]\n            color = np.random.uniform(<span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">255<\/span>, size=(<span class=\"hljs-number\">3<\/span>,))\n            cv2.rectangle(frame, (x, y), (x + w, y + h), color, <span class=\"hljs-number\">2<\/span>)\n            cv2.putText(frame, classes&#91;class_id], (x, y - <span class=\"hljs-number\">10<\/span>), cv2.FONT_HERSHEY_SIMPLEX, <span class=\"hljs-number\">0.5<\/span>, color, <span class=\"hljs-number\">2<\/span>)\n            new_object_trackers.append((tracker, class_id))\n\n            <span class=\"hljs-comment\"># Log the coordinates<\/span>\n            csv_writer.writerow(&#91;frame_count, class_id, classes&#91;class_id], x, y, w, h])\n\n            <span class=\"hljs-comment\"># Update paths<\/span>\n            <span class=\"hljs-keyword\">if<\/span> class_id <span class=\"hljs-keyword\">in<\/span> object_paths:\n                object_paths&#91;class_id].append((x + w \/\/ <span class=\"hljs-number\">2<\/span>, y + h \/\/ <span class=\"hljs-number\">2<\/span>))\n                <span class=\"hljs-keyword\">if<\/span> len(object_paths&#91;class_id]) &gt; max_path_length:\n                    object_paths&#91;class_id].popleft()\n            <span class=\"hljs-keyword\">else<\/span>:\n                object_paths&#91;class_id] = deque(&#91;(x + w \/\/ <span class=\"hljs-number\">2<\/span>, y + h \/\/ <span class=\"hljs-number\">2<\/span>)], maxlen=max_path_length)\n\n    object_trackers = new_object_trackers\n\n    <span class=\"hljs-comment\"># Draw paths<\/span>\n    <span class=\"hljs-keyword\">for<\/span> object_id, path <span class=\"hljs-keyword\">in<\/span> object_paths.items():\n        <span class=\"hljs-keyword\">for<\/span> j <span class=\"hljs-keyword\">in<\/span> range(<span class=\"hljs-number\">1<\/span>, len(path)):\n            <span class=\"hljs-keyword\">if<\/span> path&#91;j - <span class=\"hljs-number\">1<\/span>] <span class=\"hljs-keyword\">is<\/span> <span class=\"hljs-literal\">None<\/span> <span class=\"hljs-keyword\">or<\/span> path&#91;j] <span class=\"hljs-keyword\">is<\/span> <span class=\"hljs-literal\">None<\/span>:\n                <span class=\"hljs-keyword\">continue<\/span>\n            thickness = int(np.sqrt(max_path_length \/ float(j + <span class=\"hljs-number\">1<\/span>)) * <span class=\"hljs-number\">2.5<\/span>)\n            cv2.line(frame, path&#91;j - <span class=\"hljs-number\">1<\/span>], path&#91;j], color, thickness)\n\n    cv2.imshow(<span class=\"hljs-string\">\"Real-Time Object Detection and Tracking\"<\/span>, frame)\n\n    frame_count += <span class=\"hljs-number\">1<\/span>\n\n    key = cv2.waitKey(<span class=\"hljs-number\">1<\/span>)\n    <span class=\"hljs-keyword\">if<\/span> key == <span class=\"hljs-number\">27<\/span>:  <span class=\"hljs-comment\"># Press 'Esc' to exit<\/span>\n        <span class=\"hljs-keyword\">break<\/span>\n\ncap.release()\ncsv_file.close()\ncv2.destroyAllWindows()<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-13\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Explanation:<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Preprocess Image<\/strong>: Convert the image to a blob and pass it through the YOLO network to get detections.<\/li>\n\n\n\n<li><strong>Extract Boxes, Confidences, and Class IDs<\/strong>: Parse the outputs of the network to get the bounding boxes, confidence scores, and class IDs.<\/li>\n\n\n\n<li><strong>Apply Non-Maximum Suppression<\/strong>: Filter out overlapping bounding boxes to keep only the most confident ones.<\/li>\n\n\n\n<li><strong>Draw Bounding Boxes<\/strong>: Draw the bounding boxes and log the coordinates.<\/li>\n\n\n\n<li><strong>Initialize and Update Trackers<\/strong>: Use OpenCV&#8217;s <code>TrackerKCF_create()<\/code> to track the objects across frames.<\/li>\n\n\n\n<li><strong>Draw Paths<\/strong>: Keep track of object paths and draw them on the frame.<\/li>\n\n\n\n<li><strong>Log Data<\/strong>: Log the coordinates of the detected objects to a CSV file for further analysis.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Conclusion:<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This exercise provides a comprehensive practice for implementing a real-time object detection and tracking system using YOLO and OpenCV. It covers key aspects such as detection, tracking, logging, and visualization, making it a robust solution for various real-time applications.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Object detection is a critical task in the field of computer vision. It involves identifying and localizing objects within an image or video frame. One of the most efficient and widely used techniques for real-time object detection is YOLO (You Only Look Once). Combined with OpenCV, a powerful library for computer vision tasks, YOLO can [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_genesis_hide_title":false,"_genesis_hide_breadcrumbs":false,"_genesis_hide_singular_image":false,"_genesis_hide_footer_widgets":false,"_genesis_custom_body_class":"","_genesis_custom_post_class":"","_genesis_layout":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[18,4,6],"tags":[],"class_list":["post-2148","post","type-post","status-publish","format-standard","category-artificial-intelligence","category-programming-languages","category-python","entry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.6 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Real-Time Object Detection with YOLO and OpenCV<\/title>\n<meta name=\"description\" content=\"One of the most efficient and widely used techniques for real-time object detection is YOLO (You Only Look Once). Combined with OpenCV,\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Real-Time Object Detection with YOLO and OpenCV\" \/>\n<meta property=\"og:description\" content=\"One of the most efficient and widely used techniques for real-time object detection is YOLO (You Only Look Once). Combined with OpenCV,\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-08-07T22:51:01+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-08-08T11:14:43+00:00\" \/>\n<meta name=\"author\" content=\"w3compadmin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"w3compadmin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"TechArticle\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/real-time-object-detection-with-yolo-opencv\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/real-time-object-detection-with-yolo-opencv\\\/\"},\"author\":{\"name\":\"w3compadmin\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#\\\/schema\\\/person\\\/a550b3e20d78bb4f79b7c6b7b53f0561\"},\"headline\":\"Real-Time Object Detection with YOLO and OpenCV\",\"datePublished\":\"2024-08-07T22:51:01+00:00\",\"dateModified\":\"2024-08-08T11:14:43+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/real-time-object-detection-with-yolo-opencv\\\/\"},\"wordCount\":937,\"articleSection\":[\"Artificial Intelligence\",\"Programming Languages\",\"Python\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/real-time-object-detection-with-yolo-opencv\\\/\",\"url\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/real-time-object-detection-with-yolo-opencv\\\/\",\"name\":\"Real-Time Object Detection with YOLO and OpenCV\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#website\"},\"datePublished\":\"2024-08-07T22:51:01+00:00\",\"dateModified\":\"2024-08-08T11:14:43+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#\\\/schema\\\/person\\\/a550b3e20d78bb4f79b7c6b7b53f0561\"},\"description\":\"One of the most efficient and widely used techniques for real-time object detection is YOLO (You Only Look Once). Combined with OpenCV,\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/real-time-object-detection-with-yolo-opencv\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/real-time-object-detection-with-yolo-opencv\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/real-time-object-detection-with-yolo-opencv\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Articles Home\",\"item\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Artificial Intelligence\",\"item\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/artificial-intelligence\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Real-Time Object Detection with YOLO and OpenCV\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#website\",\"url\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/\",\"name\":\"Developer Articles Hub\",\"description\":\"\",\"alternateName\":\"Developer Articles\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#\\\/schema\\\/person\\\/a550b3e20d78bb4f79b7c6b7b53f0561\",\"name\":\"w3compadmin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/wp-content\\\/litespeed\\\/avatar\\\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266\",\"url\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/wp-content\\\/litespeed\\\/avatar\\\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266\",\"contentUrl\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/wp-content\\\/litespeed\\\/avatar\\\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266\",\"caption\":\"w3compadmin\"},\"sameAs\":[\"http:\\\/\\\/w3computing.com\\\/articles\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Real-Time Object Detection with YOLO and OpenCV","description":"One of the most efficient and widely used techniques for real-time object detection is YOLO (You Only Look Once). Combined with OpenCV,","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/","og_locale":"en_US","og_type":"article","og_title":"Real-Time Object Detection with YOLO and OpenCV","og_description":"One of the most efficient and widely used techniques for real-time object detection is YOLO (You Only Look Once). Combined with OpenCV,","og_url":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/","article_published_time":"2024-08-07T22:51:01+00:00","article_modified_time":"2024-08-08T11:14:43+00:00","author":"w3compadmin","twitter_card":"summary_large_image","twitter_misc":{"Written by":"w3compadmin","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"TechArticle","@id":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/#article","isPartOf":{"@id":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/"},"author":{"name":"w3compadmin","@id":"https:\/\/www.w3computing.com\/articles\/#\/schema\/person\/a550b3e20d78bb4f79b7c6b7b53f0561"},"headline":"Real-Time Object Detection with YOLO and OpenCV","datePublished":"2024-08-07T22:51:01+00:00","dateModified":"2024-08-08T11:14:43+00:00","mainEntityOfPage":{"@id":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/"},"wordCount":937,"articleSection":["Artificial Intelligence","Programming Languages","Python"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/","url":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/","name":"Real-Time Object Detection with YOLO and OpenCV","isPartOf":{"@id":"https:\/\/www.w3computing.com\/articles\/#website"},"datePublished":"2024-08-07T22:51:01+00:00","dateModified":"2024-08-08T11:14:43+00:00","author":{"@id":"https:\/\/www.w3computing.com\/articles\/#\/schema\/person\/a550b3e20d78bb4f79b7c6b7b53f0561"},"description":"One of the most efficient and widely used techniques for real-time object detection is YOLO (You Only Look Once). Combined with OpenCV,","breadcrumb":{"@id":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.w3computing.com\/articles\/real-time-object-detection-with-yolo-opencv\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Articles Home","item":"https:\/\/www.w3computing.com\/articles\/"},{"@type":"ListItem","position":2,"name":"Artificial Intelligence","item":"https:\/\/www.w3computing.com\/articles\/artificial-intelligence\/"},{"@type":"ListItem","position":3,"name":"Real-Time Object Detection with YOLO and OpenCV"}]},{"@type":"WebSite","@id":"https:\/\/www.w3computing.com\/articles\/#website","url":"https:\/\/www.w3computing.com\/articles\/","name":"Developer Articles Hub","description":"","alternateName":"Developer Articles","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.w3computing.com\/articles\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.w3computing.com\/articles\/#\/schema\/person\/a550b3e20d78bb4f79b7c6b7b53f0561","name":"w3compadmin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.w3computing.com\/articles\/wp-content\/litespeed\/avatar\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266","url":"https:\/\/www.w3computing.com\/articles\/wp-content\/litespeed\/avatar\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266","contentUrl":"https:\/\/www.w3computing.com\/articles\/wp-content\/litespeed\/avatar\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266","caption":"w3compadmin"},"sameAs":["http:\/\/w3computing.com\/articles"]}]}},"featured_image_src":null,"featured_image_src_square":null,"author_info":{"display_name":"w3compadmin","author_link":"https:\/\/www.w3computing.com\/articles\/author\/w3compadmin\/"},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/posts\/2148","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/comments?post=2148"}],"version-history":[{"count":3,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/posts\/2148\/revisions"}],"predecessor-version":[{"id":2156,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/posts\/2148\/revisions\/2156"}],"wp:attachment":[{"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/media?parent=2148"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/categories?post=2148"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/tags?post=2148"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}