使用OpenCV和Python掌握计算机视觉：深入探讨高级技术及代码演示

点击下方卡片，关注“小白玩转Python”公众号

在不断发展的技术领域中，计算机视觉作为一种变革性力量脱颖而出，使机器能够解释和理解视觉信息。OpenCV（开源计算机视觉库）成为该领域的基石，提供了丰富的工具和功能，用于图像和视频处理。在本文中，我们将探索OpenCV的基础知识，并深入研究9个高级Python代码示例，展示其多样性和强大功能。

理解OpenCV

OpenCV是一个开源的计算机视觉和机器学习软件库，提供图像和视频分析工具。它是用C++开发的，后来扩展到包括Python绑定，OpenCV支持广泛的计算机视觉任务，包括图像和视频处理、对象检测、人脸识别等。其多功能性使其成为研究人员、开发人员和爱好者的首选。

安装OpenCV

在深入代码示例之前，请确保正确安装了OpenCV。使用以下命令在Python环境中安装OpenCV：

pip install opencv-python

安装完成后，在您的Python脚本或Jupyter笔记本中导入OpenCV：

import cv2

1. 加载和显示图像

让我们从一个简单的示例开始，加载一张图像并使用OpenCV显示它：

2. 图像灰度转换

使用OpenCV将图像转换为灰度：

# Convert the image to grayscalegray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Display the grayscale imageplt.imshow(gray_image, cmap='gray')plt.axis('off')plt.show()

3. 图像模糊

对图像应用高斯模糊以减少噪声：

# Apply Gaussian blurblurred_image = cv2.GaussianBlur(image, (5, 5), 0)
# Display the blurred imageplt.imshow(cv2.cvtColor(blurred_image, cv2.COLOR_BGR2RGB))plt.axis('off')plt.show()

4. 边缘检测

利用Canny边缘检测算法突出显示图像中的边缘：

# Apply Canny edge detectionedges = cv2.Canny(gray_image, 50, 150)
# Display the edgesplt.imshow(edges, cmap='gray')plt.axis('off')plt.show()

5. 对象检测

使用预先训练的Haar级联进行图像中的人脸检测：

# Load the pre-trained face cascadefaceCascade = cv2.CascadeClassifier('./opencv-master/data/haarcascades/' + 'haarcascade_frontalface_default.xml')
# Detect faces in the imagefaces = faceCascade.detectMultiScale(    gray_image,    scaleFactor = 1.1,    minNeighbors = 0,    minSize=(10,10))how_many_faces = len(faces)
# Draw rectangles around the detected facesfor (x, y, w, h) in faces:    cv2.rectangle(image, (x, y), (x+w, y+h), (255, 0, 0), 2)
# Display the image with face detectionplt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))plt.axis('off')plt.show()

6. 图像直方图

为图像生成并显示直方图：

# Calculate the histogramhist = cv2.calcHist([image], [0], None, [256], [0, 256])
# Plot the histogramplt.plot(hist)plt.title('Image Histogram')plt.xlabel('Pixel Value')plt.ylabel('Frequency')plt.show()

理解直方图：

直方图的x轴表示像素值（强度水平）从0到255。
y轴表示图像中每个像素值的出现频率。
直方图中的峰值表示图像中的高强度或颜色浓度区域。
直方图提供了关于像素强度分布的见解，有助于理解图像的整体亮度和对比度。

7. 图像拼接

将多个图像拼接在一起创建全景视图：

import cv2
stitcher = cv2.Stitcher_create()image1 = cv2.imread("./foo.png")image2 = cv2.imread("./bar.png")result, panorama = stitcher.stitch([image1, image2])



    
cv2.imwrite("./result.jpg", panorama)

8. 使用网络摄像头进行实时人脸检测

使用OpenCV和网络摄像头进行实时人脸检测：

# Open a connection to the webcamcap = cv2.VideoCapture(0)
while True:    # Capture frame-by-frame    ret, frame = cap.read()
    # Convert the frame to grayscale for face detection    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # Detect faces in the frame    faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))
    # Draw rectangles around the detected faces    for (x, y, w, h) in faces:        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
    # Display the frame    cv2.imshow('Real-time Face Detection', frame)
    # Break the loop when 'q' key is pressed    if cv2.waitKey(1) & 0xFF == ord('q'):        break
# Release the webcam and close all windowscap.release()cv2.destroyAllWindows()

9. 文档扫描仪

创建一个文档扫描仪，从文档或照片中提取文本或图像。此示例演示如何应用透视变换以获得文档的俯视图：

import cv2import numpy as npimport matplotlib.pyplot as plt# Load the image of the documentdocument_path = 'path/to/your/document.jpg'document_image = cv2.imread(document_path)# Convert the image to grayscalegray_document = cv2.cvtColor(document_image, cv2.COLOR_BGR2GRAY)# Apply Gaussian blur to reduce noise and improve edge detectionblurred_document = cv2.GaussianBlur(gray_document, (5, 5), 0)# Use Canny edge detection to find edges in the imageedges_document = cv2.Canny(blurred_document, 50, 150)# Find contours in the edged imagecontours, _ = cv2.findContours(edges_document, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)# Sort the contours by area and find the largest one (assuming it's the document)largest_contour = max(contours, key=cv2.contourArea)# Calculate the perimeter of the contourperimeter = cv2.arcLength(largest_contour, True)# Approximate the polygonal curves of the contourapprox = cv2.approxPolyDP(largest_contour, 0.02 * perimeter, True)# Ensure the approximated contour has four points (a rectangle)if len(approx) == 4:    # Apply perspective transformation to obtain a top-down view of the document    transformed_document = cv2.warpPerspective(document_image, cv2.getPerspectiveTransform(approx.reshape(4, 2), np.float32([[0, 0], [800, 0], [800, 1200], [0, 1200]])), (800, 1200))    # Display the original and transformed document side by side    plt.figure(figsize=(10, 5))    plt.subplot(1, 2, 1)    plt.imshow(cv2.cvtColor(document_image, cv2.COLOR_BGR2RGB))    plt.title('Original Document')    plt.axis('off')


    
    plt.subplot(1, 2, 2)    plt.imshow(cv2.cvtColor(transformed_document, cv2.COLOR_BGR2RGB))    plt.title('Transformed Document')    plt.axis('off')    plt.show()

Extracted Text:Tesseract at UB Mannheim
The Mannheim University Library (UB Mannheim) uses Tesseract to perform text recognition (OCR = optical characterrecognition) for historical German newspapers ( ' ). The latestresults with text from more than 700000 pages are available
Tesseract installer for Windows
Normally we run Tesseract on Debian GNU Linux, but there was also the need for a Windows version. That's why we have builta Tesseract installer for Windows.
WARNING: Tesseract should be either installed in the directory which is suggested during the installation or in a newdirectory. The uninstaller removes the whole installation directory. If you installed Tesseract in an existing directory, thatdirectory will be removed with all its subdirectories and files.
The latest installer can be downloaded here:e (64 bit)There are also available.
In addition, we also provide which was generated by Doxygen.

10. 光学字符识别（OCR）

使用Tesseract OCR引擎通过pytesseract库实现光学字符识别。此示例从图像中提取文本：

import cv2import pytesseractfrom PIL import Image# Path to the Tesseract executable (replace with your path)pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'# Load an image containing texttext_image_path = 'path/to/your/text_image.jpg'text_image = cv2.imread(text_image_path)# Convert the image to grayscalegray_text_image = cv2.cvtColor(text_image, cv2.COLOR_BGR2GRAY)# Use thresholding to emphasize the text_, thresholded_text = cv2.threshold(gray_text_image, 150, 255, cv2.THRESH_BINARY)# Use pytesseract to perform OCR on the thresholded imagetext = pytesseract.image_to_string(Image.fromarray(thresholded_text))# Display the original image and extracted textplt.figure(figsize=(8, 6))plt.imshow(cv2.cvtColor(text_image, cv2.COLOR_BGR2RGB))plt.title('Original Image')plt.axis('off')plt.show()print("Extracted Text:")print(text)

Extracted Text:Tesseract at UB Mannheim
The Mannheim University Library (UB Mannheim) uses Tesseract to perform text recognition (OCR = optical characterrecognition) for historical German newspapers ( ' ). The latestresults with text from more than 700000 pages are available
Tesseract 


    
installer for Windows
Normally we run Tesseract on Debian GNU Linux, but there was also the need for a Windows version. That's why we have builta Tesseract installer for Windows.
WARNING: Tesseract should be either installed in the directory which is suggested during the installation or in a newdirectory. The uninstaller removes the whole installation directory. If you installed Tesseract in an existing directory, thatdirectory will be removed with all its subdirectories and files.

这些示例展示了OpenCV可以处理的各种任务的多样性，从文档扫描到通过OCR提取文本。通过将这些技术纳入您的项目中，您可以利用计算机视觉的力量来解决现实世界中的问题。

结论

OpenCV赋予开发人员和研究人员探索计算机视觉广阔世界的能力。本文提供了对OpenCV的全面介绍，以及高级Python代码示例，展示了其在图像和视频处理、对象检测等方面的能力。随着技术的不断发展，OpenCV仍然是那些希望推动计算机视觉领域边界的人士的无价工具。

·  END  ·

HAPPY LIFE

本文仅供学习交流使用，如有侵权请联系作者删除