Question

我有几百张图像（扫描的文档），其中大多数是歪斜的。我想使用Python使它们偏斜。
这是我使用的代码：

import numpy as np
import cv2

from skimage.transform import radon


filename = 'path_to_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
I = cv2.cvtColor(img, COLOR_BGR2GRAY)
h, w = I.shape
# If the resolution is high, resize the image to reduce processing time.
if (w > 640):
    I = cv2.resize(I, (640, int((h / w) * 640)))
I = I - np.mean(I)  # Demean; make the brightness extend above and below zero
# Do the radon transform
sinogram = radon(I)
# Find the RMS value of each row and find "busiest" rotation,
# where the transform is lined up perfectly with the alternating dark
# text and white lines
r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))

# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2,h/2),90 - rotation,1)
dst = cv2.warpAffine(img,M,(w,h))
cv2.imwrite('rotated.jpg', dst)

此代码对大多数文档都适用，除了某些角度：（180和0）和（90和270）通常被检测为相同角度（即，在（180和0）和（180）之间没有区别）（90和270））。所以我得到了很多颠倒的文件。

这里是一个例子：

我得到的结果图像与输入图像相同。

是否有建议使用Opencv和Python检测图像是否颠倒了？
PS：我尝试使用EXIF数据检查方向，但没有找到任何解决方案。

编辑：
可以使用Tesseract（Python的pytesseract）检测方向，但是仅当图像包含很多字符时才可能。
对于可能需要此服务的任何人：

import cv2
import pytesseract


print(pytesseract.image_to_osd(cv2.imread(file_name)))

如果文档包含足够的字符，则Tesseract可以检测方向。但是，当图像的线条很少时，Tesseract建议的定向角度通常是错误的。因此，这不是100％的解决方案。

Answer 1

Python3/OpenCV4 script以对齐扫描的文档。

旋转文档并汇总行。当文档旋转0度和180度时，图像中将有很多黑色像素：

使用得分保持方法。对每个图像进行评分，以使其类似于斑马纹。得分最高的图像具有正确的旋转度。您链接的图像偏离了0.5度。为了便于阅读，我省略了一些功能，完整的代码可以为found here。

# Rotate the image around in a circle
angle = 0
while angle <= 360:
    # Rotate the source image
    img = rotate(src, angle)    
    # Crop the center 1/3rd of the image (roi is filled with text)
    h,w = img.shape
    buffer = min(h, w) - int(min(h,w)/1.15)
    roi = img[int(h/2-buffer):int(h/2+buffer), int(w/2-buffer):int(w/2+buffer)]
    # Create background to draw transform on
    bg = np.zeros((buffer*2, buffer*2), np.uint8)
    # Compute the sums of the rows
    row_sums = sum_rows(roi)
    # High score --> Zebra stripes
    score = np.count_nonzero(row_sums)
    scores.append(score)
    # Image has best rotation
    if score <= min(scores):
        # Save the rotatied image
        print('found optimal rotation')
        best_rotation = img.copy()
    k = display_data(roi, row_sums, buffer)
    if k == 27: break
    # Increment angle and try again
    angle += .75
cv2.destroyAllWindows()

如何判断文档是否颠倒？填写从文档顶部到图像中第一个非黑色像素的区域。用黄色测量面积。面积最小的图像将是正面朝上的图像：

# Find the area from the top of page to top of image
_, bg = area_to_top_of_text(best_rotation.copy())
right_side_up = sum(sum(bg))
# Flip image and try again
best_rotation_flipped = rotate(best_rotation, 180)
_, bg = area_to_top_of_text(best_rotation_flipped.copy())
upside_down = sum(sum(bg))
# Check which area is larger
if right_side_up < upside_down: aligned_image = best_rotation
else: aligned_image = best_rotation_flipped
# Save aligned image
cv2.imwrite('/home/stephen/Desktop/best_rotation.png', 255-aligned_image)
cv2.destroyAllWindows()

Answer 2

假设您确实已经在图像上进行过角度校正，则可以尝试以下操作来找出图像是否被翻转：

将校正后的图像投影到y轴，以便为每行获得一个“峰值”。重要提示：实际上几乎总是有两个子峰值！
通过与高斯进行卷积来平滑此投影，以消除精细的结构，噪声等。
对于每个峰值，请检查较强的次峰值是在顶部还是在底部。
计算在底部具有次峰值的峰的分数。这是您的标量值，可以使您确信图像的方向正确。

步骤3中的峰发现是通过发现平均值高于平均值的部分完成的。然后通过argmax找到亚峰。

这里有个图来说明这种方法；您的几行示例图片

蓝色：原始投影
橙色：平滑的投影
水平线：整个图像的平滑投影的平均值。

这是执行此操作的一些代码：

import cv2
import numpy as np

# load image, convert to grayscale, threshold it at 127 and invert.
page = cv2.imread('Page.jpg')
page = cv2.cvtColor(page, cv2.COLOR_BGR2GRAY)
page = cv2.threshold(page, 127, 255, cv2.THRESH_BINARY_INV)[1]

# project the page to the side and smooth it with a gaussian
projection = np.sum(page, 1)
gaussian_filter = np.exp(-(np.arange(-3, 3, 0.1)**2))
gaussian_filter /= np.sum(gaussian_filter)
smooth = np.convolve(projection, gaussian_filter)

# find the pixel values where we expect lines to start and end
mask = smooth > np.average(smooth)
edges = np.convolve(mask, [1, -1])
line_starts = np.where(edges == 1)[0]
line_endings = np.where(edges == -1)[0]

# count lines with peaks on the lower side
lower_peaks = 0
for start, end in zip(line_starts, line_endings):
    line = smooth[start:end]
    if np.argmax(line) < len(line)/2:
        lower_peaks += 1

print(lower_peaks / len(line_starts))

这将为给定图像打印0.125，因此它的方向不正确，必须将其翻转。

请注意，如果存在图像或图像中未按行组织的任何内容（可能是数学或图片），此方法可能会严重中断。另一个问题是行太少，导致统计数据不正确。

不同的字体也可能导致不同的分布。您可以在一些图像上尝试一下，看看该方法是否有效。我没有足够的数据。

Answer 3

您可以使用Alyn模块。要安装它：

pip install alyn

然后将其用于校正图像（从首页获取）：

from alyn import Deskew
d = Deskew(
    input_file='path_to_file',
    display_image='preview the image on screen',
    output_file='path_for_deskewed image',
    r_angle='offest_angle_in_degrees_to_control_orientation')`
d.run()

请注意，Alyn仅用于偏斜文本。

Answer 4

如果图像上有脸，则易于检测。我创建了以下代码来检测面部是否朝上。在颠倒的情况下，我们不会得到人脸编码。

# first install face_recognition
# pip install --upgrade face_recognition
def is_image_upside_down(img):
    import face_recognition
    face_locations = face_recognition.face_locations(img)
    encodings = face_recognition.face_encodings(img, face_locations)
    image_is_upside_down = (len(encodings) == 0)
    return image_is_upside_down

import cv2
filename = 'path_to_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
if is_image_upside_down(img):
    print("rotate to 180 degree")
else:
    print("image is straight")

检测图像是否颠倒

4 个答案: