我正试图通过从下图读取数据来创建熊猫数据框。但无法正确读取此数据。
下面是我的代码:
import cv2
import pytesseract
import numpy as np
img = cv2.imread('image.png') #table
# color conversion to gray scle image
gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
# apply threshold
gray ,img_bin = cv2.threshold(gray,0,255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# change black background and white text
gray = cv2.bitwise_not(img_bin)
kernel = np.ones((1,1),np.uint8)
img = cv2.erode(gray,kernel,iterations=1)
img = cv2.dilate(img,kernel,iterations=1)
out_below = pytesseract.image_to_string(img)
out_below
'foe jo 0 14\n° ey)\nee ec ac)\n2 entrepreneur 205 128\n3 housemaid 165 109\nTeme ED\nB Cy\nCe eG\nee)\n8 student 91 269\nCee\nCC ed\nee\n\x0c'
对此有任何帮助吗?