昨天需要测试OCR的文字识别功能，需要从提供的图片或PDF扫描件中提取出文本信息。本来我想使用python的开源库（如pytesseract、OCRopus、OpenCV之类的库），考虑到公司数字员工的使用场景，未来图像识别会对识别率有较高的要求，所以还是用第三方提供的OCR接口最为稳妥。
大概看了百度的AI开放平台，在文字识别部分对个人用户开放了每月的调用额度，就顺便注册体验了下。整体识别效率个人感觉还是很高的。

测试步骤

登录百度AI开放平台，开通文字识别服务。
在控制台创建应用，获取到AppID、API Key、Secret Key。
参照百度提供的技术文档进行联调测试。

源码

from aip import AipOcr
import os
from pdf2image import convert_from_path

# 百度AI调用接口
def apiMessage(app_id,app_key,secret_key):
    APP_ID = app_id
    API_KEY = app_key
    SECRET_KEY = secret_key
    client = AipOcr(APP_ID, API_KEY, SECRET_KEY)
    return client

# 图片识别
def imgOcr(client,filename):
    with open(filename, 'rb') as fp:  # 要用二进制读方式打开
        image = fp.read()
    dic_result = client.basicGeneral(image)
    res = dic_result['words_result']
    return res

# 保存识别内容
def imgContent(res,resultpath):
    with open('{}\\result.txt'.format(resultpath),'a',encoding='utf-8') as f:
        for i in res:
            f.write(i['words'])

# PDF转图片
def pdfToimg(pdfFile,outputPath):
    images = convert_from_path(pdfFile,poppler_path=r'C:\Users\admin\Documents\workspace\otherapi\poppler-0.68.0\bin')
    for i ,img in enumerate(images):
        img.save(outputPath+f'\\page_{i+1}.png','PNG')

if __name__ == '__main__':
    # APP_ID，API_KEY，SECRET_KEY的值在创建完应用后可以获取到
    APP_ID = 'xx'
    API_KEY = 'xx'
    SECRET_KEY = 'x'
    
    # # 图片处理
    # filename = r'C:\Users\a.jpg'
    # resultpath = r'C:\Users'
    # client = apiMessage(APP_ID,API_KEY,SECRET_KEY)
    # content = imgOcr(client,filename)
    # imgContent(content,resultpath)

    # PDF处理
    pdfFile = r'C:\Users\A.pdf'
    outputPath = r'C:\Users'
    # pdfToimg(pdfFile,outputPath)
    listDir = os.listdir(outputPath)
    for i in listDir:
        filenamei = '{}\\{}'.format(outputPath,i)
        client = apiMessage(APP_ID,API_KEY,SECRET_KEY)
        content = imgOcr(client,filenamei)
        imgContent(content,outputPath)
        print('完成{}的处理'.format(i))