官网
PaddleOCR - 文档解析与智能文字识别 | 支持API调用与MCP服务 - 飞桨星河社区
api调用:https://ai.baidu.com/ai-doc/AISTUDIO/2mh4okm66
官方的学习文档:
直接选择用vllm推理加速方案
python -m pip install paddlepaddle-gpu==3.2.0 -i <https://www.paddlepaddle.org.cn/packages/stable/cu126/>
python -m pip install -U "paddleocr[doc-parser]"
python -m pip install <https://paddle-whl.bj.bcebos.com/nightly/cu126/safetensors/safetensors-0.6.2.dev0-cp38-abi3-linux_x86_64.whl>
使用docker镜像
docker run \\
-it \\
--rm \\
--gpus all \\
--network host \\
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddlex-genai-vllm-server \\
paddlex_genai_server --model_name PaddleOCR-VL-0.9B --host 0.0.0.0 --port 8118 --backend vllm
解析pdf文档
from pathlib import Path
from paddleocr import PaddleOCRVL
input_file = "./your_pdf_file.pdf"
output_path = Path("./output")
pipeline = PaddleOCRVL(vl_rec_backend="vllm-server", vl_rec_server_url="<http://127.0.0.1:8118/v1>")
output = pipeline.predict(input=input_file)
markdown_list = []
markdown_images = []
for res in output:
md_info = res.markdown
markdown_list.append(md_info)
markdown_images.append(md_info.get("markdown_images", {}))
markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)
mkd_file_path = output_path / f"{Path(input_file).stem}.md"
mkd_file_path.parent.mkdir(parents=True, exist_ok=True)
with open(mkd_file_path, "w", encoding="utf-8") as f:
f.write(markdown_texts)
for item in markdown_images:
if item:
for path, image in item.items():
file_path = output_path / path
file_path.parent.mkdir(parents=True, exist_ok=True)
image.save(file_path)