你所在的位置：汇视网 > 新闻 >财经

基于PaddleOCR实现AI发票识别的Asp.netCore应用

发布时间：2022-02-15 14:22　　来源：TechWeb 　　编辑：笑笑阅读量：12327

简要介绍

用户批量上传需要识别的照片，上传成功后，系统会启动Hangfire后台Job开始调用PaddleOCR服务返回结果，这个过程有点类似微服务的架构模型。

PaddleOCR

PaddleOCR是百度AI团队开源的一个项目，应该是目前所有免费开源OCR项目中识别效果最好的，具体可以通过PaddleOCR了解，如果你没有Python的开发经验，可能在环境部署上会遇到一些问题，但几乎都能找到解决方案。

用户批量上传要识别的文件，由于我的虚拟机性能非常差，所以才能先上传系统后台自动识别

系统识别完成后会自动通知用户并修改状态，用户预览识别的结果

经测试PaddleOCR可在glibc 2.23上运行，您也可以测试其他glibc版本或安装glic 2.23

PaddleOCR 工作环境

PaddlePaddle 2.0.0python3.7glibc 2.23cuDNN 7.6+

建议使用我们提供的docker运行PaddleOCR，有关docker，nvidia—docker使用请参考链接。

如您希望使用 mac 或 windows直接运行预测代码，可以从第2步开始执行。

解析发票信息，目前还是使用比较笨的方法，通过正则表达式来匹配需要的字段，比如发票金额，开票日期，发票号码等等，因为这是免费的并没有提供像收费服务那样更智能的匹配，这里我想只要有足够的数据，应该也可以通过自己训练实现更智能的识别所以我留了Label字段，目的就是先有人工制定好对应的字段栏位，然后通过坐标数据进行训练

if var result = response.Content.ReadAsStringAsync.Result， var ocr_result = JsonSerializer.Deserializelt，ocr_resultgt，， var ocr_status = ""， invoice.Status = "Done"， invoice.Result = ocr_result.status， if foreach foreach var rawdata = new InvoiceRawData Confidence=item.confidence， InvoiceId=id， Text=item.text， Text_Region= JsonSerializer.Serialize ， if ) var regex = new Regex("d*$")， var mc = regex.Match(item.text)， if(mc.Success) invoice.InvoiceNo = mc.Value， if (item.text.Contains("开票日期")) var regex = new Regex("d4年d2月d2日")， var mc = regex.Match(item.text)， if (mc.Success) invoice.InvoiceDate = Convert.ToDateTime(mc.Value.Replace("年"，"/").Replace("月"， "/").Replace("日"， ""))， if (item.text.Contains("%")) var regex = new Regex("^d*.d*")， var mc = regex.Match(item.text)， if (mc.Success) invoice.TaxRate = decimal.Parse(mc.Value)， if (item.text.Contains("￥")) var regex = new Regex("d.d*")， var mc = regex.Match(item.text)， if (mc.Success) invoice.Amount = decimal.Parse(mc.Value)， _context.InvoiceRawDatas.Add(rawdata)， ocr_status = ocr_result.status， _context.SaveChangesAsync(default).Wait， _hubContext.Clients.All.SendAsync(SignalR.OCRTaskCompleted， new invoiceNo = invoice.InvoiceNo )，

Canvas 画框标注识别结果

data.map =gt， $('#rawdata_table gt， tbody').append(`

$index + 1$item.Text

`)， var points = JSON.parse(item.Text_Region)， ctx.lineWidth = "5"， ctx.strokeStyle = "#00ff00"， ctx.textAlign = 'left'， ctx.textBaseline = 'top'， ctx.fillStyle = "#ff0000"， ctx.font = "bold 13px verdana， sans—serif "， ctx.fillText(item.Text， points(0)(0)， points(0)(1)—15)， ctx.beginPath， ctx.moveTo(points(0)(0)， points(0)(1))， ctx.lineTo(points(1)(0)， points(1)(1))， ctx.lineTo(points(2)(0)， points(2)(1))， ctx.lineTo(points(3)(0)， points(3)(1))， ctx.closePath， ctx.stroke，)，

是不是很简单，很酷

最后

Give a Star! 。HomeKit摄像头改进：通过tvOS15在你的AppleTV上同时查看房子里的多个摄像头。。

If you like or are using this project please give it a star. Thanks!

。

郑重声明：此文内容为本网站转载企业宣传资讯，目的在于传播更多信息，与本站立场无关。仅供读者参考，并请自行核实相关内容。

基于PaddleOCR实现AI发票识别的Asp.netCore应用

热门文章

最新文章