BallonsTranslator icon indicating copy to clipboard operation
BallonsTranslator copied to clipboard

Poor optimization

Open stalker3331111 opened this issue 2 years ago • 5 comments

Hello, I have a question how can I optimize the program as much as possible, otherwise I have 15 pages of manga translated for almost an hour

stalker3331111 avatar May 25 '23 09:05 stalker3331111

Do you have a GPU? If not you could run this repo using google colab (havent tried it but its prob possible) for free gpu

tak2hu avatar May 25 '23 09:05 tak2hu

How much could GPU improving the performance? In my computer it cost 30s per page, I am using AMD 3600 CPU, but my GPU is AMD that means I could not use CUDA.

aqssxlzc avatar May 25 '23 11:05 aqssxlzc

How much could GPU improving the performance?

Because it accelerates the deep learning models of detecting/segmenting text, ocr, and inpainting (and translating if using sugoi translator).

See how fast this website is https://cotrans.touhou.ai/ as it uses the same deep learning models (this BallonTranslator project is based on that manga-image-translator project)

In my computer it cost 30s per page, I am using AMD 3600 CPU

Same, I’m using Intel Core i5 7300U, it is what it is

but my GPU is AMD that means I could not use CUDA.

Pretty sure there is a HIP option when selecting device. See https://pytorch.org/docs/stable/notes/hip.html but I don’t care about HIP because I don’t have the hardware

run this repo using google colab

Google colab is the only place I know that offer free gpu, just see these scripts to make a script using this project

So how this app works (simplified):

  • it takes an image (jpeg, png, webp) as input
  • input image gets passed to TextDetector function
  • TextDetector outputs 2 things: image mask and TextBlock
  • Image mask is an input for Inpainter function
  • TextBlock is to store text boxes, bounding boxes, and a whole lot of plethora
  • Inpainter function takes image mask and original image to output redrawn/cleaned image
  • OCR function takes TextBlock and original image to output TextBlock with text that was scanned
  • And when you edit text, it edits a TextBlock

Here is a picture to describe each model:

Imho the current models are not optimized for CPU inference (except LamaMPE for small masks, like using the healing brush), a more optimized or lightweight model should be implemented, a pull request is probably welcome by this project’s maintainer.

tak2hu avatar May 25 '23 16:05 tak2hu

thanks for your explain. After a few experiment, I found HIP could not working in my Video Card RX6700XT, Because it need ROCm, which is maybe not support this device yet, and ROCm is not available on windows.

aqssxlzc avatar May 25 '23 18:05 aqssxlzc

谢谢你的解释。经过几次实验,我发现 HIP 在我的视频卡RX6700XT中无法工作,因为它需要 ROCm,它可能还不支持此设备,并且 ROCm 在 Windows 上不可用。 现在ROC上ROC可以在Windows↑

qianzhou123 avatar Jul 17 '24 14:07 qianzhou123