label-studio icon indicating copy to clipboard operation
label-studio copied to clipboard

When the amount of data is large, exporting data will be very slow. Is there any solution?

Open LBZHK opened this issue 3 years ago • 6 comments

thank you!

LBZHK avatar Jul 14 '22 03:07 LBZHK

@LBZHK are you talking about timeouts? If yes, please try to use snapshots via SDK:

https://labelstud.io/sdk/project.html#label_studio_sdk.project.Project.export_snapshot_create

https://labelstud.io/guide/export.html#Export-snapshots-using-the-API

makseq avatar Jul 14 '22 18:07 makseq

@LBZHK are you talking about timeouts? If yes, please try to use snapshots via SDK:

https://labelstud.io/sdk/project.html#label_studio_sdk.project.Project.export_snapshot_create

https://labelstud.io/guide/export.html#Export-snapshots-using-the-API

and any updated docker version to support that?

yangboz avatar Jul 21 '22 10:07 yangboz

@yangboz yes, we has been supporting export snapshots for a long time.

makseq avatar Jul 26 '22 00:07 makseq

For full hd images, takes about 5 minutes for 300 of them (3-4 bounding boxes each). That's a LOT of time.

ntakouris avatar Aug 04 '22 13:08 ntakouris

@ntakouris

As a workaround, you can follow this

  1. go to console where you run LS and call there:
label-studio shell

it will run LS shell (Django shell_plus) and you will be able to execute Django ORM commands

  1. In shell plus:
from tasks.serializers import TaskWithAnnotationsSerializer

tasks = Task.objects.filter(project=<ID>)
export_json_data = TaskWithAnnotationsSerializer(tasks, many=True).data
  1. Optionally:
import json 
with open('export.json', 'w') as f:
  json.dump(export_json_data, f)

makseq avatar Aug 08 '22 23:08 makseq

Roger that. I'll try it over here. Thank you

------------------ 原始邮件 ------------------ 发件人: "heartexlabs/label-studio" @.>; 发送时间: 2022年8月9日(星期二) 上午7:18 @.>; @.@.>; 主题: Re: [heartexlabs/label-studio] When the amount of data is large, exporting data will be very slow. Is there any solution? (Issue #2663)

@ntakouris

As a workaround, you can follow this

go to console where you run LS and call there: label-studio shell
it will run LS shell (Django shell_plus) and you will be able to execute Django ORM commands

In shell plus: from tasks.serializers import TaskWithAnnotationsSerializer tasks = Task.objects.filter(project=<ID>) export_json_data = TaskWithAnnotationsSerializer(tasks, many=True).data
Optionally: import json with open('export.json', 'w') as f: json.dump(export_json_data, f)
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

LBZHK avatar Aug 09 '22 00:08 LBZHK

Also you can use this console command: https://labelstud.io/guide/export.html#Export-using-console-command

makseq avatar Jan 31 '23 13:01 makseq