Vincenzo Lavorini
Vincenzo Lavorini
Hi! I have trained a model which uses 7 labels: set(clusterer.labels_) # result: {-1, 0, 1, 2, 3, 4, 5, 6, 7} but if I use the model with unseen...
Hi, I'm using the pipeline executor locally, and I will appreciate the possibility of executing only a subset of nodes of a pipeline. Sometimes it happens that you build a...
Hello, I tried the code on a text (680kbyte, words with abbreviations and non-text characters), and for some of the pairs I get that in the glove thread, in glove.c,...
From [discussio on slack](https://datahubspace.slack.com/archives/CUMUWQU66/p1643927238620069): Maybe is better to switch to GreatExpectations also for profiling files in Data Lake ingestion.
Hi folks, I tried to run it, but the app exit with a 'killed' message. `dmesg` state that is because an Out Of Memory error. Can I run this app...
Hi there, what I suggest is a simple 'turn on' button, which function is to use the last configuration (ex. cooling function, automatic fan spin, etc) and just turn on...
I am trying to parse [this PDF](https://ec.europa.eu/eurostat/documents/2995521/11081093/3-10072020-AP-EN.pdf/d2f799bf-4412-05cc-a357-7b49b93615f1) using PaddleOCR 2.7.3. I tried converting the pages as images, and then run PPStructure on them. I tried with the following options: engine...
Hello, I tried to force the installation of the 2.7.5 version, but while importing it I get an error: ```python --------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[6], line...
Maybe related to [this](https://github.com/Unstructured-IO/unstructured/issues/3076). When using in the context of a binary file an error is thrown. Example: with open ("./that.pdf", 'rb') as f: elements = partition_pdf( file=f, strategy='hi_res', is_image=False,...
Hello, I see that de-serializing a serialized Numpy array gives back a list. Is there a way to have back a Numpy array, apart from calling an explicit conversion afterwards?...