Pimeyes.com and lenso.ai
I've previously raised the topic of these two websites, and I'd like to do so again: pimeyes.com and lenso.ai
1. pimeyes.com
Pros:
- The paid version can be bypassed quite easily; you simply need to convert the hex from the obtained URL to ASCII, and we get the necessary data.
Cons:
- The number of requests from a single IP address is still limited, but this is bypassed by using personal and/or public proxies. At least I haven't found a way to bypass this without using proxies.
- Searching by URL is not possible. Or I haven't found such a way. Only by file.
2. lenso.ai
Pros:
-
Now, for some reason, they don't encrypt the data (only visually), as they did before, that is obtained during the request, therefore, we can parse them.
-
Even if it says "you have used free searches," it still searches and still gives results. Although I'm not entirely sure about this, as I haven't conducted in-depth tests.
-
It was previously discussed that there was no search by URL, but the method I mentioned in the previous issue still works. That is, it can search by both URL and file.
Cons:
- MAYBE the same as with pimeyes.com. The number of requests from a single IP address is still limited, but this is bypassed by using personal and/or public proxies. But this is highly doubtful, as it seems to return results even if the limit is exhausted.
- The site is young and might frequently change its internal API, consequently, the method might stop working altogether or become very limited.
I'll leave it up to you to decide whether or not to proceed with adding these engines.
P.S If you're interested, when converting from hex to ASCII in lenso.ai, we get a link to lenso.ai's server IP address. I haven't figured out how to use this to our advantage, but it's amusing
Pimeyes has already encrypted it...
Pimeyes has already encrypted it...
Yes, you are right. Honestly, I'm very surprised. Because I saw this method in another project, and it worked for quite a while. Although I highly doubt it's due to the issue I created. But it is what it is
I think they are still returning data in hex, just with an offset or something similar. It seems to me this can still be exploited, but I'm not sure this fix will last long
Anyway, it's a pity, just yesterday everything was working...
Yes, they encrypted it, but after researching and analyzing it, I believe it's just another encoding in hexadecimal. For example, d8ab19d5c7a0f27c10fa57540506ac68 is equivalent to 7b2275726c223a2268747470733a2f2f, which decodes to {"url": "https://
I support adding lenso.ai to the library, as it currently has fewer access restrictions.
Regarding pimeyes.com, since they have started encrypting data and may have implemented offsets or other anti-bypass measures, I believe it is better to continue monitoring the situation and assess whether new methods emerge to bypass the current restrictions.
Yes, they encrypted it, but after researching and analyzing it, I believe it's just another encoding in hexadecimal. For example, d8ab19d5c7a0f27c10fa57540506ac68 is equivalent to 7b2275726c223a2268747470733a2f2f, which decodes to {"url": "https://
Hey i was wondering if you know the exact encoding used to convert d8ab19d5c7a0f27c10fa57540506ac68to 7b2275726c223a2268747470733a2f2f
Someone shared the data they recorded before and after encryption: https://github.com/addycb/Pimeyes-Free-POC/issues/7#issuecomment-2589236731
I think the encryption process is far more complex than previously imagined.
I found lenso.ai seems have updated their algorithms now upload always failed somehow?
File "/Users/dev/developing/PicImageSearch/PicImageSearch/engines/lenso.py", line 125, in search result_hash = await self._upload_image(image_base64) │ │ └ '/9j/4AAQSkZJRgABAQAAAQABAAD/4gIcSUNDX1BST0ZJTEUAAQEAAAIMbGNtcwIQAABtbnRyUkdCIFhZWiAH3AABABkAAwApADlhY3NwQVBQTAAAAAAAAAAAAAAA... │ └ <function Lenso._upload_image at 0x10536efc0> └ <PicImageSearch.engines.lenso.Lenso object at 0x105381eb0>
File "/Users/dev/developing/PicImageSearch/PicImageSearch/engines/lenso.py", line 72, in _upload_image resp_json = json_loads(resp.text) │ │ └ _tuplegetter(0, 'Alias for field number 0') │ └ RESP(text='', url='https://lenso.ai/api/upload', status_code=403) └ <function loads at 0x104063420>
File "/Users/dev/.asdf/installs/python/3.12.1/lib/python3.12/json/init.py", line 346, in loads return _default_decoder.decode(s) │ │ └ '' │ └ <function JSONDecoder.decode at 0x104062de0> └ <json.decoder.JSONDecoder object at 0x1031c3350> File "/Users/dev/.asdf/installs/python/3.12.1/lib/python3.12/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) │ │ │ │ └ '' │ │ │ └ <built-in method match of re.Pattern object at 0x1040276b0> │ │ └ '' │ └ <function JSONDecoder.raw_decode at 0x104062e80> └ <json.decoder.JSONDecoder object at 0x1031c3350> File "/Users/dev/.asdf/installs/python/3.12.1/lib/python3.12/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None │ └ '' └ <class 'json.decoder.JSONDecodeError'>
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
seems getting nasty now
@eminarcissus 6e2cbe4 That’s why we mark it as deprecated.
I see.... I tested lenso likely last month or two month ago and it used to works like a charm, need to see if it can be circumvent or not lol.
Circumvention probably won't do any good regardless. As of a few days ago it seems like both imageUrl and sourceUrl, while still given in the response, are pretty close to useless now. It looks like anything except the protocol and domain name in those parameters are completely randomised. For example '/media/jbgonlsb/image-1.jpg' will now report something like '/cdg5F/7zhLEYlx/ljtDggf2woh'.
I was hoping it was just rot13 or something equally naive, but that does not seem to be the case. Resubmitting the same network request over and over will return randomised results for the same URL. There is no key or anything to reverse it. I've done a statistical analysis on over 1000 samples of the randomised 'cipher' -> known plaintext and it looks like it's not doing any reversible mutations on the plaintext at all, just a substitution for each character with a randomly chosen replacement in the range 0-9A-Za-z. If it was at the very least choosing a random offset each time, it should have unveiled a nice little bell curve with the plaintext characters sat in the middle.