CommunityScrapers Scraper for .nfo exports from kodi/plex

Adds a scraper for Kodi/Plex etc. .nfo export files in the directory of the video file. Resolves #484, #429

Aug 15 '21 12:08 Phasetime

Going into draft again until i change get_file_path to use GraphQL instead of direct SQLite access

Aug 16 '21 02:08 Phasetime

You are giving this error or trying to use or scraper kodi

"Error running scraper script: exec: "python": executable file not found in %PATH%"

Sep 03 '21 13:09 AiWABR

Hey, thanks for making this. I adapted some of it for my own use, but wanted to let you know it's possible to get images from the folder, however not directly.

The short of it is, if you convert the image to a base64 string, you no longer need to return a public url via the scraper results, and can just return that instead. This is how I did it based on your code:

#new import
import base64

# Changes done to the bottom of your script, so the first executed part:
imagePath = os.path.dirname(videoFilePath) + "/poster.jpg" # you already had the videoFilePath variable, I'm just getting an image on that same directory named "poster.jpg".
lookup_xml(nfoFilePath, fragment['title'], imagePath) # Added an extra parameter to pass this path to the next method

# Changes done to your query_xml def:
res={'title':title}    
with open(imagePath, "rb") as image_file:
    res['image'] = "data:image/jpeg;base64," + base64.b64encode(image_file.read()).decode()

This is a non-smart, hardcoded demonstration, but should give you a gist of what you can do. Might be a good idea to add it.

Sep 19 '21 04:09 Emilaia

Got this Error Message:

21-09-19 15:03:44 Error
could not unmarshal json: EOF 2021-09-19 15:03:44 Error
scraper: KeyError: 'tags' 2021-09-19 15:03:44 Error
scraper: if res["tags"] is not None: 2021-09-19 15:03:44 Error
scraper: File "kodi.py", line 39, in query_xml 2021-09-19 15:03:44 Error
scraper: res = query_xml(nfoFilePath, fragment["title"]) 2021-09-19 15:03:44 Error
scraper: File "kodi.py", line 103, in 2021-09-19 15:03:44 Error
scraper: Traceback (most recent call last): 2021-09-19 15:03:44 Error
scraper: Exact match found for Defiance

Sep 19 '21 13:09 peterpannimmerland

@peterpannimmerland should be fixed now

Sep 25 '21 22:09 Phasetime

Great, Thanks a lot. But now there is a new error:

21-09-26 08:56:13 Error
could not unmarshal json: EOF 2021-09-26 08:56:13 Error
scraper: AttributeError: 'NoneType' object has no attribute 'text' 2021-09-26 08:56:13 Error
scraper: if actor.find("type").text == "Actor": 2021-09-26 08:56:13 Error
scraper: File "kodi.py", line 46, in query_xml 2021-09-26 08:56:13 Error
scraper: res = query_xml(nfoFilePath, fragment["title"]) 2021-09-26 08:56:13 Error
scraper: File "kodi.py", line 103, in 2021-09-26 08:56:13 Error
scraper: Traceback (most recent call last): 2021-09-26 08:56:13 Error
scraper: No exact match found for brazzersexxtra.21.04.17.kristina.rose.and.tru.kait.two.wives.one.cock. Matching with Two Wives One Cock! 2021-09-26 08:56:13 Debug
Scraper script started

I have no idea why, the scraper cant find the files. In my directory all files sorted like:

Screen from Stash:

regards,

Sep 26 '21 07:09 peterpannimmerland

@peterpannimmerland can you provide that nfo file for me? Seems like it differs from the schema i thought was universal.... BTW you can test it again, should have fixed that bug aswell.

Oct 02 '21 20:10 Phasetime

Hi, in the attachment you find the .nfo file. Please rename .txt to .nfo

brazzersexxtra.21.04.17.kristina.rose.and.tru.kait.two.wives.one.cock.txt

I update your script an getting new errors :-)

021-10-03 09:56:53 Error
could not unmarshal json: EOF 2021-10-03 09:56:53 Error
scraper: SyntaxError: invalid syntax 2021-10-03 09:56:53 Error
scraper: ^ 2021-10-03 09:56:53 Error
scraper: else if actor.find("name") != None: 2021-10-03 09:56:53 Error
scraper: File "kodi.py", line 49 2021-10-03 09:56:41 Error
Error loading scraper /root/.stash/scrapers/javdb.yml: yaml: unmarshal errors: line 21: field sceneByName not found in type scraper.config line 25: field sceneByQueryFragment not found in type scraper.config 2021-10-03 09:56:41 Error
Error loading scraper /root/.stash/scrapers/ThePornDB.yml: yaml: unmarshal errors: line 16: field sceneByName not found in type scraper.config line 20: field sceneByQueryFragment not found in type scraper.config 2021-10-03 09:56:41 Error
Error loading scraper /root/.stash/scrapers/SARJ-LLC.yml: yaml: unmarshal errors: line 3: field sceneByName not found in type scraper.config line 11: field sceneByQueryFragment not found in type scraper.config 2021-10-03 09:56:41 Error
Error loading scraper /root/.stash/scrapers/JavLibrary.yml: yaml: unmarshal errors: line 22: field sceneByName not found in type scraper.config line 26: field sceneByQueryFragment not found in type scraper.config

Thanks for your great work

regards

Oct 03 '21 08:10 peterpannimmerland

Hi, thanks for making this. I'm getting the following error:

ERRO[2021-12-02 14:37:33] [Scrape / Kodi XML] File "/root/.stash/scrapers/kodi.py", line 49 ERRO[2021-12-02 14:37:33] [Scrape / Kodi XML] else if actor.find("name") != None: ERRO[2021-12-02 14:37:33] [Scrape / Kodi XML] ^
ERRO[2021-12-02 14:37:33] [Scrape / Kodi XML] SyntaxError: invalid syntax ERRO[2021-12-02 14:37:33] could not unmarshal json: EOF

I get it with all nfo files, not just one. A representative nfo is here: Natasha Nice - My First Sex Teacher

All NFOs were all created using this:

https://forum.kodi.tv/showthread.php?tid=360299

Thanks in advance for any help.

EDIT: I guess since I'm here, just confirming: my data is in /volume1/Adult. Docker has /volume1 mounted as /data. My base directory before should then be /volume1 my base after should be /data , correct?

Dec 02 '21 23:12 CapgrasDelusion2

@CapgrasDelusion2 for your specific error try changing the else if in line 49 of kody.py to elif. Not sure if it will work afterwards but that should take care of the specific error. Bare in mind that this PR is still a draft so it might need some more fixes. For your second question If the storage volumes are /volume1 -> /data in the docker you should see /data/Adult now if you try to add a library from stash

Dec 03 '21 17:12 bnkai

Worked like a charm, thank you very much

Dec 03 '21 20:12 CapgrasDelusion2

The below seems to work ok for me in linux, with the above assumptions only the py file was modified

import sys
import pathlib
import mimetypes
import base64
import json
from urllib.parse import urlparse

import xml.etree.ElementTree as ET

try:
    import py_common.graphql as graphql
    import py_common.log as log
except ModuleNotFoundError:
    print(
        "You need to download the folder 'py_common' from the community repo (CommunityScrapers/tree/master/scrapers/py_common)",
        file=sys.stderr)
    sys.exit(1)
"""  
This script parses kodi nfo files for metadata. The .nfo file must be in the same directory as the video file and must be named exactly alike.
"""


def query_xml(path, title):
    res = {"title": title}
    try:
        tree = ET.parse(path)
    except Exception as e:
        log.error(f'xml parsing failed:{e}')
        print(json.dumps(res))
        exit(1)

    if title == tree.find("title").text:
        log.info("Exact match found for " + title)
    else:
        log.info("No exact match found for " + title + ". Matching with " +
                 tree.find("title").text + "!")

    # Extract metadata from xml
    if tree.find("title") != None:
        res["title"] = tree.find("title").text

    if tree.find("plot") != None:
        res["details"] = tree.find("plot").text

    if tree.find("releasedate") != None:
        res["date"] = tree.find("releasedate").text
    elif tree.find("premiered") != None:
        res["date"] = tree.find("premiered").text

    if tree.find("tag") != None:
        res["tags"] = [{"name": x.text} for x in tree.findall("tag")]
    if tree.find("genre") != None:
        if "tags" in res:
            res["tags"] += [{"name": x.text} for x in tree.findall("genre")]
        else:
            res["tags"] = [{"name": x.text} for x in tree.findall("genre")]

    if tree.find("actor") != None:
        res["performers"] = []
        for actor in tree.findall("actor"):
            if actor.find("type") != None:
                if actor.find("type").text == "Actor":
                    res["performers"].append({"name": actor.find("name").text})
            elif actor.find("name") != None:
                res["performers"].append({"name": actor.find("name").text})
            else:
                res["performers"].append({"name": actor.text})

    if tree.find("studio") != None:
        res["studio"] = {"name": tree.find("studio").text}

    if tree.find("art") != None:
        if tree.find("art").find("poster") != None:
            posterElem = tree.find("art").find("poster")
            if posterElem.text != None:
                if uri_validator(posterElem.text):
                    # if image is a valid url return the url
                    res["image"] = posterElem.text
                elif pathlib.Path(posterElem.text).is_file(
                ):  # if image is a file return its base64 string
                    res["image"] = make_image_data_url(posterElem.text)
                else:  # non valid image text
                    log.warning(f"Non valid image data <{posterElem.text}>")
    return res


def uri_validator(u):
    try:
        result = urlparse(u)
        return all([result.scheme, result.netloc, result.path])
    except:
        return False


def make_image_data_url(image_path):
    # type: (str,) -> str
    mime, _ = mimetypes.guess_type(image_path)
    with open(image_path, 'rb') as img:
        encoded = base64.b64encode(img.read()).decode()
    return 'data:{0};base64,{1}'.format(mime, encoded)


if sys.argv[1] == "query":
    fragment = json.loads(sys.stdin.read())
    s_id = fragment.get("id")
    if not s_id:
        log.error(f"No ID found")
        sys.exit(1)

    # Assume that .nfo/.xml is named exactly alike the video file and is at the same location
    # Query graphQL for the file path
    scene = graphql.getScene(s_id)
    if scene:
        scene_path = scene.get("path")
        if scene_path:
            p = pathlib.Path(scene_path)
            res = {"title": fragment["title"]}
            f = p.with_suffix(".nfo")
            if f.is_file():
                pass
            elif p.with_suffix(".NFO").is_file():
                f = p.with_suffix(".NFO")
            else:
                log.info(f"No nfo/xml files found for the scene: {p}")
                print("{}")
                exit(0)
            res = query_xml(f, fragment["title"])
            print(json.dumps(res))
            exit(0)
    log.error(f"No scene found for {s_id}")
    exit(1)

Jan 10 '22 16:01 bnkai

Would it be a possibility to locate images that also share the same path as the video/nfo? For instance:

video.mkv video.nfo video-fanart.jpg

Mar 23 '22 18:03 nymeras

How would someone like myself, who has only a limited experience dabbling with some very basic coding, implement this scraper into their setup? Thank you in advance

May 17 '22 05:05 adultsesamestreet

Getting the following error message

2022-05-27 17:34:06
Error   
could not unmarshal json from script output: EOF
2022-05-27 17:34:06
Error   
[Scrape / Kodi XML] TypeError: can only concatenate str (not "NoneType") to str
2022-05-27 17:34:06
Error   
[Scrape / Kodi XML]     log.info("No exact match found for " + title + ". Matching with " + tree.find("title").text + "!")
2022-05-27 17:34:06
Error   
[Scrape / Kodi XML]   File "F:\Stash\scrapers\kodi.py", line 34, in query_xml
2022-05-27 17:34:06
Error   
[Scrape / Kodi XML]     res = query_xml(f, fragment["title"])
2022-05-27 17:34:06
Error   
[Scrape / Kodi XML]   File "F:\Stash\scrapers\kodi.py", line 110, in <module>
2022-05-27 17:34:06
Error   
[Scrape / Kodi XML] Traceback (most recent call last):

May 28 '22 00:05 edgar1016

Hi, this scraper looks like it would be very useful but am I right in thinking it does not work at the moment? Has it been abandoned?

I've tried really hard to get it working but I always get the same error: "scraper kodi: could not unmarshal json from script output: EOF"

Jul 19 '22 15:07 jake4dave4