linkml-runtime Optimize `get_classes_by

Waiting for tests to run on the array branch and browsing through issues and PRs, noticed there was one for optimizing get_classes_by_slot https://github.com/linkml/linkml-runtime/pull/300 and any time there's an induced_slot perf thing i gotta check it out now.

For non-induced slots, #300 is a perf wash (for 100 rounds through all slots in biolink, 6.77 before and 6.81 after), but when i went to profile it with induced slots i was surprised at how long the estimated time for a single round through was supposed to take - roughly 3 hours (~160,000x slower vs non-induced slots).

Using sets is a great idea for removing the step of making something unique at the end. we can take advantage of that further by breaking after the first set addition (since future set additions will do nothing).

the problem is that we are still making basically the most expensive call you can do with schemaview, which is to call induced_slot on every combination of slot and class for a schema. induced_slot is expensive in itself, but it's usually fine to do because you're only calling it in the context of a few classes and the caching can handle how slow it is, but since the call is every slot in the context of every class, no caching can be done.

The fix is relatively simple - since get_classes_by_slot only cares about the existence of the slot on the class, not on its full induced form, there is no need to use induced_slot at all. class_induced_slots iterates over the slots given by class_slots: https://github.com/linkml/linkml-runtime/blob/7c311d975422ddf990ae524fe6d969326c1b9261/linkml_runtime/utils/schemaview.py#L1368

so any slots that class_induced_slots knows about, class_slots knows about. The implementation in this PR also avoids iterating over the classes twice, although that's also a perf wash because iterating through a list and comparing strings takes basically no time.

So for the simple timing function

from linkml_runtime.utils.schemaview import SchemaView
import cProfile

def get_slot_classes():
    sv = SchemaView('biolink.yaml')
    for i in range(100):
        for slot in sv.all_slots(imports=True).values():
            _ = sv.get_classes_by_slot(slot, include_induced=False)

def get_induced_slot_classes():
    sv = SchemaView('biolink.yaml')
    for i in range(1):
        for i, slot in enumerate(sv.all_slots(imports=True).values()):
            _ = sv.get_classes_by_slot(slot, include_induced=True)
            if i >= 4:
                break

cProfile.run('get_slot_classes()', 'before_stats.prof')
cProfile.run('get_induced_slot_classes()', 'before_induced_stats.prof')

where i break out of getting the induced slots after 5 slots because it takes so long,

you get:

version	regular per-call	induced per-call (s)
before 300	8.234e-05	21.59
after 300	8.092e-05	21.85
this PR	6.672e-05	0.01946

So i'll call the regular difference in the noise and the induced slot call roughly 1100x faster. A full run over biolink takes 12s vs ~3h.

attached profiling results where beforest is before 300, before is 300 and after is this PR.

profiling-get_classes_by_slot.zip

Mar 05 '24 04:03 sneakers-the-rat

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 62.88%. Comparing base (27b9158) to head (ee4e4b2). Report is 31 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #304   +/-   ##
=======================================
  Coverage   62.88%   62.88%           
=======================================
  Files          62       62           
  Lines        8528     8528           
  Branches     2436     2435    -1     
=======================================
  Hits         5363     5363           
  Misses       2554     2554           
  Partials      611      611

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Mar 05 '24 05:03 codecov[bot]

would that be helpful? would that look like just running the main linkml tests with the PR'd version of linkml-runtime, or did you have something else in mind? for now I can just run them and post results

Mar 05 '24 19:03 sneakers-the-rat

would that look like just running the main linkml tests with the PR'd version of linkml-runtime

That would work! IMO it's a bit clunky and it's a big ask any time there is a schemaview change but induced slots are central enough it's worth doing here, thx!

Mar 05 '24 20:03 cmungall

So yes test failures, but i honestly can't tell if it's from this or not.. i'll investigate more thoroughly later, but would advise not merging yet. Tests also took 2 hours to run and i can't rly fathom why. will also profile that later.

====================================================== short test summary info =======================================================
FAILED tests/test_issues/test_issue_494.py::test_jsonschema_validation - AssertionError: Regex pattern did not match.
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_config_extends_recommended - AssertionError: 1 != 2
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_directory_of_files - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmp0wrjounp/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_no_config - AssertionError: 1 != 2
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_no_schema_errors - AssertionError: 1 != 0
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_processes_dot_files_in_directory_when_a_option_provided - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmpf87woivi/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_processes_dot_files_in_directory_when_all_option_provided - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmps6ym618p/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_validate_schema - AssertionError: 1 != 2
FAILED tests/test_linter/test_rule_canonical_prefixes.py::TestCanonicalPrefixesRule::test_custom_context - AttributeError: 'NoneType' object has no attribute 'append'
FAILED tests/test_linter/test_rule_canonical_prefixes.py::TestCanonicalPrefixesRule::test_default_merged_context - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_load_from_dict - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_example_runner - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_example_runner_non_defaults - AttributeError: 'NoneType' object has no attribute 'append'
========== 10 failed, 6151 passed, 964 skipped, 3 xfailed, 4 warnings, 3 errors, 261 subtests passed in 6882.77s (1:54:42) ===========

expand/collapse pytest output

=============================================================== ERRORS ===============================================================
_______________________________________________ ERROR at setup of test_load_from_dict ________________________________________________

input_path = <function input_path.<locals>.get_path at 0x144044e00>
tmp_path = PosixPath('/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/pytest-of-jonny/pytest-69/test_load_from_dict0')

    @pytest.fixture
    def example_runner(input_path, tmp_path):
        schemaview = SchemaView(input_path("personinfo.yaml"))
>       ctxt = load_multi_context(["obo", "linked_data", "prefixcc"])

tests/test_workspaces/test_example_runner.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='obo+linked_data+prefixcc', description=None, prefix_expansions=[], comments=None, location=None, format=..., merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = 'AAO', namespace = 'http://purl.obolibrary.org/obo/AAO_', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
_______________________________________________ ERROR at setup of test_example_runner ________________________________________________

input_path = <function input_path.<locals>.get_path at 0x144044a40>
tmp_path = PosixPath('/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/pytest-of-jonny/pytest-69/test_example_runner0')

    @pytest.fixture
    def example_runner(input_path, tmp_path):
        schemaview = SchemaView(input_path("personinfo.yaml"))
>       ctxt = load_multi_context(["obo", "linked_data", "prefixcc"])

tests/test_workspaces/test_example_runner.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='obo+linked_data+prefixcc', description=None, prefix_expansions=[], comments=None, location=None, format=..., merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = 'AAO', namespace = 'http://purl.obolibrary.org/obo/AAO_', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
_________________________________________ ERROR at setup of test_example_runner_non_defaults _________________________________________

input_path = <function input_path.<locals>.get_path at 0x1440474c0>
tmp_path = PosixPath('/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/pytest-of-jonny/pytest-69/test_example_runner_non_defaul0')

    @pytest.fixture
    def example_runner(input_path, tmp_path):
        schemaview = SchemaView(input_path("personinfo.yaml"))
>       ctxt = load_multi_context(["obo", "linked_data", "prefixcc"])

tests/test_workspaces/test_example_runner.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='obo+linked_data+prefixcc', description=None, prefix_expansions=[], comments=None, location=None, format=..., merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = 'AAO', namespace = 'http://purl.obolibrary.org/obo/AAO_', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
============================================================== FAILURES ==============================================================
_____________________________________________________ test_jsonschema_validation _____________________________________________________

input_path = <function input_path.<locals>.get_path at 0x13de5f880>

    def test_jsonschema_validation(input_path):
        """Test case that tries to infer the target class for a schema
        that has a local import and doesn't explicitly specify the
        target class"""
        SCHEMA = input_path("issue_494/slot_usage_example.yaml")
        sv = SchemaView(schema=SCHEMA)
    
        with pytest.raises(
            RuntimeError,
            match="Multiple potential target "
            r"classes found: \['Container', 'annotation'\]. "
            "Please specify a target using --target-class "
            "or by adding tree_root: true to the relevant class in the schema",
        ):
            # infer target_class rather than explicit specification
>           infer_root_class(sv)

tests/test_issues/test_issue_494.py:22: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

sv = SchemaView(schema=SchemaDefinition(name='personinfo', id_prefixes=[], id_prefixes_are_closed=None, definition_uri=None...one, slot_names_unique=None, settings={})}, importmap={}, modifications=0, uuid='70a0f3ff-b715-4b8b-83f3-9856da459a2f')

    def infer_root_class(sv: SchemaView) -> Optional[ClassDefinitionName]:
        """
        Infer the class that should be at the root of the object tree
    
        (Note this is distinct from the root of the class hierarchy)
    
        If a class is explicitly designated with tree_root, use this.
        Otherwise use the class that is not referenced as a range in any other class.
        """
        for c in sv.all_classes().values():
            if c.tree_root:
                return c.name
        refs = defaultdict(int)
        for cn in sv.all_classes().keys():
            for sn in sv.class_slots(cn):
                slot = sv.induced_slot(sn, cn)
                r = slot.range
                if r in sv.all_classes():
                    for a in sv.class_ancestors(r):
                        refs[a] += 1
        candidates = [cn for cn in sv.all_classes().keys() if cn not in refs]
    
        # throw Exception if unambiguous root cannot be inferred
        if len(candidates) > 1:
>           raise RuntimeError(
                f"Multiple potential target classes found: {candidates}. "
                "Please specify a target using --target-class or by adding "
                "tree_root: true to the relevant class in the schema."
            )
E           RuntimeError: Multiple potential target classes found: ['annotation', 'Container']. Please specify a target using --target-class or by adding tree_root: true to the relevant class in the schema.

linkml/utils/datautils.py:93: RuntimeError

During handling of the above exception, another exception occurred:

input_path = <function input_path.<locals>.get_path at 0x13de5f880>

    def test_jsonschema_validation(input_path):
        """Test case that tries to infer the target class for a schema
        that has a local import and doesn't explicitly specify the
        target class"""
        SCHEMA = input_path("issue_494/slot_usage_example.yaml")
        sv = SchemaView(schema=SCHEMA)
    
>       with pytest.raises(
            RuntimeError,
            match="Multiple potential target "
            r"classes found: \['Container', 'annotation'\]. "
            "Please specify a target using --target-class "
            "or by adding tree_root: true to the relevant class in the schema",
        ):
E       AssertionError: Regex pattern did not match.
E        Regex: "Multiple potential target classes found: \\['Container', 'annotation'\\]. Please specify a target using --target-class or by adding tree_root: true to the relevant class in the schema"
E        Input: "Multiple potential target classes found: ['annotation', 'Container']. Please specify a target using --target-class or by adding tree_root: true to the relevant class in the schema."

tests/test_issues/test_issue_494.py:14: AssertionError
--------------------------------------------------------- Captured log call ----------------------------------------------------------
INFO     root:schemaview.py:206 Importing testme as testme from source /Users/jonny/git/forks/linkml/tests/test_issues/input/issue_494/slot_usage_example.yaml; base_dir=/Users/jonny/git/forks/linkml/tests/test_issues/input/issue_494
INFO     root:schemaview.py:206 Importing linkml:types as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/types from source /Users/jonny/git/forks/linkml/tests/test_issues/input/issue_494/slot_usage_example.yaml; base_dir=None
___________________________________________ TestLinterCli.test_config_extends_recommended ____________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_config_extends_recommended>

    def test_config_extends_recommended(self):
        config_file = "config.yaml"
        with self.runner.isolated_filesystem():
            write_schema_file()
            write_config_file(config_file, extends_recommended=True)
    
            result = self.runner.invoke(main, ["--config", config_file, SCHEMA_FILE])
>           self.assertEqual(result.exit_code, 2)
E           AssertionError: 1 != 2

tests/test_linter/test_cli.py:110: AssertionError
--------------------------------------------------------- Captured log call ----------------------------------------------------------
INFO     root:schemaview.py:206 Importing linkml:units as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/units from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:mappings as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/mappings from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:types as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/types from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:annotations as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/annotations from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:extensions as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/extensions from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
_______________________________________________ TestLinterCli.test_directory_of_files ________________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_directory_of_files>

        def test_directory_of_files(self):
            schema_dir = Path("schemas")
            schema_a = schema_dir / "schema_a.yaml"
            schema_b = schema_dir / "schema_b.yaml"
            with self.runner.isolated_filesystem():
                schema_dir.mkdir()
                schema_a.write_text(
                    """
    id: http://example.org/test_a
    name: test_a
    
    classes:
      person:
        description: An individual human
    """
                )
                schema_b.write_text(
                    """
    id: http://example.org/test_b
    name: test_b
    
    slots:
      a slot:
        description: A slot to hold thing
    """
                )
    
                result = self.runner.invoke(main, [str(schema_dir)])
                self.assertEqual(result.exit_code, 1)
>               self.assertIn(str(schema_a), result.stdout)
E               AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmp0wrjounp/schemas/schema_b.yaml\n  warning  Slot has name 'a slot'  (standard_naming)\n"

tests/test_linter/test_cli.py:215: AssertionError
____________________________________________________ TestLinterCli.test_no_config ____________________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_no_config>

    def test_no_config(self):
        with self.runner.isolated_filesystem():
            write_schema_file()
    
            result = self.runner.invoke(main, [SCHEMA_FILE])
>           self.assertEqual(result.exit_code, 2)
E           AssertionError: 1 != 2

tests/test_linter/test_cli.py:65: AssertionError
________________________________________________ TestLinterCli.test_no_schema_errors _________________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_no_schema_errors>

        def test_no_schema_errors(self):
            with self.runner.isolated_filesystem():
                with open(SCHEMA_FILE, "w") as f:
                    f.write(
                        """
    id: http://example.org/test
    name: test
    
    classes:
      Person:
        description: An individual human
    """
                    )
    
                result = self.runner.invoke(main, [SCHEMA_FILE])
>               self.assertEqual(result.exit_code, 0)
E               AssertionError: 1 != 0

tests/test_linter/test_cli.py:184: AssertionError
_____________________________ TestLinterCli.test_processes_dot_files_in_directory_when_a_option_provided _____________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_processes_dot_files_in_directory_when_a_option_provided>

        def test_processes_dot_files_in_directory_when_a_option_provided(self):
            schema_dir = Path("schemas")
            schema_a = schema_dir / "schema_a.yaml"
            schema_b = schema_dir / ".schema_b.yaml"
            with self.runner.isolated_filesystem():
                schema_dir.mkdir()
                schema_a.write_text(
                    """
    id: http://example.org/test_a
    name: test_a
    
    classes:
      person:
        description: An individual human
    """
                )
                schema_b.write_text(
                    """
    id: http://example.org/test_b
    name: test_b
    
    slots:
      a slot:
        description: A slot to hold thing
    """
                )
    
                result = self.runner.invoke(main, ["-a", str(schema_dir)])
                self.assertEqual(result.exit_code, 1)
>               self.assertIn(str(schema_a), result.stdout)
E               AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmpf87woivi/schemas/.schema_b.yaml\n  warning  Slot has name 'a slot'  (standard_naming)\n"

tests/test_linter/test_cli.py:283: AssertionError
____________________________ TestLinterCli.test_processes_dot_files_in_directory_when_all_option_provided ____________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_processes_dot_files_in_directory_when_all_option_provided>

        def test_processes_dot_files_in_directory_when_all_option_provided(self):
            schema_dir = Path("schemas")
            schema_a = schema_dir / "schema_a.yaml"
            schema_b = schema_dir / ".schema_b.yaml"
            with self.runner.isolated_filesystem():
                schema_dir.mkdir()
                schema_a.write_text(
                    """
    id: http://example.org/test_a
    name: test_a
    
    classes:
      person:
        description: An individual human
    """
                )
                schema_b.write_text(
                    """
    id: http://example.org/test_b
    name: test_b
    
    slots:
      a slot:
        description: A slot to hold thing
    """
                )
    
                result = self.runner.invoke(main, ["--all", str(schema_dir)])
                self.assertEqual(result.exit_code, 1)
>               self.assertIn(str(schema_a), result.stdout)
E               AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmps6ym618p/schemas/.schema_b.yaml\n  warning  Slot has name 'a slot'  (standard_naming)\n"

tests/test_linter/test_cli.py:317: AssertionError
_________________________________________________ TestLinterCli.test_validate_schema _________________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_validate_schema>

        def test_validate_schema(self):
            with self.runner.isolated_filesystem():
                with open(SCHEMA_FILE, "w") as f:
                    f.write(
                        """
    id: http://example.org/test
    classes:
        person:
            description: a person
    """
                    )
    
                result = self.runner.invoke(main, ["--validate", SCHEMA_FILE])
>               self.assertEqual(result.exit_code, 2)
E               AssertionError: 1 != 2

tests/test_linter/test_cli.py:335: AssertionError
--------------------------------------------------------- Captured log call ----------------------------------------------------------
INFO     root:generator.py:199 Using SchemaView with im=None
INFO     root:schemaview.py:206 Importing linkml:units as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/units from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:mappings as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/mappings from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:types as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/types from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:annotations as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/annotations from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:extensions as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/extensions from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
___________________________________________ TestCanonicalPrefixesRule.test_custom_context ____________________________________________

self = <tests.test_linter.test_rule_canonical_prefixes.TestCanonicalPrefixesRule testMethod=test_custom_context>

    def test_custom_context(self):
        builder = SchemaBuilder()
        builder.add_prefix("CHEBI", "http://purl.obolibrary.org/obo/CHEBI_")
        builder.add_prefix("GEO", "http://identifiers.org/geo/")
        builder.add_prefix("UPHENO", "http://example.com/wrong/upheno_")
        builder.add_prefix("WRONG", "http://identifiers.org/orcid/")
    
        schema_view = SchemaView(builder.schema)
        config = CanonicalPrefixesConfig(
            level=RuleLevel.error,
            prefixmaps_contexts=["bioregistry.upper", "prefixcc", "obo"],
        )
    
        rule = CanonicalPrefixesRule(config)
>       problems = list(rule.check(schema_view))

tests/test_linter/test_rule_canonical_prefixes.py:52: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
linkml/linter/rules.py:256: in check
    context = load_multi_context(self.config.prefixmaps_contexts)
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='bioregistry.upper+prefixcc+obo', description=None, prefix_expansions=[], comments=None, location=None, f..., merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = '3DMET', namespace = 'http://identifiers.org/3dmet/', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
_______________________________________ TestCanonicalPrefixesRule.test_default_merged_context ________________________________________

self = <tests.test_linter.test_rule_canonical_prefixes.TestCanonicalPrefixesRule testMethod=test_default_merged_context>

    def test_default_merged_context(self):
        builder = SchemaBuilder()
        builder.add_prefix("CHEBI", "http://purl.obolibrary.org/obo/CHEBI_")
        builder.add_prefix("GEO", "http://purl.obolibrary.org/obo/GEO_")
        builder.add_prefix("UPHENO", "http://example.com/wrong/upheno_")
        builder.add_prefix("WRONG", "http://identifiers.org/orcid/")
    
        schema_view = SchemaView(builder.schema)
        config = CanonicalPrefixesConfig(level=RuleLevel.error, prefixmaps_contexts=["merged"])
    
        rule = CanonicalPrefixesRule(config)
>       problems = list(rule.check(schema_view))

tests/test_linter/test_rule_canonical_prefixes.py:22: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
linkml/linter/rules.py:256: in check
    context = load_multi_context(self.config.prefixmaps_contexts)
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='merged', description=None, prefix_expansions=[], comments=None, location=None, format=None, merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = '3DMET', namespace = 'http://identifiers.org/3dmet/', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
========================================================== warnings summary ==========================================================
linkml/utils/generator.py:685
  /Users/jonny/git/forks/linkml/linkml/utils/generator.py:685: DeprecationWarning: invalid escape sequence '\_'
    """

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/pyshexc/parser/ShExDocLexer.py:4
  /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/pyshexc/parser/ShExDocLexer.py:4: DeprecationWarning: typing.io is deprecated, import directly from typing instead. typing.io will be removed in Python 3.12.
    from typing.io import TextIO

linkml/linter/linter.py:38
  /Users/jonny/git/forks/linkml/linkml/linter/linter.py:38: DeprecationWarning: Importing Validator directly from the jsonschema package is deprecated and will become an ImportError. Import it from jsonschema.protocols instead.
    def get_metamodel_validator() -> jsonschema.Validator:

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/jupyter_client/connect.py:22
  /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/jupyter_client/connect.py:22: DeprecationWarning: Jupyter is migrating its paths to use standard platformdirs
  given by the platformdirs library.  To remove this warning and
  see the appropriate new directories, set the environment variable
  `JUPYTER_PLATFORM_DIRS=1` and then run `jupyter --paths`.
  The use of platformdirs will be the default in `jupyter_core` v6
    from jupyter_core.paths import jupyter_data_dir, jupyter_runtime_dir, secure_write

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================================================== short test summary info =======================================================
FAILED tests/test_issues/test_issue_494.py::test_jsonschema_validation - AssertionError: Regex pattern did not match.
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_config_extends_recommended - AssertionError: 1 != 2
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_directory_of_files - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmp0wrjounp/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_no_config - AssertionError: 1 != 2
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_no_schema_errors - AssertionError: 1 != 0
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_processes_dot_files_in_directory_when_a_option_provided - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmpf87woivi/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_processes_dot_files_in_directory_when_all_option_provided - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmps6ym618p/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_validate_schema - AssertionError: 1 != 2
FAILED tests/test_linter/test_rule_canonical_prefixes.py::TestCanonicalPrefixesRule::test_custom_context - AttributeError: 'NoneType' object has no attribute 'append'
FAILED tests/test_linter/test_rule_canonical_prefixes.py::TestCanonicalPrefixesRule::test_default_merged_context - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_load_from_dict - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_example_runner - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_example_runner_non_defaults - AttributeError: 'NoneType' object has no attribute 'append'

Mar 05 '24 23:03 sneakers-the-rat

OK figured out what was wrong - it was just that i had installed my local version of prefixmaps in my linkml environment to test https://github.com/linkml/prefixmaps/pull/69 and hadn't fixed a bug in it.

tests pass now, except for test_issue_494.py which fails because of this, which is pending in another PR: https://github.com/linkml/linkml/pull/1704/commits/f148a1882df1fd162f24b6dc8b192491116dfa16

this is good 2 merge i think, but go ahead and test it urself!

Mar 16 '24 04:03 sneakers-the-rat

triggering a manual build to test the new workflow in

#316

Apr 02 '24 21:04 cmungall

Omg I hope it works, lmk if it doesnt and can debug

Apr 02 '24 21:04 sneakers-the-rat

Ok now that the action is in main i can test it on my fork. Sorry for the runaround with these CI actions, its specifically a weird thing for forks. Ill make it work

Apr 02 '24 22:04 sneakers-the-rat

It looks to me like the issue in this ticket got resolved (checks are all successful), and we can merge it in. Re-requested review from Chris as it is currently blocking merge (I could also re-review if Chris is swamped).

Dec 20 '24 17:12 sierra-moxon

forgot about this one - re: ongoing conversation in other PR "do we consider attributes to be slots" there is one functional difference that i think we would need to change. in the include_induced leg we don't check for attributes that are directly declared on the class, so if we want that then i would just have to add an additional check for whether slot_name in c.attributes.

But since the thing that is passed is an actual SlotDefinition rather than just the name of a slot/attribute, it seems like that indicates that one is looking for that exact slot in particular - e.g. if someone passed in the slot with slot_name, then i wouldn't expect to receive an attribute with slot_name because that's a different object. so it also seems like this should take a string rather than a SlotDefinition to avoid that ambiguity

Dec 20 '24 21:12 sneakers-the-rat

Optimize `get_classes_by_slot` round 2

Codecov Report