complement icon indicating copy to clipboard operation
complement copied to clipboard

Autogen per-test docs for discoverability

Open ShadowJonathan opened this issue 4 years ago • 3 comments

While working on importing sytests, i found myself ctrl-shift-f-ing on keywords to see what tests already existed to see if a sytest would fit that file, however, I think that this displays some poor test discoverability.

However, when i was documenting some sytests (and their behaviour), i came upon something like the following format;

sytest:
  - name: "Non-present room members cannot ban others"
    type:
      - power_level
      - ban
    desc:
      When a user does not exist in a room, but has powerlevels there, it should not be able to ban users.
    variants:
      - user has never entered the room
      - user has left the room
      - user has been kicked while ban is sent out
      - user server has been ACL-d
  - name: "New federated private chats get full presence information (SYN-115)"
    # ...

I think it would be useful to programmatically document (and maybe link) these tests, so that a quick glance can see which ones exist, what it does, what variants of behaviour it's also testing, and also in which "areas" of testing matrix it exists.


For a final variant of above schema, i think adding a path (something like tests/csapi/rooms_state_test.go:TestRoomCreationReportsEventsToMyself) would help auto-identify the corresponding function.

ShadowJonathan avatar Dec 01 '21 17:12 ShadowJonathan

So there have been attempts by me in the past to group together sytests into coherent sections which are useful to end users: this is what are we synapse yet? is, which assigns each test a 3 letter code which determines the grouping, resulting in an output like:

Are We Synapse Yet?
===================

Non-Spec APIs: 91% (39/43 tests)
--------------
  Non-Spec API             :  80% (4/5 tests)
  MSCs                     :  78% (7/9 tests)
  Unknown API (no group specified):  97% (28/29 tests)

Client-Server APIs: 65% (396/608 tests)
-------------------
  Registration             :  79% (27/34 tests)
  Login                    :  58% (11/19 tests)
  Misc CS APIs             : 100% (1/1 tests)
  Profile                  : 100% (6/6 tests)
  Devices                  :  92% (11/12 tests)
  Presence                 :   7% (1/15 tests)
  Create Room              : 100% (13/13 tests)
  Sync API                 :  67% (53/79 tests)
  Room Membership          :  86% (12/14 tests)
  Room State APIs          :  93% (14/15 tests)
  Public Room APIs         : 100% (7/7 tests)
  Room Aliases             :  80% (12/15 tests)
  Joining Rooms            :  88% (7/8 tests)
  Leaving Rooms            : 100% (1/1 tests)
  Inviting users to Rooms  : 100% (13/13 tests)
  Banning users            :  67% (2/3 tests)
  Sending events           : 100% (2/2 tests)
  Getting events for Rooms :  80% (8/10 tests)
  Typing API               : 100% (5/5 tests)
  Receipts                 : 100% (2/2 tests)
  Read markers             : 100% (1/1 tests)
  Media APIs               :  90% (19/21 tests)
  Capabilities API         : 100% (2/2 tests)
  Logout                   : 100% (4/4 tests)
  Push APIs                :   8% (5/59 tests)
  Account APIs             :  90% (9/10 tests)
  V1 CS APIs               :  50% (1/2 tests)
  Ephemeral Events         :   0% (0/1 tests)
  Power Levels             : 100% (6/6 tests)
  Redaction                : 100% (5/5 tests)
  Third-Party ID APIs      :  16% (3/19 tests)
  Guest APIs               :  50% (12/24 tests)
  Room Auth                :  44% (8/18 tests)
  Forget APIs              : 100% (5/5 tests)
  Context APIs             :   0% (0/4 tests)
  Room Upgrade APIs        :   0% (0/21 tests)
  Room Versions            :  98% (51/52 tests)
  Device Keys              :  82% (14/17 tests)
  Device Key Backup        : 100% (10/10 tests)
  Cross-signing Keys       :  75% (6/8 tests)
  Tagging APIs             :  75% (6/8 tests)
  Search APIs              :   0% (0/6 tests)
  OpenID API               : 100% (3/3 tests)
  Send-to-Device APIs      : 100% (10/10 tests)
  Server Admin API         : 100% (1/1 tests)
  Ignore Users             :   0% (0/3 tests)
  User Directory APIs      :  36% (4/11 tests)
  Enforced canonical JSON  : 100% (3/3 tests)

Federation APIs: 91% (101/111 tests)
----------------
  State APIs               :  92% (12/13 tests)
  Device Key APIs          :  67% (6/9 tests)
  Send-to-Device APIs      : 100% (2/2 tests)
  Key API                  : 100% (6/6 tests)
  Query API                : 100% (5/5 tests)
  send_join API            : 100% (10/10 tests)
  make_join API            : 100% (3/3 tests)
  Auth                     : 100% (19/19 tests)
  room versions            : 100% (7/7 tests)
  Federation API           :  77% (10/13 tests)
  get_missing_events API   :  57% (4/7 tests)
  Backfill API             : 100% (5/5 tests)
  Invite API               : 100% (10/10 tests)
  send_leave API           : 100% (1/1 tests)
  Public Room API          : 100% (1/1 tests)

Application Services APIs: 52% (13/25 tests)
--------------------------
  Application Services API :  52% (13/25 tests)

The intention of this grouping was to let end users who may want to run Dendrite roughly know which features were implemented and which were not automatically. There's other ways you can group tests though which are useful to different audiences:

  • By the endpoints used in the tests. Useful for HS developers to know which endpoints they need to implement in order to test a feature.
  • By the spec section. Useful for spec compliance.
  • By feature dependency. For example, you cannot test changing the room name until you have the ability to set state events, which requires you to be able to create a room, which requires you to be able to create a user.

It's unclear to me how the tests should be grouped in order for it to be most useful to the target audience: test writers.

kegsay avatar Dec 01 '21 18:12 kegsay

(Keep in mind that this is a different issue than https://github.com/matrix-org/complement/issues/241, which is more about file layout, this is more about creating a "lookup/search directory")

I think that providing corresponding behaviour and variant information to tests could already help test writers, but your comment on the three different categories of categorisation gives me an idea to include that additional information (endpoints, spec area, features) alongside this listing of tests, so that it could possibly be displayed in a likewise website such AWSY, only somewhat more akin to like the clippy lint directory

In short; I think for test writers it could be good enough to provide a central reference list of tests and some corresponding documentation, but that for homeserver developers, some more metadata and information about what exactly is tested would be helpful for them.

I don't think that such a list should be hierarchical, but more about tagging areas of interest which a particular test touches.

ShadowJonathan avatar Dec 01 '21 19:12 ShadowJonathan

In short; I think for test writers it could be good enough to provide a central reference list of tests and some corresponding documentation,

I think the solution here then is to just autogen a directory listing of tests from (package, testname), and we can automatically pull out the top test comment as well. It'll make it slightly easier to CTRL+F. We already do stuff like this for https://github.com/matrix-org/complement/blob/main/ENVIRONMENT.md

Renaming issue.

kegsay avatar Oct 04 '23 09:10 kegsay