Populate OSS Directory with description fields
Describe the feature you'd like to request
As the number of projects grows, it will get harder for users to ensure they are getting data from the project they care about. We should start using the description field in OSS Directory and/or auto-populating from GitHub org descriptions.
Describe the solution you'd like
If the description field is provided in OSS Directory, then that is the source of truth.
If not, then we auto-generate based on GitHub artifact(s):
- If the project is instantiated via a GitHub org, it should pull the description from the GitHub org space.
- If it's a single repo, then it can also be pulled from the repo description
- If it's a list of repos, then we can auto-generate text describing the number of repos (eg, "This project reflects contributions made across X repos")
The description should also be included in the projects.sql mart and in the API.
Describe alternatives you've considered
Only using OSS Directory descriptions
Fair enough, probably makes sense to join it in a dbt model after importOssDirectory into the projects intermediate model
Starting with enabling an optional description field in the project or collection files in oss-directory
https://github.com/opensource-observer/oss-directory/pull/274
I think the cloudquery plugin needs to be updated as well. I wonder if we can just have the cloudquery plugin use the JSON schema directly, rather than duck type it
importOssDirectory from cloudquery grabs dsecription here https://github.com/opensource-observer/oso/pull/1360/files
Following work in is to script a PR into oss-directtory
We've gotten contributions from growthepie and RF4 so far. Maybe we just want a transform that will look for empty Project descriptions and query GitHub for found orgs