Refactor Cluster and Catalog mods
Describe This Problem
A rough dependency graph of related structures:
Cluster and CatalogManager are public, they may be used by Client or CeresMeta to modify catalog information (create, open or delete tables). This graph shows two paths of creating/droping a table. The green one is triggered by server itself and another orange one is triggered by CeresMeta
- Green path
CatalogManagercan create/drop tables when it has write access (the leader role).3-1in this path is notifyingCeresMetathat a new table has been created, and3-2is an operation that happens simultaneously that create/drop that table in the leader itself. - Orange path
This path can be treated as counter part of the green one.
CeresMetamay notify non-leader node that something has changed. It also goes throughEventHandlerto notifyClusterand let it take action.
Proposal
From my first impression, the dependency and call graph is complicated. I'd like to change relative parts in the following ways:
- [ ] Combine
VolatileCatalogandCluster#245Clusteris responsible to manageTables andShards whileCatalogis responsible to manageSchemas which is a group of tables in another perspective different fromShard. Their functionalities have a big part that overlaps. It's natural to implement bothClusterandCatalogtraits over one struct, so they can use one memory state and maybe one operation logic. - [x] Remove
SchemaIdAllocandTableIdAlloc#238Idshould be allocated byCeresMeta, not the server. And furthermore, the server does need not to allocate ID then create table. The entire procedure is accomplished byCeresMeta. - [ ] Split
MetaClientCurrentMetaClientprovides two functionalities: poll update fromCeresMetaand invokeCeresMeta's method. These correspond to the two ways above: green arrows are from server toCeresMetaand orange are fromCeresMetato server. We can split these two aspects, by makingMetaClientan actual client that invokes RPC toCeresMetaand anEventloopthat keeps fetching updates fromCeresMeta. This can also resolve the cyclic dependency betweenClusterandMetaClient.
Additional Context
No response
@waynexia π I can't agree more with this proposal.
During the development, I find it is hard to combine the VolatileCatalog and Cluster because create/drop/open/close procedures are too complex to be processed by the combined module in cluster mode, e.g. in the create table procedure, the combined one should be able to handle two different cases: create by user (the request will be forwarded to ceresmeta) and create by ceresmeta (do the real creation), and that is ugly.
Here is the new proposal, which will still keep the VolatileCatalog and Cluster separate, but the event handler will be removed, and the TableManager in the Cluster will be exposed and will be held by VolatileCatalog:
βββββββββββββ
β Cluster ββββββββCreate/Drop/Open/Close table
βββββββββββββ β
β βΌ
β βββββββββββββββββββ
β βTableManipulator βββββββββββββ
β βββββββββββββββββββ β
β βΌ
β ββββββββββββββββ
β βCatalogManagerβ
β ββββββββββββββββ
βΌ β
ββββββββββββββ β
βTableManagerβββββββββββββββββββββββββββββββββββββββ
ββββββββββββββ
As for the create/drop table procedure, a new proxy called TableCreator will be added to handle different cases:
βββββββββββββββ
ββββTableCreator ββββββ
β βββββββββββββββ β
β β
βΌ βΌ
βββββββββββββ βββββββββββββ
β From user β β From meta β
βββββββββββββ βββββββββββββ
β β
βΌ βΌ
βββββββββββββ βββββββββββββ
βMeta clientβ β Cluster β
βββββββββββββ βββββββββββββ
β
βΌ
βββββββββββββ
βCatalogManaβ
βββββββββββββ
Here is the tracking task:
- [ ] Remove the current catalog based on cluster, and add table manager to volatile catalog;
- [ ] Add TableCreator & TableDropper, and refactor the create/drop table procedure;
- [ ] Implement the open/close shards of cluster;