horaedb icon indicating copy to clipboard operation
horaedb copied to clipboard

Refactor Cluster and Catalog mods

Open waynexia opened this issue 3 years ago β€’ 1 comments

Describe This Problem

A rough dependency graph of related structures:

cluster

Cluster and CatalogManager are public, they may be used by Client or CeresMeta to modify catalog information (create, open or delete tables). This graph shows two paths of creating/droping a table. The green one is triggered by server itself and another orange one is triggered by CeresMeta

  • Green path CatalogManager can create/drop tables when it has write access (the leader role). 3-1 in this path is notifying CeresMeta that a new table has been created, and 3-2 is an operation that happens simultaneously that create/drop that table in the leader itself.
  • Orange path This path can be treated as counter part of the green one. CeresMeta may notify non-leader node that something has changed. It also goes through EventHandler to notify Cluster and let it take action.

Proposal

From my first impression, the dependency and call graph is complicated. I'd like to change relative parts in the following ways:

  • [ ] Combine VolatileCatalog and Cluster #245 Cluster is responsible to manage Tables and Shards while Catalog is responsible to manage Schemas which is a group of tables in another perspective different from Shard. Their functionalities have a big part that overlaps. It's natural to implement both Cluster and Catalog traits over one struct, so they can use one memory state and maybe one operation logic.
  • [x] Remove SchemaIdAlloc and TableIdAlloc #238 Id should be allocated by CeresMeta, not the server. And furthermore, the server does need not to allocate ID then create table. The entire procedure is accomplished by CeresMeta.
  • [ ] Split MetaClient Current MetaClient provides two functionalities: poll update from CeresMeta and invoke CeresMeta's method. These correspond to the two ways above: green arrows are from server to CeresMeta and orange are from CeresMeta to server. We can split these two aspects, by making MetaClient an actual client that invokes RPC to CeresMeta and an Eventloop that keeps fetching updates from CeresMeta. This can also resolve the cyclic dependency between Cluster and MetaClient.

Additional Context

No response

waynexia avatar Sep 06 '22 11:09 waynexia

@waynexia πŸ‘ I can't agree more with this proposal.

ShiKaiWi avatar Sep 06 '22 11:09 ShiKaiWi

During the development, I find it is hard to combine the VolatileCatalog and Cluster because create/drop/open/close procedures are too complex to be processed by the combined module in cluster mode, e.g. in the create table procedure, the combined one should be able to handle two different cases: create by user (the request will be forwarded to ceresmeta) and create by ceresmeta (do the real creation), and that is ugly.

Here is the new proposal, which will still keep the VolatileCatalog and Cluster separate, but the event handler will be removed, and the TableManager in the Cluster will be exposed and will be held by VolatileCatalog:

 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                             
 β”‚  Cluster  │───────Create/Drop/Open/Close table          
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                β”‚                            
       β”‚                      β–Ό                            
       β”‚             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   
       β”‚             β”‚TableManipulator │───────────┐       
       β”‚             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚       
       β”‚                                           β–Ό       
       β”‚                                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚                                   β”‚CatalogManagerβ”‚
       β”‚                                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β–Ό                                           β”‚       
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                     β”‚       
β”‚TableManagerβ”‚β—€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     

As for the create/drop table procedure, a new proxy called TableCreator will be added to handle different cases:

         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           
      β”Œβ”€β”€β”‚TableCreator │────┐      
      β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚      
      β”‚                     β”‚      
      β–Ό                     β–Ό      
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ From user β”‚         β”‚ From meta β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      β”‚                     β”‚      
      β–Ό                     β–Ό      
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Meta clientβ”‚         β”‚  Cluster  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚      
                            β–Ό      
                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                      β”‚CatalogManaβ”‚
                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Here is the tracking task:

  • [ ] Remove the current catalog based on cluster, and add table manager to volatile catalog;
  • [ ] Add TableCreator & TableDropper, and refactor the create/drop table procedure;
  • [ ] Implement the open/close shards of cluster;

ShiKaiWi avatar Oct 12 '22 07:10 ShiKaiWi