dynomite icon indicating copy to clipboard operation
dynomite copied to clipboard

Dynomite local cluster with docker compose

Open emiteze opened this issue 6 years ago • 2 comments

Hi guys,

First of all thanks for the great work that has been done here! We met Dynomite after started using Conductor and we decided to use it as database for our application as well, to avoid installing another database and to enjoy the advantages of Dynomite :)

I'm currently running some tests with a Dynomite cluster locally using docker compose and I'm facing some strange issues which I don't think is the expected behavior for a Dynomite cluster.

The topology used for tests was dynomite01 and dynomite02 in rack1 and dynomite03 in rack2. I used the python script to calculate tokens and this is the configuration for each node and the YML file for dynomite cluster.

docker-compose.yml

version: '3'
services:
  dynomite01:
    image: v1r3n/dynomite
    ports:
      - 8102:8102
      - 22222:22222
    volumes:
      - ./node01.yml:/dynomite/conf/redis_single.yml
    networks:
      - custom

  dynomite02:
    image: v1r3n/dynomite
    ports:
      - 8103:8102
      - 22223:22222
    volumes:
      - ./node02.yml:/dynomite/conf/redis_single.yml
    networks:
      - custom

  dynomite03:
    image: v1r3n/dynomite
    ports:
      - 8104:8102
      - 22224:22222
    volumes:
      - ./node03.yml:/dynomite/conf/redis_single.yml
    networks:
      - custom

networks:
  custom:
    driver: bridge
    ipam:
      driver: default
      config:
        - subnet: 192.167.60.0/16

node01.yml

dyn_o_mite:
  datacenter: dc1
  rack: dc1-rack1
  dyn_listen: 0.0.0.0:8101
  dyn_seeds:
    - dynomite02:8101:dc1-rack1:dc1:4294967294
    - dynomite03:8101:dc1-rack2:dc1:4294967295
  listen: 0.0.0.0:8102
  dyn_port: 8101
  dyn_seed_provider: simple_provider
  servers:
    - 0.0.0.0:22122:1
  tokens: 2147483647
  secure_server_option: datacenter
  pem_key_file: conf/dynomite.pem
  data_store: 0
  stats_listen: 0.0.0.0:22222

node02.yml

dyn_o_mite:
  datacenter: dc1
  rack: dc1-rack1
  dyn_listen: 0.0.0.0:8101
  dyn_seeds:
    - dynomite01:8101:dc1-rack1:dc1:2147483647
    - dynomite03:8101:dc1-rack2:dc1:4294967295
  listen: 0.0.0.0:8102
  dyn_port: 8101
  dyn_seed_provider: simple_provider
  servers:
    - 0.0.0.0:22122:1
  tokens: 4294967294
  secure_server_option: datacenter
  pem_key_file: conf/dynomite.pem
  data_store: 0
  stats_listen: 0.0.0.0:22222

node03.yml

dyn_o_mite:
  datacenter: dc1
  rack: dc1-rack1
  dyn_listen: 0.0.0.0:8101
  dyn_seeds:
    - dynomite01:8101:dc1-rack1:dc1:2147483647
    - dynomite02:8101:dc1-rack1:dc1:4294967294
  listen: 0.0.0.0:8102
  dyn_port: 8101
  dyn_seed_provider: simple_provider
  servers:
    - 0.0.0.0:22122:1
  tokens: 4294967295
  secure_server_option: datacenter
  pem_key_file: conf/dynomite.pem
  data_store: 0
  stats_listen: 0.0.0.0:22222

All tests ran with the java client (dyno)

The first test was using read_consistency and write_consistency as DC_ONE:

  • Insert test1 on dynomite01 - OK
  • Insert test2 on dynomite02 - OK
  • Insert test3 on dynomite03 - OK
  • Get test1, test2 and test3 on all nodes via redis-cli - OK - Found on all nodes
  • Stop dynomite01 and dynomite02
  • Insert test4 on dynomite03 - JedisDataException: ERR Peer: Unknown error -1
  • Start dynomite01 and dynomite02
  • Get test1, test2 and test3 on all nodes via redis-cli - Not found on all nodes
  • Insert test4 on dynomite03 - OK
  • Get test4 on all nodes via redis-cli - OK - Found on all nodes
  • Insert test5 on dynomite01 - OK
  • Get test5 on all nodes via redis-cli - OK - Found on all nodes

The second test was using read_consistency and write_consistency as DC_QUORUM:

  • Insert test1 on dynomite01 - OK
  • Insert test2 on dynomite02 - OK
  • Insert test3 on dynomite03 - OK
  • Get test1, test2 and test3 on all nodes via redis-cli - OK - Found on all nodes
  • Stop dynomite01 and dynomite02
  • Insert test4 on dynomite03 - JedisDataException: ERR Peer: No route to host
  • Start dynomite01 and dynomite02
  • Get test1, test2 and test3 on all nodes via redis-cli - Not found on all nodes
  • Insert test4 on dynomite03 - OK
  • Get test4 on all nodes via redis-cli - Found only on dynomite03
  • Insert test5 on dynomite01 - OK
  • Get test5 on all nodes via redis-cli - Found only on dynomite01 and dynomite02

The second test was using read_consistency and write_consistency as DC_SAFE_QUORUM:

  • Insert test1 on dynomite01 - OK
  • Insert test2 on dynomite02 - OK
  • Insert test3 on dynomite03 - JedisDataException: ERR Dynomite: Failed to achieve Quorum
  • Get test1 and test2 on all nodes via redis-cli - Found on dynomite01 and dynomite02. ERR Dynomite: Failed to achieve Quorum on dynomite03
  • Stop dynomite01 and dynomite02
  • Insert test3 on dynomite03 - JedisDataException: ERR Peer: Unknown error -1
  • Start dynomite01 and dynomite02
  • Get test1 and test2 on all nodes via redis-cli - ERR Dynomite: Failed to achieve Quorum on all nodes
  • Insert test3 on dynomite03 - ERR Dynomite: Failed to achieve Quorum
  • Insert test3 on dynomite01 - OK
  • Get test3 on all nodes via redis-cli - Found on dynomite01 and dynomite02. ERR Dynomite: Failed to achieve Quorum on dynomite03

Do you guys notice some strange behavior in all 3 tests? Things like:

  1. Trying to get a record after stopping rack1 and getting (nil) as response
  2. Trying to insert a record in dynomite03 after stopping rack1 returning an exception is a strange thing as well, since dynomite03 was up at the moment and available
  3. Trying to get a record after stopping rack1 with DC_QUORUM consistency and getting response just in some nodes of the cluster
  4. Quorum not being achieved when trying to insert/read any record on dynomite03, even before rack1 was stopped

Have you faced some issues like that as well?

emiteze avatar Jul 02 '19 17:07 emiteze

@emiteze Sorry for the slow response. Let me have a look at this and get back to you.

smukil avatar Jul 12 '19 18:07 smukil

@emiteze The node03.yml 'rack:' field points to 'dc1-rack1'. You'd need to change it to 'dc1-rack2', right? Right now, it thinks all 3 nodes are part of the same rack and 2 nodes are overlapping on the token range.

smukil avatar Aug 19 '19 17:08 smukil