gracefully remove a node from the ensamble

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

gracefully remove a node from the ensamble

Luigi Tagliamonte
Hello all!
is there any document that describes how to remove a zk node from the
ensemble?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Alexander Shraer-2
Hi Luigi,

In 3.5.X yes: https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html

For previous releases (3.4 etc) you would need to do a rolling restart,
where for each server you change the config file to exclude that member
and bounce the server. Preferably do this one server at a time, and let the
ensemble be operational before bouncing the next server. And bounce
the leader last. I wouldn't call this gracefully though :)


Alex

On Thu, Jul 13, 2017 at 3:22 PM, Luigi Tagliamonte <
[hidden email]> wrote:

> Hello all!
> is there any document that describes how to remove a zk node from the
> ensemble?
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Luigi Tagliamonte
Thank you, Alexander!!!
I'm wondering if would be a good idea to use 3.5 instead of 3.4... but
since it is a beta I'm afraid to use it in production.
I'm Cassandra user and I'm basically looking for the same level of
reliability and orchestration I have there.
Thank you!!
Regards
L.

On Thu, Jul 13, 2017 at 6:19 PM, Alexander Shraer <[hidden email]> wrote:

> Hi Luigi,
>
> In 3.5.X yes: https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.
> html
>
> For previous releases (3.4 etc) you would need to do a rolling restart,
> where for each server you change the config file to exclude that member
> and bounce the server. Preferably do this one server at a time, and let the
> ensemble be operational before bouncing the next server. And bounce
> the leader last. I wouldn't call this gracefully though :)
>
>
> Alex
>
> On Thu, Jul 13, 2017 at 3:22 PM, Luigi Tagliamonte <
> [hidden email]> wrote:
>
> > Hello all!
> > is there any document that describes how to remove a zk node from the
> > ensemble?
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Alexander Shraer-2
Well, I totally understand you. But on the other hand - I know that the
dynamic membership code has been running in production since 2012: link
<https://issues.apache.org/jira/browse/ZOOKEEPER-107?focusedCommentId=13566886&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13566886>
Of course it might have problems, but its been out for a while and I don't
think its less stable than 3.4. That's my personal opinion though :)

Cheers,
Alex

On Fri, Jul 14, 2017 at 9:26 AM, Luigi Tagliamonte <
[hidden email]> wrote:

> Thank you, Alexander!!!
> I'm wondering if would be a good idea to use 3.5 instead of 3.4... but
> since it is a beta I'm afraid to use it in production.
> I'm Cassandra user and I'm basically looking for the same level of
> reliability and orchestration I have there.
> Thank you!!
> Regards
> L.
>
> On Thu, Jul 13, 2017 at 6:19 PM, Alexander Shraer <[hidden email]>
> wrote:
>
> > Hi Luigi,
> >
> > In 3.5.X yes: https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.
> > html
> >
> > For previous releases (3.4 etc) you would need to do a rolling restart,
> > where for each server you change the config file to exclude that member
> > and bounce the server. Preferably do this one server at a time, and let
> the
> > ensemble be operational before bouncing the next server. And bounce
> > the leader last. I wouldn't call this gracefully though :)
> >
> >
> > Alex
> >
> > On Thu, Jul 13, 2017 at 3:22 PM, Luigi Tagliamonte <
> > [hidden email]> wrote:
> >
> > > Hello all!
> > > is there any document that describes how to remove a zk node from the
> > > ensemble?
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Luigi Tagliamonte
Hello Alexander,
thank you for the link I read the comment and the white paper and it seems
really promising.
I found though that Kafka isn't able yet to automatically reconfigure his
zk nodes list.. do you happen to know different?
Regards
L.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Luigi Tagliamonte
Hello again Alexander,
so only Java and C clients support the new zk node discovery? right?
Is there any specific version to use to be able to use this feature?
Regards
L.

On Fri, Jul 14, 2017 at 10:37 AM, Luigi Tagliamonte <
[hidden email]> wrote:

> Hello Alexander,
> thank you for the link I read the comment and the white paper and it seems
> really promising.
> I found though that Kafka isn't able yet to automatically reconfigure his
> zk nodes list.. do you happen to know different?
> Regards
> L.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Alexander Shraer-2
I'd suggest to use 3.5.3. ZK only officially supports a Java and C client
as far as I know. I know these two support it,
not sure if anyone ported it to other clients.

Alex


On Fri, Jul 14, 2017 at 11:04 AM, Luigi Tagliamonte <
[hidden email]> wrote:

> Hello again Alexander,
> so only Java and C clients support the new zk node discovery? right?
> Is there any specific version to use to be able to use this feature?
> Regards
> L.
>
> On Fri, Jul 14, 2017 at 10:37 AM, Luigi Tagliamonte <
> [hidden email]> wrote:
>
> > Hello Alexander,
> > thank you for the link I read the comment and the white paper and it
> seems
> > really promising.
> > I found though that Kafka isn't able yet to automatically reconfigure his
> > zk nodes list.. do you happen to know different?
> > Regards
> > L.
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Luigi Tagliamonte
Thank Alexander,
I'm giving a shot to 3.5.3.
I have 2 servers, the first one has:

-zoo.cfg :
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper/data
reconfigEnabled=true
standaloneEnabled=false
dynamicConfigFile=/etc/zookeeper/bin/conf/zoo_replicated1.cfg.dynamic

-zoo_replicated1.cfg.dynamic:
server.1=zook.mydomain.com:2888:3888

- myid: 1

On the second server I'm using the same zoo.cfg and zoo_replicated1.cfg.dynamic
and I only changed the id to 2.
I'm getting the following in the logs:

2017-07-14 19:43:25,515 - INFO  [main:QuorumPeerConfig@117] - Reading
configuration from: /etc/zookeeper/zoo.cfg
2017-07-14 19:43:25,518 - INFO  [main:QuorumPeerConfig@317] - clientPort is
not set
2017-07-14 19:43:25,519 - INFO  [main:QuorumPeerConfig@331] -
secureClientPort is not set
2017-07-14 19:43:25,579 - WARN  [main:QuorumPeerConfig@590] - No server
failure will be tolerated. You need at least 3 servers.
2017-07-14 19:43:25,583 - INFO  [main:DatadirCleanupManager@78] -
autopurge.snapRetainCount set to 3
2017-07-14 19:43:25,583 - INFO  [main:DatadirCleanupManager@79] -
autopurge.purgeInterval set to 0
2017-07-14 19:43:25,583 - INFO  [main:DatadirCleanupManager@101] - Purge
task is not scheduled.
2017-07-14 19:43:25,584 - INFO  [main:ManagedUtil@46] - Log4j found with
jmx enabled.
2017-07-14 19:43:25,594 - INFO  [main:QuorumPeerMain@138] - Starting quorum
peer
2017-07-14 19:43:25,617 - INFO  [main:Log@186] - Logging initialized @388ms
2017-07-14 19:43:25,661 - WARN  [main:ContextHandler@1339] -
o.e.j.s.ServletContextHandler@6d78f375{/,null,null} contextPath ends with /*
2017-07-14 19:43:25,661 - WARN  [main:ContextHandler@1350] - Empty
contextPath
2017-07-14 19:43:25,673 - INFO  [main:QuorumPeer@1349] - Local sessions
disabled
2017-07-14 19:43:25,673 - INFO  [main:QuorumPeer@1360] - Local session
upgrading disabled
2017-07-14 19:43:25,673 - INFO  [main:QuorumPeer@1327] - tickTime set to
2000
2017-07-14 19:43:25,673 - INFO  [main:QuorumPeer@1371] - minSessionTimeout
set to 4000
2017-07-14 19:43:25,674 - INFO  [main:QuorumPeer@1382] - maxSessionTimeout
set to 40000
2017-07-14 19:43:25,674 - INFO  [main:QuorumPeer@1397] - initLimit set to 10
2017-07-14 19:43:25,685 - ERROR [main:QuorumPeerMain@98] - Unexpected
exception, exiting abnormally
java.lang.RuntimeException: My id 2 not in the peer list
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:770)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:185)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:120)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)

What am I doing wrong? should the second server reach the first one, get
the list of the other server in the ensemble and join it?
Or I have to implement an automation on top of this?
Regards
L.


On Fri, Jul 14, 2017 at 11:07 AM, Alexander Shraer <[hidden email]>
wrote:

> I'd suggest to use 3.5.3. ZK only officially supports a Java and C client
> as far as I know. I know these two support it,
> not sure if anyone ported it to other clients.
>
> Alex
>
>
> On Fri, Jul 14, 2017 at 11:04 AM, Luigi Tagliamonte <
> [hidden email]> wrote:
>
> > Hello again Alexander,
> > so only Java and C clients support the new zk node discovery? right?
> > Is there any specific version to use to be able to use this feature?
> > Regards
> > L.
> >
> > On Fri, Jul 14, 2017 at 10:37 AM, Luigi Tagliamonte <
> > [hidden email]> wrote:
> >
> > > Hello Alexander,
> > > thank you for the link I read the comment and the white paper and it
> > seems
> > > really promising.
> > > I found though that Kafka isn't able yet to automatically reconfigure
> his
> > > zk nodes list.. do you happen to know different?
> > > Regards
> > > L.
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Alexander Shraer-2
> java.lang.RuntimeException: My id 2 not in the peer list

if the server's id is 2, a line for server 2 should be in the config file.
More generally, the dynamic config file should be the same at both servers
and include both servers. The documentation should be helpful.

https://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html
https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html

-zoo_replicated1.cfg.dynamic:
server.1=zook.mydomain.com:2888:3888;<client port>
server.2=zook.mydomain.com:yyyy:zzzz;<client port>

-zoo_replicated2.cfg.dynamic:
server.1=zook.mydomain.com:2888:3888;<client port>
server.2=zook.mydomain.com:yyyy:zzzz;<client port>




On Fri, Jul 14, 2017 at 12:45 PM, Luigi Tagliamonte <
[hidden email]> wrote:

> Thank Alexander,
> I'm giving a shot to 3.5.3.
> I have 2 servers, the first one has:
>
> -zoo.cfg :
> tickTime=2000
> initLimit=10
> syncLimit=5
> dataDir=/var/lib/zookeeper/data
> reconfigEnabled=true
> standaloneEnabled=false
> dynamicConfigFile=/etc/zookeeper/bin/conf/zoo_replicated1.cfg.dynamic
>
> -zoo_replicated1.cfg.dynamic:
> server.1=zook.mydomain.com:2888:3888
>
> - myid: 1
>
> On the second server I'm using the same zoo.cfg and
> zoo_replicated1.cfg.dynamic
> and I only changed the id to 2.
> I'm getting the following in the logs:
>
> 2017-07-14 19:43:25,515 - INFO  [main:QuorumPeerConfig@117] - Reading
> configuration from: /etc/zookeeper/zoo.cfg
> 2017-07-14 19:43:25,518 - INFO  [main:QuorumPeerConfig@317] - clientPort
> is
> not set
> 2017-07-14 19:43:25,519 - INFO  [main:QuorumPeerConfig@331] -
> secureClientPort is not set
> 2017-07-14 19:43:25,579 - WARN  [main:QuorumPeerConfig@590] - No server
> failure will be tolerated. You need at least 3 servers.
> 2017-07-14 19:43:25,583 - INFO  [main:DatadirCleanupManager@78] -
> autopurge.snapRetainCount set to 3
> 2017-07-14 19:43:25,583 - INFO  [main:DatadirCleanupManager@79] -
> autopurge.purgeInterval set to 0
> 2017-07-14 19:43:25,583 - INFO  [main:DatadirCleanupManager@101] - Purge
> task is not scheduled.
> 2017-07-14 19:43:25,584 - INFO  [main:ManagedUtil@46] - Log4j found with
> jmx enabled.
> 2017-07-14 19:43:25,594 - INFO  [main:QuorumPeerMain@138] - Starting
> quorum
> peer
> 2017-07-14 19:43:25,617 - INFO  [main:Log@186] - Logging initialized
> @388ms
> 2017-07-14 19:43:25,661 - WARN  [main:ContextHandler@1339] -
> o.e.j.s.ServletContextHandler@6d78f375{/,null,null} contextPath ends with
> /*
> 2017-07-14 19:43:25,661 - WARN  [main:ContextHandler@1350] - Empty
> contextPath
> 2017-07-14 19:43:25,673 - INFO  [main:QuorumPeer@1349] - Local sessions
> disabled
> 2017-07-14 19:43:25,673 - INFO  [main:QuorumPeer@1360] - Local session
> upgrading disabled
> 2017-07-14 19:43:25,673 - INFO  [main:QuorumPeer@1327] - tickTime set to
> 2000
> 2017-07-14 19:43:25,673 - INFO  [main:QuorumPeer@1371] - minSessionTimeout
> set to 4000
> 2017-07-14 19:43:25,674 - INFO  [main:QuorumPeer@1382] - maxSessionTimeout
> set to 40000
> 2017-07-14 19:43:25,674 - INFO  [main:QuorumPeer@1397] - initLimit set to
> 10
> 2017-07-14 19:43:25,685 - ERROR [main:QuorumPeerMain@98] - Unexpected
> exception, exiting abnormally
> java.lang.RuntimeException: My id 2 not in the peer list
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(
> QuorumPeer.java:770)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.
> runFromConfig(QuorumPeerMain.java:185)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(
> QuorumPeerMain.java:120)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(
> QuorumPeerMain.java:79)
>
> What am I doing wrong? should the second server reach the first one, get
> the list of the other server in the ensemble and join it?
> Or I have to implement an automation on top of this?
> Regards
> L.
>
>
> On Fri, Jul 14, 2017 at 11:07 AM, Alexander Shraer <[hidden email]>
> wrote:
>
> > I'd suggest to use 3.5.3. ZK only officially supports a Java and C client
> > as far as I know. I know these two support it,
> > not sure if anyone ported it to other clients.
> >
> > Alex
> >
> >
> > On Fri, Jul 14, 2017 at 11:04 AM, Luigi Tagliamonte <
> > [hidden email]> wrote:
> >
> > > Hello again Alexander,
> > > so only Java and C clients support the new zk node discovery? right?
> > > Is there any specific version to use to be able to use this feature?
> > > Regards
> > > L.
> > >
> > > On Fri, Jul 14, 2017 at 10:37 AM, Luigi Tagliamonte <
> > > [hidden email]> wrote:
> > >
> > > > Hello Alexander,
> > > > thank you for the link I read the comment and the white paper and it
> > > seems
> > > > really promising.
> > > > I found though that Kafka isn't able yet to automatically reconfigure
> > his
> > > > zk nodes list.. do you happen to know different?
> > > > Regards
> > > > L.
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Luigi Tagliamonte
Thank you for the reply
I was thinking that this whole automatic reconfiguration was something like
in Cassandra...you have a seed node and when a new node boots it get the
info from the seed. Is something like that available?
Regards
L.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: gracefully remove a node from the ensamble

Alexander Shraer-2
Well, first of all you need to bootstrap a system - so all the nodes should
know of each other. This hasn't changed in 3.5.
When you add a new server, you also need to bootstrap its config file with
something (there are a few suggestions in the manual) - it doesn't need to
be the latest config but it has to include the leader and it must be
specified in a way that avoids a split brain. Once the new server talks
with the leader,
it syncs the latest configuration (something like what you're saying). Then
you can issue a command to formally add it to the cluster.


On Fri, Jul 14, 2017 at 2:09 PM, Luigi Tagliamonte <
[hidden email]> wrote:

> Thank you for the reply
> I was thinking that this whole automatic reconfiguration was something like
> in Cassandra...you have a seed node and when a new node boots it get the
> info from the seed. Is something like that available?
> Regards
> L.
>
Loading...