Quantcast

From standalone ZK instance to 3 instances

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

From standalone ZK instance to 3 instances

michael.boom
Hi,

I'm currently having only one Zookeeper instance (ZK1) and would like to upgrade to a 3 instace quorum (ZK1, ZK2, ZK3).
I understand that this can be done by rolling restarts, but since ZK1 was in standalone mode until now, it's missing the myid file.
Do i need to create the myid file for ZK1 before adding the other nodes ?
If so, is there any other modifications to be done to ZK1?

If anyone has a tutorial or some insight on this scenario, i would be grateful!

Thank you!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From standalone ZK instance to 3 instances

michael.boom
Any idea here ?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From standalone ZK instance to 3 instances

German Blanco
Could you please read the thread "safely upgrade single instance to ensemble"
and see if the answer is there?


On Wed, Nov 20, 2013 at 10:46 AM, michael.boom <[hidden email]> wrote:

> Any idea here ?
>
>
>
> --
> View this message in context:
> http://zookeeper-user.578899.n2.nabble.com/From-standalone-ZK-instance-to-3-instances-tp7579325p7579334.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From standalone ZK instance to 3 instances

michael.boom
Thanks German,

I had a look and gave it a try on a test system. But I get into a problem:

config server1 and 2 are the same:
tickTime=2000

initLimit=10

syncLimit=5

dataDir=/data/zookeeper

clientPort=9983

server.1=zk1:2888:3888
server.2=zk2:2888:2888

Both instances are stopped, then
i started zk1 like this: ./zkServer.sh start zoo.cfg
in zkCli i found following error:
/opt/zookeeper/bin$ ./zkCli.sh
Connecting to localhost:2181
2013-11-20 14:23:55,956 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2013-11-20 14:23:55,962 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=solr1.productdb.internal
2013-11-20 14:23:55,963 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.7.0_25
2013-11-20 14:23:55,964 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2013-11-20 14:23:55,965 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre
2013-11-20 14:23:55,965 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/opt/zookeeper/bin/../lib/log4j-1.2.15.jar:/opt/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/zookeeper/bin/../zookeeper-3.4.5.jar:/opt/zookeeper/bin/../src/java/lib/*.jar:/opt/zookeeper/bin/../conf:
2013-11-20 14:23:55,966 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/jni:/lib:/usr/lib
2013-11-20 14:23:55,967 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2013-11-20 14:23:55,968 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2013-11-20 14:23:55,968 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2013-11-20 14:23:55,969 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2013-11-20 14:23:55,970 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.2.0-4-amd64
2013-11-20 14:23:55,970 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=solr
2013-11-20 14:23:55,971 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/solr
2013-11-20 14:23:55,972 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/opt/zookeeper/bin
2013-11-20 14:23:55,974 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@36bc7722
Welcome to ZooKeeper!
2013-11-20 14:23:56,018 [myid:] - INFO  [main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@966] - Opening socket connection to server localhost.localdomain/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2013-11-20 14:23:56,029 [myid:] - WARN  [main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@1089] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)



Any idea why does it try to connect to localhost on port 2181, since i have specified a different port as a client port?
Thanks!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: From standalone ZK instance to 3 instances

Rakesh R
>>>>>> Any idea why does it try to connect to localhost on port 2181, since i have specified a different port as a client port?

Zookeeper has a default client port of 2181.

You can connect to the server as follows

$./zkCli.sh -server zk1:ClientPort

Between what is the status of the zookeeper servers, whether they have started successfully?

-Rakesh


-----Original Message-----
From: michael.boom [mailto:[hidden email]]
Sent: 20 November 2013 18:55
To: [hidden email]
Subject: Re: From standalone ZK instance to 3 instances

Thanks German,

I had a look and gave it a try on a test system. But I get into a problem:

config server1 and 2 are the same:
tickTime=2000

initLimit=10

syncLimit=5

dataDir=/data/zookeeper

clientPort=9983

server.1=zk1:2888:3888
server.2=zk2:2888:2888

Both instances are stopped, then
i started zk1 like this: ./zkServer.sh start zoo.cfg in zkCli i found following error:
/opt/zookeeper/bin$ ./zkCli.sh
Connecting to localhost:2181
2013-11-20 14:23:55,956 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2013-11-20 14:23:55,962 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=solr1.productdb.internal
2013-11-20 14:23:55,963 [myid:] - INFO  [main:Environment@100] - Client
environment:java.version=1.7.0_25
2013-11-20 14:23:55,964 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2013-11-20 14:23:55,965 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre
2013-11-20 14:23:55,965 [myid:] - INFO  [main:Environment@100] - Client
environment:java.class.path=/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/opt/zookeeper/bin/../lib/log4j-1.2.15.jar:/opt/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/zookeeper/bin/../zookeeper-3.4.5.jar:/opt/zookeeper/bin/../src/java/lib/*.jar:/opt/zookeeper/bin/../conf:
2013-11-20 14:23:55,966 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/jni:/lib:/usr/lib
2013-11-20 14:23:55,967 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2013-11-20 14:23:55,968 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2013-11-20 14:23:55,968 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2013-11-20 14:23:55,969 [myid:] - INFO  [main:Environment@100] - Client
environment:os.arch=amd64
2013-11-20 14:23:55,970 [myid:] - INFO  [main:Environment@100] - Client
environment:os.version=3.2.0-4-amd64
2013-11-20 14:23:55,970 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=solr
2013-11-20 14:23:55,971 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/solr
2013-11-20 14:23:55,972 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/opt/zookeeper/bin
2013-11-20 14:23:55,974 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000
watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@36bc7722
Welcome to ZooKeeper!
2013-11-20 14:23:56,018 [myid:] - INFO
[main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@966] - Opening socket connection to server localhost.localdomain/127.0.0.1:2181.
Will not attempt to authenticate using SASL (unknown error) JLine support is enabled
2013-11-20 14:23:56,029 [myid:] - WARN
[main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@1089] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
        at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)



Any idea why does it try to connect to localhost on port 2181, since i have specified a different port as a client port?
Thanks!



--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/From-standalone-ZK-instance-to-3-instances-tp7579325p7579336.html
Sent from the zookeeper-user mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: From standalone ZK instance to 3 instances

michael.boom
Thank you, Rakesh!

Yes, zk seems to have started corectly.

$ sudo ./zkServer.sh start zoo.cfg
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
$

I tried
./zkCli.sh -server zk1:9983
and here's the output:
2013-11-20 15:15:19,220 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/jni:/lib:/usr/lib
2013-11-20 15:15:19,220 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2013-11-20 15:15:19,221 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2013-11-20 15:15:19,222 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2013-11-20 15:15:19,222 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2013-11-20 15:15:19,223 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.2.0-4-amd64
2013-11-20 15:15:19,224 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=solr
2013-11-20 15:15:19,224 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/solr
2013-11-20 15:15:19,225 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/opt/zookeeper/bin
2013-11-20 15:15:19,227 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:9983 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@7b9dd3cd
Welcome to ZooKeeper!
2013-11-20 15:15:19,272 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@966] - Opening socket connection to server localhost.localdomain/127.0.0.1:9983. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2013-11-20 15:15:19,288 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@849] - Socket connection established to localhost.localdomain/127.0.0.1:9983, initiating session
2013-11-20 15:15:19,297 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@1085] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
[zk: localhost:9983(CONNECTING) 0] 2013-11-20 15:15:20,611 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@966] - Opening socket connection to server localhost.localdomain/127.0.0.1:9983. Will not attempt to authenticate using SASL (unknown error)
2013-11-20 15:15:20,612 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@849] - Socket connection established to localhost.localdomain/127.0.0.1:9983, initiating session
2013-11-20 15:15:20,614 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@1085] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: From standalone ZK instance to 3 instances

Rakesh R
As you mentioned earlier, you are upgrading single instance to ensemble having size 3.

Could you tell me the all the three servers are fine. One has become LEADER and other two in FOLLOWER state.

If server is running fine, could you check the zk server logs for the client session establishment, it will give more information like:

2013-11-19 12:35:14,780 [myid:1] - INFO  [NIOServerCxn.Factory:/10.18.40.137:25000:ZooKeeperServer@868] - Client attempting to establish new session at /10.18.40.137:53318


-----Original Message-----
From: michael.boom [mailto:[hidden email]]
Sent: 20 November 2013 19:48
To: [hidden email]
Subject: RE: From standalone ZK instance to 3 instances

Thank you, Rakesh!

Yes, zk seems to have started corectly.

$ sudo ./zkServer.sh start zoo.cfg
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
$

I tried
./zkCli.sh -server zk1:9983
and here's the output:
2013-11-20 15:15:19,220 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/jni:/lib:/usr/lib
2013-11-20 15:15:19,220 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2013-11-20 15:15:19,221 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2013-11-20 15:15:19,222 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2013-11-20 15:15:19,222 [myid:] - INFO  [main:Environment@100] - Client
environment:os.arch=amd64
2013-11-20 15:15:19,223 [myid:] - INFO  [main:Environment@100] - Client
environment:os.version=3.2.0-4-amd64
2013-11-20 15:15:19,224 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=solr
2013-11-20 15:15:19,224 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/solr
2013-11-20 15:15:19,225 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/opt/zookeeper/bin
2013-11-20 15:15:19,227 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:9983 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@7b9dd3cd
Welcome to ZooKeeper!
2013-11-20 15:15:19,272 [myid:] - INFO
[main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@966] - Opening socket connection to server localhost.localdomain/127.0.0.1:9983.
Will not attempt to authenticate using SASL (unknown error) JLine support is enabled
2013-11-20 15:15:19,288 [myid:] - INFO
[main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@849] - Socket connection established to localhost.localdomain/127.0.0.1:9983,
initiating session
2013-11-20 15:15:19,297 [myid:] - INFO
[main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@1085] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
[zk: localhost:9983(CONNECTING) 0] 2013-11-20 15:15:20,611 [myid:] - INFO [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@966] - Opening socket connection to server localhost.localdomain/127.0.0.1:9983.
Will not attempt to authenticate using SASL (unknown error)
2013-11-20 15:15:20,612 [myid:] - INFO
[main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@849] - Socket connection established to localhost.localdomain/127.0.0.1:9983,
initiating session
2013-11-20 15:15:20,614 [myid:] - INFO
[main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@1085] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect




--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/From-standalone-ZK-instance-to-3-instances-tp7579325p7579338.html
Sent from the zookeeper-user mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: From standalone ZK instance to 3 instances

michael.boom
I started the test with having the standalone instance ZK1 and trying to add to that 1 more, ZK2 instance.

Both ZK1 and ZK2 have the same configuraion described in a above post

I started the ZK1 and it starts as it should, but it creates no log file.
If i connect to the instance using zkCli it outputs every second this:

2013-11-20 16:06:24,302 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@849] - Socket connection established to localhost.localdomain/127.0.0.1:9983, initiating session
2013-11-20 16:06:24,307 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@1085] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect



Having started also ZK2, it has the same behaviour, is says it started correctly, but no log file is created.
If i connect to the instance using zkCli it outputs every second this:

2013-11-20 16:06:58,626 [myid:] - WARN  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@1089] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2013-11-20 16:06:59,727 [myid:] - INFO  [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@966] - Opening socket connection to server localhost.localdomain/127.0.0.1:9983. Will not attempt to authenticate using SASL (unknown error)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From standalone ZK instance to 3 instances

Ted Dunning
On Wed, Nov 20, 2013 at 4:07 PM, michael.boom <[hidden email]> wrote:

> 2013-11-20 16:06:24,302 [myid:] - INFO
> [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@849] -
> Socket connection established to localhost.localdomain/127.0.0.1:9983,
> initiating session
> 2013-11-20 16:06:24,307 [myid:] - INFO
> [main-SendThread(localhost.localdomain:9983):ClientCnxn$SendThread@1085] -
> Unable to read additional data from server sessionid 0x0, likely server has
> closed socket, closing socket connection and attempting reconnect
>

This makes it look like you haven't used correct hostnames in your
configuration.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From standalone ZK instance to 3 instances

michael.boom
Hmm, I think you were right. Thank you!
I changed the conf from hostnames to IPs, and now it looks better.

However, when i connect with zkCli i get from both instances the message:

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:9983(CONNECTED) 0]


Shouldn't one of them had become LEADER ? Or, am i looking in the wrong place?
Also, to be noted that the second ZK has replicated the state from the first.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From standalone ZK instance to 3 instances

Bryan Thompson
Michael,

I've been monitoring this thread.  Do you plan to post a procedure for
doing this?  That would be quite useful.

Thanks,
Bryan

On 11/20/13 10:32 AM, "michael.boom" <[hidden email]> wrote:

>Hmm, I think you were right. Thank you!
>I changed the conf from hostnames to IPs, and now it looks better.
>
>However, when i connect with zkCli i get from both instances the message:
>
>WATCHER::
>
>WatchedEvent state:SyncConnected type:None path:null
>[zk: localhost:9983(CONNECTED) 0]
>
>
>Shouldn't one of them had become LEADER ? Or, am i looking in the wrong
>place?
>Also, to be noted that the second ZK has replicated the state from the
>first.
>
>
>
>--
>View this message in context:
>http://zookeeper-user.578899.n2.nabble.com/From-standalone-ZK-instance-to-
>3-instances-tp7579325p7579343.html
>Sent from the zookeeper-user mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From standalone ZK instance to 3 instances

michael.boom
Hi Bryan,

As soon as I will be satisfied with the procedure and resolve any remaining problem, I will post a detailed step by step "tutorial".
Hopefully tomorrow.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: From standalone ZK instance to 3 instances

Rakesh R
In reply to this post by michael.boom
Hi Micheal,

Nice to hear about the progress and all the best for your efforts.
Adding one more point, which will be helpful to see general information about zookeeper server.

There is an interesting set of "4 letter words" in zookeeper.
Please refer section : http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#sc_zkCommands

For example, 'stat' command will give the details as follows and will get to know the running status.

Zookeeper version: 3.4.5-21786, built on 10/17/2013 02:02 GMT
Clients:
 /10.11.41.1:53372[1](queued=0,recved=98923,sent=98923)
 /10.11.41.1:53374[1](queued=0,recved=224100,sent=224100)

Latency min/avg/max: 0/0/309
Received: 1106514
Sent: 1106515
Connections: 6
Outstanding: 0
Zxid: 0x1000037f3
Mode: leader
Node count: 2756

-Rakesh

-----Original Message-----
From: michael.boom [mailto:[hidden email]]
Sent: 20 November 2013 21:03
To: [hidden email]
Subject: Re: From standalone ZK instance to 3 instances

Hmm, I think you were right. Thank you!
I changed the conf from hostnames to IPs, and now it looks better.

However, when i connect with zkCli i get from both instances the message:

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:9983(CONNECTED) 0]


Shouldn't one of them had become LEADER ? Or, am i looking in the wrong place?
Also, to be noted that the second ZK has replicated the state from the first.



--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/From-standalone-ZK-instance-to-3-instances-tp7579325p7579343.html
Sent from the zookeeper-user mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: From standalone ZK instance to 3 instances

michael.boom
Thank you for the tip Rakesh, it's indeed very helpful.

So after checking the ZK instances with the "stat" command the two instances seem to be working fine, one as a leader and the other one as a follower. However this was achieved only by stopping both instances and having the quorum unavailable for a little while.
When i try to add the second instance without restarting/stopping the first it doesn't really work, because the first is still in standalone mode:
$ echo stat | nc ZK1 9983
Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
Clients:
 /<ZK1 IP>:35517[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 7
Sent: 6
Connections: 1
Outstanding: 0
Zxid: 0x0
Mode: standalone
Node count: 4

and the second replies to the stat command with a:

$ echo stat | nc ZK2 9983
This ZooKeeper instance is not currently serving requests

Is there a way to commute from standalone mode to quorum without compromising availability?
Thank you!

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From standalone ZK instance to 3 instances

Bryan Thompson
Michael,

I am interested in this procedure, but I have never attempted it myself.
It seems that the concept advanced in [1] is to manually replicate the
data, then start a new ensemble with different ports using the replicated
data, and finally instruct your clients to talk to the new ensemble.  This
procedure would definitely cause any ephemeral znodes to be lost during
the migration since the client connections could not be transferred to the
new ensemble without being dropped.  Essentially, each client becomes
disconnected from the standalone server instance and then is reconnected
to the new ensemble of highly available server instances.

Given that the clients must become disconnected, at least temporarily, it
seems that you can not obtain 100% up time from a client perspective
during this migration.  I.e., each client would have to be either
restarted or (if you architect for it in your client) it would have to be
instructed through an API that it should disconnect from one zk server
configuration and connect to another.  Either way, the client would be
disconnected during its transition.

However, the service provided by your clients could remain up as long as
that service was able to move transparently from clients connected to the
standalone node to clients connected to the ensemble.

But at the service level, there would have to be some point at which you
stopped relying on the data in the standalone instance and began to rely
on the data in the new ensemble.  If there are writes on the standalone
instance after you manually replicate its data, then those writes would
not be present in the ensemble.  From a service level, those writes would
have been lost.  

I would be interested in a procedure to make this migration seamless, but
I can't see how it would be accomplished without:

- halt writes on zookeeper.
- replicate zookeeper standalone server state to a zookeeper ensemble with
at least two instances (a quorum can meet with two servers). The services
will need their myid files. If you start one of these servers on the same
machine, then you need to use a different client port for the new ensemble.
- start the servers in the new ensemble. Quorum should meet. Leader should
be elected, etc.
- change the client configuration to point to the servers in the new
ensemble.
- restart the clients. This moves them from the old standalone zookeeper
instance (which nobody should be writing on) to the new ensemble (which is
read/write).
- terminate the old standalone zookeeper server instance

I think that a procedure to increase the replication count of a zookeeper
ensemble would be similar:

- start a new service in zookeeper ensemble. This service should know
about the original servers plus itself.
- for each existing zookeeper service, change the server configurations to
include the new server and restart the service (rolling restart). This
makes the services mutually aware of the new server.
- for each client, change the client configuration to include the new
zookeeper ensemble list and restart that client.

Given all of this, I suggest that the right way to move from a single node
deployment to a highly available deployment is to begin with a zookeeper
ensemble running on the initial node.

- Begin with a single node with 3 zookeeper server instances configured as
an ensemble (there are instructions somewhere for running multiple zk
instances on the same node - the ports need to be specified such that they
do not conflict).

To move from a single node to multiple nodes:

- Configure and start a new zookeeper server instance on another node.  It
should know about 2 of the original instances.
- Rolling reconfigure and restart of the zookeeper services.  The server
instance that is being migrated is terminated rather than being restarted.
- Rolling reconfigure and restart of the zookeeper clients.  On restart,
the client will know about the new zookeeper ensemble locations.

This would leave you with two zookeeper server instances on the original
node and one somewhere else.

You would then repeat that procedure to migrate one of the two remaining
zookeeper server instances to another node.  That would give you one
zookeeper service per node.

You could then follow the procedure to increase the replication count if
you wanted to increase the availability of zookeeper beyond those three
nodes.


I have not tested any of this.  This is just the way I could see it
working based on my understanding of zookeeper.  I am interested in a
procedure for managing this because we have a service that uses zookeeper
to coordinate failover.  We can manage the increase of replication in our
own services and their durable state easily enough, but I am not sure how
to manage this for zookeeper.  All of the above is complicated enough that
it seems it would be easier to begin with three VMs running zookeeper and
then migrate the VMs if necessary, ideally without changing their IPs.

Thanks,
Bryan

[1]
http://zookeeper-user.578899.n2.nabble.com/safely-upgrade-single-instance-t
o-ensemble-td7578716.html

On 11/21/13 5:20 AM, "michael.boom" <[hidden email]> wrote:

>http://zookeeper-user.578899.n2.nabble.com/From-standalone-ZK-instance-to-
>3-instances-tp7579325p7579349.html

Loading...