Quantcast

Running Zookeeper in 2 machines

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Running Zookeeper in 2 machines

erolagnab
Hi,

Just wondering is it possible to run zookeeper in 2 machine so when one machine dies, zookeeper is still up?
I've tried to run 3 zookeeper instances each machine but it doesn't work.

Thanks,

Ero
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Running Zookeeper in 2 machines

Cameron McKenzie
You need a minimum of 3 ZK instances to have any redundancy, because when
an instance fails the remaining instances need to be able to form a quorum.
To do this, more than half of the instances need to be able to communicate
with each other.

You can certainly run multiple instances on a single host though if that's
what you want, you just need to modify the configuration so that the ports
don't conflict between instances.
cheers


On Tue, Nov 5, 2013 at 3:38 PM, erolagnab <[hidden email]> wrote:

> Hi,
>
> Just wondering is it possible to run zookeeper in 2 machine so when one
> machine dies, zookeeper is still up?
> I've tried to run 3 zookeeper instances each machine but it doesn't work.
>
> Thanks,
>
> Ero
>
>
>
> --
> View this message in context:
> http://zookeeper-user.578899.n2.nabble.com/Running-Zookeeper-in-2-machines-tp7579232.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Running Zookeeper in 2 machines

erolagnab
I have 2 machines and I have tried to setup 3 ZK instances per machine. Configs below (i have appropriate myid file in the respected data dir, and both machines use the same configs):

zoo1.cfg
--------
dataDir=/var/tmp/zookeeper/instance1
clientPort=2181
server.11=machine1:2888:3888
server.21=machine2:2888:3888
server.12=machine1:2889:3889
server.22=machine2:2889:3889
server.13=machine1:2890:3890
server.23=machine2:2890:3890

zoo2.cfg
--------
dataDir=/var/tmp/zookeeper/instance2
clientPort=2182
server.11=machine1:2888:3888
server.21=machine2:2888:3888
server.12=machine1:2889:3889
server.22=machine2:2889:3889
server.13=machine1:2890:3890
server.23=machine2:2890:3890

zoo3.cfg
--------
dataDir=/var/tmp/zookeeper/instance3
clientPort=2183
server.11=machine1:2888:3888
server.21=machine2:2888:3888
server.12=machine1:2889:3889
server.22=machine2:2889:3889
server.13=machine1:2890:3890
server.23=machine2:2890:3890

In machine1, i try zkServer.sh start for all 3 config but ZK not able to start. All 3 ZK instances are trying to open channel at election machine2 address and not to machine1.

If i start all 3 ZK instances in machine2, then ZK cluster starts nicely but then if i shutdown all in machine2, ZK cluster dies.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Running Zookeeper in 2 machines

Cameron McKenzie
The servers on machine1 will still try and connect to machine2 to form a
quorum because they cannot form a quorum themselves, because they only make
up half the cluster, and you need half + 1 machines to form a quorum.


On Tue, Nov 5, 2013 at 4:42 PM, erolagnab <[hidden email]> wrote:

> I have 2 machines and I have tried to setup 3 ZK instances per machine.
> Configs below (i have appropriate myid file in the respected data dir, and
> both machines use the same configs):
>
> zoo1.cfg
> --------
> dataDir=/var/tmp/zookeeper/instance1
> clientPort=2181
> server.11=machine1:2888:3888
> server.21=machine2:2888:3888
> server.12=machine1:2889:3889
> server.22=machine2:2889:3889
> server.13=machine1:2890:3890
> server.23=machine2:2890:3890
>
> zoo2.cfg
> --------
> dataDir=/var/tmp/zookeeper/instance2
> clientPort=2182
> server.11=machine1:2888:3888
> server.21=machine2:2888:3888
> server.12=machine1:2889:3889
> server.22=machine2:2889:3889
> server.13=machine1:2890:3890
> server.23=machine2:2890:3890
>
> zoo3.cfg
> --------
> dataDir=/var/tmp/zookeeper/instance3
> clientPort=2183
> server.11=machine1:2888:3888
> server.21=machine2:2888:3888
> server.12=machine1:2889:3889
> server.22=machine2:2889:3889
> server.13=machine1:2890:3890
> server.23=machine2:2890:3890
>
> In machine1, i try zkServer.sh start for all 3 config but ZK not able to
> start. All 3 ZK instances are trying to open channel at election machine2
> address and not to machine1.
>
> If i start all 3 ZK instances in machine2, then ZK cluster starts nicely
> but
> then if i shutdown all in machine2, ZK cluster dies.
>
>
>
> --
> View this message in context:
> http://zookeeper-user.578899.n2.nabble.com/Running-Zookeeper-in-2-machines-tp7579232p7579234.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Running Zookeeper in 2 machines

erolagnab
Thanks, I got the idea now. So is it fair to say that it is not possible to create ZK cluster providing some redundancy with 2 physical machines? If so, is there a way to make it happen?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Running Zookeeper in 2 machines

Cameron McKenzie
I have a similar problem to you. I have more than 2 machines, but only 2
geographically redundant sites.

In your situation, you could get some redundancy by running 2 instances on
one host, and 1 instance on the other host. This would protect you from
temporary network glitches (because the machine with 2 instances can still
form a quorum), and will protect you from failure of the machine with the
single instance. It will not help you if the machine with 2 instances
crashes.

In this situation, where the 2 instance machine dies, you can temporarily
configure the 1 instance machine to be a single instance cluster, and then
when the 2 instance machine is recovered, you can reconfigure the single
instance machine to be part of the 3 instance cluster again. This process
is manual, and slightly dangerous, because if you restart nodes in the
wrong order, you have potential to lose data. This is the approach that I
have tested and seems to work, but I'd recommend testing it also.

Machine A has ZK instance 1
Machine B has ZK instances 2 and 3

Machine B dies
Reconfigure ZK instance 1 so that it only has itself in the cluster. This
means that there is no redundancy at this point, but it can form a quorum
as its the only instance in the cluster.
Restart ZK instance 1 to pickup config changes
Fix up Machine B
Reconfigure ZK 1 instance to have ZK instances 2 and 3 in its configuration
Restart ZK instance 1 to pickup config changes
Start ZK instance 2 on Machine B.
Wait for ZK instance 1 on Machine A and ZK instance 2 on machine B form a
quorum. This is vitally important. If you start instance 3 before a quorum
is formed it is possible that instances 2 and 3 will form a quorum. This
will cause any updates that have occurred via instance 1 during the outage
of Machine B to be lost.
Start ZK instance 3 on Machine B

This process should become easier once dynamic reconfiguration is
implemented (in ZK 3.5 I believe?) because restarts won't be required.
cheers
Cam











On Tue, Nov 5, 2013 at 6:05 PM, erolagnab <[hidden email]> wrote:

> Thanks, I got the idea now. So is it fair to say that it is not possible to
> create ZK cluster providing some redundancy with 2 physical machines? If
> so,
> is there a way to make it happen?
>
>
>
> --
> View this message in context:
> http://zookeeper-user.578899.n2.nabble.com/Running-Zookeeper-in-2-machines-tp7579232p7579237.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Running Zookeeper in 2 machines

Alexander Shraer-2
I don't think reconfiguration will help you here as it requires a
quorum of the old and a quorum of the new ensembles, and here you're
missing a quorum of the old one.

The problem is that you may have some committed operations on the B
servers that A doesn't know about (writes are done to a quorum).
Moreover, B may just be slow and may be still operational.

To solve the problem here I think you either need a tie breaker, a
reliable failure detection mechanism (such as when you're manually
doing this because you're sure that B is down) or some kind of
stronger synchrony assumptions (e.g., if A didn't hear from B for 3
sec it means that B has crashed), this is something that ZK doesn't do
to be more robust to network delays.

Since this scenario seems very common It may be interesting to
implement some kind of a tie breaker quorum system in zookeeper.

Alex

On Tue, Nov 5, 2013 at 12:44 PM, Cameron McKenzie
<[hidden email]> wrote:

> I have a similar problem to you. I have more than 2 machines, but only 2
> geographically redundant sites.
>
> In your situation, you could get some redundancy by running 2 instances on
> one host, and 1 instance on the other host. This would protect you from
> temporary network glitches (because the machine with 2 instances can still
> form a quorum), and will protect you from failure of the machine with the
> single instance. It will not help you if the machine with 2 instances
> crashes.
>
> In this situation, where the 2 instance machine dies, you can temporarily
> configure the 1 instance machine to be a single instance cluster, and then
> when the 2 instance machine is recovered, you can reconfigure the single
> instance machine to be part of the 3 instance cluster again. This process
> is manual, and slightly dangerous, because if you restart nodes in the
> wrong order, you have potential to lose data. This is the approach that I
> have tested and seems to work, but I'd recommend testing it also.
>
> Machine A has ZK instance 1
> Machine B has ZK instances 2 and 3
>
> Machine B dies
> Reconfigure ZK instance 1 so that it only has itself in the cluster. This
> means that there is no redundancy at this point, but it can form a quorum
> as its the only instance in the cluster.
> Restart ZK instance 1 to pickup config changes
> Fix up Machine B
> Reconfigure ZK 1 instance to have ZK instances 2 and 3 in its configuration
> Restart ZK instance 1 to pickup config changes
> Start ZK instance 2 on Machine B.
> Wait for ZK instance 1 on Machine A and ZK instance 2 on machine B form a
> quorum. This is vitally important. If you start instance 3 before a quorum
> is formed it is possible that instances 2 and 3 will form a quorum. This
> will cause any updates that have occurred via instance 1 during the outage
> of Machine B to be lost.
> Start ZK instance 3 on Machine B
>
> This process should become easier once dynamic reconfiguration is
> implemented (in ZK 3.5 I believe?) because restarts won't be required.
> cheers
> Cam
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Nov 5, 2013 at 6:05 PM, erolagnab <[hidden email]> wrote:
>
>> Thanks, I got the idea now. So is it fair to say that it is not possible to
>> create ZK cluster providing some redundancy with 2 physical machines? If
>> so,
>> is there a way to make it happen?
>>
>>
>>
>> --
>> View this message in context:
>> http://zookeeper-user.578899.n2.nabble.com/Running-Zookeeper-in-2-machines-tp7579232p7579237.html
>> Sent from the zookeeper-user mailing list archive at Nabble.com.
>>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Running Zookeeper in 2 machines

Cameron McKenzie
Yes, I guess in this situation there's no guarantee that A has the latest
data. I think that this is just an inherent limitation of the quorum based
writes though. Unless you have three separate machines at geographically
redundant sites, I don't think that you have true redundancy.
cheers
Cam


On Wed, Nov 6, 2013 at 9:17 AM, Alexander Shraer <[hidden email]> wrote:

> I don't think reconfiguration will help you here as it requires a
> quorum of the old and a quorum of the new ensembles, and here you're
> missing a quorum of the old one.
>
> The problem is that you may have some committed operations on the B
> servers that A doesn't know about (writes are done to a quorum).
> Moreover, B may just be slow and may be still operational.
>
> To solve the problem here I think you either need a tie breaker, a
> reliable failure detection mechanism (such as when you're manually
> doing this because you're sure that B is down) or some kind of
> stronger synchrony assumptions (e.g., if A didn't hear from B for 3
> sec it means that B has crashed), this is something that ZK doesn't do
> to be more robust to network delays.
>
> Since this scenario seems very common It may be interesting to
> implement some kind of a tie breaker quorum system in zookeeper.
>
> Alex
>
> On Tue, Nov 5, 2013 at 12:44 PM, Cameron McKenzie
> <[hidden email]> wrote:
> > I have a similar problem to you. I have more than 2 machines, but only 2
> > geographically redundant sites.
> >
> > In your situation, you could get some redundancy by running 2 instances
> on
> > one host, and 1 instance on the other host. This would protect you from
> > temporary network glitches (because the machine with 2 instances can
> still
> > form a quorum), and will protect you from failure of the machine with the
> > single instance. It will not help you if the machine with 2 instances
> > crashes.
> >
> > In this situation, where the 2 instance machine dies, you can temporarily
> > configure the 1 instance machine to be a single instance cluster, and
> then
> > when the 2 instance machine is recovered, you can reconfigure the single
> > instance machine to be part of the 3 instance cluster again. This process
> > is manual, and slightly dangerous, because if you restart nodes in the
> > wrong order, you have potential to lose data. This is the approach that I
> > have tested and seems to work, but I'd recommend testing it also.
> >
> > Machine A has ZK instance 1
> > Machine B has ZK instances 2 and 3
> >
> > Machine B dies
> > Reconfigure ZK instance 1 so that it only has itself in the cluster. This
> > means that there is no redundancy at this point, but it can form a quorum
> > as its the only instance in the cluster.
> > Restart ZK instance 1 to pickup config changes
> > Fix up Machine B
> > Reconfigure ZK 1 instance to have ZK instances 2 and 3 in its
> configuration
> > Restart ZK instance 1 to pickup config changes
> > Start ZK instance 2 on Machine B.
> > Wait for ZK instance 1 on Machine A and ZK instance 2 on machine B form a
> > quorum. This is vitally important. If you start instance 3 before a
> quorum
> > is formed it is possible that instances 2 and 3 will form a quorum. This
> > will cause any updates that have occurred via instance 1 during the
> outage
> > of Machine B to be lost.
> > Start ZK instance 3 on Machine B
> >
> > This process should become easier once dynamic reconfiguration is
> > implemented (in ZK 3.5 I believe?) because restarts won't be required.
> > cheers
> > Cam
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Nov 5, 2013 at 6:05 PM, erolagnab <[hidden email]> wrote:
> >
> >> Thanks, I got the idea now. So is it fair to say that it is not
> possible to
> >> create ZK cluster providing some redundancy with 2 physical machines? If
> >> so,
> >> is there a way to make it happen?
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://zookeeper-user.578899.n2.nabble.com/Running-Zookeeper-in-2-machines-tp7579232p7579237.html
> >> Sent from the zookeeper-user mailing list archive at Nabble.com.
> >>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Running Zookeeper in 2 machines

Alexander Shraer-2
yes, and more than that - its theoretically impossible to tolerate
half or more failures in a completely asynchronous environment if you
want to provide strongly consistent replicated storage. Similar to the
CAP theorem - in ZK when there is a partition we choose C over A.


On Tue, Nov 5, 2013 at 2:29 PM, Cameron McKenzie <[hidden email]> wrote:
> Yes, I guess in this situation there's no guarantee that A has the latest
> data. I think that this is just an inherent limitation of the quorum based
> writes though. Unless you have three separate machines at geographically
> redundant sites, I don't think that you have true redundancy.
> cheers
> Cam
Loading...