Quantcast

Linking two sites via two Zookeeper instances

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Linking two sites via two Zookeeper instances

Christian Schuhegger
Hello,

up to now I did not work with Zookeeper itself and am only reading
documentation. From the document "ZooKeeper: Wait-free coordination for
Internet-scale systems" I understand that ZooKeeper uses a single writer
(the leader) approach for a ZooKeeper cluster, e.g. all writes go
through the leader.

 From the documentation about Observers:
http://zookeeper.apache.org/doc/trunk/zookeeperObservers.html
I understand that "Observers have other advantages. Because they do not
vote, they are not a critical part of the ZooKeeper ensemble. Therefore
they can fail, or be disconnected from the cluster, without harming the
availability of the ZooKeeper service. The benefit to the user is that
Observers may connect over less reliable network links than Followers.
In fact, Observers may be used to talk to a ZooKeeper server from
another data center."

The use-case I have in mind is to use ZooKeeper within one data-center
and synchronize data via the Observer mechanism to another data center
half way around the world. I would need to do the same symmetrically the
other way round.

I could imagine to do this by setting up two independent ZooKeeper
clusters, one which has the leader and the other voters in data center A
and its Observers in data center B and another ZooKeeper cluster which
has the leader and the other voters in data center B and its Observers
in data center A.

This would mean an increased maintenance overhead, I believe.

My question is now if it is somehow possible to do this with one
ZooKeeper cluster only by configuration, e.g. defining that all writes
that go to znode /A and its children (e.g. /A/a, /A/b, ...) are handled
by a group of voters in data center A and all writes that go to znode /B
and its children are handled by a group of voters in data center B. All
ZooKeeper servers would be at least observers of all znodes in the
cluster, e.g. the group of ZooKeeper servers that is not voter for a
given top-level node would at least be observer of that top-level node.

I would be interested to extend this concept to more than two data
centers with ping times between the data centers in the order of 300ms.

Many thanks and best regards,
--
Christian Schuhegger


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Linking two sites via two Zookeeper instances

Alexander Shraer-2
Hi Christian,

I don't think this is currently possible. I believe there has been some
work on building a hierarchy of ZooKeeper clusters @ Facebook, but I don't
know the details. I don't believe that this would mean less management
overhead though, since you'd still need several voting servers in each
datacenter.

But I actually wanted to ask you about your usecase. Do you have
consistency requirements among data items mastered in different datacenters
? For example - do you require that all clients (no matter where they are)
see changes to /A/* and /B/* in the same order ? could you share some more
details ? or, lets say you have 3 datacenters, one mastering /A/* another
/B/* and the third /C/*. Suppose that the first datacenter sees a change to
/C/x and afterwards /A/y is updated. Is it possible that someone in
datacenter B sees the new /A/y  before the new /C/x  ?

The reason I'm asking is that some time in the past me and others made this
initial proposal:
http://wiki.apache.org/hadoop/ZooKeeper/MountRemoteZookeeper
which didn't get enough support for lack of a compelling use-case (among
other things).

Thanks,
Alex

On Sun, Jan 27, 2013 at 5:56 AM, Christian Schuhegger <
[hidden email]> wrote:

> Hello,
>
> up to now I did not work with Zookeeper itself and am only reading
> documentation. From the document "ZooKeeper: Wait-free coordination for
> Internet-scale systems" I understand that ZooKeeper uses a single writer
> (the leader) approach for a ZooKeeper cluster, e.g. all writes go through
> the leader.
>
> From the documentation about Observers:
> http://zookeeper.apache.org/**doc/trunk/zookeeperObservers.**html<http://zookeeper.apache.org/doc/trunk/zookeeperObservers.html>
> I understand that "Observers have other advantages. Because they do not
> vote, they are not a critical part of the ZooKeeper ensemble. Therefore
> they can fail, or be disconnected from the cluster, without harming the
> availability of the ZooKeeper service. The benefit to the user is that
> Observers may connect over less reliable network links than Followers. In
> fact, Observers may be used to talk to a ZooKeeper server from another data
> center."
>
> The use-case I have in mind is to use ZooKeeper within one data-center and
> synchronize data via the Observer mechanism to another data center half way
> around the world. I would need to do the same symmetrically the other way
> round.
>
> I could imagine to do this by setting up two independent ZooKeeper
> clusters, one which has the leader and the other voters in data center A
> and its Observers in data center B and another ZooKeeper cluster which has
> the leader and the other voters in data center B and its Observers in data
> center A.
>
> This would mean an increased maintenance overhead, I believe.
>
> My question is now if it is somehow possible to do this with one ZooKeeper
> cluster only by configuration, e.g. defining that all writes that go to
> znode /A and its children (e.g. /A/a, /A/b, ...) are handled by a group of
> voters in data center A and all writes that go to znode /B and its children
> are handled by a group of voters in data center B. All ZooKeeper servers
> would be at least observers of all znodes in the cluster, e.g. the group of
> ZooKeeper servers that is not voter for a given top-level node would at
> least be observer of that top-level node.
>
> I would be interested to extend this concept to more than two data centers
> with ping times between the data centers in the order of 300ms.
>
> Many thanks and best regards,
> --
> Christian Schuhegger
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Linking two sites via two Zookeeper instances

Jordan Zimmerman-3
In reply to this post by Christian Schuhegger
IMO ZooKeeper is not well suited to a cross-data center scenario. Instead, have single data center islands and use methodology described by Camille Fournier in this article (http://whilefalse.blogspot.com/2012/12/building-global-highly-available.html).

-Jordan

On Jan 27, 2013, at 5:56 AM, Christian Schuhegger <[hidden email]> wrote:

> Hello,
>
> up to now I did not work with Zookeeper itself and am only reading documentation. From the document "ZooKeeper: Wait-free coordination for Internet-scale systems" I understand that ZooKeeper uses a single writer (the leader) approach for a ZooKeeper cluster, e.g. all writes go through the leader.
>
> From the documentation about Observers:
> http://zookeeper.apache.org/doc/trunk/zookeeperObservers.html
> I understand that "Observers have other advantages. Because they do not vote, they are not a critical part of the ZooKeeper ensemble. Therefore they can fail, or be disconnected from the cluster, without harming the availability of the ZooKeeper service. The benefit to the user is that Observers may connect over less reliable network links than Followers. In fact, Observers may be used to talk to a ZooKeeper server from another data center."
>
> The use-case I have in mind is to use ZooKeeper within one data-center and synchronize data via the Observer mechanism to another data center half way around the world. I would need to do the same symmetrically the other way round.
>
> I could imagine to do this by setting up two independent ZooKeeper clusters, one which has the leader and the other voters in data center A and its Observers in data center B and another ZooKeeper cluster which has the leader and the other voters in data center B and its Observers in data center A.
>
> This would mean an increased maintenance overhead, I believe.
>
> My question is now if it is somehow possible to do this with one ZooKeeper cluster only by configuration, e.g. defining that all writes that go to znode /A and its children (e.g. /A/a, /A/b, ...) are handled by a group of voters in data center A and all writes that go to znode /B and its children are handled by a group of voters in data center B. All ZooKeeper servers would be at least observers of all znodes in the cluster, e.g. the group of ZooKeeper servers that is not voter for a given top-level node would at least be observer of that top-level node.
>
> I would be interested to extend this concept to more than two data centers with ping times between the data centers in the order of 300ms.
>
> Many thanks and best regards,
> --
> Christian Schuhegger
>
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Linking two sites via two Zookeeper instances

Christian Schuhegger
In reply to this post by Christian Schuhegger
Hi Alexander,

Alexander Shraer wrote:
> I don't think this is currently possible. I believe there has been some
> work on building a hierarchy of ZooKeeper clusters @ Facebook, but I don't
> know the details. I don't believe that this would mean less management
> overhead though, since you'd still need several voting servers in each
> datacenter.

ok, I understand.

> But I actually wanted to ask you about your usecase. Do you have
> consistency requirements among data items mastered in different datacenters
> ? For example - do you require that all clients (no matter where they are)
> see changes to /A/* and /B/* in the same order ? could you share some more
> details ? or, lets say you have 3 datacenters, one mastering /A/* another
> /B/* and the third /C/*. Suppose that the first datacenter sees a change to
> /C/x and afterwards /A/y is updated. Is it possible that someone in
> datacenter B sees the new /A/y  before the new /C/x  ?

The two things that you might need in a distributed set-up are agreement
and/or order. Agreement would mean that all participants in the
distributed set-up get ALL updates and order would mean that they get
all updates in the same sequential order.

Zookeeper is implementing both.

For several of my use cases agreement and order would be required within
one data center, because we simply structure (shard) our services and
user groups in such a way that the users that need both, agreement and
order, access services within one data center. Across data centers I
only would need agreement. I would be nice to have agreement and order
across data centers, but because of latency requirements I guess this
would be prohibitively expensive.

Now to your question: yes, it would be fine if client would see C/x and
A/y in different order.

> The reason I'm asking is that some time in the past me and others made this
> initial proposal:
> http://wiki.apache.org/hadoop/ZooKeeper/MountRemoteZookeeper
> which didn't get enough support for lack of a compelling use-case (among
> other things).

It would be nice if Zookeeper would offer agreement and order in a more
granular fashion. I could imagine that write throughput could benefit
even within one data center if you have a use case that only needs
agreement, but you also pay for order, e.g. you thread all writes
through a single writer.

Thanks for your thoughts!
--
Christian Schuhegger


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Linking two sites via two Zookeeper instances

Christian Schuhegger
In reply to this post by Christian Schuhegger
Jordan Zimmerman wrote:
> IMO ZooKeeper is not well suited to a cross-data center scenario. Instead, have single data center islands and use methodology described by Camille Fournier in this article (http://whilefalse.blogspot.com/2012/12/building-global-highly-available.html).

Thanks for the link!

My current picture of how to do it would be very similar to the one that
Camille described, except that I only would need an Observer node in a
remote data center, e.g. this node would only be there to speed up read
access through local caching.

In addition I would not need one cluster that guarantees agreement and
order across all data centers. Within one data center I need agreement
and order, but across data centers I only need agreement and I can live
with latency between the original write in the one data center and the
possible availability of that write in the remote data center.

I can solve this use case by putting a zookeeper cluster in each data
center with observer node(s) in remote data centers. I only thought that
it would reduce the maintenance overhead if I could realize this use
case via one zookeeper cluster and configuration instead of setting up
several of them.

--
Christian Schuhegger


Loading...