Zookeeper client reverse lookup issue

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Zookeeper client reverse lookup issue

Ben Wood
Hey Folks,

My team and I are working on a containerized Zookeeper service on top of
DC/OS. We're running into an issue with Kerberos in the following scenario.

Simplified, we have a zk server with the DNS address zk-server.dcos (e.g.
the dns address of the ZK task) and actual hostname zk-server.aws
(Shortened here, but really a standard resolvable AWS private dns address)
and a kafka broker, kafka.dcos.

We can easily setup our Zookeeper and Kafka services to work together,
until we try to enable Kerberos. ZK itself works just fine with Kerberos,
but the Kafka broker is not able to connect to the ZK server:

0. kafka.dcos is started with a zk server list of zk-server.dcos.
1. kafka.dcos starts up, initializing its ZK client.
2. kafka.dcos then attempts to retrieve a ticket from the KDC in order to
talk to zk-server.aws, however the only zk principal known to the kdc is
zk-server.dcos.

From reading the source (
https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/client/StaticHostProvider.java#L112)
it appears that the zk client is winding up with the actual hostname of the
ZK server.

Being new to the codebase, is this because of a client reverse lookup? Or
because the zk server is telling the client about its hostname? It appears
to be the former.

Thanks!
Ben
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Abraham Fine
Hi Ben-

What version of ZooKeeper are you using? In my testing it looks like 3.4
does a reverse lookup when creating the server principal
(https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1011)
but 3.5/master do not
(https://github.com/apache/zookeeper/blob/branch-3.5/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1104).

Let me know if that helps.

Thanks,
Abe

On Fri, Nov 17, 2017, at 12:01, Ben Wood wrote:

> Hey Folks,
>
> My team and I are working on a containerized Zookeeper service on top of
> DC/OS. We're running into an issue with Kerberos in the following
> scenario.
>
> Simplified, we have a zk server with the DNS address zk-server.dcos (e.g.
> the dns address of the ZK task) and actual hostname zk-server.aws
> (Shortened here, but really a standard resolvable AWS private dns
> address)
> and a kafka broker, kafka.dcos.
>
> We can easily setup our Zookeeper and Kafka services to work together,
> until we try to enable Kerberos. ZK itself works just fine with Kerberos,
> but the Kafka broker is not able to connect to the ZK server:
>
> 0. kafka.dcos is started with a zk server list of zk-server.dcos.
> 1. kafka.dcos starts up, initializing its ZK client.
> 2. kafka.dcos then attempts to retrieve a ticket from the KDC in order to
> talk to zk-server.aws, however the only zk principal known to the kdc is
> zk-server.dcos.
>
> From reading the source (
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/client/StaticHostProvider.java#L112)
> it appears that the zk client is winding up with the actual hostname of
> the
> ZK server.
>
> Being new to the codebase, is this because of a client reverse lookup? Or
> because the zk server is telling the client about its hostname? It
> appears
> to be the former.
>
> Thanks!
> Ben
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Abraham Fine
This change occurred due to
https://issues.apache.org/jira/browse/ZOOKEEPER-2171

On Fri, Nov 17, 2017, at 15:10, Abraham Fine wrote:

> Hi Ben-
>
> What version of ZooKeeper are you using? In my testing it looks like 3.4
> does a reverse lookup when creating the server principal
> (https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1011)
> but 3.5/master do not
> (https://github.com/apache/zookeeper/blob/branch-3.5/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1104).
>
> Let me know if that helps.
>
> Thanks,
> Abe
>
> On Fri, Nov 17, 2017, at 12:01, Ben Wood wrote:
> > Hey Folks,
> >
> > My team and I are working on a containerized Zookeeper service on top of
> > DC/OS. We're running into an issue with Kerberos in the following
> > scenario.
> >
> > Simplified, we have a zk server with the DNS address zk-server.dcos (e.g.
> > the dns address of the ZK task) and actual hostname zk-server.aws
> > (Shortened here, but really a standard resolvable AWS private dns
> > address)
> > and a kafka broker, kafka.dcos.
> >
> > We can easily setup our Zookeeper and Kafka services to work together,
> > until we try to enable Kerberos. ZK itself works just fine with Kerberos,
> > but the Kafka broker is not able to connect to the ZK server:
> >
> > 0. kafka.dcos is started with a zk server list of zk-server.dcos.
> > 1. kafka.dcos starts up, initializing its ZK client.
> > 2. kafka.dcos then attempts to retrieve a ticket from the KDC in order to
> > talk to zk-server.aws, however the only zk principal known to the kdc is
> > zk-server.dcos.
> >
> > From reading the source (
> > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/client/StaticHostProvider.java#L112)
> > it appears that the zk client is winding up with the actual hostname of
> > the
> > ZK server.
> >
> > Being new to the codebase, is this because of a client reverse lookup? Or
> > because the zk server is telling the client about its hostname? It
> > appears
> > to be the former.
> >
> > Thanks!
> > Ben
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Ben Wood
Hey Abraham,

We've been using 3.4.11.

That's great news the reverse lookup has already been taken out in 3.5! We
found another ticket https://issues.apache.org/jira/browse/ZOOKEEPER-2858
that suggested there were some workarounds but hadn't found that one (and
hadn't had a chance to test 3.5).

This raises an interesting question for us, admitted ZK neophytes, is ZK
3.5 "safe" to use in production?

On Fri, Nov 17, 2017 at 3:11 PM, Abraham Fine <[hidden email]> wrote:

> This change occurred due to
> https://issues.apache.org/jira/browse/ZOOKEEPER-2171
>
> On Fri, Nov 17, 2017, at 15:10, Abraham Fine wrote:
> > Hi Ben-
> >
> > What version of ZooKeeper are you using? In my testing it looks like 3.4
> > does a reverse lookup when creating the server principal
> > (https://github.com/apache/zookeeper/blob/branch-3.4/src/
> java/main/org/apache/zookeeper/ClientCnxn.java#L1011)
> > but 3.5/master do not
> > (https://github.com/apache/zookeeper/blob/branch-3.5/src/
> java/main/org/apache/zookeeper/ClientCnxn.java#L1104).
> >
> > Let me know if that helps.
> >
> > Thanks,
> > Abe
> >
> > On Fri, Nov 17, 2017, at 12:01, Ben Wood wrote:
> > > Hey Folks,
> > >
> > > My team and I are working on a containerized Zookeeper service on top
> of
> > > DC/OS. We're running into an issue with Kerberos in the following
> > > scenario.
> > >
> > > Simplified, we have a zk server with the DNS address zk-server.dcos
> (e.g.
> > > the dns address of the ZK task) and actual hostname zk-server.aws
> > > (Shortened here, but really a standard resolvable AWS private dns
> > > address)
> > > and a kafka broker, kafka.dcos.
> > >
> > > We can easily setup our Zookeeper and Kafka services to work together,
> > > until we try to enable Kerberos. ZK itself works just fine with
> Kerberos,
> > > but the Kafka broker is not able to connect to the ZK server:
> > >
> > > 0. kafka.dcos is started with a zk server list of zk-server.dcos.
> > > 1. kafka.dcos starts up, initializing its ZK client.
> > > 2. kafka.dcos then attempts to retrieve a ticket from the KDC in order
> to
> > > talk to zk-server.aws, however the only zk principal known to the kdc
> is
> > > zk-server.dcos.
> > >
> > > From reading the source (
> > > https://github.com/apache/zookeeper/blob/master/src/
> java/main/org/apache/zookeeper/client/StaticHostProvider.java#L112)
> > > it appears that the zk client is winding up with the actual hostname of
> > > the
> > > ZK server.
> > >
> > > Being new to the codebase, is this because of a client reverse lookup?
> Or
> > > because the zk server is telling the client about its hostname? It
> > > appears
> > > to be the former.
> > >
> > > Thanks!
> > > Ben
>



--
Ben Wood
Software Engineer - Data Agility
Mesosphere
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Abraham Fine
Hi Ben-

Unfortunately 3.5 is still in beta and not recommended for production.

Thanks,
Abe

On Mon, Nov 20, 2017, at 11:13, Ben Wood wrote:

> Hey Abraham,
>
> We've been using 3.4.11.[
>
> That's great news the reverse lookup has already been taken out in 3.5!
> We
> found another ticket https://issues.apache.org/jira/browse/ZOOKEEPER-2858
> that suggested there were some workarounds but hadn't found that one (and
> hadn't had a chance to test 3.5).
>
> This raises an interesting question for us, admitted ZK neophytes, is ZK
> 3.5 "safe" to use in production?
>
> On Fri, Nov 17, 2017 at 3:11 PM, Abraham Fine <[hidden email]> wrote:
>
> > This change occurred due to
> > https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> >
> > On Fri, Nov 17, 2017, at 15:10, Abraham Fine wrote:
> > > Hi Ben-
> > >
> > > What version of ZooKeeper are you using? In my testing it looks like 3.4
> > > does a reverse lookup when creating the server principal
> > > (https://github.com/apache/zookeeper/blob/branch-3.4/src/
> > java/main/org/apache/zookeeper/ClientCnxn.java#L1011)
> > > but 3.5/master do not
> > > (https://github.com/apache/zookeeper/blob/branch-3.5/src/
> > java/main/org/apache/zookeeper/ClientCnxn.java#L1104).
> > >
> > > Let me know if that helps.
> > >
> > > Thanks,
> > > Abe
> > >
> > > On Fri, Nov 17, 2017, at 12:01, Ben Wood wrote:
> > > > Hey Folks,
> > > >
> > > > My team and I are working on a containerized Zookeeper service on top
> > of
> > > > DC/OS. We're running into an issue with Kerberos in the following
> > > > scenario.
> > > >
> > > > Simplified, we have a zk server with the DNS address zk-server.dcos
> > (e.g.
> > > > the dns address of the ZK task) and actual hostname zk-server.aws
> > > > (Shortened here, but really a standard resolvable AWS private dns
> > > > address)
> > > > and a kafka broker, kafka.dcos.
> > > >
> > > > We can easily setup our Zookeeper and Kafka services to work together,
> > > > until we try to enable Kerberos. ZK itself works just fine with
> > Kerberos,
> > > > but the Kafka broker is not able to connect to the ZK server:
> > > >
> > > > 0. kafka.dcos is started with a zk server list of zk-server.dcos.
> > > > 1. kafka.dcos starts up, initializing its ZK client.
> > > > 2. kafka.dcos then attempts to retrieve a ticket from the KDC in order
> > to
> > > > talk to zk-server.aws, however the only zk principal known to the kdc
> > is
> > > > zk-server.dcos.
> > > >
> > > > From reading the source (
> > > > https://github.com/apache/zookeeper/blob/master/src/
> > java/main/org/apache/zookeeper/client/StaticHostProvider.java#L112)
> > > > it appears that the zk client is winding up with the actual hostname of
> > > > the
> > > > ZK server.
> > > >
> > > > Being new to the codebase, is this because of a client reverse lookup?
> > Or
> > > > because the zk server is telling the client about its hostname? It
> > > > appears
> > > > to be the former.
> > > >
> > > > Thanks!
> > > > Ben
> >
>
>
>
> --
> Ben Wood
> Software Engineer - Data Agility
> Mesosphere
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Ben Wood
Do you think a patch targeting 3.4.11 or 3.4.12 that backports this lookup
change (as minimally as possible) would be accepted?

In order to do ZK in containers (yes, slightly terrifying), it's not
_necessary_ but a lot easier!

On Mon, Nov 20, 2017 at 12:34 PM, Abraham Fine <[hidden email]> wrote:

> Hi Ben-
>
> Unfortunately 3.5 is still in beta and not recommended for production.
>
> Thanks,
> Abe
>
> On Mon, Nov 20, 2017, at 11:13, Ben Wood wrote:
> > Hey Abraham,
> >
> > We've been using 3.4.11.[
> >
> > That's great news the reverse lookup has already been taken out in 3.5!
> > We
> > found another ticket https://issues.apache.org/
> jira/browse/ZOOKEEPER-2858
> > that suggested there were some workarounds but hadn't found that one (and
> > hadn't had a chance to test 3.5).
> >
> > This raises an interesting question for us, admitted ZK neophytes, is ZK
> > 3.5 "safe" to use in production?
> >
> > On Fri, Nov 17, 2017 at 3:11 PM, Abraham Fine <[hidden email]> wrote:
> >
> > > This change occurred due to
> > > https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> > >
> > > On Fri, Nov 17, 2017, at 15:10, Abraham Fine wrote:
> > > > Hi Ben-
> > > >
> > > > What version of ZooKeeper are you using? In my testing it looks like
> 3.4
> > > > does a reverse lookup when creating the server principal
> > > > (https://github.com/apache/zookeeper/blob/branch-3.4/src/
> > > java/main/org/apache/zookeeper/ClientCnxn.java#L1011)
> > > > but 3.5/master do not
> > > > (https://github.com/apache/zookeeper/blob/branch-3.5/src/
> > > java/main/org/apache/zookeeper/ClientCnxn.java#L1104).
> > > >
> > > > Let me know if that helps.
> > > >
> > > > Thanks,
> > > > Abe
> > > >
> > > > On Fri, Nov 17, 2017, at 12:01, Ben Wood wrote:
> > > > > Hey Folks,
> > > > >
> > > > > My team and I are working on a containerized Zookeeper service on
> top
> > > of
> > > > > DC/OS. We're running into an issue with Kerberos in the following
> > > > > scenario.
> > > > >
> > > > > Simplified, we have a zk server with the DNS address zk-server.dcos
> > > (e.g.
> > > > > the dns address of the ZK task) and actual hostname zk-server.aws
> > > > > (Shortened here, but really a standard resolvable AWS private dns
> > > > > address)
> > > > > and a kafka broker, kafka.dcos.
> > > > >
> > > > > We can easily setup our Zookeeper and Kafka services to work
> together,
> > > > > until we try to enable Kerberos. ZK itself works just fine with
> > > Kerberos,
> > > > > but the Kafka broker is not able to connect to the ZK server:
> > > > >
> > > > > 0. kafka.dcos is started with a zk server list of zk-server.dcos.
> > > > > 1. kafka.dcos starts up, initializing its ZK client.
> > > > > 2. kafka.dcos then attempts to retrieve a ticket from the KDC in
> order
> > > to
> > > > > talk to zk-server.aws, however the only zk principal known to the
> kdc
> > > is
> > > > > zk-server.dcos.
> > > > >
> > > > > From reading the source (
> > > > > https://github.com/apache/zookeeper/blob/master/src/
> > > java/main/org/apache/zookeeper/client/StaticHostProvider.java#L112)
> > > > > it appears that the zk client is winding up with the actual
> hostname of
> > > > > the
> > > > > ZK server.
> > > > >
> > > > > Being new to the codebase, is this because of a client reverse
> lookup?
> > > Or
> > > > > because the zk server is telling the client about its hostname? It
> > > > > appears
> > > > > to be the former.
> > > > >
> > > > > Thanks!
> > > > > Ben
> > >
> >
> >
> >
> > --
> > Ben Wood
> > Software Engineer - Data Agility
> > Mesosphere
>



--
Ben Wood
Software Engineer - Data Agility
Mesosphere
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Abraham Fine
I don't think a backport would be possible here (since existing
deployments may break when upgraded within the 3.4 line). If a fix can
be created without changing existing ZooKeeper behavior by default that
would be much more likely to be considered. Perhaps in the form of a
configuration switch of some kind.

In short we need to consider this from two angles. 3.4 is broken because
this is unexpected behavior and 3.5 is broken because ZOOKEEPER-2171
broke backwards compatibility. So I think we should add in a switch with
opposite default values for 3.4 and 3.5.

Would you be interested in submitting a patch for this Ben?

Thanks,
Abe

On Mon, Nov 20, 2017, at 15:11, Ben Wood wrote:

> Do you think a patch targeting 3.4.11 or 3.4.12 that backports this
> lookup
> change (as minimally as possible) would be accepted?
>
> In order to do ZK in containers (yes, slightly terrifying), it's not
> _necessary_ but a lot easier!
>
> On Mon, Nov 20, 2017 at 12:34 PM, Abraham Fine <[hidden email]> wrote:
>
> > Hi Ben-
> >
> > Unfortunately 3.5 is still in beta and not recommended for production.
> >
> > Thanks,
> > Abe
> >
> > On Mon, Nov 20, 2017, at 11:13, Ben Wood wrote:
> > > Hey Abraham,
> > >
> > > We've been using 3.4.11.[
> > >
> > > That's great news the reverse lookup has already been taken out in 3.5!
> > > We
> > > found another ticket https://issues.apache.org/
> > jira/browse/ZOOKEEPER-2858
> > > that suggested there were some workarounds but hadn't found that one (and
> > > hadn't had a chance to test 3.5).
> > >
> > > This raises an interesting question for us, admitted ZK neophytes, is ZK
> > > 3.5 "safe" to use in production?
> > >
> > > On Fri, Nov 17, 2017 at 3:11 PM, Abraham Fine <[hidden email]> wrote:
> > >
> > > > This change occurred due to
> > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> > > >
> > > > On Fri, Nov 17, 2017, at 15:10, Abraham Fine wrote:
> > > > > Hi Ben-
> > > > >
> > > > > What version of ZooKeeper are you using? In my testing it looks like
> > 3.4
> > > > > does a reverse lookup when creating the server principal
> > > > > (https://github.com/apache/zookeeper/blob/branch-3.4/src/
> > > > java/main/org/apache/zookeeper/ClientCnxn.java#L1011)
> > > > > but 3.5/master do not
> > > > > (https://github.com/apache/zookeeper/blob/branch-3.5/src/
> > > > java/main/org/apache/zookeeper/ClientCnxn.java#L1104).
> > > > >
> > > > > Let me know if that helps.
> > > > >
> > > > > Thanks,
> > > > > Abe
> > > > >
> > > > > On Fri, Nov 17, 2017, at 12:01, Ben Wood wrote:
> > > > > > Hey Folks,
> > > > > >
> > > > > > My team and I are working on a containerized Zookeeper service on
> > top
> > > > of
> > > > > > DC/OS. We're running into an issue with Kerberos in the following
> > > > > > scenario.
> > > > > >
> > > > > > Simplified, we have a zk server with the DNS address zk-server.dcos
> > > > (e.g.
> > > > > > the dns address of the ZK task) and actual hostname zk-server.aws
> > > > > > (Shortened here, but really a standard resolvable AWS private dns
> > > > > > address)
> > > > > > and a kafka broker, kafka.dcos.
> > > > > >
> > > > > > We can easily setup our Zookeeper and Kafka services to work
> > together,
> > > > > > until we try to enable Kerberos. ZK itself works just fine with
> > > > Kerberos,
> > > > > > but the Kafka broker is not able to connect to the ZK server:
> > > > > >
> > > > > > 0. kafka.dcos is started with a zk server list of zk-server.dcos.
> > > > > > 1. kafka.dcos starts up, initializing its ZK client.
> > > > > > 2. kafka.dcos then attempts to retrieve a ticket from the KDC in
> > order
> > > > to
> > > > > > talk to zk-server.aws, however the only zk principal known to the
> > kdc
> > > > is
> > > > > > zk-server.dcos.
> > > > > >
> > > > > > From reading the source (
> > > > > > https://github.com/apache/zookeeper/blob/master/src/
> > > > java/main/org/apache/zookeeper/client/StaticHostProvider.java#L112)
> > > > > > it appears that the zk client is winding up with the actual
> > hostname of
> > > > > > the
> > > > > > ZK server.
> > > > > >
> > > > > > Being new to the codebase, is this because of a client reverse
> > lookup?
> > > > Or
> > > > > > because the zk server is telling the client about its hostname? It
> > > > > > appears
> > > > > > to be the former.
> > > > > >
> > > > > > Thanks!
> > > > > > Ben
> > > >
> > >
> > >
> > >
> > > --
> > > Ben Wood
> > > Software Engineer - Data Agility
> > > Mesosphere
> >
>
>
>
> --
> Ben Wood
> Software Engineer - Data Agility
> Mesosphere
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Ben Wood
Definitely.

Can you point me in the direction of similar configuration gating?

On Tue, Nov 21, 2017 at 2:18 PM, Abraham Fine <[hidden email]> wrote:

> I don't think a backport would be possible here (since existing
> deployments may break when upgraded within the 3.4 line). If a fix can
> be created without changing existing ZooKeeper behavior by default that
> would be much more likely to be considered. Perhaps in the form of a
> configuration switch of some kind.
>
> In short we need to consider this from two angles. 3.4 is broken because
> this is unexpected behavior and 3.5 is broken because ZOOKEEPER-2171
> broke backwards compatibility. So I think we should add in a switch with
> opposite default values for 3.4 and 3.5.
>
> Would you be interested in submitting a patch for this Ben?
>
> Thanks,
> Abe
>
> On Mon, Nov 20, 2017, at 15:11, Ben Wood wrote:
> > Do you think a patch targeting 3.4.11 or 3.4.12 that backports this
> > lookup
> > change (as minimally as possible) would be accepted?
> >
> > In order to do ZK in containers (yes, slightly terrifying), it's not
> > _necessary_ but a lot easier!
> >
> > On Mon, Nov 20, 2017 at 12:34 PM, Abraham Fine <[hidden email]> wrote:
> >
> > > Hi Ben-
> > >
> > > Unfortunately 3.5 is still in beta and not recommended for production.
> > >
> > > Thanks,
> > > Abe
> > >
> > > On Mon, Nov 20, 2017, at 11:13, Ben Wood wrote:
> > > > Hey Abraham,
> > > >
> > > > We've been using 3.4.11.[
> > > >
> > > > That's great news the reverse lookup has already been taken out in
> 3.5!
> > > > We
> > > > found another ticket https://issues.apache.org/
> > > jira/browse/ZOOKEEPER-2858
> > > > that suggested there were some workarounds but hadn't found that one
> (and
> > > > hadn't had a chance to test 3.5).
> > > >
> > > > This raises an interesting question for us, admitted ZK neophytes,
> is ZK
> > > > 3.5 "safe" to use in production?
> > > >
> > > > On Fri, Nov 17, 2017 at 3:11 PM, Abraham Fine <[hidden email]>
> wrote:
> > > >
> > > > > This change occurred due to
> > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> > > > >
> > > > > On Fri, Nov 17, 2017, at 15:10, Abraham Fine wrote:
> > > > > > Hi Ben-
> > > > > >
> > > > > > What version of ZooKeeper are you using? In my testing it looks
> like
> > > 3.4
> > > > > > does a reverse lookup when creating the server principal
> > > > > > (https://github.com/apache/zookeeper/blob/branch-3.4/src/
> > > > > java/main/org/apache/zookeeper/ClientCnxn.java#L1011)
> > > > > > but 3.5/master do not
> > > > > > (https://github.com/apache/zookeeper/blob/branch-3.5/src/
> > > > > java/main/org/apache/zookeeper/ClientCnxn.java#L1104).
> > > > > >
> > > > > > Let me know if that helps.
> > > > > >
> > > > > > Thanks,
> > > > > > Abe
> > > > > >
> > > > > > On Fri, Nov 17, 2017, at 12:01, Ben Wood wrote:
> > > > > > > Hey Folks,
> > > > > > >
> > > > > > > My team and I are working on a containerized Zookeeper service
> on
> > > top
> > > > > of
> > > > > > > DC/OS. We're running into an issue with Kerberos in the
> following
> > > > > > > scenario.
> > > > > > >
> > > > > > > Simplified, we have a zk server with the DNS address
> zk-server.dcos
> > > > > (e.g.
> > > > > > > the dns address of the ZK task) and actual hostname
> zk-server.aws
> > > > > > > (Shortened here, but really a standard resolvable AWS private
> dns
> > > > > > > address)
> > > > > > > and a kafka broker, kafka.dcos.
> > > > > > >
> > > > > > > We can easily setup our Zookeeper and Kafka services to work
> > > together,
> > > > > > > until we try to enable Kerberos. ZK itself works just fine with
> > > > > Kerberos,
> > > > > > > but the Kafka broker is not able to connect to the ZK server:
> > > > > > >
> > > > > > > 0. kafka.dcos is started with a zk server list of
> zk-server.dcos.
> > > > > > > 1. kafka.dcos starts up, initializing its ZK client.
> > > > > > > 2. kafka.dcos then attempts to retrieve a ticket from the KDC
> in
> > > order
> > > > > to
> > > > > > > talk to zk-server.aws, however the only zk principal known to
> the
> > > kdc
> > > > > is
> > > > > > > zk-server.dcos.
> > > > > > >
> > > > > > > From reading the source (
> > > > > > > https://github.com/apache/zookeeper/blob/master/src/
> > > > > java/main/org/apache/zookeeper/client/
> StaticHostProvider.java#L112)
> > > > > > > it appears that the zk client is winding up with the actual
> > > hostname of
> > > > > > > the
> > > > > > > ZK server.
> > > > > > >
> > > > > > > Being new to the codebase, is this because of a client reverse
> > > lookup?
> > > > > Or
> > > > > > > because the zk server is telling the client about its
> hostname? It
> > > > > > > appears
> > > > > > > to be the former.
> > > > > > >
> > > > > > > Thanks!
> > > > > > > Ben
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Ben Wood
> > > > Software Engineer - Data Agility
> > > > Mesosphere
> > >
> >
> >
> >
> > --
> > Ben Wood
> > Software Engineer - Data Agility
> > Mesosphere
>



--
Ben Wood
Software Engineer - Data Agility
Mesosphere
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Abraham Fine
Hi Ben-

I'm not completely sure what you mean by "similar configuration gating"
but I think a system property with a boolean switch would be the
appropriate way of handling this. Take a look at " zookeeper.forceSync".

Hopefully this helps.

Thanks,
Abe

On Tue, Nov 21, 2017, at 20:06, Ben Wood wrote:

> Definitely.
>
> Can you point me in the direction of similar configuration gating?
>
> On Tue, Nov 21, 2017 at 2:18 PM, Abraham Fine <[hidden email]> wrote:
>
> > I don't think a backport would be possible here (since existing
> > deployments may break when upgraded within the 3.4 line). If a fix can
> > be created without changing existing ZooKeeper behavior by default that
> > would be much more likely to be considered. Perhaps in the form of a
> > configuration switch of some kind.
> >
> > In short we need to consider this from two angles. 3.4 is broken because
> > this is unexpected behavior and 3.5 is broken because ZOOKEEPER-2171
> > broke backwards compatibility. So I think we should add in a switch with
> > opposite default values for 3.4 and 3.5.
> >
> > Would you be interested in submitting a patch for this Ben?
> >
> > Thanks,
> > Abe
> >
> > On Mon, Nov 20, 2017, at 15:11, Ben Wood wrote:
> > > Do you think a patch targeting 3.4.11 or 3.4.12 that backports this
> > > lookup
> > > change (as minimally as possible) would be accepted?
> > >
> > > In order to do ZK in containers (yes, slightly terrifying), it's not
> > > _necessary_ but a lot easier!
> > >
> > > On Mon, Nov 20, 2017 at 12:34 PM, Abraham Fine <[hidden email]> wrote:
> > >
> > > > Hi Ben-
> > > >
> > > > Unfortunately 3.5 is still in beta and not recommended for production.
> > > >
> > > > Thanks,
> > > > Abe
> > > >
> > > > On Mon, Nov 20, 2017, at 11:13, Ben Wood wrote:
> > > > > Hey Abraham,
> > > > >
> > > > > We've been using 3.4.11.[
> > > > >
> > > > > That's great news the reverse lookup has already been taken out in
> > 3.5!
> > > > > We
> > > > > found another ticket https://issues.apache.org/
> > > > jira/browse/ZOOKEEPER-2858
> > > > > that suggested there were some workarounds but hadn't found that one
> > (and
> > > > > hadn't had a chance to test 3.5).
> > > > >
> > > > > This raises an interesting question for us, admitted ZK neophytes,
> > is ZK
> > > > > 3.5 "safe" to use in production?
> > > > >
> > > > > On Fri, Nov 17, 2017 at 3:11 PM, Abraham Fine <[hidden email]>
> > wrote:
> > > > >
> > > > > > This change occurred due to
> > > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> > > > > >
> > > > > > On Fri, Nov 17, 2017, at 15:10, Abraham Fine wrote:
> > > > > > > Hi Ben-
> > > > > > >
> > > > > > > What version of ZooKeeper are you using? In my testing it looks
> > like
> > > > 3.4
> > > > > > > does a reverse lookup when creating the server principal
> > > > > > > (https://github.com/apache/zookeeper/blob/branch-3.4/src/
> > > > > > java/main/org/apache/zookeeper/ClientCnxn.java#L1011)
> > > > > > > but 3.5/master do not
> > > > > > > (https://github.com/apache/zookeeper/blob/branch-3.5/src/
> > > > > > java/main/org/apache/zookeeper/ClientCnxn.java#L1104).
> > > > > > >
> > > > > > > Let me know if that helps.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Abe
> > > > > > >
> > > > > > > On Fri, Nov 17, 2017, at 12:01, Ben Wood wrote:
> > > > > > > > Hey Folks,
> > > > > > > >
> > > > > > > > My team and I are working on a containerized Zookeeper service
> > on
> > > > top
> > > > > > of
> > > > > > > > DC/OS. We're running into an issue with Kerberos in the
> > following
> > > > > > > > scenario.
> > > > > > > >
> > > > > > > > Simplified, we have a zk server with the DNS address
> > zk-server.dcos
> > > > > > (e.g.
> > > > > > > > the dns address of the ZK task) and actual hostname
> > zk-server.aws
> > > > > > > > (Shortened here, but really a standard resolvable AWS private
> > dns
> > > > > > > > address)
> > > > > > > > and a kafka broker, kafka.dcos.
> > > > > > > >
> > > > > > > > We can easily setup our Zookeeper and Kafka services to work
> > > > together,
> > > > > > > > until we try to enable Kerberos. ZK itself works just fine with
> > > > > > Kerberos,
> > > > > > > > but the Kafka broker is not able to connect to the ZK server:
> > > > > > > >
> > > > > > > > 0. kafka.dcos is started with a zk server list of
> > zk-server.dcos.
> > > > > > > > 1. kafka.dcos starts up, initializing its ZK client.
> > > > > > > > 2. kafka.dcos then attempts to retrieve a ticket from the KDC
> > in
> > > > order
> > > > > > to
> > > > > > > > talk to zk-server.aws, however the only zk principal known to
> > the
> > > > kdc
> > > > > > is
> > > > > > > > zk-server.dcos.
> > > > > > > >
> > > > > > > > From reading the source (
> > > > > > > > https://github.com/apache/zookeeper/blob/master/src/
> > > > > > java/main/org/apache/zookeeper/client/
> > StaticHostProvider.java#L112)
> > > > > > > > it appears that the zk client is winding up with the actual
> > > > hostname of
> > > > > > > > the
> > > > > > > > ZK server.
> > > > > > > >
> > > > > > > > Being new to the codebase, is this because of a client reverse
> > > > lookup?
> > > > > > Or
> > > > > > > > because the zk server is telling the client about its
> > hostname? It
> > > > > > > > appears
> > > > > > > > to be the former.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > > Ben
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Ben Wood
> > > > > Software Engineer - Data Agility
> > > > > Mesosphere
> > > >
> > >
> > >
> > >
> > > --
> > > Ben Wood
> > > Software Engineer - Data Agility
> > > Mesosphere
> >
>
>
>
> --
> Ben Wood
> Software Engineer - Data Agility
> Mesosphere
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper client reverse lookup issue

Ben Wood
I just meant somewhere similar that's using a flag so that's perfect!

Thanks!