Zookeeper session expiration

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Zookeeper session expiration

Anthony Shaya
Hi,

I have a question about zookeeper sessions (within a cluster). I've noticed in our production servers that some of our clients lose connection to zookeeper then our client application  comes down (and we automatically bring it back and it reconnects to zk just fine). It seems to be a session expiration failure (from the investigation done so far).

2017-09-14 09:26:11,516 org.apache.zookeeper.ClientCnxn INFO [main-SendThread(***:2181)] Opening socket connection to server *** /***:2181. Will not attempt to authenticate using SASL (unknown error)
2017-09-14 09:26:11,517 org.apache.zookeeper.ClientCnxn INFO [main-SendThread(***:2181)] Socket connection established to *** /***:2181, initiating session
2017-09-14 09:26:11,519 org.apache.zookeeper.ClientCnxn DEBUG [main-SendThread(***:2181)] Session establishment request sent on *** /***:2181
2017-09-14 09:26:11,520 org.apache.zookeeper.ClientCnxn INFO [main-SendThread(***:2181)] Unable to reconnect to ZooKeeper service, session 0x3256a2e5b6090079 has expired, closing socket connection


My question is related to how session expiration works, I noticed on many of the client machines the times across these machines were all off (by anywhere from 1 minute to 20 minutes - which was resolved after discovery - haven't verified this completely yet). Can this directly affect session expiration within the zookeeper cluster?


  *   I read the following in https://wiki.apache.org/hadoop/ZooKeeper/FAQ , "Expirations happens when the cluster does not hear from the client within the specified session timeout period (i.e. no heartbeat).". So in some case it seems like if the times were wrong across the machines its possible one of the clients could of effectively sent a heart beat in the past (not sure about this tbh) and then the cluster expires the session?


  *   I don't have the zookeeper node log for the above time to see what was going on in zookeeper when the cluster determined the session expired.

  *   Is there any additional logging I can turn on to troubleshoot zk session expiration issues?



Thanks!



This message is intended exclusively for the individual or entity to which it is addressed. This communication may contain information that is proprietary, privileged, confidential or otherwise legally exempt from disclosure. If you are not the named addressee, or have been inadvertently and erroneously referenced in the address line, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this message in error, please notify the sender immediately by e-mail and delete all copies of the message. (ID m031214)
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper session expiration

Shawn Heisey
On 12/4/2017 8:22 AM, Anthony Shaya wrote:
> My question is related to how session expiration works, I noticed on many of the client machines the times across these machines were all off (by anywhere from 1 minute to 20 minutes - which was resolved after discovery - haven't verified this completely yet). Can this directly affect session expiration within the zookeeper cluster?
>
>    *   I read the following in https://wiki.apache.org/hadoop/ZooKeeper/FAQ , "Expirations happens when the cluster does not hear from the client within the specified session timeout period (i.e. no heartbeat).". So in some case it seems like if the times were wrong across the machines its possible one of the clients could of effectively sent a heart beat in the past (not sure about this tbh) and then the cluster expires the session?

I make these comments without any knowledge of what ZK code actually
does.  I am a member of this list because I'm a representative of the
Apache Solr project, which uses the ZK client in order to maintain a
cluster.

IMHO, any software which makes actual decisions based on the timestamps
in messages from another system is badly designed.  I would hope that
the ZK designers know this, and always make any decisions related to
time using the clock in the local system only.

If ZK's designers did the right thing, then a session timeout would
indicate that quite literally no heartbeats were received in X seconds,
as measured by the local clock, and the local clock ONLY ... NOT from
timestamp information received from another system.

Although such a lack of communication could be caused by any number of
things, including network hardware failure, one of the most common
reasons I have seen for problems like this is extreme java garbage
collection pauses in the client software.

Situations where the heap is a little bit too small can cause a java
program to basically be doing garbage collection constantly, so it
doesn't have much time to do anything else, like send heartbeats to ZK
servers.

Situations where the heap is HUGE and garbage collection is not well
tuned can lead to pauses of a minute or longer while Java does a massive
full GC.

>    *   I don't have the zookeeper node log for the above time to see what was going on in zookeeper when the cluster determined the session expired.
>
>    *   Is there any additional logging I can turn on to troubleshoot zk session expiration issues?

Hopefully your ZK clients also have logging.  Failing that, you could
turn on GC logging for the software with the ZK client (assuming it's a
Java client) and find a program or website that can examine the log and
give you statistics or a graph of GC pauses.

If there is a problem in software using the client and whatever logging
is available doesn't help you figure out what's wrong, you're generally
going to need to talk to whoever wrote that software for help
troubleshooting it.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

RE: Zookeeper session expiration

Anthony Shaya
Thanks Shawn, should I message the developer mailing list for a more definitive answer?

Thanks again for the reply.

-----Original Message-----
From: Shawn Heisey [mailto:[hidden email]]
Sent: Monday, December 4, 2017 2:49 PM
To: [hidden email]
Subject: Re: Zookeeper session expiration

On 12/4/2017 8:22 AM, Anthony Shaya wrote:
> My question is related to how session expiration works, I noticed on many of the client machines the times across these machines were all off (by anywhere from 1 minute to 20 minutes - which was resolved after discovery - haven't verified this completely yet). Can this directly affect session expiration within the zookeeper cluster?
>
>    *   I read the following in https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d53b5023ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster does not hear from the client within the specified session timeout period (i.e. no heartbeat).". So in some case it seems like if the times were wrong across the machines its possible one of the clients could of effectively sent a heart beat in the past (not sure about this tbh) and then the cluster expires the session?

I make these comments without any knowledge of what ZK code actually does.  I am a member of this list because I'm a representative of the Apache Solr project, which uses the ZK client in order to maintain a cluster.

IMHO, any software which makes actual decisions based on the timestamps in messages from another system is badly designed.  I would hope that the ZK designers know this, and always make any decisions related to time using the clock in the local system only.

If ZK's designers did the right thing, then a session timeout would indicate that quite literally no heartbeats were received in X seconds, as measured by the local clock, and the local clock ONLY ... NOT from timestamp information received from another system.

Although such a lack of communication could be caused by any number of things, including network hardware failure, one of the most common reasons I have seen for problems like this is extreme java garbage collection pauses in the client software.

Situations where the heap is a little bit too small can cause a java program to basically be doing garbage collection constantly, so it doesn't have much time to do anything else, like send heartbeats to ZK servers.

Situations where the heap is HUGE and garbage collection is not well tuned can lead to pauses of a minute or longer while Java does a massive full GC.

>    *   I don't have the zookeeper node log for the above time to see what was going on in zookeeper when the cluster determined the session expired.
>
>    *   Is there any additional logging I can turn on to troubleshoot zk session expiration issues?

Hopefully your ZK clients also have logging.  Failing that, you could turn on GC logging for the software with the ZK client (assuming it's a Java client) and find a program or website that can examine the log and give you statistics or a graph of GC pauses.

If there is a problem in software using the client and whatever logging is available doesn't help you figure out what's wrong, you're generally going to need to talk to whoever wrote that software for help troubleshooting it.

Thanks,
Shawn



This message is intended exclusively for the individual or entity to which it is addressed. This communication may contain information that is proprietary, privileged, confidential or otherwise legally exempt from disclosure. If you are not the named addressee, or have been inadvertently and erroneously referenced in the address line, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this message in error, please notify the sender immediately by e-mail and delete all copies of the message. (ID m031214)
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper session expiration

Shawn Heisey
On 12/4/2017 12:51 PM, Anthony Shaya wrote:
> Thanks Shawn, should I message the developer mailing list for a more definitive answer?

The ZK dev list is for discussion around the development of ZK itself,
NOT for development of software that uses ZK.  For the latter kind of
development, you want THIS list.

If you're talking about the dev list for whatever software is using the
ZK client, then that would be the right place to go.

Although a bug in ZK is always possible, I don't think it's very likely
for the session timeouts you are seeing.  Even if it does turn out to be
a bug in ZK, this list would still be the correct place to discuss it,
and further action would then be taken as an issue in Jira.

For most usages, there will be at least three ZK servers, and each
client will know about all of them.  If there are no problems on the
client side, then the client would only lose connectivity to one of the
servers and would be able to communicate with the others.  If there ARE
problems on the client side, then it would probably lose connection with
all the servers at nearly the same time.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper session expiration

Abraham Fine
In reply to this post by Anthony Shaya
Hello Anthony and Shawn-

To the best of my knowledge ZooKeeper does not use the "wall clock" time
anywhere. So that should not be the problem.

Please consider enabling debug logging, which should allow you to track
the "pings".

Thanks,
Abe

On Mon, Dec 4, 2017, at 11:51, Anthony Shaya wrote:

> Thanks Shawn, should I message the developer mailing list for a more
> definitive answer?
>
> Thanks again for the reply.
>
> -----Original Message-----
> From: Shawn Heisey [mailto:[hidden email]]
> Sent: Monday, December 4, 2017 2:49 PM
> To: [hidden email]
> Subject: Re: Zookeeper session expiration
>
> On 12/4/2017 8:22 AM, Anthony Shaya wrote:
> > My question is related to how session expiration works, I noticed on many of the client machines the times across these machines were all off (by anywhere from 1 minute to 20 minutes - which was resolved after discovery - haven't verified this completely yet). Can this directly affect session expiration within the zookeeper cluster?
> >
> >    *   I read the following in https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d53b5023ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster does not hear from the client within the specified session timeout period (i.e. no heartbeat).". So in some case it seems like if the times were wrong across the machines its possible one of the clients could of effectively sent a heart beat in the past (not sure about this tbh) and then the cluster expires the session?
>
> I make these comments without any knowledge of what ZK code actually
> does.  I am a member of this list because I'm a representative of the
> Apache Solr project, which uses the ZK client in order to maintain a
> cluster.
>
> IMHO, any software which makes actual decisions based on the timestamps
> in messages from another system is badly designed.  I would hope that the
> ZK designers know this, and always make any decisions related to time
> using the clock in the local system only.
>
> If ZK's designers did the right thing, then a session timeout would
> indicate that quite literally no heartbeats were received in X seconds,
> as measured by the local clock, and the local clock ONLY ... NOT from
> timestamp information received from another system.
>
> Although such a lack of communication could be caused by any number of
> things, including network hardware failure, one of the most common
> reasons I have seen for problems like this is extreme java garbage
> collection pauses in the client software.
>
> Situations where the heap is a little bit too small can cause a java
> program to basically be doing garbage collection constantly, so it
> doesn't have much time to do anything else, like send heartbeats to ZK
> servers.
>
> Situations where the heap is HUGE and garbage collection is not well
> tuned can lead to pauses of a minute or longer while Java does a massive
> full GC.
>
> >    *   I don't have the zookeeper node log for the above time to see what was going on in zookeeper when the cluster determined the session expired.
> >
> >    *   Is there any additional logging I can turn on to troubleshoot zk session expiration issues?
>
> Hopefully your ZK clients also have logging.  Failing that, you could
> turn on GC logging for the software with the ZK client (assuming it's a
> Java client) and find a program or website that can examine the log and
> give you statistics or a graph of GC pauses.
>
> If there is a problem in software using the client and whatever logging
> is available doesn't help you figure out what's wrong, you're generally
> going to need to talk to whoever wrote that software for help
> troubleshooting it.
>
> Thanks,
> Shawn
>
>
>
> This message is intended exclusively for the individual or entity to
> which it is addressed. This communication may contain information that is
> proprietary, privileged, confidential or otherwise legally exempt from
> disclosure. If you are not the named addressee, or have been
> inadvertently and erroneously referenced in the address line, you are not
> authorized to read, print, retain, copy or disseminate this message or
> any part of it. If you have received this message in error, please notify
> the sender immediately by e-mail and delete all copies of the message.
> (ID m031214)
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper session expiration

Jordan Zimmerman-3
ZooKeeper, indeed, does not use wall clock time. It uses System.nanoTime() for most operations. Further, all operations go through the Leader node so only the Leader's notion of time matters. The Leader manages the session via a "SessionTracker" instance. The code is in SessionTrackerImpl.java. There is a sessionExpiryQueue which is a kind of priority queue that returns expired sessions based on System.nanoTime().

-JZ

> On Dec 4, 2017, at 12:09 PM, Abraham Fine <[hidden email]> wrote:
>
> Hello Anthony and Shawn-
>
> To the best of my knowledge ZooKeeper does not use the "wall clock" time
> anywhere. So that should not be the problem.
>
> Please consider enabling debug logging, which should allow you to track
> the "pings".
>
> Thanks,
> Abe
>
> On Mon, Dec 4, 2017, at 11:51, Anthony Shaya wrote:
>> Thanks Shawn, should I message the developer mailing list for a more
>> definitive answer?
>>
>> Thanks again for the reply.
>>
>> -----Original Message-----
>> From: Shawn Heisey [mailto:[hidden email]]
>> Sent: Monday, December 4, 2017 2:49 PM
>> To: [hidden email]
>> Subject: Re: Zookeeper session expiration
>>
>> On 12/4/2017 8:22 AM, Anthony Shaya wrote:
>>> My question is related to how session expiration works, I noticed on many of the client machines the times across these machines were all off (by anywhere from 1 minute to 20 minutes - which was resolved after discovery - haven't verified this completely yet). Can this directly affect session expiration within the zookeeper cluster?
>>>
>>>   *   I read the following in https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d53b5023ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster does not hear from the client within the specified session timeout period (i.e. no heartbeat).". So in some case it seems like if the times were wrong across the machines its possible one of the clients could of effectively sent a heart beat in the past (not sure about this tbh) and then the cluster expires the session?
>>
>> I make these comments without any knowledge of what ZK code actually
>> does.  I am a member of this list because I'm a representative of the
>> Apache Solr project, which uses the ZK client in order to maintain a
>> cluster.
>>
>> IMHO, any software which makes actual decisions based on the timestamps
>> in messages from another system is badly designed.  I would hope that the
>> ZK designers know this, and always make any decisions related to time
>> using the clock in the local system only.
>>
>> If ZK's designers did the right thing, then a session timeout would
>> indicate that quite literally no heartbeats were received in X seconds,
>> as measured by the local clock, and the local clock ONLY ... NOT from
>> timestamp information received from another system.
>>
>> Although such a lack of communication could be caused by any number of
>> things, including network hardware failure, one of the most common
>> reasons I have seen for problems like this is extreme java garbage
>> collection pauses in the client software.
>>
>> Situations where the heap is a little bit too small can cause a java
>> program to basically be doing garbage collection constantly, so it
>> doesn't have much time to do anything else, like send heartbeats to ZK
>> servers.
>>
>> Situations where the heap is HUGE and garbage collection is not well
>> tuned can lead to pauses of a minute or longer while Java does a massive
>> full GC.
>>
>>>   *   I don't have the zookeeper node log for the above time to see what was going on in zookeeper when the cluster determined the session expired.
>>>
>>>   *   Is there any additional logging I can turn on to troubleshoot zk session expiration issues?
>>
>> Hopefully your ZK clients also have logging.  Failing that, you could
>> turn on GC logging for the software with the ZK client (assuming it's a
>> Java client) and find a program or website that can examine the log and
>> give you statistics or a graph of GC pauses.
>>
>> If there is a problem in software using the client and whatever logging
>> is available doesn't help you figure out what's wrong, you're generally
>> going to need to talk to whoever wrote that software for help
>> troubleshooting it.
>>
>> Thanks,
>> Shawn
>>
>>
>>
>> This message is intended exclusively for the individual or entity to
>> which it is addressed. This communication may contain information that is
>> proprietary, privileged, confidential or otherwise legally exempt from
>> disclosure. If you are not the named addressee, or have been
>> inadvertently and erroneously referenced in the address line, you are not
>> authorized to read, print, retain, copy or disseminate this message or
>> any part of it. If you have received this message in error, please notify
>> the sender immediately by e-mail and delete all copies of the message.
>> (ID m031214)

Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper session expiration

Patrick Hunt
What Jordan said + time use is only in the relative sense, not the
absolute. Session tracking (expiration) is relative to the start of
leadership.

Patrick

On Mon, Dec 4, 2017 at 12:21 PM, Jordan Zimmerman <
[hidden email]> wrote:

> ZooKeeper, indeed, does not use wall clock time. It uses System.nanoTime()
> for most operations. Further, all operations go through the Leader node so
> only the Leader's notion of time matters. The Leader manages the session
> via a "SessionTracker" instance. The code is in SessionTrackerImpl.java.
> There is a sessionExpiryQueue which is a kind of priority queue that
> returns expired sessions based on System.nanoTime().
>
> -JZ
>
> > On Dec 4, 2017, at 12:09 PM, Abraham Fine <[hidden email]> wrote:
> >
> > Hello Anthony and Shawn-
> >
> > To the best of my knowledge ZooKeeper does not use the "wall clock" time
> > anywhere. So that should not be the problem.
> >
> > Please consider enabling debug logging, which should allow you to track
> > the "pings".
> >
> > Thanks,
> > Abe
> >
> > On Mon, Dec 4, 2017, at 11:51, Anthony Shaya wrote:
> >> Thanks Shawn, should I message the developer mailing list for a more
> >> definitive answer?
> >>
> >> Thanks again for the reply.
> >>
> >> -----Original Message-----
> >> From: Shawn Heisey [mailto:[hidden email]]
> >> Sent: Monday, December 4, 2017 2:49 PM
> >> To: [hidden email]
> >> Subject: Re: Zookeeper session expiration
> >>
> >> On 12/4/2017 8:22 AM, Anthony Shaya wrote:
> >>> My question is related to how session expiration works, I noticed on
> many of the client machines the times across these machines were all off
> (by anywhere from 1 minute to 20 minutes - which was resolved after
> discovery - haven't verified this completely yet). Can this directly affect
> session expiration within the zookeeper cluster?
> >>>
> >>>   *   I read the following in https://na01.safelinks.
> protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%
> 2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d53b50
> 23ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%
> 7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1
> EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster does
> not hear from the client within the specified session timeout period (i.e.
> no heartbeat).". So in some case it seems like if the times were wrong
> across the machines its possible one of the clients could of effectively
> sent a heart beat in the past (not sure about this tbh) and then the
> cluster expires the session?
> >>
> >> I make these comments without any knowledge of what ZK code actually
> >> does.  I am a member of this list because I'm a representative of the
> >> Apache Solr project, which uses the ZK client in order to maintain a
> >> cluster.
> >>
> >> IMHO, any software which makes actual decisions based on the timestamps
> >> in messages from another system is badly designed.  I would hope that
> the
> >> ZK designers know this, and always make any decisions related to time
> >> using the clock in the local system only.
> >>
> >> If ZK's designers did the right thing, then a session timeout would
> >> indicate that quite literally no heartbeats were received in X seconds,
> >> as measured by the local clock, and the local clock ONLY ... NOT from
> >> timestamp information received from another system.
> >>
> >> Although such a lack of communication could be caused by any number of
> >> things, including network hardware failure, one of the most common
> >> reasons I have seen for problems like this is extreme java garbage
> >> collection pauses in the client software.
> >>
> >> Situations where the heap is a little bit too small can cause a java
> >> program to basically be doing garbage collection constantly, so it
> >> doesn't have much time to do anything else, like send heartbeats to ZK
> >> servers.
> >>
> >> Situations where the heap is HUGE and garbage collection is not well
> >> tuned can lead to pauses of a minute or longer while Java does a massive
> >> full GC.
> >>
> >>>   *   I don't have the zookeeper node log for the above time to see
> what was going on in zookeeper when the cluster determined the session
> expired.
> >>>
> >>>   *   Is there any additional logging I can turn on to troubleshoot zk
> session expiration issues?
> >>
> >> Hopefully your ZK clients also have logging.  Failing that, you could
> >> turn on GC logging for the software with the ZK client (assuming it's a
> >> Java client) and find a program or website that can examine the log and
> >> give you statistics or a graph of GC pauses.
> >>
> >> If there is a problem in software using the client and whatever logging
> >> is available doesn't help you figure out what's wrong, you're generally
> >> going to need to talk to whoever wrote that software for help
> >> troubleshooting it.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
> >>
> >> This message is intended exclusively for the individual or entity to
> >> which it is addressed. This communication may contain information that
> is
> >> proprietary, privileged, confidential or otherwise legally exempt from
> >> disclosure. If you are not the named addressee, or have been
> >> inadvertently and erroneously referenced in the address line, you are
> not
> >> authorized to read, print, retain, copy or disseminate this message or
> >> any part of it. If you have received this message in error, please
> notify
> >> the sender immediately by e-mail and delete all copies of the message.
> >> (ID m031214)
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Zookeeper session expiration

Kathryn Hogg
I'm pretty new to zookeeper but have a fair amount of experience with virtual synchrony going back many years.  Even though time is relative, it is possible that if the clock suddenly jumps forward on the server to prematurely declare timeouts as expired.  I'm not sure how Zookeeper handles that but in Isis, if 2 consecutive calls to gettimeofday had too large of a difference, it considered it fishy.  

Of course, this is why we use ntp with adjtime to avoid clocks going backwards or making large jumps forward.

-----Original Message-----
From: Patrick Hunt [mailto:[hidden email]]
Sent: Wednesday, December 06, 2017 5:18 PM
To: UserZooKeeper <[hidden email]>
Subject: Re: Zookeeper session expiration

{External email message: This email is from an external source. Please exercise caution prior to opening attachments, clicking on links, or providing any sensitive information.}

What Jordan said + time use is only in the relative sense, not the absolute. Session tracking (expiration) is relative to the start of leadership.

Patrick

On Mon, Dec 4, 2017 at 12:21 PM, Jordan Zimmerman < [hidden email]> wrote:

> ZooKeeper, indeed, does not use wall clock time. It uses
> System.nanoTime() for most operations. Further, all operations go
> through the Leader node so only the Leader's notion of time matters.
> The Leader manages the session via a "SessionTracker" instance. The code is in SessionTrackerImpl.java.
> There is a sessionExpiryQueue which is a kind of priority queue that
> returns expired sessions based on System.nanoTime().
>
> -JZ
>
> > On Dec 4, 2017, at 12:09 PM, Abraham Fine <[hidden email]> wrote:
> >
> > Hello Anthony and Shawn-
> >
> > To the best of my knowledge ZooKeeper does not use the "wall clock"
> > time anywhere. So that should not be the problem.
> >
> > Please consider enabling debug logging, which should allow you to
> > track the "pings".
> >
> > Thanks,
> > Abe
> >
> > On Mon, Dec 4, 2017, at 11:51, Anthony Shaya wrote:
> >> Thanks Shawn, should I message the developer mailing list for a
> >> more definitive answer?
> >>
> >> Thanks again for the reply.
> >>
> >> -----Original Message-----
> >> From: Shawn Heisey [mailto:[hidden email]]
> >> Sent: Monday, December 4, 2017 2:49 PM
> >> To: [hidden email]
> >> Subject: Re: Zookeeper session expiration
> >>
> >> On 12/4/2017 8:22 AM, Anthony Shaya wrote:
> >>> My question is related to how session expiration works, I noticed
> >>> on
> many of the client machines the times across these machines were all
> off (by anywhere from 1 minute to 20 minutes - which was resolved
> after discovery - haven't verified this completely yet). Can this
> directly affect session expiration within the zookeeper cluster?
> >>>
> >>>   *   I read the following in https://na01.safelinks.
> protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%
> 2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d53
> b50 23ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%
> 7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1
> EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster
> does not hear from the client within the specified session timeout period (i.e.
> no heartbeat).". So in some case it seems like if the times were wrong
> across the machines its possible one of the clients could of
> effectively sent a heart beat in the past (not sure about this tbh)
> and then the cluster expires the session?
> >>
> >> I make these comments without any knowledge of what ZK code
> >> actually does.  I am a member of this list because I'm a
> >> representative of the Apache Solr project, which uses the ZK client
> >> in order to maintain a cluster.
> >>
> >> IMHO, any software which makes actual decisions based on the
> >> timestamps in messages from another system is badly designed.  I
> >> would hope that
> the
> >> ZK designers know this, and always make any decisions related to
> >> time using the clock in the local system only.
> >>
> >> If ZK's designers did the right thing, then a session timeout would
> >> indicate that quite literally no heartbeats were received in X
> >> seconds, as measured by the local clock, and the local clock ONLY
> >> ... NOT from timestamp information received from another system.
> >>
> >> Although such a lack of communication could be caused by any number
> >> of things, including network hardware failure, one of the most
> >> common reasons I have seen for problems like this is extreme java
> >> garbage collection pauses in the client software.
> >>
> >> Situations where the heap is a little bit too small can cause a
> >> java program to basically be doing garbage collection constantly,
> >> so it doesn't have much time to do anything else, like send
> >> heartbeats to ZK servers.
> >>
> >> Situations where the heap is HUGE and garbage collection is not
> >> well tuned can lead to pauses of a minute or longer while Java does
> >> a massive full GC.
> >>
> >>>   *   I don't have the zookeeper node log for the above time to see
> what was going on in zookeeper when the cluster determined the session
> expired.
> >>>
> >>>   *   Is there any additional logging I can turn on to troubleshoot zk
> session expiration issues?
> >>
> >> Hopefully your ZK clients also have logging.  Failing that, you
> >> could turn on GC logging for the software with the ZK client
> >> (assuming it's a Java client) and find a program or website that
> >> can examine the log and give you statistics or a graph of GC pauses.
> >>
> >> If there is a problem in software using the client and whatever
> >> logging is available doesn't help you figure out what's wrong,
> >> you're generally going to need to talk to whoever wrote that
> >> software for help troubleshooting it.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
> >>
> >> This message is intended exclusively for the individual or entity
> >> to which it is addressed. This communication may contain
> >> information that
> is
> >> proprietary, privileged, confidential or otherwise legally exempt
> >> from disclosure. If you are not the named addressee, or have been
> >> inadvertently and erroneously referenced in the address line, you
> >> are
> not
> >> authorized to read, print, retain, copy or disseminate this message
> >> or any part of it. If you have received this message in error,
> >> please
> notify
> >> the sender immediately by e-mail and delete all copies of the message.
> >> (ID m031214)
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper session expiration

Jordan Zimmerman-3
System.nanoTime() is not affected by clock changes. Really everyone - this is simply not an issue in ZooKeeper.

====================
Jordan Zimmerman

> On Dec 7, 2017, at 7:43 AM, Kathryn Hogg <[hidden email]> wrote:
>
> I'm pretty new to zookeeper but have a fair amount of experience with virtual synchrony going back many years.  Even though time is relative, it is possible that if the clock suddenly jumps forward on the server to prematurely declare timeouts as expired.  I'm not sure how Zookeeper handles that but in Isis, if 2 consecutive calls to gettimeofday had too large of a difference, it considered it fishy.  
>
> Of course, this is why we use ntp with adjtime to avoid clocks going backwards or making large jumps forward.
>
> -----Original Message-----
> From: Patrick Hunt [mailto:[hidden email]]
> Sent: Wednesday, December 06, 2017 5:18 PM
> To: UserZooKeeper <[hidden email]>
> Subject: Re: Zookeeper session expiration
>
> {External email message: This email is from an external source. Please exercise caution prior to opening attachments, clicking on links, or providing any sensitive information.}
>
> What Jordan said + time use is only in the relative sense, not the absolute. Session tracking (expiration) is relative to the start of leadership.
>
> Patrick
>
>> On Mon, Dec 4, 2017 at 12:21 PM, Jordan Zimmerman < [hidden email]> wrote:
>>
>> ZooKeeper, indeed, does not use wall clock time. It uses
>> System.nanoTime() for most operations. Further, all operations go
>> through the Leader node so only the Leader's notion of time matters.
>> The Leader manages the session via a "SessionTracker" instance. The code is in SessionTrackerImpl.java.
>> There is a sessionExpiryQueue which is a kind of priority queue that
>> returns expired sessions based on System.nanoTime().
>>
>> -JZ
>>
>>> On Dec 4, 2017, at 12:09 PM, Abraham Fine <[hidden email]> wrote:
>>>
>>> Hello Anthony and Shawn-
>>>
>>> To the best of my knowledge ZooKeeper does not use the "wall clock"
>>> time anywhere. So that should not be the problem.
>>>
>>> Please consider enabling debug logging, which should allow you to
>>> track the "pings".
>>>
>>> Thanks,
>>> Abe
>>>
>>>> On Mon, Dec 4, 2017, at 11:51, Anthony Shaya wrote:
>>>> Thanks Shawn, should I message the developer mailing list for a
>>>> more definitive answer?
>>>>
>>>> Thanks again for the reply.
>>>>
>>>> -----Original Message-----
>>>> From: Shawn Heisey [mailto:[hidden email]]
>>>> Sent: Monday, December 4, 2017 2:49 PM
>>>> To: [hidden email]
>>>> Subject: Re: Zookeeper session expiration
>>>>
>>>>> On 12/4/2017 8:22 AM, Anthony Shaya wrote:
>>>>> My question is related to how session expiration works, I noticed
>>>>> on
>> many of the client machines the times across these machines were all
>> off (by anywhere from 1 minute to 20 minutes - which was resolved
>> after discovery - haven't verified this completely yet). Can this
>> directly affect session expiration within the zookeeper cluster?
>>>>>
>>>>>  *   I read the following in https://na01.safelinks.
>> protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%
>> 2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d53
>> b50 23ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%
>> 7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1
>> EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster
>> does not hear from the client within the specified session timeout period (i.e.
>> no heartbeat).". So in some case it seems like if the times were wrong
>> across the machines its possible one of the clients could of
>> effectively sent a heart beat in the past (not sure about this tbh)
>> and then the cluster expires the session?
>>>>
>>>> I make these comments without any knowledge of what ZK code
>>>> actually does.  I am a member of this list because I'm a
>>>> representative of the Apache Solr project, which uses the ZK client
>>>> in order to maintain a cluster.
>>>>
>>>> IMHO, any software which makes actual decisions based on the
>>>> timestamps in messages from another system is badly designed.  I
>>>> would hope that
>> the
>>>> ZK designers know this, and always make any decisions related to
>>>> time using the clock in the local system only.
>>>>
>>>> If ZK's designers did the right thing, then a session timeout would
>>>> indicate that quite literally no heartbeats were received in X
>>>> seconds, as measured by the local clock, and the local clock ONLY
>>>> ... NOT from timestamp information received from another system.
>>>>
>>>> Although such a lack of communication could be caused by any number
>>>> of things, including network hardware failure, one of the most
>>>> common reasons I have seen for problems like this is extreme java
>>>> garbage collection pauses in the client software.
>>>>
>>>> Situations where the heap is a little bit too small can cause a
>>>> java program to basically be doing garbage collection constantly,
>>>> so it doesn't have much time to do anything else, like send
>>>> heartbeats to ZK servers.
>>>>
>>>> Situations where the heap is HUGE and garbage collection is not
>>>> well tuned can lead to pauses of a minute or longer while Java does
>>>> a massive full GC.
>>>>
>>>>>  *   I don't have the zookeeper node log for the above time to see
>> what was going on in zookeeper when the cluster determined the session
>> expired.
>>>>>
>>>>>  *   Is there any additional logging I can turn on to troubleshoot zk
>> session expiration issues?
>>>>
>>>> Hopefully your ZK clients also have logging.  Failing that, you
>>>> could turn on GC logging for the software with the ZK client
>>>> (assuming it's a Java client) and find a program or website that
>>>> can examine the log and give you statistics or a graph of GC pauses.
>>>>
>>>> If there is a problem in software using the client and whatever
>>>> logging is available doesn't help you figure out what's wrong,
>>>> you're generally going to need to talk to whoever wrote that
>>>> software for help troubleshooting it.
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>>>
>>>>
>>>> This message is intended exclusively for the individual or entity
>>>> to which it is addressed. This communication may contain
>>>> information that
>> is
>>>> proprietary, privileged, confidential or otherwise legally exempt
>>>> from disclosure. If you are not the named addressee, or have been
>>>> inadvertently and erroneously referenced in the address line, you
>>>> are
>> not
>>>> authorized to read, print, retain, copy or disseminate this message
>>>> or any part of it. If you have received this message in error,
>>>> please
>> notify
>>>> the sender immediately by e-mail and delete all copies of the message.
>>>> (ID m031214)
>>
>>
Reply | Threaded
Open this post in threaded view
|

RE: Zookeeper session expiration

Anthony Shaya
Thanks for the replies, I will investigate further and submit something again with more detail if I believe its zookeeper or the client application truly not responding with heartbeats

-----Original Message-----
From: Jordan Zimmerman [mailto:[hidden email]]
Sent: Thursday, December 7, 2017 8:48 AM
To: [hidden email]
Subject: Re: Zookeeper session expiration

System.nanoTime() is not affected by clock changes. Really everyone - this is simply not an issue in ZooKeeper.

====================
Jordan Zimmerman

> On Dec 7, 2017, at 7:43 AM, Kathryn Hogg <[hidden email]> wrote:
>
> I'm pretty new to zookeeper but have a fair amount of experience with virtual synchrony going back many years.  Even though time is relative, it is possible that if the clock suddenly jumps forward on the server to prematurely declare timeouts as expired.  I'm not sure how Zookeeper handles that but in Isis, if 2 consecutive calls to gettimeofday had too large of a difference, it considered it fishy.  
>
> Of course, this is why we use ntp with adjtime to avoid clocks going backwards or making large jumps forward.
>
> -----Original Message-----
> From: Patrick Hunt [mailto:[hidden email]]
> Sent: Wednesday, December 06, 2017 5:18 PM
> To: UserZooKeeper <[hidden email]>
> Subject: Re: Zookeeper session expiration
>
> {External email message: This email is from an external source. Please
> exercise caution prior to opening attachments, clicking on links, or
> providing any sensitive information.}
>
> What Jordan said + time use is only in the relative sense, not the absolute. Session tracking (expiration) is relative to the start of leadership.
>
> Patrick
>
>> On Mon, Dec 4, 2017 at 12:21 PM, Jordan Zimmerman < [hidden email]> wrote:
>>
>> ZooKeeper, indeed, does not use wall clock time. It uses
>> System.nanoTime() for most operations. Further, all operations go
>> through the Leader node so only the Leader's notion of time matters.
>> The Leader manages the session via a "SessionTracker" instance. The code is in SessionTrackerImpl.java.
>> There is a sessionExpiryQueue which is a kind of priority queue that
>> returns expired sessions based on System.nanoTime().
>>
>> -JZ
>>
>>> On Dec 4, 2017, at 12:09 PM, Abraham Fine <[hidden email]> wrote:
>>>
>>> Hello Anthony and Shawn-
>>>
>>> To the best of my knowledge ZooKeeper does not use the "wall clock"
>>> time anywhere. So that should not be the problem.
>>>
>>> Please consider enabling debug logging, which should allow you to
>>> track the "pings".
>>>
>>> Thanks,
>>> Abe
>>>
>>>> On Mon, Dec 4, 2017, at 11:51, Anthony Shaya wrote:
>>>> Thanks Shawn, should I message the developer mailing list for a
>>>> more definitive answer?
>>>>
>>>> Thanks again for the reply.
>>>>
>>>> -----Original Message-----
>>>> From: Shawn Heisey [mailto:[hidden email]]
>>>> Sent: Monday, December 4, 2017 2:49 PM
>>>> To: [hidden email]
>>>> Subject: Re: Zookeeper session expiration
>>>>
>>>>> On 12/4/2017 8:22 AM, Anthony Shaya wrote:
>>>>> My question is related to how session expiration works, I noticed
>>>>> on
>> many of the client machines the times across these machines were all
>> off (by anywhere from 1 minute to 20 minutes - which was resolved
>> after discovery - haven't verified this completely yet). Can this
>> directly affect session expiration within the zookeeper cluster?
>>>>>
>>>>>  *   I read the following in https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fna01.safelinks&data=02%7C01%7C%7C0e20d40cd2464da0fa1508d53d791ed3%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%7C636482512836877142&sdata=f3bltEBbHY5kV5N%2FzDYox3Ex4WzAZA%2BZrCYPpZJjl4Q%3D&reserved=0.
>> protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%
>> 2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d5
>> 3
>> b50 23ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%
>> 7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1
>> EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster
>> does not hear from the client within the specified session timeout period (i.e.
>> no heartbeat).". So in some case it seems like if the times were
>> wrong across the machines its possible one of the clients could of
>> effectively sent a heart beat in the past (not sure about this tbh)
>> and then the cluster expires the session?
>>>>
>>>> I make these comments without any knowledge of what ZK code
>>>> actually does.  I am a member of this list because I'm a
>>>> representative of the Apache Solr project, which uses the ZK client
>>>> in order to maintain a cluster.
>>>>
>>>> IMHO, any software which makes actual decisions based on the
>>>> timestamps in messages from another system is badly designed.  I
>>>> would hope that
>> the
>>>> ZK designers know this, and always make any decisions related to
>>>> time using the clock in the local system only.
>>>>
>>>> If ZK's designers did the right thing, then a session timeout would
>>>> indicate that quite literally no heartbeats were received in X
>>>> seconds, as measured by the local clock, and the local clock ONLY
>>>> ... NOT from timestamp information received from another system.
>>>>
>>>> Although such a lack of communication could be caused by any number
>>>> of things, including network hardware failure, one of the most
>>>> common reasons I have seen for problems like this is extreme java
>>>> garbage collection pauses in the client software.
>>>>
>>>> Situations where the heap is a little bit too small can cause a
>>>> java program to basically be doing garbage collection constantly,
>>>> so it doesn't have much time to do anything else, like send
>>>> heartbeats to ZK servers.
>>>>
>>>> Situations where the heap is HUGE and garbage collection is not
>>>> well tuned can lead to pauses of a minute or longer while Java does
>>>> a massive full GC.
>>>>
>>>>>  *   I don't have the zookeeper node log for the above time to see
>> what was going on in zookeeper when the cluster determined the
>> session expired.
>>>>>
>>>>>  *   Is there any additional logging I can turn on to troubleshoot zk
>> session expiration issues?
>>>>
>>>> Hopefully your ZK clients also have logging.  Failing that, you
>>>> could turn on GC logging for the software with the ZK client
>>>> (assuming it's a Java client) and find a program or website that
>>>> can examine the log and give you statistics or a graph of GC pauses.
>>>>
>>>> If there is a problem in software using the client and whatever
>>>> logging is available doesn't help you figure out what's wrong,
>>>> you're generally going to need to talk to whoever wrote that
>>>> software for help troubleshooting it.
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>>>
>>>>
>>>> This message is intended exclusively for the individual or entity
>>>> to which it is addressed. This communication may contain
>>>> information that
>> is
>>>> proprietary, privileged, confidential or otherwise legally exempt
>>>> from disclosure. If you are not the named addressee, or have been
>>>> inadvertently and erroneously referenced in the address line, you
>>>> are
>> not
>>>> authorized to read, print, retain, copy or disseminate this message
>>>> or any part of it. If you have received this message in error,
>>>> please
>> notify
>>>> the sender immediately by e-mail and delete all copies of the message.
>>>> (ID m031214)
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper session expiration

Patrick Hunt
In reply to this post by Jordan Zimmerman-3
Easy enough to try out. Give it a shot and enter a jira if you find an
issue.

Regards,

Patrick

On Thu, Dec 7, 2017 at 5:47 AM, Jordan Zimmerman <[hidden email]
> wrote:

> System.nanoTime() is not affected by clock changes. Really everyone - this
> is simply not an issue in ZooKeeper.
>
> ====================
> Jordan Zimmerman
>
> > On Dec 7, 2017, at 7:43 AM, Kathryn Hogg <[hidden email]> wrote:
> >
> > I'm pretty new to zookeeper but have a fair amount of experience with
> virtual synchrony going back many years.  Even though time is relative, it
> is possible that if the clock suddenly jumps forward on the server to
> prematurely declare timeouts as expired.  I'm not sure how Zookeeper
> handles that but in Isis, if 2 consecutive calls to gettimeofday had too
> large of a difference, it considered it fishy.
> >
> > Of course, this is why we use ntp with adjtime to avoid clocks going
> backwards or making large jumps forward.
> >
> > -----Original Message-----
> > From: Patrick Hunt [mailto:[hidden email]]
> > Sent: Wednesday, December 06, 2017 5:18 PM
> > To: UserZooKeeper <[hidden email]>
> > Subject: Re: Zookeeper session expiration
> >
> > {External email message: This email is from an external source. Please
> exercise caution prior to opening attachments, clicking on links, or
> providing any sensitive information.}
> >
> > What Jordan said + time use is only in the relative sense, not the
> absolute. Session tracking (expiration) is relative to the start of
> leadership.
> >
> > Patrick
> >
> >> On Mon, Dec 4, 2017 at 12:21 PM, Jordan Zimmerman <
> [hidden email]> wrote:
> >>
> >> ZooKeeper, indeed, does not use wall clock time. It uses
> >> System.nanoTime() for most operations. Further, all operations go
> >> through the Leader node so only the Leader's notion of time matters.
> >> The Leader manages the session via a "SessionTracker" instance. The
> code is in SessionTrackerImpl.java.
> >> There is a sessionExpiryQueue which is a kind of priority queue that
> >> returns expired sessions based on System.nanoTime().
> >>
> >> -JZ
> >>
> >>> On Dec 4, 2017, at 12:09 PM, Abraham Fine <[hidden email]> wrote:
> >>>
> >>> Hello Anthony and Shawn-
> >>>
> >>> To the best of my knowledge ZooKeeper does not use the "wall clock"
> >>> time anywhere. So that should not be the problem.
> >>>
> >>> Please consider enabling debug logging, which should allow you to
> >>> track the "pings".
> >>>
> >>> Thanks,
> >>> Abe
> >>>
> >>>> On Mon, Dec 4, 2017, at 11:51, Anthony Shaya wrote:
> >>>> Thanks Shawn, should I message the developer mailing list for a
> >>>> more definitive answer?
> >>>>
> >>>> Thanks again for the reply.
> >>>>
> >>>> -----Original Message-----
> >>>> From: Shawn Heisey [mailto:[hidden email]]
> >>>> Sent: Monday, December 4, 2017 2:49 PM
> >>>> To: [hidden email]
> >>>> Subject: Re: Zookeeper session expiration
> >>>>
> >>>>> On 12/4/2017 8:22 AM, Anthony Shaya wrote:
> >>>>> My question is related to how session expiration works, I noticed
> >>>>> on
> >> many of the client machines the times across these machines were all
> >> off (by anywhere from 1 minute to 20 minutes - which was resolved
> >> after discovery - haven't verified this completely yet). Can this
> >> directly affect session expiration within the zookeeper cluster?
> >>>>>
> >>>>>  *   I read the following in https://na01.safelinks.
> >> protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%
> >> 2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d53
> >> b50 23ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%
> >> 7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1
> >> EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster
> >> does not hear from the client within the specified session timeout
> period (i.e.
> >> no heartbeat).". So in some case it seems like if the times were wrong
> >> across the machines its possible one of the clients could of
> >> effectively sent a heart beat in the past (not sure about this tbh)
> >> and then the cluster expires the session?
> >>>>
> >>>> I make these comments without any knowledge of what ZK code
> >>>> actually does.  I am a member of this list because I'm a
> >>>> representative of the Apache Solr project, which uses the ZK client
> >>>> in order to maintain a cluster.
> >>>>
> >>>> IMHO, any software which makes actual decisions based on the
> >>>> timestamps in messages from another system is badly designed.  I
> >>>> would hope that
> >> the
> >>>> ZK designers know this, and always make any decisions related to
> >>>> time using the clock in the local system only.
> >>>>
> >>>> If ZK's designers did the right thing, then a session timeout would
> >>>> indicate that quite literally no heartbeats were received in X
> >>>> seconds, as measured by the local clock, and the local clock ONLY
> >>>> ... NOT from timestamp information received from another system.
> >>>>
> >>>> Although such a lack of communication could be caused by any number
> >>>> of things, including network hardware failure, one of the most
> >>>> common reasons I have seen for problems like this is extreme java
> >>>> garbage collection pauses in the client software.
> >>>>
> >>>> Situations where the heap is a little bit too small can cause a
> >>>> java program to basically be doing garbage collection constantly,
> >>>> so it doesn't have much time to do anything else, like send
> >>>> heartbeats to ZK servers.
> >>>>
> >>>> Situations where the heap is HUGE and garbage collection is not
> >>>> well tuned can lead to pauses of a minute or longer while Java does
> >>>> a massive full GC.
> >>>>
> >>>>>  *   I don't have the zookeeper node log for the above time to see
> >> what was going on in zookeeper when the cluster determined the session
> >> expired.
> >>>>>
> >>>>>  *   Is there any additional logging I can turn on to troubleshoot zk
> >> session expiration issues?
> >>>>
> >>>> Hopefully your ZK clients also have logging.  Failing that, you
> >>>> could turn on GC logging for the software with the ZK client
> >>>> (assuming it's a Java client) and find a program or website that
> >>>> can examine the log and give you statistics or a graph of GC pauses.
> >>>>
> >>>> If there is a problem in software using the client and whatever
> >>>> logging is available doesn't help you figure out what's wrong,
> >>>> you're generally going to need to talk to whoever wrote that
> >>>> software for help troubleshooting it.
> >>>>
> >>>> Thanks,
> >>>> Shawn
> >>>>
> >>>>
> >>>>
> >>>> This message is intended exclusively for the individual or entity
> >>>> to which it is addressed. This communication may contain
> >>>> information that
> >> is
> >>>> proprietary, privileged, confidential or otherwise legally exempt
> >>>> from disclosure. If you are not the named addressee, or have been
> >>>> inadvertently and erroneously referenced in the address line, you
> >>>> are
> >> not
> >>>> authorized to read, print, retain, copy or disseminate this message
> >>>> or any part of it. If you have received this message in error,
> >>>> please
> >> notify
> >>>> the sender immediately by e-mail and delete all copies of the message.
> >>>> (ID m031214)
> >>
> >>
>