Quantcast

shutdown Observer

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

shutdown Observer

rammohan ganapavarapu
Hi,

We have a multi data-center zk cluster with all the followers are in one
data-center and observers in other data-centers, for some reason observers
are going down with the following exception and i am not sure what could be
the reason and how to avoid this issue, any thoughts?

Ram



2017-03-09 09:00:18,305 - WARN
[QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@79] - Exception when
observing the leader
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
        at java.io.DataInputStream.readInt(DataInputStream.java:387)
        at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
        at
org.apache.zookeeper.server.quorum.Observer.observeLeader(Observer.java:75)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:727)
2017-03-09 09:00:18,306 - INFO
[QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@137] - shutdown called
java.lang.Exception: shutdown Observer
        at
org.apache.zookeeper.server.quorum.Observer.shutdown(Observer.java:137)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: shutdown Observer

hanm
The log indicates that your server socket on observer timed out after
syncing with leader. It could simply because that the latency between your
DCs exceeds the socket timeout configuration ZK uses. The timeout is
calculated as tickTime * syncLimit so you might want tweak these values to
fit the latency between your DCs.

On Thu, Mar 9, 2017 at 9:00 AM, rammohan ganapavarapu <
[hidden email]> wrote:

> Hi,
>
> We have a multi data-center zk cluster with all the followers are in one
> data-center and observers in other data-centers, for some reason observers
> are going down with the following exception and i am not sure what could be
> the reason and how to avoid this issue, any thoughts?
>
> Ram
>
>
>
> 2017-03-09 09:00:18,305 - WARN
> [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@79] - Exception when
> observing the leader
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:152)
>         at java.net.SocketInputStream.read(SocketInputStream.java:122)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>         at java.io.DataInputStream.readInt(DataInputStream.java:387)
>         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at
> org.apache.zookeeper.server.quorum.QuorumPacket.
> deserialize(QuorumPacket.java:83)
>         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>         at
> org.apache.zookeeper.server.quorum.Observer.observeLeader(
> Observer.java:75)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:727)
> 2017-03-09 09:00:18,306 - INFO
> [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@137] - shutdown called
> java.lang.Exception: shutdown Observer
>         at
> org.apache.zookeeper.server.quorum.Observer.shutdown(Observer.java:137)
>



--
Cheers
Michael.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: shutdown Observer

Dan Benediktson
It's also likely you have a fair bit of packet loss between your
datacenters, unless you know you have a solid network between them. If your
observers are falling offline "randomly", packet loss is a pretty likely
culprit.

On Thu, Mar 9, 2017 at 9:54 AM, Michael Han <[hidden email]> wrote:

> The log indicates that your server socket on observer timed out after
> syncing with leader. It could simply because that the latency between your
> DCs exceeds the socket timeout configuration ZK uses. The timeout is
> calculated as tickTime * syncLimit so you might want tweak these values to
> fit the latency between your DCs.
>
> On Thu, Mar 9, 2017 at 9:00 AM, rammohan ganapavarapu <
> [hidden email]> wrote:
>
> > Hi,
> >
> > We have a multi data-center zk cluster with all the followers are in one
> > data-center and observers in other data-centers, for some reason
> observers
> > are going down with the following exception and i am not sure what could
> be
> > the reason and how to avoid this issue, any thoughts?
> >
> > Ram
> >
> >
> >
> > 2017-03-09 09:00:18,305 - WARN
> > [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@79] - Exception when
> > observing the leader
> > java.net.SocketTimeoutException: Read timed out
> >         at java.net.SocketInputStream.socketRead0(Native Method)
> >         at java.net.SocketInputStream.read(SocketInputStream.java:152)
> >         at java.net.SocketInputStream.read(SocketInputStream.java:122)
> >         at java.io.BufferedInputStream.fill(BufferedInputStream.java:
> 235)
> >         at java.io.BufferedInputStream.read(BufferedInputStream.java:
> 254)
> >         at java.io.DataInputStream.readInt(DataInputStream.java:387)
> >         at
> > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPacket.
> > deserialize(QuorumPacket.java:83)
> >         at
> > org.apache.jute.BinaryInputArchive.readRecord(
> BinaryInputArchive.java:108)
> >         at
> > org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> >         at
> > org.apache.zookeeper.server.quorum.Observer.observeLeader(
> > Observer.java:75)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:727)
> > 2017-03-09 09:00:18,306 - INFO
> > [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@137] - shutdown
> called
> > java.lang.Exception: shutdown Observer
> >         at
> > org.apache.zookeeper.server.quorum.Observer.shutdown(Observer.java:137)
> >
>
>
>
> --
> Cheers
> Michael.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: shutdown Observer

Jai Bheemsen Rao Dhanwada
If there is packet loss, does increasing the initLimit value help?

ref: http://efod.se/blog/archive/2013/02/09/zookeeper-initlimit

Any thoughts?

On Thu, Mar 9, 2017 at 10:12 AM, Dan Benediktson <
[hidden email]> wrote:

> It's also likely you have a fair bit of packet loss between your
> datacenters, unless you know you have a solid network between them. If your
> observers are falling offline "randomly", packet loss is a pretty likely
> culprit.
>
> On Thu, Mar 9, 2017 at 9:54 AM, Michael Han <[hidden email]> wrote:
>
> > The log indicates that your server socket on observer timed out after
> > syncing with leader. It could simply because that the latency between
> your
> > DCs exceeds the socket timeout configuration ZK uses. The timeout is
> > calculated as tickTime * syncLimit so you might want tweak these values
> to
> > fit the latency between your DCs.
> >
> > On Thu, Mar 9, 2017 at 9:00 AM, rammohan ganapavarapu <
> > [hidden email]> wrote:
> >
> > > Hi,
> > >
> > > We have a multi data-center zk cluster with all the followers are in
> one
> > > data-center and observers in other data-centers, for some reason
> > observers
> > > are going down with the following exception and i am not sure what
> could
> > be
> > > the reason and how to avoid this issue, any thoughts?
> > >
> > > Ram
> > >
> > >
> > >
> > > 2017-03-09 09:00:18,305 - WARN
> > > [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@79] - Exception
> when
> > > observing the leader
> > > java.net.SocketTimeoutException: Read timed out
> > >         at java.net.SocketInputStream.socketRead0(Native Method)
> > >         at java.net.SocketInputStream.read(SocketInputStream.java:152)
> > >         at java.net.SocketInputStream.read(SocketInputStream.java:122)
> > >         at java.io.BufferedInputStream.fill(BufferedInputStream.java:
> > 235)
> > >         at java.io.BufferedInputStream.read(BufferedInputStream.java:
> > 254)
> > >         at java.io.DataInputStream.readInt(DataInputStream.java:387)
> > >         at
> > > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> > >         at
> > > org.apache.zookeeper.server.quorum.QuorumPacket.
> > > deserialize(QuorumPacket.java:83)
> > >         at
> > > org.apache.jute.BinaryInputArchive.readRecord(
> > BinaryInputArchive.java:108)
> > >         at
> > > org.apache.zookeeper.server.quorum.Learner.readPacket(
> Learner.java:152)
> > >         at
> > > org.apache.zookeeper.server.quorum.Observer.observeLeader(
> > > Observer.java:75)
> > >         at
> > > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:727)
> > > 2017-03-09 09:00:18,306 - INFO
> > > [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@137] - shutdown
> > called
> > > java.lang.Exception: shutdown Observer
> > >         at
> > > org.apache.zookeeper.server.quorum.Observer.shutdown(
> Observer.java:137)
> > >
> >
> >
> >
> > --
> > Cheers
> > Michael.
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: shutdown Observer

hanm
It helps. An extreme case is network partition and packet loss is 100%. ZK
rely on TCP for communications between quorum peers, so the lost packet
will be retransmitted by TCP, so unless your network is partitioned
forever, the system will move forward once the partition heals. There is no
worries about a packet loss forever because of the TCP guarantee. In this
case the timeout can be set to infinite (pass 0 to setSoTimeout) so socket
IO will block indefinitely until partition heals.

The socket timeout is really just to provide an opportunity for ZK server
to take action when we think we should bail out for a bad network condition
rather than blocking indefinitely, as ZK needs to satisfy some basic
liveness guarantee.

On Thu, Mar 9, 2017 at 3:12 PM, Jai Bheemsen Rao Dhanwada <
[hidden email]> wrote:

> If there is packet loss, does increasing the initLimit value help?
>
> ref: http://efod.se/blog/archive/2013/02/09/zookeeper-initlimit
>
> Any thoughts?
>
> On Thu, Mar 9, 2017 at 10:12 AM, Dan Benediktson <
> [hidden email]> wrote:
>
> > It's also likely you have a fair bit of packet loss between your
> > datacenters, unless you know you have a solid network between them. If
> your
> > observers are falling offline "randomly", packet loss is a pretty likely
> > culprit.
> >
> > On Thu, Mar 9, 2017 at 9:54 AM, Michael Han <[hidden email]> wrote:
> >
> > > The log indicates that your server socket on observer timed out after
> > > syncing with leader. It could simply because that the latency between
> > your
> > > DCs exceeds the socket timeout configuration ZK uses. The timeout is
> > > calculated as tickTime * syncLimit so you might want tweak these values
> > to
> > > fit the latency between your DCs.
> > >
> > > On Thu, Mar 9, 2017 at 9:00 AM, rammohan ganapavarapu <
> > > [hidden email]> wrote:
> > >
> > > > Hi,
> > > >
> > > > We have a multi data-center zk cluster with all the followers are in
> > one
> > > > data-center and observers in other data-centers, for some reason
> > > observers
> > > > are going down with the following exception and i am not sure what
> > could
> > > be
> > > > the reason and how to avoid this issue, any thoughts?
> > > >
> > > > Ram
> > > >
> > > >
> > > >
> > > > 2017-03-09 09:00:18,305 - WARN
> > > > [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@79] - Exception
> > when
> > > > observing the leader
> > > > java.net.SocketTimeoutException: Read timed out
> > > >         at java.net.SocketInputStream.socketRead0(Native Method)
> > > >         at java.net.SocketInputStream.read(SocketInputStream.java:
> 152)
> > > >         at java.net.SocketInputStream.read(SocketInputStream.java:
> 122)
> > > >         at java.io.BufferedInputStream.
> fill(BufferedInputStream.java:
> > > 235)
> > > >         at java.io.BufferedInputStream.
> read(BufferedInputStream.java:
> > > 254)
> > > >         at java.io.DataInputStream.readInt(DataInputStream.java:387)
> > > >         at
> > > > org.apache.jute.BinaryInputArchive.readInt(
> BinaryInputArchive.java:63)
> > > >         at
> > > > org.apache.zookeeper.server.quorum.QuorumPacket.
> > > > deserialize(QuorumPacket.java:83)
> > > >         at
> > > > org.apache.jute.BinaryInputArchive.readRecord(
> > > BinaryInputArchive.java:108)
> > > >         at
> > > > org.apache.zookeeper.server.quorum.Learner.readPacket(
> > Learner.java:152)
> > > >         at
> > > > org.apache.zookeeper.server.quorum.Observer.observeLeader(
> > > > Observer.java:75)
> > > >         at
> > > > org.apache.zookeeper.server.quorum.QuorumPeer.run(
> QuorumPeer.java:727)
> > > > 2017-03-09 09:00:18,306 - INFO
> > > > [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@137] - shutdown
> > > called
> > > > java.lang.Exception: shutdown Observer
> > > >         at
> > > > org.apache.zookeeper.server.quorum.Observer.shutdown(
> > Observer.java:137)
> > > >
> > >
> > >
> > >
> > > --
> > > Cheers
> > > Michael.
> > >
> >
>



--
Cheers
Michael.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: shutdown Observer

Mike Richardson
In reply to this post by Dan Benediktson
Unsubscribe


Mike Richardson

Senior Software Engineer



MoTuM N.V. | Dellingstraat 34 | B-2800 MECHELEN | Belgium


T +32(0)15 28 16 63
M +41 7943 69538


www.motum.be

On 9 March 2017 at 19:12, Dan Benediktson <[hidden email]> wrote:
It's also likely you have a fair bit of packet loss between your
datacenters, unless you know you have a solid network between them. If your
observers are falling offline "randomly", packet loss is a pretty likely
culprit.

On Thu, Mar 9, 2017 at 9:54 AM, Michael Han <[hidden email]> wrote:

> The log indicates that your server socket on observer timed out after
> syncing with leader. It could simply because that the latency between your
> DCs exceeds the socket timeout configuration ZK uses. The timeout is
> calculated as tickTime * syncLimit so you might want tweak these values to
> fit the latency between your DCs.
>
> On Thu, Mar 9, 2017 at 9:00 AM, rammohan ganapavarapu <
> [hidden email]> wrote:
>
> > Hi,
> >
> > We have a multi data-center zk cluster with all the followers are in one
> > data-center and observers in other data-centers, for some reason
> observers
> > are going down with the following exception and i am not sure what could
> be
> > the reason and how to avoid this issue, any thoughts?
> >
> > Ram
> >
> >
> >
> > 2017-03-09 09:00:18,305 - WARN
> > [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@79] - Exception when
> > observing the leader
> > java.net.SocketTimeoutException: Read timed out
> >         at java.net.SocketInputStream.socketRead0(Native Method)
> >         at java.net.SocketInputStream.read(SocketInputStream.java:152)
> >         at java.net.SocketInputStream.read(SocketInputStream.java:122)
> >         at java.io.BufferedInputStream.fill(BufferedInputStream.java:
> 235)
> >         at java.io.BufferedInputStream.read(BufferedInputStream.java:
> 254)
> >         at java.io.DataInputStream.readInt(DataInputStream.java:387)
> >         at
> > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPacket.
> > deserialize(QuorumPacket.java:83)
> >         at
> > org.apache.jute.BinaryInputArchive.readRecord(
> BinaryInputArchive.java:108)
> >         at
> > org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> >         at
> > org.apache.zookeeper.server.quorum.Observer.observeLeader(
> > Observer.java:75)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:727)
> > 2017-03-09 09:00:18,306 - INFO
> > [QuorumPeer[myid=41]/0:0:0:0:0:0:0:0:2181:Observer@137] - shutdown
> called
> > java.lang.Exception: shutdown Observer
> >         at
> > org.apache.zookeeper.server.quorum.Observer.shutdown(Observer.java:137)
> >
>
>
>
> --
> Cheers
> Michael.
>

Loading...