NullPointerException stopping and starting Zookeeper servers

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

NullPointerException stopping and starting Zookeeper servers

Thomas Vinod Johnson
Hi,
I have a replicated zookeeper services consisting of 3 zookeeper (3.0.1)
servers all running on the same host for testing purposes. I've created
exactly one znode in this ensemble. At this point, I stop, then restart
a single zookeeper server; moving onto the next one a few seconds later.
A few restarts later (about 4 is usually sufficient), I get the
following exception on one of the servers, at which point it exits:
java.lang.NullPointerException
        at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:447)
        at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:358)
        at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:333)
        at
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:250)
        at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:102)
        at
org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:183)
        at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:245)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:421)
2008-12-08 14:14:24,880 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2183:Leader@336] - Shutdown called
java.lang.Exception: shutdown Leader! reason: Forcing shutdown
        at
org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:336)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)
Exception in thread "QuorumPeer:/0:0:0:0:0:0:0:0:2183"
java.lang.NullPointerException
        at
org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:339)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)

The inputStream field is null, apparently because next is being called
at line 358 even after next returns false. Having very little knowledge
about the implementation, I don't know if the existence of hdr.getZxid()
 >= zxid is supposed to be an invariant across all invocations of the
server; however the following change to FileTxnLog.java seems to make
the problem go away.
diff FileTxnLog.java /tmp/FileTxnLog.java
358c358,359
<                 next();
---
 >               if (!next())
 >                   return;
447c448,450
<                 inputStream.close();
---
 >               if (inputStream != null) {
 >                   inputStream.close();
 >               }

Is this a bug?

Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException stopping and starting Zookeeper servers

Mahadev Konar
Hi Thomas,
 This looks like a bug. Can you open a jira mentioning what the problem is
and how to recreate it?

Thanks
mahadev


On 12/8/08 11:33 AM, "Thomas Vinod Johnson" <[hidden email]> wrote:

> Hi,
> I have a replicated zookeeper services consisting of 3 zookeeper (3.0.1)
> servers all running on the same host for testing purposes. I've created
> exactly one znode in this ensemble. At this point, I stop, then restart
> a single zookeeper server; moving onto the next one a few seconds later.
> A few restarts later (about 4 is usually sufficient), I get the
> following exception on one of the servers, at which point it exits:
> java.lang.NullPointerException
>         at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTx
> nLog.java:447)
>         at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTx
> nLog.java:358)
>         at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(File
> TxnLog.java:333)
>         at
> org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:250)
>         at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.
> java:102)
>         at
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:183)
>         at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:245)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:421)
> 2008-12-08 14:14:24,880 - INFO
> [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Leader@336] - Shutdown called
> java.lang.Exception: shutdown Leader! reason: Forcing shutdown
>         at
> org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:336)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)
> Exception in thread "QuorumPeer:/0:0:0:0:0:0:0:0:2183"
> java.lang.NullPointerException
>         at
> org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:339)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)
>
> The inputStream field is null, apparently because next is being called
> at line 358 even after next returns false. Having very little knowledge
> about the implementation, I don't know if the existence of hdr.getZxid()
>> = zxid is supposed to be an invariant across all invocations of the
> server; however the following change to FileTxnLog.java seems to make
> the problem go away.
> diff FileTxnLog.java /tmp/FileTxnLog.java
> 358c358,359
> <                 next();
> ---
>>               if (!next())
>>                   return;
> 447c448,450
> <                 inputStream.close();
> ---
>>               if (inputStream != null) {
>>                   inputStream.close();
>>               }
>
> Is this a bug?
>
> Thanks.
>