How to find the actual size of zk data

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

How to find the actual size of zk data

rammohan ganapavarapu
Hi,

How to find  actual size of zookeeper data since it store and operate from
in-memory? and snapshot may not give me actual size? I used zkteeUtils to
dump the data is it the actual size ?

One more question, since its a in-memory datastore, what if my zk data
grows beyond available RAM? does it swap?

Thanks,
Ram
Reply | Threaded
Open this post in threaded view
|

Re: How to find the actual size of zk data

Jordan Zimmerman-3
> How to find  actual size of zookeeper data since it store and operate from
> in-memory? and snapshot may not give me actual size?

ZooKeeper transactions/snapshots are stored on disk in a series of files. You can use your file system to get the combined sizes of these files. The directory is specified by the "dataDir" value in zoo.cfg.

> One more question, since its a in-memory datastore, what if my zk data
> grows beyond available RAM? does it swap?

No. ZooKeeper does not swap. You will get OutOfMemory exceptions if store too much data. The ZooKeeper database is always limited by memory.

-JZ

> On Mar 15, 2018, at 5:57 PM, rammohan ganapavarapu <[hidden email]> wrote:
>
> Hi,
>
> How to find  actual size of zookeeper data since it store and operate from
> in-memory? and snapshot may not give me actual size? I used zkteeUtils to
> dump the data is it the actual size ?
>
> One more question, since its a in-memory datastore, what if my zk data
> grows beyond available RAM? does it swap?
>
> Thanks,
> Ram

Reply | Threaded
Open this post in threaded view
|

Re: How to find the actual size of zk data

rammohan ganapavarapu
Jordan,

Thank you, even though you have multiple snapshot files in dataDir
zookeeper uses only one latest file right? If my latest snapshot file is
1GB means my total zk data side is 1GB ?  But when I use zktreeutil and
dump to filesystem i am getting lot more GB space then snapshot size.

Ram

On Thu, Mar 15, 2018, 4:02 PM Jordan Zimmerman <[hidden email]>
wrote:

> > How to find  actual size of zookeeper data since it store and operate
> from
> > in-memory? and snapshot may not give me actual size?
>
> ZooKeeper transactions/snapshots are stored on disk in a series of files.
> You can use your file system to get the combined sizes of these files. The
> directory is specified by the "dataDir" value in zoo.cfg.
>
> > One more question, since its a in-memory datastore, what if my zk data
> > grows beyond available RAM? does it swap?
>
> No. ZooKeeper does not swap. You will get OutOfMemory exceptions if store
> too much data. The ZooKeeper database is always limited by memory.
>
> -JZ
>
> > On Mar 15, 2018, at 5:57 PM, rammohan ganapavarapu <
> [hidden email]> wrote:
> >
> > Hi,
> >
> > How to find  actual size of zookeeper data since it store and operate
> from
> > in-memory? and snapshot may not give me actual size? I used zkteeUtils to
> > dump the data is it the actual size ?
> >
> > One more question, since its a in-memory datastore, what if my zk data
> > grows beyond available RAM? does it swap?
> >
> > Thanks,
> > Ram
>
>
Reply | Threaded
Open this post in threaded view
|

Re: How to find the actual size of zk data

Mark Fenes
Hi Ram,

If the format of the snapshot file and the zktreeutil dump file is
different, they will have different sizes. Does zktreeutil export the zk
data in XML format?

Mark


On Fri, Mar 16, 2018 at 2:58 AM, rammohan ganapavarapu <
[hidden email]> wrote:

> Jordan,
>
> Thank you, even though you have multiple snapshot files in dataDir
> zookeeper uses only one latest file right? If my latest snapshot file is
> 1GB means my total zk data side is 1GB ?  But when I use zktreeutil and
> dump to filesystem i am getting lot more GB space then snapshot size.
>
> Ram
>
> On Thu, Mar 15, 2018, 4:02 PM Jordan Zimmerman <[hidden email]
> >
> wrote:
>
> > > How to find  actual size of zookeeper data since it store and operate
> > from
> > > in-memory? and snapshot may not give me actual size?
> >
> > ZooKeeper transactions/snapshots are stored on disk in a series of files.
> > You can use your file system to get the combined sizes of these files.
> The
> > directory is specified by the "dataDir" value in zoo.cfg.
> >
> > > One more question, since its a in-memory datastore, what if my zk data
> > > grows beyond available RAM? does it swap?
> >
> > No. ZooKeeper does not swap. You will get OutOfMemory exceptions if store
> > too much data. The ZooKeeper database is always limited by memory.
> >
> > -JZ
> >
> > > On Mar 15, 2018, at 5:57 PM, rammohan ganapavarapu <
> > [hidden email]> wrote:
> > >
> > > Hi,
> > >
> > > How to find  actual size of zookeeper data since it store and operate
> > from
> > > in-memory? and snapshot may not give me actual size? I used zkteeUtils
> to
> > > dump the data is it the actual size ?
> > >
> > > One more question, since its a in-memory datastore, what if my zk data
> > > grows beyond available RAM? does it swap?
> > >
> > > Thanks,
> > > Ram
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: How to find the actual size of zk data

rammohan ganapavarapu
zkTreeutil has option to dump every thing into a directory and create same
tree in filesystem under that directory, i am talking about that size.

Ram

On Mon, Mar 19, 2018 at 5:42 AM, Mark Fenes <[hidden email]> wrote:

>
> Hi Ram,
>
> If the format of the snapshot file and the zktreeutil dump file is
> different, they will have different sizes. Does zktreeutil export the zk
> data in XML format?
>
> Mark
>
>
> On Fri, Mar 16, 2018 at 2:58 AM, rammohan ganapavarapu <
> [hidden email]> wrote:
>
>> Jordan,
>>
>> Thank you, even though you have multiple snapshot files in dataDir
>> zookeeper uses only one latest file right? If my latest snapshot file is
>> 1GB means my total zk data side is 1GB ?  But when I use zktreeutil and
>> dump to filesystem i am getting lot more GB space then snapshot size.
>>
>> Ram
>>
>> On Thu, Mar 15, 2018, 4:02 PM Jordan Zimmerman <
>> [hidden email]>
>> wrote:
>>
>> > > How to find  actual size of zookeeper data since it store and operate
>> > from
>> > > in-memory? and snapshot may not give me actual size?
>> >
>> > ZooKeeper transactions/snapshots are stored on disk in a series of
>> files.
>> > You can use your file system to get the combined sizes of these files.
>> The
>> > directory is specified by the "dataDir" value in zoo.cfg.
>> >
>> > > One more question, since its a in-memory datastore, what if my zk data
>> > > grows beyond available RAM? does it swap?
>> >
>> > No. ZooKeeper does not swap. You will get OutOfMemory exceptions if
>> store
>> > too much data. The ZooKeeper database is always limited by memory.
>> >
>> > -JZ
>> >
>> > > On Mar 15, 2018, at 5:57 PM, rammohan ganapavarapu <
>> > [hidden email]> wrote:
>> > >
>> > > Hi,
>> > >
>> > > How to find  actual size of zookeeper data since it store and operate
>> > from
>> > > in-memory? and snapshot may not give me actual size? I used
>> zkteeUtils to
>> > > dump the data is it the actual size ?
>> > >
>> > > One more question, since its a in-memory datastore, what if my zk data
>> > > grows beyond available RAM? does it swap?
>> > >
>> > > Thanks,
>> > > Ram
>> >
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: How to find the actual size of zk data

Mark Fenes
Ok, understood. Maybe the best way of calculating the ZK data size would be
to count/estimate the number of znodes and their average data size per
node. This would eliminate file size differences by file format.
Maybe we could write a simple utility to read a snapshot file and calculate
these statistics. Also transactions after the latest snapshot file can
cause file size differences. Or perhaps the ZK server itself could publish
online statistics about node count and average node size via JMX - I'll
check if this is already implemented or not.

Mark

On Mon, Mar 19, 2018 at 4:05 PM, rammohan ganapavarapu <
[hidden email]> wrote:

> zkTreeutil has option to dump every thing into a directory and create same
> tree in filesystem under that directory, i am talking about that size.
>
> Ram
>
> On Mon, Mar 19, 2018 at 5:42 AM, Mark Fenes <[hidden email]> wrote:
>
>>
>> Hi Ram,
>>
>> If the format of the snapshot file and the zktreeutil dump file is
>> different, they will have different sizes. Does zktreeutil export the zk
>> data in XML format?
>>
>> Mark
>>
>>
>> On Fri, Mar 16, 2018 at 2:58 AM, rammohan ganapavarapu <
>> [hidden email]> wrote:
>>
>>> Jordan,
>>>
>>> Thank you, even though you have multiple snapshot files in dataDir
>>> zookeeper uses only one latest file right? If my latest snapshot file is
>>> 1GB means my total zk data side is 1GB ?  But when I use zktreeutil and
>>> dump to filesystem i am getting lot more GB space then snapshot size.
>>>
>>> Ram
>>>
>>> On Thu, Mar 15, 2018, 4:02 PM Jordan Zimmerman <
>>> [hidden email]>
>>> wrote:
>>>
>>> > > How to find  actual size of zookeeper data since it store and operate
>>> > from
>>> > > in-memory? and snapshot may not give me actual size?
>>> >
>>> > ZooKeeper transactions/snapshots are stored on disk in a series of
>>> files.
>>> > You can use your file system to get the combined sizes of these files.
>>> The
>>> > directory is specified by the "dataDir" value in zoo.cfg.
>>> >
>>> > > One more question, since its a in-memory datastore, what if my zk
>>> data
>>> > > grows beyond available RAM? does it swap?
>>> >
>>> > No. ZooKeeper does not swap. You will get OutOfMemory exceptions if
>>> store
>>> > too much data. The ZooKeeper database is always limited by memory.
>>> >
>>> > -JZ
>>> >
>>> > > On Mar 15, 2018, at 5:57 PM, rammohan ganapavarapu <
>>> > [hidden email]> wrote:
>>> > >
>>> > > Hi,
>>> > >
>>> > > How to find  actual size of zookeeper data since it store and operate
>>> > from
>>> > > in-memory? and snapshot may not give me actual size? I used
>>> zkteeUtils to
>>> > > dump the data is it the actual size ?
>>> > >
>>> > > One more question, since its a in-memory datastore, what if my zk
>>> data
>>> > > grows beyond available RAM? does it swap?
>>> > >
>>> > > Thanks,
>>> > > Ram
>>> >
>>> >
>>>
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: How to find the actual size of zk data

rammohan ganapavarapu
Mark,

Thanks, yes it would be nice to have as part of JMX. Please let me know if
its already there or any tool that does.

Ram

On Tue, Mar 20, 2018 at 2:27 AM, Mark Fenes <[hidden email]> wrote:

> Ok, understood. Maybe the best way of calculating the ZK data size would
> be to count/estimate the number of znodes and their average data size per
> node. This would eliminate file size differences by file format.
> Maybe we could write a simple utility to read a snapshot file and
> calculate these statistics. Also transactions after the latest snapshot
> file can cause file size differences. Or perhaps the ZK server itself could
> publish online statistics about node count and average node size via JMX -
> I'll check if this is already implemented or not.
>
> Mark
>
> On Mon, Mar 19, 2018 at 4:05 PM, rammohan ganapavarapu <
> [hidden email]> wrote:
>
>> zkTreeutil has option to dump every thing into a directory and create
>> same tree in filesystem under that directory, i am talking about that size.
>>
>> Ram
>>
>> On Mon, Mar 19, 2018 at 5:42 AM, Mark Fenes <[hidden email]> wrote:
>>
>>>
>>> Hi Ram,
>>>
>>> If the format of the snapshot file and the zktreeutil dump file is
>>> different, they will have different sizes. Does zktreeutil export the
>>> zk data in XML format?
>>>
>>> Mark
>>>
>>>
>>> On Fri, Mar 16, 2018 at 2:58 AM, rammohan ganapavarapu <
>>> [hidden email]> wrote:
>>>
>>>> Jordan,
>>>>
>>>> Thank you, even though you have multiple snapshot files in dataDir
>>>> zookeeper uses only one latest file right? If my latest snapshot file is
>>>> 1GB means my total zk data side is 1GB ?  But when I use zktreeutil and
>>>> dump to filesystem i am getting lot more GB space then snapshot size.
>>>>
>>>> Ram
>>>>
>>>> On Thu, Mar 15, 2018, 4:02 PM Jordan Zimmerman <
>>>> [hidden email]>
>>>> wrote:
>>>>
>>>> > > How to find  actual size of zookeeper data since it store and
>>>> operate
>>>> > from
>>>> > > in-memory? and snapshot may not give me actual size?
>>>> >
>>>> > ZooKeeper transactions/snapshots are stored on disk in a series of
>>>> files.
>>>> > You can use your file system to get the combined sizes of these
>>>> files. The
>>>> > directory is specified by the "dataDir" value in zoo.cfg.
>>>> >
>>>> > > One more question, since its a in-memory datastore, what if my zk
>>>> data
>>>> > > grows beyond available RAM? does it swap?
>>>> >
>>>> > No. ZooKeeper does not swap. You will get OutOfMemory exceptions if
>>>> store
>>>> > too much data. The ZooKeeper database is always limited by memory.
>>>> >
>>>> > -JZ
>>>> >
>>>> > > On Mar 15, 2018, at 5:57 PM, rammohan ganapavarapu <
>>>> > [hidden email]> wrote:
>>>> > >
>>>> > > Hi,
>>>> > >
>>>> > > How to find  actual size of zookeeper data since it store and
>>>> operate
>>>> > from
>>>> > > in-memory? and snapshot may not give me actual size? I used
>>>> zkteeUtils to
>>>> > > dump the data is it the actual size ?
>>>> > >
>>>> > > One more question, since its a in-memory datastore, what if my zk
>>>> data
>>>> > > grows beyond available RAM? does it swap?
>>>> > >
>>>> > > Thanks,
>>>> > > Ram
>>>> >
>>>> >
>>>>
>>>
>>>
>>
>