|
Hi all,
I'm looking at using Zookeeper for distributed sequence number generation. What's the best way to do this currently? Is there a particular recipe available for this? My so far involve: a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this gives me the monotonically increasing number, but the sequence number isn't contiguous b) Storing the sequence number in the data portion of a persistent node - then updating this (using the version number - aka optimistic locking). The problem with this is that under high load I'm assuming there'll be a lot of contention and hence failures with regards to updates. What are your thoughts on the above? Many thanks, Jon. |
|
(b)
BUT: Sequential numbering is a special case of "now". In large diameters, now gets very expensive. This is a special case of that assertion. If there is a way to get away from this presumption of the need for sequential numbering, you will be miles better off. HOWEVER: ZK can do better than you suggest. Incrementing a counter does involve potential contention, but you will very likely be able to get to pretty high rates before the optimistic locking begins to fail. If you code your update with a few tries at full speed followed by some form of retry back-off, you should get pretty close to the best possible performance. You might also try building a lock with an ephemeral file before updating the counter. I would expect that this will be slower than the back-off option if only because involves more transactions in ZK. IF you wanted to get too complicated for your own good, you could have a secondary strategy flag that is only sampled by all clients every few seconds and is updated whenever a client needs to back-off more than say 5 steps. If this flag has been updated recently, then clients should switch to the locking protocol. You might even have several locks so that you don't exclude all other updaters, merely thin them out a bit. This flagged strategy would run as fast as optimistic locking as long as optimistic locking is fast and then would limit the total number of transactions needed under very high load. On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway < [hidden email]> wrote: > My so far involve: > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this gives > me the monotonically increasing number, but the sequence number isn't > contiguous > b) Storing the sequence number in the data portion of a persistent node - > then updating this (using the version number - aka optimistic locking). > The > problem with this is that under high load I'm assuming there'll be a lot of > contention and hence failures with regards to updates. > |
|
Hi Ted,
Thanks for the comments. I might have overlooked something here, but is it also possible to do the following: 1. Create a PERSISTENT node 2. Have multiple clients set the data on the node, e.g. Stat stat = zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1); 3. Use the version number from stat.getVersion() as the sequence (obviously I'm limited to Integer.MAX_VALUE) Are there any weird race conditions involved here which would mean that a client would receive the wrong Stat object back? Many thanks again, Jon. On 5 August 2010 16:09, Ted Dunning <[hidden email]> wrote: > (b) > > BUT: > > Sequential numbering is a special case of "now". In large diameters, now > gets very expensive. This is a special case of that assertion. If there > is > a way to get away from this presumption of the need for sequential > numbering, you will be miles better off. > > HOWEVER: > > ZK can do better than you suggest. Incrementing a counter does involve > potential contention, but you will very likely be able to get to pretty > high > rates before the optimistic locking begins to fail. If you code your > update > with a few tries at full speed followed by some form of retry back-off, you > should get pretty close to the best possible performance. > > You might also try building a lock with an ephemeral file before updating > the counter. I would expect that this will be slower than the back-off > option if only because involves more transactions in ZK. IF you wanted to > get too complicated for your own good, you could have a secondary strategy > flag that is only sampled by all clients every few seconds and is updated > whenever a client needs to back-off more than say 5 steps. If this flag > has > been updated recently, then clients should switch to the locking protocol. > You might even have several locks so that you don't exclude all other > updaters, merely thin them out a bit. This flagged strategy would run as > fast as optimistic locking as long as optimistic locking is fast and then > would limit the total number of transactions needed under very high load. > > On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway < > [hidden email]> wrote: > > > My so far involve: > > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this > gives > > me the monotonically increasing number, but the sequence number isn't > > contiguous > > b) Storing the sequence number in the data portion of a persistent node - > > then updating this (using the version number - aka optimistic locking). > > The > > problem with this is that under high load I'm assuming there'll be a lot > of > > contention and hence failures with regards to updates. > > > |
|
Sounds right to me. Much simpler as well.
On Thu, Aug 5, 2010 at 5:38 PM, Jonathan Holloway < [hidden email]> wrote: > Hi Ted, > > Thanks for the comments. > > I might have overlooked something here, but is it also possible to do the > following: > > 1. Create a PERSISTENT node > 2. Have multiple clients set the data on the node, e.g. Stat stat = > zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1); > 3. Use the version number from stat.getVersion() as the sequence (obviously > I'm limited to Integer.MAX_VALUE) > > Are there any weird race conditions involved here which would mean that a > client would receive the wrong Stat object back? > > Many thanks again, > Jon. > > On 5 August 2010 16:09, Ted Dunning <[hidden email]> wrote: > > > (b) > > > > BUT: > > > > Sequential numbering is a special case of "now". In large diameters, now > > gets very expensive. This is a special case of that assertion. If there > > is > > a way to get away from this presumption of the need for sequential > > numbering, you will be miles better off. > > > > HOWEVER: > > > > ZK can do better than you suggest. Incrementing a counter does involve > > potential contention, but you will very likely be able to get to pretty > > high > > rates before the optimistic locking begins to fail. If you code your > > update > > with a few tries at full speed followed by some form of retry back-off, > you > > should get pretty close to the best possible performance. > > > > You might also try building a lock with an ephemeral file before updating > > the counter. I would expect that this will be slower than the back-off > > option if only because involves more transactions in ZK. IF you wanted > to > > get too complicated for your own good, you could have a secondary > strategy > > flag that is only sampled by all clients every few seconds and is updated > > whenever a client needs to back-off more than say 5 steps. If this flag > > has > > been updated recently, then clients should switch to the locking > protocol. > > You might even have several locks so that you don't exclude all other > > updaters, merely thin them out a bit. This flagged strategy would run as > > fast as optimistic locking as long as optimistic locking is fast and then > > would limit the total number of transactions needed under very high load. > > > > On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway < > > [hidden email]> wrote: > > > > > My so far involve: > > > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this > > gives > > > me the monotonically increasing number, but the sequence number isn't > > > contiguous > > > b) Storing the sequence number in the data portion of a persistent node > - > > > then updating this (using the version number - aka optimistic locking). > > > The > > > problem with this is that under high load I'm assuming there'll be a > lot > > of > > > contention and hence failures with regards to updates. > > > > > > |
|
In reply to this post by jph98
On 08/05/2010 06:31 PM, Jonathan Holloway wrote:
> Hi all, > > I'm looking at using Zookeeper for distributed sequence number generation. > What's the best way to do this currently? Is there a particular recipe > available for this? > > My so far involve: > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this gives > me the monotonically increasing number, but the sequence number isn't > contiguous > b) Storing the sequence number in the data portion of a persistent node - > then updating this (using the version number - aka optimistic locking). The > problem with this is that under high load I'm assuming there'll be a lot of > contention and hence failures with regards to updates. > > What are your thoughts on the above? > > Many thanks, > Jon. I just ran into this exact situation, and handled it like so: I wrote a library that uses the option (b) you described above. Only instead of requesting a single sequence number, you request a block of them at a time from Zookeeper, and then locally use them up one by one from the block you retrieved. Retrieving by block (e.g., by blocks of 10000 at a time) eliminates the contention issue. Then, if you're finished assigning ID's from that block, but still have a bunch of ID's left in the block, the library has another function to "push back" the unused ID's. They'll then get pulled again in the next block retrieval. We don't actually have this code running in production yet, so I can't vouch for how well it works. But the design was reviewed and given the thumbs up by the core developers on the team, and the implementation passes all my unit tests. HTH. Feel free to email back with specific questions if you'd like more details. DR |
|
In reply to this post by Ted Dunning
Great, I just need to overcome the Integer.MAX_VALUE bit, maybe by having
SEQUENCE_LOW and SEQUENCE_HIGH nodes, being careful with the rollover. Thanks again. On 5 August 2010 17:54, Ted Dunning <[hidden email]> wrote: > Sounds right to me. Much simpler as well. > > On Thu, Aug 5, 2010 at 5:38 PM, Jonathan Holloway < > [hidden email]> wrote: > > > Hi Ted, > > > > Thanks for the comments. > > > > I might have overlooked something here, but is it also possible to do the > > following: > > > > 1. Create a PERSISTENT node > > 2. Have multiple clients set the data on the node, e.g. Stat stat = > > zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1); > > 3. Use the version number from stat.getVersion() as the sequence > (obviously > > I'm limited to Integer.MAX_VALUE) > > > > Are there any weird race conditions involved here which would mean that a > > client would receive the wrong Stat object back? > > > > Many thanks again, > > Jon. > > > > On 5 August 2010 16:09, Ted Dunning <[hidden email]> wrote: > > > > > (b) > > > > > > BUT: > > > > > > Sequential numbering is a special case of "now". In large diameters, > now > > > gets very expensive. This is a special case of that assertion. If > there > > > is > > > a way to get away from this presumption of the need for sequential > > > numbering, you will be miles better off. > > > > > > HOWEVER: > > > > > > ZK can do better than you suggest. Incrementing a counter does involve > > > potential contention, but you will very likely be able to get to pretty > > > high > > > rates before the optimistic locking begins to fail. If you code your > > > update > > > with a few tries at full speed followed by some form of retry back-off, > > you > > > should get pretty close to the best possible performance. > > > > > > You might also try building a lock with an ephemeral file before > updating > > > the counter. I would expect that this will be slower than the back-off > > > option if only because involves more transactions in ZK. IF you wanted > > to > > > get too complicated for your own good, you could have a secondary > > strategy > > > flag that is only sampled by all clients every few seconds and is > updated > > > whenever a client needs to back-off more than say 5 steps. If this > flag > > > has > > > been updated recently, then clients should switch to the locking > > protocol. > > > You might even have several locks so that you don't exclude all other > > > updaters, merely thin them out a bit. This flagged strategy would run > as > > > fast as optimistic locking as long as optimistic locking is fast and > then > > > would limit the total number of transactions needed under very high > load. > > > > > > On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway < > > > [hidden email]> wrote: > > > > > > > My so far involve: > > > > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this > > > gives > > > > me the monotonically increasing number, but the sequence number isn't > > > > contiguous > > > > b) Storing the sequence number in the data portion of a persistent > node > > - > > > > then updating this (using the version number - aka optimistic > locking). > > > > The > > > > problem with this is that under high load I'm assuming there'll be a > > lot > > > of > > > > contention and hence failures with regards to updates. > > > > > > > > > > |
|
In reply to this post by David Rosenstrauch
Hi David,
We did discuss potentially doing this as well. It would be nice to get some recipes for Zookeeper done for this area, if people think it's useful. Were you thinking of submitting this back as a recipe, if not then I could potentially work on such a recipe instead. Many thanks, Jon. > I just ran into this exact situation, and handled it like so: > > I wrote a library that uses the option (b) you described above. Only > instead of requesting a single sequence number, you request a block of them > at a time from Zookeeper, and then locally use them up one by one from the > block you retrieved. Retrieving by block (e.g., by blocks of 10000 at a > time) eliminates the contention issue. > > Then, if you're finished assigning ID's from that block, but still have a > bunch of ID's left in the block, the library has another function to "push > back" the unused ID's. They'll then get pulled again in the next block > retrieval. > > We don't actually have this code running in production yet, so I can't > vouch for how well it works. But the design was reviewed and given the > thumbs up by the core developers on the team, and the implementation passes > all my unit tests. > > HTH. Feel free to email back with specific questions if you'd like more > details. > > DR |
|
Perhaps. I'd have to ask my boss for permission to release the code.
Is this something that would be interesting/useful to other people? If so, I can ask about it. DR On 08/05/2010 11:02 PM, Jonathan Holloway wrote: > Hi David, > > We did discuss potentially doing this as well. It would be nice to get some > recipes for Zookeeper done for this area, if people think it's useful. Were > you thinking of submitting this back as a recipe, if not then I could > potentially work on such a recipe instead. > > Many thanks, > Jon. > > >> I just ran into this exact situation, and handled it like so: >> >> I wrote a library that uses the option (b) you described above. Only >> instead of requesting a single sequence number, you request a block of them >> at a time from Zookeeper, and then locally use them up one by one from the >> block you retrieved. Retrieving by block (e.g., by blocks of 10000 at a >> time) eliminates the contention issue. >> >> Then, if you're finished assigning ID's from that block, but still have a >> bunch of ID's left in the block, the library has another function to "push >> back" the unused ID's. They'll then get pulled again in the next block >> retrieval. >> >> We don't actually have this code running in production yet, so I can't >> vouch for how well it works. But the design was reviewed and given the >> thumbs up by the core developers on the team, and the implementation passes >> all my unit tests. >> >> HTH. Feel free to email back with specific questions if you'd like more >> details. >> >> DR > |
|
Hi David,
I think it would be really useful. It would be very helpful for someone looking for geenrating unique tokens/generations ids ( I can think of plenty of applications for this). Please do consider contributing it back to the community! Thanks mahadev On 8/6/10 7:10 AM, "David Rosenstrauch" <[hidden email]> wrote: > Perhaps. I'd have to ask my boss for permission to release the code. > > Is this something that would be interesting/useful to other people? If > so, I can ask about it. > > DR > > On 08/05/2010 11:02 PM, Jonathan Holloway wrote: >> Hi David, >> >> We did discuss potentially doing this as well. It would be nice to get some >> recipes for Zookeeper done for this area, if people think it's useful. Were >> you thinking of submitting this back as a recipe, if not then I could >> potentially work on such a recipe instead. >> >> Many thanks, >> Jon. >> >> >>> I just ran into this exact situation, and handled it like so: >>> >>> I wrote a library that uses the option (b) you described above. Only >>> instead of requesting a single sequence number, you request a block of them >>> at a time from Zookeeper, and then locally use them up one by one from the >>> block you retrieved. Retrieving by block (e.g., by blocks of 10000 at a >>> time) eliminates the contention issue. >>> >>> Then, if you're finished assigning ID's from that block, but still have a >>> bunch of ID's left in the block, the library has another function to "push >>> back" the unused ID's. They'll then get pulled again in the next block >>> retrieval. >>> >>> We don't actually have this code running in production yet, so I can't >>> vouch for how well it works. But the design was reviewed and given the >>> thumbs up by the core developers on the team, and the implementation passes >>> all my unit tests. >>> >>> HTH. Feel free to email back with specific questions if you'd like more >>> details. >>> >>> DR >> > > |
|
I'll run it by my boss next week.
DR On 08/06/2010 07:30 PM, Mahadev Konar wrote: > Hi David, > I think it would be really useful. It would be very helpful for someone > looking for geenrating unique tokens/generations ids ( I can think of plenty > of applications for this). > > Please do consider contributing it back to the community! > > Thanks > mahadev > > > On 8/6/10 7:10 AM, "David Rosenstrauch"<[hidden email]> wrote: > >> Perhaps. I'd have to ask my boss for permission to release the code. >> >> Is this something that would be interesting/useful to other people? If >> so, I can ask about it. >> >> DR >> >> On 08/05/2010 11:02 PM, Jonathan Holloway wrote: >>> Hi David, >>> >>> We did discuss potentially doing this as well. It would be nice to get some >>> recipes for Zookeeper done for this area, if people think it's useful. Were >>> you thinking of submitting this back as a recipe, if not then I could >>> potentially work on such a recipe instead. >>> >>> Many thanks, >>> Jon. >>> >>> >>>> I just ran into this exact situation, and handled it like so: >>>> >>>> I wrote a library that uses the option (b) you described above. Only >>>> instead of requesting a single sequence number, you request a block of them >>>> at a time from Zookeeper, and then locally use them up one by one from the >>>> block you retrieved. Retrieving by block (e.g., by blocks of 10000 at a >>>> time) eliminates the contention issue. >>>> >>>> Then, if you're finished assigning ID's from that block, but still have a >>>> bunch of ID's left in the block, the library has another function to "push >>>> back" the unused ID's. They'll then get pulled again in the next block >>>> retrieval. >>>> >>>> We don't actually have this code running in production yet, so I can't >>>> vouch for how well it works. But the design was reviewed and given the >>>> thumbs up by the core developers on the team, and the implementation passes >>>> all my unit tests. >>>> >>>> HTH. Feel free to email back with specific questions if you'd like more >>>> details. >>>> >>>> DR |
|
Tell him that we will all look over your code so he gets immediate free
consulting. On Fri, Aug 6, 2010 at 7:39 PM, David Rosenstrauch <[hidden email]>wrote: > I'll run it by my boss next week. > > DR > > > On 08/06/2010 07:30 PM, Mahadev Konar wrote: > >> Hi David, >> I think it would be really useful. It would be very helpful for someone >> looking for geenrating unique tokens/generations ids ( I can think of >> plenty >> of applications for this). >> >> Please do consider contributing it back to the community! >> >> Thanks >> mahadev >> >> >> On 8/6/10 7:10 AM, "David Rosenstrauch"<[hidden email]> wrote: >> >> Perhaps. I'd have to ask my boss for permission to release the code. >>> >>> Is this something that would be interesting/useful to other people? If >>> so, I can ask about it. >>> >>> DR >>> >>> On 08/05/2010 11:02 PM, Jonathan Holloway wrote: >>> >>>> Hi David, >>>> >>>> We did discuss potentially doing this as well. It would be nice to get >>>> some >>>> recipes for Zookeeper done for this area, if people think it's useful. >>>> Were >>>> you thinking of submitting this back as a recipe, if not then I could >>>> potentially work on such a recipe instead. >>>> >>>> Many thanks, >>>> Jon. >>>> >>>> >>>> I just ran into this exact situation, and handled it like so: >>>>> >>>>> I wrote a library that uses the option (b) you described above. Only >>>>> instead of requesting a single sequence number, you request a block of >>>>> them >>>>> at a time from Zookeeper, and then locally use them up one by one from >>>>> the >>>>> block you retrieved. Retrieving by block (e.g., by blocks of 10000 at >>>>> a >>>>> time) eliminates the contention issue. >>>>> >>>>> Then, if you're finished assigning ID's from that block, but still have >>>>> a >>>>> bunch of ID's left in the block, the library has another function to >>>>> "push >>>>> back" the unused ID's. They'll then get pulled again in the next block >>>>> retrieval. >>>>> >>>>> We don't actually have this code running in production yet, so I can't >>>>> vouch for how well it works. But the design was reviewed and given the >>>>> thumbs up by the core developers on the team, and the implementation >>>>> passes >>>>> all my unit tests. >>>>> >>>>> HTH. Feel free to email back with specific questions if you'd like >>>>> more >>>>> details. >>>>> >>>>> DR >>>>> >>>> |
|
Hi all,
we have something implementing the optimistic concurrency approach to sequence generation that we've been running in production for some time now. We don't see a huge amount of contention over the sequence counters as the nature of our app lends itself well to partitioned keys. Initially, we coded up the simplest thing we thought could work and deployed it, figuring that we'd have plenty of scope for improvement once we saw it running with real load. However, to date its been ticking over so well we've not really had cause to spend any further effort on it. There's plenty of scope for improvement though, two of the things we had thought we would need to do sooner rather than later are implement an exponential backoff scheme (like Ted describes) when there is contention over a given counter, and to add a more performant network interface than HTTP. Like I say though, this just hasn't been a high enough priority for us yet. Anyway, we've been meaning to open source this for a while now, and prompted by this thread, I just spent an afternoon tidying up a little and pushing to github. Its at http://github.com/talisplatform/H1 and any feedback would be gratefully received. Cheers, Sam On 7 August 2010 03:40, Ted Dunning <[hidden email]> wrote: > Tell him that we will all look over your code so he gets immediate free > consulting. > > On Fri, Aug 6, 2010 at 7:39 PM, David Rosenstrauch <[hidden email] > >wrote: > > > I'll run it by my boss next week. > > > > DR > > > > > > On 08/06/2010 07:30 PM, Mahadev Konar wrote: > > > >> Hi David, > >> I think it would be really useful. It would be very helpful for someone > >> looking for geenrating unique tokens/generations ids ( I can think of > >> plenty > >> of applications for this). > >> > >> Please do consider contributing it back to the community! > >> > >> Thanks > >> mahadev > >> > >> > >> On 8/6/10 7:10 AM, "David Rosenstrauch"<[hidden email]> wrote: > >> > >> Perhaps. I'd have to ask my boss for permission to release the code. > >>> > >>> Is this something that would be interesting/useful to other people? If > >>> so, I can ask about it. > >>> > >>> DR > >>> > >>> On 08/05/2010 11:02 PM, Jonathan Holloway wrote: > >>> > >>>> Hi David, > >>>> > >>>> We did discuss potentially doing this as well. It would be nice to > get > >>>> some > >>>> recipes for Zookeeper done for this area, if people think it's useful. > >>>> Were > >>>> you thinking of submitting this back as a recipe, if not then I could > >>>> potentially work on such a recipe instead. > >>>> > >>>> Many thanks, > >>>> Jon. > >>>> > >>>> > >>>> I just ran into this exact situation, and handled it like so: > >>>>> > >>>>> I wrote a library that uses the option (b) you described above. Only > >>>>> instead of requesting a single sequence number, you request a block > of > >>>>> them > >>>>> at a time from Zookeeper, and then locally use them up one by one > from > >>>>> the > >>>>> block you retrieved. Retrieving by block (e.g., by blocks of 10000 > at > >>>>> a > >>>>> time) eliminates the contention issue. > >>>>> > >>>>> Then, if you're finished assigning ID's from that block, but still > have > >>>>> a > >>>>> bunch of ID's left in the block, the library has another function to > >>>>> "push > >>>>> back" the unused ID's. They'll then get pulled again in the next > block > >>>>> retrieval. > >>>>> > >>>>> We don't actually have this code running in production yet, so I > can't > >>>>> vouch for how well it works. But the design was reviewed and given > the > >>>>> thumbs up by the core developers on the team, and the implementation > >>>>> passes > >>>>> all my unit tests. > >>>>> > >>>>> HTH. Feel free to email back with specific questions if you'd like > >>>>> more > >>>>> details. > >>>>> > >>>>> DR > >>>>> > >>>> > |
|
In reply to this post by David Rosenstrauch
Good news! I got approval to release this code! (Man, I love working
for a startup!!!) :-) So anyone know: what's the next step? Do I need to obtain commit privileges? Or do I deliver the code to someone who has commit privs who shepherds this for me? Also, what (if anything) do I need to tweak in the code to make it release-ready. (e.g., Change package names? Slap an Apache license on it? etc.) Thanks, DR On 08/06/2010 10:39 PM, David Rosenstrauch wrote: > I'll run it by my boss next week. > > DR > > On 08/06/2010 07:30 PM, Mahadev Konar wrote: >> Hi David, >> I think it would be really useful. It would be very helpful for someone >> looking for geenrating unique tokens/generations ids ( I can think of >> plenty >> of applications for this). >> >> Please do consider contributing it back to the community! >> >> Thanks >> mahadev >> >> >> On 8/6/10 7:10 AM, "David Rosenstrauch"<[hidden email]> wrote: >> >>> Perhaps. I'd have to ask my boss for permission to release the code. >>> >>> Is this something that would be interesting/useful to other people? If >>> so, I can ask about it. >>> >>> DR >>> >>> On 08/05/2010 11:02 PM, Jonathan Holloway wrote: >>>> Hi David, >>>> >>>> We did discuss potentially doing this as well. It would be nice to >>>> get some >>>> recipes for Zookeeper done for this area, if people think it's >>>> useful. Were >>>> you thinking of submitting this back as a recipe, if not then I could >>>> potentially work on such a recipe instead. >>>> >>>> Many thanks, >>>> Jon. >>>> >>>> >>>>> I just ran into this exact situation, and handled it like so: >>>>> >>>>> I wrote a library that uses the option (b) you described above. Only >>>>> instead of requesting a single sequence number, you request a block >>>>> of them >>>>> at a time from Zookeeper, and then locally use them up one by one >>>>> from the >>>>> block you retrieved. Retrieving by block (e.g., by blocks of 10000 >>>>> at a >>>>> time) eliminates the contention issue. >>>>> >>>>> Then, if you're finished assigning ID's from that block, but still >>>>> have a >>>>> bunch of ID's left in the block, the library has another function >>>>> to "push >>>>> back" the unused ID's. They'll then get pulled again in the next block >>>>> retrieval. >>>>> >>>>> We don't actually have this code running in production yet, so I can't >>>>> vouch for how well it works. But the design was reviewed and given the >>>>> thumbs up by the core developers on the team, and the >>>>> implementation passes >>>>> all my unit tests. >>>>> >>>>> HTH. Feel free to email back with specific questions if you'd like >>>>> more >>>>> details. >>>>> >>>>> DR |
|
Great!
Basic details are here (create a jira, attach a patch, click "submit" and someone will review and help you get it into a state which we can commit). Probably you'd put your code into src/recipes or src/contrib (recipes sounds reasonable). http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute Patrick On 08/10/2010 09:59 AM, David Rosenstrauch wrote: > Good news! I got approval to release this code! (Man, I love working > for a startup!!!) :-) > > So anyone know: what's the next step? Do I need to obtain commit > privileges? Or do I deliver the code to someone who has commit privs who > shepherds this for me? > > Also, what (if anything) do I need to tweak in the code to make it > release-ready. (e.g., Change package names? Slap an Apache license on > it? etc.) > > Thanks, > > DR > > On 08/06/2010 10:39 PM, David Rosenstrauch wrote: >> I'll run it by my boss next week. >> >> DR >> >> On 08/06/2010 07:30 PM, Mahadev Konar wrote: >>> Hi David, >>> I think it would be really useful. It would be very helpful for someone >>> looking for geenrating unique tokens/generations ids ( I can think of >>> plenty >>> of applications for this). >>> >>> Please do consider contributing it back to the community! >>> >>> Thanks >>> mahadev >>> >>> >>> On 8/6/10 7:10 AM, "David Rosenstrauch"<[hidden email]> wrote: >>> >>>> Perhaps. I'd have to ask my boss for permission to release the code. >>>> >>>> Is this something that would be interesting/useful to other people? If >>>> so, I can ask about it. >>>> >>>> DR >>>> >>>> On 08/05/2010 11:02 PM, Jonathan Holloway wrote: >>>>> Hi David, >>>>> >>>>> We did discuss potentially doing this as well. It would be nice to >>>>> get some >>>>> recipes for Zookeeper done for this area, if people think it's >>>>> useful. Were >>>>> you thinking of submitting this back as a recipe, if not then I could >>>>> potentially work on such a recipe instead. >>>>> >>>>> Many thanks, >>>>> Jon. >>>>> >>>>> >>>>>> I just ran into this exact situation, and handled it like so: >>>>>> >>>>>> I wrote a library that uses the option (b) you described above. Only >>>>>> instead of requesting a single sequence number, you request a block >>>>>> of them >>>>>> at a time from Zookeeper, and then locally use them up one by one >>>>>> from the >>>>>> block you retrieved. Retrieving by block (e.g., by blocks of 10000 >>>>>> at a >>>>>> time) eliminates the contention issue. >>>>>> >>>>>> Then, if you're finished assigning ID's from that block, but still >>>>>> have a >>>>>> bunch of ID's left in the block, the library has another function >>>>>> to "push >>>>>> back" the unused ID's. They'll then get pulled again in the next >>>>>> block >>>>>> retrieval. >>>>>> >>>>>> We don't actually have this code running in production yet, so I >>>>>> can't >>>>>> vouch for how well it works. But the design was reviewed and given >>>>>> the >>>>>> thumbs up by the core developers on the team, and the >>>>>> implementation passes >>>>>> all my unit tests. >>>>>> >>>>>> HTH. Feel free to email back with specific questions if you'd like >>>>>> more >>>>>> details. >>>>>> >>>>>> DR > |
|
OK, will do, as soon as time permits. It'll take me a little while to
do the needed tweaks. (Plus I'm under some pretty heavy deadline on some other work right now.) Will email back once I've got this done. DR On 08/10/2010 01:10 PM, Patrick Hunt wrote: > Great! > > Basic details are here (create a jira, attach a patch, click "submit" > and someone will review and help you get it into a state which we can > commit). Probably you'd put your code into src/recipes or src/contrib > (recipes sounds reasonable). > http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute > > Patrick > > On 08/10/2010 09:59 AM, David Rosenstrauch wrote: >> Good news! I got approval to release this code! (Man, I love working >> for a startup!!!) :-) >> >> So anyone know: what's the next step? Do I need to obtain commit >> privileges? Or do I deliver the code to someone who has commit privs who >> shepherds this for me? >> >> Also, what (if anything) do I need to tweak in the code to make it >> release-ready. (e.g., Change package names? Slap an Apache license on >> it? etc.) >> >> Thanks, >> >> DR |
|
In reply to this post by jph98
What happens during a network partition and different clients are
incrementing "different" counters, and then the partition goes away? Won't (potentially) the same sequence value be given out to two clients? .. Adam On Thu, Aug 5, 2010 at 5:38 PM, Jonathan Holloway <[hidden email]> wrote: > Hi Ted, > > Thanks for the comments. > > I might have overlooked something here, but is it also possible to do the > following: > > 1. Create a PERSISTENT node > 2. Have multiple clients set the data on the node, e.g. Stat stat = > zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1); > 3. Use the version number from stat.getVersion() as the sequence (obviously > I'm limited to Integer.MAX_VALUE) > > Are there any weird race conditions involved here which would mean that a > client would receive the wrong Stat object back? > > Many thanks again, > Jon. > > On 5 August 2010 16:09, Ted Dunning <[hidden email]> wrote: > >> (b) >> >> BUT: >> >> Sequential numbering is a special case of "now". In large diameters, now >> gets very expensive. This is a special case of that assertion. If there >> is >> a way to get away from this presumption of the need for sequential >> numbering, you will be miles better off. >> >> HOWEVER: >> >> ZK can do better than you suggest. Incrementing a counter does involve >> potential contention, but you will very likely be able to get to pretty >> high >> rates before the optimistic locking begins to fail. If you code your >> update >> with a few tries at full speed followed by some form of retry back-off, you >> should get pretty close to the best possible performance. >> >> You might also try building a lock with an ephemeral file before updating >> the counter. I would expect that this will be slower than the back-off >> option if only because involves more transactions in ZK. IF you wanted to >> get too complicated for your own good, you could have a secondary strategy >> flag that is only sampled by all clients every few seconds and is updated >> whenever a client needs to back-off more than say 5 steps. If this flag >> has >> been updated recently, then clients should switch to the locking protocol. >> You might even have several locks so that you don't exclude all other >> updaters, merely thin them out a bit. This flagged strategy would run as >> fast as optimistic locking as long as optimistic locking is fast and then >> would limit the total number of transactions needed under very high load. >> >> On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway < >> [hidden email]> wrote: >> >> > My so far involve: >> > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this >> gives >> > me the monotonically increasing number, but the sequence number isn't >> > contiguous >> > b) Storing the sequence number in the data portion of a persistent node - >> > then updating this (using the version number - aka optimistic locking). >> > The >> > problem with this is that under high load I'm assuming there'll be a lot >> of >> > contention and hence failures with regards to updates. >> > >> > |
|
Can't happen.
In a network partition, the side without a quorum can't update the file version. On Wed, Aug 11, 2010 at 3:11 PM, Adam Rosien <[hidden email]> wrote: > What happens during a network partition and different clients are > incrementing "different" counters, and then the partition goes away? > Won't (potentially) the same sequence value be given out to two > clients? > > .. Adam > > On Thu, Aug 5, 2010 at 5:38 PM, Jonathan Holloway > <[hidden email]> wrote: > > Hi Ted, > > > > Thanks for the comments. > > > > I might have overlooked something here, but is it also possible to do the > > following: > > > > 1. Create a PERSISTENT node > > 2. Have multiple clients set the data on the node, e.g. Stat stat = > > zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1); > > 3. Use the version number from stat.getVersion() as the sequence > (obviously > > I'm limited to Integer.MAX_VALUE) > > > > Are there any weird race conditions involved here which would mean that a > > client would receive the wrong Stat object back? > > > > Many thanks again, > > Jon. > > > > On 5 August 2010 16:09, Ted Dunning <[hidden email]> wrote: > > > >> (b) > >> > >> BUT: > >> > >> Sequential numbering is a special case of "now". In large diameters, > now > >> gets very expensive. This is a special case of that assertion. If > there > >> is > >> a way to get away from this presumption of the need for sequential > >> numbering, you will be miles better off. > >> > >> HOWEVER: > >> > >> ZK can do better than you suggest. Incrementing a counter does involve > >> potential contention, but you will very likely be able to get to pretty > >> high > >> rates before the optimistic locking begins to fail. If you code your > >> update > >> with a few tries at full speed followed by some form of retry back-off, > you > >> should get pretty close to the best possible performance. > >> > >> You might also try building a lock with an ephemeral file before > updating > >> the counter. I would expect that this will be slower than the back-off > >> option if only because involves more transactions in ZK. IF you wanted > to > >> get too complicated for your own good, you could have a secondary > strategy > >> flag that is only sampled by all clients every few seconds and is > updated > >> whenever a client needs to back-off more than say 5 steps. If this flag > >> has > >> been updated recently, then clients should switch to the locking > protocol. > >> You might even have several locks so that you don't exclude all other > >> updaters, merely thin them out a bit. This flagged strategy would run > as > >> fast as optimistic locking as long as optimistic locking is fast and > then > >> would limit the total number of transactions needed under very high > load. > >> > >> On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway < > >> [hidden email]> wrote: > >> > >> > My so far involve: > >> > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this > >> gives > >> > me the monotonically increasing number, but the sequence number isn't > >> > contiguous > >> > b) Storing the sequence number in the data portion of a persistent > node - > >> > then updating this (using the version number - aka optimistic > locking). > >> > The > >> > problem with this is that under high load I'm assuming there'll be a > lot > >> of > >> > contention and hence failures with regards to updates. > >> > > >> > > > |
|
Ah thanks, I forgot the "majority-commit" property because I also
forgot that all servers know what the cluster should look like, rather than act adaptively (which wouldn't make sense after all). .. Adam On Wed, Aug 11, 2010 at 3:23 PM, Ted Dunning <[hidden email]> wrote: > Can't happen. > > In a network partition, the side without a quorum can't update the file > version. > > On Wed, Aug 11, 2010 at 3:11 PM, Adam Rosien <[hidden email]> wrote: > >> What happens during a network partition and different clients are >> incrementing "different" counters, and then the partition goes away? >> Won't (potentially) the same sequence value be given out to two >> clients? >> >> .. Adam >> >> On Thu, Aug 5, 2010 at 5:38 PM, Jonathan Holloway >> <[hidden email]> wrote: >> > Hi Ted, >> > >> > Thanks for the comments. >> > >> > I might have overlooked something here, but is it also possible to do the >> > following: >> > >> > 1. Create a PERSISTENT node >> > 2. Have multiple clients set the data on the node, e.g. Stat stat = >> > zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1); >> > 3. Use the version number from stat.getVersion() as the sequence >> (obviously >> > I'm limited to Integer.MAX_VALUE) >> > >> > Are there any weird race conditions involved here which would mean that a >> > client would receive the wrong Stat object back? >> > >> > Many thanks again, >> > Jon. >> > >> > On 5 August 2010 16:09, Ted Dunning <[hidden email]> wrote: >> > >> >> (b) >> >> >> >> BUT: >> >> >> >> Sequential numbering is a special case of "now". In large diameters, >> now >> >> gets very expensive. This is a special case of that assertion. If >> there >> >> is >> >> a way to get away from this presumption of the need for sequential >> >> numbering, you will be miles better off. >> >> >> >> HOWEVER: >> >> >> >> ZK can do better than you suggest. Incrementing a counter does involve >> >> potential contention, but you will very likely be able to get to pretty >> >> high >> >> rates before the optimistic locking begins to fail. If you code your >> >> update >> >> with a few tries at full speed followed by some form of retry back-off, >> you >> >> should get pretty close to the best possible performance. >> >> >> >> You might also try building a lock with an ephemeral file before >> updating >> >> the counter. I would expect that this will be slower than the back-off >> >> option if only because involves more transactions in ZK. IF you wanted >> to >> >> get too complicated for your own good, you could have a secondary >> strategy >> >> flag that is only sampled by all clients every few seconds and is >> updated >> >> whenever a client needs to back-off more than say 5 steps. If this flag >> >> has >> >> been updated recently, then clients should switch to the locking >> protocol. >> >> You might even have several locks so that you don't exclude all other >> >> updaters, merely thin them out a bit. This flagged strategy would run >> as >> >> fast as optimistic locking as long as optimistic locking is fast and >> then >> >> would limit the total number of transactions needed under very high >> load. >> >> >> >> On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway < >> >> [hidden email]> wrote: >> >> >> >> > My so far involve: >> >> > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this >> >> gives >> >> > me the monotonically increasing number, but the sequence number isn't >> >> > contiguous >> >> > b) Storing the sequence number in the data portion of a persistent >> node - >> >> > then updating this (using the version number - aka optimistic >> locking). >> >> > The >> >> > problem with this is that under high load I'm assuming there'll be a >> lot >> >> of >> >> > contention and hence failures with regards to updates. >> >> > >> >> >> > >> > |
| Powered by Nabble | Edit this page |
