[GitHub] accumulo pull request #199: DO NOT MERGE: PROOF OF CONCEPT

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo pull request #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
GitHub user phrocker opened a pull request:

    https://github.com/apache/accumulo/pull/199

    DO NOT MERGE: PROOF OF CONCEPT

    Hello this is a proof of concept that will hopefully begin the discussion of changing how we request resources.
   
    First of all there are a lot of changes here. That's primarily because I'm more focused on getting some ideas out there and getting feedback so that I can provide a solid design to the community. There are commits from 2015 in here because I took code from a previous commit that I wanted to include.
   
    The basic concept and what I want from this PR is how do we modify the API so that we can define resources intelligently AND allow users to request users within some bounds
   
    I spoke to @keith-turner and @ctubbsii  today. We can break this up into the following ideas.
   
    1) Allow configuration to define resource pools ( memory, cpu, etc ) What will this look like? How do we constrain this?
    2) Define a way of segmenting tables/pools. Right now I have included code I've used elsewhere to define table specific threadpools, but I think we need to be more fine grained. But how do we define resource groups? Right now I've created a stubbed out object called ResourceRequest. That needs to be better defined, which I will do in the coming days and welcome feedback.
    3) How do we  define limits on user request? Clearly I've modify the thrift calls so that we can allow scans to create resource requests, but is this a good idea? If so, should we create a new permission? How do we avoid abuse with resource requests?
    4) Can we eliminate some of our current resource constraints and merge them into this concept? I.e. can we get rid of max open files? Can we extend this concept to be aware of resources consumed by compactions?
    5) Could we potentially define more concrete critical sections?
   
    Please ignore the badness in the PR -- I can guarantee that it doesn't compile. I took a vacation and told @keith-turner  I'd put this up here because I think getting ideas now will serve us better. I'm going to clean this up a great deal in the next few days so that I can have others help on this. I will attach tickets to this PR as I have a previous ticket that corresponds to this; however, I'd welcome input into the list above.
   
    I will update this PR tomorrow with cleanup and begin stubbing out portions from the list above and other ideas that are presented. Bottom line, what's your sticking point with how tasks run in Accumulo and is there anything we can do to improve it?
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/phrocker/accumulo-1 master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/accumulo/pull/199.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #199
   
----
commit bbd2257a001d654a87b433bb387ad9fce402dbb8
Author: phrocker <[hidden email]>
Date:   2015-12-31T15:16:49Z

    ACCUMULO-3509: Allow cleanup state to be kept to avoid blocking tserver session sweep
   
    By enabling state ( true/false) from the cleanup method, the change will avoid blocking
    on a scan session being swept. if the session cleanup blocks because a ScanSession is
    still being read, we may block until the ScanBatch returns for that ScanSession.
   
    The change uses a simple semaphore ( purely because I like the word ) to attempt acquisition.
    If that fails, we return false from the cleanup and reintroduce that Session back into
    the queue to clean up.

commit 3cd07f03104ef8382afd740de82efdfa2e682f73
Author: Marc <[hidden email]>
Date:   2017-01-10T01:20:11Z

    THIS IS A POC PR

commit 0fcf266560461c04bcdf7511128d7f6c0a182c26
Author: Marc <[hidden email]>
Date:   2017-01-10T01:24:29Z

    Merge branch 'master' of https://github.com/phrocker/accumulo-1

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo issue #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
Github user phrocker commented on the issue:

    https://github.com/apache/accumulo/pull/199
 
    Oh and tests will fail. Again, just a thought provoking PR. I like the idea of submitting a PR to get ideas and if someone has ideas pointing to code this is a great way for them to point it out specifically.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo issue #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user dlmarion commented on the issue:

    https://github.com/apache/accumulo/pull/199
 
    I see in Property.java the definition of queues / pools. Are the high water marks for the requested resources set there also? I could not tell from the PR, but I think an admin should be able to define the pools and max resources in the configuration, and then have the ability to allow users to use specific pools or something to that effect (thinking about multi-tenancy).
   
    Do pools have priority? If I define a resource pool for administrative things, can I make it so that it can preempt resources for another user-facing pool?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo issue #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user phrocker commented on the issue:

    https://github.com/apache/accumulo/pull/199
 
    @dlmarion Some of these things aren't well defined in Property.java yet as I was worried about going in a direction and it being all wrong. I take your comment as a 'feature request' sort of. To better define pools resource queues will allow configurations to define thread priorities. We would rely on the operating system to do prioritization.
   
    I think this was a question I have. Should the configuration allow users to select from a set of configured pools of resources or allow them to override and select any set of resources that are defined? @keith-turner  thoughts?
   
    I'll try and better define some of these properties tomorrow so I can clarify this for the next person.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo issue #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user dlmarion commented on the issue:

    https://github.com/apache/accumulo/pull/199
 
    @phrocker - Just thinking about multi-tenancy and competition for resources, allowing an administrator to define upper limits, and making sure that user operations don't trump an administrator trying to make a change on the running system. Another thing to think about is something that came up on the user list today, can/should an admin be able to define a pool of resources that can't be interrupted (iterator tear-down and re-setup)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo issue #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user joshelser commented on the issue:

    https://github.com/apache/accumulo/pull/199
 
    > Clearly I've modify the thrift calls so that we can allow scans to create resource requests, but is this a good idea
   
    I think I've heard that HDFS has the notion of a "resource coupon" that, I'm guessing, is similar in spirit. I'd have to see if I can find a design doc from them. Might be something useful to extract.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo issue #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user joshelser commented on the issue:

    https://github.com/apache/accumulo/pull/199
 
    > I think an admin should be able to define the pools and max resources in the configuration, and then have the ability to allow users to use specific pools or something to that effect (thinking about multi-tenancy).
   
    If we move to bounded queues for resources in thrift servers, we might also have to start thinking about preemption or small timeouts to prevent starvation/denial of service.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo issue #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user keith-turner commented on the issue:

    https://github.com/apache/accumulo/pull/199
 
    I am planning to pick this feature up and start work on it next week.  Would really like to see this in 2.0.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo pull request #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user keith-turner commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/199#discussion_r106948790
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/conf/Property.java ---
    @@ -295,6 +295,8 @@
       TSERV_READ_AHEAD_MAXCONCURRENT("tserver.readahead.concurrent.max", "16", PropertyType.COUNT,
           "The maximum number of concurrent read ahead that will execute. This effectively"
               + " limits the number of long running scans that can run concurrently per tserver."),
    +  TSERV_READ_AHEAD_PREFIX("tserver.readahead.concurrent.queue.", null, PropertyType.PREFIX,
    --- End diff --
   
    could also have a maxTime property on resource group...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo issue #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user phrocker commented on the issue:

    https://github.com/apache/accumulo/pull/199
 
    @keith-turner  please have at it. After some initial tests I felt this wasn't an arbitrary feature, but by all means have fun with it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] accumulo pull request #199: DO NOT MERGE: PROOF OF CONCEPT

ctubbsii
In reply to this post by ctubbsii
Github user phrocker closed the pull request at:

    https://github.com/apache/accumulo/pull/199


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---