[DISCUSS] Hadoop3 support target?

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Hadoop3 support target?

Josh Elser-2
What branch do we want to consider Hadoop3 support?

There is a 3.0.0-beta1 release that's been out for a while, and Hadoop
PMC has already done a 3.0.0 RC0. I think it's the right time to start
considering this.

In my poking so far, I've filed ACCUMULO-4753 which I'm working through
now. This does raise the question: where do we want to say we support
Hadoop3? 1.8 or 2.0? (have we "officially" deprecated 1.7?)

- Josh

https://issues.apache.org/jira/browse/ACCUMULO-4753
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Christopher Tubbs-2
I don't think we can support it with 1.8 or earlier, because of some
serious incompatibilities (namely, ACCUMULO-4611/4753)
I think people are still patching 1.7, so I don't think we've "officially"
EOL'd it.
I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently stable.

On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:

> What branch do we want to consider Hadoop3 support?
>
> There is a 3.0.0-beta1 release that's been out for a while, and Hadoop
> PMC has already done a 3.0.0 RC0. I think it's the right time to start
> considering this.
>
> In my poking so far, I've filed ACCUMULO-4753 which I'm working through
> now. This does raise the question: where do we want to say we support
> Hadoop3? 1.8 or 2.0? (have we "officially" deprecated 1.7?)
>
> - Josh
>
> https://issues.apache.org/jira/browse/ACCUMULO-4753
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Josh Elser-2
Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
Cuz, uh... I made it work already :)

Thanks for the JIRA cleanup. Forgot about that one.

On 12/4/17 5:55 PM, Christopher wrote:

> I don't think we can support it with 1.8 or earlier, because of some
> serious incompatibilities (namely, ACCUMULO-4611/4753)
> I think people are still patching 1.7, so I don't think we've "officially"
> EOL'd it.
> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently stable.
>
> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
>
>> What branch do we want to consider Hadoop3 support?
>>
>> There is a 3.0.0-beta1 release that's been out for a while, and Hadoop
>> PMC has already done a 3.0.0 RC0. I think it's the right time to start
>> considering this.
>>
>> In my poking so far, I've filed ACCUMULO-4753 which I'm working through
>> now. This does raise the question: where do we want to say we support
>> Hadoop3? 1.8 or 2.0? (have we "officially" deprecated 1.7?)
>>
>> - Josh
>>
>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Josh Elser-2
Ah, I'm seeing now -- didn't check my inbox appropriately.

I think the fact that code that we don't own has somehow been allowed to
be public API is the smell. That's something that needs to be rectified
sooner than later. By that measure, it can *only* land on Accumulo 2.0
(which is going to be a major issue for the project).

On 12/4/17 5:58 PM, Josh Elser wrote:

> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
> Cuz, uh... I made it work already :)
>
> Thanks for the JIRA cleanup. Forgot about that one.
>
> On 12/4/17 5:55 PM, Christopher wrote:
>> I don't think we can support it with 1.8 or earlier, because of some
>> serious incompatibilities (namely, ACCUMULO-4611/4753)
>> I think people are still patching 1.7, so I don't think we've
>> "officially"
>> EOL'd it.
>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently stable.
>>
>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
>>
>>> What branch do we want to consider Hadoop3 support?
>>>
>>> There is a 3.0.0-beta1 release that's been out for a while, and Hadoop
>>> PMC has already done a 3.0.0 RC0. I think it's the right time to start
>>> considering this.
>>>
>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working through
>>> now. This does raise the question: where do we want to say we support
>>> Hadoop3? 1.8 or 2.0? (have we "officially" deprecated 1.7?)
>>>
>>> - Josh
>>>
>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Christopher Tubbs-2
Agreed.

On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:

> Ah, I'm seeing now -- didn't check my inbox appropriately.
>
> I think the fact that code that we don't own has somehow been allowed to
> be public API is the smell. That's something that needs to be rectified
> sooner than later. By that measure, it can *only* land on Accumulo 2.0
> (which is going to be a major issue for the project).
>
> On 12/4/17 5:58 PM, Josh Elser wrote:
> > Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
> > Cuz, uh... I made it work already :)
> >
> > Thanks for the JIRA cleanup. Forgot about that one.
> >
> > On 12/4/17 5:55 PM, Christopher wrote:
> >> I don't think we can support it with 1.8 or earlier, because of some
> >> serious incompatibilities (namely, ACCUMULO-4611/4753)
> >> I think people are still patching 1.7, so I don't think we've
> >> "officially"
> >> EOL'd it.
> >> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently stable.
> >>
> >> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
> >>
> >>> What branch do we want to consider Hadoop3 support?
> >>>
> >>> There is a 3.0.0-beta1 release that's been out for a while, and Hadoop
> >>> PMC has already done a 3.0.0 RC0. I think it's the right time to start
> >>> considering this.
> >>>
> >>> In my poking so far, I've filed ACCUMULO-4753 which I'm working through
> >>> now. This does raise the question: where do we want to say we support
> >>> Hadoop3? 1.8 or 2.0? (have we "officially" deprecated 1.7?)
> >>>
> >>> - Josh
> >>>
> >>> https://issues.apache.org/jira/browse/ACCUMULO-4753
> >>>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Josh Elser-2
Also, just to be clear for everyone else:

This means that we have *no roadmap* at all for Hadoop 3 support because
Accumulo 2.0 is in a state of languish.

This is a severe enough problem to me that I would consider breaking API
compatibility and fixing the API leak in 1.7/1.8. I'm curious what
people other than Christopher think (assuming from his comments/JIRA
work that he disagrees with me).

On 12/4/17 6:12 PM, Christopher wrote:

> Agreed.
>
> On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:
>
>> Ah, I'm seeing now -- didn't check my inbox appropriately.
>>
>> I think the fact that code that we don't own has somehow been allowed to
>> be public API is the smell. That's something that needs to be rectified
>> sooner than later. By that measure, it can *only* land on Accumulo 2.0
>> (which is going to be a major issue for the project).
>>
>> On 12/4/17 5:58 PM, Josh Elser wrote:
>>> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
>>> Cuz, uh... I made it work already :)
>>>
>>> Thanks for the JIRA cleanup. Forgot about that one.
>>>
>>> On 12/4/17 5:55 PM, Christopher wrote:
>>>> I don't think we can support it with 1.8 or earlier, because of some
>>>> serious incompatibilities (namely, ACCUMULO-4611/4753)
>>>> I think people are still patching 1.7, so I don't think we've
>>>> "officially"
>>>> EOL'd it.
>>>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently stable.
>>>>
>>>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
>>>>
>>>>> What branch do we want to consider Hadoop3 support?
>>>>>
>>>>> There is a 3.0.0-beta1 release that's been out for a while, and Hadoop
>>>>> PMC has already done a 3.0.0 RC0. I think it's the right time to start
>>>>> considering this.
>>>>>
>>>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working through
>>>>> now. This does raise the question: where do we want to say we support
>>>>> Hadoop3? 1.8 or 2.0? (have we "officially" deprecated 1.7?)
>>>>>
>>>>> - Josh
>>>>>
>>>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>>>>
>>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] Hadoop3 support target?

Dave Marion
There is no reason that you can't mark the offending API methods as deprecated in a 1.8.x release, then immediately branch off of that to create a 2.0 and remove the method. Alternatively, we could decide to forego the semver rules for a specific release and make sure to point it out in the release notes.

-----Original Message-----
From: Josh Elser [mailto:[hidden email]]
Sent: Monday, December 4, 2017 6:19 PM
To: [hidden email]
Subject: Re: [DISCUSS] Hadoop3 support target?

Also, just to be clear for everyone else:

This means that we have *no roadmap* at all for Hadoop 3 support because Accumulo 2.0 is in a state of languish.

This is a severe enough problem to me that I would consider breaking API compatibility and fixing the API leak in 1.7/1.8. I'm curious what people other than Christopher think (assuming from his comments/JIRA work that he disagrees with me).

On 12/4/17 6:12 PM, Christopher wrote:

> Agreed.
>
> On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:
>
>> Ah, I'm seeing now -- didn't check my inbox appropriately.
>>
>> I think the fact that code that we don't own has somehow been allowed
>> to be public API is the smell. That's something that needs to be
>> rectified sooner than later. By that measure, it can *only* land on
>> Accumulo 2.0 (which is going to be a major issue for the project).
>>
>> On 12/4/17 5:58 PM, Josh Elser wrote:
>>> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
>>> Cuz, uh... I made it work already :)
>>>
>>> Thanks for the JIRA cleanup. Forgot about that one.
>>>
>>> On 12/4/17 5:55 PM, Christopher wrote:
>>>> I don't think we can support it with 1.8 or earlier, because of
>>>> some serious incompatibilities (namely, ACCUMULO-4611/4753) I think
>>>> people are still patching 1.7, so I don't think we've "officially"
>>>> EOL'd it.
>>>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently stable.
>>>>
>>>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
>>>>
>>>>> What branch do we want to consider Hadoop3 support?
>>>>>
>>>>> There is a 3.0.0-beta1 release that's been out for a while, and
>>>>> Hadoop PMC has already done a 3.0.0 RC0. I think it's the right
>>>>> time to start considering this.
>>>>>
>>>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working
>>>>> through now. This does raise the question: where do we want to say
>>>>> we support Hadoop3? 1.8 or 2.0? (have we "officially" deprecated
>>>>> 1.7?)
>>>>>
>>>>> - Josh
>>>>>
>>>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>>>>
>>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Christopher Tubbs-2
In reply to this post by Josh Elser-2
I'm not certain what I'm supposed to be in disagreement with. I think
you've presented a fair assessment of the situation, and I agree with the
severity of the issue.

My comments in JIRA about shading not working was simply an observation
that we also need to consider the API breakage, which the shading wouldn't
fix.

If we're resigned to do the API breakage in 1.8, we can make that the
"bridge" version (supporting both Hadoop 2 and 3) by shading. Or, we can
wait until 2.0 and shade there to make that the "bridge" version (perhaps
even locking in a release timeline for 2.0... which hasn't seemed urgent
until now). Either way, shading seems the only way forward in order to
resolve this particular dependency issue.

The only other path I can see would be to not have a "bridge" version at
all and instead require upgrading Accumulo simultaneously with Hadoop. I
kind of like that option, but I don't think it's realistic for our
audiences, as it doesn't allow them to manage their upgrade risks, so the
shaded "bridge" version seems like the better option.

On Mon, Dec 4, 2017 at 6:19 PM Josh Elser <[hidden email]> wrote:

> Also, just to be clear for everyone else:
>
> This means that we have *no roadmap* at all for Hadoop 3 support because
> Accumulo 2.0 is in a state of languish.
>
> This is a severe enough problem to me that I would consider breaking API
> compatibility and fixing the API leak in 1.7/1.8. I'm curious what
> people other than Christopher think (assuming from his comments/JIRA
> work that he disagrees with me).
>
> On 12/4/17 6:12 PM, Christopher wrote:
> > Agreed.
> >
> > On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:
> >
> >> Ah, I'm seeing now -- didn't check my inbox appropriately.
> >>
> >> I think the fact that code that we don't own has somehow been allowed to
> >> be public API is the smell. That's something that needs to be rectified
> >> sooner than later. By that measure, it can *only* land on Accumulo 2.0
> >> (which is going to be a major issue for the project).
> >>
> >> On 12/4/17 5:58 PM, Josh Elser wrote:
> >>> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
> >>> Cuz, uh... I made it work already :)
> >>>
> >>> Thanks for the JIRA cleanup. Forgot about that one.
> >>>
> >>> On 12/4/17 5:55 PM, Christopher wrote:
> >>>> I don't think we can support it with 1.8 or earlier, because of some
> >>>> serious incompatibilities (namely, ACCUMULO-4611/4753)
> >>>> I think people are still patching 1.7, so I don't think we've
> >>>> "officially"
> >>>> EOL'd it.
> >>>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently
> stable.
> >>>>
> >>>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
> >>>>
> >>>>> What branch do we want to consider Hadoop3 support?
> >>>>>
> >>>>> There is a 3.0.0-beta1 release that's been out for a while, and
> Hadoop
> >>>>> PMC has already done a 3.0.0 RC0. I think it's the right time to
> start
> >>>>> considering this.
> >>>>>
> >>>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working
> through
> >>>>> now. This does raise the question: where do we want to say we support
> >>>>> Hadoop3? 1.8 or 2.0? (have we "officially" deprecated 1.7?)
> >>>>>
> >>>>> - Josh
> >>>>>
> >>>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
> >>>>>
> >>>>
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Keith Turner
In reply to this post by Dave Marion
If we are going to deprecate, then it would be nice to have a
replacement.  One thing that has irked me about the current Accumulo
entry point is that one can not specify everything needed to connect
to in a single props file.  Specifically, credentials can not be
specified.  It would be really nice to have a new entry point that
allows this.

We could release a 1.9 bridge version.  This version would be based on
1.8 and only include a new entry point. Base it on 1.8 in order to
allow a low risk upgrade for anyone currently using 1.8.  Once people
start using 1.9 they can have code that uses the old and new entry
point running at the same time.  In 2.0 we can drop the problematic
entry point.

Below is a commit to 1.8 where I was experimenting with a new entry point.

https://github.com/keith-turner/accumulo/commit/1c07fa62e9c57bde7e60907595d50f898d03c9d5

This new API would need review, its rough and there are some things I
don't like about it.  Just sharing for discussion of general concept,
not advocating for this specific API.

On Mon, Dec 4, 2017 at 6:27 PM, Dave Marion <[hidden email]> wrote:

> There is no reason that you can't mark the offending API methods as deprecated in a 1.8.x release, then immediately branch off of that to create a 2.0 and remove the method. Alternatively, we could decide to forego the semver rules for a specific release and make sure to point it out in the release notes.
>
> -----Original Message-----
> From: Josh Elser [mailto:[hidden email]]
> Sent: Monday, December 4, 2017 6:19 PM
> To: [hidden email]
> Subject: Re: [DISCUSS] Hadoop3 support target?
>
> Also, just to be clear for everyone else:
>
> This means that we have *no roadmap* at all for Hadoop 3 support because Accumulo 2.0 is in a state of languish.
>
> This is a severe enough problem to me that I would consider breaking API compatibility and fixing the API leak in 1.7/1.8. I'm curious what people other than Christopher think (assuming from his comments/JIRA work that he disagrees with me).
>
> On 12/4/17 6:12 PM, Christopher wrote:
>> Agreed.
>>
>> On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:
>>
>>> Ah, I'm seeing now -- didn't check my inbox appropriately.
>>>
>>> I think the fact that code that we don't own has somehow been allowed
>>> to be public API is the smell. That's something that needs to be
>>> rectified sooner than later. By that measure, it can *only* land on
>>> Accumulo 2.0 (which is going to be a major issue for the project).
>>>
>>> On 12/4/17 5:58 PM, Josh Elser wrote:
>>>> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
>>>> Cuz, uh... I made it work already :)
>>>>
>>>> Thanks for the JIRA cleanup. Forgot about that one.
>>>>
>>>> On 12/4/17 5:55 PM, Christopher wrote:
>>>>> I don't think we can support it with 1.8 or earlier, because of
>>>>> some serious incompatibilities (namely, ACCUMULO-4611/4753) I think
>>>>> people are still patching 1.7, so I don't think we've "officially"
>>>>> EOL'd it.
>>>>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently stable.
>>>>>
>>>>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
>>>>>
>>>>>> What branch do we want to consider Hadoop3 support?
>>>>>>
>>>>>> There is a 3.0.0-beta1 release that's been out for a while, and
>>>>>> Hadoop PMC has already done a 3.0.0 RC0. I think it's the right
>>>>>> time to start considering this.
>>>>>>
>>>>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working
>>>>>> through now. This does raise the question: where do we want to say
>>>>>> we support Hadoop3? 1.8 or 2.0? (have we "officially" deprecated
>>>>>> 1.7?)
>>>>>>
>>>>>> - Josh
>>>>>>
>>>>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>>>>>
>>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Josh Elser-2
In reply to this post by Christopher Tubbs-2
Sorry, if I misinterpreted your actions/comments. I was lead to believe
that you didn't consider this a problem that we should tackle prior to
2.0 (setting fixVersion=2.0, citing "unable to do this prior to 2.0", etc)

On 12/4/17 10:41 PM, Christopher wrote:

> I'm not certain what I'm supposed to be in disagreement with. I think
> you've presented a fair assessment of the situation, and I agree with the
> severity of the issue.
>
> The only other path I can see would be to not have a "bridge" version at
> all and instead require upgrading Accumulo simultaneously with Hadoop. I
> kind of like that option, but I don't think it's realistic for our
> audiences, as it doesn't allow them to manage their upgrade risks, so the
> shaded "bridge" version seems like the better option.
>
> On Mon, Dec 4, 2017 at 6:19 PM Josh Elser<[hidden email]>  wrote:
>>
>> This is a severe enough problem to me that I would consider breaking API
>> compatibility and fixing the API leak in 1.7/1.8. I'm curious what
>> people other than Christopher think (assuming from his comments/JIRA
>> work that he disagrees with me).
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Josh Elser-2
In reply to this post by Keith Turner
Ok, a bridge version seems to be a general path forward. Generally this
would be...

* 1.8 gets relevant commons-config classes/methods deprecated
* 1.9 is 1.8 with those deprecation points removed
* 1.9 has commons-config shaded (maybe?)

IMO, it's critical that we remove the commons-config stuff from our
public API (shame this somehow was let in to begin).

I think shading our use of commons-config would be a good idea and
lessen our ClientConfiguration scope to being able to read from a file.
Trying to support the breadth of what commons-configuration can do will
just get us into more trouble.

On 12/5/17 12:18 PM, Keith Turner wrote:

> If we are going to deprecate, then it would be nice to have a
> replacement.  One thing that has irked me about the current Accumulo
> entry point is that one can not specify everything needed to connect
> to in a single props file.  Specifically, credentials can not be
> specified.  It would be really nice to have a new entry point that
> allows this.
>
> We could release a 1.9 bridge version.  This version would be based on
> 1.8 and only include a new entry point. Base it on 1.8 in order to
> allow a low risk upgrade for anyone currently using 1.8.  Once people
> start using 1.9 they can have code that uses the old and new entry
> point running at the same time.  In 2.0 we can drop the problematic
> entry point.
>
> Below is a commit to 1.8 where I was experimenting with a new entry point.
>
> https://github.com/keith-turner/accumulo/commit/1c07fa62e9c57bde7e60907595d50f898d03c9d5
>
> This new API would need review, its rough and there are some things I
> don't like about it.  Just sharing for discussion of general concept,
> not advocating for this specific API.
>
> On Mon, Dec 4, 2017 at 6:27 PM, Dave Marion <[hidden email]> wrote:
>> There is no reason that you can't mark the offending API methods as deprecated in a 1.8.x release, then immediately branch off of that to create a 2.0 and remove the method. Alternatively, we could decide to forego the semver rules for a specific release and make sure to point it out in the release notes.
>>
>> -----Original Message-----
>> From: Josh Elser [mailto:[hidden email]]
>> Sent: Monday, December 4, 2017 6:19 PM
>> To: [hidden email]
>> Subject: Re: [DISCUSS] Hadoop3 support target?
>>
>> Also, just to be clear for everyone else:
>>
>> This means that we have *no roadmap* at all for Hadoop 3 support because Accumulo 2.0 is in a state of languish.
>>
>> This is a severe enough problem to me that I would consider breaking API compatibility and fixing the API leak in 1.7/1.8. I'm curious what people other than Christopher think (assuming from his comments/JIRA work that he disagrees with me).
>>
>> On 12/4/17 6:12 PM, Christopher wrote:
>>> Agreed.
>>>
>>> On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:
>>>
>>>> Ah, I'm seeing now -- didn't check my inbox appropriately.
>>>>
>>>> I think the fact that code that we don't own has somehow been allowed
>>>> to be public API is the smell. That's something that needs to be
>>>> rectified sooner than later. By that measure, it can *only* land on
>>>> Accumulo 2.0 (which is going to be a major issue for the project).
>>>>
>>>> On 12/4/17 5:58 PM, Josh Elser wrote:
>>>>> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
>>>>> Cuz, uh... I made it work already :)
>>>>>
>>>>> Thanks for the JIRA cleanup. Forgot about that one.
>>>>>
>>>>> On 12/4/17 5:55 PM, Christopher wrote:
>>>>>> I don't think we can support it with 1.8 or earlier, because of
>>>>>> some serious incompatibilities (namely, ACCUMULO-4611/4753) I think
>>>>>> people are still patching 1.7, so I don't think we've "officially"
>>>>>> EOL'd it.
>>>>>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently stable.
>>>>>>
>>>>>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
>>>>>>
>>>>>>> What branch do we want to consider Hadoop3 support?
>>>>>>>
>>>>>>> There is a 3.0.0-beta1 release that's been out for a while, and
>>>>>>> Hadoop PMC has already done a 3.0.0 RC0. I think it's the right
>>>>>>> time to start considering this.
>>>>>>>
>>>>>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working
>>>>>>> through now. This does raise the question: where do we want to say
>>>>>>> we support Hadoop3? 1.8 or 2.0? (have we "officially" deprecated
>>>>>>> 1.7?)
>>>>>>>
>>>>>>> - Josh
>>>>>>>
>>>>>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>>>>>>
>>>>>>
>>>>
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Keith Turner
I was thinking of a slightly different path forward.

 * Add new entry point and deprecate clientconfig in 1.9
 * Branch 1.9 off 1.8
 * Stop releasing 1.8.x in favor of 1.9.x (they are the same except
for new API)
 * Release 1.9 ASAP
 * Drop clientconfig in 2.0.0
 * Release 2.0.0 early next year... maybe target March

On Tue, Dec 5, 2017 at 12:51 PM, Josh Elser <[hidden email]> wrote:

> Ok, a bridge version seems to be a general path forward. Generally this
> would be...
>
> * 1.8 gets relevant commons-config classes/methods deprecated
> * 1.9 is 1.8 with those deprecation points removed
> * 1.9 has commons-config shaded (maybe?)
>
> IMO, it's critical that we remove the commons-config stuff from our public
> API (shame this somehow was let in to begin).
>
> I think shading our use of commons-config would be a good idea and lessen
> our ClientConfiguration scope to being able to read from a file. Trying to
> support the breadth of what commons-configuration can do will just get us
> into more trouble.
>
>
> On 12/5/17 12:18 PM, Keith Turner wrote:
>>
>> If we are going to deprecate, then it would be nice to have a
>> replacement.  One thing that has irked me about the current Accumulo
>> entry point is that one can not specify everything needed to connect
>> to in a single props file.  Specifically, credentials can not be
>> specified.  It would be really nice to have a new entry point that
>> allows this.
>>
>> We could release a 1.9 bridge version.  This version would be based on
>> 1.8 and only include a new entry point. Base it on 1.8 in order to
>> allow a low risk upgrade for anyone currently using 1.8.  Once people
>> start using 1.9 they can have code that uses the old and new entry
>> point running at the same time.  In 2.0 we can drop the problematic
>> entry point.
>>
>> Below is a commit to 1.8 where I was experimenting with a new entry point.
>>
>>
>> https://github.com/keith-turner/accumulo/commit/1c07fa62e9c57bde7e60907595d50f898d03c9d5
>>
>> This new API would need review, its rough and there are some things I
>> don't like about it.  Just sharing for discussion of general concept,
>> not advocating for this specific API.
>>
>> On Mon, Dec 4, 2017 at 6:27 PM, Dave Marion <[hidden email]> wrote:
>>>
>>> There is no reason that you can't mark the offending API methods as
>>> deprecated in a 1.8.x release, then immediately branch off of that to create
>>> a 2.0 and remove the method. Alternatively, we could decide to forego the
>>> semver rules for a specific release and make sure to point it out in the
>>> release notes.
>>>
>>> -----Original Message-----
>>> From: Josh Elser [mailto:[hidden email]]
>>> Sent: Monday, December 4, 2017 6:19 PM
>>> To: [hidden email]
>>> Subject: Re: [DISCUSS] Hadoop3 support target?
>>>
>>> Also, just to be clear for everyone else:
>>>
>>> This means that we have *no roadmap* at all for Hadoop 3 support because
>>> Accumulo 2.0 is in a state of languish.
>>>
>>> This is a severe enough problem to me that I would consider breaking API
>>> compatibility and fixing the API leak in 1.7/1.8. I'm curious what people
>>> other than Christopher think (assuming from his comments/JIRA work that he
>>> disagrees with me).
>>>
>>> On 12/4/17 6:12 PM, Christopher wrote:
>>>>
>>>> Agreed.
>>>>
>>>> On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:
>>>>
>>>>> Ah, I'm seeing now -- didn't check my inbox appropriately.
>>>>>
>>>>> I think the fact that code that we don't own has somehow been allowed
>>>>> to be public API is the smell. That's something that needs to be
>>>>> rectified sooner than later. By that measure, it can *only* land on
>>>>> Accumulo 2.0 (which is going to be a major issue for the project).
>>>>>
>>>>> On 12/4/17 5:58 PM, Josh Elser wrote:
>>>>>>
>>>>>> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
>>>>>> Cuz, uh... I made it work already :)
>>>>>>
>>>>>> Thanks for the JIRA cleanup. Forgot about that one.
>>>>>>
>>>>>> On 12/4/17 5:55 PM, Christopher wrote:
>>>>>>>
>>>>>>> I don't think we can support it with 1.8 or earlier, because of
>>>>>>> some serious incompatibilities (namely, ACCUMULO-4611/4753) I think
>>>>>>> people are still patching 1.7, so I don't think we've "officially"
>>>>>>> EOL'd it.
>>>>>>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently
>>>>>>> stable.
>>>>>>>
>>>>>>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
>>>>>>>
>>>>>>>> What branch do we want to consider Hadoop3 support?
>>>>>>>>
>>>>>>>> There is a 3.0.0-beta1 release that's been out for a while, and
>>>>>>>> Hadoop PMC has already done a 3.0.0 RC0. I think it's the right
>>>>>>>> time to start considering this.
>>>>>>>>
>>>>>>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working
>>>>>>>> through now. This does raise the question: where do we want to say
>>>>>>>> we support Hadoop3? 1.8 or 2.0? (have we "officially" deprecated
>>>>>>>> 1.7?)
>>>>>>>>
>>>>>>>> - Josh
>>>>>>>>
>>>>>>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Josh Elser-2
Interesting. What makes you want to deprecate ClientConfig entirely?

I'd be worried about removing without sufficient thought of replacement
around. It would be a bit "churn-y" to introduce yet another way that
clients have to connect (since it was introduced in 1.6-ish?). Working
around the ClientConfig changes was irritating for the downstream
integrations (Hive, most notably).

On 12/5/17 1:13 PM, Keith Turner wrote:

> I was thinking of a slightly different path forward.
>
>   * Add new entry point and deprecate clientconfig in 1.9
>   * Branch 1.9 off 1.8
>   * Stop releasing 1.8.x in favor of 1.9.x (they are the same except
> for new API)
>   * Release 1.9 ASAP
>   * Drop clientconfig in 2.0.0
>   * Release 2.0.0 early next year... maybe target March
>
> On Tue, Dec 5, 2017 at 12:51 PM, Josh Elser <[hidden email]> wrote:
>> Ok, a bridge version seems to be a general path forward. Generally this
>> would be...
>>
>> * 1.8 gets relevant commons-config classes/methods deprecated
>> * 1.9 is 1.8 with those deprecation points removed
>> * 1.9 has commons-config shaded (maybe?)
>>
>> IMO, it's critical that we remove the commons-config stuff from our public
>> API (shame this somehow was let in to begin).
>>
>> I think shading our use of commons-config would be a good idea and lessen
>> our ClientConfiguration scope to being able to read from a file. Trying to
>> support the breadth of what commons-configuration can do will just get us
>> into more trouble.
>>
>>
>> On 12/5/17 12:18 PM, Keith Turner wrote:
>>>
>>> If we are going to deprecate, then it would be nice to have a
>>> replacement.  One thing that has irked me about the current Accumulo
>>> entry point is that one can not specify everything needed to connect
>>> to in a single props file.  Specifically, credentials can not be
>>> specified.  It would be really nice to have a new entry point that
>>> allows this.
>>>
>>> We could release a 1.9 bridge version.  This version would be based on
>>> 1.8 and only include a new entry point. Base it on 1.8 in order to
>>> allow a low risk upgrade for anyone currently using 1.8.  Once people
>>> start using 1.9 they can have code that uses the old and new entry
>>> point running at the same time.  In 2.0 we can drop the problematic
>>> entry point.
>>>
>>> Below is a commit to 1.8 where I was experimenting with a new entry point.
>>>
>>>
>>> https://github.com/keith-turner/accumulo/commit/1c07fa62e9c57bde7e60907595d50f898d03c9d5
>>>
>>> This new API would need review, its rough and there are some things I
>>> don't like about it.  Just sharing for discussion of general concept,
>>> not advocating for this specific API.
>>>
>>> On Mon, Dec 4, 2017 at 6:27 PM, Dave Marion <[hidden email]> wrote:
>>>>
>>>> There is no reason that you can't mark the offending API methods as
>>>> deprecated in a 1.8.x release, then immediately branch off of that to create
>>>> a 2.0 and remove the method. Alternatively, we could decide to forego the
>>>> semver rules for a specific release and make sure to point it out in the
>>>> release notes.
>>>>
>>>> -----Original Message-----
>>>> From: Josh Elser [mailto:[hidden email]]
>>>> Sent: Monday, December 4, 2017 6:19 PM
>>>> To: [hidden email]
>>>> Subject: Re: [DISCUSS] Hadoop3 support target?
>>>>
>>>> Also, just to be clear for everyone else:
>>>>
>>>> This means that we have *no roadmap* at all for Hadoop 3 support because
>>>> Accumulo 2.0 is in a state of languish.
>>>>
>>>> This is a severe enough problem to me that I would consider breaking API
>>>> compatibility and fixing the API leak in 1.7/1.8. I'm curious what people
>>>> other than Christopher think (assuming from his comments/JIRA work that he
>>>> disagrees with me).
>>>>
>>>> On 12/4/17 6:12 PM, Christopher wrote:
>>>>>
>>>>> Agreed.
>>>>>
>>>>> On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:
>>>>>
>>>>>> Ah, I'm seeing now -- didn't check my inbox appropriately.
>>>>>>
>>>>>> I think the fact that code that we don't own has somehow been allowed
>>>>>> to be public API is the smell. That's something that needs to be
>>>>>> rectified sooner than later. By that measure, it can *only* land on
>>>>>> Accumulo 2.0 (which is going to be a major issue for the project).
>>>>>>
>>>>>> On 12/4/17 5:58 PM, Josh Elser wrote:
>>>>>>>
>>>>>>> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
>>>>>>> Cuz, uh... I made it work already :)
>>>>>>>
>>>>>>> Thanks for the JIRA cleanup. Forgot about that one.
>>>>>>>
>>>>>>> On 12/4/17 5:55 PM, Christopher wrote:
>>>>>>>>
>>>>>>>> I don't think we can support it with 1.8 or earlier, because of
>>>>>>>> some serious incompatibilities (namely, ACCUMULO-4611/4753) I think
>>>>>>>> people are still patching 1.7, so I don't think we've "officially"
>>>>>>>> EOL'd it.
>>>>>>>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently
>>>>>>>> stable.
>>>>>>>>
>>>>>>>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>> What branch do we want to consider Hadoop3 support?
>>>>>>>>>
>>>>>>>>> There is a 3.0.0-beta1 release that's been out for a while, and
>>>>>>>>> Hadoop PMC has already done a 3.0.0 RC0. I think it's the right
>>>>>>>>> time to start considering this.
>>>>>>>>>
>>>>>>>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working
>>>>>>>>> through now. This does raise the question: where do we want to say
>>>>>>>>> we support Hadoop3? 1.8 or 2.0? (have we "officially" deprecated
>>>>>>>>> 1.7?)
>>>>>>>>>
>>>>>>>>> - Josh
>>>>>>>>>
>>>>>>>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Keith Turner
On Tue, Dec 5, 2017 at 2:53 PM, Josh Elser <[hidden email]> wrote:
> Interesting. What makes you want to deprecate ClientConfig entirely?
>
> I'd be worried about removing without sufficient thought of replacement
> around. It would be a bit "churn-y" to introduce yet another way that
> clients have to connect (since it was introduced in 1.6-ish?). Working
> around the ClientConfig changes was irritating for the downstream
> integrations (Hive, most notably).

Ok maybe thats a bad idea, not looking to cause pain.  Here were some
of my goals.

 * Remove commons config from API completely via deprecation cycle.
 * Introduce API that supports putting all props needed to connect to
Accumulo in an API.

I suppose if we want to keep ClientConfig class in API, then there is
no way to remove commons config via a deprecation cycle??  We can't
deprecate the extension of commons config, all we can do is just drop
it at some point.

>
>
> On 12/5/17 1:13 PM, Keith Turner wrote:
>>
>> I was thinking of a slightly different path forward.
>>
>>   * Add new entry point and deprecate clientconfig in 1.9
>>   * Branch 1.9 off 1.8
>>   * Stop releasing 1.8.x in favor of 1.9.x (they are the same except
>> for new API)
>>   * Release 1.9 ASAP
>>   * Drop clientconfig in 2.0.0
>>   * Release 2.0.0 early next year... maybe target March
>>
>> On Tue, Dec 5, 2017 at 12:51 PM, Josh Elser <[hidden email]> wrote:
>>>
>>> Ok, a bridge version seems to be a general path forward. Generally this
>>> would be...
>>>
>>> * 1.8 gets relevant commons-config classes/methods deprecated
>>> * 1.9 is 1.8 with those deprecation points removed
>>> * 1.9 has commons-config shaded (maybe?)
>>>
>>> IMO, it's critical that we remove the commons-config stuff from our
>>> public
>>> API (shame this somehow was let in to begin).
>>>
>>> I think shading our use of commons-config would be a good idea and lessen
>>> our ClientConfiguration scope to being able to read from a file. Trying
>>> to
>>> support the breadth of what commons-configuration can do will just get us
>>> into more trouble.
>>>
>>>
>>> On 12/5/17 12:18 PM, Keith Turner wrote:
>>>>
>>>>
>>>> If we are going to deprecate, then it would be nice to have a
>>>> replacement.  One thing that has irked me about the current Accumulo
>>>> entry point is that one can not specify everything needed to connect
>>>> to in a single props file.  Specifically, credentials can not be
>>>> specified.  It would be really nice to have a new entry point that
>>>> allows this.
>>>>
>>>> We could release a 1.9 bridge version.  This version would be based on
>>>> 1.8 and only include a new entry point. Base it on 1.8 in order to
>>>> allow a low risk upgrade for anyone currently using 1.8.  Once people
>>>> start using 1.9 they can have code that uses the old and new entry
>>>> point running at the same time.  In 2.0 we can drop the problematic
>>>> entry point.
>>>>
>>>> Below is a commit to 1.8 where I was experimenting with a new entry
>>>> point.
>>>>
>>>>
>>>>
>>>> https://github.com/keith-turner/accumulo/commit/1c07fa62e9c57bde7e60907595d50f898d03c9d5
>>>>
>>>> This new API would need review, its rough and there are some things I
>>>> don't like about it.  Just sharing for discussion of general concept,
>>>> not advocating for this specific API.
>>>>
>>>> On Mon, Dec 4, 2017 at 6:27 PM, Dave Marion <[hidden email]> wrote:
>>>>>
>>>>>
>>>>> There is no reason that you can't mark the offending API methods as
>>>>> deprecated in a 1.8.x release, then immediately branch off of that to
>>>>> create
>>>>> a 2.0 and remove the method. Alternatively, we could decide to forego
>>>>> the
>>>>> semver rules for a specific release and make sure to point it out in
>>>>> the
>>>>> release notes.
>>>>>
>>>>> -----Original Message-----
>>>>> From: Josh Elser [mailto:[hidden email]]
>>>>> Sent: Monday, December 4, 2017 6:19 PM
>>>>> To: [hidden email]
>>>>> Subject: Re: [DISCUSS] Hadoop3 support target?
>>>>>
>>>>> Also, just to be clear for everyone else:
>>>>>
>>>>> This means that we have *no roadmap* at all for Hadoop 3 support
>>>>> because
>>>>> Accumulo 2.0 is in a state of languish.
>>>>>
>>>>> This is a severe enough problem to me that I would consider breaking
>>>>> API
>>>>> compatibility and fixing the API leak in 1.7/1.8. I'm curious what
>>>>> people
>>>>> other than Christopher think (assuming from his comments/JIRA work that
>>>>> he
>>>>> disagrees with me).
>>>>>
>>>>> On 12/4/17 6:12 PM, Christopher wrote:
>>>>>>
>>>>>>
>>>>>> Agreed.
>>>>>>
>>>>>> On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]> wrote:
>>>>>>
>>>>>>> Ah, I'm seeing now -- didn't check my inbox appropriately.
>>>>>>>
>>>>>>> I think the fact that code that we don't own has somehow been allowed
>>>>>>> to be public API is the smell. That's something that needs to be
>>>>>>> rectified sooner than later. By that measure, it can *only* land on
>>>>>>> Accumulo 2.0 (which is going to be a major issue for the project).
>>>>>>>
>>>>>>> On 12/4/17 5:58 PM, Josh Elser wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
>>>>>>>> Cuz, uh... I made it work already :)
>>>>>>>>
>>>>>>>> Thanks for the JIRA cleanup. Forgot about that one.
>>>>>>>>
>>>>>>>> On 12/4/17 5:55 PM, Christopher wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I don't think we can support it with 1.8 or earlier, because of
>>>>>>>>> some serious incompatibilities (namely, ACCUMULO-4611/4753) I think
>>>>>>>>> people are still patching 1.7, so I don't think we've "officially"
>>>>>>>>> EOL'd it.
>>>>>>>>> I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently
>>>>>>>>> stable.
>>>>>>>>>
>>>>>>>>> On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> What branch do we want to consider Hadoop3 support?
>>>>>>>>>>
>>>>>>>>>> There is a 3.0.0-beta1 release that's been out for a while, and
>>>>>>>>>> Hadoop PMC has already done a 3.0.0 RC0. I think it's the right
>>>>>>>>>> time to start considering this.
>>>>>>>>>>
>>>>>>>>>> In my poking so far, I've filed ACCUMULO-4753 which I'm working
>>>>>>>>>> through now. This does raise the question: where do we want to say
>>>>>>>>>> we support Hadoop3? 1.8 or 2.0? (have we "officially" deprecated
>>>>>>>>>> 1.7?)
>>>>>>>>>>
>>>>>>>>>> - Josh
>>>>>>>>>>
>>>>>>>>>> https://issues.apache.org/jira/browse/ACCUMULO-4753
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Josh Elser-2


On 12/5/17 3:28 PM, Keith Turner wrote:

> On Tue, Dec 5, 2017 at 2:53 PM, Josh Elser<[hidden email]>  wrote:
>> Interesting. What makes you want to deprecate ClientConfig entirely?
>>
>> I'd be worried about removing without sufficient thought of replacement
>> around. It would be a bit "churn-y" to introduce yet another way that
>> clients have to connect (since it was introduced in 1.6-ish?). Working
>> around the ClientConfig changes was irritating for the downstream
>> integrations (Hive, most notably).
> Ok maybe thats a bad idea, not looking to cause pain.  Here were some
> of my goals.
>
>   * Remove commons config from API completely via deprecation cycle.
>   * Introduce API that supports putting all props needed to connect to
> Accumulo in an API.
>
> I suppose if we want to keep ClientConfig class in API, then there is
> no way to remove commons config via a deprecation cycle??  We can't
> deprecate the extension of commons config, all we can do is just drop
> it at some point.
>

My line of thinking is that the majority of the time, we're creating a
ClientConfiguration by one of:

* ClientConfiguration#loadDefault()
* new ClientConfiguration(String)
* new ClientConfiguration(File)

Granted, we also inherit/expose a few other things (notably extending
CompositeConfiguration and throwing ConfigurationException). I would be
comfortable with dropping those w/o deprecation. I have not seen
evidence from anyone that they are widely in use by folks (although I've
not explicitly asked, either).
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Keith Turner
If we do the following.

 * Drop ZooKeeperInstance.ZooKeeperInstance(Configuration config) method.
 * Drop extends from ClientConfig
 * Add a method ZooKeeperInstance.ZooKeeperInstance(ClientConfig config)

Then this will not be binary compatible, so it will still be painful
in many cases.   It may be source compatible.

For example the following will be source (but not binary) compatible.

  ClientConfiguration cc = new ClientConfiguration(file);
  //when compiled against older version of Accumulo will bind to
method with commons config signature
  //when recompiled will bind to clientconfig version of method
  ZooKeeperInstance zki = new ZooKeeperInstance(cc);

The following would not be source or binary compatible.

  Configuration cc = new ClientConfiguration(file);
  ZooKeeperInstance zki = new ZooKeeperInstance(cc);


On Tue, Dec 5, 2017 at 3:40 PM, Josh Elser <[hidden email]> wrote:

>
>
> On 12/5/17 3:28 PM, Keith Turner wrote:
>>
>> On Tue, Dec 5, 2017 at 2:53 PM, Josh Elser<[hidden email]>  wrote:
>>>
>>> Interesting. What makes you want to deprecate ClientConfig entirely?
>>>
>>> I'd be worried about removing without sufficient thought of replacement
>>> around. It would be a bit "churn-y" to introduce yet another way that
>>> clients have to connect (since it was introduced in 1.6-ish?). Working
>>> around the ClientConfig changes was irritating for the downstream
>>> integrations (Hive, most notably).
>>
>> Ok maybe thats a bad idea, not looking to cause pain.  Here were some
>> of my goals.
>>
>>   * Remove commons config from API completely via deprecation cycle.
>>   * Introduce API that supports putting all props needed to connect to
>> Accumulo in an API.
>>
>> I suppose if we want to keep ClientConfig class in API, then there is
>> no way to remove commons config via a deprecation cycle??  We can't
>> deprecate the extension of commons config, all we can do is just drop
>> it at some point.
>>
>
> My line of thinking is that the majority of the time, we're creating a
> ClientConfiguration by one of:
>
> * ClientConfiguration#loadDefault()
> * new ClientConfiguration(String)
> * new ClientConfiguration(File)
>
> Granted, we also inherit/expose a few other things (notably extending
> CompositeConfiguration and throwing ConfigurationException). I would be
> comfortable with dropping those w/o deprecation. I have not seen evidence
> from anyone that they are widely in use by folks (although I've not
> explicitly asked, either).
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Christopher Tubbs-2
In reply to this post by Josh Elser-2
On Tue, Dec 5, 2017 at 3:40 PM Josh Elser <[hidden email]> wrote:

>
>
> On 12/5/17 3:28 PM, Keith Turner wrote:
> > On Tue, Dec 5, 2017 at 2:53 PM, Josh Elser<[hidden email]>  wrote:
> >> Interesting. What makes you want to deprecate ClientConfig entirely?
> >>
> >> I'd be worried about removing without sufficient thought of replacement
> >> around. It would be a bit "churn-y" to introduce yet another way that
> >> clients have to connect (since it was introduced in 1.6-ish?). Working
> >> around the ClientConfig changes was irritating for the downstream
> >> integrations (Hive, most notably).
> > Ok maybe thats a bad idea, not looking to cause pain.  Here were some
> > of my goals.
> >
> >   * Remove commons config from API completely via deprecation cycle.
> >   * Introduce API that supports putting all props needed to connect to
> > Accumulo in an API.
> >
> > I suppose if we want to keep ClientConfig class in API, then there is
> > no way to remove commons config via a deprecation cycle??  We can't
> > deprecate the extension of commons config, all we can do is just drop
> > it at some point.
> >
>
> My line of thinking is that the majority of the time, we're creating a
> ClientConfiguration by one of:
>
> * ClientConfiguration#loadDefault()
> * new ClientConfiguration(String)
> * new ClientConfiguration(File)
>
> Granted, we also inherit/expose a few other things (notably extending
> CompositeConfiguration and throwing ConfigurationException). I would be
> comfortable with dropping those w/o deprecation. I have not seen
> evidence from anyone that they are widely in use by folks (although I've
> not explicitly asked, either).
>

I would also be comfortable with that. I think the main API that we need to
deal with is the ZooKeeperInstance(Configuration) constructor. It should be
ZooKeeperInstance(ClientConfiguration) instead. If we assume that the only
way this method was used was with ClientConfiguration objects, then simply
changing the parameter type will continue to be source compatible. I don't
know if we want to assume that. Either way, it won't be binary compatible.

If we're not concerned with binary compatibility, and are willing to make
that assumption about this constructor only (or usually) being used with
ClientConfiguration, then we can just change the method signature now,
without deprecation cycles or a new alternative method.
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Josh Elser-2
In reply to this post by Keith Turner
Another potential suggestion I forgot about: we try to just move to the
Hadoop shaded artifacts. This would invalidate the need to do more, but
I have no idea how "battle-tested" those artifacts are.

On 12/5/17 3:52 PM, Keith Turner wrote:

> If we do the following.
>
>   * Drop ZooKeeperInstance.ZooKeeperInstance(Configuration config) method.
>   * Drop extends from ClientConfig
>   * Add a method ZooKeeperInstance.ZooKeeperInstance(ClientConfig config)
>
> Then this will not be binary compatible, so it will still be painful
> in many cases.   It may be source compatible.
>
> For example the following will be source (but not binary) compatible.
>
>    ClientConfiguration cc = new ClientConfiguration(file);
>    //when compiled against older version of Accumulo will bind to
> method with commons config signature
>    //when recompiled will bind to clientconfig version of method
>    ZooKeeperInstance zki = new ZooKeeperInstance(cc);
>
> The following would not be source or binary compatible.
>
>    Configuration cc = new ClientConfiguration(file);
>    ZooKeeperInstance zki = new ZooKeeperInstance(cc);
>
>
> On Tue, Dec 5, 2017 at 3:40 PM, Josh Elser <[hidden email]> wrote:
>>
>>
>> On 12/5/17 3:28 PM, Keith Turner wrote:
>>>
>>> On Tue, Dec 5, 2017 at 2:53 PM, Josh Elser<[hidden email]>  wrote:
>>>>
>>>> Interesting. What makes you want to deprecate ClientConfig entirely?
>>>>
>>>> I'd be worried about removing without sufficient thought of replacement
>>>> around. It would be a bit "churn-y" to introduce yet another way that
>>>> clients have to connect (since it was introduced in 1.6-ish?). Working
>>>> around the ClientConfig changes was irritating for the downstream
>>>> integrations (Hive, most notably).
>>>
>>> Ok maybe thats a bad idea, not looking to cause pain.  Here were some
>>> of my goals.
>>>
>>>    * Remove commons config from API completely via deprecation cycle.
>>>    * Introduce API that supports putting all props needed to connect to
>>> Accumulo in an API.
>>>
>>> I suppose if we want to keep ClientConfig class in API, then there is
>>> no way to remove commons config via a deprecation cycle??  We can't
>>> deprecate the extension of commons config, all we can do is just drop
>>> it at some point.
>>>
>>
>> My line of thinking is that the majority of the time, we're creating a
>> ClientConfiguration by one of:
>>
>> * ClientConfiguration#loadDefault()
>> * new ClientConfiguration(String)
>> * new ClientConfiguration(File)
>>
>> Granted, we also inherit/expose a few other things (notably extending
>> CompositeConfiguration and throwing ConfigurationException). I would be
>> comfortable with dropping those w/o deprecation. I have not seen evidence
>> from anyone that they are widely in use by folks (although I've not
>> explicitly asked, either).
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Keith Turner
Another option for the sake of discussion, is to have a hadoop 3
version.  For example


     <dependency>
        <groupId>org.apache.accumulo</groupId>
        <artifactId>accumulo-core</artifactId>
        <version>1.8.2-hadoop3</version>
      </dependency>

I think this horrible in too many ways to enumerate, not advocating
for it.  Just trying to exhaustively think through the solution space.


On Tue, Dec 5, 2017 at 3:58 PM, Josh Elser <[hidden email]> wrote:

> Another potential suggestion I forgot about: we try to just move to the
> Hadoop shaded artifacts. This would invalidate the need to do more, but I
> have no idea how "battle-tested" those artifacts are.
>
>
> On 12/5/17 3:52 PM, Keith Turner wrote:
>>
>> If we do the following.
>>
>>   * Drop ZooKeeperInstance.ZooKeeperInstance(Configuration config) method.
>>   * Drop extends from ClientConfig
>>   * Add a method ZooKeeperInstance.ZooKeeperInstance(ClientConfig config)
>>
>> Then this will not be binary compatible, so it will still be painful
>> in many cases.   It may be source compatible.
>>
>> For example the following will be source (but not binary) compatible.
>>
>>    ClientConfiguration cc = new ClientConfiguration(file);
>>    //when compiled against older version of Accumulo will bind to
>> method with commons config signature
>>    //when recompiled will bind to clientconfig version of method
>>    ZooKeeperInstance zki = new ZooKeeperInstance(cc);
>>
>> The following would not be source or binary compatible.
>>
>>    Configuration cc = new ClientConfiguration(file);
>>    ZooKeeperInstance zki = new ZooKeeperInstance(cc);
>>
>>
>> On Tue, Dec 5, 2017 at 3:40 PM, Josh Elser <[hidden email]> wrote:
>>>
>>>
>>>
>>> On 12/5/17 3:28 PM, Keith Turner wrote:
>>>>
>>>>
>>>> On Tue, Dec 5, 2017 at 2:53 PM, Josh Elser<[hidden email]>  wrote:
>>>>>
>>>>>
>>>>> Interesting. What makes you want to deprecate ClientConfig entirely?
>>>>>
>>>>> I'd be worried about removing without sufficient thought of replacement
>>>>> around. It would be a bit "churn-y" to introduce yet another way that
>>>>> clients have to connect (since it was introduced in 1.6-ish?). Working
>>>>> around the ClientConfig changes was irritating for the downstream
>>>>> integrations (Hive, most notably).
>>>>
>>>>
>>>> Ok maybe thats a bad idea, not looking to cause pain.  Here were some
>>>> of my goals.
>>>>
>>>>    * Remove commons config from API completely via deprecation cycle.
>>>>    * Introduce API that supports putting all props needed to connect to
>>>> Accumulo in an API.
>>>>
>>>> I suppose if we want to keep ClientConfig class in API, then there is
>>>> no way to remove commons config via a deprecation cycle??  We can't
>>>> deprecate the extension of commons config, all we can do is just drop
>>>> it at some point.
>>>>
>>>
>>> My line of thinking is that the majority of the time, we're creating a
>>> ClientConfiguration by one of:
>>>
>>> * ClientConfiguration#loadDefault()
>>> * new ClientConfiguration(String)
>>> * new ClientConfiguration(File)
>>>
>>> Granted, we also inherit/expose a few other things (notably extending
>>> CompositeConfiguration and throwing ConfigurationException). I would be
>>> comfortable with dropping those w/o deprecation. I have not seen evidence
>>> from anyone that they are widely in use by folks (although I've not
>>> explicitly asked, either).
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Hadoop3 support target?

Jorge Machado
In reply to this post by Keith Turner
About this Client config is a Mess. Why can’t we just do it like in hdfs where for the client to run it needs or to pass that I to the configuration object or we just read it from a resource. My ideia is to get rid of this client.conf

Jorge Machado
[hidden email]<mailto:[hidden email]>


Am 05.12.2017 um 21:29 schrieb Keith Turner <[hidden email]<mailto:[hidden email]>>:

On Tue, Dec 5, 2017 at 2:53 PM, Josh Elser <[hidden email]<mailto:[hidden email]>> wrote:
Interesting. What makes you want to deprecate ClientConfig entirely?

I'd be worried about removing without sufficient thought of replacement
around. It would be a bit "churn-y" to introduce yet another way that
clients have to connect (since it was introduced in 1.6-ish?). Working
around the ClientConfig changes was irritating for the downstream
integrations (Hive, most notably).

Ok maybe thats a bad idea, not looking to cause pain.  Here were some
of my goals.

* Remove commons config from API completely via deprecation cycle.
* Introduce API that supports putting all props needed to connect to
Accumulo in an API.

I suppose if we want to keep ClientConfig class in API, then there is
no way to remove commons config via a deprecation cycle??  We can't
deprecate the extension of commons config, all we can do is just drop
it at some point.



On 12/5/17 1:13 PM, Keith Turner wrote:

I was thinking of a slightly different path forward.

 * Add new entry point and deprecate clientconfig in 1.9
 * Branch 1.9 off 1.8
 * Stop releasing 1.8.x in favor of 1.9.x (they are the same except
for new API)
 * Release 1.9 ASAP
 * Drop clientconfig in 2.0.0
 * Release 2.0.0 early next year... maybe target March

On Tue, Dec 5, 2017 at 12:51 PM, Josh Elser <[hidden email]<mailto:[hidden email]>> wrote:

Ok, a bridge version seems to be a general path forward. Generally this
would be...

* 1.8 gets relevant commons-config classes/methods deprecated
* 1.9 is 1.8 with those deprecation points removed
* 1.9 has commons-config shaded (maybe?)

IMO, it's critical that we remove the commons-config stuff from our
public
API (shame this somehow was let in to begin).

I think shading our use of commons-config would be a good idea and lessen
our ClientConfiguration scope to being able to read from a file. Trying
to
support the breadth of what commons-configuration can do will just get us
into more trouble.


On 12/5/17 12:18 PM, Keith Turner wrote:


If we are going to deprecate, then it would be nice to have a
replacement.  One thing that has irked me about the current Accumulo
entry point is that one can not specify everything needed to connect
to in a single props file.  Specifically, credentials can not be
specified.  It would be really nice to have a new entry point that
allows this.

We could release a 1.9 bridge version.  This version would be based on
1.8 and only include a new entry point. Base it on 1.8 in order to
allow a low risk upgrade for anyone currently using 1.8.  Once people
start using 1.9 they can have code that uses the old and new entry
point running at the same time.  In 2.0 we can drop the problematic
entry point.

Below is a commit to 1.8 where I was experimenting with a new entry
point.



https://github.com/keith-turner/accumulo/commit/1c07fa62e9c57bde7e60907595d50f898d03c9d5

This new API would need review, its rough and there are some things I
don't like about it.  Just sharing for discussion of general concept,
not advocating for this specific API.

On Mon, Dec 4, 2017 at 6:27 PM, Dave Marion <[hidden email]<mailto:[hidden email]>> wrote:


There is no reason that you can't mark the offending API methods as
deprecated in a 1.8.x release, then immediately branch off of that to
create
a 2.0 and remove the method. Alternatively, we could decide to forego
the
semver rules for a specific release and make sure to point it out in
the
release notes.

-----Original Message-----
From: Josh Elser [mailto:[hidden email]]
Sent: Monday, December 4, 2017 6:19 PM
To: [hidden email]<mailto:[hidden email]>
Subject: Re: [DISCUSS] Hadoop3 support target?

Also, just to be clear for everyone else:

This means that we have *no roadmap* at all for Hadoop 3 support
because
Accumulo 2.0 is in a state of languish.

This is a severe enough problem to me that I would consider breaking
API
compatibility and fixing the API leak in 1.7/1.8. I'm curious what
people
other than Christopher think (assuming from his comments/JIRA work that
he
disagrees with me).

On 12/4/17 6:12 PM, Christopher wrote:


Agreed.

On Mon, Dec 4, 2017 at 6:01 PM Josh Elser <[hidden email]<mailto:[hidden email]>> wrote:

Ah, I'm seeing now -- didn't check my inbox appropriately.

I think the fact that code that we don't own has somehow been allowed
to be public API is the smell. That's something that needs to be
rectified sooner than later. By that measure, it can *only* land on
Accumulo 2.0 (which is going to be a major issue for the project).

On 12/4/17 5:58 PM, Josh Elser wrote:


Sorry, I don't follow. Why do you think 4611/4753 is a show-stopper?
Cuz, uh... I made it work already :)

Thanks for the JIRA cleanup. Forgot about that one.

On 12/4/17 5:55 PM, Christopher wrote:


I don't think we can support it with 1.8 or earlier, because of
some serious incompatibilities (namely, ACCUMULO-4611/4753) I think
people are still patching 1.7, so I don't think we've "officially"
EOL'd it.
I think 2.0 could require Hadoop 3, if Hadoop 3 is sufficiently
stable.

On Mon, Dec 4, 2017 at 1:14 PM Josh Elser <[hidden email]<mailto:[hidden email]>>
wrote:

What branch do we want to consider Hadoop3 support?

There is a 3.0.0-beta1 release that's been out for a while, and
Hadoop PMC has already done a 3.0.0 RC0. I think it's the right
time to start considering this.

In my poking so far, I've filed ACCUMULO-4753 which I'm working
through now. This does raise the question: where do we want to say
we support Hadoop3? 1.8 or 2.0? (have we "officially" deprecated
1.7?)

- Josh

https://issues.apache.org/jira/browse/ACCUMULO-4753







12