[DISCUSS] Any interest in separate client/server tarballs

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Any interest in separate client/server tarballs

Josh Elser
Hi,

$dayjob presented me with a request to break up the current tarball into
two: one suitable for "users" and another for the Accumulo services. The
ultimate goal is to make upgrade scenarios a bit easier by having client
and server centric packaging.

The "client" tarball would be something suitable for most users
providing the ability to do things like:

* Launch a java app against Accumulo
* Launch a MapReduce job against Accumulo
* Launch the Accumulo shell

Essentially, the client tarball is just a pared down version of our
"current" tarball and the server-tarball is likely equivalent to our
"current" tarball (given that we have little code which would be
considered client-only).

Obviously, there are many ways to go about this. If there is buy-in from
other folks, adding some new assembly descriptors and making it a part
of the Maven build (perhaps, optionally generated) would be the easiest
in terms of maintenance. However, I don't want to push for that if it's
just going to be ignored by folks. I'll be creating something to support
this one way or another.

Any thoughts/opinions? Would this have any value to other folks?

- Josh
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Christopher Tubbs-2
tl;dr : I would prefer not to add another tarball as part of our "official"
releases, but I'd be in favor of a blog instructions, script, or build
profile, which users could read/execute/activate to create a client-centric
package.

I've long believed that supporting different downstream packaging scenarios
should be prioritized over upstream binary packaging. I have argued in
favor of removing our current tarball entirely, while supporting efforts to
enable downstream packaging by modularizing the server code, supporting a
client-API jar (future work), and decoupling code from launch scripts. I
think we should continue to do these kinds of improvements to support
different packaging scenarios downstream, but I'd prefer to avoid
additional "official" binary releases.

Rather than provide additional packages, I'd prefer to work with downstream
to make the source more "packagable" to suit the needs of these downstream
vendor/community packagers. One way we can do that here is by either
documenting what would be needed in a client-centric package, or by
providing a script or build profile to create it from source, so that your
$dayjob or any other downstream packager doesn't have to figure that out
from scratch.

On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]> wrote:

> Hi,
>
> $dayjob presented me with a request to break up the current tarball into
> two: one suitable for "users" and another for the Accumulo services. The
> ultimate goal is to make upgrade scenarios a bit easier by having client
> and server centric packaging.
>
> The "client" tarball would be something suitable for most users
> providing the ability to do things like:
>
> * Launch a java app against Accumulo
> * Launch a MapReduce job against Accumulo
> * Launch the Accumulo shell
>
> Essentially, the client tarball is just a pared down version of our
> "current" tarball and the server-tarball is likely equivalent to our
> "current" tarball (given that we have little code which would be
> considered client-only).
>
> Obviously, there are many ways to go about this. If there is buy-in from
> other folks, adding some new assembly descriptors and making it a part
> of the Maven build (perhaps, optionally generated) would be the easiest
> in terms of maintenance. However, I don't want to push for that if it's
> just going to be ignored by folks. I'll be creating something to support
> this one way or another.
>
> Any thoughts/opinions? Would this have any value to other folks?
>
> - Josh
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Keith Turner
In reply to this post by Josh Elser
On Thu, Jan 4, 2018 at 7:16 PM, Josh Elser <[hidden email]> wrote:

> Hi,
>
> $dayjob presented me with a request to break up the current tarball into
> two: one suitable for "users" and another for the Accumulo services. The
> ultimate goal is to make upgrade scenarios a bit easier by having client and
> server centric packaging.
>
> The "client" tarball would be something suitable for most users providing
> the ability to do things like:
>
> * Launch a java app against Accumulo
> * Launch a MapReduce job against Accumulo
> * Launch the Accumulo shell
>
> Essentially, the client tarball is just a pared down version of our
> "current" tarball and the server-tarball is likely equivalent to our
> "current" tarball (given that we have little code which would be considered
> client-only).
>
> Obviously, there are many ways to go about this. If there is buy-in from
> other folks, adding some new assembly descriptors and making it a part of
> the Maven build (perhaps, optionally generated) would be the easiest in
> terms of maintenance. However, I don't want to push for that if it's just
> going to be ignored by folks. I'll be creating something to support this one
> way or another.

Do you have anything to share?  I would be interested in reviewing this.

>
> Any thoughts/opinions? Would this have any value to other folks?

This is slightly unrelated, but it would be nice to lower the number
of dependencies for the client side code and possibly shade in
libthrift.

>
> - Josh
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Josh Elser
On 1/5/18 9:55 AM, Keith Turner wrote:
>> Obviously, there are many ways to go about this. If there is buy-in from
>> other folks, adding some new assembly descriptors and making it a part of
>> the Maven build (perhaps, optionally generated) would be the easiest in
>> terms of maintenance. However, I don't want to push for that if it's just
>> going to be ignored by folks. I'll be creating something to support this one
>> way or another.
> Do you have anything to share?  I would be interested in reviewing this.

Nothing yet. My plan is to take the stock bin-tarball, split the files
up into two lists to make sure I have the separation correct (that
things actually work). Then, I can implement it however we want.

>> Any thoughts/opinions? Would this have any value to other folks?
> This is slightly unrelated, but it would be nice to lower the number
> of dependencies for the client side code and possibly shade in
> libthrift.

Yup. Agreed.
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Josh Elser
In reply to this post by Christopher Tubbs-2
I'd be worried about advertising something that we're not treating as
official as it would languish (unless we create tests that can validate
the result for us).

Thanks for the input.

On 1/4/18 7:43 PM, Christopher wrote:

> tl;dr : I would prefer not to add another tarball as part of our "official"
> releases, but I'd be in favor of a blog instructions, script, or build
> profile, which users could read/execute/activate to create a client-centric
> package.
>
> I've long believed that supporting different downstream packaging scenarios
> should be prioritized over upstream binary packaging. I have argued in
> favor of removing our current tarball entirely, while supporting efforts to
> enable downstream packaging by modularizing the server code, supporting a
> client-API jar (future work), and decoupling code from launch scripts. I
> think we should continue to do these kinds of improvements to support
> different packaging scenarios downstream, but I'd prefer to avoid
> additional "official" binary releases.
>
> Rather than provide additional packages, I'd prefer to work with downstream
> to make the source more "packagable" to suit the needs of these downstream
> vendor/community packagers. One way we can do that here is by either
> documenting what would be needed in a client-centric package, or by
> providing a script or build profile to create it from source, so that your
> $dayjob or any other downstream packager doesn't have to figure that out
> from scratch.
>
> On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]> wrote:
>
>> Hi,
>>
>> $dayjob presented me with a request to break up the current tarball into
>> two: one suitable for "users" and another for the Accumulo services. The
>> ultimate goal is to make upgrade scenarios a bit easier by having client
>> and server centric packaging.
>>
>> The "client" tarball would be something suitable for most users
>> providing the ability to do things like:
>>
>> * Launch a java app against Accumulo
>> * Launch a MapReduce job against Accumulo
>> * Launch the Accumulo shell
>>
>> Essentially, the client tarball is just a pared down version of our
>> "current" tarball and the server-tarball is likely equivalent to our
>> "current" tarball (given that we have little code which would be
>> considered client-only).
>>
>> Obviously, there are many ways to go about this. If there is buy-in from
>> other folks, adding some new assembly descriptors and making it a part
>> of the Maven build (perhaps, optionally generated) would be the easiest
>> in terms of maintenance. However, I don't want to push for that if it's
>> just going to be ignored by folks. I'll be creating something to support
>> this one way or another.
>>
>> Any thoughts/opinions? Would this have any value to other folks?
>>
>> - Josh
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Keith Turner
In reply to this post by Christopher Tubbs-2
On Thu, Jan 4, 2018 at 7:43 PM, Christopher <[hidden email]> wrote:
> tl;dr : I would prefer not to add another tarball as part of our "official"

I am not opposed to replacing the current single tarball with client
and server tarballs.   What I find appealing about this is if the
client tarball has less deps.

However I think a lot of thought should be put into the scripts if
this is done.  For example the client tar and server tar should
probably not both have accumulo commands that do different things.

> releases, but I'd be in favor of a blog instructions, script, or build
> profile, which users could read/execute/activate to create a client-centric
> package.
>
> I've long believed that supporting different downstream packaging scenarios
> should be prioritized over upstream binary packaging. I have argued in

These "downstream" packaging could be done within the Apache Accumulo
project also.  Like accumulo-docker.  Creating other packaging
projects within Accumulo is something to consider.

> favor of removing our current tarball entirely, while supporting efforts to

Apache Accumulo needs some sort of tarball that makes it easy to run
the code on a cluster, otherwise how can we test Accumulo on a cluster
for releases?

> enable downstream packaging by modularizing the server code, supporting a
> client-API jar (future work), and decoupling code from launch scripts. I
> think we should continue to do these kinds of improvements to support
> different packaging scenarios downstream, but I'd prefer to avoid
> additional "official" binary releases.

I agree, I think if the Accumulo Java code made less assumptions about
its runtime env it would result in code that is easier to maintain and
package for different environments.

In Fluo we have recently done a lot of work in order to support
Docker, Mesos, and Kubernetes.  This work has really cleaned up the
core Fluo code making it easier to run in any environment.

I suspect pulling the Accumuo tar ball into a separate git repo and
out of the main repo may help highlight some of the assumptions
Accumulo Java code makes about the environment.

I think these clean up issues are related to what Josh is suggesting,
but are not prerequisites.  So it makes sense to discuss them at this
point, but I don't think they should block work on two tarballs if
that seems like a good idea.

>
> Rather than provide additional packages, I'd prefer to work with downstream
> to make the source more "packagable" to suit the needs of these downstream
> vendor/community packagers. One way we can do that here is by either
> documenting what would be needed in a client-centric package, or by
> providing a script or build profile to create it from source, so that your
> $dayjob or any other downstream packager doesn't have to figure that out
> from scratch.
>
> On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]> wrote:
>
>> Hi,
>>
>> $dayjob presented me with a request to break up the current tarball into
>> two: one suitable for "users" and another for the Accumulo services. The
>> ultimate goal is to make upgrade scenarios a bit easier by having client
>> and server centric packaging.
>>
>> The "client" tarball would be something suitable for most users
>> providing the ability to do things like:
>>
>> * Launch a java app against Accumulo
>> * Launch a MapReduce job against Accumulo
>> * Launch the Accumulo shell
>>
>> Essentially, the client tarball is just a pared down version of our
>> "current" tarball and the server-tarball is likely equivalent to our
>> "current" tarball (given that we have little code which would be
>> considered client-only).
>>
>> Obviously, there are many ways to go about this. If there is buy-in from
>> other folks, adding some new assembly descriptors and making it a part
>> of the Maven build (perhaps, optionally generated) would be the easiest
>> in terms of maintenance. However, I don't want to push for that if it's
>> just going to be ignored by folks. I'll be creating something to support
>> this one way or another.
>>
>> Any thoughts/opinions? Would this have any value to other folks?
>>
>> - Josh
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Mike Walch-2
In reply to this post by Josh Elser
I like the idea of client tarball.  I think it will make things easier for
users. However, I agree with Keith that we are going to need to split the
accumulo command into accumulo-client & accumulo-server.  I am interested
in helping out with this as I have done a lot of work on the scripts in 2.0.

On Thu, Jan 4, 2018 at 7:16 PM, Josh Elser <[hidden email]> wrote:

> Hi,
>
> $dayjob presented me with a request to break up the current tarball into
> two: one suitable for "users" and another for the Accumulo services. The
> ultimate goal is to make upgrade scenarios a bit easier by having client
> and server centric packaging.
>
> The "client" tarball would be something suitable for most users providing
> the ability to do things like:
>
> * Launch a java app against Accumulo
> * Launch a MapReduce job against Accumulo
> * Launch the Accumulo shell
>
> Essentially, the client tarball is just a pared down version of our
> "current" tarball and the server-tarball is likely equivalent to our
> "current" tarball (given that we have little code which would be considered
> client-only).
>
> Obviously, there are many ways to go about this. If there is buy-in from
> other folks, adding some new assembly descriptors and making it a part of
> the Maven build (perhaps, optionally generated) would be the easiest in
> terms of maintenance. However, I don't want to push for that if it's just
> going to be ignored by folks. I'll be creating something to support this
> one way or another.
>
> Any thoughts/opinions? Would this have any value to other folks?
>
> - Josh
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Keith Turner
On Fri, Jan 5, 2018 at 11:24 AM, Mike Walch <[hidden email]> wrote:
> I like the idea of client tarball.  I think it will make things easier for
> users. However, I agree with Keith that we are going to need to split the
> accumulo command into accumulo-client & accumulo-server.  I am interested
> in helping out with this as I have done a lot of work on the scripts in 2.0.

2.0 would be a good time for disruptive script changes.

Could call client script accumulo and server script accumulo-server.
Just thinking the client script is used more often so shorter would be
nice.

>
> On Thu, Jan 4, 2018 at 7:16 PM, Josh Elser <[hidden email]> wrote:
>
>> Hi,
>>
>> $dayjob presented me with a request to break up the current tarball into
>> two: one suitable for "users" and another for the Accumulo services. The
>> ultimate goal is to make upgrade scenarios a bit easier by having client
>> and server centric packaging.
>>
>> The "client" tarball would be something suitable for most users providing
>> the ability to do things like:
>>
>> * Launch a java app against Accumulo
>> * Launch a MapReduce job against Accumulo
>> * Launch the Accumulo shell
>>
>> Essentially, the client tarball is just a pared down version of our
>> "current" tarball and the server-tarball is likely equivalent to our
>> "current" tarball (given that we have little code which would be considered
>> client-only).
>>
>> Obviously, there are many ways to go about this. If there is buy-in from
>> other folks, adding some new assembly descriptors and making it a part of
>> the Maven build (perhaps, optionally generated) would be the easiest in
>> terms of maintenance. However, I don't want to push for that if it's just
>> going to be ignored by folks. I'll be creating something to support this
>> one way or another.
>>
>> Any thoughts/opinions? Would this have any value to other folks?
>>
>> - Josh
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Josh Elser
One thing worth mentioning is that I will be doing this against
$dayjob's 1.7 based branch to start.

If the consensus is to only do this for a 2.0 Accumulo release, perhaps
I can use my work to seed that effort? I'm thinking something like a
document that lists what would be in such a client-tarball.

On 1/5/18 11:35 AM, Keith Turner wrote:

> On Fri, Jan 5, 2018 at 11:24 AM, Mike Walch <[hidden email]> wrote:
>> I like the idea of client tarball.  I think it will make things easier for
>> users. However, I agree with Keith that we are going to need to split the
>> accumulo command into accumulo-client & accumulo-server.  I am interested
>> in helping out with this as I have done a lot of work on the scripts in 2.0.
>
> 2.0 would be a good time for disruptive script changes.
>
> Could call client script accumulo and server script accumulo-server.
> Just thinking the client script is used more often so shorter would be
> nice.
>
>>
>> On Thu, Jan 4, 2018 at 7:16 PM, Josh Elser <[hidden email]> wrote:
>>
>>> Hi,
>>>
>>> $dayjob presented me with a request to break up the current tarball into
>>> two: one suitable for "users" and another for the Accumulo services. The
>>> ultimate goal is to make upgrade scenarios a bit easier by having client
>>> and server centric packaging.
>>>
>>> The "client" tarball would be something suitable for most users providing
>>> the ability to do things like:
>>>
>>> * Launch a java app against Accumulo
>>> * Launch a MapReduce job against Accumulo
>>> * Launch the Accumulo shell
>>>
>>> Essentially, the client tarball is just a pared down version of our
>>> "current" tarball and the server-tarball is likely equivalent to our
>>> "current" tarball (given that we have little code which would be considered
>>> client-only).
>>>
>>> Obviously, there are many ways to go about this. If there is buy-in from
>>> other folks, adding some new assembly descriptors and making it a part of
>>> the Maven build (perhaps, optionally generated) would be the easiest in
>>> terms of maintenance. However, I don't want to push for that if it's just
>>> going to be ignored by folks. I'll be creating something to support this
>>> one way or another.
>>>
>>> Any thoughts/opinions? Would this have any value to other folks?
>>>
>>> - Josh
>>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Christopher Tubbs-2
In reply to this post by Josh Elser
On Fri, Jan 5, 2018 at 10:01 AM Josh Elser <[hidden email]> wrote:

> I'd be worried about advertising something that we're not treating as
> official as it would languish (unless we create tests that can validate
> the result for us).
>
>
My concern is "packagability". That's what I'm concerned about languishing.
I see no reason why either should, though.

As for including it as another "convenience binary" with our releases, I'm
not sold on that yet, but I'm not entirely against it either. I'll withhold
judgment until we have something concrete to review.


> Thanks for the input.
>
> On 1/4/18 7:43 PM, Christopher wrote:
> > tl;dr : I would prefer not to add another tarball as part of our
> "official"
> > releases, but I'd be in favor of a blog instructions, script, or build
> > profile, which users could read/execute/activate to create a
> client-centric
> > package.
> >
> > I've long believed that supporting different downstream packaging
> scenarios
> > should be prioritized over upstream binary packaging. I have argued in
> > favor of removing our current tarball entirely, while supporting efforts
> to
> > enable downstream packaging by modularizing the server code, supporting a
> > client-API jar (future work), and decoupling code from launch scripts. I
> > think we should continue to do these kinds of improvements to support
> > different packaging scenarios downstream, but I'd prefer to avoid
> > additional "official" binary releases.
> >
> > Rather than provide additional packages, I'd prefer to work with
> downstream
> > to make the source more "packagable" to suit the needs of these
> downstream
> > vendor/community packagers. One way we can do that here is by either
> > documenting what would be needed in a client-centric package, or by
> > providing a script or build profile to create it from source, so that
> your
> > $dayjob or any other downstream packager doesn't have to figure that out
> > from scratch.
> >
> > On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]> wrote:
> >
> >> Hi,
> >>
> >> $dayjob presented me with a request to break up the current tarball into
> >> two: one suitable for "users" and another for the Accumulo services. The
> >> ultimate goal is to make upgrade scenarios a bit easier by having client
> >> and server centric packaging.
> >>
> >> The "client" tarball would be something suitable for most users
> >> providing the ability to do things like:
> >>
> >> * Launch a java app against Accumulo
> >> * Launch a MapReduce job against Accumulo
> >> * Launch the Accumulo shell
> >>
> >> Essentially, the client tarball is just a pared down version of our
> >> "current" tarball and the server-tarball is likely equivalent to our
> >> "current" tarball (given that we have little code which would be
> >> considered client-only).
> >>
> >> Obviously, there are many ways to go about this. If there is buy-in from
> >> other folks, adding some new assembly descriptors and making it a part
> >> of the Maven build (perhaps, optionally generated) would be the easiest
> >> in terms of maintenance. However, I don't want to push for that if it's
> >> just going to be ignored by folks. I'll be creating something to support
> >> this one way or another.
> >>
> >> Any thoughts/opinions? Would this have any value to other folks?
> >>
> >> - Josh
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Christopher Tubbs-2
In reply to this post by Keith Turner
On Fri, Jan 5, 2018 at 10:30 AM Keith Turner <[hidden email]> wrote:

> On Thu, Jan 4, 2018 at 7:43 PM, Christopher <[hidden email]> wrote:
> > tl;dr : I would prefer not to add another tarball as part of our
> "official"
>
> I am not opposed to replacing the current single tarball with client
> and server tarballs.   What I find appealing about this is if the
> client tarball has less deps.
>
> However I think a lot of thought should be put into the scripts if
> this is done.  For example the client tar and server tar should
> probably not both have accumulo commands that do different things.
>
>
Agreed on Keith's point about the scripts and it requiring some
consideration.


> > releases, but I'd be in favor of a blog instructions, script, or build
> > profile, which users could read/execute/activate to create a
> client-centric
> > package.
> >
> > I've long believed that supporting different downstream packaging
> scenarios
> > should be prioritized over upstream binary packaging. I have argued in
>
> These "downstream" packaging could be done within the Apache Accumulo
> project also.  Like accumulo-docker.  Creating other packaging
> projects within Accumulo is something to consider.
>
>
+1; When I say "downstream", it's a role, not an entity. The point is that
it's a distinct activity. accumulo-docker is a perfect example of a
"downstream packaging" project maintained by the upstream community. I find
it frustrating sometimes when supporting users that they can't tell the
difference between what is "Accumulo" and what is "this specific
packaging/configuration/deployment of Accumulo", because we don't make
those lines clear. I think we can draw these lines a bit more clearly.


> > favor of removing our current tarball entirely, while supporting efforts
> to
>
> Apache Accumulo needs some sort of tarball that makes it easy to run
> the code on a cluster, otherwise how can we test Accumulo on a cluster
> for releases?
>
>
A binary tarball may be the best for this, but it's little more than the
jars in Maven Central and a few text files. It could be trivially replaced
with a simple script and manifest; it could also be replaced with an RPM, a
docker image, or any number of things. A tarball is just one type of
packaging for Accumulo's binaries.

In any case, I wasn't talking about removing the ability to produce a
binary tarball from source. Only removing it from our release artifacts and
downloads. It is not a popular opinion, but I still think it's reasonable,
with both pros and cons.


> > enable downstream packaging by modularizing the server code, supporting a
> > client-API jar (future work), and decoupling code from launch scripts. I
> > think we should continue to do these kinds of improvements to support
> > different packaging scenarios downstream, but I'd prefer to avoid
> > additional "official" binary releases.
>
> I agree, I think if the Accumulo Java code made less assumptions about
> its runtime env it would result in code that is easier to maintain and
> package for different environments.
>
> In Fluo we have recently done a lot of work in order to support
> Docker, Mesos, and Kubernetes.  This work has really cleaned up the
> core Fluo code making it easier to run in any environment.
>
> I suspect pulling the Accumuo tar ball into a separate git repo and
> out of the main repo may help highlight some of the assumptions
> Accumulo Java code makes about the environment.
>
>
This is basically what the assemble module is now. It's why I moved the bin
and conf directories into it, and have made its dependencies optional so
they wouldn't be resolved transitively, and why I made the assembly plugin
gather up the libs instead of the dependency plugin which used to drop them
in a lib directory at the root of the source checkout. This module is the
"downstream packaging" for the current "all-in-one" binary tarball package.


> I think these clean up issues are related to what Josh is suggesting,
> but are not prerequisites.  So it makes sense to discuss them at this
> point, but I don't think they should block work on two tarballs if
> that seems like a good idea.
>
>
Agreed. That discussion can be deferred. Much depends on how it is to be
split up.


> >
> > Rather than provide additional packages, I'd prefer to work with
> downstream
> > to make the source more "packagable" to suit the needs of these
> downstream
> > vendor/community packagers. One way we can do that here is by either
> > documenting what would be needed in a client-centric package, or by
> > providing a script or build profile to create it from source, so that
> your
> > $dayjob or any other downstream packager doesn't have to figure that out
> > from scratch.
> >
> > On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]> wrote:
> >
> >> Hi,
> >>
> >> $dayjob presented me with a request to break up the current tarball into
> >> two: one suitable for "users" and another for the Accumulo services. The
> >> ultimate goal is to make upgrade scenarios a bit easier by having client
> >> and server centric packaging.
> >>
> >> The "client" tarball would be something suitable for most users
> >> providing the ability to do things like:
> >>
> >> * Launch a java app against Accumulo
> >> * Launch a MapReduce job against Accumulo
> >> * Launch the Accumulo shell
> >>
> >> Essentially, the client tarball is just a pared down version of our
> >> "current" tarball and the server-tarball is likely equivalent to our
> >> "current" tarball (given that we have little code which would be
> >> considered client-only).
> >>
> >> Obviously, there are many ways to go about this. If there is buy-in from
> >> other folks, adding some new assembly descriptors and making it a part
> >> of the Maven build (perhaps, optionally generated) would be the easiest
> >> in terms of maintenance. However, I don't want to push for that if it's
> >> just going to be ignored by folks. I'll be creating something to support
> >> this one way or another.
> >>
> >> Any thoughts/opinions? Would this have any value to other folks?
> >>
> >> - Josh
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Michael Wall
I like the idea of a client jar that has less dependencies.  Josh, where
are thinking the MiniAccumuloCluster fits in here?

On Fri, Jan 5, 2018 at 3:57 PM Christopher <[hidden email]> wrote:

> On Fri, Jan 5, 2018 at 10:30 AM Keith Turner <[hidden email]> wrote:
>
> > On Thu, Jan 4, 2018 at 7:43 PM, Christopher <[hidden email]> wrote:
> > > tl;dr : I would prefer not to add another tarball as part of our
> > "official"
> >
> > I am not opposed to replacing the current single tarball with client
> > and server tarballs.   What I find appealing about this is if the
> > client tarball has less deps.
> >
> > However I think a lot of thought should be put into the scripts if
> > this is done.  For example the client tar and server tar should
> > probably not both have accumulo commands that do different things.
> >
> >
> Agreed on Keith's point about the scripts and it requiring some
> consideration.
>
>
> > > releases, but I'd be in favor of a blog instructions, script, or build
> > > profile, which users could read/execute/activate to create a
> > client-centric
> > > package.
> > >
> > > I've long believed that supporting different downstream packaging
> > scenarios
> > > should be prioritized over upstream binary packaging. I have argued in
> >
> > These "downstream" packaging could be done within the Apache Accumulo
> > project also.  Like accumulo-docker.  Creating other packaging
> > projects within Accumulo is something to consider.
> >
> >
> +1; When I say "downstream", it's a role, not an entity. The point is that
> it's a distinct activity. accumulo-docker is a perfect example of a
> "downstream packaging" project maintained by the upstream community. I find
> it frustrating sometimes when supporting users that they can't tell the
> difference between what is "Accumulo" and what is "this specific
> packaging/configuration/deployment of Accumulo", because we don't make
> those lines clear. I think we can draw these lines a bit more clearly.
>
>
> > > favor of removing our current tarball entirely, while supporting
> efforts
> > to
> >
> > Apache Accumulo needs some sort of tarball that makes it easy to run
> > the code on a cluster, otherwise how can we test Accumulo on a cluster
> > for releases?
> >
> >
> A binary tarball may be the best for this, but it's little more than the
> jars in Maven Central and a few text files. It could be trivially replaced
> with a simple script and manifest; it could also be replaced with an RPM, a
> docker image, or any number of things. A tarball is just one type of
> packaging for Accumulo's binaries.
>
> In any case, I wasn't talking about removing the ability to produce a
> binary tarball from source. Only removing it from our release artifacts and
> downloads. It is not a popular opinion, but I still think it's reasonable,
> with both pros and cons.
>
>
> > > enable downstream packaging by modularizing the server code,
> supporting a
> > > client-API jar (future work), and decoupling code from launch scripts.
> I
> > > think we should continue to do these kinds of improvements to support
> > > different packaging scenarios downstream, but I'd prefer to avoid
> > > additional "official" binary releases.
> >
> > I agree, I think if the Accumulo Java code made less assumptions about
> > its runtime env it would result in code that is easier to maintain and
> > package for different environments.
> >
> > In Fluo we have recently done a lot of work in order to support
> > Docker, Mesos, and Kubernetes.  This work has really cleaned up the
> > core Fluo code making it easier to run in any environment.
> >
> > I suspect pulling the Accumuo tar ball into a separate git repo and
> > out of the main repo may help highlight some of the assumptions
> > Accumulo Java code makes about the environment.
> >
> >
> This is basically what the assemble module is now. It's why I moved the bin
> and conf directories into it, and have made its dependencies optional so
> they wouldn't be resolved transitively, and why I made the assembly plugin
> gather up the libs instead of the dependency plugin which used to drop them
> in a lib directory at the root of the source checkout. This module is the
> "downstream packaging" for the current "all-in-one" binary tarball package.
>
>
> > I think these clean up issues are related to what Josh is suggesting,
> > but are not prerequisites.  So it makes sense to discuss them at this
> > point, but I don't think they should block work on two tarballs if
> > that seems like a good idea.
> >
> >
> Agreed. That discussion can be deferred. Much depends on how it is to be
> split up.
>
>
> > >
> > > Rather than provide additional packages, I'd prefer to work with
> > downstream
> > > to make the source more "packagable" to suit the needs of these
> > downstream
> > > vendor/community packagers. One way we can do that here is by either
> > > documenting what would be needed in a client-centric package, or by
> > > providing a script or build profile to create it from source, so that
> > your
> > > $dayjob or any other downstream packager doesn't have to figure that
> out
> > > from scratch.
> > >
> > > On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]>
> wrote:
> > >
> > >> Hi,
> > >>
> > >> $dayjob presented me with a request to break up the current tarball
> into
> > >> two: one suitable for "users" and another for the Accumulo services.
> The
> > >> ultimate goal is to make upgrade scenarios a bit easier by having
> client
> > >> and server centric packaging.
> > >>
> > >> The "client" tarball would be something suitable for most users
> > >> providing the ability to do things like:
> > >>
> > >> * Launch a java app against Accumulo
> > >> * Launch a MapReduce job against Accumulo
> > >> * Launch the Accumulo shell
> > >>
> > >> Essentially, the client tarball is just a pared down version of our
> > >> "current" tarball and the server-tarball is likely equivalent to our
> > >> "current" tarball (given that we have little code which would be
> > >> considered client-only).
> > >>
> > >> Obviously, there are many ways to go about this. If there is buy-in
> from
> > >> other folks, adding some new assembly descriptors and making it a part
> > >> of the Maven build (perhaps, optionally generated) would be the
> easiest
> > >> in terms of maintenance. However, I don't want to push for that if
> it's
> > >> just going to be ignored by folks. I'll be creating something to
> support
> > >> this one way or another.
> > >>
> > >> Any thoughts/opinions? Would this have any value to other folks?
> > >>
> > >> - Josh
> > >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Josh Elser
MAC, in its common state, is probably not something we'd want to include
in this proposed tarball. The reasoning being that MAC (and related
classes) aren't something that people would need on your "Hadoop
Cluster" to talk to Accumulo. It's something that can just be obtained
via Maven.

However, if you're more referring to MAC as the generic
"AccumuloCluster" interface (an attempt to make running tests against
MAC and a real Accumulo cluster transparent --
StandaloneAccumuloCluster), then I could see some JAR that we'd include
which would contain the necessary classes (on top of
accumulo-client.jar) for users to run code seamlessly against a
traditional MAC or the StandaloneAccumuloCluster.

On 1/5/18 4:22 PM, Michael Wall wrote:

> I like the idea of a client jar that has less dependencies.  Josh, where
> are thinking the MiniAccumuloCluster fits in here?
>
> On Fri, Jan 5, 2018 at 3:57 PM Christopher <[hidden email]> wrote:
>
>> On Fri, Jan 5, 2018 at 10:30 AM Keith Turner <[hidden email]> wrote:
>>
>>> On Thu, Jan 4, 2018 at 7:43 PM, Christopher <[hidden email]> wrote:
>>>> tl;dr : I would prefer not to add another tarball as part of our
>>> "official"
>>>
>>> I am not opposed to replacing the current single tarball with client
>>> and server tarballs.   What I find appealing about this is if the
>>> client tarball has less deps.
>>>
>>> However I think a lot of thought should be put into the scripts if
>>> this is done.  For example the client tar and server tar should
>>> probably not both have accumulo commands that do different things.
>>>
>>>
>> Agreed on Keith's point about the scripts and it requiring some
>> consideration.
>>
>>
>>>> releases, but I'd be in favor of a blog instructions, script, or build
>>>> profile, which users could read/execute/activate to create a
>>> client-centric
>>>> package.
>>>>
>>>> I've long believed that supporting different downstream packaging
>>> scenarios
>>>> should be prioritized over upstream binary packaging. I have argued in
>>>
>>> These "downstream" packaging could be done within the Apache Accumulo
>>> project also.  Like accumulo-docker.  Creating other packaging
>>> projects within Accumulo is something to consider.
>>>
>>>
>> +1; When I say "downstream", it's a role, not an entity. The point is that
>> it's a distinct activity. accumulo-docker is a perfect example of a
>> "downstream packaging" project maintained by the upstream community. I find
>> it frustrating sometimes when supporting users that they can't tell the
>> difference between what is "Accumulo" and what is "this specific
>> packaging/configuration/deployment of Accumulo", because we don't make
>> those lines clear. I think we can draw these lines a bit more clearly.
>>
>>
>>>> favor of removing our current tarball entirely, while supporting
>> efforts
>>> to
>>>
>>> Apache Accumulo needs some sort of tarball that makes it easy to run
>>> the code on a cluster, otherwise how can we test Accumulo on a cluster
>>> for releases?
>>>
>>>
>> A binary tarball may be the best for this, but it's little more than the
>> jars in Maven Central and a few text files. It could be trivially replaced
>> with a simple script and manifest; it could also be replaced with an RPM, a
>> docker image, or any number of things. A tarball is just one type of
>> packaging for Accumulo's binaries.
>>
>> In any case, I wasn't talking about removing the ability to produce a
>> binary tarball from source. Only removing it from our release artifacts and
>> downloads. It is not a popular opinion, but I still think it's reasonable,
>> with both pros and cons.
>>
>>
>>>> enable downstream packaging by modularizing the server code,
>> supporting a
>>>> client-API jar (future work), and decoupling code from launch scripts.
>> I
>>>> think we should continue to do these kinds of improvements to support
>>>> different packaging scenarios downstream, but I'd prefer to avoid
>>>> additional "official" binary releases.
>>>
>>> I agree, I think if the Accumulo Java code made less assumptions about
>>> its runtime env it would result in code that is easier to maintain and
>>> package for different environments.
>>>
>>> In Fluo we have recently done a lot of work in order to support
>>> Docker, Mesos, and Kubernetes.  This work has really cleaned up the
>>> core Fluo code making it easier to run in any environment.
>>>
>>> I suspect pulling the Accumuo tar ball into a separate git repo and
>>> out of the main repo may help highlight some of the assumptions
>>> Accumulo Java code makes about the environment.
>>>
>>>
>> This is basically what the assemble module is now. It's why I moved the bin
>> and conf directories into it, and have made its dependencies optional so
>> they wouldn't be resolved transitively, and why I made the assembly plugin
>> gather up the libs instead of the dependency plugin which used to drop them
>> in a lib directory at the root of the source checkout. This module is the
>> "downstream packaging" for the current "all-in-one" binary tarball package.
>>
>>
>>> I think these clean up issues are related to what Josh is suggesting,
>>> but are not prerequisites.  So it makes sense to discuss them at this
>>> point, but I don't think they should block work on two tarballs if
>>> that seems like a good idea.
>>>
>>>
>> Agreed. That discussion can be deferred. Much depends on how it is to be
>> split up.
>>
>>
>>>>
>>>> Rather than provide additional packages, I'd prefer to work with
>>> downstream
>>>> to make the source more "packagable" to suit the needs of these
>>> downstream
>>>> vendor/community packagers. One way we can do that here is by either
>>>> documenting what would be needed in a client-centric package, or by
>>>> providing a script or build profile to create it from source, so that
>>> your
>>>> $dayjob or any other downstream packager doesn't have to figure that
>> out
>>>> from scratch.
>>>>
>>>> On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]>
>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> $dayjob presented me with a request to break up the current tarball
>> into
>>>>> two: one suitable for "users" and another for the Accumulo services.
>> The
>>>>> ultimate goal is to make upgrade scenarios a bit easier by having
>> client
>>>>> and server centric packaging.
>>>>>
>>>>> The "client" tarball would be something suitable for most users
>>>>> providing the ability to do things like:
>>>>>
>>>>> * Launch a java app against Accumulo
>>>>> * Launch a MapReduce job against Accumulo
>>>>> * Launch the Accumulo shell
>>>>>
>>>>> Essentially, the client tarball is just a pared down version of our
>>>>> "current" tarball and the server-tarball is likely equivalent to our
>>>>> "current" tarball (given that we have little code which would be
>>>>> considered client-only).
>>>>>
>>>>> Obviously, there are many ways to go about this. If there is buy-in
>> from
>>>>> other folks, adding some new assembly descriptors and making it a part
>>>>> of the Maven build (perhaps, optionally generated) would be the
>> easiest
>>>>> in terms of maintenance. However, I don't want to push for that if
>> it's
>>>>> just going to be ignored by folks. I'll be creating something to
>> support
>>>>> this one way or another.
>>>>>
>>>>> Any thoughts/opinions? Would this have any value to other folks?
>>>>>
>>>>> - Josh
>>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Michael Wall
Yeah, I was thinking more like your second paragraph.  Thinking I would use
the proposed client jar to develop against the MiniAccumuloCluster
(typically the StandaloneMiniAccumuloCluster for me) and then deploy that
code to run against a real cluster.  Would like to flesh that usecase out a
little more.  Do you think it has to be another jar on top of the client
jar?

On Fri, Jan 5, 2018 at 4:31 PM Josh Elser <[hidden email]> wrote:

> MAC, in its common state, is probably not something we'd want to include
> in this proposed tarball. The reasoning being that MAC (and related
> classes) aren't something that people would need on your "Hadoop
> Cluster" to talk to Accumulo. It's something that can just be obtained
> via Maven.
>
> However, if you're more referring to MAC as the generic
> "AccumuloCluster" interface (an attempt to make running tests against
> MAC and a real Accumulo cluster transparent --
> StandaloneAccumuloCluster), then I could see some JAR that we'd include
> which would contain the necessary classes (on top of
> accumulo-client.jar) for users to run code seamlessly against a
> traditional MAC or the StandaloneAccumuloCluster.
>
> On 1/5/18 4:22 PM, Michael Wall wrote:
> > I like the idea of a client jar that has less dependencies.  Josh, where
> > are thinking the MiniAccumuloCluster fits in here?
> >
> > On Fri, Jan 5, 2018 at 3:57 PM Christopher <[hidden email]> wrote:
> >
> >> On Fri, Jan 5, 2018 at 10:30 AM Keith Turner <[hidden email]> wrote:
> >>
> >>> On Thu, Jan 4, 2018 at 7:43 PM, Christopher <[hidden email]>
> wrote:
> >>>> tl;dr : I would prefer not to add another tarball as part of our
> >>> "official"
> >>>
> >>> I am not opposed to replacing the current single tarball with client
> >>> and server tarballs.   What I find appealing about this is if the
> >>> client tarball has less deps.
> >>>
> >>> However I think a lot of thought should be put into the scripts if
> >>> this is done.  For example the client tar and server tar should
> >>> probably not both have accumulo commands that do different things.
> >>>
> >>>
> >> Agreed on Keith's point about the scripts and it requiring some
> >> consideration.
> >>
> >>
> >>>> releases, but I'd be in favor of a blog instructions, script, or build
> >>>> profile, which users could read/execute/activate to create a
> >>> client-centric
> >>>> package.
> >>>>
> >>>> I've long believed that supporting different downstream packaging
> >>> scenarios
> >>>> should be prioritized over upstream binary packaging. I have argued in
> >>>
> >>> These "downstream" packaging could be done within the Apache Accumulo
> >>> project also.  Like accumulo-docker.  Creating other packaging
> >>> projects within Accumulo is something to consider.
> >>>
> >>>
> >> +1; When I say "downstream", it's a role, not an entity. The point is
> that
> >> it's a distinct activity. accumulo-docker is a perfect example of a
> >> "downstream packaging" project maintained by the upstream community. I
> find
> >> it frustrating sometimes when supporting users that they can't tell the
> >> difference between what is "Accumulo" and what is "this specific
> >> packaging/configuration/deployment of Accumulo", because we don't make
> >> those lines clear. I think we can draw these lines a bit more clearly.
> >>
> >>
> >>>> favor of removing our current tarball entirely, while supporting
> >> efforts
> >>> to
> >>>
> >>> Apache Accumulo needs some sort of tarball that makes it easy to run
> >>> the code on a cluster, otherwise how can we test Accumulo on a cluster
> >>> for releases?
> >>>
> >>>
> >> A binary tarball may be the best for this, but it's little more than the
> >> jars in Maven Central and a few text files. It could be trivially
> replaced
> >> with a simple script and manifest; it could also be replaced with an
> RPM, a
> >> docker image, or any number of things. A tarball is just one type of
> >> packaging for Accumulo's binaries.
> >>
> >> In any case, I wasn't talking about removing the ability to produce a
> >> binary tarball from source. Only removing it from our release artifacts
> and
> >> downloads. It is not a popular opinion, but I still think it's
> reasonable,
> >> with both pros and cons.
> >>
> >>
> >>>> enable downstream packaging by modularizing the server code,
> >> supporting a
> >>>> client-API jar (future work), and decoupling code from launch scripts.
> >> I
> >>>> think we should continue to do these kinds of improvements to support
> >>>> different packaging scenarios downstream, but I'd prefer to avoid
> >>>> additional "official" binary releases.
> >>>
> >>> I agree, I think if the Accumulo Java code made less assumptions about
> >>> its runtime env it would result in code that is easier to maintain and
> >>> package for different environments.
> >>>
> >>> In Fluo we have recently done a lot of work in order to support
> >>> Docker, Mesos, and Kubernetes.  This work has really cleaned up the
> >>> core Fluo code making it easier to run in any environment.
> >>>
> >>> I suspect pulling the Accumuo tar ball into a separate git repo and
> >>> out of the main repo may help highlight some of the assumptions
> >>> Accumulo Java code makes about the environment.
> >>>
> >>>
> >> This is basically what the assemble module is now. It's why I moved the
> bin
> >> and conf directories into it, and have made its dependencies optional so
> >> they wouldn't be resolved transitively, and why I made the assembly
> plugin
> >> gather up the libs instead of the dependency plugin which used to drop
> them
> >> in a lib directory at the root of the source checkout. This module is
> the
> >> "downstream packaging" for the current "all-in-one" binary tarball
> package.
> >>
> >>
> >>> I think these clean up issues are related to what Josh is suggesting,
> >>> but are not prerequisites.  So it makes sense to discuss them at this
> >>> point, but I don't think they should block work on two tarballs if
> >>> that seems like a good idea.
> >>>
> >>>
> >> Agreed. That discussion can be deferred. Much depends on how it is to be
> >> split up.
> >>
> >>
> >>>>
> >>>> Rather than provide additional packages, I'd prefer to work with
> >>> downstream
> >>>> to make the source more "packagable" to suit the needs of these
> >>> downstream
> >>>> vendor/community packagers. One way we can do that here is by either
> >>>> documenting what would be needed in a client-centric package, or by
> >>>> providing a script or build profile to create it from source, so that
> >>> your
> >>>> $dayjob or any other downstream packager doesn't have to figure that
> >> out
> >>>> from scratch.
> >>>>
> >>>> On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]>
> >> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> $dayjob presented me with a request to break up the current tarball
> >> into
> >>>>> two: one suitable for "users" and another for the Accumulo services.
> >> The
> >>>>> ultimate goal is to make upgrade scenarios a bit easier by having
> >> client
> >>>>> and server centric packaging.
> >>>>>
> >>>>> The "client" tarball would be something suitable for most users
> >>>>> providing the ability to do things like:
> >>>>>
> >>>>> * Launch a java app against Accumulo
> >>>>> * Launch a MapReduce job against Accumulo
> >>>>> * Launch the Accumulo shell
> >>>>>
> >>>>> Essentially, the client tarball is just a pared down version of our
> >>>>> "current" tarball and the server-tarball is likely equivalent to our
> >>>>> "current" tarball (given that we have little code which would be
> >>>>> considered client-only).
> >>>>>
> >>>>> Obviously, there are many ways to go about this. If there is buy-in
> >> from
> >>>>> other folks, adding some new assembly descriptors and making it a
> part
> >>>>> of the Maven build (perhaps, optionally generated) would be the
> >> easiest
> >>>>> in terms of maintenance. However, I don't want to push for that if
> >> it's
> >>>>> just going to be ignored by folks. I'll be creating something to
> >> support
> >>>>> this one way or another.
> >>>>>
> >>>>> Any thoughts/opinions? Would this have any value to other folks?
> >>>>>
> >>>>> - Josh
> >>>>>
> >>>
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Any interest in separate client/server tarballs

Josh Elser
I think it would depend how much other "stuff" has to come in to support
the *Clusters. I assumed it would be a bit, but, if it's not, I have no
objections to a single jar.

On 1/5/18 4:38 PM, Michael Wall wrote:

> Yeah, I was thinking more like your second paragraph.  Thinking I would use
> the proposed client jar to develop against the MiniAccumuloCluster
> (typically the StandaloneMiniAccumuloCluster for me) and then deploy that
> code to run against a real cluster.  Would like to flesh that usecase out a
> little more.  Do you think it has to be another jar on top of the client
> jar?
>
> On Fri, Jan 5, 2018 at 4:31 PM Josh Elser <[hidden email]> wrote:
>
>> MAC, in its common state, is probably not something we'd want to include
>> in this proposed tarball. The reasoning being that MAC (and related
>> classes) aren't something that people would need on your "Hadoop
>> Cluster" to talk to Accumulo. It's something that can just be obtained
>> via Maven.
>>
>> However, if you're more referring to MAC as the generic
>> "AccumuloCluster" interface (an attempt to make running tests against
>> MAC and a real Accumulo cluster transparent --
>> StandaloneAccumuloCluster), then I could see some JAR that we'd include
>> which would contain the necessary classes (on top of
>> accumulo-client.jar) for users to run code seamlessly against a
>> traditional MAC or the StandaloneAccumuloCluster.
>>
>> On 1/5/18 4:22 PM, Michael Wall wrote:
>>> I like the idea of a client jar that has less dependencies.  Josh, where
>>> are thinking the MiniAccumuloCluster fits in here?
>>>
>>> On Fri, Jan 5, 2018 at 3:57 PM Christopher <[hidden email]> wrote:
>>>
>>>> On Fri, Jan 5, 2018 at 10:30 AM Keith Turner <[hidden email]> wrote:
>>>>
>>>>> On Thu, Jan 4, 2018 at 7:43 PM, Christopher <[hidden email]>
>> wrote:
>>>>>> tl;dr : I would prefer not to add another tarball as part of our
>>>>> "official"
>>>>>
>>>>> I am not opposed to replacing the current single tarball with client
>>>>> and server tarballs.   What I find appealing about this is if the
>>>>> client tarball has less deps.
>>>>>
>>>>> However I think a lot of thought should be put into the scripts if
>>>>> this is done.  For example the client tar and server tar should
>>>>> probably not both have accumulo commands that do different things.
>>>>>
>>>>>
>>>> Agreed on Keith's point about the scripts and it requiring some
>>>> consideration.
>>>>
>>>>
>>>>>> releases, but I'd be in favor of a blog instructions, script, or build
>>>>>> profile, which users could read/execute/activate to create a
>>>>> client-centric
>>>>>> package.
>>>>>>
>>>>>> I've long believed that supporting different downstream packaging
>>>>> scenarios
>>>>>> should be prioritized over upstream binary packaging. I have argued in
>>>>>
>>>>> These "downstream" packaging could be done within the Apache Accumulo
>>>>> project also.  Like accumulo-docker.  Creating other packaging
>>>>> projects within Accumulo is something to consider.
>>>>>
>>>>>
>>>> +1; When I say "downstream", it's a role, not an entity. The point is
>> that
>>>> it's a distinct activity. accumulo-docker is a perfect example of a
>>>> "downstream packaging" project maintained by the upstream community. I
>> find
>>>> it frustrating sometimes when supporting users that they can't tell the
>>>> difference between what is "Accumulo" and what is "this specific
>>>> packaging/configuration/deployment of Accumulo", because we don't make
>>>> those lines clear. I think we can draw these lines a bit more clearly.
>>>>
>>>>
>>>>>> favor of removing our current tarball entirely, while supporting
>>>> efforts
>>>>> to
>>>>>
>>>>> Apache Accumulo needs some sort of tarball that makes it easy to run
>>>>> the code on a cluster, otherwise how can we test Accumulo on a cluster
>>>>> for releases?
>>>>>
>>>>>
>>>> A binary tarball may be the best for this, but it's little more than the
>>>> jars in Maven Central and a few text files. It could be trivially
>> replaced
>>>> with a simple script and manifest; it could also be replaced with an
>> RPM, a
>>>> docker image, or any number of things. A tarball is just one type of
>>>> packaging for Accumulo's binaries.
>>>>
>>>> In any case, I wasn't talking about removing the ability to produce a
>>>> binary tarball from source. Only removing it from our release artifacts
>> and
>>>> downloads. It is not a popular opinion, but I still think it's
>> reasonable,
>>>> with both pros and cons.
>>>>
>>>>
>>>>>> enable downstream packaging by modularizing the server code,
>>>> supporting a
>>>>>> client-API jar (future work), and decoupling code from launch scripts.
>>>> I
>>>>>> think we should continue to do these kinds of improvements to support
>>>>>> different packaging scenarios downstream, but I'd prefer to avoid
>>>>>> additional "official" binary releases.
>>>>>
>>>>> I agree, I think if the Accumulo Java code made less assumptions about
>>>>> its runtime env it would result in code that is easier to maintain and
>>>>> package for different environments.
>>>>>
>>>>> In Fluo we have recently done a lot of work in order to support
>>>>> Docker, Mesos, and Kubernetes.  This work has really cleaned up the
>>>>> core Fluo code making it easier to run in any environment.
>>>>>
>>>>> I suspect pulling the Accumuo tar ball into a separate git repo and
>>>>> out of the main repo may help highlight some of the assumptions
>>>>> Accumulo Java code makes about the environment.
>>>>>
>>>>>
>>>> This is basically what the assemble module is now. It's why I moved the
>> bin
>>>> and conf directories into it, and have made its dependencies optional so
>>>> they wouldn't be resolved transitively, and why I made the assembly
>> plugin
>>>> gather up the libs instead of the dependency plugin which used to drop
>> them
>>>> in a lib directory at the root of the source checkout. This module is
>> the
>>>> "downstream packaging" for the current "all-in-one" binary tarball
>> package.
>>>>
>>>>
>>>>> I think these clean up issues are related to what Josh is suggesting,
>>>>> but are not prerequisites.  So it makes sense to discuss them at this
>>>>> point, but I don't think they should block work on two tarballs if
>>>>> that seems like a good idea.
>>>>>
>>>>>
>>>> Agreed. That discussion can be deferred. Much depends on how it is to be
>>>> split up.
>>>>
>>>>
>>>>>>
>>>>>> Rather than provide additional packages, I'd prefer to work with
>>>>> downstream
>>>>>> to make the source more "packagable" to suit the needs of these
>>>>> downstream
>>>>>> vendor/community packagers. One way we can do that here is by either
>>>>>> documenting what would be needed in a client-centric package, or by
>>>>>> providing a script or build profile to create it from source, so that
>>>>> your
>>>>>> $dayjob or any other downstream packager doesn't have to figure that
>>>> out
>>>>>> from scratch.
>>>>>>
>>>>>> On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <[hidden email]>
>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> $dayjob presented me with a request to break up the current tarball
>>>> into
>>>>>>> two: one suitable for "users" and another for the Accumulo services.
>>>> The
>>>>>>> ultimate goal is to make upgrade scenarios a bit easier by having
>>>> client
>>>>>>> and server centric packaging.
>>>>>>>
>>>>>>> The "client" tarball would be something suitable for most users
>>>>>>> providing the ability to do things like:
>>>>>>>
>>>>>>> * Launch a java app against Accumulo
>>>>>>> * Launch a MapReduce job against Accumulo
>>>>>>> * Launch the Accumulo shell
>>>>>>>
>>>>>>> Essentially, the client tarball is just a pared down version of our
>>>>>>> "current" tarball and the server-tarball is likely equivalent to our
>>>>>>> "current" tarball (given that we have little code which would be
>>>>>>> considered client-only).
>>>>>>>
>>>>>>> Obviously, there are many ways to go about this. If there is buy-in
>>>> from
>>>>>>> other folks, adding some new assembly descriptors and making it a
>> part
>>>>>>> of the Maven build (perhaps, optionally generated) would be the
>>>> easiest
>>>>>>> in terms of maintenance. However, I don't want to push for that if
>>>> it's
>>>>>>> just going to be ignored by folks. I'll be creating something to
>>>> support
>>>>>>> this one way or another.
>>>>>>>
>>>>>>> Any thoughts/opinions? Would this have any value to other folks?
>>>>>>>
>>>>>>> - Josh
>>>>>>>
>>>>>
>>>>
>>>
>>
>