Accumulo on Google Cloud Storage

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Accumulo on Google Cloud Storage

Maxim Kolchin
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Geoffry Roberts
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Maxim Kolchin
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Geoffry Roberts
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Christopher Tubbs-2
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.


On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Stephen Meyles
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts

Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Maxim Kolchin
> If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.

I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.

BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.

@Stephen,

> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?


Maxim

On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts

Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Stephen Meyles
Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?

We're using BatchWriters and sending reasonable larges batches of Mutations. Given the stack traces in both our cases are related to WAL writes it seems like batch size would be the only tweak available here (though, without reading the code carefully it's not even clear to me that is impactful) but if there others have suggestions I'd be happy to try.

Given we have this working well and stable in other clusters atop traditional HDFS I'm currently pursuing this further with the MS to understand the variance to ADLS. Depending what emerges from that I may circle back with more details and a bug report and start digging in more deeply to the relevant code in Accumulo.

S.


On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[hidden email]> wrote:
> If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.

I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.

BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.

@Stephen,

> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?


Maxim

On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts


Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Maxim Kolchin
In reply to this post by Maxim Kolchin
Just FYI: A separate discussion was started in the GCS connector issue tracker to come up with a way to support Accumulo. See https://github.com/GoogleCloudPlatform/bigdata-interop/issues/104

It'd be great to increase some attention to the issue, so please if everyone interested press the thumb up button :)

Maxim

On Fri, Jun 22, 2018 at 4:09 PM Maxim Kolchin <[hidden email]> wrote:
> If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.

I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.

BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.

@Stephen,

> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?


Maxim

On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts

Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Stephen Meyles
In reply to this post by Stephen Meyles
Knowing that HBase has been run successfully on ADLS, went looking there (as they have the same WAL write pattern). This is informative:


which suggests a need to split the WALs off on HDFS proper versus ADLS (or presumably GCS) barring changes in the underlying semantics of each. AFAICT you can't currently configure Accumulo to send WAL logs to a separate cluster - is this correct?

S.


On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[hidden email]> wrote:
Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?

We're using BatchWriters and sending reasonable larges batches of Mutations. Given the stack traces in both our cases are related to WAL writes it seems like batch size would be the only tweak available here (though, without reading the code carefully it's not even clear to me that is impactful) but if there others have suggestions I'd be happy to try.

Given we have this working well and stable in other clusters atop traditional HDFS I'm currently pursuing this further with the MS to understand the variance to ADLS. Depending what emerges from that I may circle back with more details and a bug report and start digging in more deeply to the relevant code in Accumulo.

S.


On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[hidden email]> wrote:
> If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.

I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.

BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.

@Stephen,

> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?


Maxim

On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts



Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Christopher Tubbs-2
Unfortunately, that feature wasn't added until 2.0, which hasn't yet been released, but I'm hoping it will be later this year.

However, I'm not convinced this is a write pattern issue, though. I commented on https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543

On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[hidden email]> wrote:
Knowing that HBase has been run successfully on ADLS, went looking there (as they have the same WAL write pattern). This is informative:


which suggests a need to split the WALs off on HDFS proper versus ADLS (or presumably GCS) barring changes in the underlying semantics of each. AFAICT you can't currently configure Accumulo to send WAL logs to a separate cluster - is this correct?

S.


On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[hidden email]> wrote:
Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?

We're using BatchWriters and sending reasonable larges batches of Mutations. Given the stack traces in both our cases are related to WAL writes it seems like batch size would be the only tweak available here (though, without reading the code carefully it's not even clear to me that is impactful) but if there others have suggestions I'd be happy to try.

Given we have this working well and stable in other clusters atop traditional HDFS I'm currently pursuing this further with the MS to understand the variance to ADLS. Depending what emerges from that I may circle back with more details and a bug report and start digging in more deeply to the relevant code in Accumulo.

S.


On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[hidden email]> wrote:
> If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.

I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.

BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.

@Stephen,

> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?


Maxim

On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts



Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Stephen Meyles
I'm not convinced this is a write pattern issue, though. I commented on..

The note there suggests the need for a LogCloser implementation; in my (ADLS) case I've written one and have it configured - the exception I'm seeing involves failures during writes, not during recovery (though it then leads to a need for recovery).

S.

On Fri, Jun 22, 2018 at 4:33 PM, Christopher <[hidden email]> wrote:
Unfortunately, that feature wasn't added until 2.0, which hasn't yet been released, but I'm hoping it will be later this year.

However, I'm not convinced this is a write pattern issue, though. I commented on https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543

On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[hidden email]> wrote:
Knowing that HBase has been run successfully on ADLS, went looking there (as they have the same WAL write pattern). This is informative:


which suggests a need to split the WALs off on HDFS proper versus ADLS (or presumably GCS) barring changes in the underlying semantics of each. AFAICT you can't currently configure Accumulo to send WAL logs to a separate cluster - is this correct?

S.


On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[hidden email]> wrote:
Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?

We're using BatchWriters and sending reasonable larges batches of Mutations. Given the stack traces in both our cases are related to WAL writes it seems like batch size would be the only tweak available here (though, without reading the code carefully it's not even clear to me that is impactful) but if there others have suggestions I'd be happy to try.

Given we have this working well and stable in other clusters atop traditional HDFS I'm currently pursuing this further with the MS to understand the variance to ADLS. Depending what emerges from that I may circle back with more details and a bug report and start digging in more deeply to the relevant code in Accumulo.

S.


On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[hidden email]> wrote:
> If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.

I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.

BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.

@Stephen,

> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?


Maxim

On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts




Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Christopher Tubbs-2
Ah, ok. One of the comments on the issue led me to believe that it was the same issue as the missing custom log closer.

On Sat, Jun 23, 2018, 01:10 Stephen Meyles <[hidden email]> wrote:
I'm not convinced this is a write pattern issue, though. I commented on..

The note there suggests the need for a LogCloser implementation; in my (ADLS) case I've written one and have it configured - the exception I'm seeing involves failures during writes, not during recovery (though it then leads to a need for recovery).

S.

On Fri, Jun 22, 2018 at 4:33 PM, Christopher <[hidden email]> wrote:
Unfortunately, that feature wasn't added until 2.0, which hasn't yet been released, but I'm hoping it will be later this year.

However, I'm not convinced this is a write pattern issue, though. I commented on https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543

On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[hidden email]> wrote:
Knowing that HBase has been run successfully on ADLS, went looking there (as they have the same WAL write pattern). This is informative:


which suggests a need to split the WALs off on HDFS proper versus ADLS (or presumably GCS) barring changes in the underlying semantics of each. AFAICT you can't currently configure Accumulo to send WAL logs to a separate cluster - is this correct?

S.


On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[hidden email]> wrote:
Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?

We're using BatchWriters and sending reasonable larges batches of Mutations. Given the stack traces in both our cases are related to WAL writes it seems like batch size would be the only tweak available here (though, without reading the code carefully it's not even clear to me that is impactful) but if there others have suggestions I'd be happy to try.

Given we have this working well and stable in other clusters atop traditional HDFS I'm currently pursuing this further with the MS to understand the variance to ADLS. Depending what emerges from that I may circle back with more details and a bug report and start digging in more deeply to the relevant code in Accumulo.

S.


On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[hidden email]> wrote:
> If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.

I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.

BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.

@Stephen,

> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?


Maxim

On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts


Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Maxim Kolchin
Hi,

I just wanted to leave intermediate feedback on the topic.

So far, Accumulo works pretty well on top of Google Storage. The aforementioned issue still exists, but it doesn't break anything. However, I can't give you any useful performance numbers at the moment.

The cluster:

 - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4)
 - 32+ billlion entries
 - 5 tables (excluding system tables)

Some averaged numbers from two use cases:

 - batch write into pre-splitted tables with 40 client machines + 4 tservers (n1-standard-4) - max speed 1.5M entries/sec.
 - sequential read with 2 client iterators (1 - filters by key, 2- filters by timestamp), with 5 client machines +  2 tservers (n1-standard-4 ) and less than 60k entries returned - max speed 1M+ entries/sec.

Maxim

On Mon, Jun 25, 2018 at 12:57 AM Christopher <[hidden email]> wrote:
Ah, ok. One of the comments on the issue led me to believe that it was the same issue as the missing custom log closer.

On Sat, Jun 23, 2018, 01:10 Stephen Meyles <[hidden email]> wrote:
I'm not convinced this is a write pattern issue, though. I commented on..

The note there suggests the need for a LogCloser implementation; in my (ADLS) case I've written one and have it configured - the exception I'm seeing involves failures during writes, not during recovery (though it then leads to a need for recovery).

S.

On Fri, Jun 22, 2018 at 4:33 PM, Christopher <[hidden email]> wrote:
Unfortunately, that feature wasn't added until 2.0, which hasn't yet been released, but I'm hoping it will be later this year.

However, I'm not convinced this is a write pattern issue, though. I commented on https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543

On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[hidden email]> wrote:
Knowing that HBase has been run successfully on ADLS, went looking there (as they have the same WAL write pattern). This is informative:


which suggests a need to split the WALs off on HDFS proper versus ADLS (or presumably GCS) barring changes in the underlying semantics of each. AFAICT you can't currently configure Accumulo to send WAL logs to a separate cluster - is this correct?

S.


On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[hidden email]> wrote:
Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?

We're using BatchWriters and sending reasonable larges batches of Mutations. Given the stack traces in both our cases are related to WAL writes it seems like batch size would be the only tweak available here (though, without reading the code carefully it's not even clear to me that is impactful) but if there others have suggestions I'd be happy to try.

Given we have this working well and stable in other clusters atop traditional HDFS I'm currently pursuing this further with the MS to understand the variance to ADLS. Depending what emerges from that I may circle back with more details and a bug report and start digging in more deeply to the relevant code in Accumulo.

S.


On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[hidden email]> wrote:
> If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.

I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.

BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.

@Stephen,

> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?


Maxim

On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:

[log.DfsLogger] ERROR: Failed to write log entries
java.io.IOException: attempting to write to a closed stream;
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)

We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.

As for the WAL closing issue on GCS, I recall a previous thread about that

I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:

>> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?

as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.

Regards

Stephen
 

On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.



On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
Maxim,

Interesting that you were able to run A on GCS.  I never thought of that--good to know.  

Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.


On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
Hi Geoffry,

Thank you for the feedback!

Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.

To all,

Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?


Thank you!
Maxim

On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.  

I decided life was too short for this kind of thing and switched to Amazon.  

On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
Hi all,

Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.

I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.


Best regards,
Maxim



--
There are ways and there are ways, 

Geoffry Roberts



--
There are ways and there are ways, 

Geoffry Roberts


Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Keith Turner
Maxim,

This is very interesting.  Would you be interested in writing an
Accumulo blog post about your experience?  If you are interested I can
help.

Keith

On Tue, Jan 15, 2019 at 10:03 AM Maxim Kolchin <[hidden email]> wrote:

>
> Hi,
>
> I just wanted to leave intermediate feedback on the topic.
>
> So far, Accumulo works pretty well on top of Google Storage. The aforementioned issue still exists, but it doesn't break anything. However, I can't give you any useful performance numbers at the moment.
>
> The cluster:
>
>  - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4)
>  - 32+ billlion entries
>  - 5 tables (excluding system tables)
>
> Some averaged numbers from two use cases:
>
>  - batch write into pre-splitted tables with 40 client machines + 4 tservers (n1-standard-4) - max speed 1.5M entries/sec.
>  - sequential read with 2 client iterators (1 - filters by key, 2- filters by timestamp), with 5 client machines +  2 tservers (n1-standard-4 ) and less than 60k entries returned - max speed 1M+ entries/sec.
>
> Maxim
>
> On Mon, Jun 25, 2018 at 12:57 AM Christopher <[hidden email]> wrote:
>>
>> Ah, ok. One of the comments on the issue led me to believe that it was the same issue as the missing custom log closer.
>>
>> On Sat, Jun 23, 2018, 01:10 Stephen Meyles <[hidden email]> wrote:
>>>
>>> > I'm not convinced this is a write pattern issue, though. I commented on..
>>>
>>> The note there suggests the need for a LogCloser implementation; in my (ADLS) case I've written one and have it configured - the exception I'm seeing involves failures during writes, not during recovery (though it then leads to a need for recovery).
>>>
>>> S.
>>>
>>> On Fri, Jun 22, 2018 at 4:33 PM, Christopher <[hidden email]> wrote:
>>>>
>>>> Unfortunately, that feature wasn't added until 2.0, which hasn't yet been released, but I'm hoping it will be later this year.
>>>>
>>>> However, I'm not convinced this is a write pattern issue, though. I commented on https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543
>>>>
>>>> On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[hidden email]> wrote:
>>>>>
>>>>> Knowing that HBase has been run successfully on ADLS, went looking there (as they have the same WAL write pattern). This is informative:
>>>>>
>>>>>     https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_using_adls_storage_with_hbase.html
>>>>>
>>>>> which suggests a need to split the WALs off on HDFS proper versus ADLS (or presumably GCS) barring changes in the underlying semantics of each. AFAICT you can't currently configure Accumulo to send WAL logs to a separate cluster - is this correct?
>>>>>
>>>>> S.
>>>>>
>>>>>
>>>>> On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[hidden email]> wrote:
>>>>>>
>>>>>> > Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?
>>>>>>
>>>>>> We're using BatchWriters and sending reasonable larges batches of Mutations. Given the stack traces in both our cases are related to WAL writes it seems like batch size would be the only tweak available here (though, without reading the code carefully it's not even clear to me that is impactful) but if there others have suggestions I'd be happy to try.
>>>>>>
>>>>>> Given we have this working well and stable in other clusters atop traditional HDFS I'm currently pursuing this further with the MS to understand the variance to ADLS. Depending what emerges from that I may circle back with more details and a bug report and start digging in more deeply to the relevant code in Accumulo.
>>>>>>
>>>>>> S.
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[hidden email]> wrote:
>>>>>>>
>>>>>>> > If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.
>>>>>>>
>>>>>>> I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.
>>>>>>>
>>>>>>> BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.
>>>>>>>
>>>>>>> @Stephen,
>>>>>>>
>>>>>>> > as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.
>>>>>>>
>>>>>>> Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?
>>>>>>>
>>>>>>> [1]: https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
>>>>>>>
>>>>>>> Maxim
>>>>>>>
>>>>>>> On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
>>>>>>>>
>>>>>>>> I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:
>>>>>>>>
>>>>>>>> [log.DfsLogger] ERROR: Failed to write log entries
>>>>>>>> java.io.IOException: attempting to write to a closed stream;
>>>>>>>> at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
>>>>>>>> at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
>>>>>>>> at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
>>>>>>>> at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
>>>>>>>> at java.io.DataOutputStream.write(DataOutputStream.java:88)
>>>>>>>> at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
>>>>>>>> at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
>>>>>>>> at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)
>>>>>>>>
>>>>>>>> We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.
>>>>>>>>
>>>>>>>> > As for the WAL closing issue on GCS, I recall a previous thread about that
>>>>>>>>
>>>>>>>> I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:
>>>>>>>>
>>>>>>>> >> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?
>>>>>>>>
>>>>>>>> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> Stephen
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
>>>>>>>>>
>>>>>>>>> For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.
>>>>>>>>>
>>>>>>>>> As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
>>>>>>>>>>
>>>>>>>>>> Maxim,
>>>>>>>>>>
>>>>>>>>>> Interesting that you were able to run A on GCS.  I never thought of that--good to know.
>>>>>>>>>>
>>>>>>>>>> Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Geoffry,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for the feedback!
>>>>>>>>>>>
>>>>>>>>>>> Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.
>>>>>>>>>>>
>>>>>>>>>>> To all,
>>>>>>>>>>>
>>>>>>>>>>> Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?
>>>>>>>>>>>
>>>>>>>>>>> [1]: https://github.com/cybermaggedon/accumulo-gs
>>>>>>>>>>> [2]: https://github.com/cybermaggedon/accumulo-docker
>>>>>>>>>>>
>>>>>>>>>>> Thank you!
>>>>>>>>>>> Maxim
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.
>>>>>>>>>>>>
>>>>>>>>>>>> I decided life was too short for this kind of thing and switched to Amazon.
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]: https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage
>>>>>>>>>>>>> [2]: https://github.com/apache/accumulo/issues/428
>>>>>>>>>>>>> [3]: https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Maxim
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> There are ways and there are ways,
>>>>>>>>>>>>
>>>>>>>>>>>> Geoffry Roberts
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> There are ways and there are ways,
>>>>>>>>>>
>>>>>>>>>> Geoffry Roberts
>>>>>>>>
>>>>>>>>
>>>>>>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Bob Thorman
And will you be so kind as to share the link with this email distro please?

On 1/16/19, 11:41 AM, "Keith Turner" <[hidden email]> wrote:

    Maxim,
   
    This is very interesting.  Would you be interested in writing an
    Accumulo blog post about your experience?  If you are interested I can
    help.
   
    Keith
   
    On Tue, Jan 15, 2019 at 10:03 AM Maxim Kolchin <[hidden email]> wrote:
    >
    > Hi,
    >
    > I just wanted to leave intermediate feedback on the topic.
    >
    > So far, Accumulo works pretty well on top of Google Storage. The aforementioned issue still exists, but it doesn't break anything. However, I can't give you any useful performance numbers at the moment.
    >
    > The cluster:
    >
    >  - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4)
    >  - 32+ billlion entries
    >  - 5 tables (excluding system tables)
    >
    > Some averaged numbers from two use cases:
    >
    >  - batch write into pre-splitted tables with 40 client machines + 4 tservers (n1-standard-4) - max speed 1.5M entries/sec.
    >  - sequential read with 2 client iterators (1 - filters by key, 2- filters by timestamp), with 5 client machines +  2 tservers (n1-standard-4 ) and less than 60k entries returned - max speed 1M+ entries/sec.
    >
    > Maxim
    >
    > On Mon, Jun 25, 2018 at 12:57 AM Christopher <[hidden email]> wrote:
    >>
    >> Ah, ok. One of the comments on the issue led me to believe that it was the same issue as the missing custom log closer.
    >>
    >> On Sat, Jun 23, 2018, 01:10 Stephen Meyles <[hidden email]> wrote:
    >>>
    >>> > I'm not convinced this is a write pattern issue, though. I commented on..
    >>>
    >>> The note there suggests the need for a LogCloser implementation; in my (ADLS) case I've written one and have it configured - the exception I'm seeing involves failures during writes, not during recovery (though it then leads to a need for recovery).
    >>>
    >>> S.
    >>>
    >>> On Fri, Jun 22, 2018 at 4:33 PM, Christopher <[hidden email]> wrote:
    >>>>
    >>>> Unfortunately, that feature wasn't added until 2.0, which hasn't yet been released, but I'm hoping it will be later this year.
    >>>>
    >>>> However, I'm not convinced this is a write pattern issue, though. I commented on https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GoogleCloudPlatform_bigdata-2Dinterop_issues_103-23issuecomment-2D399608543&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=RrpeKiRhbHsbTPStmmN3yjoOdB8n7TXkrQdYjvqOb54&e=
    >>>>
    >>>> On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[hidden email]> wrote:
    >>>>>
    >>>>> Knowing that HBase has been run successfully on ADLS, went looking there (as they have the same WAL write pattern). This is informative:
    >>>>>
    >>>>>     https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cloudera.com_documentation_enterprise_5-2D12-2Dx_topics_admin-5Fusing-5Fadls-5Fstorage-5Fwith-5Fhbase.html&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=A3SBTtG6DoNjpwDwd90-4Cnmi2WwE5TRxbAjBmzwaRI&e=
    >>>>>
    >>>>> which suggests a need to split the WALs off on HDFS proper versus ADLS (or presumably GCS) barring changes in the underlying semantics of each. AFAICT you can't currently configure Accumulo to send WAL logs to a separate cluster - is this correct?
    >>>>>
    >>>>> S.
    >>>>>
    >>>>>
    >>>>> On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[hidden email]> wrote:
    >>>>>>
    >>>>>> > Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?
    >>>>>>
    >>>>>> We're using BatchWriters and sending reasonable larges batches of Mutations. Given the stack traces in both our cases are related to WAL writes it seems like batch size would be the only tweak available here (though, without reading the code carefully it's not even clear to me that is impactful) but if there others have suggestions I'd be happy to try.
    >>>>>>
    >>>>>> Given we have this working well and stable in other clusters atop traditional HDFS I'm currently pursuing this further with the MS to understand the variance to ADLS. Depending what emerges from that I may circle back with more details and a bug report and start digging in more deeply to the relevant code in Accumulo.
    >>>>>>
    >>>>>> S.
    >>>>>>
    >>>>>>
    >>>>>> On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[hidden email]> wrote:
    >>>>>>>
    >>>>>>> > If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.
    >>>>>>>
    >>>>>>> I'd like to contribute a fix, but I don't know where to start. We tried to get any help from the Google Support about [1] over email, but they just say that the GCS doesn't support such write pattern. In the end, we can only guess how to adjust the Accumulo behaviour to minimise broken connections to the GCS.
    >>>>>>>
    >>>>>>> BTW although we observe this exception, the tablet server doesn't fail, so it means that after some retries it is able to write WALs to GCS.
    >>>>>>>
    >>>>>>> @Stephen,
    >>>>>>>
    >>>>>>> > as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.
    >>>>>>>
    >>>>>>> Did you try to adjust any Accumulo properties to do bigger writes less frequently or something like that?
    >>>>>>>
    >>>>>>> [1]: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GoogleCloudPlatform_bigdata-2Dinterop_issues_103&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=2yPpLqw3V32UFtuTULJ4GIgrpBpvRT6k3sdvxxE7gys&e=
    >>>>>>>
    >>>>>>> Maxim
    >>>>>>>
    >>>>>>> On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[hidden email]> wrote:
    >>>>>>>>
    >>>>>>>> I think we're seeing something similar but in our case we're trying to run Accumulo atop ADLS. When we generate sufficient write load we start to see stack traces like the following:
    >>>>>>>>
    >>>>>>>> [log.DfsLogger] ERROR: Failed to write log entries
    >>>>>>>> java.io.IOException: attempting to write to a closed stream;
    >>>>>>>> at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
    >>>>>>>> at com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
    >>>>>>>> at org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
    >>>>>>>> at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
    >>>>>>>> at java.io.DataOutputStream.write(DataOutputStream.java:88)
    >>>>>>>> at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
    >>>>>>>> at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
    >>>>>>>> at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)
    >>>>>>>>
    >>>>>>>> We have developed a rudimentary LogCloser implementation that allows us to recover from this but overall performance is significantly impacted by this.
    >>>>>>>>
    >>>>>>>> > As for the WAL closing issue on GCS, I recall a previous thread about that
    >>>>>>>>
    >>>>>>>> I searched more for this but wasn't able to find anything, nor similar re: ADL. I am also curious about the earlier question:
    >>>>>>>>
    >>>>>>>> >> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?
    >>>>>>>>
    >>>>>>>> as discussions with MS engineers have suggested, similar to the GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.
    >>>>>>>>
    >>>>>>>> Regards
    >>>>>>>>
    >>>>>>>> Stephen
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[hidden email]> wrote:
    >>>>>>>>>
    >>>>>>>>> For what it's worth, this is an Apache project, not a Sqrrl project. Amazon is free to contribute to Accumulo to improve its support of their platform, just as anybody is free to do. Amazon may start contributing more as a result of their acquisition... or they may not. There is no reason to expect that their acquisition will have any impact whatsoever on the platforms Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl project (although some Sqrrl employees have contributed), and thus will not become an Amazon project. It has been, and will remain, a vendor-neutral Apache project. Regardless, we welcome contributions from anybody which would improve Accumulo's support of any additional platform alternatives to HDFS, whether it be GCS, S3, or something else.
    >>>>>>>>>
    >>>>>>>>> As for the WAL closing issue on GCS, I recall a previous thread about that... I think a simple patch might be possible to solve that issue, but to date, nobody has contributed a fix. If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs. If they need help submitting a fix, please ask on the dev@ list.
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <[hidden email]> wrote:
    >>>>>>>>>>
    >>>>>>>>>> Maxim,
    >>>>>>>>>>
    >>>>>>>>>> Interesting that you were able to run A on GCS.  I never thought of that--good to know.
    >>>>>>>>>>
    >>>>>>>>>> Since I am now an AWS guy (at least or the time being), in light of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[hidden email]> wrote:
    >>>>>>>>>>>
    >>>>>>>>>>> Hi Geoffry,
    >>>>>>>>>>>
    >>>>>>>>>>> Thank you for the feedback!
    >>>>>>>>>>>
    >>>>>>>>>>> Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on Accumulo. Almost everything was good until I've not faced some connection issues with GCS. Quite often, the connection to GCS breaks on writing or closing WALs.
    >>>>>>>>>>>
    >>>>>>>>>>> To all,
    >>>>>>>>>>>
    >>>>>>>>>>> Does Accumulo have a specific write pattern, so that file system may not support it? Are there Accumulo properties which I can play with to adjust the write pattern?
    >>>>>>>>>>>
    >>>>>>>>>>> [1]: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cybermaggedon_accumulo-2Dgs&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=K3sM4QEXbilBZ-bDW-ld4a7WxgTkHn5Ms4P_BaIvfmo&e=
    >>>>>>>>>>> [2]: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cybermaggedon_accumulo-2Ddocker&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=9jRnuv65SKmppVkDI1tKNRKcOZJDfXFiSRS5Pcxt2fU&e=
    >>>>>>>>>>>
    >>>>>>>>>>> Thank you!
    >>>>>>>>>>> Maxim
    >>>>>>>>>>>
    >>>>>>>>>>> On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <[hidden email]> wrote:
    >>>>>>>>>>>>
    >>>>>>>>>>>> I tried running Accumulo on Google.  I first tried running it on Google's pre-made Hadoop.  I found the various file paths one must contend with are different on Google than on a straight download from Apache.  It seems they moved things around.  To counter this, I installed my own Hadoop along with Zookeeper and Accumulo on a Google node.  All went well until one fine day when I could no longer log in.  It seems Google had pushed out some changes over night that broke my client side Google Cloud installation.  Google referred the affected to a lengthy, easy-to-make-a-mistake procedure for resolving the issue.
    >>>>>>>>>>>>
    >>>>>>>>>>>> I decided life was too short for this kind of thing and switched to Amazon.
    >>>>>>>>>>>>
    >>>>>>>>>>>> On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <[hidden email]> wrote:
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> Hi all,
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> Does anyone have experience running Accumulo on top of Google Cloud Storage instead of HDFS? In [1] you can see some details if you never heard about this feature.
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> I see some discussion (see [2], [3]) around this topic, but it looks to me that this isn't as popular as, I believe, should be.
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> [1]: https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.google.com_dataproc_docs_concepts_connectors_cloud-2Dstorage&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=Ow_9xK5ABIEJsHBsXlCqQCJf63WEzC0RSrh1xTpVP5U&e=
    >>>>>>>>>>>>> [2]: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_accumulo_issues_428&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=98HiJiBLlqHr485MKj12gKhhd4wehE3n3VNWNCYYeH4&e=
    >>>>>>>>>>>>> [3]: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GoogleCloudPlatform_bigdata-2Dinterop_issues_103&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=2yPpLqw3V32UFtuTULJ4GIgrpBpvRT6k3sdvxxE7gys&e=
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> Best regards,
    >>>>>>>>>>>>> Maxim
    >>>>>>>>>>>>
    >>>>>>>>>>>>
    >>>>>>>>>>>>
    >>>>>>>>>>>>
    >>>>>>>>>>>> --
    >>>>>>>>>>>> There are ways and there are ways,
    >>>>>>>>>>>>
    >>>>>>>>>>>> Geoffry Roberts
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> --
    >>>>>>>>>> There are ways and there are ways,
    >>>>>>>>>>
    >>>>>>>>>> Geoffry Roberts
    >>>>>>>>
    >>>>>>>>
    >>>>>>
   

Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Josh Elser-2
In reply to this post by Maxim Kolchin
Thanks for sharing, Maxim.

What kind of failure/recovery testing did you do as a part of this? If
you haven't done any yet, are you planning to do some such testing?

- Josh

On 1/15/19 10:02 AM, Maxim Kolchin wrote:

> Hi,
>
> I just wanted to leave intermediate feedback on the topic.
>
> So far, Accumulo works pretty well on top of Google Storage. The
> aforementioned issue still exists, but it doesn't break anything.
> However, I can't give you any useful performance numbers at the moment.
>
> The cluster:
>
>   - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4)
>   - 32+ billlion entries
>   - 5 tables (excluding system tables)
>
> Some averaged numbers from two use cases:
>
>   - batch write into pre-splitted tables with 40 client machines + 4
> tservers (n1-standard-4) - max speed 1.5M entries/sec.
>   - sequential read with 2 client iterators (1 - filters by key, 2-
> filters by timestamp), with 5 client machines +  2 tservers
> (n1-standard-4 ) and less than 60k entries returned - max speed 1M+
> entries/sec.
>
> Maxim
>
> On Mon, Jun 25, 2018 at 12:57 AM Christopher <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Ah, ok. One of the comments on the issue led me to believe that it
>     was the same issue as the missing custom log closer.
>
>     On Sat, Jun 23, 2018, 01:10 Stephen Meyles <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>          > I'm not convinced this is a write pattern issue, though. I
>         commented on..
>
>         The note there suggests the need for a LogCloser implementation;
>         in my (ADLS) case I've written one and have it configured - the
>         exception I'm seeing involves failures during writes, not during
>         recovery (though it then leads to a need for recovery).
>
>         S.
>
>         On Fri, Jun 22, 2018 at 4:33 PM, Christopher
>         <[hidden email] <mailto:[hidden email]>> wrote:
>
>             Unfortunately, that feature wasn't added until 2.0, which
>             hasn't yet been released, but I'm hoping it will be later
>             this year.
>
>             However, I'm not convinced this is a write pattern issue,
>             though. I commented on
>             https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543
>
>             On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles
>             <[hidden email] <mailto:[hidden email]>> wrote:
>
>                 Knowing that HBase has been run successfully on ADLS,
>                 went looking there (as they have the same WAL write
>                 pattern). This is informative:
>
>                 https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_using_adls_storage_with_hbase.html
>
>                 which suggests a need to split the WALs off on HDFS
>                 proper versus ADLS (or presumably GCS) barring changes
>                 in the underlying semantics of each. AFAICT you can't
>                 currently configure Accumulo to send WAL logs to a
>                 separate cluster - is this correct?
>
>                 S.
>
>
>                 On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles
>                 <[hidden email] <mailto:[hidden email]>> wrote:
>
>                     > Did you try to adjust any Accumulo properties to do
>                     bigger writes less frequently or something like that?
>
>                     We're using BatchWriters and sending reasonable
>                     larges batches of Mutations. Given the stack traces
>                     in both our cases are related to WAL writes it seems
>                     like batch size would be the only tweak available
>                     here (though, without reading the code carefully
>                     it's not even clear to me that is impactful) but if
>                     there others have suggestions I'd be happy to try.
>
>                     Given we have this working well and stable in other
>                     clusters atop traditional HDFS I'm currently
>                     pursuing this further with the MS to understand the
>                     variance to ADLS. Depending what emerges from that I
>                     may circle back with more details and a bug report
>                     and start digging in more deeply to the relevant
>                     code in Accumulo.
>
>                     S.
>
>
>                     On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin
>                     <[hidden email] <mailto:[hidden email]>>
>                     wrote:
>
>                         > If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.
>
>                         I'd like to contribute a fix, but I don't know
>                         where to start. We tried to get any help from
>                         the Google Support about [1] over email, but
>                         they just say that the GCS doesn't support such
>                         write pattern. In the end, we can only guess how
>                         to adjust the Accumulo behaviour to minimise
>                         broken connections to the GCS.
>
>                         BTW although we observe this exception, the
>                         tablet server doesn't fail, so it means that
>                         after some retries it is able to write WALs to GCS.
>
>                         @Stephen,
>
>                         > as discussions with MS engineers have suggested,
>                         similar to the GCS thread, that small writes at
>                         high volume are, at best, suboptimal for ADLS.
>
>                         Did you try to adjust any Accumulo properties to
>                         do bigger writes less frequently or something
>                         like that?
>
>                         [1]:
>                         https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
>
>                         Maxim
>
>                         On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles
>                         <[hidden email] <mailto:[hidden email]>>
>                         wrote:
>
>                             I think we're seeing something similar but
>                             in our case we're trying to run Accumulo
>                             atop ADLS. When we generate sufficient write
>                             load we start to see stack traces like the
>                             following:
>
>                             [log.DfsLogger] ERROR: Failed to write log
>                             entries
>                             java.io.IOException: attempting to write to
>                             a closed stream;
>                             at
>                             com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
>                             at
>                             com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
>                             at
>                             org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
>                             at
>                             org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
>                             at
>                             java.io.DataOutputStream.write(DataOutputStream.java:88)
>                             at
>                             java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
>                             at
>                             org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
>                             at
>                             org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)
>
>                             We have developed a rudimentary LogCloser
>                             implementation that allows us to recover
>                             from this but overall performance is
>                             significantly impacted by this.
>
>                              > As for the WAL closing issue on GCS, I
>                             recall a previous thread about that
>
>                             I searched more for this but wasn't able to
>                             find anything, nor similar re: ADL. I am
>                             also curious about the earlier question:
>
>                             >> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?
>
>                             as discussions with MS engineers have
>                             suggested, similar to the GCS thread, that
>                             small writes at high volume are, at best,
>                             suboptimal for ADLS.
>
>                             Regards
>
>                             Stephen
>
>                             On Wed, Jun 20, 2018 at 11:20 AM,
>                             Christopher <[hidden email]
>                             <mailto:[hidden email]>> wrote:
>
>                                 For what it's worth, this is an Apache
>                                 project, not a Sqrrl project. Amazon is
>                                 free to contribute to Accumulo to
>                                 improve its support of their platform,
>                                 just as anybody is free to do. Amazon
>                                 may start contributing more as a result
>                                 of their acquisition... or they may not.
>                                 There is no reason to expect that their
>                                 acquisition will have any impact
>                                 whatsoever on the platforms Accumulo
>                                 supports, because Accumulo is not, and
>                                 has not ever been, a Sqrrl project
>                                 (although some Sqrrl employees have
>                                 contributed), and thus will not become
>                                 an Amazon project. It has been, and will
>                                 remain, a vendor-neutral Apache project.
>                                 Regardless, we welcome contributions
>                                 from anybody which would improve
>                                 Accumulo's support of any additional
>                                 platform alternatives to HDFS, whether
>                                 it be GCS, S3, or something else.
>
>                                 As for the WAL closing issue on GCS, I
>                                 recall a previous thread about that... I
>                                 think a simple patch might be possible
>                                 to solve that issue, but to date, nobody
>                                 has contributed a fix. If somebody is
>                                 interested in using Accumulo on GCS, I'd
>                                 like to encourage them to submit any
>                                 bugs they encounter, and any patches (if
>                                 they are able) which resolve those bugs.
>                                 If they need help submitting a fix,
>                                 please ask on the dev@ list.
>
>
>
>                                 On Wed, Jun 20, 2018 at 8:21 AM Geoffry
>                                 Roberts <[hidden email]
>                                 <mailto:[hidden email]>> wrote:
>
>                                     Maxim,
>
>                                     Interesting that you were able to
>                                     run A on GCS.  I never thought of
>                                     that--good to know.
>
>                                     Since I am now an AWS guy (at least
>                                     or the time being), in light of the
>                                     fact that Amazon purchased Sqrrl,  I
>                                     am interested to see what develops.
>
>
>                                     On Wed, Jun 20, 2018 at 5:15 AM,
>                                     Maxim Kolchin <[hidden email]
>                                     <mailto:[hidden email]>> wrote:
>
>                                         Hi Geoffry,
>
>                                         Thank you for the feedback!
>
>                                         Thanks to [1, 2], I was able to
>                                         run Accumulo cluster on Google
>                                         VMs and with GCS instead of
>                                         HDFS. And I used Google Dataproc
>                                         to run Hadoop jobs on Accumulo.
>                                         Almost everything was good until
>                                         I've not faced some connection
>                                         issues with GCS. Quite often,
>                                         the connection to GCS breaks on
>                                         writing or closing WALs.
>
>                                         To all,
>
>                                         Does Accumulo have a specific
>                                         write pattern, so that file
>                                         system may not support it? Are
>                                         there Accumulo properties which
>                                         I can play with to adjust the
>                                         write pattern?
>
>                                         [1]:
>                                         https://github.com/cybermaggedon/accumulo-gs
>                                         [2]:
>                                         https://github.com/cybermaggedon/accumulo-docker
>
>                                         Thank you!
>                                         Maxim
>
>                                         On Tue, Jun 19, 2018 at 10:31 PM
>                                         Geoffry Roberts
>                                         <[hidden email]
>                                         <mailto:[hidden email]>>
>                                         wrote:
>
>                                             I tried running Accumulo on
>                                             Google.  I first tried
>                                             running it on Google's
>                                             pre-made Hadoop.  I found
>                                             the various file paths one
>                                             must contend with are
>                                             different on Google than on
>                                             a straight download from
>                                             Apache.  It seems they moved
>                                             things around.  To counter
>                                             this, I installed my own
>                                             Hadoop along with Zookeeper
>                                             and Accumulo on a
>                                             Google node.  All went well
>                                             until one fine day when I
>                                             could no longer log in.  It
>                                             seems Google had pushed out
>                                             some changes over night that
>                                             broke my client side Google
>                                             Cloud installation.
>                                             Google referred the affected
>                                             to a lengthy,
>                                             easy-to-make-a-mistake
>                                             procedure for resolving the
>                                             issue.
>
>                                             I decided life was too short
>                                             for this kind of thing and
>                                             switched to Amazon.
>
>                                             On Tue, Jun 19, 2018 at 7:34
>                                             AM, Maxim Kolchin
>                                             <[hidden email]
>                                             <mailto:[hidden email]>>
>                                             wrote:
>
>                                                 Hi all,
>
>                                                 Does anyone have
>                                                 experience running
>                                                 Accumulo on top of
>                                                 Google Cloud Storage
>                                                 instead of HDFS? In [1]
>                                                 you can see some details
>                                                 if you never heard about
>                                                 this feature.
>
>                                                 I see some discussion
>                                                 (see [2], [3]) around
>                                                 this topic, but it looks
>                                                 to me that this isn't as
>                                                 popular as, I believe,
>                                                 should be.
>
>                                                 [1]:
>                                                 https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage
>                                                 [2]:
>                                                 https://github.com/apache/accumulo/issues/428
>                                                 [3]:
>                                                 https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
>
>                                                 Best regards,
>                                                 Maxim
>
>
>
>
>                                             --
>                                             There are ways and there are
>                                             ways,
>
>                                             Geoffry Roberts
>
>
>
>
>                                     --
>                                     There are ways and there are ways,
>
>                                     Geoffry Roberts
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo on Google Cloud Storage

Maxim Kolchin
> Would you be interested in writing an Accumulo blog post about your experience?  If you are interested I can help.

I'd like to do that, but not sure when I'll have time. And unfortunately, I didn't collect enough statistics to show any useful numbers and the cluster is shutdown now.

We discovered that Google Storage provides too high latency for our use case and we've decided to give a chance to another DB which doesn't depend on HDFS/GS. The average latency for a small file (much less than 1Mb) is around 60ms if it's downloaded (even partially) from GS and 30ms to get its metadata.

Actually, there are several articles about GS performance:

> What kind of failure/recovery testing did you do as a part of this?

No, we didn't.

Maxim

On Thu, Jan 17, 2019 at 9:50 PM Josh Elser <[hidden email]> wrote:
Thanks for sharing, Maxim.

What kind of failure/recovery testing did you do as a part of this? If
you haven't done any yet, are you planning to do some such testing?

- Josh

On 1/15/19 10:02 AM, Maxim Kolchin wrote:
> Hi,
>
> I just wanted to leave intermediate feedback on the topic.
>
> So far, Accumulo works pretty well on top of Google Storage. The
> aforementioned issue still exists, but it doesn't break anything.
> However, I can't give you any useful performance numbers at the moment.
>
> The cluster:
>
>   - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4)
>   - 32+ billlion entries
>   - 5 tables (excluding system tables)
>
> Some averaged numbers from two use cases:
>
>   - batch write into pre-splitted tables with 40 client machines + 4
> tservers (n1-standard-4) - max speed 1.5M entries/sec.
>   - sequential read with 2 client iterators (1 - filters by key, 2-
> filters by timestamp), with 5 client machines +  2 tservers
> (n1-standard-4 ) and less than 60k entries returned - max speed 1M+
> entries/sec.
>
> Maxim
>
> On Mon, Jun 25, 2018 at 12:57 AM Christopher <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Ah, ok. One of the comments on the issue led me to believe that it
>     was the same issue as the missing custom log closer.
>
>     On Sat, Jun 23, 2018, 01:10 Stephen Meyles <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>          > I'm not convinced this is a write pattern issue, though. I
>         commented on..
>
>         The note there suggests the need for a LogCloser implementation;
>         in my (ADLS) case I've written one and have it configured - the
>         exception I'm seeing involves failures during writes, not during
>         recovery (though it then leads to a need for recovery).
>
>         S.
>
>         On Fri, Jun 22, 2018 at 4:33 PM, Christopher
>         <[hidden email] <mailto:[hidden email]>> wrote:
>
>             Unfortunately, that feature wasn't added until 2.0, which
>             hasn't yet been released, but I'm hoping it will be later
>             this year.
>
>             However, I'm not convinced this is a write pattern issue,
>             though. I commented on
>             https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543
>
>             On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles
>             <[hidden email] <mailto:[hidden email]>> wrote:
>
>                 Knowing that HBase has been run successfully on ADLS,
>                 went looking there (as they have the same WAL write
>                 pattern). This is informative:
>
>                 https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_using_adls_storage_with_hbase.html
>
>                 which suggests a need to split the WALs off on HDFS
>                 proper versus ADLS (or presumably GCS) barring changes
>                 in the underlying semantics of each. AFAICT you can't
>                 currently configure Accumulo to send WAL logs to a
>                 separate cluster - is this correct?
>
>                 S.
>
>
>                 On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles
>                 <[hidden email] <mailto:[hidden email]>> wrote:
>
>                     > Did you try to adjust any Accumulo properties to do
>                     bigger writes less frequently or something like that?
>
>                     We're using BatchWriters and sending reasonable
>                     larges batches of Mutations. Given the stack traces
>                     in both our cases are related to WAL writes it seems
>                     like batch size would be the only tweak available
>                     here (though, without reading the code carefully
>                     it's not even clear to me that is impactful) but if
>                     there others have suggestions I'd be happy to try.
>
>                     Given we have this working well and stable in other
>                     clusters atop traditional HDFS I'm currently
>                     pursuing this further with the MS to understand the
>                     variance to ADLS. Depending what emerges from that I
>                     may circle back with more details and a bug report
>                     and start digging in more deeply to the relevant
>                     code in Accumulo.
>
>                     S.
>
>
>                     On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin
>                     <[hidden email] <mailto:[hidden email]>>
>                     wrote:
>
>                         > If somebody is interested in using Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, and any patches (if they are able) which resolve those bugs.
>
>                         I'd like to contribute a fix, but I don't know
>                         where to start. We tried to get any help from
>                         the Google Support about [1] over email, but
>                         they just say that the GCS doesn't support such
>                         write pattern. In the end, we can only guess how
>                         to adjust the Accumulo behaviour to minimise
>                         broken connections to the GCS.
>
>                         BTW although we observe this exception, the
>                         tablet server doesn't fail, so it means that
>                         after some retries it is able to write WALs to GCS.
>
>                         @Stephen,
>
>                         > as discussions with MS engineers have suggested,
>                         similar to the GCS thread, that small writes at
>                         high volume are, at best, suboptimal for ADLS.
>
>                         Did you try to adjust any Accumulo properties to
>                         do bigger writes less frequently or something
>                         like that?
>
>                         [1]:
>                         https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
>
>                         Maxim
>
>                         On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles
>                         <[hidden email] <mailto:[hidden email]>>
>                         wrote:
>
>                             I think we're seeing something similar but
>                             in our case we're trying to run Accumulo
>                             atop ADLS. When we generate sufficient write
>                             load we start to see stack traces like the
>                             following:
>
>                             [log.DfsLogger] ERROR: Failed to write log
>                             entries
>                             java.io.IOException: attempting to write to
>                             a closed stream;
>                             at
>                             com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
>                             at
>                             com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
>                             at
>                             org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
>                             at
>                             org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
>                             at
>                             java.io.DataOutputStream.write(DataOutputStream.java:88)
>                             at
>                             java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
>                             at
>                             org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
>                             at
>                             org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)
>
>                             We have developed a rudimentary LogCloser
>                             implementation that allows us to recover
>                             from this but overall performance is
>                             significantly impacted by this.
>
>                              > As for the WAL closing issue on GCS, I
>                             recall a previous thread about that
>
>                             I searched more for this but wasn't able to
>                             find anything, nor similar re: ADL. I am
>                             also curious about the earlier question:
>
>                             >> Does Accumulo have a specific write pattern [to WALs], so that file system may not support it?
>
>                             as discussions with MS engineers have
>                             suggested, similar to the GCS thread, that
>                             small writes at high volume are, at best,
>                             suboptimal for ADLS.
>
>                             Regards
>
>                             Stephen
>
>                             On Wed, Jun 20, 2018 at 11:20 AM,
>                             Christopher <[hidden email]
>                             <mailto:[hidden email]>> wrote:
>
>                                 For what it's worth, this is an Apache
>                                 project, not a Sqrrl project. Amazon is
>                                 free to contribute to Accumulo to
>                                 improve its support of their platform,
>                                 just as anybody is free to do. Amazon
>                                 may start contributing more as a result
>                                 of their acquisition... or they may not.
>                                 There is no reason to expect that their
>                                 acquisition will have any impact
>                                 whatsoever on the platforms Accumulo
>                                 supports, because Accumulo is not, and
>                                 has not ever been, a Sqrrl project
>                                 (although some Sqrrl employees have
>                                 contributed), and thus will not become
>                                 an Amazon project. It has been, and will
>                                 remain, a vendor-neutral Apache project.
>                                 Regardless, we welcome contributions
>                                 from anybody which would improve
>                                 Accumulo's support of any additional
>                                 platform alternatives to HDFS, whether
>                                 it be GCS, S3, or something else.
>
>                                 As for the WAL closing issue on GCS, I
>                                 recall a previous thread about that... I
>                                 think a simple patch might be possible
>                                 to solve that issue, but to date, nobody
>                                 has contributed a fix. If somebody is
>                                 interested in using Accumulo on GCS, I'd
>                                 like to encourage them to submit any
>                                 bugs they encounter, and any patches (if
>                                 they are able) which resolve those bugs.
>                                 If they need help submitting a fix,
>                                 please ask on the dev@ list.
>
>
>
>                                 On Wed, Jun 20, 2018 at 8:21 AM Geoffry
>                                 Roberts <[hidden email]
>                                 <mailto:[hidden email]>> wrote:
>
>                                     Maxim,
>
>                                     Interesting that you were able to
>                                     run A on GCS.  I never thought of
>                                     that--good to know.
>
>                                     Since I am now an AWS guy (at least
>                                     or the time being), in light of the
>                                     fact that Amazon purchased Sqrrl,  I
>                                     am interested to see what develops.
>
>
>                                     On Wed, Jun 20, 2018 at 5:15 AM,
>                                     Maxim Kolchin <[hidden email]
>                                     <mailto:[hidden email]>> wrote:
>
>                                         Hi Geoffry,
>
>                                         Thank you for the feedback!
>
>                                         Thanks to [1, 2], I was able to
>                                         run Accumulo cluster on Google
>                                         VMs and with GCS instead of
>                                         HDFS. And I used Google Dataproc
>                                         to run Hadoop jobs on Accumulo.
>                                         Almost everything was good until
>                                         I've not faced some connection
>                                         issues with GCS. Quite often,
>                                         the connection to GCS breaks on
>                                         writing or closing WALs.
>
>                                         To all,
>
>                                         Does Accumulo have a specific
>                                         write pattern, so that file
>                                         system may not support it? Are
>                                         there Accumulo properties which
>                                         I can play with to adjust the
>                                         write pattern?
>
>                                         [1]:
>                                         https://github.com/cybermaggedon/accumulo-gs
>                                         [2]:
>                                         https://github.com/cybermaggedon/accumulo-docker
>
>                                         Thank you!
>                                         Maxim
>
>                                         On Tue, Jun 19, 2018 at 10:31 PM
>                                         Geoffry Roberts
>                                         <[hidden email]
>                                         <mailto:[hidden email]>>
>                                         wrote:
>
>                                             I tried running Accumulo on
>                                             Google.  I first tried
>                                             running it on Google's
>                                             pre-made Hadoop.  I found
>                                             the various file paths one
>                                             must contend with are
>                                             different on Google than on
>                                             a straight download from
>                                             Apache.  It seems they moved
>                                             things around.  To counter
>                                             this, I installed my own
>                                             Hadoop along with Zookeeper
>                                             and Accumulo on a
>                                             Google node.  All went well
>                                             until one fine day when I
>                                             could no longer log in.  It
>                                             seems Google had pushed out
>                                             some changes over night that
>                                             broke my client side Google
>                                             Cloud installation.
>                                             Google referred the affected
>                                             to a lengthy,
>                                             easy-to-make-a-mistake
>                                             procedure for resolving the
>                                             issue.
>
>                                             I decided life was too short
>                                             for this kind of thing and
>                                             switched to Amazon.
>
>                                             On Tue, Jun 19, 2018 at 7:34
>                                             AM, Maxim Kolchin
>                                             <[hidden email]
>                                             <mailto:[hidden email]>>
>                                             wrote:
>
>                                                 Hi all,
>
>                                                 Does anyone have
>                                                 experience running
>                                                 Accumulo on top of
>                                                 Google Cloud Storage
>                                                 instead of HDFS? In [1]
>                                                 you can see some details
>                                                 if you never heard about
>                                                 this feature.
>
>                                                 I see some discussion
>                                                 (see [2], [3]) around
>                                                 this topic, but it looks
>                                                 to me that this isn't as
>                                                 popular as, I believe,
>                                                 should be.
>
>                                                 [1]:
>                                                 https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage
>                                                 [2]:
>                                                 https://github.com/apache/accumulo/issues/428
>                                                 [3]:
>                                                 https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
>
>                                                 Best regards,
>                                                 Maxim
>
>
>
>
>                                             --
>                                             There are ways and there are
>                                             ways,
>
>                                             Geoffry Roberts
>
>
>
>
>                                     --
>                                     There are ways and there are ways,
>
>                                     Geoffry Roberts
>
>
>
Reply | Threaded
Open this post in threaded view
|

All tablets are down

pranav.puri
In reply to this post by Maxim Kolchin

Hi,

I have setup accumulo on a two node hadoop HA cluster. In this, the rf file kept in +r has been corrupted(rf file related to the accumulo.root table). This was checked through fsck command. And due to this the accumulo was not starting up.

As a troubleshooting step, I removed this rf file and replaced with an empty file of same name. After this the accumulo is starting up and all the tabes are online but there are no entries present for these table. No tablets are there but one. And also I am able to scan just the root table probably because I have replaced it with the new file.Also if I try to scan any other table the shell hangs.

Please let me know how to handle this. Also, please mention if any other details are required.

Regards
Pranav

Reply | Threaded
Open this post in threaded view
|

Re: All tablets are down

Michael Wall
Pranav,

Having a corrupt root table and replacing with with an empty empty file means Accumulo does not not anything about the metadata table.  Without the metadata table, Accumulo know nothing about the other tables.

If you can do so, I suggest starting over.  If you have data you need to keep, you can bulk import the existing RFiles into a new instance if you can associate the files to a table.  Having the old splits would be helpful but is not necessary.

What version of Accumulo are you using?  Take a look at https://accumulo.apache.org/1.9/accumulo_user_manual.html#_advanced_system_recovery.

It would be good to understand what led to the corruption in the accumulo.root table.

Mike

On Wed, Feb 27, 2019 at 7:27 AM pranav.puri <[hidden email]> wrote:

Hi,

I have setup accumulo on a two node hadoop HA cluster. In this, the rf file kept in +r has been corrupted(rf file related to the accumulo.root table). This was checked through fsck command. And due to this the accumulo was not starting up.

As a troubleshooting step, I removed this rf file and replaced with an empty file of same name. After this the accumulo is starting up and all the tabes are online but there are no entries present for these table. No tablets are there but one. And also I am able to scan just the root table probably because I have replaced it with the new file.Also if I try to scan any other table the shell hangs.

Please let me know how to handle this. Also, please mention if any other details are required.

Regards
Pranav

12