Accumulo performance on various hardware configurations

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Accumulo performance on various hardware configurations

guy sharon
hi,

Continuing my performance benchmarks, I'm still trying to figure out if the results I'm getting are reasonable and why throwing more hardware at the problem doesn't help. What I'm doing is a full table scan on a table with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop 2.8.4. The table is populated by org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter modified to write 6M entries instead of 50k. Reads are performed by "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the results I got:

1. 5 tserver cluster as configured by Muchos (https://github.com/apache/fluo-muchos), running on m5d.large AWS machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan took 12 seconds.
2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
3. Splitting the table to 4 tablets causes the runtime to increase to 16 seconds.
4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running Amazon Linux. Configuration as provided by Uno (https://github.com/apache/fluo-uno). Total time was 26 seconds.

Offhand I would say this is very slow. I'm guessing I'm making some sort of newbie (possibly configuration) mistake but I can't figure out what it is. Can anyone point me to something that might help me find out what it is?

thanks,
Guy.


Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Marc
Guy,
  The ReadData example appears to use a sequential scanner. Can you
change that to a batch scanner and see if there is improvement [1]?
Also, while you are there can you remove the log statement or set your
log level so that the trace message isn't printed?

In this case we are reading the entirety of that data. If you were to
perform a query you would likely prefer to do it at the data instead
of bringing all data back to the client.

What are your expectations since it appears very slow. Do you want
faster client side access to the data? Certainly improvements could be
made -- of that I have no doubt -- but the time to bring 6M entries to
the client is a cost you will incur if you use the ReadData example.

[1] If you have four tablets it's reasonable to suspect that the RPC
time to access those servers may increase a bit.

On Wed, Aug 29, 2018 at 8:05 AM guy sharon <[hidden email]> wrote:

>
> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if the results I'm getting are reasonable and why throwing more hardware at the problem doesn't help. What I'm doing is a full table scan on a table with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop 2.8.4. The table is populated by org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter modified to write 6M entries instead of 50k. Reads are performed by "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the results I got:
>
> 1. 5 tserver cluster as configured by Muchos (https://github.com/apache/fluo-muchos), running on m5d.large AWS machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16 seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running Amazon Linux. Configuration as provided by Uno (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort of newbie (possibly configuration) mistake but I can't figure out what it is. Can anyone point me to something that might help me find out what it is?
>
> thanks,
> Guy.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Marc
Guy,
  To clarify :

[1] If you have four tablets it's reasonable to suspect that the RPC
time to access those servers may increase a bit if you access them
sequentially versus in parallel.
On Wed, Aug 29, 2018 at 8:16 AM Marc <[hidden email]> wrote:

>
> Guy,
>   The ReadData example appears to use a sequential scanner. Can you
> change that to a batch scanner and see if there is improvement [1]?
> Also, while you are there can you remove the log statement or set your
> log level so that the trace message isn't printed?
>
> In this case we are reading the entirety of that data. If you were to
> perform a query you would likely prefer to do it at the data instead
> of bringing all data back to the client.
>
> What are your expectations since it appears very slow. Do you want
> faster client side access to the data? Certainly improvements could be
> made -- of that I have no doubt -- but the time to bring 6M entries to
> the client is a cost you will incur if you use the ReadData example.
>
> [1] If you have four tablets it's reasonable to suspect that the RPC
> time to access those servers may increase a bit.
>
> On Wed, Aug 29, 2018 at 8:05 AM guy sharon <[hidden email]> wrote:
> >
> > hi,
> >
> > Continuing my performance benchmarks, I'm still trying to figure out if the results I'm getting are reasonable and why throwing more hardware at the problem doesn't help. What I'm doing is a full table scan on a table with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop 2.8.4. The table is populated by org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter modified to write 6M entries instead of 50k. Reads are performed by "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the results I got:
> >
> > 1. 5 tserver cluster as configured by Muchos (https://github.com/apache/fluo-muchos), running on m5d.large AWS machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan took 12 seconds.
> > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> > 3. Splitting the table to 4 tablets causes the runtime to increase to 16 seconds.
> > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running Amazon Linux. Configuration as provided by Uno (https://github.com/apache/fluo-uno). Total time was 26 seconds.
> >
> > Offhand I would say this is very slow. I'm guessing I'm making some sort of newbie (possibly configuration) mistake but I can't figure out what it is. Can anyone point me to something that might help me find out what it is?
> >
> > thanks,
> > Guy.
> >
> >
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Jeremy Kepner
In reply to this post by guy sharon
Your node is fairly underpowered (2 cores and 8 GB RAM) and is less than
most laptops.  That said

6M / 12sec = 500K/sec

is good for a single node Accumulo instance on this hardware.

Spitting might not help since you only have 2 cores so added parallism can't
be exploited.

Why do you think 500K/sec is slow?

To determine slowness one would have to compare with other database technology on the same platform.


On Wed, Aug 29, 2018 at 03:04:51PM +0300, guy sharon wrote:

> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if the
> results I'm getting are reasonable and why throwing more hardware at the
> problem doesn't help. What I'm doing is a full table scan on a table with
> 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop 2.8.4.
> The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos (
> https://github.com/apache/fluo-muchos), running on m5d.large AWS machines
> (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan
> took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno (
> https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort of
> newbie (possibly configuration) mistake but I can't figure out what it is.
> Can anyone point me to something that might help me find out what it is?
>
> thanks,
> Guy.
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

guy sharon
In reply to this post by Marc
hi Marc,

Just ran the test again with the changes you suggested. Setup: 5 tservers on CentOS 7, 4 CPUs and 16 GB RAM, Accumulo 1.7.4, table with 6M rows. org.apache.accumulo.examples.simple.helloworld.ReadData now uses a BatchScanner with 10 threads. I got:

$ time install/accumulo-1.7.4/bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i muchos -z localhost:2181 -u root -t hellotable -p secret

real    0m16.979s
user    0m13.670s
sys    0m0.599s

So this doesn't really improve things. That looks strange to me as I'd expect Accumulo to use the threads to speed things up. Unless the full scan makes it use just one thread with the assumption that the entries are next to each other on the disk making it faster to read them sequentially rather than jump back and forth with threads. What do you think?

BR,
Guy.




On Wed, Aug 29, 2018 at 3:25 PM Marc <[hidden email]> wrote:
Guy,
  To clarify :

[1] If you have four tablets it's reasonable to suspect that the RPC
time to access those servers may increase a bit if you access them
sequentially versus in parallel.
On Wed, Aug 29, 2018 at 8:16 AM Marc <[hidden email]> wrote:
>
> Guy,
>   The ReadData example appears to use a sequential scanner. Can you
> change that to a batch scanner and see if there is improvement [1]?
> Also, while you are there can you remove the log statement or set your
> log level so that the trace message isn't printed?
>
> In this case we are reading the entirety of that data. If you were to
> perform a query you would likely prefer to do it at the data instead
> of bringing all data back to the client.
>
> What are your expectations since it appears very slow. Do you want
> faster client side access to the data? Certainly improvements could be
> made -- of that I have no doubt -- but the time to bring 6M entries to
> the client is a cost you will incur if you use the ReadData example.
>
> [1] If you have four tablets it's reasonable to suspect that the RPC
> time to access those servers may increase a bit.
>
> On Wed, Aug 29, 2018 at 8:05 AM guy sharon <[hidden email]> wrote:
> >
> > hi,
> >
> > Continuing my performance benchmarks, I'm still trying to figure out if the results I'm getting are reasonable and why throwing more hardware at the problem doesn't help. What I'm doing is a full table scan on a table with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop 2.8.4. The table is populated by org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter modified to write 6M entries instead of 50k. Reads are performed by "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the results I got:
> >
> > 1. 5 tserver cluster as configured by Muchos (https://github.com/apache/fluo-muchos), running on m5d.large AWS machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan took 12 seconds.
> > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> > 3. Splitting the table to 4 tablets causes the runtime to increase to 16 seconds.
> > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running Amazon Linux. Configuration as provided by Uno (https://github.com/apache/fluo-uno). Total time was 26 seconds.
> >
> > Offhand I would say this is very slow. I'm guessing I'm making some sort of newbie (possibly configuration) mistake but I can't figure out what it is. Can anyone point me to something that might help me find out what it is?
> >
> > thanks,
> > Guy.
> >
> >
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

guy sharon
In reply to this post by Jeremy Kepner
Well, in one experiment I used a machine with 48 cores and 192GB and the results actually came out worse. And in another I had 7 tservers on servers with 4 cores. I think I'm not configuring things correctly because I'd expect the improved hardware to improve performance and that doesn't seem to be the case.

On Wed, Aug 29, 2018 at 4:00 PM Jeremy Kepner <[hidden email]> wrote:
Your node is fairly underpowered (2 cores and 8 GB RAM) and is less than
most laptops.  That said

6M / 12sec = 500K/sec

is good for a single node Accumulo instance on this hardware.

Spitting might not help since you only have 2 cores so added parallism can't
be exploited.

Why do you think 500K/sec is slow?

To determine slowness one would have to compare with other database technology on the same platform.


On Wed, Aug 29, 2018 at 03:04:51PM +0300, guy sharon wrote:
> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if the
> results I'm getting are reasonable and why throwing more hardware at the
> problem doesn't help. What I'm doing is a full table scan on a table with
> 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop 2.8.4.
> The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos (
> https://github.com/apache/fluo-muchos), running on m5d.large AWS machines
> (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan
> took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno (
> https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort of
> newbie (possibly configuration) mistake but I can't figure out what it is.
> Can anyone point me to something that might help me find out what it is?
>
> thanks,
> Guy.
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

James Srinivasan
In my limited experience of cloud services, I/O bandwidth seems to be pretty low. Can you run a benchmark eg bonnie++?

On Wed, 29 Aug 2018, 14:39 guy sharon, <[hidden email]> wrote:
Well, in one experiment I used a machine with 48 cores and 192GB and the results actually came out worse. And in another I had 7 tservers on servers with 4 cores. I think I'm not configuring things correctly because I'd expect the improved hardware to improve performance and that doesn't seem to be the case.

On Wed, Aug 29, 2018 at 4:00 PM Jeremy Kepner <[hidden email]> wrote:
Your node is fairly underpowered (2 cores and 8 GB RAM) and is less than
most laptops.  That said

6M / 12sec = 500K/sec

is good for a single node Accumulo instance on this hardware.

Spitting might not help since you only have 2 cores so added parallism can't
be exploited.

Why do you think 500K/sec is slow?

To determine slowness one would have to compare with other database technology on the same platform.


On Wed, Aug 29, 2018 at 03:04:51PM +0300, guy sharon wrote:
> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if the
> results I'm getting are reasonable and why throwing more hardware at the
> problem doesn't help. What I'm doing is a full table scan on a table with
> 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop 2.8.4.
> The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos (
> https://github.com/apache/fluo-muchos), running on m5d.large AWS machines
> (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan
> took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno (
> https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort of
> newbie (possibly configuration) mistake but I can't figure out what it is.
> Can anyone point me to something that might help me find out what it is?
>
> thanks,
> Guy.
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Jeremy Kepner
In reply to this post by guy sharon
Why do you think 500K/sec is slow?

On Wed, Aug 29, 2018 at 04:39:32PM +0300, guy sharon wrote:

> Well, in one experiment I used a machine with 48 cores and 192GB and the
> results actually came out worse. And in another I had 7 tservers on servers
> with 4 cores. I think I'm not configuring things correctly because I'd
> expect the improved hardware to improve performance and that doesn't seem
> to be the case.
>
> On Wed, Aug 29, 2018 at 4:00 PM Jeremy Kepner <[hidden email]> wrote:
>
> > Your node is fairly underpowered (2 cores and 8 GB RAM) and is less than
> > most laptops.  That said
> >
> > 6M / 12sec = 500K/sec
> >
> > is good for a single node Accumulo instance on this hardware.
> >
> > Spitting might not help since you only have 2 cores so added parallism
> > can't
> > be exploited.
> >
> > Why do you think 500K/sec is slow?
> >
> > To determine slowness one would have to compare with other database
> > technology on the same platform.
> >
> >
> > On Wed, Aug 29, 2018 at 03:04:51PM +0300, guy sharon wrote:
> > > hi,
> > >
> > > Continuing my performance benchmarks, I'm still trying to figure out if
> > the
> > > results I'm getting are reasonable and why throwing more hardware at the
> > > problem doesn't help. What I'm doing is a full table scan on a table with
> > > 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> > 2.8.4.
> > > The table is populated by
> > > org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> > > modified to write 6M entries instead of 50k. Reads are performed by
> > > "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> > > muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> > > results I got:
> > >
> > > 1. 5 tserver cluster as configured by Muchos (
> > > https://github.com/apache/fluo-muchos), running on m5d.large AWS
> > machines
> > > (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan
> > > took 12 seconds.
> > > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> > > 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> > > seconds.
> > > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> > > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> > > Amazon Linux. Configuration as provided by Uno (
> > > https://github.com/apache/fluo-uno). Total time was 26 seconds.
> > >
> > > Offhand I would say this is very slow. I'm guessing I'm making some sort
> > of
> > > newbie (possibly configuration) mistake but I can't figure out what it
> > is.
> > > Can anyone point me to something that might help me find out what it is?
> > >
> > > thanks,
> > > Guy.
> >
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Josh Elser-2
In reply to this post by guy sharon
Does Muchos actually change the Accumulo configuration when you are
changing the underlying hardware?

On 8/29/18 8:04 AM, guy sharon wrote:

> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if
> the results I'm getting are reasonable and why throwing more hardware at
> the problem doesn't help. What I'm doing is a full table scan on a table
> with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> 2.8.4. The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos
> (https://github.com/apache/fluo-muchos), running on m5d.large AWS
> machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
> server. Scan took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno
> (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort
> of newbie (possibly configuration) mistake but I can't figure out what
> it is. Can anyone point me to something that might help me find out what
> it is?
>
> thanks,
> Guy.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

dlmarion
What's the value for table.scan.max.memory? I would re-run your tests with different values to see if there is a difference.

> On August 29, 2018 at 11:35 AM Josh Elser <[hidden email]> wrote:
>
>
> Does Muchos actually change the Accumulo configuration when you are
> changing the underlying hardware?
>
> On 8/29/18 8:04 AM, guy sharon wrote:
> > hi,
> >
> > Continuing my performance benchmarks, I'm still trying to figure out if
> > the results I'm getting are reasonable and why throwing more hardware at
> > the problem doesn't help. What I'm doing is a full table scan on a table
> > with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> > 2.8.4. The table is populated by
> > org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> > modified to write 6M entries instead of 50k. Reads are performed by
> > "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> > muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> > results I got:
> >
> > 1. 5 tserver cluster as configured by Muchos
> > (https://github.com/apache/fluo-muchos), running on m5d.large AWS
> > machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
> > server. Scan took 12 seconds.
> > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> > 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> > seconds.
> > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> > Amazon Linux. Configuration as provided by Uno
> > (https://github.com/apache/fluo-uno). Total time was 26 seconds.
> >
> > Offhand I would say this is very slow. I'm guessing I'm making some sort
> > of newbie (possibly configuration) mistake but I can't figure out what
> > it is. Can anyone point me to something that might help me find out what
> > it is?
> >
> > thanks,
> > Guy.
> >
> >
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Josh Elser-2
In reply to this post by guy sharon
To answer your original question: YCSB is a standard benchmarking tool
for databases that provides various types of read/write workloads.

https://github.com/brianfrankcooper/YCSB/tree/master/accumulo1.7

On 8/29/18 8:04 AM, guy sharon wrote:

> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if
> the results I'm getting are reasonable and why throwing more hardware at
> the problem doesn't help. What I'm doing is a full table scan on a table
> with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> 2.8.4. The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos
> (https://github.com/apache/fluo-muchos), running on m5d.large AWS
> machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
> server. Scan took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno
> (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort
> of newbie (possibly configuration) mistake but I can't figure out what
> it is. Can anyone point me to something that might help me find out what
> it is?
>
> thanks,
> Guy.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Jonathan Yom-Tov
In reply to this post by Jeremy Kepner
I'm not 100% sure it's slow. Coming from RDBMS it seems it might be, but I wanted the opinion of others since I'm not experienced with Accumulo. From your reply I assume you think it's reasonable?

On Wed, Aug 29, 2018 at 6:33 PM, Jeremy Kepner <[hidden email]> wrote:
Why do you think 500K/sec is slow?

On Wed, Aug 29, 2018 at 04:39:32PM +0300, guy sharon wrote:
> Well, in one experiment I used a machine with 48 cores and 192GB and the
> results actually came out worse. And in another I had 7 tservers on servers
> with 4 cores. I think I'm not configuring things correctly because I'd
> expect the improved hardware to improve performance and that doesn't seem
> to be the case.
>
> On Wed, Aug 29, 2018 at 4:00 PM Jeremy Kepner <[hidden email]> wrote:
>
> > Your node is fairly underpowered (2 cores and 8 GB RAM) and is less than
> > most laptops.  That said
> >
> > 6M / 12sec = 500K/sec
> >
> > is good for a single node Accumulo instance on this hardware.
> >
> > Spitting might not help since you only have 2 cores so added parallism
> > can't
> > be exploited.
> >
> > Why do you think 500K/sec is slow?
> >
> > To determine slowness one would have to compare with other database
> > technology on the same platform.
> >
> >
> > On Wed, Aug 29, 2018 at 03:04:51PM +0300, guy sharon wrote:
> > > hi,
> > >
> > > Continuing my performance benchmarks, I'm still trying to figure out if
> > the
> > > results I'm getting are reasonable and why throwing more hardware at the
> > > problem doesn't help. What I'm doing is a full table scan on a table with
> > > 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> > 2.8.4.
> > > The table is populated by
> > > org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> > > modified to write 6M entries instead of 50k. Reads are performed by
> > > "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> > > muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> > > results I got:
> > >
> > > 1. 5 tserver cluster as configured by Muchos (
> > > https://github.com/apache/fluo-muchos), running on m5d.large AWS
> > machines
> > > (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan
> > > took 12 seconds.
> > > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> > > 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> > > seconds.
> > > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> > > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> > > Amazon Linux. Configuration as provided by Uno (
> > > https://github.com/apache/fluo-uno). Total time was 26 seconds.
> > >
> > > Offhand I would say this is very slow. I'm guessing I'm making some sort
> > of
> > > newbie (possibly configuration) mistake but I can't figure out what it
> > is.
> > > Can anyone point me to something that might help me find out what it is?
> > >
> > > thanks,
> > > Guy.
> >

Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Jonathan Yom-Tov
In reply to this post by Josh Elser-2
No, it doesn't. What would you recommend changing? Heap space, or something else?

On Wed, Aug 29, 2018 at 6:35 PM, Josh Elser <[hidden email]> wrote:
Does Muchos actually change the Accumulo configuration when you are changing the underlying hardware?


On 8/29/18 8:04 AM, guy sharon wrote:
hi,

Continuing my performance benchmarks, I'm still trying to figure out if the results I'm getting are reasonable and why throwing more hardware at the problem doesn't help. What I'm doing is a full table scan on a table with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop 2.8.4. The table is populated by org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter modified to write 6M entries instead of 50k. Reads are performed by "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the results I got:

1. 5 tserver cluster as configured by Muchos (https://github.com/apache/fluo-muchos), running on m5d.large AWS machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan took 12 seconds.
2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
3. Splitting the table to 4 tablets causes the runtime to increase to 16 seconds.
4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running Amazon Linux. Configuration as provided by Uno (https://github.com/apache/fluo-uno). Total time was 26 seconds.

Offhand I would say this is very slow. I'm guessing I'm making some sort of newbie (possibly configuration) mistake but I can't figure out what it is. Can anyone point me to something that might help me find out what it is?

thanks,
Guy.



Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Jeremy Kepner
In reply to this post by Jonathan Yom-Tov
50K/sec on any SQL database on a single node would be very good.

On Aug 29, 2018, at 12:37 PM, Jonathan Yom-Tov <[hidden email]> wrote:

I'm not 100% sure it's slow. Coming from RDBMS it seems it might be, but I wanted the opinion of others since I'm not experienced with Accumulo. From your reply I assume you think it's reasonable?

On Wed, Aug 29, 2018 at 6:33 PM, Jeremy Kepner <[hidden email]> wrote:
Why do you think 500K/sec is slow?

On Wed, Aug 29, 2018 at 04:39:32PM +0300, guy sharon wrote:
> Well, in one experiment I used a machine with 48 cores and 192GB and the
> results actually came out worse. And in another I had 7 tservers on servers
> with 4 cores. I think I'm not configuring things correctly because I'd
> expect the improved hardware to improve performance and that doesn't seem
> to be the case.
>
> On Wed, Aug 29, 2018 at 4:00 PM Jeremy Kepner <[hidden email]> wrote:
>
> > Your node is fairly underpowered (2 cores and 8 GB RAM) and is less than
> > most laptops.  That said
> >
> > 6M / 12sec = 500K/sec
> >
> > is good for a single node Accumulo instance on this hardware.
> >
> > Spitting might not help since you only have 2 cores so added parallism
> > can't
> > be exploited.
> >
> > Why do you think 500K/sec is slow?
> >
> > To determine slowness one would have to compare with other database
> > technology on the same platform.
> >
> >
> > On Wed, Aug 29, 2018 at 03:04:51PM +0300, guy sharon wrote:
> > > hi,
> > >
> > > Continuing my performance benchmarks, I'm still trying to figure out if
> > the
> > > results I'm getting are reasonable and why throwing more hardware at the
> > > problem doesn't help. What I'm doing is a full table scan on a table with
> > > 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> > 2.8.4.
> > > The table is populated by
> > > org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> > > modified to write 6M entries instead of 50k. Reads are performed by
> > > "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> > > muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> > > results I got:
> > >
> > > 1. 5 tserver cluster as configured by Muchos (
> > > https://github.com/apache/fluo-muchos), running on m5d.large AWS
> > machines
> > > (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan
> > > took 12 seconds.
> > > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> > > 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> > > seconds.
> > > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> > > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> > > Amazon Linux. Configuration as provided by Uno (
> > > https://github.com/apache/fluo-uno). Total time was 26 seconds.
> > >
> > > Offhand I would say this is very slow. I'm guessing I'm making some sort
> > of
> > > newbie (possibly configuration) mistake but I can't figure out what it
> > is.
> > > Can anyone point me to something that might help me find out what it is?
> > >
> > > thanks,
> > > Guy.
> >


Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Mike Walch-2
In reply to this post by Josh Elser-2
Muchos does not automatically change its Accumulo configuration to take advantage of better hardware. However, it does have a performance profile setting in its configuration (see link below) where you can select a profile (or create your own) based on your the hardware you are using.

https://github.com/apache/fluo-muchos/blob/master/conf/muchos.props.example#L94

On Wed, Aug 29, 2018 at 11:35 AM Josh Elser <[hidden email]> wrote:
Does Muchos actually change the Accumulo configuration when you are
changing the underlying hardware?

On 8/29/18 8:04 AM, guy sharon wrote:
> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if
> the results I'm getting are reasonable and why throwing more hardware at
> the problem doesn't help. What I'm doing is a full table scan on a table
> with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> 2.8.4. The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos
> (https://github.com/apache/fluo-muchos), running on m5d.large AWS
> machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
> server. Scan took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno
> (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort
> of newbie (possibly configuration) mistake but I can't figure out what
> it is. Can anyone point me to something that might help me find out what
> it is?
>
> thanks,
> Guy.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

guy sharon
Yes, I tried the high performance configuration which translates to 4G heap size, but that didn't affect performance. Neither did setting table.scan.max.memory to 4096k (default is 512k). Even if I accept that the read performance here is reasonable I don't understand why none of the hardware configuration changes (except going to 48 cores, which made things worse) made any difference.

On Wed, Aug 29, 2018 at 8:33 PM Mike Walch <[hidden email]> wrote:
Muchos does not automatically change its Accumulo configuration to take advantage of better hardware. However, it does have a performance profile setting in its configuration (see link below) where you can select a profile (or create your own) based on your the hardware you are using.

https://github.com/apache/fluo-muchos/blob/master/conf/muchos.props.example#L94

On Wed, Aug 29, 2018 at 11:35 AM Josh Elser <[hidden email]> wrote:
Does Muchos actually change the Accumulo configuration when you are
changing the underlying hardware?

On 8/29/18 8:04 AM, guy sharon wrote:
> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if
> the results I'm getting are reasonable and why throwing more hardware at
> the problem doesn't help. What I'm doing is a full table scan on a table
> with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> 2.8.4. The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos
> (https://github.com/apache/fluo-muchos), running on m5d.large AWS
> machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
> server. Scan took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno
> (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort
> of newbie (possibly configuration) mistake but I can't figure out what
> it is. Can anyone point me to something that might help me find out what
> it is?
>
> thanks,
> Guy.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Michael Wall
Guy,

Can you go into specifics about how you are measuring this?  Are you still using "bin/accumulo shell -u root -p secret -e "scan -t hellotable -np" | wc -l" as you mentioned earlier in the thread?  As Mike Miller suggested, serializing that back to the display and then counting 6M entries is going to take some time.  Try using a Batch Scanner directly.

Mike

On Wed, Aug 29, 2018 at 2:56 PM guy sharon <[hidden email]> wrote:
Yes, I tried the high performance configuration which translates to 4G heap size, but that didn't affect performance. Neither did setting table.scan.max.memory to 4096k (default is 512k). Even if I accept that the read performance here is reasonable I don't understand why none of the hardware configuration changes (except going to 48 cores, which made things worse) made any difference.

On Wed, Aug 29, 2018 at 8:33 PM Mike Walch <[hidden email]> wrote:
Muchos does not automatically change its Accumulo configuration to take advantage of better hardware. However, it does have a performance profile setting in its configuration (see link below) where you can select a profile (or create your own) based on your the hardware you are using.

https://github.com/apache/fluo-muchos/blob/master/conf/muchos.props.example#L94

On Wed, Aug 29, 2018 at 11:35 AM Josh Elser <[hidden email]> wrote:
Does Muchos actually change the Accumulo configuration when you are
changing the underlying hardware?

On 8/29/18 8:04 AM, guy sharon wrote:
> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if
> the results I'm getting are reasonable and why throwing more hardware at
> the problem doesn't help. What I'm doing is a full table scan on a table
> with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> 2.8.4. The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos
> (https://github.com/apache/fluo-muchos), running on m5d.large AWS
> machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
> server. Scan took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno
> (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort
> of newbie (possibly configuration) mistake but I can't figure out what
> it is. Can anyone point me to something that might help me find out what
> it is?
>
> thanks,
> Guy.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

guy sharon
hi Mike,

As per Mike Miller's suggestion I started using org.apache.accumulo.examples.simple.helloworld.ReadData from Accumulo with debugging turned off and a BatchScanner with 10 threads. I redid all the measurements and although this was 20% faster than using the shell there was no difference once I started playing with the hardware configurations.

Guy.

On Wed, Aug 29, 2018 at 10:06 PM Michael Wall <[hidden email]> wrote:
Guy,

Can you go into specifics about how you are measuring this?  Are you still using "bin/accumulo shell -u root -p secret -e "scan -t hellotable -np" | wc -l" as you mentioned earlier in the thread?  As Mike Miller suggested, serializing that back to the display and then counting 6M entries is going to take some time.  Try using a Batch Scanner directly.

Mike

On Wed, Aug 29, 2018 at 2:56 PM guy sharon <[hidden email]> wrote:
Yes, I tried the high performance configuration which translates to 4G heap size, but that didn't affect performance. Neither did setting table.scan.max.memory to 4096k (default is 512k). Even if I accept that the read performance here is reasonable I don't understand why none of the hardware configuration changes (except going to 48 cores, which made things worse) made any difference.

On Wed, Aug 29, 2018 at 8:33 PM Mike Walch <[hidden email]> wrote:
Muchos does not automatically change its Accumulo configuration to take advantage of better hardware. However, it does have a performance profile setting in its configuration (see link below) where you can select a profile (or create your own) based on your the hardware you are using.

https://github.com/apache/fluo-muchos/blob/master/conf/muchos.props.example#L94

On Wed, Aug 29, 2018 at 11:35 AM Josh Elser <[hidden email]> wrote:
Does Muchos actually change the Accumulo configuration when you are
changing the underlying hardware?

On 8/29/18 8:04 AM, guy sharon wrote:
> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if
> the results I'm getting are reasonable and why throwing more hardware at
> the problem doesn't help. What I'm doing is a full table scan on a table
> with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> 2.8.4. The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos
> (https://github.com/apache/fluo-muchos), running on m5d.large AWS
> machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
> server. Scan took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno
> (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort
> of newbie (possibly configuration) mistake but I can't figure out what
> it is. Can anyone point me to something that might help me find out what
> it is?
>
> thanks,
> Guy.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Marc
Guy,
   In the case where you added servers and splits, did you check the
tablet locations to see if they migrated to separate hosts?

On Wed, Aug 29, 2018 at 3:12 PM guy sharon <[hidden email]> wrote:

>
> hi Mike,
>
> As per Mike Miller's suggestion I started using org.apache.accumulo.examples.simple.helloworld.ReadData from Accumulo with debugging turned off and a BatchScanner with 10 threads. I redid all the measurements and although this was 20% faster than using the shell there was no difference once I started playing with the hardware configurations.
>
> Guy.
>
> On Wed, Aug 29, 2018 at 10:06 PM Michael Wall <[hidden email]> wrote:
>>
>> Guy,
>>
>> Can you go into specifics about how you are measuring this?  Are you still using "bin/accumulo shell -u root -p secret -e "scan -t hellotable -np" | wc -l" as you mentioned earlier in the thread?  As Mike Miller suggested, serializing that back to the display and then counting 6M entries is going to take some time.  Try using a Batch Scanner directly.
>>
>> Mike
>>
>> On Wed, Aug 29, 2018 at 2:56 PM guy sharon <[hidden email]> wrote:
>>>
>>> Yes, I tried the high performance configuration which translates to 4G heap size, but that didn't affect performance. Neither did setting table.scan.max.memory to 4096k (default is 512k). Even if I accept that the read performance here is reasonable I don't understand why none of the hardware configuration changes (except going to 48 cores, which made things worse) made any difference.
>>>
>>> On Wed, Aug 29, 2018 at 8:33 PM Mike Walch <[hidden email]> wrote:
>>>>
>>>> Muchos does not automatically change its Accumulo configuration to take advantage of better hardware. However, it does have a performance profile setting in its configuration (see link below) where you can select a profile (or create your own) based on your the hardware you are using.
>>>>
>>>> https://github.com/apache/fluo-muchos/blob/master/conf/muchos.props.example#L94
>>>>
>>>> On Wed, Aug 29, 2018 at 11:35 AM Josh Elser <[hidden email]> wrote:
>>>>>
>>>>> Does Muchos actually change the Accumulo configuration when you are
>>>>> changing the underlying hardware?
>>>>>
>>>>> On 8/29/18 8:04 AM, guy sharon wrote:
>>>>> > hi,
>>>>> >
>>>>> > Continuing my performance benchmarks, I'm still trying to figure out if
>>>>> > the results I'm getting are reasonable and why throwing more hardware at
>>>>> > the problem doesn't help. What I'm doing is a full table scan on a table
>>>>> > with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
>>>>> > 2.8.4. The table is populated by
>>>>> > org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
>>>>> > modified to write 6M entries instead of 50k. Reads are performed by
>>>>> > "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
>>>>> > muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
>>>>> > results I got:
>>>>> >
>>>>> > 1. 5 tserver cluster as configured by Muchos
>>>>> > (https://github.com/apache/fluo-muchos), running on m5d.large AWS
>>>>> > machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
>>>>> > server. Scan took 12 seconds.
>>>>> > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
>>>>> > 3. Splitting the table to 4 tablets causes the runtime to increase to 16
>>>>> > seconds.
>>>>> > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
>>>>> > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
>>>>> > Amazon Linux. Configuration as provided by Uno
>>>>> > (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>>>>> >
>>>>> > Offhand I would say this is very slow. I'm guessing I'm making some sort
>>>>> > of newbie (possibly configuration) mistake but I can't figure out what
>>>>> > it is. Can anyone point me to something that might help me find out what
>>>>> > it is?
>>>>> >
>>>>> > thanks,
>>>>> > Guy.
>>>>> >
>>>>> >
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo performance on various hardware configurations

Michael Wall
In reply to this post by guy sharon
Couple of things to look at/try

1 - Is the data spread out amongst all the tablets and tservers when you have multiple tservers?
2 - How much of the data is in memory on the tablet server and how much is on disk.  You can try flushing the table before running your scan.
3 - You could also launch compaction before running your scan to minimize the number of rfiles per tablet

Mike

On Wed, Aug 29, 2018 at 3:12 PM guy sharon <[hidden email]> wrote:
hi Mike,

As per Mike Miller's suggestion I started using org.apache.accumulo.examples.simple.helloworld.ReadData from Accumulo with debugging turned off and a BatchScanner with 10 threads. I redid all the measurements and although this was 20% faster than using the shell there was no difference once I started playing with the hardware configurations.

Guy.

On Wed, Aug 29, 2018 at 10:06 PM Michael Wall <[hidden email]> wrote:
Guy,

Can you go into specifics about how you are measuring this?  Are you still using "bin/accumulo shell -u root -p secret -e "scan -t hellotable -np" | wc -l" as you mentioned earlier in the thread?  As Mike Miller suggested, serializing that back to the display and then counting 6M entries is going to take some time.  Try using a Batch Scanner directly.

Mike

On Wed, Aug 29, 2018 at 2:56 PM guy sharon <[hidden email]> wrote:
Yes, I tried the high performance configuration which translates to 4G heap size, but that didn't affect performance. Neither did setting table.scan.max.memory to 4096k (default is 512k). Even if I accept that the read performance here is reasonable I don't understand why none of the hardware configuration changes (except going to 48 cores, which made things worse) made any difference.

On Wed, Aug 29, 2018 at 8:33 PM Mike Walch <[hidden email]> wrote:
Muchos does not automatically change its Accumulo configuration to take advantage of better hardware. However, it does have a performance profile setting in its configuration (see link below) where you can select a profile (or create your own) based on your the hardware you are using.

https://github.com/apache/fluo-muchos/blob/master/conf/muchos.props.example#L94

On Wed, Aug 29, 2018 at 11:35 AM Josh Elser <[hidden email]> wrote:
Does Muchos actually change the Accumulo configuration when you are
changing the underlying hardware?

On 8/29/18 8:04 AM, guy sharon wrote:
> hi,
>
> Continuing my performance benchmarks, I'm still trying to figure out if
> the results I'm getting are reasonable and why throwing more hardware at
> the problem doesn't help. What I'm doing is a full table scan on a table
> with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
> 2.8.4. The table is populated by
> org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
> modified to write 6M entries instead of 50k. Reads are performed by
> "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
> muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
> results I got:
>
> 1. 5 tserver cluster as configured by Muchos
> (https://github.com/apache/fluo-muchos), running on m5d.large AWS
> machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
> server. Scan took 12 seconds.
> 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
> 3. Splitting the table to 4 tablets causes the runtime to increase to 16
> seconds.
> 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
> 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
> Amazon Linux. Configuration as provided by Uno
> (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>
> Offhand I would say this is very slow. I'm guessing I'm making some sort
> of newbie (possibly configuration) mistake but I can't figure out what
> it is. Can anyone point me to something that might help me find out what
> it is?
>
> thanks,
> Guy.
>
>
12