[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

classic Classic list List threaded Threaded
67 messages Options
1234
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
GitHub user milleruntime opened a pull request:

    https://github.com/apache/accumulo/pull/244

    Discuss ACCUMULO-3079: collapsing the iterator stack to improve performance

    Applied the changes from the patch submitted by Adam Fuchs on [ACCUMULO-3079](https://issues.apache.org/jira/browse/ACCUMULO-3079) to master. This PR is to bring these changes back into discussion.  Created JMH benchmark tests [here](https://github.com/milleruntime/jmh-test/blob/master/src/main/java/org/sample/MyBenchmark.java).  I tried to isolate the tests to only the system iterators acting on data in memory.  
   
    I have attached the results from the benchmark tests.
    [jmh-accumulo-benchmark-results.txt](https://github.com/apache/accumulo/files/906333/jmh-accumulo-benchmark-results.txt)
   
    The JMH tests can be built and run from [here](https://github.com/milleruntime/jmh-test). After cloning the repo, simply run:
    `mvn clean install`
    `java -jar target/benchmarks.jar -i 100 -f 1`

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/milleruntime/accumulo ACCUMULO-3079-all

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/accumulo/pull/244.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #244
   
----
commit 747bae1c737b9df2bbd3c7a834a536696d9e304a
Author: Mike Miller <[hidden email]>
Date:   2017-03-31T16:43:45Z

    ACCUMULO-3079: collapsed the iterator stack to improve performance

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
Github user joshelser commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    > VM version: JDK 1.8.0_121, VM 25.121-b13
    > VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/jre/bin/java
   
    We should definitely run these on Linux.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user milleruntime commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    Here are updated results with @keith-turner 's additional improvements:
    [testresults.txt](https://github.com/apache/accumulo/files/911586/testresults.txt)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user joshelser commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    >  Created JMH benchmark tests here.
   
    Also, I would be super in-favor with including this in Accumulo. I think there are some licensing things that we would have to figure out (JMH is GPL IIRC), but I think this would be worthwhile as a benchmark is no good if no one else ever runs it :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user scubafuchs commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    Thanks for resurrecting this, folks! The benchmarked performance improvement still looks pretty great -- can't wait to see it in production!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user milleruntime commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    > Also, I would be super in-favor with including this in Accumulo. I think there are some licensing things that we would have to figure out (JMH is GPL IIRC), but I think this would be worthwhile as a benchmark is no good if no one else ever runs it :)
   
    I like the idea.  Any ideas how we would record benchmarks?  Have them run with the release plugin? Then publish with release notes?  
   
    We should also discuss alternatives to JMH.  I only chose it because it was the first one you suggested.   FYI I haven't been able to get the setup() and teardown() annotations to work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user joshelser commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    > Any ideas how we would record benchmarks? Have them run with the release plugin? Then publish with release notes?
   
    I'm hesitant to suggest publishing them with release notes as the "norm". I'd think that we could just start a new part of the website to capture this (since they are likely only developer-related). If we feel like it is worthy of mentioning in the release notes (e.g. the numbers are not misleading or context-specific), then we can do so. Otherwise, the website can just serve as the record of "results". We could expand this out to include things like ContinuousIngest numbers too (which would be awesome -- don't have to build up a list of numbers from release notes..)
   
    > We should also discuss alternatives to JMH. I only chose it because it was the first one you suggested. FYI I haven't been able to get the setup() and teardown() annotations to work.
   
    Heh. I think I played around with JMH once. For the most part, I think JMH and Caliper are pretty equivalent in terms of functionality. @matthew-dailey, didn't I talk to you about this a long time ago and you had some experience with them both?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user scubafuchs commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    Here's a published benchmark example that I've found valuable in the past: https://home.apache.org/~mikemccand/lucenebench/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user matthew-dailey commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    @joshelser we did talk about this, but I only have JMH experience.  Looking through our convos, I found [this](http://www.google.com/url?q=http%3A%2F%2Fhg.openjdk.java.net%2Fcode-tools%2Fjmh%2Ffile%2Ffa510264b3f6%2Fjmh-samples%2Fsrc%2Fmain%2Fjava%2Forg%2Fopenjdk%2Fjmh%2Fsamples%2FJMHSample_11_Loops.java%23l47&sa=D&sntz=1&usg=AFQjCNF22nEbxn2-JkEhIy2LDeM3iimXpw) and [this](https://groups.google.com/forum/#!topic/mechanical-sympathy/m4opvy4xq3U) as references about JMH being good.  The gist is that JMH is made by OpenJDK developers, so they did a pretty good job designing JMH to get around things like JIT optimizations and properly measure performance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user milleruntime commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    Aside from discussion of benchmark tests in Accumulo, any more input about the actual code changes?  I am currently running the tests again on my Ubuntu laptop to get a set of Linux system comparisons.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112494632
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/iterators/ServerFilter.java ---
    @@ -0,0 +1,80 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.accumulo.core.iterators;
    +
    +import java.io.IOException;
    +import java.util.Collection;
    +import java.util.Map;
    +
    +import org.apache.accumulo.core.data.ByteSequence;
    +import org.apache.accumulo.core.data.Key;
    +import org.apache.accumulo.core.data.Range;
    +import org.apache.accumulo.core.data.Value;
    +
    +/**
    + * A SortedKeyValueIterator that filters entries from its source iterator.
    + *
    + * Subclasses must implement an accept method: public boolean accept(Key k, Value v);
    + *
    + * Key/Value pairs for which the accept method returns true are said to match the filter. By default, this class iterates over entries that match its filter.
    + * This iterator takes an optional "negate" boolean parameter that defaults to false. If negate is set to true, this class instead omits entries that match its
    + * filter, thus iterating over entries that do not match its filter.
    + */
    +public abstract class ServerFilter extends ServerWrappingIterator {
    +
    +  public ServerFilter(SortedKeyValueIterator<Key,Value> source) {
    +    super(source);
    +  }
    +
    +  @Override
    +  public abstract SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env);
    +
    +  @Override
    +  public void next() throws IOException {
    +    source.next();
    +    findTop();
    +  }
    +
    +  @Override
    +  public void seek(Range range, Collection<ByteSequence> columnFamilies, boolean inclusive) throws IOException {
    +    source.seek(range, columnFamilies, inclusive);
    +    findTop();
    +  }
    +
    +  /**
    +   * Iterates over the source until an acceptable key/value pair is found.
    +   */
    +  private void findTop() throws IOException {
    +    while (source.hasTop()) {
    +      Key top = source.getTopKey();
    +      if (top.isDeleted() || (accept(top, source.getTopValue()))) {
    --- End diff --
   
    This condition seems strange. We want to accept deletes we see?
   
    I'd have expected `if (!top.isDeleted() && accept(top, source.getTopValue()))`, avoiding the `accept()` call when we know the key is actually a delete.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112495202
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/iterators/ServerWrappingIterator.java ---
    @@ -0,0 +1,80 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.accumulo.core.iterators;
    +
    +import java.io.IOException;
    +import java.util.Collection;
    +import java.util.Map;
    +
    +import org.apache.accumulo.core.data.ByteSequence;
    +import org.apache.accumulo.core.data.Key;
    +import org.apache.accumulo.core.data.Range;
    +import org.apache.accumulo.core.data.Value;
    +
    +/**
    + * A convenience class for implementing iterators that select, but do not modify, entries read from a source iterator. Default implementations exist for all
    + * methods, but {@link #deepCopy} will throw an <code>UnsupportedOperationException</code>.
    + *
    + * This iterator has some checks in place to enforce the iterator contract. Specifically, it verifies that it has a source iterator and that {@link #seek} has
    + * been called before any data is read. If either of these conditions does not hold true, an <code>IllegalStateException</code> will be thrown. In particular,
    + * this means that <code>getSource().seek</code> and <code>super.seek</code> no longer perform identical actions. Implementors should take note of this and if
    + * <code>seek</code> is overridden, ensure that <code>super.seek</code> is called before data is read.
    + */
    +public abstract class ServerWrappingIterator implements SortedKeyValueIterator<Key,Value> {
    --- End diff --
   
    How/why does this need to exist when compared to the `WrappingIterator`. Going off memory, these are very similar.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112495020
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/iterators/ServerWrappingIterator.java ---
    @@ -0,0 +1,80 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.accumulo.core.iterators;
    +
    +import java.io.IOException;
    +import java.util.Collection;
    +import java.util.Map;
    +
    +import org.apache.accumulo.core.data.ByteSequence;
    +import org.apache.accumulo.core.data.Key;
    +import org.apache.accumulo.core.data.Range;
    +import org.apache.accumulo.core.data.Value;
    +
    +/**
    + * A convenience class for implementing iterators that select, but do not modify, entries read from a source iterator. Default implementations exist for all
    + * methods, but {@link #deepCopy} will throw an <code>UnsupportedOperationException</code>.
    + *
    + * This iterator has some checks in place to enforce the iterator contract. Specifically, it verifies that it has a source iterator and that {@link #seek} has
    + * been called before any data is read. If either of these conditions does not hold true, an <code>IllegalStateException</code> will be thrown. In particular,
    + * this means that <code>getSource().seek</code> and <code>super.seek</code> no longer perform identical actions. Implementors should take note of this and if
    + * <code>seek</code> is overridden, ensure that <code>super.seek</code> is called before data is read.
    + */
    +public abstract class ServerWrappingIterator implements SortedKeyValueIterator<Key,Value> {
    +
    +  protected final SortedKeyValueIterator<Key,Value> source;
    +
    +  public ServerWrappingIterator(SortedKeyValueIterator<Key,Value> source) {
    +    this.source = source;
    +  }
    +
    +  @Override
    +  public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
    +    throw new UnsupportedOperationException();
    +  }
    +
    +  @Override
    +  public Key getTopKey() {
    +    return source.getTopKey();
    +  }
    +
    +  @Override
    +  public Value getTopValue() {
    +    return source.getTopValue();
    +  }
    +
    +  @Override
    +  public boolean hasTop() {
    +    return source.hasTop();
    +  }
    +
    +  @Override
    +  public void init(SortedKeyValueIterator<Key,Value> source, Map<String,String> options, IteratorEnvironment env) throws IOException {
    +    throw new UnsupportedOperationException();
    --- End diff --
   
    Ditto WRT the `deepCopy()` comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112495854
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/iterators/ServerSkippingIterator.java ---
    @@ -0,0 +1,47 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.accumulo.core.iterators;
    +
    +import java.io.IOException;
    +import java.util.Collection;
    +
    +import org.apache.accumulo.core.data.ByteSequence;
    +import org.apache.accumulo.core.data.Key;
    +import org.apache.accumulo.core.data.Range;
    +import org.apache.accumulo.core.data.Value;
    +
    +public abstract class ServerSkippingIterator extends ServerWrappingIterator {
    --- End diff --
   
    Similarly, to the same question as to "why do we need ServerWrappingIterator and WrappingIterator", why do we need "ServerSkippingIterator" and not just "SkippingIterator"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112496258
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/iterators/system/VisibilityFilter.java ---
    @@ -16,63 +16,59 @@
      */
     package org.apache.accumulo.core.iterators.system;
     
    +import org.apache.accumulo.core.data.ArrayByteSequence;
    +import org.apache.accumulo.core.data.ByteSequence;
     import org.apache.accumulo.core.data.Key;
     import org.apache.accumulo.core.data.Value;
    -import org.apache.accumulo.core.iterators.Filter;
     import org.apache.accumulo.core.iterators.IteratorEnvironment;
     import org.apache.accumulo.core.iterators.SortedKeyValueIterator;
    +import org.apache.accumulo.core.iterators.SynchronizedServerFilter;
     import org.apache.accumulo.core.security.Authorizations;
     import org.apache.accumulo.core.security.ColumnVisibility;
     import org.apache.accumulo.core.security.VisibilityEvaluator;
     import org.apache.accumulo.core.security.VisibilityParseException;
     import org.apache.accumulo.core.util.BadArgumentException;
    -import org.apache.accumulo.core.util.TextUtil;
     import org.apache.commons.collections.map.LRUMap;
    -import org.apache.hadoop.io.Text;
     import org.slf4j.Logger;
     import org.slf4j.LoggerFactory;
     
    -public class VisibilityFilter extends Filter {
    +public class VisibilityFilter extends SynchronizedServerFilter {
    --- End diff --
   
    This looks to me that, for every call to this Filter we're now grabbing a lock?
   
    That sounds *really* bad to me. What's the reasoning here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112494213
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java ---
    @@ -746,50 +746,45 @@ public boolean hasTop() {
         @Override
         public void next() throws IOException {
    --- End diff --
   
    Why the consolidation of `next()` and `_next()` into just `next()`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112494665
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/iterators/ServerSkippingIterator.java ---
    @@ -0,0 +1,47 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.accumulo.core.iterators;
    +
    +import java.io.IOException;
    +import java.util.Collection;
    +
    +import org.apache.accumulo.core.data.ByteSequence;
    +import org.apache.accumulo.core.data.Key;
    +import org.apache.accumulo.core.data.Range;
    +import org.apache.accumulo.core.data.Value;
    +
    +public abstract class ServerSkippingIterator extends ServerWrappingIterator {
    --- End diff --
   
    Javadoc, please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112494944
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/iterators/ServerWrappingIterator.java ---
    @@ -0,0 +1,80 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.accumulo.core.iterators;
    +
    +import java.io.IOException;
    +import java.util.Collection;
    +import java.util.Map;
    +
    +import org.apache.accumulo.core.data.ByteSequence;
    +import org.apache.accumulo.core.data.Key;
    +import org.apache.accumulo.core.data.Range;
    +import org.apache.accumulo.core.data.Value;
    +
    +/**
    + * A convenience class for implementing iterators that select, but do not modify, entries read from a source iterator. Default implementations exist for all
    + * methods, but {@link #deepCopy} will throw an <code>UnsupportedOperationException</code>.
    + *
    + * This iterator has some checks in place to enforce the iterator contract. Specifically, it verifies that it has a source iterator and that {@link #seek} has
    + * been called before any data is read. If either of these conditions does not hold true, an <code>IllegalStateException</code> will be thrown. In particular,
    + * this means that <code>getSource().seek</code> and <code>super.seek</code> no longer perform identical actions. Implementors should take note of this and if
    + * <code>seek</code> is overridden, ensure that <code>super.seek</code> is called before data is read.
    + */
    +public abstract class ServerWrappingIterator implements SortedKeyValueIterator<Key,Value> {
    +
    +  protected final SortedKeyValueIterator<Key,Value> source;
    +
    +  public ServerWrappingIterator(SortedKeyValueIterator<Key,Value> source) {
    +    this.source = source;
    +  }
    +
    +  @Override
    +  public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
    +    throw new UnsupportedOperationException();
    --- End diff --
   
    Why the decision to throw this exception and not let the compiler tell the user (at compile time, not run time) that they need to implement this method?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo issue #244: Discuss ACCUMULO-3079: collapsing the iterator stack to...

joshelser
In reply to this post by joshelser
Github user milleruntime commented on the issue:

    https://github.com/apache/accumulo/pull/244
 
    @scubafuchs Would you be able to provide feedback to Josh about your original changes?  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] accumulo pull request #244: Discuss ACCUMULO-3079: collapsing the iterator s...

joshelser
In reply to this post by joshelser
Github user keith-turner commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/244#discussion_r112501284
 
    --- Diff: core/src/main/java/org/apache/accumulo/core/iterators/ServerFilter.java ---
    @@ -0,0 +1,80 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.accumulo.core.iterators;
    +
    +import java.io.IOException;
    +import java.util.Collection;
    +import java.util.Map;
    +
    +import org.apache.accumulo.core.data.ByteSequence;
    +import org.apache.accumulo.core.data.Key;
    +import org.apache.accumulo.core.data.Range;
    +import org.apache.accumulo.core.data.Value;
    +
    +/**
    + * A SortedKeyValueIterator that filters entries from its source iterator.
    + *
    + * Subclasses must implement an accept method: public boolean accept(Key k, Value v);
    + *
    + * Key/Value pairs for which the accept method returns true are said to match the filter. By default, this class iterates over entries that match its filter.
    + * This iterator takes an optional "negate" boolean parameter that defaults to false. If negate is set to true, this class instead omits entries that match its
    + * filter, thus iterating over entries that do not match its filter.
    + */
    +public abstract class ServerFilter extends ServerWrappingIterator {
    +
    +  public ServerFilter(SortedKeyValueIterator<Key,Value> source) {
    +    super(source);
    +  }
    +
    +  @Override
    +  public abstract SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env);
    +
    +  @Override
    +  public void next() throws IOException {
    +    source.next();
    +    findTop();
    +  }
    +
    +  @Override
    +  public void seek(Range range, Collection<ByteSequence> columnFamilies, boolean inclusive) throws IOException {
    +    source.seek(range, columnFamilies, inclusive);
    +    findTop();
    +  }
    +
    +  /**
    +   * Iterates over the source until an acceptable key/value pair is found.
    +   */
    +  private void findTop() throws IOException {
    +    while (source.hasTop()) {
    +      Key top = source.getTopKey();
    +      if (top.isDeleted() || (accept(top, source.getTopValue()))) {
    --- End diff --
   
    If `top.isDeleted()` returns true, then would `||` short circuit and not execute accept?  I think so.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
1234
Loading...