Saturday, 2 May 2020

JAVA Streams

Java streams main intension performance improvement.
State of stream - Sorted, distinct
ArrayList : characteristics
public int characteristics()
{ return Spliterator.ORDERED | Spliterator.SIZED | Spliterator.SUBSIZED; }

HashSet : characteristics
public int characteristics() {
return (fence < 0 || est == map.size ? Spliterator.SIZED : 0) | Spliterator.DISTINCT; }

Stream - Characteristic
ORDERED ordermatters
DISTINCT No duplication possible
SORTED Sorted
SIZED The size is known
NONNULL No nullvalues
IMMUTABLE Immutable
CONCURRENT Parallelismis possible
SUBSIZED The size is known



employee.stream()
                .distinct() // returns a stream = intermediate method
                .sorted()   // the same as distinct()
                .forEach(System.out::println);

How streams will improve performance ?

distinct() and sorted() intermediate methods means until terminal methods forEach invoke no computation will not perfomed ? what it means ?
if stream call distinct() then it will update the state of distinct either 0 or 1.
if stream call sorted() then it will update the state of sorted either 0 or 1
Now terminal method forEach
1) Checks the state of streams in our case
if(distinct==1) { it will perform the distinct operation }
if(sorted == 1) { it will perform the sorted operation }
after that stream it will process each element in stream.

But where you got performance improvement here ?

// HashSet
HashSet<Employee> employee= ... ;
employee.stream()
          .distinct() // no processing is triggered
          .sorted()   // quicksort is triggered
          .forEach (System.out::println);

HashSet stream, HashSet will not allowed duplicated means in our distinct operation not requires.
distinct = 0
sorted = 1
Now forEach it will skip the distinct and perform the sorted so here little bit performance improved.


// SortedSet
SortedSet<Employee> employee= ... ;
employee.stream()
           .distinct() // no processing is triggered
           .sorted()   // no quicksort is triggered
           .forEach(System.out::println);
SortedSet stream - means SortedSet not allows duplicates and it's already sorted
hence
distinct - 0
sorted - 0
hence forEach it will skip the distinct and sorted operations.


We know stateful and stateless object ? What is Stateful and Stateless operation ?

I will explain
// call parallel on an existing stream
List<Person> people = ... ;
 people.stream()
            .parallel()
            .filter(person -> person.getAge() > 20)
            .sorted()
            .forEach(System.out::println);

Here In the filter operation/method, we don't need to remember any state.

// call parallel on an existing stream
List<Person> people = ... ;
people.stream()
          .parallel()
          .skip(2)
          .limit(5)
          .forEach(System.out::println);

We need to maintain one counter variable, remove 2, add 5 means we need to atomicLong instance variable in class level. This instance variable updated by multiple threads.

Here we are depend on counter variable state, this example stateful example.




















No comments:

Post a Comment