Java streams main intension performance improvement.
State of stream - Sorted, distinct
ArrayList : characteristics
public int characteristics()
{ return Spliterator.ORDERED | Spliterator.SIZED | Spliterator.SUBSIZED; }
HashSet : characteristics
public int characteristics() {
return (fence < 0 || est == map.size ? Spliterator.SIZED : 0) | Spliterator.DISTINCT; }
Stream - Characteristic
ORDERED ordermatters
DISTINCT No duplication possible
SORTED Sorted
SIZED The size is known
NONNULL No nullvalues
IMMUTABLE Immutable
CONCURRENT Parallelismis possible
SUBSIZED The size is known
employee.stream()
.distinct() // returns a stream = intermediate method
.sorted() // the same as distinct()
.forEach(System.out::println);
How streams will improve performance ?
distinct() and sorted() intermediate methods means until terminal methods forEach invoke no computation will not perfomed ? what it means ?
if stream call distinct() then it will update the state of distinct either 0 or 1.
if stream call sorted() then it will update the state of sorted either 0 or 1
Now terminal method forEach
1) Checks the state of streams in our case
if(distinct==1) { it will perform the distinct operation }
if(sorted == 1) { it will perform the sorted operation }
after that stream it will process each element in stream.
But where you got performance improvement here ?
// HashSet
HashSet<Employee> employee= ... ;
employee.stream()
.distinct() // no processing is triggered
.sorted() // quicksort is triggered
.forEach (System.out::println);
HashSet stream, HashSet will not allowed duplicated means in our distinct operation not requires.
distinct = 0
sorted = 1
Now forEach it will skip the distinct and perform the sorted so here little bit performance improved.
// SortedSet
SortedSet<Employee> employee= ... ;
employee.stream()
.distinct() // no processing is triggered
.sorted() // no quicksort is triggered
.forEach(System.out::println);
SortedSet stream - means SortedSet not allows duplicates and it's already sorted
hence
distinct - 0
sorted - 0
hence forEach it will skip the distinct and sorted operations.
We know stateful and stateless object ? What is Stateful and Stateless operation ?
I will explain
// call parallel on an existing stream
List<Person> people = ... ;
people.stream()
.parallel()
.filter(person -> person.getAge() > 20)
.sorted()
.forEach(System.out::println);
Here In the filter operation/method, we don't need to remember any state.
// call parallel on an existing stream
List<Person> people = ... ;
people.stream()
.parallel()
.skip(2)
.limit(5)
.forEach(System.out::println);
We need to maintain one counter variable, remove 2, add 5 means we need to atomicLong instance variable in class level. This instance variable updated by multiple threads.
Here we are depend on counter variable state, this example stateful example.