Java 8 Feature

    Stream API

    Process collections of data in a declarative, functional way

    Data Pipeline Flow
    Source
    filter()
    map()
    collect()
    Current Stage:
    Source
    1
    2
    3
    4
    5
    6
    7
    8
    Start with source data
    numbers.stream().filter(n -> n % 2 == 0).map(n -> n * 2).collect(toList())
    💤
    Lazy Evaluation
    Processes only when needed
    🔗
    Pipeline
    Chain operations together
    Functional
    Declarative style

    What is a Stream?

    A Stream is a sequence of elements that supports various operations to process data in a declarative way. Instead of telling Java HOW to do something (loops, counters, conditions), you tell it WHAT you want.

    💡 Think of Streams Like a Factory Assembly Line:

    Data flows through the stream like products on a conveyor belt. At each station (operation), something happens: items get filtered out, transformed, or combined. The key insight: the conveyor belt doesn't start moving until someone at the end requests the final product!

    Streams Are NOT:

    • ❌ Data structures (they don't store data)
    • ❌ A replacement for collections
    • ❌ Reusable (can only be consumed once)
    • ❌ Modifying the source

    Streams ARE:

    • ✅ Pipelines for processing data
    • ✅ Lazy (compute on demand)
    • ✅ Easily parallelizable
    • ✅ Composable (chain operations)

    How to Create Streams

    There are many ways to create streams. Here are the most common:

    CreatingStreams.java
    // 1. FROM A COLLECTION (most common)List<String> names =Arrays.asList("Alice","Bob","Charlie");Stream<String> streamFromList = names.stream();// 2. FROM AN ARRAYString[] array ={"a","b","c"};Stream<String> streamFromArray =Arrays.stream(array);// 3. USING Stream.of() - for a few elementsStream<String> streamOf =Stream.of("a","b","c");// 4. EMPTY STREAMStream<String> emptyStream =Stream.empty();// 5. INFINITE STREAMS (be careful - use limit()!)// generate: same value repeatedlyStream<String> infiniteHellos =Stream.generate(()->"Hello").limit(5);// ["Hello", "Hello", "Hello", "Hello", "Hello"]// iterate: apply function repeatedlyStream<Integer> counting =Stream.iterate(0, n -> n +1).limit(10);// [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]// 6. PRIMITIVE STREAMS (avoid boxing overhead)IntStream ints =IntStream.range(1,5);// [1, 2, 3, 4]IntStream intsClosed =IntStream.rangeClosed(1,5);// [1, 2, 3, 4, 5]LongStream longs =LongStream.of(1L,2L,3L);DoubleStream doubles =DoubleStream.of(1.5,2.5);// 7. FROM A STRING (characters)IntStream chars ="Hello".chars();// IntStream of char codes// 8. FROM A FILEStream<String> lines =Files.lines(Paths.get("data.txt"));

    ⚠️ Important: Streams are Single-Use!

    Once you consume a stream (with a terminal operation), it's gone. You cannot reuse it. If you need to process data multiple times, create a new stream each time.

    Understanding the Stream Pipeline

    Every stream pipeline has three parts:

    Stream Pipeline Process

    📦
    ⚙️ 🔄
    01
    Source

    Data is initiated from a collection, array, or file.

    02
    Intermediate Operations

    Data is transformed through filtering, mapping, or sorting.

    03
    Terminal Operation

    The pipeline is triggered, producing a result or side-effect.

    1

    Source

    Where the data comes from (Collection, Array, File, etc.)

    2

    Intermediate Operations (0 or more)

    Transform the stream (filter, map, sort). These are lazy – they don't execute until a terminal operation is called. Each returns a new stream.

    3

    Terminal Operation (exactly 1)

    Produces a result or side-effect (collect, forEach, reduce). This triggers the pipelineand consumes the stream.

    StreamPipeline.java
    List<String> names =Arrays.asList("Alice","Bob","Charlie","David");List<String> result = names.stream()// SOURCE.filter(n -> n.length()>3)// INTERMEDIATE: keep if length > 3.map(String::toUpperCase)// INTERMEDIATE: convert to uppercase.sorted()// INTERMEDIATE: sort alphabetically.collect(Collectors.toList());// TERMINAL: collect to List// Result: ["ALICE", "CHARLIE", "DAVID"]// NOTHING happens until collect() is called!// The filter, map, and sorted are "recipes" waiting to be executed.

    Intermediate Operations (Lazy)

    These operations transform a stream into another stream. They're lazy – nothing happens until you call a terminal operation.

    OperationWhat It DoesExample
    filter()Keep elements matching condition.filter(n → n > 5)
    map()Transform each element.map(String::length)
    flatMap()Flatten nested structures.flatMap(List::stream)
    distinct()Remove duplicates.distinct()
    sorted()Sort elements.sorted(Comparator.reverseOrder())
    limit(n)Take first n elements.limit(10)
    skip(n)Skip first n elements.skip(5)
    peek()Perform action without modifying.peek(System.out::println)
    IntermediateOps.java
    List<String> names =Arrays.asList("Alice","Bob","Charlie","Alice","David");// Chaining multiple operationsList<String> processed = names.stream().filter(n -> n.length()>3)// Remove short names.map(String::toUpperCase)// Convert to uppercase.distinct()// Remove duplicates.sorted()// Sort alphabetically.limit(3)// Take first 3.collect(Collectors.toList());// Nothing executes until collect!// Then all operations run in a SINGLE PASS through the data.

    Terminal Operations (Eager)

    These operations trigger the pipeline and produce a result. After a terminal operation, the stream is consumed and cannot be reused.

    TerminalOps.java
    List<Integer> numbers =Arrays.asList(1,2,3,4,5,6,7,8,9,10);// ===== COLLECTING =====List<Integer> list = numbers.stream().filter(n -> n %2==0).collect(Collectors.toList());// [2, 4, 6, 8, 10]Set<Integer> set = numbers.stream().collect(Collectors.toSet());// ===== REDUCING =====int sum = numbers.stream().reduce(0,(a, b)-> a + b);// 55Optional<Integer> max = numbers.stream().reduce(Integer::max);// Optional[10]// ===== MATCHING =====boolean allPositive = numbers.stream().allMatch(n -> n >0);// trueboolean anyEven = numbers.stream().anyMatch(n -> n %2==0);// trueboolean noneNegative = numbers.stream().noneMatch(n -> n <0);// true// ===== FINDING =====Optional<Integer> first = numbers.stream().findFirst();// Optional[1]Optional<Integer> any = numbers.stream().findAny();// Optional[1] (or random in parallel)// ===== COUNTING =====long count = numbers.stream().count();// 10// ===== MIN/MAX =====Optional<Integer> min = numbers.stream().min(Integer::compareTo);Optional<Integer> max2 = numbers.stream().max(Integer::compareTo);// ===== FOR EACH (side effects) =====
    numbers.stream().forEach(System.out::println);// Prints each number// ===== TO ARRAY =====Integer[] array = numbers.stream().toArray(Integer[]::new);

    Understanding Lazy Evaluation

    Lazy evaluation is one of Stream's most powerful features. Operations don't execute immediately – they wait until absolutely necessary.

    🚀 Why is this powerful?

    • Efficiency: Only processes what's needed
    • Short-circuiting: Can stop early (findFirst, limit)
    • Optimization: JVM can fuse operations together
    • Infinite streams: Can work with potentially infinite data
    LazyDemo.java
    // This looks like it processes ALL elements, but...Optional<Integer> result =Stream.of(1,2,3,4,5,6,7,8,9,10).filter(n ->{System.out.println("Filtering: "+ n);return n %2==0;}).map(n ->{System.out.println("Mapping: "+ n);return n *10;}).findFirst();// We only need the FIRST even number!// Output:// Filtering: 1  (odd, filtered out)// Filtering: 2  (even, passes filter)// Mapping: 2    (first even found!)// // Elements 3-10 are NEVER processed!// findFirst() short-circuits the stream.System.out.println(result.get());// 20

    Parallel Streams

    Streams can process data in parallel with just one method call. The work is automatically split across multiple CPU cores using the Fork/Join framework.

    ParallelStreams.java
    List<Integer> numbers =IntStream.rangeClosed(1,1000000).boxed().collect(Collectors.toList());// SEQUENTIAL (single thread)long start1 =System.currentTimeMillis();long sum1 = numbers.stream().mapToLong(Integer::longValue).sum();System.out.println("Sequential: "+(System.currentTimeMillis()- start1)+"ms");// PARALLEL (multiple threads)long start2 =System.currentTimeMillis();long sum2 = numbers.parallelStream()// Just change to parallelStream()!.mapToLong(Integer::longValue).sum();System.out.println("Parallel: "+(System.currentTimeMillis()- start2)+"ms");// OR convert existing stream to parallel:long sum3 = numbers.stream().parallel()// Makes this stream parallel.mapToLong(Integer::longValue).sum();

    ✅ Good for Parallel:

    • Large datasets (10,000+ elements)
    • CPU-intensive operations
    • Independent computations
    • Stateless operations

    ❌ Bad for Parallel:

    • Small datasets (overhead > benefit)
    • I/O operations (waiting, not computing)
    • Order-dependent operations
    • Shared mutable state

    💡 Tips & Best Practices

    ✅ DO: Filter Early

    Put filter() before map() to reduce the number of elements processed.

    // Good: filter then map (fewer elements to transform)
    .filter(n -> n > 0).map(expensiveOperation)
    // Bad: map then filter (wasted work)
    .map(expensiveOperation).filter(n -> n > 0)

    ✅ DO: Use Primitive Streams

    Use IntStream, LongStream, DoubleStream to avoid boxing overhead.

    // Slower (boxing)
    Stream<Integer> boxed = numbers.stream().map(n -> n * 2);
    // Faster (no boxing)
    IntStream primitive = numbers.stream().mapToInt(n -> n * 2);

    ❌ DON'T: Modify State in Lambdas

    Stream lambdas should be stateless and side-effect free.

    // BAD: Modifying external state
    List<String> results = new ArrayList<>();
    stream.forEach(s -> results.add(s));  // Thread-unsafe!
    // GOOD: Use collect
    List<String> results = stream.collect(Collectors.toList());

    💡 TIP: Debug with peek()

    Use peek() to see what's happening at each stage:

    stream
    .peek(n -> System.out.println("Before filter: " + n))
    .filter(n -> n > 5)
    .peek(n -> System.out.println("After filter: " + n))
    .collect(Collectors.toList());

    📝 Quick Summary

    Stream Basics:

    • Declarative data processing
    • Lazy evaluation
    • Single-use (not reusable)
    • Doesn't modify source

    Pipeline Pattern:

    • Source → Intermediate* → Terminal
    • Intermediate: filter, map, sorted, distinct
    • Terminal: collect, forEach, reduce, count
    • Nothing runs until terminal operation