Stream API
Process collections of data in a declarative, functional way
What is a Stream?
A Stream is a sequence of elements that supports various operations to process data in a declarative way. Instead of telling Java HOW to do something (loops, counters, conditions), you tell it WHAT you want.
💡 Think of Streams Like a Factory Assembly Line:
Data flows through the stream like products on a conveyor belt. At each station (operation), something happens: items get filtered out, transformed, or combined. The key insight: the conveyor belt doesn't start moving until someone at the end requests the final product!
Streams Are NOT:
- ❌ Data structures (they don't store data)
- ❌ A replacement for collections
- ❌ Reusable (can only be consumed once)
- ❌ Modifying the source
Streams ARE:
- ✅ Pipelines for processing data
- ✅ Lazy (compute on demand)
- ✅ Easily parallelizable
- ✅ Composable (chain operations)
How to Create Streams
There are many ways to create streams. Here are the most common:
// 1. FROM A COLLECTION (most common)List<String> names =Arrays.asList("Alice","Bob","Charlie");Stream<String> streamFromList = names.stream();// 2. FROM AN ARRAYString[] array ={"a","b","c"};Stream<String> streamFromArray =Arrays.stream(array);// 3. USING Stream.of() - for a few elementsStream<String> streamOf =Stream.of("a","b","c");// 4. EMPTY STREAMStream<String> emptyStream =Stream.empty();// 5. INFINITE STREAMS (be careful - use limit()!)// generate: same value repeatedlyStream<String> infiniteHellos =Stream.generate(()->"Hello").limit(5);// ["Hello", "Hello", "Hello", "Hello", "Hello"]// iterate: apply function repeatedlyStream<Integer> counting =Stream.iterate(0, n -> n +1).limit(10);// [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]// 6. PRIMITIVE STREAMS (avoid boxing overhead)IntStream ints =IntStream.range(1,5);// [1, 2, 3, 4]IntStream intsClosed =IntStream.rangeClosed(1,5);// [1, 2, 3, 4, 5]LongStream longs =LongStream.of(1L,2L,3L);DoubleStream doubles =DoubleStream.of(1.5,2.5);// 7. FROM A STRING (characters)IntStream chars ="Hello".chars();// IntStream of char codes// 8. FROM A FILEStream<String> lines =Files.lines(Paths.get("data.txt"));⚠️ Important: Streams are Single-Use!
Once you consume a stream (with a terminal operation), it's gone. You cannot reuse it. If you need to process data multiple times, create a new stream each time.
Understanding the Stream Pipeline
Every stream pipeline has three parts:
Stream Pipeline Process
Data is initiated from a collection, array, or file.
Data is transformed through filtering, mapping, or sorting.
The pipeline is triggered, producing a result or side-effect.
Source
Where the data comes from (Collection, Array, File, etc.)
Intermediate Operations (0 or more)
Transform the stream (filter, map, sort). These are lazy – they don't execute until a terminal operation is called. Each returns a new stream.
Terminal Operation (exactly 1)
Produces a result or side-effect (collect, forEach, reduce). This triggers the pipelineand consumes the stream.
List<String> names =Arrays.asList("Alice","Bob","Charlie","David");List<String> result = names.stream()// SOURCE.filter(n -> n.length()>3)// INTERMEDIATE: keep if length > 3.map(String::toUpperCase)// INTERMEDIATE: convert to uppercase.sorted()// INTERMEDIATE: sort alphabetically.collect(Collectors.toList());// TERMINAL: collect to List// Result: ["ALICE", "CHARLIE", "DAVID"]// NOTHING happens until collect() is called!// The filter, map, and sorted are "recipes" waiting to be executed.Intermediate Operations (Lazy)
These operations transform a stream into another stream. They're lazy – nothing happens until you call a terminal operation.
| Operation | What It Does | Example |
|---|---|---|
| filter() | Keep elements matching condition | .filter(n → n > 5) |
| map() | Transform each element | .map(String::length) |
| flatMap() | Flatten nested structures | .flatMap(List::stream) |
| distinct() | Remove duplicates | .distinct() |
| sorted() | Sort elements | .sorted(Comparator.reverseOrder()) |
| limit(n) | Take first n elements | .limit(10) |
| skip(n) | Skip first n elements | .skip(5) |
| peek() | Perform action without modifying | .peek(System.out::println) |
List<String> names =Arrays.asList("Alice","Bob","Charlie","Alice","David");// Chaining multiple operationsList<String> processed = names.stream().filter(n -> n.length()>3)// Remove short names.map(String::toUpperCase)// Convert to uppercase.distinct()// Remove duplicates.sorted()// Sort alphabetically.limit(3)// Take first 3.collect(Collectors.toList());// Nothing executes until collect!// Then all operations run in a SINGLE PASS through the data.Terminal Operations (Eager)
These operations trigger the pipeline and produce a result. After a terminal operation, the stream is consumed and cannot be reused.
List<Integer> numbers =Arrays.asList(1,2,3,4,5,6,7,8,9,10);// ===== COLLECTING =====List<Integer> list = numbers.stream().filter(n -> n %2==0).collect(Collectors.toList());// [2, 4, 6, 8, 10]Set<Integer> set = numbers.stream().collect(Collectors.toSet());// ===== REDUCING =====int sum = numbers.stream().reduce(0,(a, b)-> a + b);// 55Optional<Integer> max = numbers.stream().reduce(Integer::max);// Optional[10]// ===== MATCHING =====boolean allPositive = numbers.stream().allMatch(n -> n >0);// trueboolean anyEven = numbers.stream().anyMatch(n -> n %2==0);// trueboolean noneNegative = numbers.stream().noneMatch(n -> n <0);// true// ===== FINDING =====Optional<Integer> first = numbers.stream().findFirst();// Optional[1]Optional<Integer> any = numbers.stream().findAny();// Optional[1] (or random in parallel)// ===== COUNTING =====long count = numbers.stream().count();// 10// ===== MIN/MAX =====Optional<Integer> min = numbers.stream().min(Integer::compareTo);Optional<Integer> max2 = numbers.stream().max(Integer::compareTo);// ===== FOR EACH (side effects) =====
numbers.stream().forEach(System.out::println);// Prints each number// ===== TO ARRAY =====Integer[] array = numbers.stream().toArray(Integer[]::new);Understanding Lazy Evaluation
Lazy evaluation is one of Stream's most powerful features. Operations don't execute immediately – they wait until absolutely necessary.
🚀 Why is this powerful?
- Efficiency: Only processes what's needed
- Short-circuiting: Can stop early (findFirst, limit)
- Optimization: JVM can fuse operations together
- Infinite streams: Can work with potentially infinite data
// This looks like it processes ALL elements, but...Optional<Integer> result =Stream.of(1,2,3,4,5,6,7,8,9,10).filter(n ->{System.out.println("Filtering: "+ n);return n %2==0;}).map(n ->{System.out.println("Mapping: "+ n);return n *10;}).findFirst();// We only need the FIRST even number!// Output:// Filtering: 1 (odd, filtered out)// Filtering: 2 (even, passes filter)// Mapping: 2 (first even found!)// // Elements 3-10 are NEVER processed!// findFirst() short-circuits the stream.System.out.println(result.get());// 20Parallel Streams
Streams can process data in parallel with just one method call. The work is automatically split across multiple CPU cores using the Fork/Join framework.
List<Integer> numbers =IntStream.rangeClosed(1,1000000).boxed().collect(Collectors.toList());// SEQUENTIAL (single thread)long start1 =System.currentTimeMillis();long sum1 = numbers.stream().mapToLong(Integer::longValue).sum();System.out.println("Sequential: "+(System.currentTimeMillis()- start1)+"ms");// PARALLEL (multiple threads)long start2 =System.currentTimeMillis();long sum2 = numbers.parallelStream()// Just change to parallelStream()!.mapToLong(Integer::longValue).sum();System.out.println("Parallel: "+(System.currentTimeMillis()- start2)+"ms");// OR convert existing stream to parallel:long sum3 = numbers.stream().parallel()// Makes this stream parallel.mapToLong(Integer::longValue).sum();✅ Good for Parallel:
- Large datasets (10,000+ elements)
- CPU-intensive operations
- Independent computations
- Stateless operations
❌ Bad for Parallel:
- Small datasets (overhead > benefit)
- I/O operations (waiting, not computing)
- Order-dependent operations
- Shared mutable state
💡 Tips & Best Practices
✅ DO: Filter Early
Put filter() before map() to reduce the number of elements processed.
// Good: filter then map (fewer elements to transform) .filter(n -> n > 0).map(expensiveOperation) // Bad: map then filter (wasted work) .map(expensiveOperation).filter(n -> n > 0)
✅ DO: Use Primitive Streams
Use IntStream, LongStream, DoubleStream to avoid boxing overhead.
// Slower (boxing) Stream<Integer> boxed = numbers.stream().map(n -> n * 2); // Faster (no boxing) IntStream primitive = numbers.stream().mapToInt(n -> n * 2);
❌ DON'T: Modify State in Lambdas
Stream lambdas should be stateless and side-effect free.
// BAD: Modifying external state List<String> results = new ArrayList<>(); stream.forEach(s -> results.add(s)); // Thread-unsafe! // GOOD: Use collect List<String> results = stream.collect(Collectors.toList());
💡 TIP: Debug with peek()
Use peek() to see what's happening at each stage:
stream
.peek(n -> System.out.println("Before filter: " + n))
.filter(n -> n > 5)
.peek(n -> System.out.println("After filter: " + n))
.collect(Collectors.toList());📝 Quick Summary
Stream Basics:
- Declarative data processing
- Lazy evaluation
- Single-use (not reusable)
- Doesn't modify source
Pipeline Pattern:
- Source → Intermediate* → Terminal
- Intermediate: filter, map, sorted, distinct
- Terminal: collect, forEach, reduce, count
- Nothing runs until terminal operation