Rollups are the backbone of Ethereum scaling, but they come with their own set of performance traps. Many teams launch with excitement, only to discover that their rollup architecture hits unexpected walls at 50 TPS—far below the promised thousands. This guide from Upstate's Layer-2 Scaling Pitfalls series identifies three hidden bottlenecks that frequently sabotage throughput and offers a practical checklist to solve them.
We focus on the most common culprits: sequencer throughput bottlenecks, data availability compression limits, and proof generation latency. Each section includes a problem statement, diagnostic steps, and concrete solutions—no fluff, just what you need to keep your rollup running smoothly.
1. Who Needs This and What Goes Wrong Without It
This article is for developers and infrastructure leads who are building or maintaining a rollup—whether it's an optimistic rollup, a zk-rollup, or a hybrid. If you've ever seen your transaction queue grow while blocks stay half-empty, or watched gas costs spike without a clear cause, you've encountered one of these bottlenecks.
Without addressing these issues, your rollup will underperform in production. Users will face high fees, slow confirmations, or even failed transactions. In the worst cases, the sequencer becomes a single point of failure, and the entire network stalls. We've seen teams spend months optimizing smart contracts while ignoring the sequencer's memory limits—only to find that the real bottleneck was something as simple as batch size configuration.
The three bottlenecks we cover are:
- Sequencer throughput: The rate at which the sequencer can process and order transactions.
- Data availability compression: How efficiently transaction data is posted to L1, affecting gas costs and finality.
- Proof generation latency: For zk-rollups, the time to generate validity proofs; for optimistic rollups, the challenge period delays.
Ignoring any of these can lead to cascading failures. For example, a sequencer that processes 100 TPS but posts data in large batches every 10 minutes creates a 10-minute confirmation window—too slow for many DeFi applications. Similarly, a zk-rollup that takes 30 minutes to generate a proof for a single block might as well be a sidechain.
By the end of this guide, you'll have a clear checklist to diagnose and fix each bottleneck, plus a framework for deciding which trade-offs matter most for your specific use case.
2. Prerequisites and Context You Should Settle First
Before diving into the bottlenecks, make sure you have a solid understanding of your rollup's architecture. You'll need:
- Basic metrics: Current TPS, block time, batch size, gas cost per transaction, and proof generation time (if applicable).
- Infrastructure details: Sequencer hardware specs (CPU, memory, disk I/O), network latency to L1, and whether you're using calldata or blobs for data availability.
- Rollup type: Optimistic or zk? Each has different bottlenecks. Optimistic rollups are limited by the challenge period and fraud proof generation; zk-rollups by proof computation.
If you haven't collected these metrics yet, start with a simple load test. Use a tool like eth_spam or a custom script to send transactions at increasing rates while monitoring sequencer CPU usage, memory, and batch submission times. This baseline will tell you which bottleneck hits first.
Another critical context is your target use case. A rollup for NFT minting has very different requirements than one for high-frequency trading. NFT mints can tolerate 10-minute finality; trading needs sub-second confirmations. Your bottleneck priorities shift accordingly.
Finally, be aware of common misconceptions. Many teams assume that data availability is always the bottleneck because it's the most talked-about. In practice, we've seen sequencer throughput fail first in most projects, especially those running on modest hardware. Proof generation is a close second for zk-rollups. Don't optimize for data availability compression until you've ruled out the other two.
3. Core Workflow: Diagnosing and Fixing Each Bottleneck
Follow these steps in order. Each step includes a diagnostic check and a solution.
Step 1: Check Sequencer Throughput
Monitor your sequencer's CPU and memory usage during peak load. If CPU is above 80% or memory is near capacity, the sequencer is likely the bottleneck. Also check the transaction pool size—if it's growing continuously, the sequencer can't keep up.
Solution: Scale vertically first. Upgrade to a machine with more cores and faster RAM. If that's not enough, consider horizontal scaling by sharding the sequencer (e.g., multiple sequencers handling different shards) or using a more efficient consensus algorithm like Tendermint instead of a naive FIFO queue. Some rollups also batch transactions before sequencing to reduce overhead.
Step 2: Evaluate Data Availability Compression
Look at your batch submission costs on L1. If a single batch costs more than 0.1 ETH, you're paying too much. The issue is often inefficient encoding—sending full transaction data instead of compressed calldata.
Solution: Use EIP-4844 blobs for data availability if your L1 supports it (e.g., Ethereum mainnet after the Dencun upgrade). Blobs are cheaper than calldata and allow larger batches. If blobs aren't available, optimize your compression: use a delta-based encoding that only sends state differences, or batch more transactions together to amortize fixed costs. Some teams also use data availability committees (DACs) to reduce L1 posting frequency, but this introduces trust assumptions.
Step 3: Address Proof Generation Latency
For zk-rollups, measure the time to generate a proof for a single block. If it's over 5 minutes, you'll have trouble scaling. For optimistic rollups, the challenge period (typically 7 days) is fixed, but fraud proof generation can be slow if disputes are frequent.
Solution: For zk-rollups, use parallel proof generation—split the block into smaller chunks and prove each chunk simultaneously. This requires a prover infrastructure with multiple GPUs. Alternatively, switch to a more efficient proof system like PLONK or STARKs that has faster proving times. For optimistic rollups, reduce the challenge period by using a faster fraud proof mechanism (e.g., interactive proving) or by running a permissioned validator set that you trust to act quickly.
4. Tools, Setup, and Environment Realities
You don't need expensive enterprise tools to diagnose these bottlenecks. Here's a practical toolkit:
- Prometheus + Grafana: Monitor sequencer CPU, memory, and transaction pool size. Set up alerts when CPU exceeds 70% for more than 5 minutes.
- Block explorer dashboards: Track batch submission times and gas costs on L1. Etherscan or a custom indexer works.
- Benchmarking scripts: Use
tx-spammerorartemisto generate load and measure TPS. Run for at least 10 minutes to get stable numbers.
Environment matters. In staging, you might run on a single machine with low latency to L1. In production, your sequencer may be in a different region from L1 nodes, adding 100ms of latency that doubles batch times. Always test under realistic network conditions.
Another reality: most rollup frameworks (like Optimism's OP Stack or Arbitrum Nitro) come with default settings optimized for average use. These defaults may not suit your workload. For example, OP Stack's default batch size is 100 transactions—fine for general use, but if you're processing 1000 TPS, you'll want to increase it to 1000 to reduce L1 costs. Tune these parameters based on your load tests.
Finally, consider the cost of scaling. Vertical scaling (bigger machines) is simple but expensive. Horizontal scaling (multiple sequencers) is complex but can be cheaper in the long run. For proof generation, buying a cluster of GPUs may be necessary, but cloud GPU instances can be cost-prohibitive for small teams. Evaluate whether a third-party prover service (like the one from Aleo or zkSync) fits your budget and trust model.
5. Variations for Different Constraints
Not every rollup faces the same bottlenecks. Your specific constraints change the priority order.
Constraint: Low Budget for Infrastructure
If you can't afford high-end hardware or multiple GPUs, focus on sequencer optimization first. Use a single, well-tuned machine with fast NVMe storage and plenty of RAM. For data availability, rely on calldata compression and batch less frequently (e.g., every 5 minutes instead of every minute) to reduce L1 costs. For proof generation, consider using a zk-rollup with a slower but cheaper proof system like Groth16, or switch to an optimistic rollup with a longer challenge period.
Constraint: Low Latency Requirements (DeFi)
If your users need sub-second confirmations, sequencer throughput and proof generation are paramount. Use a high-performance sequencer with a custom consensus algorithm (e.g., HotStuff or a DAG-based model) to minimize ordering delays. For zk-rollups, invest in parallel proof generation with multiple GPUs. Consider a hybrid approach: use an optimistic rollup for most transactions but a zk-rollup for high-value trades that need fast finality.
Constraint: High Throughput (Gaming or Social)
Gaming and social apps can tolerate longer finality (minutes to hours) but need very high TPS (thousands). Here, data availability compression becomes the main bottleneck. Use EIP-4844 blobs aggressively and compress transaction data with custom encoding (e.g., only sending state diffs). Batch as many transactions as possible—up to 10,000 per batch. Sequencer throughput can be scaled horizontally with sharding, as the app doesn't need global ordering for all transactions.
6. Pitfalls, Debugging, and What to Check When It Fails
Even with the best planning, things go wrong. Here are common pitfalls and how to debug them.
Pitfall 1: Over-Provisioning Calldata
Many teams send raw transaction data to L1 without compression, thinking that more data means faster finality. In reality, this increases gas costs and can lead to batch submission failures if the gas limit is exceeded. Check: Look at your calldata size per transaction. If it's over 500 bytes, you're likely sending too much. Fix: Implement a compression scheme like Snappy or Brotli before posting. For zk-rollups, use a state diff approach that only sends the changed state slots.
Pitfall 2: Ignoring Batch Timing
Posting batches too frequently (every 10 seconds) increases L1 costs and can cause the sequencer to spend more time on submission than on processing. Posting too rarely (every hour) delays finality and can lead to large batches that exceed L1 block gas limits. Check: Monitor batch submission success rate. If you see failed submissions due to 'out of gas', your batch is too large. Fix: Set a dynamic batch size based on current gas prices and L1 block space. Use a target of 10-30% of L1 block gas limit per batch.
Pitfall 3: Underestimating Proof Generation Time
In zk-rollups, proof generation time can increase non-linearly with transaction complexity. A simple transfer might take 1 second, but a complex DeFi swap could take 10 seconds. If you batch many complex transactions together, proof time can balloon to hours. Check: Profile proof generation per transaction type. Fix: Separate simple and complex transactions into different blocks, or use a prover queue with priority for simple transactions. Consider using a recursive proof scheme that combines multiple proofs into one, reducing total time.
Pitfall 4: Neglecting Sequencer Memory
The sequencer keeps a transaction pool in memory. If the pool grows too large (e.g., 100,000 pending transactions), memory usage spikes and the sequencer may crash. Check: Monitor memory usage and pool size. Fix: Implement a pool size limit with a backpressure mechanism—reject new transactions when the pool exceeds a threshold. Also, increase the batch size to drain the pool faster.
7. FAQ and Checklist in Prose
Here are answers to common questions, followed by a quick checklist you can use today.
FAQ
Q: How do I know which bottleneck is my primary one?
A: Run a load test while monitoring sequencer CPU, batch submission costs, and proof generation time. The first resource to hit 80% utilization is your primary bottleneck. If none hits 80%, your bottleneck is likely network latency to L1.
Q: Can I use a data availability layer like Celestia instead of L1?
A: Yes, but it adds trust assumptions and latency. Celestia can reduce L1 costs but introduces a separate consensus mechanism. Evaluate whether the trade-off is worth it for your use case.
Q: What's the best way to reduce proof generation time without buying more GPUs?
A: Optimize your circuit design—use smaller field sizes, reduce the number of constraints, and implement parallel proving within a single GPU. Also, consider using a proof system with faster proving times, like STARKs (which are slower to verify but faster to prove than Groth16).
Checklist for Your Rollup
- Measure sequencer CPU and memory at peak load. If above 80%, scale vertically or horizontally.
- Check batch submission costs on L1. If above 0.1 ETH per batch, optimize compression or use blobs.
- Measure proof generation time per block (zk-rollups). If above 5 minutes, implement parallel proving or switch proof systems.
- Monitor transaction pool size. If growing continuously, increase batch size or implement pool limits.
- Test with realistic network latency to L1. Adjust batch timing accordingly.
- Review default parameters in your rollup framework. Tune batch size, block time, and gas limits for your workload.
- Document your bottleneck hierarchy—know which one will hit first as you scale, and have a plan for each.
By following this checklist, you'll catch the hidden bottlenecks before they catch you. Start with the sequencer, then data availability, then proofs—that order has saved the most projects we've seen. Good luck, and may your rollup scale smoothly.
Comments (0)
Please sign in to post a comment.
Don't have an account? Create one
No comments yet. Be the first to comment!