ESMFold2 Tools logoESMFold2 Tools logoESMFold2 Tools
HomePredictPricing
  • What is ESMFold2

    Definition, release context, and model scope

  • GitHub guide

    Repository, examples, Colab, and license notes

  • Biohub

    Official model, API, GitHub, and Atlas sources

  • ESM Atlas

    How ESMFold2 relates to the Biohub atlas

  • API

    Generate and understand Biohub API code

  • Benchmark

    First-party H100 runtime, VRAM, and max length

  • Run ESMFold2 online

    Validate a sequence and fold a protein on hosted GPUs

  • Code Generator

    Create starter Python code for ESMFold2

  • Example result

    Inspect a precomputed ESMFold2 structure in your browser

  • ESMFold2 vs AlphaFold3

    Compare scope, claims, and practical workflow

  • ESMFold2 vs ESMFold

    Clarify old ESMFold and new ESMFold2 search intent

Wishlist
ESMFold2 Tools logoESMFold2 Tools logoESMFold2 Tools

A third-party ESMFold2 guide, code generator, input builder, and structure inspection toolkit.

Learn
  • What is ESMFold2
  • GitHub
  • Biohub
  • ESM Atlas
  • API
Tools
  • Run ESMFold2 online
  • Code Generator
  • Example result
  • Feature wishlist
Compare
  • vs AlphaFold3
  • vs ESMFold
About
  • About
  • Contact
  • Privacy
  • Terms
  • Acceptable Use
© 2026 ESMFold2 Tools. All Rights Reserved.
ESMFold2 - Featured on Startup Fameyo.directoryVerified on DANG!Featured on Findly.tools

First-party measurement — runtime and memory, not accuracy

ESMFold2 Benchmark: H100 Runtime, VRAM, and Max Length

We ran ESMFold2 on an NVIDIA H100 80GB via Modal serverless and measured what actually matters when you deploy it: cold start, warm latency, peak VRAM, and where it runs out of memory.

Length (aa)BucketCold startFirst (JIT)WarmPeak VRAMResult
20025653.7s25.9s0.26s52.1 GBOK
450512—29.6s1.26s52.1 GBOK
700768—31.6s2.66s52.1 GBOK
8001024————OOM

Measured on June 8, 2026 · ESMFold2 (Biohub ESM) · NVIDIA H100 80GB · Modal serverless. Measured across four sequence lengths in one warm H100 container.

What we measured

  • Once the container is warm, inference is fast and scales with length: ~0.3s at 200 residues, ~1.3s at 450, and ~2.7s at 700 — all under three seconds.
  • Peak VRAM stayed around 52GB across every length from 200 to 700 residues, so memory is dominated by fixed model allocations, not by sequence length within the working range.
  • Cold start (model load + weight convert) was ~54s and is paid once per container; the first prediction at each new length adds a ~26–32s JIT compile.
  • Sequences that pad to the 1024 bucket (≥800 residues) ran out of memory even on an 80GB H100, so the practical single-GPU ceiling is the 768 bucket (~700 residues).

How we measured this

These numbers come from running ESMFold2 on Modal serverless H100 GPUs with our own benchmark harness (bench/modal_benchmark.py). We are not the model authors; this measures runtime and memory, not prediction accuracy.

Cold start is the one-time cost of container init plus model load and weight convert, paid once per warm container. The first prediction at a new length adds a JIT compile; subsequent predictions at the same padded length are pure inference (the "warm" column).

Because the model recompiles per shape, inputs are padded to fixed length buckets. A 700-residue sequence runs at the 768 bucket; an 800-residue sequence runs at the 1024 bucket, which ran out of memory even on an 80GB H100.

This dataset is intentionally small and is being expanded with more sequence lengths and GPU tiers. We only publish points we have actually measured.

FAQ

How much VRAM does ESMFold2 need?

In our run, a 700-residue prediction peaked near 52GB of VRAM on an H100, so it needs a high-memory datacenter GPU (A100/H100 class) rather than a consumer card.

What is the maximum sequence length for ESMFold2 on one GPU?

We measured a 700-residue sequence (768 bucket) running fine, while an 800-residue sequence (1024 bucket) ran out of memory on an 80GB H100. So the practical single-GPU ceiling sits around the 768-length bucket in our setup.

How fast is ESMFold2 inference?

After warm-up, inference scales with length: about 0.3s at 200 residues, 1.3s at 450, and 2.7s at 700. The first call pays a ~54s cold start plus a ~26–32s JIT compile at each new length.

Is this an accuracy benchmark?

No. This page only measures runtime, latency, and memory. It does not compare prediction accuracy against ESMFold, AlphaFold3, or any other model.

ESMFold2 vs AlphaFold3See an example resultCode generator