⛏️

Hashrate to Inference Converter

Turn your crypto-mining hashrate into an estimated local LLM inference capacity. See which models fit and how many tokens/sec you might generate.

Mining setup

Ethereum Classic, ETH historical. Memory-bound; flops per hash is low.

Typical range: 0.18-0.7 J/MH

Real inference rarely uses 100% of peak FP16.

Estimated peak FP16 compute

Inference power

Power cost / month

Best-fit model

Estimated tok/s by model

Throughput depends on quantization, context length, and batch size. Treat these as rough directional estimates.

Model VRAM needed Est. tok/s Fit
Llama 3.1 8B Q4
Fast local chat
6.5 GB
Llama 3.1 70B Q4
High-capability agent
42 GB
Qwen2.5 14B Q4
Balanced coding assistant
10 GB
DeepSeek-V3 / R1 Q4 (MoE)
Reasoning / coding heavy
75 GB
Mistral Small 22B Q4
Agentic reasoning
15 GB

How the estimate works

  • Hashrate → flops: we use the algorithm's approximate MH/s : GFLOPS ratio to back out compute.
  • Utilization: real inference uses 40-90% of peak FP16 depending on batch size and memory bandwidth.
  • VRAM fit: only models that fit in your available VRAM are marked green.
  • Power: inferred from efficiency and hashrate, then costed at your electricity rate.

Reality checks

  • • Mining and inference stress different parts of the card. A mined card may have degraded memory.
  • • Large models are memory-bandwidth bound; teraflops alone overstate speed.
  • • Batch processing and prompt prefill can swing real tok/s by 2-5×.
  • • Always verify VRAM headroom; quantization tables are approximate.

🚀 Get AI automation insights daily

15:00 MST. One-click unsubscribe.

Subscribe