Streaming 3D reconstruction from video demands bounded memory and compute, yet existing geometric foundation models either operate offline with quadratic cost or accumulate an ever-growing KV cache that exhausts GPU memory within hundreds of frames. We present OVGGT, a training-free framework that bounds both memory and compute to a fixed budget regardless of sequence length. Our approach combines Self-Selective Caching (FFN-residual-based KV cache compression, fully compatible with FlashAttention) with Dynamic Anchor Protection (shielding coordinate-critical tokens from eviction to suppress geometric drift). Experiments on indoor, outdoor, and ultra-long benchmarks show OVGGT processes arbitrarily long videos within a constant VRAM envelope while surpassing existing methods in accuracy.