FERRAMENTAS LINUX: Valkey 9.1: A Deep Dive into Performance, Security, and What It Means for Your In-Memory Database Strategy

Valkey 9.1 delivers 2.1M req/sec, database-level ACLs, atomic HGETDEL & MSETEX. Essential upgrade for Redis-compatible in-memory stores.

You’ve built a service that depends on sub-millisecond response times. Your in-memory database handles session storage, real-time analytics, or a caching layer that keeps your API responsive.

But when traffic spikes, you see latency tail kicks, memory fragmentation warnings, and the nagging feeling that your data store isn’t scaling as cleanly as the rest of your stack.

—the latest major release from the Linux Foundation’s Valkey community (a )—directly addresses those pain points.

This isn’t a collection of experimental features. It’s a production-ready evolution that rethinks , tightens , and introduces that simplify your code. Let’s break down what actually matters.

What You Will Learn

How Valkey 9.1’s new I/O threading model achieves and when you’ll actually see those gains

Why changes security boundaries for

Three new commands (HGETDEL, MSETEX, CLUSTERSCAN) that eliminate common anti-patterns and round trips

Practical memory and rehashing improvements that prevent performance cliffs under write-heavy workloads

From Foundation to Advanced: Understanding Valkey 9.1

What Is Valkey and Why Does It Exist?

Valkey is an open-source, in-memory key-value store that forked from Redis in 2024 following licensing changes. It maintains wire protocol compatibility with Redis while being governed by a neutral foundation.

For developers, this means you can drop Valkey into existing Redis clients, connection pools, and infrastructure—no code changes required.

Version 9.1 represents the community’s first major feature release where the performance and security roadmap clearly diverges from its origin.

The changes aren’t cosmetic; they address fundamental bottlenecks in how a single-threaded event loop (the traditional Redis/Valkey model) handles modern multi-core hardware.

The Performance Leap: 2.1 Million Requests Per Second

The headline number—2.1 million requests per second using 512-byte payloads—comes from Valkey 9.1’s redesigned I/O threading architecture. Here’s what that means in practical terms.

The Old Way: Single-Threaded Event Loop

Traditional Redis and earlier Valkey versions processed commands on a single main thread. While this eliminates locking overhead,

it also leaves CPU cores idle. Network I/O (reading requests, writing responses) happened in the same thread, which meant high-throughput scenarios became bottlenecked by system call overhead and network stack processing.

The New I/O Threading Model

Valkey 9.1 separates network I/O from command execution:

Multiple I/O threads handle reading client requests and writing responses

Command execution remains single-threaded (preserving the predictable, lock-free execution model)

The main thread delegates network operations to I/O threads, then processes commands sequentially

Real-world impact: For workloads dominated by GET commands with modest payload sizes (sessions, user profiles, configuration data), you’ll see linear throughput scaling up to the number of I/O threads configured.

For write-heavy or complex command workloads (Lua scripts, SORT, aggregate operations), the single-threaded execution is still the limiting factor—but I/O threading reduces the overhead of just moving data in and out.

Configuration guideline: Start with I/O threads equal to the number of CPU cores minus one (leaving a core for the main thread). Monitor CPU utilization. More threads don’t always mean faster performance due to cache contention.

Additional Performance Optimizations

Hardware clock enabled by default: Valkey now uses clock_gettime(CLOCK_REALTIME) with hardware timestamp counters where available. This improves precision for expiration, eviction, and slow-log tracking with lower overhead than software fallbacks.

Faster sorted set queries: ZRANGE, ZRANK, and ZSCORE operations on skiplist-based sorted sets received micro-optimizations around pointer chasing and branch prediction.

Higher throughput for GET commands: The I/O threading change helps most here, but internal response buffer management was also refactored to reduce malloc() calls.

Security: Numbered Database-Level Access Control

This is the most underrated feature in Valkey 9.1.

The Problem Valkey Solves

Traditional Valkey (and Redis) supports multiple logical databases (indexes 0–15 by default). However, access control was all-or-nothing: a user authenticated with SELECT 5 could also SELECT 0 and access other tenants’ data. This forced many teams to run separate instances for each tenant or application—wasting memory and connection overhead.

What’s New in Valkey 9.1

The ACL (Access Control List) system now supports numbered database-level restrictions:

ACL SETUSER app_user on >password ~* &* +@all -@dangerous
ACL SETUSER app_user DATABASES 5,6,7

After this configuration, app_user can only SELECT databases 5, 6, or 7. Any attempt to access database 0 or 8+ returns a permission error.

Why this matters for you:

Multi-tenant SaaS: Run a single Valkey instance with each tenant isolated to a database number

Staging vs. production in one instance: Database 0 = production, database 1 = staging, with different credentials for each

Compliance: Meet data segregation requirements without deploying dozens of tiny instances

Lua Scripting as a Module

Valkey 9.1 moves Lua scripting support into its own loadable module (valkey-lua.so). By default, Lua scripts continue to work exactly as before. However, you can now:

Disable Lua entirely for attack surface reduction on cache-only instances

Update the Lua engine independently of Valkey core releases

Replace Lua with WebAssembly or other embedded runtimes in future releases (the modular architecture enables this)

Operational note: If you don’t use Lua scripts, explicitly disable the module in your configuration. Every unused feature is a potential vulnerability.

TLS Integration Improvements

TLS handling received two practical upgrades:

Reduced handshake latency through session resumption and better certificate chain handling

Proper occlusion of TLS errors (no more leaking plaintext “invalid password” messages before TLS negotiation completes)

New Commands That Simplify Your Code

These three additions eliminate common round-trip patterns that developers have kludged around for years.

HGETDEL – Atomic Retrieve and Delete from a Hash

The old way (two round trips, race condition possible):

# Client 1
HGET session:12345 user_id
# Client 2 could modify or delete the hash here
HDEL session:12345 user_id

The Valkey 9.1 way (one atomic operation):

HGETDEL session:12345 user_id

Returns the value and removes the field in a single, atomic step.

Use case: Processing a queue stored in a hash field, consuming temporary session attributes exactly once, or implementing “peek and delete” for workflow state machines.

MSETEX – Set Multiple Keys with Shared Expiration

The old way:

MSET key1:user:100 "value1" key2:user:100 "value2" key3:user:100 "value3"
EXPIRE key1:user:100 300
EXPIRE key2:user:100 300
EXPIRE key3:user:100 300

Three extra commands. Miss one EXPIRE and you have a memory leak.

The Valkey 9.1 way:

MSETEX 300 key1:user:100 "value1" key2:user:100 "value2" key3:user:100 "value3"

Use case: Caching multiple related data points from a single API response, session initialization (user ID, permissions, preferences all expire together), or rate limiting counters across time windows.

CLUSTERSCAN – Cluster-Wide Key Scanning

In clustered deployments (multiple Valkey nodes sharding data), the standard SCAN command only iterates over keys on the connected node. To find a key across the cluster, you previously needed:

1. Ask the cluster for the slot mapping

2.Calculate which node owns the key (CRC16 modulo 16384)

3. Connect to that node

4. SCAN only on that node

CLUSTERSCAN automates this:

CLUSTERSCAN 0 MATCH user:* COUNT 100

Returns an iterator cursor plus results from all cluster nodes, handling redirection internally.

Use case: Operational debugging (“find all keys matching pattern across the entire cluster”), migration validation, or audit logging where you need a complete inventory.

Memory Usage and Rehashing Improvements

Valkey 9.1 reduces memory overhead in three specific scenarios:

Lower memory during hash operations: When hashes grow beyond the hash-max-ziplist-value threshold and convert from ziplist to dict, the temporary peak memory is reduced by ~30%

Faster rehashing performance: The incremental rehashing algorithm (which slowly migrates keys from old to new hash table) now processes more buckets per tick when the server is idle, completing rehashes faster and freeing memory sooner

Better fragmentation handling: INFO memory reports more accurate mem_fragmentation_ratio when using jemalloc (the default allocator on most Linux distributions)

What this means for you: If you have workloads that constantly add and remove keys (session stores, real-time leaderboards, caches with high churn), you’ll see more predictable memory usage and fewer “latency spikes during rehash” incidents.

Common Mistakes to Avoid

1. Turning On All I/O Threads Blindly

Setting I/O threads to 16 on an 8-core machine creates context-switching overhead. Start with 2–4 threads on most production deployments and benchmark with your actual workload. Pipelined batch operations benefit less than many small, concurrent requests.

2. Using Database-Level ACLs as Your Only Isolation Mechanism

Numbered database access prevents a user from selecting another database, but commands like KEYS, FLUSHDB, and SCAN still operate within the allowed databases. For true tenant isolation, combine database-level ACLs with key prefixes and namespace restrictions (~ pattern in ACL rules).

3. Forgetting That MSETEX Is Atomic but Not Distributed

MSETEX sets multiple keys on the same Valkey node atomically. In a cluster, these keys must all belong to the same hash slot (same key prefix with hash tags). Otherwise, you’ll get a CROSSSLOT error. Design your key naming with {user:100} hash tags.

4. Assuming CLUSTERSCAN Is Free

Scanning an entire cluster generates load on every node. Use the COUNT parameter conservatively (100–1000). For production automation, maintain a separate key inventory rather than running CLUSTERSCAN every few seconds.

FAQ

Is Valkey 9.1 a drop-in replacement for my existing Redis instance?

Yes, with very few caveats. Valkey maintains wire protocol compatibility, so Redis clients (redis-py, ioredis, StackExchange.Redis) connect without changes. Configuration options are largely the same. The main differences are new commands (which your client library may not support immediately) and removed commands (Redis’s UNLINK variants are supported; proprietary modules are not). Test your application’s module dependencies before migrating.

How does Valkey 9.1 compare to Redis 7.4 on performance?

Early benchmarks show Valkey 9.1 outperforming Redis 7.4 on GET-heavy workloads by 15–30% due to I/O threading. On SET or mixed workloads, the difference is narrower (5–10%). The bigger differentiator is memory fragmentation—Valkey’s more aggressive rehashing shows lower long-term memory growth in write-heavy tests. As with any benchmark, test with your own data and access patterns.

When should I use HGETDEL vs. a Lua script?

HGETDEL is ideal for single-field operations. Use Lua scripts when you need to retrieve multiple fields, conditionally delete based on values, or perform operations across different data structures (e.g., get from hash, then increment a counter). HGETDEL gives you atomicity without script overhead.

How do I migrate from Redis to Valkey 9.1?

Use valkey-cli --rdb to generate an RDB file from your Redis instance, then start Valkey 9.1 with that file. For live migration with zero downtime, set up Valkey as a replica of Redis (Valkey supports Redis replication protocol), wait for sync, then fail over by redirecting clients.

Conclusion

Valkey 9.1 isn’t a speculative rewrite—it’s a production-hardened fork that solves real operational problems. The I/O threading model alone makes it compelling for anyone running high-throughput caching layers. The security improvements around database-level ACLs finally make multi-tenant Valkey practical without container-per-tenant overhead.

Your next step: Download Valkey 9.1 from valkey.io and run a benchmark against your existing workload. Use valkey-benchmark -t get,set -n 1000000 -d 512 to measure baseline performance. Then enable I/O threads (io-threads 4 in valkey.conf) and run the same test. Compare tail latency (p99.9), not just averages.

If you’re currently evaluating in-memory data stores for a new project, [Link to guide on choosing between Valkey, KeyDB, and native Redis] provides a decision framework based on your consistency and performance requirements. For teams already running Valkey, the 9.1 upgrade is low-risk and high-reward—schedule it for your next maintenance window.

The in-memory database landscape has matured. Valkey 9.1 proves that community-led, performance-first development can outpace vendor-backed roadmaps. Your infrastructure deserves the same pragmatism.

Visit Valkey's ofical site