Optimization Tips¶

Maximize rsylla performance with these optimization strategies.

Connection Optimization¶

Pool Size¶

# Tune based on workload
session = await (
    SessionBuilder()
    .pool_size(20)  # Per host
    .build()
)

Guidelines:

Start with 10-20 connections per host
Increase for high-throughput workloads
Monitor connection usage

Compression¶

session = await (
    SessionBuilder()
    .compression("lz4")  # Faster
    # .compression("snappy")  # Better ratio
    .build()
)

When to use:

Large payloads (>1KB)
Limited network bandwidth
Not for tiny operations (adds overhead)

TCP Nodelay¶

session = await (
    SessionBuilder()
    .tcp_nodelay(True)
    .build()
)

Disables Nagle's algorithm for lower latency.

Query Optimization¶

Use Prepared Statements¶

# Prepare once
stmt = await session.prepare("SELECT * FROM users WHERE id = ?")

# Execute many times
for user_id in user_ids:
    result = await session.execute_prepared(stmt, {"id": user_id})

Benefits:

Query parsed once
Reduced network overhead
~15-20% faster than unprepared

Appropriate Consistency¶

# Fast reads (eventual consistency)
query = Query("SELECT ...").with_consistency("LOCAL_ONE")

# Durable writes
query = Query("INSERT ...").with_consistency("LOCAL_QUORUM")

Guidelines:

Use LOCAL_ONE for non-critical reads
Use LOCAL_QUORUM for important writes
Avoid ALL unless necessary

Mark Idempotent Queries¶

stmt = await session.prepare("SELECT * FROM users WHERE id = ?")
stmt = stmt.set_idempotent(True)

Enables safe retries for transient failures.

Use Paging¶

query = Query("SELECT * FROM large_table").with_page_size(1000)
result = await session.query(query)

for row in result:  # Automatic paging
    process(row)

Guidelines:

Set page size based on row size
Smaller pages = more round trips
Larger pages = more memory

Data Model Optimization¶

Partition Design¶

# GOOD: Distribute data evenly
PRIMARY KEY ((user_id, date), event_time)

# BAD: Hot partitions
PRIMARY KEY (date, event_time)  # All data for one date in one partition

Clustering Order¶

# For queries that need latest first
WITH CLUSTERING ORDER BY (created_at DESC)

Use TTL¶

await session.execute(
    "INSERT INTO sessions (id, data) VALUES (?, ?) USING TTL ?",
    {"id": session_id, "data": data, "ttl": 3600}
)

Automatic cleanup of temporary data.

Batch Optimization¶

Batch Same Partition¶

# GOOD: Same partition
batch = Batch("logged")
batch.append_statement("INSERT INTO user_data (user_id, ...) VALUES (1, ...)")
batch.append_statement("INSERT INTO user_data (user_id, ...) VALUES (1, ...)")

# BAD: Different partitions
for user_id in range(1000):
    batch.append_statement("INSERT INTO users ...")

Use Unlogged Batches¶

# When atomicity not required
batch = Batch("unlogged")  # Faster

Keep Batches Small¶

Max 100 statements per batch
Large batches can timeout
Split into smaller batches if needed

Async Optimization¶

Concurrent Requests¶

import asyncio

async def fetch_users(session, user_ids):
    tasks = [
        session.execute("SELECT * FROM users WHERE id = ?", {"id": uid})
        for uid in user_ids
    ]
    return await asyncio.gather(*tasks)

Connection Reuse¶

# GOOD: Share session
class Database:
    _session = None

    @classmethod
    async def get_session(cls):
        if cls._session is None:
            cls._session = await Session.connect(["..."])
        return cls._session

Monitoring¶

Enable Tracing (Debug Only)¶

query = Query("SELECT ...").with_tracing(True)
result = await session.query(query)

if result.tracing_id():
    print(f"Trace: {result.tracing_id()}")

Check Warnings¶

result = await session.execute("...")
for warning in result.warnings():
    print(f"Warning: {warning}")

Anti-Patterns to Avoid¶

Don't Create Sessions Per Request¶

# BAD
async def get_user(user_id):
    session = await Session.connect([...])  # Slow!
    return await session.execute(...)

Don't Use ALLOW FILTERING¶

# BAD
await session.execute("SELECT * FROM users WHERE email = ? ALLOW FILTERING")

Don't Use SELECT COUNT(*)¶

# BAD
await session.execute("SELECT COUNT(*) FROM large_table")

Don't Use Large IN Clauses¶

# BAD
await session.execute("SELECT * FROM users WHERE id IN (1,2,3,...,1000)")

Performance Checklist¶

[ ] Use connection pooling
[ ] Enable compression for large data
[ ] Use prepared statements
[ ] Set appropriate consistency levels
[ ] Mark idempotent queries
[ ] Use paging for large results
[ ] Batch same-partition operations
[ ] Monitor latencies and errors