Programming guides for beginner...
Any comments are welcomed....
I hope it helps!!! Thanks for drop by...
Powered By Blogger

Saturday, June 6, 2026

Your Smart TV Is a Node in the AI Scraping Economy

Your Smart TV Is a Node in the AI Scraping Economy

Disclosure: This post was researched and drafted with AI assistance. Primary source: buchodi / Include Security, The Smart TV in Your Living Room Is a Node in the AIScraping Economy (June 5, 2026), cross-referenced against the Hacker News front-page discussion (85 points, 19 comments at time of writing). All claims, framework versions, endpoint hostnames, and per-country bandwidth tiers are taken directly from the buchodi write-up, which itself documents the reverse-engineering of a consent-installed partner app over 30 days. Analysis and framing are the author's.

The write-up of the week is buchodi's at Include Security: a forensic look at Bright Data's "consent SDK" for residential proxying, and an argument — backed by reverse-engineered binaries and 30 days of captured traffic — that the connected TV in your living room is the ideal exit node for the AI training data economy. The interesting part is not the SDK itself, but that the legal supply side of the residential-proxy market has been engineered to be invisible to the people whose homes it runs in. Most of the existing press is looking at the illegal supply side and missing it.

Why the TV, not the phone

The reason CTV (connected TV — any TV with a built-in internet connection and apps, including Roku, Apple TV, Fire TV, and smart TVs from Samsung, LG, etc.) matters more than the mobile phone — where the same SDK already lives in apps like EarnApp and XYO COIN — is form factor:

Factor Mobile phone Smart TV / CTV
Power Battery most of the day Always plugged in
Network WiFi + cellular Always WiFi, high-speed
Uptime Intermittent 24/7 in standby
Bandwidth ceiling Low (cellular caps) Effectively unlimited
User attention Actively used Often unattended
Corporate / family oversight Higher (MDM, mobile EDR) Virtually none

A phone hits 1% battery, gets locked, jumps networks, and has EDR (endpoint detection and response — software that monitors a device for suspicious behavior, common on corporate and BYOD phones) watching it. A TV in your guest room doesn't. Once the SDK is past its install screen, it owns a residential IP that is online every night while the user is asleep, on a fast unmetered connection, in a household that has no idea it's running.

How the SDK actually works

The protocol design is the part most people will find surprising, because the implementation choices are deliberately aimed at the mobile app-security tooling that would normally catch this kind of behavior.

The config endpoint is unauthenticated. On every launch, the SDK calls https://clientsdk.bright-sdk.com/sdk_config_ios.json?appid=<bundle>&ver=<sdk-version>&uuid=sdk-ios-<32hex>. The server only gates on appid (a bundle ID you can read off the App Store listing) and ver (an SDK version string). Pass any random UUID, get the same config a real device gets: feature flags, idle thresholds, country bandwidth caps, and the partner manifest.

The peer tunnel is a plain WebSocket. After config fetch, the SDK opens a persistent wss://proxyjs.brdtnet.com:443. The TLS cert is CN=*.luminatinet.com — the corporate name Bright Data used before its 2018 rebrand. Active SDK infrastructure still runs on the legacy cert, which is a clean detection pivot: any *.luminatinet.com or *.brdtnet.com traffic on your network is specifically the peer-tunnel plane, not customer-side Bright Data usage.

No message signing, no client certificate, no device attestation. The server filters peers by IP reputation. The IPC envelope is plain JSON with commands like tunnel_init, cid_set, status_get, and cmd_tun. Once the device reports favorable idle state, the server pushes a cmd_tun frame, which the SDK executes as a real HTTP request against a third-party site, sourced from your residential IP.

The idle rules are not what you think they are

The config ships an explicit rulebook for when the device is eligible to relay someone else's traffic:

"idle_metrics": {
  "ignore_screen_on": true,
  "ignore_on_call": true,
  "max_bw_ratio": 1,
  "min_battery": 0.2,
  "wifi_on_battery": true,
  "min_battery_wifi": 0.2,
  "max_cpu_usage": 70,
  "max_mem_usage": 90,
  "mem_screen_off": true,
  "idle_timeout": 30,
  "not_idle_timeout": 10
}

The ignore_screen_on and ignore_on_call flags are the important ones. In the SDK's rulebook, "idle" means the device's CPU, memory, and battery are within thresholds — not that the user is away. A user actively on a phone call, reading the screen, counts as idle. So does a TV in the background during dinner.

"Consent" is a TV-remote problem

This is where most coverage is going to get the framing wrong. Petflix — a Roku app documented by The Verge and cited by buchodi as a representative consent-dialog example (not a partner-manifest entry) — has a consent screen that reads:

"To enjoy Petflix for free with fewer ads, you are allowing Bright Data to occasionally use your device's free resources and IP address to download public web data from the internet. Bright Data will only use your IP address for approved business-related use cases. None of your personal information is accessed or collected except your IP address. Period."

The word "occasionally" does a lot of work. The same SDK's publicly queryable config sets max_bw_monthly_wifi: 200,000,000,000 bytes — a 200 GB default monthly WiFi budget. Privacy-policy disclosure on a TV navigated by arrow keys is the wrong control surface.

The VPN bypass is the actual problem for security teams

The single technical finding that should change how enterprise security teams think about this SDK is the use_netifs flag, which triggers code in the binary that constructs its NWConnection with a specific requiredInterfaceen0 (WiFi) or pdp_ip0 (cellular) — rather than the system default route. On iOS, this bypasses any configured VPN's tun0 (the virtual network interface a VPN creates on the device) entirely. The peer tunnel does not cross a user-configured VPN, even when the rest of the app's HTTPS traffic does.

Buchodi verified this empirically with transparent TLS interception: every HTTPS call the SDK made was captured except the peer tunnel to proxyjs.brdtnet.com:443, despite port 443 being explicitly redirected to the inspector.

The SDK uses two independent inspection bypasses, one per plane:

  • Control plane (config fetch, telemetry): built on CFHTTPMessage primitives rather than URLSession. This defeats URLSession-level instrumentation (swizzling, network extensions, URLProtocol subclasses) commonly used in mobile app-security tooling.
  • Data plane (peer tunnel): built on NWConnection with requiredInterface set to the physical interface. This is what defeats VPNs and ensures the scraping is executed from a residential IP.

Both choices are legitimate Apple APIs. The combination is the interesting artifact: the data plane is invisible to VPN-based inspection and the control plane is invisible to URLSession-based hooks. Researchers who rely on either single technique see only half the SDK's behavior. For enterprise security teams running MDM (mobile device management — software that lets an organization enforce policy on phones and tablets, typically installed on company-issued or BYOD devices), corporate-VPN traffic inspection, or home-router parental controls: the most sensitive channel this SDK operates is designed to go around your visibility layer.

The original take: legal ≠ invisible

The wider story this drops into is the AI training data economy. Cloudflare's pay-per-crawl program, the Gemma 4 multimodal encoder consolidation we covered a few days ago, the rise of rate-limited retrieval-augmented agents — all of this is downstream of an LLM training pipeline that depends on scraping data that increasingly has owners who would prefer not to give it up. Residential proxies are how scrapers route around that resistance. They are the load-bearing infrastructure of the post-Cloudflare web.

Most of the press on residential proxies has focused on the illegal supply side: botnets like Aisuru and Kimwolf, trojanized apps like the HUMAN Security PROXYLIB disclosure, pre-infected IoT hardware in the Google/Mandiant IPIDEA takedown. The FBI issued a formal advisory earlier this year. These are the bad actors. They are also the ones that get reported on, because they have obvious victims and obvious villains.

Bright Data is the legal supply side. The SDK ships as a documented commercial product. The "consent" comes from a publisher that put it in their app's EULA. The user is told the device is being monetized, in language designed to be skimmed past on a TV. The scraping jobs that go through the network are bound to be "approved business-related use cases" because Bright Data is also the customer side and gets to define what that means.

What this changes is the defensive posture itself: the press, the takedowns, the FBI advisories have implicitly assumed the supply side is a thing that gets installed on a victim's device by an adversary, not a thing the victim consented to. The defensive posture does not currently distinguish between a TV that has been rooted by a botnet herder and a TV that has been enrolled in a "free ad-supported app." From the perspective of network telemetry, both are the same: an iOS device on a residential IP, opening a long-lived WebSocket to proxyjs.brdtnet.com, executing inbound HTTP jobs. The detection signal is the same. The remediation story is harder.

What this means for you

Home / small business / school network you control — the buchodi write-up gives you five DNS hostnames to block at the router. They will not affect any customer who legitimately uses Bright Data's customer-facing proxy service on a different domain.

# Block at your router's DNS — Pi-hole, NextDNS, Cloudflare Gateway, OpenWrt+dnsmasq, etc.
proxyjs.brdtnet.com
proxyjs.luminatinet.com
proxyjs.bright-sdk.com
clientsdk.bright-sdk.com
clientsdk.brdtnet.com

For deeper inspection: TLS SNI (Server Name Indication — the unencrypted hostname field in a TLS handshake, readable at the network boundary without decrypting the traffic) filtering on *.brdtnet.com, *.luminatinet.com, *.luminati.io works at the network boundary without TLS interception. The *.brdtnet.com and *.luminatinet.com TLS certificate fingerprints are stable until the next Sectigo rotation (current certs valid through mid-2026, per the write-up).

Corporate security stack relying on VPN-based traffic inspection or MDM with URLSession-level instrumentation — the use_netifs + CFHTTPMessage combination is built to defeat both. Add a host-based or app-store binary check for the Swift symbols BrdWebSocketFacade and BrdNetwork.DNSResolver to your managed-fleet scanning.

If you build consumer apps or CTV platforms — the most uncomfortable finding is the per-country bandwidth tier table, which suggests deliberate market segmentation:

Country Min battery to relay Daily cap Monthly cap
Uzbekistan 1% 1 GB 30 GB
Oman 1% 1 GB 30 GB
Qatar 20% 40 MB 250 MB
UAE 20% 40 MB 250 MB
Default (worldwide) 20% 50 MB 500 MB

Uzbekistan and Oman devices are permitted to relay down to 1% battery, with daily caps 20× the default and monthly caps 60× the default. The default-worldwide allowance still permits 500 MB of someone else's traffic per month over the user's home internet. There is a market design choice being made here that the consumer-facing copy does not describe.

What to do this week

The 30-day experiment in the buchodi write-up is reproducible without any special tooling. On a spare iOS device with mitmproxy and a partner app installed (XYO COIN is publicly named in the research), you can capture the same clientsdk.bright-sdk.com config fetch, the same wss://proxyjs.brdtnet.com:443 upgrade, and the same JSON envelopes — ipc_call with cmd=tunnel_init / cmd=cid_set. You will also see, in your own network logs, that the tunnel does not cross the iOS device's VPN if you have one configured. That is the part that is hard to argue with.

The bigger question — whether the consent-dialog model for residential-proxy enrollment survives the moment a regulator or a major platform holder decides to look at the SDK's actual config vs. its marketing copy — is one this post is not going to answer. But the buchodi write-up is now the public artifact that lets the question be asked in concrete terms, and that is the part that is going to matter.


Related on the blog: Cloudflare Just Bought the Build Tool That Runs the Web (the upstream half of the scraping-detection story), Redis 8.8: Your Lua Rate Limiter Is Now Obsolete (where rate-limited scrape traffic ends up), and Gemma 4 12B Just Killed the Multimodal Encoder (where the scraped data is going).

Key terms used in this post: CTV = connected TV (a TV with built-in internet and apps, including Roku, Apple TV, Fire TV, and most smart TVs); MDM = mobile device management (software that lets an organization enforce policy on phones and tablets, common on company-issued and BYOD devices); EDR = endpoint detection and response (software that monitors a device for suspicious behavior, common on corporate endpoints); SNI = Server Name Indication (the unencrypted hostname field in a TLS handshake, visible at the network boundary without decryption); tun0 = the virtual network interface a VPN creates on a device, which most traffic-inspection tools rely on for visibility.

Microsoft Just Put a Workflow Engine Inside Postgres

Microsoft Just Put a Workflow Engine Inside Postgres

Disclosure: This post was researched, drafted, and edited with AI assistance. Microsoft's pg_durable GitHub repository and README were the primary source; the HN announcement thread (281 points, 72 comments at time of writing) was the secondary source. Opinions, framing, and analysis are the author's.

Microsoft open-sourced pg_durable on June 5th and most coverage will focus on the SQL DSL, the ~> and |=> operators, and the question of whether writing workflows as SQL strings is a good idea. That's the wrong story. The real story is that the author of pg_durable is the same person who built the orchestration layer for Durable Task Framework — the framework that has been running Microsoft-internal workflows and Azure Durable Entities for close to a decade — and the team is now putting that capability inside Postgres. If you've ever told someone "we need a workflow engine for this," and the answer was Temporal, or Airflow, or Step Functions, that answer just got weaker.

What pg_durable actually does

A pg_durable function is a graph of SQL steps that Postgres executes and checkpoints as it goes. If the database crashes, restarts, or a step fails, execution resumes from the last durable checkpoint instead of forcing you to reconstruct state by hand. You start one with a one-liner:

SELECT df.start(
  'SELECT id FROM documents WHERE processed = false LIMIT 100' |=>
  'batch' ~>
  'UPDATE documents SET processed = true WHERE id = ANY($batch)'
);

The runtime checkpoints between steps, so a restart in the middle of a long job doesn't rerun work that already succeeded. Status and results are queryable from standard Postgres tables (the README points to df.instances) — same auth model, same backup model, same observability tooling. There is no Redis, no Temporal cluster, no separate queue service. It installs as a PostgreSQL extension and ships as a Debian package for PG 17 and 18 on amd64.

Under the hood, pg_durable is built on duroxide, a Rust-based durable execution runtime that handles deterministic replay, checkpoints, sub-orchestrations, and timers. pg_durable is the Postgres-flavored wrapper (PostgreSQL License); duroxide is the engine (MIT). The two components carry different licenses.

The "Postgres is enough" thesis just got real

There's been a persistent argument in the Postgres community for years — most visibly at postgresisenough.dev — that you can replace a lot of operational machinery with Postgres if you reach for the right extensions. pg_durable is the most ambitious version of that argument yet: it claims that durable execution, the thing that has historically required a separate orchestrator like Temporal, is just another primitive the database should provide.

The README's own list of "what you're probably doing today" makes the displacement target explicit:

  • pg_cron plus a jobs table, status columns, retry counters, and a polling worker
  • An external orchestrator (Airflow, Temporal, Step Functions, Argo) calling back into Postgres
  • A queue plus workers plus a separate state table to coordinate retries
  • A plpgsql procedure that works until a crash or long-running transaction forces you to start over

That's the menu. If pg_durable works as advertised, several of those menu items become the same thing, and the "we need Temporal for this" justification gets harder to make.

The maintainer of postgresisenough.dev is already asking for a PR to add pg_durable to the site. That's the tell — the people who've been arguing "Postgres is enough" see this as a real entry in the catalog, not a marketing stunt.

The Microsoft stake is bigger than it looks

Two things are easy to miss. First, the lead committer, affandar, is also the author of Durable Task Framework, the orchestration library that has powered Azure Durable Functions and Durable Entities. This isn't a new team learning the durable-execution category. It's the same team shipping their next move in the open.

Second, the same repo's documentation points at Azure HorizonDB, Microsoft's new PostgreSQL cloud service, as the place to try pg_durable — and notes that it's "engineered for performance and built with pg_durable inside." This isn't a one-off OSS contribution. It's a positioning move. Microsoft is betting that the database is the right substrate for workflow orchestration, and the database they want to bet on is Postgres, not a proprietary service they control end-to-end. That tells you something about where they think the leverage is.

The honest counterargument: the SQL DSL is awkward

The most consistent pushback in the HN thread is that the workflow syntax is hard to read. One commenter, looking at the README example, called it "bizarre." Another pointed out that embedding SQL strings inside other SQL strings — which is what the df.start(...) syntax essentially is — is a maintainability hazard waiting to happen.

Both criticisms are fair, and the maintainers know it. gdecandia, a contributor, said: "Agree that the DSL ergonomics can be improved. Our pipelines use a higher level language and therefore simplified, but pg_durable is meant to solve a wider array of problems. We're happy to take suggestions for improvements." A committer also noted that the state-provider layer is an extensibility point — they're open to alternative backends like a pgmq-based state provider, rather than the default PostgreSQL one.

The DSL awkwardness is the price you pay for putting workflows inside a SQL-shaped runtime. The tradeoff is real: pure SQL workflows are more constrained than Temporal's TypeScript SDK, but they force the architecture into a shape that survives database restarts, which is the whole point. If you've been writing Temporal workflows in TypeScript and never worrying about the underlying state store, you may not feel the pain pg_durable is solving. If you've been writing plpgsql procedures and losing work to transaction timeouts, you will.

What it's not

  • It's not a replacement for Airflow if your workflows fan out across heterogeneous systems (S3 + Spark + Slack + a database). The README explicitly says: "if the workflow mostly lives outside Postgres and spans many heterogeneous systems," reach for a general-purpose orchestrator.
  • It's not a sub-millisecond request handler. It's for durable background work, not synchronous request paths.
  • It's not available everywhere. The first-class deployment is Azure HorizonDB. If you're on AWS RDS, Aurora, Supabase, or Neon, you'll need to install the extension yourself and check whether your provider's PG build allows it.
  • It's not the first durable-execution project on Postgres. pg-boss, pg-workflows, and several others have been filling this niche for years. pg_durable is the most ambitious and the first with a major-vendor seal.

The performance and architecture story that's still developing

The README lists workloads (vector embedding pipelines, ingest pipelines, scheduled maintenance, fan-out aggregation, external API workflows) but doesn't publish benchmark numbers as of the v0.2.2 release. That's reasonable for an early OSS drop, but it means the "is this faster than my current setup" question is one you'll have to answer with your own load tests. The engine is Rust (duroxide) and the integration is in-PG, so there's no obvious reason it should be slow — but the early numbers will tell.

The architectural claim most worth testing is the parallel-fanout story. The README says pg_durable supports "fan-out aggregation: run independent queries in parallel, then join the results." If this works inside a single Postgres connection without an external worker pool, it's a real differentiator from the queue-plus-workers pattern.

The original take: the orchestrator is being absorbed into the database

pg_durable doesn't beat Temporal feature-for-feature today — Temporal has sub-orchestrations, versioning, signals, queries, and a TypeScript SDK that a generation of developers have already learned. pg_durable has none of those. The interesting question is what happens if a category of workflow tools gets pulled into the database itself over the next three to five years. Microsoft shipping pg_durable as a PG extension, embedded in their new cloud Postgres, is a strong signal that the answer to "where does the orchestrator live?" is shifting from "separate service" back to "the database." If this pattern holds, expect to see competing extensions in MySQL, MariaDB, and DuckDB within 24 months. The durable-execution category as a standalone product category gets thinner with each one.

The counter-trend is the continued rise of general-purpose orchestrators with mature SDKs (Temporal, Restate, Inngest) and the assumption that workflows will increasingly be written in application code, not SQL. If you're betting on that future, pg_durable is a 2026 data point, not a trend reversal. If you're betting on the database-absorbs-orchestration future, this is the most significant open-source release of the year so far.

What to do this week

# Check what your current workflow stack actually is
SELECT count(*) FROM information_schema.tables
WHERE table_name IN ('jobs', 'job_runs', 'workflow_state', 'scheduled_tasks');
# If you have more than 2 of these, you have a homegrown orchestrator.

# Look at what extensions your Postgres allows
SELECT name, default_version, installed_version
FROM pg_available_extensions
WHERE name IN ('pg_cron', 'pg_durable', 'pgmq');
# If pg_durable shows up with a version, your provider has built it in.
# If it doesn't, ask them when it will.

If you have a Temporal deployment that's mostly doing "fetch some rows, update some rows, wait, update some more rows" — that's exactly the workload pg_durable is for, and it's worth a one-week prototype to see if you can drop the orchestrator from your architecture diagram.

If you're on Azure and you've been waiting for "modern" Postgres features to land on Azure, the HN commenter who said "I'm trapped on Azure" is the user you should be listening to. Azure HorizonDB is the response to that complaint, and pg_durable is one of the first things it ships with.

If you're a maintainer of an existing pg-boss or pg-workflows-style project: now is the time to make sure your README has a "how this compares to pg_durable" section. The displacement question is going to come up in every HN thread for the next quarter.

What this means for you

The story of pg_durable is that the most valuable open-source workflow orchestration capability — the kind that was, until now, the reason to deploy a separate service — is now an install command away from every team that already runs Postgres. The deployment cost of "I need durable execution" just went from "spin up a cluster" to "apt install pg-durable-postgresql-17." That's the same kind of leverage shift that Redis 8.8's array data type brought to in-memory data structures, and the same pattern Cloudflare applied in acquiring VoidZero — own the substrate, and the layers above it become someone else's problem to defend. (For more on what "owning the substrate" looks like on the model side, see how Gemma 4 12B dropped the multimodal encoder — different substrate, same play.)

The next time someone tells you "we need Temporal for this," the better question is: do you need a workflow engine, or do you need Postgres to remember what it was doing?

Friday, June 5, 2026

Redis 8.8: Your Lua Rate Limiter Is Now Obsolete

Redis 8.8: Your Lua Rate Limiter Is Now Obsolete

Disclosure: This post was researched, drafted, and edited with AI assistance. Redis's official announcement was the primary source; benchmark numbers and feature claims were verified against the markdown source of their post. Opinions, framing, and analysis are the author's.

Redis 8.8 shipped on June 2nd with six new features, and most coverage will lead with the array data type. That's a mistake. The real story is that Redis has quietly crossed the line from "in-memory data structure server" into "a different kind of database," and two of these features do most of the work to get it there.

The new array data type (and why it isn't the real story)

The new array data type is going to get most of the attention. It's an index-addressable, dynamic, sparse-friendly container that supports server-side SUM, MIN, MAX aggregations over index ranges and can act as a ring buffer with a single command (ARRING). For random-element access at 100K elements with 1KB values, the benchmarks show arrays running 5x faster than lists and 8–15% faster than hashes. For ring-buffer operations, ARRING is twice the throughput of the RPUSH+LTRIM idiom everyone has been using for years.

That's all real and worth knowing about. But the data type is the easy part. The hard part is the implicit claim embedded in the design: that the right place to do sliding-window aggregations, log-line searches, and sensor-data sum/min/max is inside Redis, not in your application code. That's a much bigger architectural shift than a new container.

The story nobody's writing: INCREX ends a decade of Lua

If you've built a production rate limiter in Redis at any point in the last eight years, you wrote a Lua script. Some combination of INCR, EXPIRE, conditional logic, maybe a sliding window via a sorted set, and a Lua wrapper to keep the whole thing atomic. It's the kind of code you copy from a 2014 blog post and never look at again.

Redis 8.8 introduces INCREX, a new generalized INCR-family command that does this natively:

INCREX key
       [<BYFLOAT|BYINT> increment]
       [LBOUND lowerbound] [UBOUND upperbound] [SATURATE]
       [EX sec | PX msec | EXAT unix-time-sec | PXAT unix-time-msec | PERSIST]
       [ENX]

Three things make this more than just "another increment command." First, it returns both the new counter value and the actual increment applied, so the caller knows immediately whether the request was allowed or rejected. Second, the ENX flag sets the expiration only if no expiration is already set, which means a window's TTL is anchored to its first request and not silently reset by every later call — a subtle bug that has bitten a lot of production rate limiters. Third, the SATURATE flag with UBOUND lets you clamp the counter at the limit rather than reject, which is the difference between a strict rate limiter and a graceful one.

If you maintain a Redis-backed rate limiter in production: your Lua script is now a one-liner. The pattern is no longer worth its complexity.

The "real" message queue story: XNACK

For two years the most-cited reason not to use Redis Streams as a serious message queue was the failure-recovery story. A consumer that couldn't process a message had two options: ACK it (lying about success) or leave it pending and wait for XAUTOCLAIM to redistribute it after the idle timeout. For anything latency-sensitive, the second option was a non-starter.

Redis 8.8 adds XNACK, a real negative-ack command with three modes designed for three failure patterns:

  • SILENT — failure was unrelated to the message (consumer shutting down, transient network error). The delivery counter is decremented, undoing the original increment. The message becomes immediately available to other consumers.
  • FAIL — message is too expensive for this consumer but might succeed elsewhere. Delivery counter stays incremented; the message returns to the head of the queue.
  • FATAL — poison message, malformed, or potentially malicious. Delivery counter is set to LLONG_MAX, making it easy to detect and route to a dead-letter queue downstream.

This is the missing piece. It transforms Redis Streams from "queue-ish, with caveats" into "queue, full stop," because the failure-handling primitives now match what RabbitMQ or Kafka consumers take for granted. If you were weighing Redis Streams against a heavier queue service for a new project, that calculation just changed.

What the new array type is actually for

Two concrete things you can build with arrays + streams + 8.8 features:

  1. A self-hosted log aggregator. Arrays hold the last N lines per service, server-side SUM/MIN/MAX does count-by-severity and percentile queries, XNACK SILENT handles the dead-letter path when a parser crashes. No Elasticsearch, no ClickHouse, no managed SaaS — and the same Redis instance you already operate for caching carries the workload.
  2. A sensor pipeline ingest layer. Array-as-ring-buffer holds the last 60 seconds of readings, SUM/MIN/MAX over an index range gives you windowed stats without bolting on a separate TSDB. Useful for the "alert me when p99 latency in the last 30 seconds crosses X" pattern that currently needs Prometheus or InfluxDB.

This is what I mean by "a different kind of database." Redis used to be a cache you put in front of your real database. With 8.8, you can plausibly make it the system of record for narrow, time-bounded use cases where you used to reach for something heavier.

The performance numbers worth quoting

Beyond features, the 8.8 release is also a serious performance update. From the official benchmarks:

  • MGET pipelined with I/O-threads: up to 68% throughput improvement
  • XREADGROUP with COUNT 100: up to 83% improvement
  • ZADD/ZINCRBY/ZRANGEBYSCORE (sorted set operations): up to 74% improvement
  • Persistence and full synchronization: up to 60% faster
  • JSON numeric arrays (introduced in 8.4): up to 92% memory reduction, with new explicit control over BF16/FP16/FP32/FP64 storage for vector indexing needs

That last one is the AI angle nobody is connecting yet. Vector storage in Redis is now substantially cheaper than the marketing typically suggests, and the new precision control means you can store embeddings in the exact format your model expects — no casting, no precision loss, no awkward BF16 conversion layer. (For more on the model-side tradeoff, see how Gemma 4 12B dropped the multimodal encoder for the parallel argument that unified token spaces simplify AI plumbing.)

The meta-story: how the maintainers actually built it

There's been discussion on Hacker News (the announcement thread, 78 points at time of writing) about whether the array data type was implemented with LLM assistance. I won't make stronger claims about that than the public record supports — the announcement credits @antirez as the author, and the deeper "how it was built" question is best answered by reading the maintainer's own posts rather than by an outside observer guessing. Worth noting for context, but take second-hand claims with salt.

What's clear from the announcement itself is that the Redis project shipped a substantial new feature, benchmarked it, documented it, and put it in a numbered release. The takeaway for engineering managers who are still working out their AI policy isn't "use AI to write your database" — it's that AI is a tool, the verification step is the work, and a maintainer with a real test suite and benchmark suite can ship a major feature in a way that's documented and reproducible.

The trade-offs you should know about

  • Arrays are not free. They use about 18% more memory per element than a list. If your bottleneck is memory, not CPU, a list might still be the right choice. The benchmarks measure throughput, not footprint.
  • The new features are open-source-only. Redis 8.8 is the open-source release; managed Redis services (AWS ElastiCache, Azure Cache, Redis Cloud) will roll out these features on their own timelines. If you depend on a managed service, check the roadmap before planning around INCREX or XNACK.
  • The 92% JSON numeric array reduction is for a specific workload (homogeneous numeric arrays, especially vector embeddings). It's not a general-purpose JSON storage improvement.
  • The announcement thread on Hacker News was solid, not viral (78 points, 33 comments at time of writing — see the full discussion). Search volume for "Redis 8.8" will be real but bounded. The high-intent long-tail keywords (rate limiter, sliding window, streams NACK, array data type) are the realistic targets for organic search.

For comparison on what a more focused single-feature announcement looks like, see Cloudflare's recent VoidZero acquisition post — different topic, but the same pattern of one large headline news item generating a deeper, narrower technical conversation over the following week.

What to do this week

If you have a Lua rate limiter in production:

# Check the script's complexity first
redis-cli SCRIPT EXISTS $(redis-cli SCRIPT LOAD "$(cat rate_limiter.lua)")
# If it comes back 1, you have a Lua rate limiter.
# Read the INCREX docs and start planning the migration.

If you're building anything message-queue-shaped and avoiding Redis Streams because of the failure-recovery story: that objection just got answered. Run the same load test against RabbitMQ and against Redis Streams + XNACK and see how close the numbers are.

If you're storing vectors in Redis: check what precision you're actually using and whether the new BF16/FP16/FP32/FP64 control lets you cut memory without losing model quality. For most embedding models the precision difference is in the noise.

What this means for you

The story of Redis 8.8 isn't "here are six new features." It's that the project is now competing on three fronts it wasn't competing on a year ago: as a primary database for narrow, time-bounded use cases; as a message queue with proper failure handling; and as a vector store with explicit precision control. None of those is going to displace the best-in-class tool for any single use case. But the combination — one system you already operate that now does all three — is exactly the kind of leverage small teams have been waiting for.

The next time someone tells you Redis is "just a cache," ask them which cache ships its own sliding-window database, message queue, and vector store in a single binary.

Cloudflare Just Bought the Build Tool That Runs the Web

Cloudflare Just Bought the Build Tool That Runs the Web

Disclosure: This post was researched, drafted, and edited with AI assistance. The Cloudflare announcement and VoidZero product pages were the primary references; technical claims and direct quotes were verified against them. Opinions, framing, and analysis are the author's.

Evan You — creator of Vue.js and founder of VoidZero, the company behind Vite — announced this morning that VoidZero is joining Cloudflare. The full team is moving over, all the projects stay MIT-licensed and community-driven, and Cloudflare is putting a million dollars into a Vite ecosystem fund. If you build anything for the web, this is the most consequential tooling acquisition in years. Here's why the deal matters, and why the buyer matters more than the size of the check.

The numbers that should change how you think about "build tools"

Vite — the build tool VoidZero is most famous for — pulls 129 million weekly downloads. That number is so large it stops meaning anything until you compare it. React gets about 35 million. Next.js about 25 million. Webpack, the previous generation's default, around 60 million. Vite is now the most-installed frontend build tool on the planet by a factor of two.

It's not just popular in raw numbers. Vite is the foundation under Vue, SvelteKit, Nuxt, Astro, Solid, Qwik, React Router, TanStack Start, and Angular (via the community-maintained Analog and @analogjs/vite-plugin-angular). Cloudflare's own vinext project is a drop-in Next.js reimplementation built on Vite. When you ship a SvelteKit app, you're shipping on Vite. When you ship an Astro site, you're shipping on Vite. When you ship a TanStack Start app, you're shipping on Vite. The web framework ecosystem in 2026 is, effectively, a Vite ecosystem.

That's the context for why this acquisition matters. Cloudflare didn't buy a popular dev tool. They bought the substrate almost every modern web framework compiles through. Whoever owns Vite has more influence over how the next decade of web apps get built than any other single entity in the JavaScript world.

Three reasons Cloudflare wanted this

The acquisition announcement frames it around three themes. They're worth reading carefully, because each one tells you something about where the web is going.

One: the "open internet" argument. Cloudflare's public framing is that Vite underpins so many frameworks that tilting it toward a single vendor would damage trust and slow the whole ecosystem. Buying it and then keeping it neutral is, in their telling, the most pro-developer outcome possible. Whether or not you take that at face value, it's the right argument to lead with. Anything else would scare the ecosystem.

Two: AI agents are the new heavy users of dev tooling. Cloudflare and the VoidZero team make the case that agent-coded applications are already choosing Vite — agents scaffold projects, run dev servers, read errors, write tests, and deploy, all of which Vite makes fast. If the agent-coding thesis is right (and the hiring data, the funding data, and the GitHub commit data all suggest it is), then owning the build tool that AI agents reach for first puts Cloudflare in the path of every agent-generated app. The same stack builds the app and deploys it.

Three: Vite is becoming the full-stack dev experience. Modern apps aren't just bundles anymore. They have serverless functions, agent backends, vector databases, edge runtimes. The Vite+ product wraps the VoidZero toolchain (Vite, Vitest, Rolldown, Oxc) into a single CLI with one configuration model. If Vite becomes the place where "code becomes a running app" happens, owning Vite means owning the most important transition in the developer experience. Cloudflare is the natural buyer here because they already have the deployment half.

Why Cloudflare is the best possible buyer

A year ago I would have bet on Microsoft or Amazon being the most likely acquirer of a tool this important. Both have deeper pockets and more existing dev-tool relationships. But neither would have been good news.

Microsoft buying Vite would have raised immediate GitHub Copilot and VS Code conflict-of-interest questions. Amazon buying Vite would have meant months of uncertainty about whether the AWS Amplify team would get preferential treatment. In either case, the trust that took Vite from "fast dev server" to "129 million weekly downloads" would have evaporated overnight. The community would have forked the project within a week, and the fork would have won.

Cloudflare is the right buyer for three reasons that don't get enough attention:

They have no existing dev-tools product to favor. Cloudflare sells CDN, Workers, R2, D1, Pages. They don't have a competing bundler, framework, or package manager. There's no internal team whose budget depends on Vite failing. The conflict-of-interest surface is genuinely small.

Their mission aligns with Vite's "open" stance. Cloudflare has historically been the buyer-of-record for open projects that need a corporate home without a corporate agenda — the Astro team joined Cloudflare earlier this year under exactly this model. The same playbook is being applied here. That's not a small thing: it's the single most important signal about what the next five years look like.

They're putting the right guardrails in writing. The announcement is unusually explicit about what won't change. Vite stays MIT-licensed. Vite's roadmap is community-driven. Features added to Vite itself should not be Cloudflare-specific. The Vite team continues leading the projects. The $1M ecosystem fund goes to the Vite core team to administer, not to a Cloudflare-controlled foundation. The Astro acquisition set this template, and VoidZero is being folded into the same structure.

What's in it for you, the developer

If you're already using Vite, the announcement is explicit: in the short term, nothing changes. Your existing project keeps working. Your existing plugins keep working. Your existing CI keeps working. Evan You and the VoidZero team are still running the show. Roadmap discussions happen in the same GitHub repos and Discord servers as before.

In the medium term — six to eighteen months — three things are likely to get noticeably better:

  1. Vite + Cloudflare gets a first-class full-stack story. The existing Cloudflare Vite plugin (14 million weekly downloads) runs your server code inside workerd locally, so the dev server matches production byte-for-byte. With VoidZero's full engineering team now part of Cloudflare, expect that integration to deepen, with the cf CLI rebuilt to feel like Vite, not like a separate tool.
  2. Linting and formatting will get fast everywhere. Oxlint is already "saving days of engineering time" at Cloudflare per the announcement. With the VoidZero team inside the company, expect the oxlint/oxfmt tools to become deeply integrated into the Vite+ stack, replacing slower alternatives in the dev loop.
  3. AI-agent-friendly tooling becomes a first-class concern. Both the Cloudflare and VoidZero teams have been talking about agent ergonomics. Expect Vite's "what can the agent see and call" surface to expand significantly.

In the longer term, the bet is that Vite becomes the default runtime gateway for edge-deployed JavaScript: bundle, test, deploy through the same tool. Cloudflare's commercial strategy depends on this story, and Vite is now part of how they get there.

What's the catch

Every acquisition has a catch. Here it's the same one it always is: corporate priorities can drift over five-to-ten-year horizons. The founders and maintainers could leave. The community could fragment. A future Cloudflare leadership team could reinterpret "vendor-agnostic" loosely.

The mitigations are real, though. The Astro acquisition set the playbook: open source stays open source, deploy-anywhere stays deploy-anywhere, community governance stays community governance. The Vite team has explicitly committed to the same model. The MIT license is enforceable regardless of corporate intent. The community can fork if it has to, and the goodwill to do so exists.

The honest read: this is a 90th-percentile outcome. The only better one would have been Vite staying independent forever, which was never going to happen given the scale. Of the realistic buyers, Cloudflare was the best outcome of the available ones.

The bottom line

If you build with Vite today, your workflow doesn't change this week. If you build with Vite next year, it gets better in ways you don't have to think about. If you're picking a frontend stack for a new project, Vite is now the lowest-risk default it has ever been — the company behind it is owned by a buyer whose business model depends on the open web, with the governance, the funding, and the explicit commitments to keep it that way.

The bundler wars are effectively over. Vite won, and the buyer it ended up with is the one the open web can live with. For now.

Thursday, June 4, 2026

Gemma 4 12B Just Killed the Multimodal Encoder — Here's What That Means for You

Gemma 4 12B Just Killed the Multimodal Encoder — Here's What That Means for You

Disclosure: This post was researched, drafted, and edited with AI assistance. Google's announcement was the primary reference; technical claims and direct quotes were verified against it. Opinions, framing, and analysis are the author's.

Google released Gemma 4 12B yesterday, and if you only read the headline you might think "another mid-sized open model, whatever." Don't move on. This one is different under the hood, and the architectural choice Google made is going to ripple through every open-weights release for the rest of the year.

The thing nobody's talking about: drastically lighter encoders

Most popular multimodal models today — GPT-4o, Claude, Llama 3.2 Vision, Qwen-VL — still rely on substantial modality-specific encoders. A vision encoder turns images into a stream of "vision tokens." An audio encoder does the same for sound. The main language model then sits on top of all these encoded streams and reasons over them.

Gemma 4 12B takes that apart. The vision "encoder" is replaced with a lightweight embedding module — a single matrix multiplication plus positional embeddings and normalizations. The audio path is even leaner: the raw audio signal is projected directly into the same dimensional space as text tokens.

In Google's own words from yesterday's release post:

"We replaced Gemma 4's vision encoder with a lightweight embedding module consisting of a single matrix multiplication, positional embedding and normalizations."

"We removed the audio encoder entirely and projected the raw audio signal into the same dimensional space as text tokens."

Read that again. A single matrix multiplication for vision. A direct projection of raw audio into the same dimensional space as text tokens.

This is a notable architectural choice. Google is betting that modality-specific preprocessing can be dramatically reduced. The old way treats vision and audio as fundamentally different modalities that need specialized pre-processing. The new approach says: "tokens are tokens. Project everything into the same space, let the transformer figure it out."

(Some researchers will quibble with the "encoder-free" framing — a projection layer plus positional embeddings is still a form of encoding. Fair. What's different is the order of magnitude: the encoder is now a single matrix multiplication rather than a ViT or a Whisper.)

Why this matters if you build with AI

If you're using a hosted API, this change mostly shows up under the hood — you'll get slightly lower latency, slightly lower memory cost, and the same end-user experience. The interesting part is what it enables for the local crowd.

Quantized versions of Gemma 4 12B can run on many consumer laptops with 16GB of unified memory. Not 24GB. Not 48GB. Sixteen. A $1,200 MacBook Pro with the base M-series chip can run a quantized build. A year-old gaming laptop with an RTX 4060 can run one too. Full-precision multimodal inference at meaningful speed may still need more, but the floor for "useful local multimodal" just dropped a lot.

Local multimodal models have existed for a while (Qwen2.5-VL 7B, Llama 3.2 Vision 11B, Phi-3 Vision, InternVL, MiniCPM-V), but they typically involved trade-offs in capability, latency, or hardware requirements. What Gemma 4 12B does differently is combine vision, audio, and chat in one model at the 12B size, on consumer hardware, with a single unified token space. That combination is what's new.

That's the actual story. Not "Google released a model." "Google released a model that makes truly local multimodal practical for the hardware people already own."

The 150 million download context

Google's release notes say Gemma models have been downloaded 150 million times across all platforms — a number that puts them in the same league as the major open-weights families, though exact comparisons to Llama's distribution are hard to pin down since different platforms count downloads differently.

What that tells me is that the developer gravity around Gemma is real, not hype. People are picking Gemma not because Google's marketing is loud, but because the licensing has been permissive (the team moved to Apache 2.0 for Gemma 4, and earlier versions used the Gemma license which was also commercial-friendly) and the model family covers the full hardware range from "phone" (2B, 4B) to "data center" (26B MoE, 31B dense).

Gemma 4 12B slots in the gap that was actually missing. Before this, you had a 4B model for phones and a 26B model for servers, and not much in between. The 12B is the one you'd want for a serious desktop dev machine.

What you can actually build with it

Yesterday's post highlighted two examples from the community: a wearable robotic arm that uses Gemma for physical assistance, and an enterprise security tool. Both are the kind of thing that needs multimodal input (vision for the robotic arm, audio for the security tool) but doesn't need a 200B parameter model.

Here are three concrete project types that became realistic yesterday:

  1. A local accessibility tool that reads a webpage aloud, watches the cursor, and adjusts UI in real time. Vision for the cursor, text for the page, audio output for the user. All on-device, no cloud roundtrip.
  2. A second-brain app that ingests voice memos, screenshots, and notes, and lets you search across all three with a single query. The unified token space means the model can reason about "the screenshot I took while talking about the API bug" without separate pipelines for image and audio.
  3. A coding assistant that sees your screen, not just your text. Click on a function, get an explanation, ask "what's the bug here" with the screenshot as context. Quantized runs of Gemma 4 12B on a $1,200 laptop make this kind of project much more tractable than it was last week.

The agentic angle nobody's writing about

Google quietly released something alongside the model that's worth more attention: the Gemma Skills Repository at github.com/google-gemma/gemma-skills. It's a library of "skills" — pre-built capabilities that agents can compose — designed specifically for Gemma.

This is Google's answer to Anthropic's Skills system, and it's a signal that the agent-building game is now firmly a model-family-level competition, not a model-level one. The differentiator isn't "we have a 12B model" anymore. It's "we have a 12B model plus a thousand working agent patterns you can copy."

If you're building anything agent-shaped, the right move is to spend 30 minutes browsing the skills repo even before you download the model. The patterns there will save you weeks of trial-and-error.

The license and what it means for commercial use

Per Google's announcement, Gemma 4 ships under Apache 2.0. That's a meaningful change from earlier Gemma releases, which used a custom "Gemma license" (also commercial-friendly but more restrictive). Apache 2.0 means you can ship a commercial product on top of Gemma 4 12B. You can fine-tune it and sell the fine-tune. You can distribute it embedded in your application. You don't owe Google anything except to keep the license notice in your source distribution. (If you want to be paranoid — and you should, before building a company on it — pull the model card directly from Hugging Face and have your lawyer read it.)

The permissive licensing is part of why the 150M downloads number isn't a fluke. The license is the kind you can hand to a lawyer and get a "yes, ship it" answer in five minutes.

The trade-offs nobody will mention

It's not all good news. A few honest caveats:

  • The 12B parameter count is a target, not a guarantee of capability. Compared to a 70B model, the 12B will be dumber on hard reasoning tasks. If you need the model to write production code from a vague spec, you'll want a bigger one. For "look at this and tell me what's wrong," 12B is plenty.
  • Lighter encoders don't mean "free." The matrix multiplication for vision is cheap, but the audio projection is still doing real work. Expect memory usage to scale with how much audio you feed it.
  • The benchmarks are Google's. "Performance nearing the 26B MoE at less than half the memory" is a claim, not an independent measurement. The community will publish independent benchmarks within a week, and those are the ones to trust.

What to do today

If you have a 16GB Mac or PC and 30 minutes:

# Install Ollama (one-time, https://ollama.com)
ollama pull gemma4:12b
# Start it as a server
ollama serve
# In another terminal, try the multimodal API
curl http://localhost:11434/api/generate -d '{
  "model": "gemma4:12b",
  "prompt": "Describe this image and tell me what app I have open.",
  "images": ["screenshot.png"]
}'

That's it. You now have a multimodal model running locally that fits in your laptop's memory. Two years ago this needed a $10,000 workstation. Yesterday it needed a model. Today it needs a curl command.

The barrier to running useful multimodal models locally has dropped substantially. The next interesting question is what gets built now that anyone with a laptop can answer it.

Monday, April 3, 2023

How to Build Resilience: Strategies for Overcoming Life's Challenges - Practice Gratitude

 Practicing gratitude is an important part of maintaining a positive outlook on life. It involves focusing on the good things in life, rather than dwelling on the negative. Here are some key steps to practicing gratitude:

  1. Focus on the Present: To practice gratitude, start by focusing on the present moment. Take a few minutes each day to reflect on the good things in your life. This may include your health, your relationships, your job, or simply the beauty of the natural world.

  2. Keep a Gratitude Journal: One way to cultivate gratitude is to keep a gratitude journal. Each day, write down a few things that you are thankful for. This can help you to focus on the positive aspects of your life and to develop a habit of gratitude.

  3. Express Gratitude to Others: Another way to practice gratitude is to express it to others. This may involve thanking someone for a kind gesture or simply expressing your appreciation for their presence in your life. Not only does this benefit the person you are thanking, but it can also help to strengthen your relationships and cultivate a positive outlook.

  4. Practice Mindfulness: Mindfulness is the practice of being present in the moment and fully engaged in your surroundings. By practicing mindfulness, you can become more aware of the good things in your life and develop a greater sense of gratitude.

  5. Cultivate Positive Relationships: Surround yourself with people who support and uplift you. Positive relationships can help to boost your mood and cultivate feelings of gratitude.

By practicing gratitude, you can cultivate a more positive outlook on life and develop a greater sense of contentment and happiness. Remember to focus on the present moment, keep a gratitude journal, express gratitude to others, practice mindfulness, and cultivate positive relationships. With practice, you can develop a habit of gratitude that will benefit you in all areas of your life.


Generated by Chatgpt