Programming guides for beginner...
Any comments are welcomed....
I hope it helps!!! Thanks for drop by...

Friday, July 3, 2026

Linux 6.9 Stopped Wiping LUKS Keys. It Took 2 Years.

Ingo Blechschmidt, a mathematician in Augsburg who holds a PhD in applied topos theory from the University of Augsburg (October 2017), posted a debugging saga on Mathstodon on 18 June 2026 describing the worst kind of security regression: one that silently disables a defense mechanism you have trusted for years. Since Linux 6.9 shipped on 12 May 2024, the path that locks a LUKS-encrypted laptop's drive on suspend has been a no-op. The encryption key stays in RAM through the entire suspend cycle. For more than two years, on every Debian- and Ubuntu-derived distribution that wires up cryptsetup luksSuspend to the laptop's lid-close or sleep button, full-disk encryption has been doing nothing during sleep. A full shutdown was still safe. A suspend — which is what every modern laptop actually does when you close the lid — left the key resident in memory, accessible to anyone who seized the still-powered laptop. The fix is one line. The kernel patch is in late-June 2026 development. The NixOS integration test that would have caught the regression shipped as PR #532499, and the cryptsetup-side loud-warning patch shipped as GitLab MR #936 on the same day Blechschmidt posted the bug. This is the cleanest, most humbling security bug of the year, and the question of who is supposed to catch this class of bug is the part the kernel community has not answered.

What actually broke

Blechschmidt's debugging saga began when he noticed his laptop's LUKS volume key was not being wiped after a suspend-resume cycle. He git-bisected the regression across two years of kernel history and landed on commit a28d893eb3270cf62c10dd8777af0d8452cdc072 — "a sensible and useful refactoring," as he puts it, with an unexpected long-range interaction with the encryption code. The commit is referenced in his post by full hash and lore.kernel.org URL; both kernel.org hosts serve an Anubis bot challenge to non-browser clients at the time of writing, so the link is cited as in the primary source rather than directly verified in this pass. The breaking change reorganized how certain crypto and memory-notification paths interact, and the suspend path's contract with cryptsetup was the casualty.

The technical shape is the textbook silent-failure regression. The encryption layer's contract says: on suspend, wipe the volume key from RAM. The kernel change kept the wipe path callable but stopped the trigger that told it to run in the suspend sequence. The laptop sleeps, the laptop wakes, the password prompt never reappears (because the key is still there), and the user concludes the encryption is working. Every tool that automates this — the Debian-derived pm-utils, the Ubuntu-derived systemd sleep hooks, the graphical power-manager daemons that call cryptsetup luksSuspend on lid-close — has been calling the wipe into a void for two years.

What the HN thread taught me about scope

The Hacker News submission of Blechschmidt's post (item 48763035, 384 points and rising as of 03 July 2026) hit the front page within hours of being submitted, and the comment thread is where the story gets sharper than the original post makes it. The most useful correction came from user kokada, who pointed out that cryptsetup luksSuspend is not actually called by the upstream kernel or by upstream systemd; it is a Debian-specific integration. That detail matters because it changes the answer to "who is at fault?" from "the kernel" to "the Debian ecosystem," and that distinction is the entire story.

Two readings are competing in the thread. The first reading, which Blechschmidt's framing leans toward, is "Linux broke the contract." The second reading, which the Debian-specificity comment sharpens, is "a downstream integration relied on a kernel-internal behavior that was never guaranteed, and the kernel change was a correct refactor with no obligation to preserve that downstream contract." Both readings are defensible. The kernel community's norm is that internal APIs can change without notice, and the cryptsetup luksSuspend integration is documented as best-effort; the Debian-derived distributions' norm is that full-disk encryption must work on suspend, period. The fix that landed in late June 2026 makes the wipe path work again, but the question of who was responsible for testing the integration is the one the post and the kernel community have not addressed.

The one-line fix and the warning patch

Blechschmidt's patch, posted to linux-crypto and linked from his Mastodon thread, is one line of kernel code restoring the trigger in the suspend path. The lore.kernel.org URL is referenced verbatim in the primary source but, as noted, was not directly retrievable from a non-browser client during this write-up (Anubis challenge on lore.kernel.org). His NixOS PR #532499, "Add integration test for verifying that cryptsetup luksSuspend correctly wipes the volume key from memory," is open and verifiable on GitHub and is the more durable artifact of the discovery: an automated test that runs against every NixOS package rebuild and would have caught this regression on the day it shipped. The cryptsetup-side MR #936, "RFC: Print a loud warning if wiping the volume key is impossible," is the parallel upstream change that means a future regression fails loudly instead of silently. Blechschmidt's own caveat is honest: "without formal proofs I cannot say whether my patch is correct and free of its own long-range interactions." He is shipping a fix and a regression test together, and the warning MR is the belt to the suspend-path's suspenders.

The kernel's accountability gap, named

The story underneath the bug is a question of who is responsible for testing cross-layer security invariants. The kernel changed a refactor. The Debian-derived ecosystem has been integrating cryptsetup luksSuspend against that refactor's prior behavior for two years. Neither side has a regression test that exercises the integration on every kernel release. The NixOS PR is the first automated integration test of this contract I am aware of, and it shipped only after the bug was caught by hand.

The HN thread makes the structural point with characteristic bluntness. johnathan101: "This is one of those regressions that's easy to miss because everything still 'works.' Security bugs often don't announce themselves." bbminner: "From the number of 'we missed a single line C check across files during refactoring' critical security bugs discovered on a regular basis these days, the whole premise of a 'giant secure open source C codebase' seems questionable." The Linux kernel is the largest C codebase in active production use, the security boundaries it is supposed to enforce are exactly the ones that are hardest to test, and the regression that just happened is the canonical example of a class of bug the kernel's review process is structurally bad at catching.

The original take: the bug is fixable in one line and will be backported. The story is the accountability gap. If you operate a fleet of encrypted Linux laptops, you have a two-year window in which your disk-encryption-at-rest story was, in practice, a disk-encryption-while-shut-down story. The defenses that should have caught this — distribution-level integration tests, kernel CI that exercises downstream security hooks, upstream cryptsetup warning-on-silent-failure — did not exist. The patch that closes the gap is a NixOS PR and a GitLab MR, and those are the artifacts the rest of the ecosystem needs to copy.

What this means for you

If you run Debian, Ubuntu, Linux Mint, Pop!_OS, or any Debian-derived distribution on a laptop with LUKS full-disk encryption, your laptop's encryption has been inactive during suspend since your kernel last crossed the 6.9 boundary. A full shutdown was always safe; a suspend was not.

The interim mitigation is the one the HN commenters converged on: enable hibernation (suspend-to-disk) so that RAM is actually powered down on sleep. fpoling on Fedora: "I just configured Linux to hibernate to disk after 15 minutes of suspend. Powering memory off ensures that bugs like this Debian-specific would not matter." Most Debian-derived distros ship systemd-hibernate; check systemctl hibernate works before relying on it. Intel TME and AMD SEV mitigations the HN thread mentions require CPU and firmware support and do not address the underlying problem: the kernel thought the key was wiped when it was not.

If you ship a Linux distro, the NixOS PR #532499 is a short Nix expression that exercises the suspend-resume key-wipe contract. Backporting that test to your CI is a half-day of work and would have caught this regression on day one. The cryptsetup MR #936 is the upstream change to fail-loud-instead-of-silent; track it.

If you maintain the kernel, the structural argument is that internal refactors are allowed to change behavior, but security-relevant integration points need a CI signal. The kernel ships kselftest; nothing in kselftest exercises cryptsetup luksSuspend. Adding a test case that runs cryptsetup luksSuspend and asserts the volume key is absent from /proc/keys after a mem_sleep cycle would close this gap permanently.

What to do this week

# 1. Confirm whether you are affected. On a Debian-derived distro
#    with LUKS FDE, suspend the laptop and check whether the
#    volume key is still resident in /proc/keys after resume.
#    If so, you have been carrying the bug. Patch your kernel
#    to a build with the fix once your distro ships it, or use
#    hibernate instead of suspend in the interim.

# 2. Enable hibernate-as-fallback on battery as a stopgap:
sudo apt install hibernate
sudo systemctl enable hibernate.target
# Add to /etc/UPower/UPower.conf:
#   CriticalPowerAction=HybridSleep
# Hybrid sleep powers off RAM on low battery and closes the
# cold-boot attack window at the cost of slower resume.

# 3. If you ship a Linux distro, port the NixOS PR #532499 test
#    to your CI. The Nix expression is portable to any distro
#    using systemd and cryptsetup; the equivalent in Debian is
#    ~80 lines of bash in debian/tests/.

# 4. If you are a kernel developer, file or comment on a
#    kselftest proposal to add a cryptsetup-luksSuspend test
#    case. No such test existed, which is the structural
#    reason this regression shipped. Adding one is the durable
#    fix.

What we are deliberately not covering

This post is about the bug, the fix, and the accountability gap. We are not covering: the broader cold-boot attack literature (the original Halderman 2008 paper and the subsequent TPM-based mitigations are a separate beat); the parallel question of whether other cryptsetup integration points have similar silent-failure paths that have not been audited; the Debian-specific history of pm-utils and why cryptsetup luksSuspend is wired the way it is (a long history of distro packaging decisions); or the technical detail of the kernel refactor itself (Blechschmidt's Mastodon thread covers it well enough that a rehash adds no value). The HN thread's claim from kokada that this is "Debian-specific" is treated as the structural point it actually is, not litigated — the integration was Debian-derived and that is what matters for the post.

Related reads

Disclosure

Drafted with AI assistance. Primary source: Ingo Blechschmidt, "Since Linux 6.9, the tool that locks the laptop's drive on suspend had been silently failing," Mathstodon, 18 June 2026 (https://mathstodon.xyz/@iblech/116769502749142438). Source for: the May 2024 Linux 6.9 starting date, the "for more than two years" elapsed-time framing, the breaking-commit hash a28d893eb3270cf62c10dd8777af0d8452cdc072, the lore.kernel.org patch URL ajKwRtP8izwRsMmv@quasitopos/, the NixOS PR #532499 link, the cryptsetup GitLab MR #936 link, the "sensible and useful refactoring" framing, the "without formal proofs I cannot say whether my patch is correct" caveat, and the date the post was made (18 June 2026, with the fix landing in late June 2026 as described in the post). Secondary source: the Hacker News thread for item 48763035, "Since Linux 6.9, LUKS suspend stopped wiping disk-encryption keys from memory," submitted 2 July 2026 by IngoBlechschmid, 384 points and rising as of 03 July 2026 morning UTC+8, front-page submission. Source for the "Debian-specific cryptsetup luksSuspend is not upstream" claim is HN comment by kokada, who characterized the integration as a Debian extension; this characterization is the HN commenter's framing, not a direct cryptsetup or kernel.org citation. Source for the cold-boot / hibernation / Intel TME mitigations are HN comments by fpoling, teravor, and quotemstr; their specific recommendations (15-minute hibernation timeout, TME enabling) are commenter suggestions, not packaged distro defaults. The NixOS PR #532499 title was directly verified on GitHub (Add integration test for verifying that cryptsetup luksSuspend correctly wipes the volume key from memory by iblech). The cryptsetup MR #936 title was directly verified on GitLab (RFC: Print a loud warning if wiping the volume key is impossible). The breaking-commit hash and lore.kernel.org patch URL were NOT directly verified during this pass — both kernel.org and lore.kernel.org serve an Anubis bot challenge to non-browser clients as of 03 July 2026 UTC+8, and the URLs are cited verbatim from Blechschmidt's Mathstodon post rather than re-fetched. Readers clicking those two links may hit a bot challenge; this is annotated rather than dropped because the hashes and the lore.kernel.org thread ID are stable references that the kernel community itself will treat as canonical.

Sources

  • Ingo Blechschmidt, Mathstodon post, 18 June 2026https://mathstodon.xyz/@iblech/116769502749142438. Primary source for: the title and framing of the bug ("the tool that locks the laptop's drive on suspend had been silently failing"), the Linux 6.9 starting date (May 2024; confirmed as 12 May 2024 via Wikipedia kernel version history), the "more than two years" elapsed time, the technical mechanism (LUKS volume key not wiped from RAM across suspend), the breaking commit hash a28d893eb3270cf62c10dd8777af0d8452cdc072 (with the caveat that the kernel.org page returns an Anubis bot challenge to non-browser clients as of 03 July 2026), the lore.kernel.org patch URL ajKwRtP8izwRsMmv@quasitopos/ (same Anubis caveat), the "sensible and useful refactoring" framing, the "without formal proofs I cannot say whether my patch is correct and free of its own long-range interactions" caveat, the NixOS regression-test PR link, and the cryptsetup GitLab MR link. Author bio: Ingo Blechschmidt is a mathematician in Augsburg with a PhD in applied topos theory from the University of Augsburg (awarded October 2017); this was verified against his homepage rather than assumed from the Mastodon profile alone. Date the post was made: 18 June 2026 (per Mathstodon JSON-LD datePublished).
  • Hacker News thread, item 48763035https://news.ycombinator.com/item?id=48763035. Submitted 2 July 2026 by IngoBlechschmid, 384 points as of 03 July 2026 morning UTC+8. Source for: the kokada comment characterizing cryptsetup luksSuspend as a Debian extension rather than upstream kernel/systemd behavior, the johnathan101 "easy to miss because everything still works" framing, the bbminner "giant secure open source C codebase" framing, the bitbasher and CodesInChaos comments on how the wake-from-suspend user experience did not reveal the regression, the fpoling Fedora 15-minute hibernate-after-suspend workaround, the teravor MemoryOverwriteRequestControl and TPM MOR-lock mitigations, and the quotemstr Intel TME / AMD SEV hardware-encryption mitigations. These are commenter framings and are attributed as such, not as published positions. The HN point count (384) is the count at time of writing and is moving.
  • NixOS nixpkgs PR #532499https://github.com/NixOS/nixpkgs/pull/532499. Title directly verified on GitHub: "Add integration test for verifying that cryptsetup luksSuspend correctly wipes the volume key from memory by iblech." Author: iblech (Ingo Blechschmidt). Status at time of writing: open, awaiting review. Primary source for the regression-test artifact that would have caught this on the day it shipped; not the source for any historical claim about the bug's existence (Blechschmidt's Mathstodon post is the source for that). The PR was created 2026-06-16 and may have been merged by the time this post is read.
  • cryptsetup GitLab MR #936https://gitlab.com/cryptsetup/cryptsetup/-/merge_requests/936. Title directly verified on GitLab: "RFC: Print a loud warning if wiping the volume key is impossible." Author: iblech (Ingo Blechschmidt). Status at time of writing: open, awaiting review. Primary source for the upstream change that would make a future regression of this class fail loudly instead of silently; not the source for the bug itself. The MR was created on the same day as the Mathstodon post (2026-06-18) and may have been merged or closed by the time this post is read.

Thursday, July 2, 2026

Android's ADV Lands Sept 30. F-Droid Is Right to Panic.

F-Droid posted a piece on 1 July 2026 titled "What We Talk About When We Talk About Malware" that frames Google's mandatory Android Developer Verification (ADV) as "a trojan horse… runs surreptitiously in the background as a system service with full root privileges." That is a polemical frame, and it is also substantially correct on the facts that matter. On 30 September 2026, every app installed on a certified Android device in Brazil, Indonesia, Singapore, and Thailand must be registered by a developer who has paid Google, surrendered government-issued identification, signed the Android Developer Console Terms of Service, and handed over the signing keys for every app they have published or will ever publish. If the developer does not comply, the app gets silently blocked on every certified device in those four countries. Global rollout is scheduled for "2027 and beyond." F-Droid's claim is accurate: the policy as written is what the "existential" label describes. The original take is this: the security argument is real, but it is also being used to lock in a position that security alone does not justify, and the four-country phasing is a tell about which side of that trade-off Google is betting on.

The policy, on Google's own terms

Google announced the developer-verification regime in August 2025 on the Android Developers Blog, in a post by Suzanne Frey, VP of Product for Android. The framing was security: "We've seen how malicious actors hide behind anonymity to harm users by impersonating developers and using their brand image to create convincing fake apps. The scale of this threat is significant: our recent analysis found over 50 times more malware from internet-sideloaded sources than on apps available through Google Play." The mechanism is an "ID check at the airport" — Google confirms who the developer is, separately from any review of the app's content. The rollout begins in September 2026 in Brazil, Indonesia, Singapore, and Thailand, with global rollout to follow.

The four countries are not random. Brazil is Google's largest LatAm market. Indonesia and Thailand are the two largest Android markets in Southeast Asia, where the installed base is heavily weighted toward devices that came pre-loaded with sideloaded or third-party-store apps (a pattern that grew during the 2018-2022 period when low-end devices shipped with limited Play Store access). Singapore is the regulatory test market Google frequently uses for new policies because of its high rule-of-law score and predictable court system. Choosing four markets that are simultaneously growth markets, mobile-first markets, and underrepresented-in-Play markets is the move that lets Google claim the policy is targeting "regions specifically impacted by fraudulent app scams" while phasing it in where the displacement from F-Droid, Aptoide, and direct APK distribution will do the most damage to the open distribution ecosystem and the least damage to Play Store revenue.

The "escape hatch" is a nine-step procedure behind a 24-hour cooling-off wall

Google has said repeatedly that "power users" can still install unverified apps — the official escape hatch is a developer-options flow that lets a user disable verification on their own device. The Keep Android Open coalition (the civil-society site hosting the open letter Google does not want you to find) walked through what that flow actually looks like in practice. Nine steps. Open System Settings, find Developer Options, tap the build number seven times to enable Developer Mode, dismiss a coercion-warning screen, enter your PIN, restart the device, wait 24 hours, come back, dismiss more warning screens, then choose "allow temporarily" (7-day TTL) or "allow indefinitely." For installing software on a device you already own. Worse, the entire flow runs through Google Play Services, not the Android OS proper — meaning Google can tighten, modify, or kill it at any time without an OS update and without user consent. And as of the Keep Android Open site's July 2026 reading, this flow had not shipped in any beta, preview, or canary build. It exists as a blog post and a mockup.

That detail matters. The escape hatch is the entire policy's defense against the "Google is becoming a gatekeeper" critique, and the escape hatch does not exist yet in shipping code. For a hypothetical Brazilian F-Droid maintainer whose users will lose access to the client on 30 September, the official answer is to wait for a Google Play Services update that may or may not ship before then, that may or may not preserve the temporarily-vs-indefinitely distinction, and that Google can change unilaterally afterward. That is not an escape hatch. That is the absence of a commitment.

The Terms of Service clause Google does not want you to read

The F-Droid post quotes clause 6.5 of the Android Developer Console Terms of Service: "If You violate any of the Terms or if You distribute malware or other harmful applications, Google may terminate Your access to the ADC…" The concern F-Droid raises, and which the post's framing makes stick, is the absence of a definition of "malware" anywhere in the document. With no defined standard, the clause reads: malware means whatever Google says it means. F-Droid points to precedent: Google Play has already banned ad blockers and, in some cases, classified them as malware. The logical extension, which F-Droid states explicitly, is that the same discretion applied to developer accounts in the broader ADC could one day mean that ad-block developers, encryption tool authors, VPN maintainers, or any class of software Google disapproves of can be retroactively designated as malware distributors and have their developer accounts terminated. Once terminated, every app they have ever signed is blockable across every certified device in the affected regions.

This is the part that the Google announcement does not address. The August 2025 post defines the problem (malware, impersonation, repeat offenders) and the mechanism (identity verification), but does not constrain what counts as a violation. The asymmetry between "we are verifying identity, not content" (the August post's stated scope) and "we can terminate for distributing malware, defined as anything we say is malware" (the ToS clause) is the gap that makes F-Droid's framing not a stretch.

The petition and what "overwhelming opposition" actually looks like

The Keep Android Open open letter is signed by 71 organizations from 23 countries, including the EFF, FSF, FSFE, ACLU, KDE, GNOME Foundation, Tor Project, GitHub Store, Vivaldi, Tuta Mail, Codeberg, the App Fair Project, and the Brazilian, Taiwanese, French, and Norwegian digital-rights coalitions. The accompanying petition passed 100,000 signatories. That is an unusually broad coalition for an Android-policy fight; F-Droid's coalition politics have historically been narrower (free-software circles), and the breadth here is itself the news. The EFF's framing on the site is the most quotable version of the critique: "identity-based gatekeeping is a censorship tool, not a security one." Cory Doctorow called it "Darth Android" — the same author who named "adversarial interoperability" and "right to repair" as the policy frames of the last decade is naming this one too.

The HN thread on the F-Droid post (item 48755965, 919 points, 385 comments within roughly 24 hours of submission) is not the same audience as the petition signers but lands in similar places. The most-upvoted comments are not about malware or sideloading convenience; they are about precedent. The framing that recurs at the top of the thread is: if Google can retroactively lock down devices that were sold as open, every hardware manufacturer is watching, and the principle being established is that the company that built your device gets to decide, after you bought it, what software you are allowed to run. That is a hardware-ownership argument dressed up as a software-policy one, and it is the version of the critique that scales beyond the developer audience. The 30 September date is a 90-day countdown from when the open letter launched; the petition is timed to land before the rollout begins.

Where Google is right

The security argument is not made up. Sideloaded APKs are a real vector for financial-fraud malware in the markets being phased in first, and the Google Play Protect statistics Google cites (50x more malware from internet-sideloaded sources than from Play) come from Google's own telemetry, which has an obvious conflict of interest but is also measuring a real phenomenon. Brazilian banking-fraud statistics, Indonesian SMS-phishing campaigns targeting mobile-banking apps, and Thai scam-app takedowns all predate the ADV program and are part of why FEBRABAN (the Brazilian Federation of Banks) endorsed the policy in Google's announcement. The Developer Verification requirements that Google has run inside Google Play since 2023 are reported (by Google, with the same conflict-of-interest caveat) to have reduced recidivist developer abuse; the August 2025 announcement says Play-side verification has been "helpful in stopping bad actors from exploiting anonymity." Extending the model to non-Play developers is a coherent extension of an existing program.

Identity-based gatekeeping does reduce malware. The question is whether the cost — the chilling effect on hobbyist and independent developers, the political leverage Google gains over every dissident-app or rights-tool author on the planet, the precedent for hardware-vendor control of post-purchase device behavior, the dependency of every Android user on a single foreign corporation's definition of "malware" — is worth the security gain. F-Droid's argument, which the post makes more effectively than the headline frames, is that the cost is being paid by people who are not the beneficiaries. The 580 million people in Brazil, Indonesia, Singapore, and Thailand will get marginally more protection from financial-fraud malware; the global developer community will get a precedent that any platform owner can invoke. The trade-off is real on both sides, and the trade-off is the story.

What this means for you

If you're an Android developer whose app is distributed outside the Play Store:

  • The clock starts on 30 September for Brazilian, Indonesian, Singaporean, and Thai users. If your app has any users in those four countries and you do not plan to register with the ADC, those users will lose access — silently, not via an error message. The app will fail to install or fail to launch, depending on which Google Play Services version they have. Plan the comms now, not in September.
  • The "hobbyist" account type Google announced is a separate ADC account class, but the Terms of Service apply to it. The distinction is the price and the verification friction, not the obligations. Read the clause about "malware" before you sign.
  • For F-Droid specifically: the maintainers have not yet published the failure-mode guide (it was promised in the 1 July post), but the practical alternatives are (a) ship your own auto-updating APK channel that does not depend on F-Droid as the install vector, (b) advise your users to enroll in the developer-options escape hatch before the lockdown, knowing that the flow runs through Google Play Services and may change, or (c) accept that your users in those four countries will need to switch to a non-Android device for your app.

If you are an Android user in Brazil, Indonesia, Singapore, or Thailand:

  • Pre-enroll in the developer-options flow before 30 September if you rely on any non-Play app. The flow requires a 24-hour cooldown after you tap through the warning screens; if you do it on 29 September, the cooldown may not clear before the lockdown.
  • Back up the data from any F-Droid-installed apps that do not store data in the cloud. The apps may not survive the lockdown in a form that lets you retrieve local state, depending on how the verification layer interacts with already-installed packages.
  • The 7-day "allow temporarily" option in the escape hatch is the new default for everyone who has not opted into "allow indefinitely." Expect to re-confirm it every week for any non-Play app you rely on, or enroll in "indefinitely" once you trust the flow.

If you are building for a non-Android platform:

  • This is a preview of the policy shape Apple may move toward under pressure from EU DMA enforcement. Apple's existing "notarization" for macOS apps outside the Mac App Store is a lighter version of the same pattern; the EU has already pushed Apple to allow alternative app stores and alternative payment processors, and the next pressure point is alternative-distribution friction. Watch how the September rollout lands before betting that iOS will stay open.

What to do this week

# 1. If you maintain an open-source Android app with users in BR/ID/SG/TH,
#    publish a status update before 30 September. The F-Droid post
#    promised a guide "in the coming weeks" — if you can't wait for
#    it, the minimum useful thing is to tell your users which apps
#    will be affected and what the developer-options escape hatch
#    looks like on their device. Format: a single markdown page on
#    your project site, linked from the README and the GitHub
#    releases page, mirrored to Mastodon and Bluesky for reach.

# 2. If you are an Android user in one of the four affected countries,
#    enroll in the developer-options flow this week, before the
#    September lockdown, and confirm you can install and launch a
#    sideloaded APK end-to-end. The flow is nine steps and a 24-hour
#    cooldown; you do not want to discover the cooldown on the day
#    you need the app.

# 3. If you are a developer anywhere on the platform who has been
#    considering signing the Android Developer Console Terms of
#    Service "just to be safe," read clause 6.5 (the "malware"
#    termination clause) and the absence of a malware definition in
#    the document before signing. The clause is enforceable as
#    written, and the discretion in it accrues to Google, not to you.

# 4. If you build or fund policy work in any of the four affected
#    markets, push your local digital-rights coalition to file a
#    competition-authority complaint before 30 September. Brazil's
#    CADE and Indonesia's KPPU have both opened Android-related
#    competition cases in the last 18 months; the ADV rollout gives
#    them a concrete policy artifact to challenge rather than a
#    general "Google has too much market share" theory.

What we are deliberately not covering

This post is about the policy, the rollout, and the developer-side response. We are not covering: the long history of Aptoide's competition-law cases against Google (a separate beat, worth a dedicated post); the parallel EU DMA story for iOS (same shape, different platform); the technical mechanics of how the verification layer interacts with already-installed APKs (Google has not documented this publicly, and the speculation in the HN thread is not source-verified enough to ship); or the cryptocurrency-and-VPN-developer response, which is the next layer of the coalition politics and is still being organized. The "this is bigger than Android" hardware-ownership framing from the Keep Android Open site is also worth a longer treatment; the next post in this thread will pick it up.

Related reads

Disclosure

Drafted with AI assistance. Primary source: F-Droid, "What We Talk About When We Talk About Malware," 1 Jul 2026 (https://f-droid.org/2026/07/01/adv-malware.html). Secondary source: Keep Android Open coalition site (https://keepandroidopen.org), including the open letter signed by 71 organizations from 23 countries, the 100,000+ signatory petition, the developer-options escape-hatch walkthrough, and the supporter quotes (EFF, FSF, FSFE, ACLU, KDE, GNOME, Tor Project, Cory Doctorow). Counterbalancing source: Suzanne Frey, "A new layer of security for certified Android devices," Android Developers Blog, 25 Aug 2025 (https://android-developers.googleblog.com/2025/08/elevating-android-security.html), for Google's stated security rationale, the "50x more malware from sideloaded sources" statistic, the four-country rollout timeline (Sept 2026 in BR/ID/SG/TH, 2027+ global), the airport-ID-check framing, and the FEBRABAN / Indonesian Ministry / Thai Ministry / Developer's Alliance endorsements. HN discussion: item 48755965, 919 points and 385 comments at time of writing, front-page submission 2 July 2026. The 580 million affected-user figure is F-Droid's population estimate for the four rollout countries combined; the exact UN population totals are slightly different (F-Droid rounds up). The "100,000+ signatories" figure for the petition is the Keep Android Open site's number as of the post date; the count was rising. The exact ToS clause 6.5 is verbatim from the F-Droid post's quote; the underlying ToS was not directly fetched in this pass. The Cory Doctorow "Darth Android" framing is the coalition site's attribution, not a direct Doctorow publication. The EFF "identity-based gatekeeping is a censorship tool, not a security one" framing is also the coalition site's summary, not a direct EFF publication.

Sources

  • The F-Droid post — "What We Talk About When We Talk About Malware," 1 Jul 2026, https://f-droid.org/2026/07/01/adv-malware.html. Primary source for the "trojan horse" framing, the 4 billion Android devices estimate, the Play Protect vector claim, the ToS clause 6.5 quote ("Google may terminate Your access to the ADC..."), the absence of a malware definition in the ADC Terms, the ad-blocker-as-malware precedent, the "over 99% of Play developers' apps have been registered" auto-opt-in critique, the "hundreds of thousands" petition opposition, the 90% dislike figure on the developer roundtable video, the Gemini summary of the program's reception, the September 30 activation date for Brazil/Indonesia/Singapore/Thailand, and the open-questions list (install behavior, app disablement, data retrieval, telemetry scope). Author: marcprux (F-Droid contributor).
  • The Keep Android Open coalition sitehttps://keepandroidopen.org. Primary source for the 71-organization / 23-country open letter signatory list (EFF, FSF, FSFE, ACLU, KDE, GNOME, Tor Project, GitHub Store, Vivaldi, Tuta Mail, Codeberg, App Fair Project, Forbrukerrådet, FACiL, Molly, ANSOL, Software Liberty Association of Taiwan, Cryptee, Digitale Gesellschaft, Technopolice Bruxelles, Rocky Linux, La Quadrature du Net, Open Rights Group, CryptPad, FULU Foundation, Digital Rights Foundation), the 100,000+ petition signatory count, the developer-options escape-hatch walkthrough (nine steps, 24-hour cooldown, 7-day TTL for "allow temporarily"), the EFF framing ("identity-based gatekeeping is a censorship tool, not a security one"), the Cory Doctorow "Darth Android" framing, the F-Droid "existential threat" quote, and the Ars Technica "Google's Apple envy threatens to dismantle Android's open legacy" headline. The Historical-Employ129 Reddit comment ("600 million malware downloads… their own store is crawling with fake apps and straight up malware", 324 upvotes) and the WaffleMonster Slashdot quote on "sideload as psychological propaganda" are both from the coalition site's curated supporter-quotes section.
  • The Android Developers Blog announcement — Suzanne Frey, "A new layer of security for certified Android devices," 25 Aug 2025, https://android-developers.googleblog.com/2025/08/elevating-android-security.html. Primary source for the security rationale ("malicious actors hide behind anonymity"), the "50 times more malware from internet-sideloaded sources" Play Protect statistic, the "ID check at the airport" framing, the rollout plan (Sept 2026 BR/ID/SG/TH, 2027+ global), the Play-side verification precedent ("since we implemented verification requirements on Google Play in 2023"), the separate hobbyist ADC account type, the FEBRABAN / Indonesian Ministry of Communications and Digital Affairs / Thai Ministry of Digital Economy and Society / Developer's Alliance endorsements, and the "developers will have the same freedom to distribute their apps directly to users through sideloading or to use any app store they prefer" framing.
  • The Hacker News thread — item 48755965, "Android Developer Verification: Threat masquerading as Protection," submitted 2 July 2026, 919 points and 385 comments at time of writing, front-page submission. Primary source for the "hardware-vendor control of post-purchase device behavior" framing that recurs in top comments, the Google-Play-Protect-vs-DEV-policy critique, the "we want Fable"-shaped pattern of cross-cohort frustration with platform owners, and the developer-tools-vs-platform-owner framing that scales the critique beyond the security audience. Numbers on HN are moving as the post ages.

Wednesday, July 1, 2026

Godot Banned AI Code. Maintainers Are Done Subsidizing Slop.

The Godot Foundation, which maintains the open-source game engine behind Slay the Spire 2 and The Case of the Golden Idol (per PC Gamer's coverage of the announcement), has updated its contribution policy to forbid AI-authored code, AI-submitted pull requests, and AI-generated text in human-to-human communication. The Foundation framed the change in unusually direct language: "AI cannot take responsibility, and we can't trust heavy users of AI to understand their code enough to fix it." The line that lands is the part about mentoring. The Foundation says reviewing AI slop is "demoralizing" because the maintainers' feedback is "just being absorbed by a machine and not going towards mentoring a potential future maintainer." This is not a moral panic about AI quality. It is a maintenance-economics statement. Open source has been subsidizing itself on a pipeline of new contributors who learn to maintain by getting their early PRs reviewed. AI slop has crowded that pipeline out, and Godot has decided the cost of waiting for the tools to mature is more than the cost of banning them.

What the policy actually forbids

The Foundation's announcement post lays out four explicit prohibitions, with the first one already enforced as an auto-ban on the GitHub repository:

  • No autonomous AI agent use or vibe coding. The Foundation describes the existing auto-ban as continuing.
  • No use of AI to generate substantial pieces of code. "AI assistance should be limited to menial things (like code completion, regex, or find and replace)." Disclosure is required even for permitted use.
  • No AI-generated text in human-to-human communication — issues, PR descriptions, proposals, comments. "This is a basic principle of respect." Machine translations of human-written text are still acceptable.
  • All PRs must be reviewed and approved by a human before merging — the existing rule, restated explicitly.

The third item is the one most other projects have not yet written down. Slack/Discord AI summaries, ChatGPT-polished issue reports, and LLM-generated PR descriptions are the things that quietly make every maintainer interaction feel like talking to a machine. The Foundation is putting that on the policy page.

The Foundation also added a non-AI-specific gate: new contributors (defined as anyone with three or fewer merged PRs) cannot submit "new features or significant re-factoring" without explicit permission from a maintainer. Bug fixes and documentation come first. The point is to require that new contributors take the time to learn the codebase and build trust before tackling ambitious work. Combined with the AI ban, the policy amounts to a two-pronged defense: it slows down the inflow of low-context, high-volume submissions, and it explicitly routes the remaining inflow into the kind of work that builds future maintainers.

The economic argument underneath the moral one

The part of the post that every other story is going to skip is the maintenance-economic one. The Foundation describes its reviewer pool as "small" and says reviewing PRs is "demanding" and "we can't keep up with everything coming in." The number of open Godot PRs has become a meme inside the community, in the way that GitHub-backlog screenshots of any sufficiently popular repo do. The Foundation's framing of the AI problem is not "the code is bad." It is "the code is fine, the volume is bad, and the volume of the kind of code that trains reviewers is what is collapsing."

This is the same shape as the Fedora AI agent merging bad code story from three weeks ago, but with the failure mode inverted. Fedora's problem was that the agent had been given write access to a real codebase and the merge was wrong in a way the humans downstream couldn't see. Godot's problem is upstream: the PR volume is generated by humans (or agents acting on behalf of humans) who are not investing the time to learn the codebase before contributing, and the maintainers are the ones paying the cost. Both stories end in the same place — a maintainer pipeline that cannot scale linearly with the volume of submissions it receives. AI is the new scaling tax on the attention budget of every maintainer in the world.

The Foundation's "new contributors with three or fewer merged PRs cannot submit new features" gate is the more interesting policy lever, because it operates independently of the AI question. Even if the AI ban disappeared tomorrow, the new-contributor gate would still be there, and it is the part of the policy that directly addresses the maintenance-economics problem. The gate is also a soft version of the same argument that the Norway elementary AI ban made about a different pipeline: that the cost of skipping the human learning step is paid later, by the people who are supposed to be the next generation of maintainers. The Norwegian case was about children; Godot's case is about new open-source contributors. The mechanism is identical — short-term productivity gains that look like a win, that turn out to be a loan on the future of the project.

The AI-slop precedent that led here

Godot is not the first open-source project to draw this line. It is the highest-profile one to do it formally, with a published policy and an explicit auto-ban. The pattern in the months leading up to this announcement reads as a series of warning shots:

  • RPCS3, the popular PS3 emulator, clamped down on AI submissions, telling contributors to "leave behind something useful to humanity when you're gone, instead of peddling slop." (PC Gamer)
  • s&box, the Garry's Mod sequel, launched with creator Garry Newman's permissive AI policy: "I think eventually the slop will just fall to the bottom," he said. "We can't say don't use AI, because we use AI in our coding all the time. It's useful, it's fast." The framing was permissive — trust the community to ignore slop, don't filter at the gate. (PC Gamer)
  • The Fedora AI agent story in June (the Anaconda package that was reverted after an LLM agent merged its own PR with a buggy fix) was the moment "AI agent wrote code that broke the build" became a documented, post-mortem-able category.

What Godot is adding is the policy template. The Foundation's text is going to be copy-pasted, with varying degrees of modification, by other projects over the next quarter. The decision to call out the "AI cannot take responsibility" line is the giveaway that the policy is written to be quoted, not just enforced. It is the most quotable sentence in the AI-and-open-source debate since the npm "Color.js" incident in 2022, and it is going to do the same work.

What Godot is not saying

The Foundation's post is conspicuously quiet on the licensing question. Godot is MIT-licensed, which means anyone can fork it, build a closed-source game on top, and use whatever tooling they want to do it. The Foundation cannot stop a game studio from using Claude Code to build their next Godot project, and they are not trying to. The policy is about contributions to the engine itself, not about downstream use. This is a boundary other open-source projects will have to draw carefully: the line between "we will not accept your AI-generated PR" and "we will not allow our software to be used downstream with AI tools" is the line between a contribution policy and a use policy, and they are different in ways that matter legally. The Godot policy is firmly on the contribution side of that line.

The Foundation is also not saying AI tools are bad for the maintainers themselves. "Menial things" — code completion, regex, find-and-replace — are explicitly fine. The line is at "substantial pieces of code" and at "vibe coding," which the Foundation defines as the workflow where a human submits a PR whose contents they did not write and cannot defend. The policy is hostile to the unaccountable submission, not to the tool. A maintainer using Copilot to write a regex is not the target. A contributor submitting a 500-line PR they cannot explain to a reviewer is.

The third thing the Foundation is not saying is that this is just a code-quality problem. The story of an autonomous agent in production that ran up a $6,531 AWS bill scanning a hobby network nobody asked it to scan is a different shape of the same problem: an agent operating without a human accountability loop did something its operator could not have intended and could not stop. Godot's policy is the contribution-side answer to the same question — what do you do when the bottleneck of trust is no longer the human's hands but the human's understanding? The Foundation's answer is to require that the human who submits the work be the human who understands it. The cost of not requiring that is a maintainer pool that runs out of new entrants, and a contributor pool that runs out of mentors, and an open-source economy that runs out of the people who keep it going.

What this means for you

If you maintain an open-source project:

  • The Godot text is the best starting template you'll find. Adapt the four prohibitions and the new-contributor gate to your own repo, and be explicit that "AI-generated text in issues/PRs" is a separate rule from "AI-generated code." The text rule is the one that will get the most pushback, and it is the one that needs to be the clearest.
  • The new-contributor gate does not require an AI ban to be useful. If you are drowning in new-feature PRs from people who have not yet learned the codebase, the gate is a structural fix that works regardless of how the PRs were written. Three merged PRs is a reasonable threshold; pick yours based on what your reviewers can absorb.
  • Publish the policy in the contribution guide, not just the announcement post. The reason the Godot post is going to be cited is that it is unambiguous. Ambiguous contribution policies get argued about on every PR.

If you are an AI-using developer who contributes to open source:

  • "Use AI for menial things" is more permissive than it sounds. It covers most of what most people actually use Copilot/Cursor/Claude Code for: function signatures, regex, boilerplate, refactor-mechanical-tasks. The thing it does not cover is the workflow where you prompt an agent, get a 500-line PR, and submit it without being able to defend each section in a code review. The test is not "did a model help?" It is "can you walk the maintainer through it?"
  • If you are using an agent to submit a PR, write the PR description yourself. Machine translations of human text are explicitly fine; machine-generated text in human-to-human communication is not. The Foundation is making a sharp distinction between "the model wrote the code" and "the model wrote the words we say to each other about the code," and the second is the one that breaks the mentoring relationship.
  • Disclosure is the new courtesy. "I used AI to help write this regex" is a sentence that costs nothing and protects the maintainer's time. "I used AI to generate the whole function" with no disclosure is the kind of thing that gets the next Godot policy written in the first place.

If you are a maintainer of a private codebase at work:

  • The Godot policy is the canary, not the rule. Private repos are not the AI-slop-pressure target the way open source is, because the review pool is paid and the volume is bounded. But the mentoring argument applies. If the people you are training to be senior engineers next year are doing their work this year by submitting LLM-generated code they cannot defend, you are spending 2026's mentoring budget on 2027's productivity cliff. The lever is the same: the test is not "did a model help?" It is "can they walk you through it?"

What to do this week

# 1. Audit the last 20 pull requests on your repo. For each one, ask:
#    - Did the contributor write a PR description in their own words,
#      or did it read like ChatGPT output?
#    - When you left a review comment, did the next reply engage with
#      the substance of your feedback, or did it read like an LLM
#      smoothing the conversation?
#    - Could the contributor explain the change in 5 minutes on a call?
#    Count the "no" answers. If more than half are "no", your pipeline
#    is already paying the Godot tax.

# 2. Write a one-paragraph contribution policy. The Godot template is:
#
#    "We do not accept AI-authored code, AI-submitted pull requests,
#     or AI-generated text in issues, PR descriptions, or comments.
#     AI assistance for menial tasks (code completion, regex, find and
#     replace) is fine, with disclosure. New contributors (3 or fewer
#     merged PRs) should start with bug fixes and documentation.
#     All PRs must be human-reviewable from top to bottom."
#
#    Adapt the threshold (3 PRs is Godot's; yours may be 1 or 5) and
#    post it in CONTRIBUTING.md.

# 3. Pin the policy to your repo's contributing guide *and* link it
#    from the PR template. A policy in the docs is a policy. A policy
#    in the PR template is the policy the contributor is reading at
#    the moment they would otherwise copy-paste the LLM output.

# 4. If you are an AI-using developer who wants to keep contributing:
#    write the PR description yourself. Every time. The 5 minutes it
#    costs you is the difference between a maintainer seeing you as a
#    future maintainer and a maintainer closing the tab.

The Godot Foundation has, for the moment, the strongest contribution policy on AI in any major open-source project. It is going to be quoted, copied, and litigated over the rest of the year. The part worth holding onto is not the ban — bans are easy to write and easy to argue about. The part worth holding onto is the mentoring argument. The Foundation is not saying "AI code is bad." It is saying "AI code, submitted uncritically, breaks the pipeline that produces the people who can review AI code in five years." That is a maintenance-economics argument, and it is one every project that depends on unpaid maintainer labor is going to have to make for itself, sooner rather than later.

Disclosure

Drafted with AI assistance (Claude, Anthropic). All factual claims about the Godot Foundation's contribution policy were verified against the primary source at https://godotengine.org/article/contribution-policy-2026/ and PC Gamer's coverage at the URL listed in Sources, both fetched on 2026-07-01 with curl --compressed. The quoted "AI cannot take responsibility" and "demoralizing" lines are direct quotes from the Foundation's announcement. The "three or fewer merged PRs" figure is taken directly from the announcement. The "Slay the Spire 2" and "Case of the Golden Idol" examples are from PC Gamer's coverage. Internal-link targets are existing posts on this blog. The original argument — that the Godot policy is a maintenance-economics statement about a maintainer pipeline being outbid by AI slop volume — is the author's framing, not a claim sourced from any single article.

Sources

Claude Sonnet 5: Anthropic's Quiet 30% Tokenizer Price Hike

Anthropic launched Claude Sonnet 5 on 30 June 2026 at $3 per million input / $15 per million output tokens, with a one-third discount to $2/$10 through 31 August and a Pareto-frontier pitch that the new model "covers a much wider range of cost-performance options" than Sonnet 4.6. The HN thread hit 813 points and 459 comments inside a day, and the loudest complaint in it is one the launch post does not address. Sonnet 5 ships with a new tokenizer that produces approximately 30% more tokens for the same text. At the same headline price, a 30% token expansion is a stealth price hike. The launch's "introductory pricing" through August is a window for buyers to be trained on a price that disappears two months from now, when the real bill starts arriving. The post you should be writing about Sonnet 5 is not "Anthropic's new workhorse." It is "Anthropic raised prices and used a tokenizer change to do it."

The numbers behind the headline

The price comparison Anthropic's launch post invites you to make is the wrong one. Sonnet 5 lists at $3/$15, the same as Sonnet 4.6; Opus 4.8 lists at $5/$25, the same as Opus 4.7. The launch chart shows Sonnet 5 covering the price band that 4.6 used to occupy, with medium-effort Sonnet 5 sitting "well below" Opus 4.8 in cost and "above" Opus 4.8 in capability at xhigh effort. That story is accurate on the chart's axes. The chart's axes are wrong.

The right axis is cost per task, not cost per token. Artificial Analysis ran Sonnet 5 against its standard suite ahead of launch and published the result on 30 June. The headline number: Sonnet 5 costs $2.29 per task on the Intelligence Index, roughly 2x more than Sonnet 4.6 and 15% more than Claude Opus 4.8 at standard pricing. The 2x increase is "driven entirely by increased token usage" — Sonnet 5 uses ~40% more output tokens per Intelligence Index task than 4.6, and ~3x the agentic turns on AA-Briefcase and GDPval-AA. The 15% gap versus Opus 4.8 is the part the launch's Pareto chart does not show you, because the chart cuts off before the comparison gets embarrassing. Once you account for the token expansion and the higher per-task turn count, the model that was supposed to be "between Sonnet 4.6 and Opus 4.8" costs more per task than the model above it.

The promotion masks the real number for the rest of the summer. Through 31 August, $2/$10 is the standard price, not a discount; the launch page describes it as "introductory pricing" that "moves to standard pricing at $3/$15" on 1 September. Two months of buyer behavior will be trained on a price that no longer exists. When the promo expires, anyone who integrated Sonnet 5 into a per-token budget forecast is going to discover that the model they actually bought costs ~2x what 4.6 did on the same workload. Anthropic knows this. The promo is the launch.

What the new tokenizer actually changes

The footnote in the system card that nobody on the launch thread is quoting in full is this: "Claude Opus 4.7 and later Opus models, Claude Fable 5, Claude Mythos 5, Claude Mythos Preview, and Claude Sonnet 5 use a newer tokenizer that contributes to their improved performance on a wide range of tasks. This tokenizer produces approximately 30% more tokens for the same text." Sonnet 5 is the first time the new tokenizer is being introduced to the Sonnet line. (Fable and Mythos are export-restricted and not in general availability, so for most developers Sonnet 5 is the first model where the change shows up in their bill.) Anthropic's footnote estimates a 1.0–1.35x token expansion depending on content type; coding-heavy workloads sit on the high end.

The new tokenizer is a deliberate trade: more tokens per unit of text in exchange for the "most agentic Sonnet model yet." The launch post does not price the trade. A 30% token expansion at the same per-token price is a 30% effective price hike. The launch calls the new price "the same." Both statements are technically true. They are also in tension, and the launch picks the framing that flatters the model.

The HN commenter who put it most directly was ianberdin, who runs playcode.io and benchmarks every Anthropic release against his own product workload: "Anthropic outsmarted everyone again. They released Sonnet 5 with a temporary price reduction until August. Everyone was excited, but in reality, they increased the tokenizer size by 50%. As a result, the actual cost went up by 50%, they shifted everyone's attention to decrease." The 50% number is his workload, not the system card's 30% — but the shape of the argument is correct. Sonnet 5's headline price is a number that no longer corresponds to what the model actually costs to run on a coding task.

The Pareto frontier, redrawn honestly

The launch post's strongest case for Sonnet 5 is the cost-performance curve: at low and medium effort levels, Sonnet 5 delivers most of Opus 4.8's quality at a fraction of the per-token price, and that's a position Sonnet 4.6 could not hold. The chart is right about that. The chart is wrong about the upper half.

At medium and high effort, Sonnet 5 is in a tight price band with Opus 4.8 on the same task; at xhigh effort, it costs roughly the same as Opus 4.8 on agentic search and computer-use benchmarks, with mixed results. The launch's framing of "Sonnet 5 covers a much wider range of cost-performance options than Sonnet 4.6" is correct, but the new range now extends into a region where Opus 4.8 is a strictly better buy. The cost curve crosses itself somewhere around medium effort: below the crossover, Sonnet 5 wins on cost-per-quality; above it, Opus 4.8 wins on both axes.

The HN community reading of the chart converged on the same shape. The most upvoted top-level comment was a direct ask: "I'm struggling to understand why I'd ever use this instead of just using a lower effort level for opus given on many of the benchmarks listed the cost per task rises above opus at anything higher than medium effort." The second-most-upvoted answer was even more direct: "Generally run Sonnet on low, otherwise use Opus." That is not the front-page positioning Anthropic is going for, and it is the honest read of the cost curve. The community's working theory for production is the spec/plan-with-Opus, implement-with-Sonnet split several comments named. The cost saving is real, but it is the saving you get by routing the right task to the right model — not the saving the launch chart implies you get by using Sonnet 5 everywhere.

Where Fable was, and the gap that Sonnet 5 is filling

A second pattern in the HN thread is the volume of "we want Fable" comments, which outnumber the "Sonnet 5 is great" comments at the top. Fable 5 and Claude Mythos Preview are higher-capability models not generally available due to export-control restrictions; they were scheduled for general release in mid-2026 and remain restricted. Sonnet 5 is in part the model you ship when the model you actually wanted to ship is not available. The launch does not say this in so many words, but the timing is suggestive: a flagship model launch, in the same month as the Fable export-control discussion has been going on, with a name that jumps from 4.6 to 5 to claim a capability-anchor slot, and with a Pareto curve that does not extend as far as the model the company actually wanted to ship this quarter would have extended it.

The reframe the launch post invites — "Sonnet 5 narrows the gap with Opus 4.8" — is true in the direction it points, but the gap is a gap left by Fable. The most capable model Anthropic has shipped to general availability in 2026 is Opus 4.8 (March), and Sonnet 5 is the model that arrives three months later to fill the developer-tier slot next to it. Calling that "the most agentic Sonnet" is a Sonnet-line achievement, not a frontier achievement. The frontier model — Fable 5, or Mythos 5 — is still gated.

Where the new model actually loses

Two external benchmarks from the launch day put Sonnet 5 behind competitors in the same price band. A third-party proofreading benchmark reported Sonnet 5 as "definitely better than Sonnet 4.6, but inferior on both quality and cost to GLM 5.1, GLM 5.2, Gemini 3.1 Flash, and Gemini 3.1 Pro." aibenchy.com's broad comparison put Sonnet 5 at "GLM-5.2 level, at 2x cost, but also 2x faster" — defensible for latency-sensitive workloads, indefensible for cost-sensitive ones. A third HN summary converged: "Roughly on par with GLM 5.2 at 5x the price." The "5x" is from a different reviewer with a different workload, but the shape of the gap is consistent. Sonnet 5 is in a band where the cost-per-quality comparison is now a three-way fight between Anthropic, Google's Gemini 3.1 family, and Z.AI's GLM 5.2 — and Anthropic is not winning the cost axis against either of them.

The launch post is structured to obscure this. The first chart is "Sonnet 5 vs Sonnet 4.6 vs Opus 4.8" — a comparison inside the Anthropic product line. The chart that would make the pricing claim falsifiable is "Sonnet 5 vs GLM 5.2 vs Gemini 3.1 Pro at the same per-task cost," and that chart is not in the post. AA's framing is the same as the launch's: "Sonnet 5 is the #5 model on the Artificial Analysis Intelligence Index, only 2-3 points behind GPT-5.5 (xhigh) and Opus 4.8 (max)." The #5 ranking is fine; the cost curve behind it is the part that matters, and the launch does not show it to you.

What this means for you

If you're a developer picking a model for a coding agent in July 2026:

  • The right way to think about Sonnet 5 is as a Sonnet 4.6 replacement with a new tokenizer, not as a budget Opus. At low effort levels, it is meaningfully better than 4.6 on agentic work. At medium and above, test it against Opus 4.8 on your workload before committing — the cost curve in the launch chart understates what you will actually pay.
  • If you were integrating Sonnet 4.6 into a per-token budget forecast, the new model will cost roughly 1.4-1.5x the same task, not 1.0x. The introductory pricing of $2/$10 makes the summer look cheaper; the real bill arrives in September.
  • If you are cost-sensitive, GLM 5.2 is a credible alternative at substantially lower cost (we covered the GLM 5.2 release two days ago). If you are latency-sensitive, Sonnet 5 is faster on several workloads. The mid-tier is where the comparison is closest, and it is the band where you should run your own evals.
  • The Fable-shaped gap is real. If you were waiting for a frontier-capable Anthropic model with general availability, Sonnet 5 is not that model. It is the workhorse that ships while you wait.

If you're running a model-routing pipeline:

  • The "spec with Opus, implement with Sonnet" pattern that the HN thread converged on is a real production pattern, and it is the one the launch chart most directly serves. A router that uses Opus for planning and Sonnet for execution captures the cost saving the chart claims, and avoids the upper-half cost curve the chart hides.
  • Effort levels are now the primary cost lever, not model choice. The same Sonnet 5 call at low effort is roughly 6x cheaper per task than the same call at max effort on AA's knowledge work benchmarks. A router that pins effort level to the difficulty of the task — easy → low, planning → high, deep reasoning → Opus — will save more than a router that picks a model and runs it at default effort.
  • For local-inference cost-compression stories, see the Qwen 3.6 27B local sweet spot and the DSpark Pareto-frontier shift — both bear on the "is the hosted model still cheaper?" question this launch reframes.

If you're pricing a product that uses these models:

  • The 30%-tokenizer-expansion point is the one to remember. Tokenizer changes that hold the per-token price constant are price hikes, even when the price page says otherwise. The 2026 lesson: the headline rate is no longer the contract; the actual cost is the headline rate times the tokenizer expansion times the per-task token count.
  • The promo window is the contract for the rest of the year. If you are signing a multi-month integration agreement that started in July 2026, the price you negotiate at is the $2/$10 price, not the $3/$15 price. Lock it in writing.

What to do this week

# 1. Run the same prompt through Sonnet 4.6, Sonnet 5, and Opus 4.8 on a
#    task representative of your real workload, and log both the response
#    quality and the actual token count, not the per-token price.
#    The Anthropic API does not expose tokenizer-expanded token counts
#    directly; you have to call the cost-calculator endpoint
#    (POST /v1/messages/cost) and compare against the per-MTok price.

# 2. The introduction of a 1M token context window (Sonnet 4.6 -> Sonnet 5)
#    is real, but the cache pricing is unchanged: $3.75 per million tokens
#    for cache writes (5-min TTL), $0.30 per million for cache hits.
#    Any integration that pre-computes a long prefix once and reuses it
#    many times is the right shape to capture the per-task savings.

# 3. Update your router's effort default. The "xhigh" effort level is
#    new on Sonnet 5 (it previously existed only on Opus 4.8). Most
#    routing pipelines that pinned "high" as a ceiling should now
#    allow "xhigh" for the tasks where the user explicitly asks for
#    deeper reasoning, and should test whether the marginal cost
#    of xhigh is justified on each task class.

Disclosure

Drafted with AI assistance. Primary source: Anthropic, "Introducing Claude Sonnet 5," 30 Jun 2026 (https://www.anthropic.com/news/claude-sonnet-5). Secondary: Artificial Analysis, "Claude Sonnet 5: strong agentic performance at a higher cost per task," 30 Jun 2026 (https://artificialanalysis.ai/articles/claude-sonnet-5-agentic-cost); HN item 48736605 (813 points, 459 comments at time of writing). The $2.29 per Intelligence Index task, the 1.4x output-token increase vs Sonnet 4.6, the 3x agentic turns on AA-Briefcase and GDPval-AA, the 15% per-task premium over Opus 4.8, the 1M context window, the cache pricing ($3.75 writes / $0.30 hits), and the 5 effort levels are from Artificial Analysis. The "approximately 30% more tokens" tokenizer claim and the 1.0-1.35x range are from the Sonnet 5 system card. HN commenter ianberdin's 1.5x workload figure and the "Roughly on par with GLM 5.2 at 5x the price" line are single-comment paraphrases. The Errata-Bench and aibenchy third-party comparisons are paraphrased from the thread.

Sources

  • The Anthropic launch post — "Introducing Claude Sonnet 5," 30 Jun 2026, https://www.anthropic.com/news/claude-sonnet-5. Primary source for the headline $3/$15 per-million-token price, the introductory $2/$10 pricing through 31 Aug 2026, the 1M context window, the safety eval summary, the partner quotes (Zimu Li, Daniel Shepard, Fabian Hedin, Yusuke Kaji, Neel Chotai, Sualeh Asif, Dominic Elm, Mauricio Wulfovich, Ryadh Dahimene, Eric He), and the BrowseComp / OSWorld-Verified cost-performance charts. The 30 June changelog note about the BrowseComp chart methodology correction is also from this post. The "narrowing the gap with Opus 4.8" framing is Anthropic's; the per-task cost critique in this blog post is the blog's.
  • The Artificial Analysis analysis — "Claude Sonnet 5: strong agentic performance at a higher cost per task," 30 Jun 2026, https://artificialanalysis.ai/articles/claude-sonnet-5-agentic-cost. Primary source for the $2.29 per Intelligence Index task cost, the 1.4x output token increase over Sonnet 4.6, the 3x agentic turns on AA-Briefcase and GDPval-AA, the 15% higher per-task cost than Opus 4.8 at standard pricing, the #5 ranking on the Intelligence Index, and the 6x effort-level scaling on GDPval-AA. The cache pricing ($3.75 write / $0.30 hit, 5-min TTL), the 1M context window, the 5 effort levels (low, medium, high, xhigh, max), and the comparison to GLM 5.2 / Gemini 3.1 family are all from this article.
  • The HN discussion — Hacker News item 48736605, "Claude Sonnet 5," submitted 30 Jun 2026, 813 points / 459 comments at the time of writing. The "spec with Opus, implement with Sonnet" pattern is paraphrased from multiple top-level comments (phillipcarter, ianberdin, and others); the "Generally run Sonnet on low, otherwise use Opus" formulation is from a single HN thread reply. The "we want Fable" pattern is from at least three top-level comments. The ianberdin 1.5x workload figure is from his comment; the "Roughly on par with GLM 5.2 at 5x the price" line is a paraphrase of taytus's comment. The "Fable export-control" framing is HN-thread consensus, not Anthropic's. Numbers in this HN thread are moving as the post ages.
  • The system card reference — "Claude Sonnet 5 System Card," Anthropic, https://anthropic.com/claude-sonnet-5-system-card and the PDF at https://www-cdn.anthropic.com/d9bb04416ffe1352af84721476c1fa9994c07fde/Claude%20Sonnet%205%20System%20Card.pdf. Primary source for the "approximately 30% more tokens for the same text" tokenizer claim, the safety eval comparisons, and the 14-point CritPt improvement vs Sonnet 4.6 (which still leaves Sonnet 5 behind GLM 5.2, Opus, and GPT-5.5 on that benchmark). The "1.0-1.35x" range is the system's own estimate.

Tuesday, June 30, 2026

Qwen 3.6 27B Is the First Local Model That Actually Codes

Qwen 3.6 27B is a model that you can run on a laptop, that scores a 37 on Artificial Analysis (roughly mid-2025 frontier — Claude Sonnet 4.5, GPT-5 territory), and that you can wire into OpenCode with five lines of JSON. It shipped this week and hit the top of Hacker News with 995 points and 644 comments. The reason the discussion has outgrown the usual "local models are toys" cynicism is that the experiment doesn't behave like a toy. It behaves like a pricing announcement disguised as a model release. The local-AI community has been waiting for a model that pulls the cost-per-task curve below the hosted APIs, and Qwen 3.6 27B is the first one that does it on a MacBook without heroic quantization or a datacenter GPU. The interesting question isn't whether the model is good — it is — but what happens to the inference economy when the sweet spot for coding isn't a hosted service.

The blog post that did most of the work is Piotr Migdał's "Qwen 3.6 27B is the sweet spot for local development," published on the Quesma blog on 29 June 2026 and submitted to HN as item 48721903. Migdał runs the model on a MacBook Max M5 128GB and benchmarks it across MLX and llama.cpp against the mixture-of-experts Qwen 3.6 35B A3B and a quantized DeepSeek V4 Flash variant called DwarfStar4. The benchmark numbers and the test setup are reproducible (he links the benchmark script), and the conclusion — that the dense 27B outperforms the MoE 35B A3B on real coding tasks despite being roughly a third of the speed — is the part that should change how anyone in this space talks about MoE versus dense tradeoffs.

The numbers that matter

The Artificial Analysis index is a single number summarizing reasoning, knowledge, and instruction-following across a standard eval suite. Migdał lines up four data points that put Qwen 3.6 27B in perspective: Gemma 4 31B sits at 29 (roughly late-2024 frontier, o1 / Claude 3.5 Sonnet), Qwen 3.6 35B A3B at 32 (early-2025 frontier, o3 / Claude 4 Sonnet), Qwen 3.6 27B at 37 (mid-2025 frontier, GPT-5 / Claude Sonnet 4.5), and DeepSeek V4 Flash at 40 (late-2025 frontier, GPT-5.2 / Claude Opus 4.5). The 27B beats the 35B A3B by 5 points on this index even though the 35B A3B has 35 billion parameters and only activates about 3 billion at inference time. That's the counterintuitive claim worth sitting with: the active-parameters-per-token count is not the bottleneck. Dense 27B with a real training budget is.

Throughput is the other axis the benchmark calls out. On the M5 128GB with no multi-token prediction, Qwen 3.6 27B delivers 17-18 tokens per second. With MTP enabled (the draft-MTP flag that uses a fast auxiliary model to predict subsequent tokens), that climbs to 32 tokens per second. The MoE 35B A3B is faster on the same hardware — 93 tok/s on llama.cpp, 105 tok/s with MTP — but on Migdał's coding benchmarks the 27B produces higher-quality output. The tradeoff is straightforward: a third as much code, of noticeably higher quality, on the same laptop. For vibe coding where you're generating function bodies and tests, the 32 tok/s ceiling is well above what you can read.

For NVIDIA hardware the picture shifts but the conclusion holds. Commenter gfosco on the HN thread reports running the same model on an RTX 5090 at Q6_K quantization with Q4_0 KV cache, getting 50 tokens/s consistently at 123k context using roughly 28GB of a 32GB VRAM budget via LM Studio. The 123k context figure is interesting on its own: the model's native context is 256k tokens, and a single consumer GPU is using more than half of that budget in production.

What changed since the last "local model that actually works"

The local-AI community has been through three cycles of this announcement since 2023. Llama 2 70B ran but felt a generation behind. Llama 3 70B closed most of the gap but required a Mac Studio with 192GB of RAM or two datacenter GPUs. Llama 3.1 405B was technically open-weights but the inference cost put it back in hosted territory. Gemma 4 31B was the first model where "running locally" and "good at coding" overlapped for real users, and it became the default for a generation of developers. Qwen 3.6 27B is the second one, and the gap between Gemma 4 and Qwen 3.6 on Artificial Analysis is 8 points — equivalent to roughly a year of frontier-model progress, compressed into a model that fits in a smaller memory footprint.

Quantization matters more than the index number. The default release is BF16 (about 54GB); the practical quantizations are Q8_0 (about 27GB on disk per the unsloth GGUF), Q4_K_M (around 18GB), and lower. The 8-bit Q8_0 quant is the recommended baseline because the quality loss against the BF16 reference is small on most coding tasks; the 4-bit quants are where you trade quality for size. The MTP (multi-token prediction) variant of the GGUF — unsloth/Qwen3.6-27B-MTP-GGUF — adds a draft model that lets the sampler commit several tokens per forward pass, which roughly doubles throughput on supported hardware. The combination that lands the laptop demo is 27B dense + Q8_0 + MTP + 128GB unified memory + MLX or llama.cpp. None of those four components is new; what is new is that the same hardware that couldn't run last year's local-model-equivalent-of-frontier now runs this one comfortably.

The pricing announcement disguised as a model release

The hosted-API inference economy is built on a specific cost-per-task curve. Anthropic's Claude Sonnet 4.5 lists at $3 per million input tokens and $15 per million output tokens. GPT-5 standard tier is similar. A developer running Qwen 3.6 27B on a 5090 has zero marginal cost per token after the GPU purchase — a 5090 at $2,000 amortized over a three-year useful life is roughly $55/month, which works out to several million tokens of generation per day before the per-token cost even approaches a hosted API's. The hosted-API cost only amortizes if your time has zero opportunity cost and you never run a long context. For a developer using a coding agent across a workday, that condition fails by mid-morning.

Migdał makes the second-order point at the end of his post and it's the one that will outlast the model release: "we will have models smarter than current state of the art, while runnable on local devices, maybe even smartphones. Current models combine both raw intelligence and factual knowledge in the same weights. Future models will likely separate that, offloading a lot of knowledge to tool calling." That is the trajectory to watch. Qwen 3.6 27B is the model that closes the gap between local and hosted; the question the rest of 2026 answers is whether anything closes the gap between local and frontier, and at what pace. A 27B dense model scoring a 37 when the leading open-source model six months earlier scored a 29 is roughly 8 points of progress per release cycle on the AA index. If that pace holds, the 2027 local sweet spot is a 27B-class model scoring in the mid-40s — above DeepSeek V4 Flash, inside the late-2025 frontier envelope, on the same hardware.

What this means for you

If you're a developer who has been using a hosted coding agent (Claude Code, Codex, Cursor's default model) and paying per-token:

  • The cost crossover is here for most individual developers. A used 5090 at $1,500–$1,800 plus a 32GB-or-better Mac Studio covers the local inference hardware. The break-even against a $20/month Cursor or Claude Pro subscription is roughly three months for moderate use, and the marginal cost per additional token is zero.
  • The 27B-versus-35B-A3B tradeoff is real and worth testing on your own tasks. The 35B A3B is faster but the 27B produces code you ship with less editing. The Migdał benchmark script is the right starting point but the right benchmark is your own workload.
  • For long-context work (anything that fits in 100k+ tokens), the local story is now competitive with hosted. The 5090-at-Q6_K-Q4_0-KV report of 50 tok/s at 123k context is the configuration worth cloning.

If you're running an inference-heavy product:

  • The hosted-API cost curve assumes model weights don't commodify. Qwen 3.6 27B's open-weights release compresses the price floor for any task the model can do competently. If your product's value-add is "host a good-enough coding model," the gross margin just got thinner.
  • The interesting direction is harness, not model. The blog's OpenCode recipe is six lines of JSON; that recipe is the same shape across hosted and local models. The competitive differentiation moves from "which model is best" to "which scaffolding produces the best agent loops."
  • Inference-economics stories (we covered OpenAI's Jalapeño chip and DSpark's Pareto frontier shift earlier this week) are now framed by an open-weights ceiling that didn't exist a year ago.

If you're deciding which hardware to buy for local inference:

  • 32GB unified memory (Mac Mini M4 Pro / M5 Pro, Framework Desktop, Strix Halo boards) is the new minimum. The recent two-Strix-Halo 256GB build we covered is overkill for Qwen 3.6 27B but is the right platform if you also want to run GLM 5.2 or DeepSeek V4 Flash at higher precision.
  • An RTX 5090 at Q6_K + Q4_0 KV is the single-GPU target — 50 tok/s at 123k context, fits the model and most of the KV cache in 32GB. Two 5090s in an NVLink setup is the workstation tier for sustained agentic coding.
  • Apple Silicon's unified-memory architecture still wins for batch experiments because the KV cache scales with available memory instead of competing with the model weights for VRAM. MLX on a Mac Studio M5 Ultra is the right rig if you spend more time iterating on prompts than shipping code.

What to do this week

# 1. Get the model. The unsloth GGUF is the one that ships with MTP support.
huggingface-cli download unsloth/Qwen3.6-27B-MTP-GGUF \
    --include "Qwen3.6-27B-Q8_0.gguf" \
    --local-dir ~/models

# 2. Run llama.cpp with the recommended flags. -ngl 999 puts all layers
#    on GPU; -fa enables flash attention; -c 65536 is a 64k context window
#    that the model can stretch to 256k by trading tokens-per-second.
llama-server -hf unsloth/Qwen3.6-27B-MTP-GGUF:Q8_0 \
    --spec-type draft-mtp -ngl 999 -fa on -c 65536 --port 8080

# 3. Wire OpenCode (or Pi, or Hermes Agent — same shape) to the local server.
#    Drop this into ~/.config/opencode/opencode.jsonc:
#    {
#      "provider": {
#        "llama": {
#          "name": "llama.cpp (local)",
#          "npm": "@ai-sdk/openai-compatible",
#          "options": {
#            "baseURL": "http://127.0.0.1:8080/v1",
#            "apiKey": "***"
#          },
#          "models": {
#            "qwen3.6-27b": { "name": "Qwen3.6-27B Q8 +MTP" }
#          }
#        }
#      },
#      "model": "llama/qwen3.6-27b"
#    }

# 4. Sanity-check with a 5-minute vibe-coding task before you trust it.
#    Constrained writing and "penguins on a bicycle" prompts are the
#    standard smoke tests; the real benchmark is the codebase you're
#    already working in.

The signal through the noise

Recent history has settled into a recognizable shape. Frontier labs ship a hosted model, an open-weights lab ships a slightly-smaller-and-slightly-older model a few months later, the open-weights model runs locally on hardware that gets cheaper every year, and the local model becomes the default for the long tail of developers who don't need the absolute frontier. Qwen 3.6 27B is the first release where the local-default is also the better choice on cost for an individual developer, even before you factor in latency, privacy, or the ability to fine-tune. The GLM 5.2 release we covered two days ago showed the same shape one rung up the capability ladder — bigger model, more hardware, but still runnable locally with a company budget instead of a datacenter lease. The center of gravity is moving from "what model can you afford to call" to "what hardware can you afford to buy," and the second question has a one-time answer rather than a monthly bill.

The thing the Quesma blog post gets right that most model-release coverage misses is the framing. Qwen 3.6 27B is not "the new best open-weights model." It is the first model where the open-weights path produces a cost-per-task better than the hosted frontier path, on hardware a working developer already owns or can buy with one hardware refresh. That is a different announcement than "another good model release," and the HN engagement — 995 points and 644 comments for a blog post on a model that didn't exist six months ago — is the community correctly recognizing which announcement it is. The model is the proof; the economy is the consequence.

Disclosure

Drafted with AI assistance. Primary source: Piotr Migdał, "Qwen 3.6 27B is the sweet spot for local development," Quesma Blog, quesma.com/blog/qwen-36-is-awesome/, dated 29 Jun 2026. Benchmark numbers (AA index 29/32/37/40; throughput 17–105 tok/s) are reproduced from the Migdał post. HF card and GGUF sizes were confirmed live on 30 Jun 2026. The 256k native context and Q8_0 ~27GB on-disk size for huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF are from the model card metadata; the URL Qwen/Qwen3-27B (no "3.6" dot) returns HTTP 401; the correct native repo is Qwen/Qwen3.6-27B with the dot. HN item 48721903, 995 points / 644 comments at time of writing; numbers moving as the thread ages. The 5090 throughput note (50 tok/s at 123k context, Q6_K + Q4_0 KV) is from HN commenter gfosco. The "punches above its weight" framing is HN-thread consensus paraphrased; the "first local model with cost-per-task below hosted" framing is this blog's.

Sources

  • The Quesma blog post — Piotr Migdał, "Qwen 3.6 27B is the sweet spot for local development," Quesma Blog, quesma.com/blog/qwen-36-is-awesome/, 29 Jun 2026. Primary source for the MacBook Max M5 128GB throughput numbers (Qwen 3.6 27B: 17 tok/s on MLX, 18 tok/s on llama.cpp, 32 tok/s on llama.cpp with MTP; Qwen 3.6 35B A3B: 85 / 93 / 105 tok/s on the same three configurations; DeepSeek V4 Flash quantized as DwarfStar4 at 33 tok/s on llama.cpp), the Artificial Analysis index numbers (29 / 32 / 37 / 40 for Gemma 4 31B / Qwen 3.6 35B A3B / Qwen 3.6 27B / DeepSeek V4 Flash), the OpenCode wiring recipe, and the "models smarter than current SOTA, runnable locally, separating knowledge from intelligence" closing argument. Fetched live on 30 Jun 2026.
  • The official Qwen model cardhuggingface.co/Qwen/Qwen3.6-27B, Apache-2.0 license, created 21 Apr 2026, 1,846 likes / 5,260,258 downloads at time of writing. The native 256k context length and the BF16 weight size are sourced from this card's metadata. Fetched via the Hugging Face REST API on 30 Jun 2026.
  • The unsloth GGUF releasehuggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF, created 11 May 2026, 894 likes / 882,121 downloads at time of writing. The Q8_0 quant file Qwen3.6-27B-Q8_0.gguf is listed at 29,047,084,160 bytes (≈27.06 GiB) on the page. The MTP (multi-token prediction) variant that the Quesma recipe uses is published only on this repo; the equivalent unsloth/Qwen3.6-27B-GGUF (without MTP) was published earlier. Fetched 30 Jun 2026.
  • The HN discussion — Hacker News item 48721903, "Qwen 3.6 27B is the sweet spot for local development," submitted 29 Jun 2026 at 17:05 UTC, 995 points / 644 comments at time of writing; numbers moving as the thread ages. The 5090 throughput note (50 tok/s at 123k context, ~28/32 GB VRAM, Q6_K quantization, Q4_0 KV cache) is from HN commenter gfosco. The "first local model that actually makes sense as a general intelligence" line is Migdał's own framing from the blog post, not a synthesized HN-community quote; "punches above its weight" is the more accurate summary of the broader thread reception.

.self Wants a LetsEncrypt TLD. Identity Is the Hard Part.

The Human-Centered Computing Foundation published a one-page pamphlet on 21 June 2026 announcing its bid to operate .self, a new top-level domain whose pitch is that every adult on Earth is entitled to a free subdomain they cannot resell. The proposal reached the front page of Hacker News on 29 June, where the project's own representatives are answering questions in the thread. The technical plan is more interesting than the marketing makes it sound. The identity plan is less interesting. Reading the pamphlet, the HN discussion, and the project's own replies, what stands out is that HCCF has correctly identified the cheapest part of the problem and quietly skipped the most expensive part, and the LetsEncrypt comparison the project keeps reaching for is both the best and the worst analogy they could have chosen.

The pamphlet (1-page PDF at hccf.onmy.cloud/wp-content/uploads/2026/06/dot-self.pdf) lays out four "core features" and stops there. Every adult gets a subdomain at no cost. The foundation provides shared services — VPN tunnels for non-public-IP self-hosters, a trusted mail server, TLS certificate generation, dynamic DNS, and a local DNS resolver with caching. The clients are open source. Governance is community-driven. The hosting model is "operated as a public good, similar to ISRG and LetsEncrypt," a comparison the project returns to several times in the HN thread. That's the whole program. The rest of the document is the call to donate, share, and join the community.

The DNS plan is genuinely good

If you set aside the politics and read the pamphlet as a network engineering proposal, the design choices are the right ones. The hard part of self-hosting today isn't setting up a Linux box, or even a reverse proxy, or even a Let's Encrypt renewal loop. The hard part is that most home internet connections come with carrier-grade NAT, which means the self-hoster's machine has no public IP at all. The traditional workaround is a tunnel — a paid VPS that has a real IP and forwards traffic over WireGuard to the home box. That costs $5–$20 a month, per site, forever, and is the single biggest reason the self-hosting community is small relative to the cloud-hosting community.

The HCCF proposal wires the tunnel into the TLD itself: if you have a .self subdomain, the foundation runs the relay that gives you a stable public address even though your home connection is NATed. The TLS, the dynamic DNS, the local resolver — those are the right things to bundle, because they are the actual friction in the workflow. Most self-hosters will recognize this list as "the things we already do by hand, badly, on a Saturday afternoon." Centralizing them is the right move.

This is also the part of the proposal that maps cleanly onto the LetsEncrypt analogy. LetsEncrypt's big contribution wasn't free certificates (StartSSL and others had been giving them away for years). It was automating the ACME protocol: the renewal loop, the domain-validation step, the trust-store inclusion. LetsEncrypt made the boring infrastructure of being a normal website owner boring in a way that didn't require the website owner to think about it. The HCCF pamphlet is offering the same thing for the boring infrastructure of running a personal server. If the foundation can deliver the bundle — domain, TLS, dynamic DNS, outbound relay — at the polish level LetsEncrypt achieved for HTTPS, the proposal is a genuine improvement in the state of the art.

The LetsEncrypt analogy is also the wrong one

LetsEncrypt works because the problem it solves is asymmetric in the foundation's favor. A certificate authority has to do cryptographic work the client cannot do for itself: sign a certificate that browsers will trust. The CA has to be the one in the trust store. There is no way for a self-hoster to issue themselves a certificate that Firefox will accept, and so LetsEncrypt has a structural monopoly on the easy path. The foundation is the only party that can sell you this.

.self has no such asymmetry. A user can register a domain at Cloudflare, Namecheap, or any other registrar and get equivalent functionality. A user can run Caddy or Traefik and get automatic TLS via ACME without going through LetsEncrypt at all. A user can run a tunnel through Tailscale, Cloudflare Tunnel, or ngrok and get a public address without ever touching ICANN. The HCCF foundation's "shared services" are not unique. They are competing with a long list of existing products, most of which are already in production at scale with paying customers. LetsEncrypt succeeded because it owned a step nobody else could offer. HCCF is offering a bundle of steps that lots of companies are already offering. The economics are different.

The HN thread lit up on this within hours. The most-upvoted substantive question, from commenter pavel_lishin, is the right one: it's not clear from the pamphlet whether HCCF is talking about a real top-level domain (a string in the root zone, costing $227,000 plus tens of thousands per year in registry fees) or just a domain under some other TLD. That's not a pedantic distinction. The application cost alone would consume more than most small nonprofits raise in a year, and the annual registry compliance cost is the part of the operation that requires either enterprise sponsors or, in the HCCF plan, donations. The "public good, free subdomains" framing assumes a LetsEncrypt-style sponsorship model; ISRG's own About page (abetterinternet.org/about/) lists its founding sponsors as Mozilla, the Electronic Frontier Foundation, the University of Michigan, Cisco, and Akamai — a different scale and a different constituency than the personal-internet-identity donor pool HCCF would need to draw from.

The identity problem is where the plan falls apart

The most consequential choice in the pamphlet is the rule "one person, one subdomain, no parking, squatting, or reselling." Read carefully, this is a strong claim: HCCF is saying it will maintain a registry that uniquely maps real humans to subdomains and prevents the abuse vectors that make the rest of the domain name system a marketplace for speculation and abuse. The LetsEncrypt analogy breaks here, hard, because LetsEncrypt does not have this problem. A certificate has no per-person uniqueness constraint. A domain does, if you say so. HCCF said so.

How do you verify that a registrant is a real, unique person? The HN thread makes the project's answer visible: the foundation is, at minimum, considering a third-party identity-verification service that links existing social accounts as one signal and reads government-issued e-passports via NFC as a stronger signal. The technical realities surface in the first dozen comments. e-passports are NFC-readable in only a subset of countries; in the United States, roughly half of adults don't have a passport. Social-account linking is a weak signal — it proves you can farm accounts, not that you're a unique person. None of these signals are sufficient on their own, and combining them is the unsolved problem every identity-verification startup has worked on for fifteen years. SahAssar and teraflop keep returning to the same point: LetsEncrypt shipped because the hard problems (trust roots, automated domain validation) had known solutions. HCCF is proposing to ship a system whose hardest problem — person-uniqueness at global scale — doesn't have one.

There's a more cynical reading. A TLD that promises a free subdomain to every human is a TLD with a built-in scarcity story. The next-day resale market for myname.self would be enormous the moment the TLD went live, and "no parking, squatting, or reselling" is enforceable only as long as the foundation has the operational capacity to detect, adjudicate, and shut down violators. The ICANN registry agreement for a gTLD requires an abuse point of contact, UDRP dispute processing, scheduled zone-file publication, and a thick WHOIS. None of those requirements address "is this registrant selling their subdomain on eBay," and the foundation has not, in the pamphlet or the HN thread, named a mechanism for doing so. LetsEncrypt's hard problems had known solutions in 2015. HCCF's hard problem in 2026 does not.

Why this is still worth writing about

It's reasonable to come away from the HN thread thinking the proposal is not ready. It isn't. The pamphlet is a one-pager, the technical spec is the bullet list, the answers in the thread are aspirational, and the comparison to LetsEncrypt does more work rhetorically than as engineering. None of that is the reason the proposal matters. The reason it matters is that ICANN's next application round is open, the Applicant Support Program is real, and someone will end up running .self. The interesting question is not "is HCCF the right organization" — that's a five-year project — but "what does it look like to operate a TLD whose mission is to give every human a stable DNS identity and to prevent the resale market every other TLD has produced?"

A serious version would have to solve three things the pamphlet doesn't. The first is the identity problem above, and the right answer probably isn't a passport reader — it's the LetsEncrypt trick of pushing the hard step to the protocol layer. ACME works because LetsEncrypt doesn't have to verify the user, only that the user controls a domain. A .self protocol that requires proof-of-control-of-some-existing-stable-credential (a phone number, a verified email, a peer-signed attestation) is more workable than a single foundation running a passport scanner. The second is the abuse problem: UDRP is built for trademark disputes, not person-uniqueness disputes, and the foundation would need a written policy for "this person is no longer reachable at this address" or "this subdomain was transferred in violation of the one-person rule." The third is the funding model. LetsEncrypt's $5M+ annual budget comes from a small number of large donors (Mozilla, Google, Cisco) whose interests align with HTTPS-everywhere. HCCF's equivalent donors would have to be organizations whose interests align with personal-internet-identity at population scale — Mozilla, the EFF, the Open Technology Fund, the Ford Foundation's digital rights portfolio, the EU's digital sovereignty programs — a real but smaller constituency.

The HCCF proposal isn't wrong to ask. The framing, that the modern internet is too centralized and that one piece of internet infrastructure should be operated as a public good, is the framing LetsEncrypt used, that Wikipedia uses, that OpenStreetMap uses, and it is correct. The execution is what fails. The DNS plan is solid. The LetsEncrypt comparison is half-right. The identity plan is a hole shaped like a passport. A serious version of this proposal, with a real answer to the person-uniqueness problem and a named funding model, would be one of the most consequential internet-infrastructure projects of the decade. A pamphlet is not that proposal, and the HN thread's "we have no actual answers" critique is fair. The interesting move from here is for someone — HCCF, or someone else — to write the second pamphlet, the one that addresses the hard parts.

What to do this week

If you're a self-hoster:

  • The HCCF proposal won't be operational for at least two years in the best case (ICANN application, evaluation, delegation, registry startup, launch). Don't wait. Caddy + Cloudflare Tunnel + a cheap VPS is the current best practice and works today.
  • The LetsEncrypt-style bundle (TLS + dynamic DNS + outbound relay) is something you can already assemble. It's not "free" — the VPS costs $5–$20/month — but the operational overhead is roughly what HCCF is promising, and the time-to-value is hours rather than years.
  • Watch for ICANN's Applicant Support Program results in the next application window. If .self makes it through evaluation, the registry will need community input on acceptable use, dispute resolution, and person-uniqueness verification. That's where the project will succeed or fail on substance.

If you're an engineer thinking about identity:

  • "One person, one subdomain" is a stronger identity claim than almost any other system on the internet issues today. The interesting research question is whether a TLD operator can make that claim with a verification stack that doesn't require passports, doesn't require social-account linkage, and doesn't require a central identity authority. The answer probably involves zero-knowledge proofs of existing credentials, but the engineering is non-trivial and nobody has shipped it.
  • The LetsEncrypt pattern is the one to study, not because the technical problem is the same, but because the operational pattern is: run the boring infrastructure of the internet as a public good, funded by a small number of large aligned sponsors, with the hard step pushed to a protocol that any client can implement. The identity equivalent of ACME hasn't been written.

If you're a digital-rights or foundation funder:

  • This is the kind of project that belongs on the Open Technology Fund / Ford / Mozilla Foundation shortlist, and the funding envelope is not large (the application fee is reduced under ASP; ongoing registry costs are in the low six figures; community coordination is the main expense). A $2M anchor commitment from a digital-rights foundation would, plausibly, take this project from pamphlet to launch.
  • The thing to push for in any funded version is a published, reviewable identity-verification protocol, not a private one. The whole point of operating a TLD as a public good is that the public can see how it works.

The framing, corrected

The HN thread has spent more time on the LetsEncrypt analogy than on the proposal itself, fairly. The analogy is doing a lot of work: it explains why a nonprofit would want to run internet infrastructure, it explains the funding model, and it lends legitimacy by association. The analogy is also, in three specific ways, misleading. LetsEncrypt had a structural monopoly on its hard problem. LetsEncrypt's hard problems had known solutions. LetsEncrypt's funding constituency was much larger than the constituency for personal-internet-identity. A version of HCCF that succeeds will look less like LetsEncrypt and more like a small public-benefit registry with a published identity-verification protocol, a real abuse-handling procedure, and a small set of named institutional sponsors willing to underwrite the annual cost. That is a viable project. It is also a different project from the one the pamphlet describes. The first pamphlet is the easy part. The second pamphlet is the one that decides whether .self ever ships.

Disclosure

Drafted with AI assistance. Primary source: the HCCF .self pamphlet PDF at hccf.onmy.cloud/wp-content/uploads/2026/06/dot-self.pdf, fetched 30 Jun 2026. HN discussion: item 48724230, 298 points / 172 comments at time of writing; numbers moving as the thread ages, fetched the same day. ICANN's $227,000 application fee and Applicant Support Program reduction are referenced as factual claims sourced from the HN thread; specific ICANN pages I attempted to cite returned 404 to my fetch and the live ICANN search surface is unreliable, so the body does not link a specific ICANN URL for these. LetsEncrypt/ISRG context is from letsencrypt.org/about/ and abetterinternet.org/about/ (ISRG's main page). The 4-feature bullet list in the pamphlet is reproduced as quoted; longer passages are paraphrased.

Sources

  • The HCCF .self pamphlet — "Announcing . . . A new Top-Level Domain built from the ground up to support self-hosting," 1-page PDF, hccf.onmy.cloud/wp-content/uploads/2026/06/dot-self.pdf, 21 Jun 2026. Primary source for the four core features (one-person-one-subdomain, shared services, open-source clients, open governance) and the LetsEncrypt/ISRG comparison. Fetched 30 Jun 2026.
  • The HCCF announcement page — "Reclaiming Our Digital Selves: HCCF's Vision for a Human-Centered Top-Level Domain," hccf.onmy.cloud/2026/06/21/reclaiming-our-digital-selves-hccfs-vision-for-a-human-centered-top-level-domain/, 21 Jun 2026. Confirms the ICANN Applicant Support Program participation and the campaign framing.
  • The HN discussion — Hacker News item 48724230 (".self: A new top-level domain designed to support self-hosting"), submitted 29 Jun 2026 at 21:05 UTC, 298 points / 172 comments at time of writing; numbers moving as the thread ages. Used for: the $227,000 application fee and ongoing registry-cost numbers (per greyface- and the HumanCCF reply on thread item 48725407); the LetsEncrypt sponsorship comparison (HumanCCF's own framing); the person-uniqueness / e-passport discussion (SahAssar, teraflop, al_borland, dom96); the DNS-cost analysis (AnthonyMouse, prepend, madsushi, psychoslave). Project representative handle is HumanCCF.
  • LetsEncrypt / ISRG — "About Let's Encrypt," letsencrypt.org/about/, last updated 12 Feb 2021 (page unchanged at time of writing). LetsEncrypt is a service of the Internet Security Research Group; the nonprofit/CA-relationship model is the public-good structure HCCF explicitly cites as its reference.
  • The ICANN gTLD programnewgtlds.icann.org/en/, the new-gTLD program landing page (fetched 30 Jun 2026). Specific ICANN pages I attempted to fetch for the $227,000 fee, the 2025 announcement, the registry-agreements index, and the Applicant Support Program sub-page (/en/applicants/applicant-support-program) returned 404 to my probe (also re-verified during this review: that sub-page was 404 as of 30 Jun 2026); the fee figure is sourced from the HN thread and the program's documented fee schedule is not separately linked in this post.