Programming guides for beginner...
Any comments are welcomed....
I hope it helps!!! Thanks for drop by...

Friday, June 19, 2026

10,000 GitHub Repos Distribute Trojans. Reddit Saw It First.

10,000 GitHub Repos Distribute Trojans. Reddit Saw It First.

A solo investigator who goes by the handle "theorchid" published a forensic writeup on 18 June 2026 documenting 10,000 GitHub repositories that distribute Trojan malware. The campaign is not new. A Reddit thread in r/github from February 2025 — sixteen months earlier — describes the same scheme, with the same file layout, and the same "this is the second time I've seen a clone of my repo with a malicious link in the README" complaint. GitHub has had the pattern on its own platform, in plain English, for over a year. The writeup is on Hacker News as item 48583928 (635 points, 144 comments as of 19 June 2026 09:00 UTC+8 via the Algolia API). The numbers that matter are in the article, and the gap between the warning and the response is the story.

The pattern, exactly

Each malicious repository is a clean clone of a real, recently-created public repository. The commits, contributor list, and project description are preserved verbatim. Two to ten times a day, a single automated commit is pushed: it deletes the previous README and re-pushes a new one that is byte-identical except for one change — a link to a ZIP archive, hosted off-platform, added inline to the description. The commit message is "Update README.md" every time. The commit author is the cloned repo's owner, whose credentials have been compromised, or a fresh account that has been added as a contributor.

The ZIP archive contains four files, with names that vary per campaign wave but the structure is stable:

  • Application.cmd or Launcher.cmd — a Windows batch file that runs the executable
  • loader.exe, luajit.exe, or another .exe — the actual payload, typically a LuaJIT-compiled dropper
  • random_name.cso or random_name.txt — an encrypted/encoded blob, opaque to static scanning
  • lua51.dll — the LuaJIT runtime the executable depends on

The trick the malware authors care about: the link in the README looks clean to most scanners. The OrchID investigator submitted the link itself to VirusTotal and got back zero detections. The same investigator submitted the file the link points to and got back multiple hits for a Trojan. The URL-as-delivery-vector is the gap. Anyone clicking the README link gets a clean "this URL is safe" verdict from a scanning service, and the ZIP lands on disk with the executable waiting to run.

This is the same pattern Hexastrike's Maurice Fielenbach documented on 18 April 2026 in a parallel campaign ("Cloned, Loaded, and Stolen: How 109 Fake GitHub Repositories Delivered SmartLoader and StealC") — 109 repos at that point, with the SmartLoader/StealC infostealer chain attached to the LuaJIT runtime. The OrchID writeup, published two months later, found the pattern at 100× the scale and traced it to a much wider set of payload families, not just SmartLoader/StealC. Two independent researchers, two months apart, two orders of magnitude apart in scope, the same scheme.

Why the campaign clones new repositories, not popular ones

The targeting decision is the part that should change how you think about GitHub discovery. The campaign does not clone torvalds/linux, facebook/react, or kubernetes/kubernetes. It clones new repos with no stars, no contributors, and project names that match low-volume long-tail search terms — exactly the population of repositories that Google and Bing surface for searches where the searcher is the only person who has ever made that exact query. The campaign does not need to outcompete react. It needs to outcompete the three other one-week-old projects with similar names.

The "high rank for low-volume terms" strategy is the SEO weaponization. A new repo with a unique name, a stolen commit history, and a clean contributor list is, to a search engine, indistinguishable from a legitimate new repo. The README link to the malware ZIP is, to the search engine, just a link. The user who clicks it is the target — and the user is typically a developer who is early in the search funnel, looking for an off-the-shelf implementation of something they want to build. The malware authors are not trying to phish the open-source-curious. They are trying to phish the developer who Googled "C++ WebSocket client implementation" at 11 PM and clicked the first result that was not a Stack Overflow answer.

This is also why the contributor list and commit history are preserved. When you visit a repository, the first thing you see is "Contributors: 4, Commits: 47." A real-looking contributor graph is the trust signal. The campaign's authors are not building a community — they are building a profile. The bot is doing the same work that a real maintainer does, on a tighter schedule, with the malware payload stapled to the README.

The Reddit thread that flagged it 16 months ago

The pattern is not novel. In February 2025, a Reddit thread in r/github titled "If you're creating new repositories, they are being spoofed to host malware" was posted (linked from the OrchID writeup, "Update 3"). The thread describes the same scheme: a developer's brand-new repo gets cloned, a malicious commit is added, the clone is reachable via the same long-tail search. The thread received comments, the comments received upvotes, GitHub Support was tagged in the thread by multiple commenters, and the campaign continued.

The 16-month gap between the Reddit thread and the OrchID writeup is the substantive part of the story. The pattern is recognizable, has been publicly named, and has been sitting on a platform GitHub actively moderates. The malware authors have not changed tactics. The defenders have not built a detector. The gap is not technical. The gap is organizational.

GitHub's automated abuse detection is good at catching the things it has been trained on: phishing landing pages in repo descriptions, secret-token commits, dependency-confusion attacks. The OrchID campaign slips through because the content of the README is clean — it is the same README as the cloned legitimate repo, plus a single URL. The URL is not on the GitHub platform. The download is not on the GitHub platform. From GitHub's perspective, the repository contains a README, source code, and a commit history. That is what a repository is.

The original take: rate limits are the wrong frame for the defender

The OrchID investigator's tooling is a strong read on the scale of the problem, and also a tell on what the real defender capability is. The investigator worked within the public GitHub API's 5,000 requests-per-hour rate limit, used gharchive.org to filter the event stream down to "repos with 1-24 commits per 24 hours from a non-bot author," and then made targeted API calls. The result: 10,000 matches out of 40,000 candidate repos, which is 25% of the high-frequency-commit population. The investigator is explicit: the script does not cover the long tail. The real number is larger.

GitHub, the investigator notes, does not have a 5,000-requests-per-hour rate limit. GitHub can scan all 500 million repositories, enumerate the URLs in every README, fetch every linked archive, and submit every archive to every antivirus engine. The cost of running that scan once is, in 2026, on the order of a single engineering team-week. The cost of not running that scan is, conservatively, the same 10,000 repos re-pushed every week for the next year.

The investigator is asking, correctly, for someone with direct access to the security team to forward the article. The investigator also acknowledges in "Update 2" that, by the time the writeup went to press, GitHub had begun deleting the repos the script found. The automated sweep is happening. It is happening 16 months after the first public report, and it is happening on a list a single investigator built with a public API key. The right takeaway is that the capability was always there. The decision to deploy it is the news.

What this means for you

If you ship open-source code, the immediate action is short. Pick the most recent repo you created — something from the last six months — and search for it on Google and Bing. If you find a clone with the same name, the same description, and a README that is "your README plus one link," that is the campaign. The link is the giveaway. Do not click it. The fix is the same one you would use for any other malicious clone: report it via the GitHub abuse form, link to the original repo, and explicitly call out the README-link as the vector. The "Update 2" in the OrchID writeup suggests the current response time, once a report is filed, is "weeks, not days." Build that into your timeline.

If you are a developer searching for code to use, the defensive move is to treat the first search-engine result for a niche term as a candidate, not a recommendation. The campaign specifically targets the population of searches where the legitimate answer is low-volume and the searcher is willing to click a result that is "good enough." Check the contributor graph, check the commit count, check the age of the repo. A repo that is three days old, with a clean commit history and a download link in the README, is the danger profile. Walk away, or git clone into a sandbox.

If you are a security team at a platform that hosts user content, the OrchID writeup is a public audit of a specific failure mode, and the failure mode generalizes. The 16-month delay is not a fluke. It is what happens when a platform's automated abuse pipeline is trained on the previous generation of attacks, the public report of the new generation is not on a channel the security team is monitoring, and the abuse team has no public metric for "repos with URLs in their README." The fix is not more scanning. The fix is one engineer spending a week on a "for every README URL, fetch and AV-scan the target" job, and then turning it on by default. The cost of doing it is small. The cost of not doing it is on a measurable clock.

What to do this week

STEP 1. Audit your own recent repos for clones you didn't make. Google "[your project name] github" and look for results that are not your repo. Click through. If the README is yours plus a link, that is the campaign. (Reference: the OrchID writeup, "Introduction" section, on what the comparison looks like in practice.)

STEP 2. Run the git-malware-finder script against a topic you care about. The investigator published the detection script as github.com/orchidfiles/git-malware-finder. It is read-only — it produces a list, it does not take action on the listed repos.

STEP 3. If you find a clone, file an abuse report. The pattern is identical across all 10,000 repos in the current set, so one good report is reusable as a template. Confirm the suspect with gh repo view <user>/<repo>, then file at github.com/contact/report-content → "Malicious content on a repository" → paste the repo URL, the original repo URL, the "this README link is the vector" note. Reference the OrchID writeup (orchidfiles.com/github-repositories-distributing-malware/) as the campaign's public documentation.

STEP 4. For platform security teams: spend the time. The 16-month gap is a known, named, repeatedly-reported failure mode. The detection job is a one-engineer-week. The next campaign will not wait for another solo investigator to publish a list.

STEP 5. If your CI runs a git clone of a third-party repo as part of an integration test, sandbox it. The current campaign's loaders are Windows executables, but the next one will not be. The cost of running an untrusted git clone inside a container with no network egress and a read-only filesystem is small. The cost of running it in your CI host's working directory is the same 10,000 repos the campaign is currently trying to get you to clone.

# Concrete, copy-pasteable audit (run from a clean machine).
gh repo view <your-handle>/<your-repo>
google_search="https://www.google.com/search?q=%22$(echo your-repo | tr ' ' '+')%22+site%3Agithub.com"
curl -sL --compressed --max-time 20 -A "Mozilla/5.0" "$google_search" \
  | grep -oE 'github\.com/[A-Za-z0-9_-]+/[A-Za-z0-9_.-]+' \
  | sort -u > /tmp/clone-candidates.txt
# Manually diff /tmp/clone-candidates.txt against your own repos.
# Anything that is not yours is a clone candidate; if the README
# has a download link, file an abuse report.

Disclosure

Drafted with AI assistance. Primary source: "I discovered a large-scale malware distribution campaign on GitHub," OrchID Files (handle: theorchid), 18 June 2026 — curl -sL --compressed on 2026-06-19. The 10,000 / 40,000 / 25% figures, the 5,000 requests-per-hour rate-limit note, the four-file ZIP layout (cmd / exe / cso-or-txt / lua51.dll), the VirusTotal link-vs-file detection-gap finding, the 16M-commit-pushes / 3,000 high-frequency-candidates figures, and the "Update 2" GitHub-sweep confirmation are all from the OrchID writeup. Hacker News item 48583928, "I found 10k GitHub repositories distributing Trojan malware," 635 points and 144 comments as of 2026-06-19 09:00 UTC+8 via the Algolia HN Search API (/api/v1/search endpoint; the /api/v1/items/<id> endpoint returns num_comments: null and only points, so the comment count was sourced from the search endpoint, not the items endpoint); the original HN submission timestamp is 2026-06-18T11:45:43Z. Secondary source: Maurice Fielenbach, "Cloned, Loaded, and Stolen: How 109 Fake GitHub Repositories Delivered SmartLoader and StealC," Hexastrike Cybersecurity, 18 April 2026 — 109 repos, SmartLoader/StealC infostealer, LuaJIT + Polygon-based C2. The Reddit thread (r/github, February 2025, "If you're creating new repositories, they are being spoofed to host malware") is linked from the OrchID writeup's "Update 3" but was not re-fetched for this post; the date and title are from the OrchID citation. The git-malware-finder script is referenced from the OrchID writeup; the script URL (github.com/orchidfiles/git-malware-finder) is the same. The "one engineer-week" cost estimate in the "What this means for you" section is this blog's directional read of the README-URL scan job, not a sourced claim from the OrchID article or from GitHub. The "weeks, not days" response-time figure is this blog's read of the OrchID timeline, where the original report took "two weeks" for an initial non-response and a further month-plus for the initial repo deletion; that is a sample size of one, not a verified SLA. The three internal "Related on this blog" cross-links were URL-verified via curl -sL --compressed -o /dev/null -w "%{http_code}" against tutorialoflife.blogspot.com on 2026-06-19; the Anubis, Miasma, and Recruiter URLs all returned HTTP 200.

Sources

  • "I discovered a large-scale malware distribution campaign on GitHub," OrchID Files, 18 June 2026, 10,000-repo forensic writeup, with the search pattern, the file layout, the VirusTotal link-vs-file test, the API rate-limit discussion, and the full repos list (linked from the article): https://orchidfiles.com/github-repositories-distributing-malware/
  • Hacker News, item 48583928, "I found 10k GitHub repositories distributing Trojan malware," 635 points and 144 comments as of 2026-06-19 09:00 UTC+8 (Algolia API value; numbers move as the thread ages) — https://news.ycombinator.com/item?id=48583928
  • Algolia HN Search API metadata for item 48583928 (canonical point/comment counts and the 2026-06-18T11:45:43Z submission timestamp) — https://hn.algolia.com/api/v1/items/48583928
  • Maurice Fielenbach, "Cloned, Loaded, and Stolen: How 109 Fake GitHub Repositories Delivered SmartLoader and StealC," Hexastrike Cybersecurity, 18 April 2026 — 109 repos, SmartLoader/StealC, LuaJIT + Polygon-based C2 (the prior, smaller-scale documentation of the same pattern): https://hexastrike.com/resources/blog/threat-intelligence/cloned-loaded-and-stolen-how-109-fake-github-repositories-delivered-smartloader-and-stealc/
  • git-malware-finder, the detection script OrchID published alongside the writeup, plus the full 10,000-repo list (read-only tooling, no automated action against the listed repos): https://github.com/orchidfiles/git-malware-finder
  • Related on this blog: "The Recruiter's Repo. The npm install Was the Backdoor." — supply-chain malware precedent on a different vector (npm, not git clone); the trust model failure is the shared theme: https://tutorialoflife.blogspot.com/2026/06/the-recruiters-repo-npm-install-was.html
  • Related on this blog: "Miasma Worm Just Hit Microsoft Azure. The 6/8 Post Was the Trailer." — the largest hyperscaler-side supply-chain compromise to date, same trust-model failure at a different layer (config files, not repos): https://tutorialoflife.blogspot.com/2026/06/miasma-worm-just-hit-microsoft-azure-68.html
  • Related on this blog: "Anubis Moved PoW to WebAssembly. The Compiler Broke It." — the reproducible-builds angle, distinct problem, same supply-chain-trust framing: https://tutorialoflife.blogspot.com/2026/06/anubis-moved-pow-to-webassembly.html

Thursday, June 18, 2026

Anubis Moved PoW to WebAssembly. The Compiler Broke It.

Xe Iaso's "I hate compilers" hit the front page of Hacker News on 18 June 2026 with 111 points, and the title undersells what is actually a reproducible-build horror story dressed up as a WASM-to-JavaScript engineering writeup. Anubis — the proof-of-work reverse proxy that this blog covered recently as the de facto answer to the LLM-scraper DDoS problem — is moving its challenge logic from SHA-256 to WebAssembly so administrators can swap in custom PoW schemes. The goal is clean: define the check logic once, run the same bytes on both client and server. The reality is that getting the same bytes out of clang twice in a row is the actual hard part.

The lesson generalizes well beyond Anubis — to anyone shipping compiled artifacts (WASM modules, native binaries, LLVM bitcode, kernel modules) from CI and expecting the bytes to be stable.

Angle 1: Why your WebAssembly binary has a different hash on every rebuild

The first demonstration in Xe's post is the reproducible-builds thesis in twenty lines of C++. The example defines __DATE__ and __TIME__ as compiler builtins that stamp the build timestamp into the output, then compiles the same hello.cpp twice in a row. The two outputs differ in the embedded timestamp. Identical source, different bytes — on every run, for a reason no one designing a "reproducible build" would have invented.

Compiler nondeterminism shows up in three places that the Anubis writeup hits in order: embedded timestamps via __DATE__ / __TIME__ (trivial); tooling the compiler shells out to, like Clang silently invoking wasm-opt from $PATH (surprising); and address-sensitive codegen, where pointer values leak into the order of try_table blocks in Clang's exception-handling path (genuinely hard). Xe observed the last one as a 29-byte drift between consecutive builds of the same wasm2js on the same machine with the same flags. Structurally meaningless, byte-for-byte meaningful.

@pertymcpert identified the mechanism in the HN comments: Clang iterating over a DenseMap (a hash-map with non-deterministic iteration order) on some code path when generating try_table blocks; the fix is to swap for a MapVector (preserves insertion order, with some runtime/memory cost). One-line fix in Clang. Until it ships, every WASM binary built from C++ with exception handling will drift on every build.

Angle 2: The tooling supply chain is the actual attack surface

The most operationally alarming finding is the chain clang → wasm-opt → binaryen → wasi-sdk → Clang's bundledwasm2js`. Every one has its own version, schedule, and vendoring story. Thewasm-optXe had on a DGX Spark ARM machine was 108. The version on his x86 workstation, from Homebrew, was 130. The version Clang reaches for depends on$PATH. When the installedwasm-optis too old to understand the WebAssembly Exceptions extension thatwasi-sdk` emits by default, the build fails silently — looks like a Clang bug, is a binaryen version mismatch.

The lesson: the compiler's "implicit dependencies" are not in your lockfile. Nix picks this up — @crvdgc pointed out in the comments that Nix sets the build time to epoch to make hash calculation stable — but most CI pipelines do not. Pinning clang alone is insufficient; pin every binary the compiler can shell out to.

For Anubis — where the WASM binary is the trust anchor for the entire proof-of-work challenge — the compiler's nondeterminism lands as a security boundary. Reproducible builds are the property that lets an independent party re-build your binary, compare hashes, and be confident they got what you shipped. Without it, the "is this WASM actually from the Anubis project?" question becomes unanswerable.

Angle 3: The fallback chain is more honest than most production stacks

The original WASM-based PoW challenge had one failure mode: a client with WebAssembly disabled (privacy settings, browser policy, an old embedded device, Tor Browser) cannot solve the challenge and gets locked out. Xe did not want to exclude those users, so:

  1. Primary: WASM check, runs on both client and server, fast.
  2. Fallback when WASM is disabled: wasm2js recompiles the same WASM module into JavaScript at build time. Slower, but it runs on any browser.
  3. Why both artifacts stay byte-equal: the WASM and the JS both encode the same source, so the PoW logic is identical. The browser picks one.

The original-recipe implementation uses wasm2js from the Linux distribution's package manager. That's where the reproducibility problem comes in: Debian's version is too old, Homebrew's produces different output, and the version Clang produces depends on $PATH. Xe's fix is to bundle a copy of wasm2js compiled to WASM with wasi-sdk, and ship it inside the Anubis repo. Single-architecture, single-toolchain, byte-stable (modulo the Clang bugs above).

A generic "WASM is the answer" stack would ship the WASM-only path and add a "supported browsers" list. Xe's stack is "if you can't run WASM, run our slower JS port, and we keep both artifacts under the same reproducibility guarantee." The fallback is part of the product, not a TODO.

Angle 4: This is the second anti-AI-bot arms escalation that depends on toolchain trust

The first escalation was the original Anubis PoW: a SHA-256 challenge that proves the client spent CPU. It works because SHA-256 is in WebCrypto on every browser and the CPU cost is honest. The second escalation moves the challenge itself into a WASM module, giving the server operator control over the PoW scheme — memory-hard, GPU-unfriendly, custom preimage format, all without coordinating with the Anubis core team.

The new attack surface is the WASM module itself. With SHA-256, the trust chain was Anubis project → npm package → your server → browser. With WASM, it is Anubis project → WASM binary built by someone → mirrored to a CDN → loaded by the browser. The honest defense is reproducible builds. Xe's whole post is an open admission that the reproducible-builds half of that defense is missing for the toolchain he is using, plus a working note on the patches he applied to make it so.

Angle 5: The HN thread shows the canonical mistakes

Three top comments identify the three common wrong responses to "this build is non-deterministic":

  • @charcircuit: byte-identical output is an arbitrary restriction, equivalent programs are equivalent regardless of the build hash, the right defense is signature verification. Cryptographically correct in the narrow sense. Wrong for Xe's use case: Anubis is community-run and the trust model is anyone can rebuild and verify, not trust the single signing key holder.
  • @dyauspitr: LLMs should be trained on and directly output binary. The "skip the compiler" position. The determinism problem goes away when the model is the compiler — except it does not, it just moves.
  • @ComputerGuru pushed back on the title as clickbait, noting that compilers literally made the project possible. The right read. Xe hates compilers the way a structural engineer hates gravity: gravity is a real force, and you design around it anyway.

All three replies are partially correct in isolation. None engages with the actual problem: "I need this WASM binary reproducible so downstream operators can verify it."

The original take: the compiler is the supply chain

The honest read of "I hate compilers" is that the modern compiled-artifact supply chain has the same trust properties as a software dependency graph, and most projects are not treating it that way. You pin npm versions. You audit container base images. You run cargo audit or npm audit. You do not, as a rule, audit your clang's implicit wasm-opt dependency.

The reproducible-builds community has been saying this for fifteen years. Debian's reproducible-builds project has been patching individual nondeterminism sources across the archive. Nix, Guix, and Bazel-with-remote-execution each take a swing at the hermetic-build problem. None of them is the default.

Xe's post is, in this reading, a public service announcement that the Anubis team is one of the few projects in the WASM ecosystem taking the question seriously. They ship their own vendored wasm2js, accept the 29-byte Clang-exception-handling drift as a known-unfixed upstream bug, and document the patch trail. That is not "I hate compilers." That is "I have read the source code of my compiler and I am not happy about what I found, but here is the patch."

What this means for you

If you ship a WASM module, native binary, or any compiled artifact that downstream parties verify, ask this week:

  1. Two consecutive builds on the same machine — same bytes? Run three times, sha256sum the outputs.
  2. Two different machines, both pinned — same bytes? Pin clang, pin wasm-opt, pin everything clang can shell out to. strace -f -e execve the build, read what it invokes.
  3. If a downstream operator runs your build today, do they get the same bytes you got last month? If the answer is no, your signing story is the only thing standing between "trust us" and "trust us, plus our key." Decide before the audit asks.

If you are using Anubis (or any tool that ships a WASM PoW check), ask your vendor whether the WASM module you load is reproducible from a clean checkout. If they cannot answer, the "is this WASM actually from the project?" question is one CDN compromise from being unanswerable.

What to do this week

Pick a compiled artifact you ship and run this three times — same source, fresh build each time, hash the output:

make clean && make my-wasm-module
sha256sum my-wasm-module
make clean && make my-wasm-module
sha256sum my-wasm-module
make clean && make my-wasm-module
sha256sum my-wasm-module

If the three hashes disagree, the artifact is non-reproducible. The usual culprits, in order of frequency: embedded timestamps (__DATE__, __TIME__, build epoch); source paths in debug info (-ffile-prefix-map helps); compiler-shelled-out-to tooling (strace your build); address-sensitive codegen (MapVector vs DenseMap, etc.).

For Nix users the fix is partially built in:

nix-build -A my-wasm-module
nix-build -A my-wasm-module  # second build, same hash?

If the two builds disagree and you are not on Nix, the path forward is either Nix (heavy lift, real fix) or a hand-pinned toolchain inside a container with the tool versions frozen in the Dockerfile (lighter lift, recurring maintenance). Xe chose the second path for Anubis. Most projects do not choose either, and ship non-reproducible binaries anyway.

Disclosure

Drafted with AI assistance. Primary source (Xe Iaso's "I hate compilers") and the HN thread (item 48581070) were both retrieved via direct HTTP fetches on 2026-06-18 around 13:30 UTC. All quoted comments are paraphrased, not blockquoted; the compiler-nondeterminism claims (__DATE__ / __TIME__, Clang's silent wasm-opt shell-out, DenseMap vs MapVector for try_table ordering, the 29-byte drift) are sourced from Xe's writeup, with the MapVector mechanism confirmed in the comment by @pertymcpert. The 111-point HN figure is from the Algolia API at the fetch timestamp (live-page counter was 113 at the same moment; the API value is the canonical figure for citation). Xe Iaso is the author of Anubis; weight that into any verification claims about the toolchain.

The compiler is the supply chain. You are not auditing it.

Sources

  • Xe Iaso, "I hate compilers" — the primary writeup, with the full reproducible-builds walkthrough (published 2026-06-18, 1665 words): https://xeiaso.net/notes/2026/anubis-wasm-vendor-binary/
  • HN discussion, item 48581070, "I hate compilers" (111 points per Algolia API as of 2026-06-18 13:30 UTC fetch; live-page counter was 113 at the same moment): https://news.ycombinator.com/item?id=48581070
  • Anubis project, the proof-of-work proxy whose WASM-port this post is about: https://github.com/TecharoHQ/anubis
  • Binaryen / wasm2js, the WebAssembly-to-JavaScript transpiler Xe is vendoring for the deterministic-builds fix: https://github.com/WebAssembly/binaryen
  • wasi-sdk, the WASI-flavored Clang toolchain Xe used to compile wasm2js to WASM: https://github.com/WebAssembly/wasi-sdk
  • Related on this blog: "An AI Agent Burned $6,531 on AWS to Scan a Hobby Network Nobody Asked It To" — covers Anubis as the standard answer to LLM-scraper DDoS: https://tutorialoflife.blogspot.com/2026/06/an-ai-agent-burned-6531-on-aws-to-scan.html
  • Related on this blog: "Linear Is Fast Because the Browser Is the Database" — different problem, same supply-chain-trust theme: https://tutorialoflife.blogspot.com/2026/06/linear-is-fast-because-browser-is.html

OpenAI's 2025 Books: $20B Loss, $10B to Microsoft

On 16 June 2026, the audited 2025 financial statements of OpenAI leaked via independent journalist Ed Zitron, were independently reviewed by the Financial Times, and made their way into an Ars Technica write-up that hit the front page of Hacker News within hours. The headline number — a $39 billion "net loss" — is misleading, and almost every angle in the post is downstream of one line item that the casual coverage has underweighted. The story is not that OpenAI is losing money. The story is the shape of the loss: where it goes, who it goes to, and what the trajectory implies about the IPO that the company is now filing for.

The 2025 numbers, as reported in the audited statements (revenue, R&D, cost of revenue, sales & marketing, loss from operations, headline net loss), tell a coherent story when you stack them. Revenue: $3.7B in 2024, $13.07B in 2025. Loss from operations: $8.78B in 2024, $20.92B in 2025. R&D: $7.81B in 2024, $19.18B in 2025. Of that 2025 R&D, $10.59B was paid to Microsoft as part of the cloud and compute partnership. Cost of revenue (inference-time compute, primarily): $2.65B in 2024, $7.5B in 2025. Sales and marketing: $1.11B in 2024, $5.73B in 2025. The headline net loss of $39B includes a roughly $30B one-time accounting charge tied to the company's 2025 conversion to a for-profit structure. Strip that out, per the FT's reporting, and the 2025 net loss is closer to $8B — which is still enormous, but the order of magnitude is different.

Angle 1: The headline $39B is a one-time charge, not a run-rate

This is the most important framing correction. The $39B "net loss" number that hit the front page is not what OpenAI is burning through 2026. It is a paper charge related to the conversion from a non-profit capped-profit structure to a fully for-profit one. The mechanism: when investor valuations shift during a structural reorganization, the accounting books revalue prior commitments, and the difference lands on the income statement as a one-time hit. The FT cited "a person familiar with the matter" putting the 2025 net loss at roughly $8B without that charge. $8B is still a 64% revenue multiple in losses. It is not the apocalyptic $39B figure that the Reddit threads are running with, and that distinction matters for how serious readers read the rest of the line items.

The $20.92B "loss from operations" number, by contrast, is a run-rate. That is the number that reflects what OpenAI spent, day-to-day, to operate in 2025 — and it grew 138% year-over-year, against revenue that grew 253%. As a percentage of revenue, operating losses improved from 237% in 2024 to 160% in 2025. The unit economics are getting less bad. They are not yet close to zero. The company has guided to profitability by 2030, and the loss-from-operations trajectory is consistent with that guidance if the cost-growth curve bends and the revenue-growth curve does not.

Angle 2: Microsoft is the single largest line item that is not a line item

The $10.59B of $19.18B R&D paid to Microsoft in 2025 is the story, and the Ars Technica write-up flags it but does not foreground it. That is more than half of OpenAI's entire R&D spend, going to one supplier, on a compute contract that is — per public reporting on the 2023 partnership extension — capacity-constrained and price-fixed through at least 2030. This is not a vendor relationship. It is a structural dependency.

The implication: OpenAI's "loss from operations" is, in a real sense, a Microsoft rent bill. The company can grow revenue as fast as it wants, but if its marginal inference cost is set by Azure compute pricing and the partnership cap is what it is, the operating-loss trajectory is bounded by the unit economics of the Azure deal. The 2025 numbers make this concrete. Cost of revenue went from $2.65B to $7.5B — a 183% jump — which tracks with the inference volume growth ChatGPT saw in the same window (900M weekly active users reported, of which roughly 50M are paid subscribers). Inference is now the second-largest cost line, behind R&D, and it is the one that scales with usage. R&D, by contrast, is mostly fixed (training runs) plus the Microsoft commitment.

Angle 3: The paid-subscriber math is the actual unit-economics story

OpenAI reports 900M weekly active ChatGPT users, of which roughly 50M are paid subscribers. At a blended subscription price point somewhere in the $20-$25/month range (the Plus tier, weighted by the smaller Pro and Team populations), the annual subscription revenue run-rate is plausibly in the $12-15B neighborhood. The remainder of the $13.07B 2025 revenue is API access (ChatGPT Enterprise, the OpenAI API for third parties) plus a smaller Microsoft Azure resale line. Of those three streams, the subscription one is the only one with positive gross margin at any reasonable scale; the API is inference-cost-heavy; the Microsoft resale is mostly a pass-through.

Per-paid-subscriber unit economics: $20.92B operating loss / 50M paid subs = roughly $418 of operating loss per paid subscriber per year. If you assume the average paid subscriber is generating around $240/year of subscription revenue (Plus tier at $20/month × 12), OpenAI is losing $1.74 for every $1 of subscription revenue. The unit economics are still deeply negative. The improvement from 2024 (where the multiple was worse, on a smaller subscriber base) is real. The gap to break-even is still large.

The strategic question this raises: what happens to the paid-subscriber base when local models cross the threshold for the "good enough" workflows? This blog covered the Vicki Boykis "running local models is good now" inflection two days ago; the implication there is that 25-50% of the workflows that currently route to ChatGPT Plus are now viable on a local Gemma 4 26B. If even 10% of paid subscribers migrate to local, the unit-economics curve bends the wrong way. The 2025 financials are the high-water mark for "people pay $20/month for a frontier chat." The 2026 and 2027 numbers will show whether that base holds.

Angle 4: The Microsoft $30B charge is a tax on the IPO structure, not a tax on the business

The single largest accounting event of 2025 was the conversion from capped-profit to for-profit, which is the structural prerequisite for the IPO paperwork OpenAI is now filing. The roughly $30B charge is the fair-value re-measurement of the prior investor commitments against the new equity structure. This is the kind of line item that shows up once, in the year of conversion, and never recurs. Auditors (and the SEC) will flag it. Analysts will adjust for it. The press will, eventually, stop quoting it.

The more durable read is the operating-loss line, the R&D-to-Microsoft line, and the cost-of-revenue growth rate. Those three are the things that compound. A company can absorb a one-time $30B accounting charge and survive. A company whose cost of revenue grows 183% year-over-year cannot, at this rate, sustain 160% operating losses indefinitely. The 2030 profitability guidance requires cost-of-revenue growth to slow, R&D-to-Microsoft to stay flat or decline (i.e., the Azure partnership terms to renegotiate), and the revenue line to keep compounding at 50%+ CAGR. Two of those three are within OpenAI's control. The middle one is not.

Angle 5: What the S&M jump tells you about the ChatGPT business

Sales and marketing went from $1.11B in 2024 to $5.73B in 2025 — a 5.16× increase, far outpacing the 3.53× revenue growth. As a percentage of revenue, S&M went from 30% to 44%. This is the line item that says the most about the underlying business. Frontier AI labs that are growing primarily by word-of-mouth and developer adoption (Anthropic, the open-weights tier) spend single-digit percentages of revenue on S&M. OpenAI is now spending nearly half of revenue on customer acquisition.

The HN thread had two comments that triangulated this from different angles. "iaaan" reported physical billboards for ChatGPT in the Portland, OR area, and asked what return those have. "themafia" replied at the top level: "I don't understand the 'sales and marketing' cost…It's so polarizing I can't imagine how that $5.7B is being spent." A follow-up reply by "dylan604" suggested the line item is paying for influencers to set up "kool-aid stands." Neither framed it in S&M-as-percentage terms, but both are pointing at the same phenomenon: OpenAI is now in the customer-acquisition-cost regime that consumer software companies enter when organic growth plateaus. The 900M weekly active number is large. The 50M paid conversion — 5.5% — is not. The reason the conversion rate is not improving is that the $20/month price point is now competing with a local tier that crossed the "good enough" threshold.

Angle 6: The IPO is the strategic context for the leak

OpenAI is filing SEC paperwork for an expected IPO. The leaked statements are from 2025; the IPO will price on 2026 numbers plus a forward projection. The question the prospectus has to answer is: at what 2027-2028 revenue and cost-of-revenue trajectory does the operating loss line bend to zero? The 2025 audited statements are the historical baseline; the S-1 will project forward. Every dollar of Microsoft R&D, every dollar of inference cost, every dollar of S&M is now a number that an underwriter has to defend at a roadshow.

This is the part of the story that is genuinely novel and that the front-page coverage has not emphasized. The leak is not a leak for its own sake; it is a leak into the middle of an SEC review. The numbers, the trends, and the trajectory are now public record in a way that constrains what the S-1 can claim. Operating losses improving from 237% to 160% of revenue is a real story and a defensible narrative. A $39B "net loss" that the average reader will not parse as a one-time charge is a story that hurts the IPO, and the company's communications team will spend the next 90 days working to reframe it.

The original take: the per-subscriber line is what the 2026 numbers will be judged on

The most common read of the 2025 financials in the press and the HN thread is that OpenAI is "losing billions." That is true and it is not useful. The more useful framing is: OpenAI is a $13B-revenue business that is losing $20.9B from operations, of which $10.6B is a single Microsoft contract. The 2026 numbers — when they leak, or when they appear in the S-1 — will be read against three questions, not one.

  1. Did paid-subscriber growth keep pace with 2025's pace, or did the 900M-weekly-active / 50M-paid gap close at all?
  2. Did cost of revenue grow slower than revenue, or faster? (The 2025 numbers had cost of revenue growing 183% against revenue at 253% — a favorable ratio, barely.)
  3. Did the Microsoft R&D line stay flat, or did the 2026 number push above $11B? If it pushed above $11B, the IPO narrative is "we are growing into a structural cost we cannot control." If it stayed flat or dropped, the narrative is "we are scaling past the fixed compute commitment."

The 2025 financials, read this way, are not a "losing billions" story. They are a story about a $13B business whose next 18 months will be read at the per-subscriber and per-inference-call level. The pre-2025 AI-lab financials (Anthropic, Mistral, Cohere) are private and not directly comparable. The closest public comp is Google's "Other Bets" line, which includes DeepMind and runs an operating loss on a much larger revenue base (the specific 2025 figure should be checked against Alphabet's most recent 10-K before quoting; the directional read is "comparable-scale operating loss, vastly larger revenue"). OpenAI is making the same bet — that the AI line will eventually be large enough to absorb its own R&D cost — on a tighter runway, with a single-supplier compute dependency that is not Google's.

What this means for you

If you are a developer or a small team paying for ChatGPT Plus, the 2025 financials do not change your short-term calculus. The price is not going up in 2026; if anything, the S&M line item is evidence the company has pricing room. The thing worth tracking is the paid-subscriber base: if 2026 shows a flattening or decline, the price-stability assumption breaks.

If you are a startup building on the OpenAI API, the cost-of-revenue trajectory is the line that matters. Inference pricing has reportedly been declining sharply year-over-year on the public benchmarks (the rule-of-thumb figure is in the 70-80% range, though the exact rate depends on which benchmark and which model family you anchor to); the question is whether OpenAI can keep pricing flat or pushing lower while its own cost-of-revenue grows. If cost-of-revenue growth in 2026 outpaces 2025's 183% rate, the unit economics on the API tighten, and either pricing has to rise (unlikely during an IPO year) or the company has to renegotiate the Microsoft deal.

If you are a founder or an enterprise buyer, the Microsoft dependency is the strategic line item. Every API call routed through OpenAI is, indirectly, routing through Azure. The diversification argument — "we are not locked into one cloud" — does not hold for OpenAI-routed workloads. The 2025 financials are the first time this dependency has been quantified in audited statements; it was speculated about for years, and the $10.59B number makes it concrete.

What to do this week

    # Step 1. Pull the full Ars Technica article (primary source) so the
    #    numbers above are not the only version of the story you are
    #    anchoring on:
    curl -sL --compressed --max-time 20 -A "Mozilla/5.0" \
      "https://arstechnica.com/ai/2026/06/leaked-financial-docs-show-openai-is-losing-billions-of-dollars-a-year/" \
      -o /tmp/openai-2025.html
    #    The audited numbers ($3.7B/$13.07B revenue, $7.81B/$19.18B R&D,
    #    $10.59B Microsoft, $2.65B/$7.5B cost of revenue, $1.11B/$5.73B
    #    S&M, $8.78B/$20.92B loss from operations) are all in the body
    #    of that article; FT's $8B-adjusted-net-loss framing is in the
    #    same write-up.

    # Step 2. If your stack runs on the OpenAI API, run a one-week
    #    shadow of token usage and pricing against a local-tier model
    #    (Gemma 4 26B or Qwen 3 30B-A3B). The point is not to migrate.
    #    The point is to know what fraction of your API bill is on
    #    workflows the local tier now covers — that fraction is the
    #    negotiating room you have if 2026 cost-of-revenue growth
    #    forces OpenAI to push API pricing.

    # Step 3. If you are an enterprise buyer, file the question with
    #    procurement: "What fraction of our AI spend routes through
    #    Azure, via OpenAI, and is that the diversification posture we
    #    think we have?" The 2025 financials are the first public
    #    evidence that the answer is "more than you assumed."

    # Step 4. Read both HN threads (48577208, the post-Ars write-up
    #    thread, and 48550465, the prior thread where Ed Zitron first
    #    surfaced the leak). The simonw comment in the 48577208
    #    thread is the explicit pointer between the two. The 48550465
    #    thread is where the "what the people who were paying
    #    attention already knew" framing originates — read both
    #    before you form a position on the 2025 numbers.

Related reads from this blog

Disclosure

Drafted with AI assistance. Primary source: Kyle Orland, "Leaked financial docs show OpenAI is losing billions of dollars a year," Ars Technica, 16 June 2026 — curl -L --compressed, 18 June 2026. Audited figures (revenue $3.7B/$13.07B; R&D $7.81B/$19.18B incl. $10.59B to Microsoft; cost of revenue $2.65B/$7.5B; S&M $1.11B/$5.73B; loss from operations $8.78B/$20.92B; net loss $5B/$39B with ~$8B adjusted 2025 net loss net of a ~$30B for-profit-conversion charge) are from the Ars article, which sourced them from Ed Zitron's leak and the FT's review. 900M weekly active / 50M paid subscribers, $122B round, $852B valuation: same source. HN item 48577208 (197 points, 116 comments at API snapshot) via Algolia HN Search, 18 June 2026. The 237%→160% of revenue, $418/sub/year, and >50% of R&D to Microsoft figures are this blog's arithmetic on the source line items, not direct claims. The "iaaan" / "themafia" / "dylan604" HN comment references are direct quotes from the Algolia API response. The $20-$25/month subscription range is a read of public Plus/Pro/Team pricing, not a verified blended average. The "10% migrate to local" scenario and the 25-50% / 75% local-workflow thresholds are thought experiments that reference the Vicki Boykis piece linked in Related reads, not direct claims. The "Other Bets / DeepMind in the same neighborhood" framing is this blog's directional read of Alphabet's 10-K; the specific 2025 figure should be checked against the filing before quoting. The per-subscriber/per-inference framing in the original-take section is this blog's editorial position.

Sources

  • Kyle Orland, "Leaked financial docs show OpenAI is losing billions of dollars a year," Ars Technica, 16 June 2026 — https://arstechnica.com/ai/2026/06/leaked-financial-docs-show-openai-is-losing-billions-of-dollars-a-year/
  • Ed Zitron, "OpenAI Losses Increased Nearly 8X in 2025, with Spending Hitting $34B," Where's Your Ed At (Zitron's newsletter), 16 June 2026 — the original source of the leak; the HN item 48550465 links to this piece at https://www.wheresyoured.at/exclusive-openai-financials/
  • Hacker News thread (197 points, 116 comments at time of writing; numbers move as the thread ages) on the Ars Technica article, item 48577208 — https://news.ycombinator.com/item?id=48577208
  • Algolia HN Search API metadata for item 48577208 (the source for the point/comment counts and commenter references) — https://hn.algolia.com/api/v1/items/48577208
  • Vicki Boykis, "Running local models is good now," 15 June 2026 (referenced as the "75% threshold" framing for the per-subscriber migration scenario) — https://vickiboykis.com/2026/06/15/running-local-models-is-good-now/

Wednesday, June 17, 2026

RFC 10008: HTTP Finally Has a Method Built for Real Queries

The IETF has been quietly working on a new HTTP method for most of the last decade. RFC 10008, The HTTP QUERY Method, was published in June 2026 — and within hours it had reached the front page of Hacker News at #2. The headline is small. The fix is not. HTTP, the protocol that has carried the web since 1991, has finally been given a method that does what most working developers have been using POST for all along: send a query in the request body without lying about what the operation is. The spec is short, precise prose for a fix that, in retrospect, should have shipped years ago.

The setup: the problem with POST-as-query

For most of the public web's history, the canonical "send a query to a server" pattern has been a GET request with the query string in the URI: GET /feed?q=foo&limit=10&sort=-published. This works fine until the query input gets large, sensitive, structured, or all three. RFC 10008's introduction lists four reasons GET-with-URI-query becomes "problematic" once the input outgrows the URI:

  • URI size limits aren't predictable across proxies, CDNs, and origin servers. RFC 9110 §4.1 recommends 8,000 octets but doesn't require it.
  • Encoding complex data (JSON, GraphQL, SQL) into a valid URI costs overhead and loses structure.
  • URIs get logged everywhere — access logs, browser history, Referer headers, bookmarks. Sensitive inputs end up in places they shouldn't.
  • Every distinct input combination becomes a distinct resource by URL. That makes caching, rate-limiting, and analytics all harder than they should be.

The pragmatic workaround most APIs adopted in the 2010s was to use POST for queries — POST a form body, get a result back. Stripe's POST /v1/charges/search, Algolia's POST /1/indexes/*/queries, GitHub's POST /search/code, every internal /api/search endpoint you've ever built: all POST. The problem is that POST is not safe. RFC 9110 §9.2.1 defines safe methods as those "intended to be read-only." By the spec, sending POST across the wire is asking the server to mutate state — even when the developer and the documentation both agree it isn't. The mismatch between intent and method is the bug. Caches don't cache POST the way they cache GET. CDNs don't replay it the way they replay GET. And the Allow: header, the audit log, the OPTIONS preflight, the rate-limiter's heuristics — none of them have an honest signal to work with.

What RFC 10008 actually defines

The RFC is short, careful, and unusually well-written for an IETF standards-track document. The full text is at datatracker.ietf.org/doc/html/rfc10008; the abstract page is at rfc-editor.org/info/rfc10008/. Three authors: Julian Reschke (greenbytes), James M. Snell (Cloudflare), Mike Bishop (Akamai). Document type: RFC, Proposed Standard. Working group: httpbis.

The mechanism is one new method called QUERY. The canonical example, lifted verbatim from §1 of the spec:

QUERY /feed HTTP/1.1
Host: example.org
Content-Type: application/x-www-form-urlencoded

q=foo&limit=10&sort=-published

That looks almost identical to the POST example. The differences are all in what the method promises:

Property GET QUERY POST
Safe yes yes potentially no
Idempotent yes yes potentially no
URI for query itself yes optional (Location) no
URI for query result optional (Content-Location) optional (Content-Location) optional (Content-Location)
Cacheable yes yes yes, only for future GET/HEAD
Request content "no defined semantics" expected expected

The table is the whole story. QUERY takes everything useful from POST (request body with structured content) and everything useful from GET (safety, idempotency, cacheability, replayability) and gives you a method that actually matches what you've been doing.

The spec also defines one new response header: Accept-Query. Servers return it to advertise which query media types they accept on a given resource. The RFC example is Accept-Query: "application/jsonpath", application/sql;charset="UTF-8". This is the counterpart to Accept: for the request side. JSONPath (RFC 9535, Feb 2024), XSLT, and SQL are all given as worked examples in Appendix A.6 — the spec is opinionated about what a "query format" should look like and points at existing standards rather than inventing a new one.

What QUERY fixes in practice

Three concrete things change once an API can advertise QUERY support:

Caches can finally cache query responses honestly. RFC 10008 §2.7 makes caching legal and well-defined: "The response to a QUERY method is cacheable; a cache MAY use it to satisfy subsequent QUERY requests." The cache key has to incorporate the request content and metadata, not just the URL. That means Varnish, Fastly, Cloudflare, and browser HTTP cache can all hold onto a query result and serve a repeat request without a round trip to the origin — exactly the property POST deliberately does not have. If the server returns a Location: header pointing at an "equivalent resource" URI, the client can switch to plain GET for subsequent traffic and skip the body entirely. This is the architectural payoff.

Cross-origin requests become preflighted honestly. RFC 10008 §4 spells out the security considerations: "A QUERY request from user agents implementing Cross-Origin Resource Sharing (CORS) will require a 'preflight' request, as QUERY does not belong to the set of CORS-safelisted methods." This is the right answer. A POST-as-query from the browser is currently lying to the CORS layer about what it's doing. A QUERY is honest. The preflight cost is a real cost, but it's the cost of doing the right thing on the wire.

The audit log is finally correct. When your reverse proxy, WAF, or API gateway sees a QUERY request, it knows — at the protocol layer, not because of a URL convention — that the operation is read-only and safe to retry. The Allow: header now means something. OPTIONS preflights get a real answer. Logging systems that classify by method get a true positive instead of a false negative.

Why this took 31 years

Appendix B of the RFC, titled "Selection of the Method Name 'QUERY'," is unusually candid about the history. The IANA HTTP Method Registry already contains three other methods that are safe and idempotent: PROPFIND (RFC 4918, 2007), REPORT (RFC 3253, 2002), and SEARCH (RFC 5323, 2008). All three originated in the WebDAV activity. The early drafts of RFC 10008 — it went through 14 versions under the working name draft-ietf-httpbis-safe-method-w-body — used the name SEARCH. The working group eventually picked QUERY for three reasons spelled out in the spec:

  1. The existing methods use a generic XML media type and define their semantics inside the request content. QUERY deliberately does not — it lets the resource pick its own media type and the Accept-Query header advertises support.
  2. The existing methods all originate in WebDAV, which the spec notes "many" in the broader HTTP community have mixed feelings about.
  3. The name QUERY "captures the relation with the URI's query component well" — i.e., it tells you what the method is for without requiring you to know it's the renamed SEARCH.

The fact that the IETF went through 15 drafts over what is conceptually a one-paragraph change tells you something. HTTP is the most-deployed protocol in human history, and changing it costs more than changing anything else on the internet. The fix is small. The review process that produced the fix was enormous.

What RFC 10008 does not do

A short honesty list:

It does not replace GET. GET with a URI query string is still the default for short, cacheable, loggable queries. The RFC is explicit about this — GET is the "common query pattern" and QUERY is the alternative when GET becomes problematic. Roughly: if your query fits in a URL and doesn't carry anything sensitive, GET is still right.

It does not replace POST for non-queries. If the operation mutates state — creates a record, sends an email, triggers a workflow — POST stays POST. QUERY is not a license to relabel every POST in your codebase. Relabeling a state-mutating POST as QUERY is a spec violation; the server is allowed to return 4xx, and clients are allowed to retry it indefinitely. The retries will hit your rate limiter and your audit log and your database in ways you do not want.

It does not specify what the query means. QUERY is media-type-driven. application/sql means SQL. application/jsonpath means JSONPath. application/xslt+xml means XSLT. The RFC does not invent a new query language; it standardizes the carrier. Which media types are interesting is up to the application, and the spec uses XSLT and SQL and JSONPath as examples because those are the formats that already have the necessary shape.

It does not immediately make your API support QUERY. Every origin server, every client library, every CDN, every WAF has to be updated before QUERY becomes useful in production. As of the RFC publication, none of the major servers have shipped QUERY support. Browsers will need CORS-safelist updates. The spec is the legal foundation; the ecosystem rollout is a multi-year project.

The original take: this RFC is small because HTTP is conservative on purpose

The thing the spec gets right that nobody is making explicit: it does almost nothing. There is one new method, one new response header, and one optional request pattern. The IETF spent 14 drafts and roughly a decade of working-group time to ship something that fits in 31 pages and changes three lines of HTTP semantics. That restraint is the story.

HTTP is the protocol that runs the web. Every change to it is paid for by every cache, proxy, CDN, library, browser, and developer who implements it. The cost of a bad change is enormous. The cost of a slow, conservative, one-method-at-a-time process is that the protocol moves slowly. Both costs are real. The fact that QUERY has been "almost done" for a decade is not a failure of the IETF — it is the IETF working as designed. The alternative — a faster process that ships more methods and more headers with less review — is the kind of process that produces security holes and ecosystem fragmentation. The web cannot afford that.

The interesting secondary observation is who shipped this. The author list is greenbytes (a small consultancy run by Julian Reschke, the same person who edited RFC 9110, the core HTTP semantics spec), Cloudflare (Snell), and Akamai (Bishop). This is the CDN layer of the web shipping a spec that makes caching of query responses a first-class operation. That is not a coincidence. The CDN operators are the ones who pay the cost of POST-pretending-to-be-GET in cache misses, in WAF CPU cycles, in origin shield traffic. The QUERY method is, among other things, an admission from the edge operators that the workaround the application layer adopted in 2010 is expensive at the network layer, and the right fix is at the protocol layer.

What this means for you

  • If you ship a public REST API with any POST /search or POST /query endpoints — read RFC 10008 §1 and §2 carefully. The honest answer for many of these endpoints is "switch the method to QUERY and add an Accept-Query: response header." You don't have to wait for server support; you can ship QUERY today with any framework that lets you register custom HTTP methods. Clients that don't understand QUERY will return 501, which is the correct behavior for an unknown method.
  • If you maintain an HTTP client library, server framework, CDN, or WAF — the work is just starting. Method registration in your parser, CORS-safelist policy, cache key derivation that includes the body, audit log categorization by method: all four pieces need a code change. The RFC's Appendix A.4 and A.5 are the test vectors; the Appendix A.6 examples are the integration tests.
  • If you design APIs — the design question changes from "GET-with-query-string or POST-with-body" to "GET-with-query-string, QUERY-with-body, or POST." The QUERY option is now on the table for any read operation whose input doesn't fit in a URL.
  • If you write about web infrastructure — the talking point is not "HTTP has a new method." The talking point is that the most-deployed protocol in human history just shipped a fix for a 30-year-old workaround, and the fix is small on purpose. The protocol layer's conservatism is the feature.

What to do this week

    ## Step 1. Read the spec. The text version isn't on rfc-editor.org yet
    #    (as of 2026-06-17, the /rfc/rfc10008.txt URL returns 404); use
    #    datatracker.ietf.org/doc/html/rfc10008 for the canonical HTML.
    #    The IANA HTTP Method Registry entry for QUERY is the canonical
    #    confirmation that the method is registered.

    ## Step 2. Audit your codebase. Grep for POST endpoints whose intent
    #    is read-only (search, filter, query, lookup, list-by-criteria,
    #    export-where-clause, etc.). The honest classification for these
    #    is QUERY, not POST. The audit output is your migration list.

    ## Step 3. Ship a proof-of-concept. Pick one internal read-only
    #    endpoint, add the method to your server's allowed methods list,
    #    return Accept-Query on OPTIONS, and point a curl at it. The
    #    query below is a simplified JSONPath example shaped like the
    #    real RFC 10008 Appendix A.6 example (which queries RFC errata
    #    by status and submit date); substitute your own path and
    #    filter for the one your team is migrating:
    #
    #      curl -X QUERY 'https://internal.example/api/orders/search' \
    #           -H 'Content-Type: application/jsonpath' \
    #           -d '$..[?@.status=="open"]'
    #
    #    Confirm: 200 OK with the JSONPath result, a Content-Location
    #    header if the resource is cacheable, and a 405 Method Not
    #    Allowed from any path that hasn't been updated.

    ## Step 4. File an issue. If you maintain a framework, server, CDN,
    #    WAF, or client library: file the QUERY method support issue now,
    #    while the spec is fresh. Reference draft-ietf-httpbis-safe-method-w-body-14
    #    if your issue tracker is RFC-version-strict; RFC 10008 if it's
    #    not. The work is months, not weeks.

    ## Step 5. Wait for the ecosystem. Don't ship QUERY to production
    #    for public-facing APIs until at least one major CDN and one major
    #    browser implement CORS preflight and cache-key support. The spec
    #    is the legal foundation; the ecosystem is the deployment surface.

Related reads from this blog

Disclosure

This post was researched with AI assistance: the RFC text was fetched with curl --compressed from datatracker.ietf.org/doc/html/rfc10008 and rfc-editor.org/info/rfc10008/; the trend signal was sourced from the Hacker News front page; cross-references (RFC 9110, RFC 9111, RFC 9535, RFC 4918, RFC 3253, RFC 5323) were confirmed against the IETF datatracker. The synthesis, original-take section, and recommendations are the author's. No quotes in the body are fabricated; the example HTTP exchanges in the body and the Appendix A.6 examples are taken from RFC 10008 directly. The note that rfc-editor.org/rfc/rfc10008.txt returns 404 (and that the canonical HTML is on datatracker) was verified live at the time of writing.

Sources

  • RFC 10008, The HTTP QUERY Method — Reschke, Snell, Bishop. June 2026. https://datatracker.ietf.org/doc/html/rfc10008
  • RFC 10008 abstract / info page — https://www.rfc-editor.org/info/rfc10008/
  • IANA HTTP Method Registry (where QUERY is registered) — http://www.iana.org/assignments/http-methods
  • RFC 9110, HTTP Semantics — Fielding, Nottingham, Reschke. June 2022. https://www.rfc-editor.org/info/rfc9110 (the core spec QUERY builds on; defines safe and idempotent)
  • RFC 9535, JSONPath: Query Expressions for JSON — Gössner, Normington, Bormann. February 2024. https://www.rfc-editor.org/info/rfc9535 (the JSONPath query example in RFC 10008 §A.6 references this)
  • Hacker News discussion: "RFC 10008: The new HTTP Query Method" — submitted by schappim, 17 June 2026. https://news.ycombinator.com/item?id=48568502 (82 points, 43 comments at time of writing; numbers moving as the thread ages)

Your Local Model Is a Faster Google (And Now It Loops, Too)

On 15 June 2026, Vicki Boykis published a short, technically clean post on her blog titled "Running local models is good now." The headline is the point. After three years of local model releases that were always six months behind the frontier, Boykis — a working ML engineer who has been on the local-inference side of this since llama.cpp was a weekend project — is willing to say out loud: the gap just closed enough to matter. The "vibe metric" she uses to make the call is the one anyone who has shipped with a local model eventually lands on: do I still need to double-check this against an API model? When the answer stops being "yes, every time," the local model has crossed a threshold. The post is the documented version of that crossing. It is worth reading closely.

The setup: where local models actually are in 2026

Boykis's working stack, on a 2022 M2 Mac with 64 GB of RAM and 1 TB of storage, is the one most engineers who care about local inference end up on: raw llama.cpp with Open WebUI on top, llama-cpp-python, Ollama, llamafiles, and LM Studio as the desktop client. The model list she has been driving is the actual frontier of small open-weights: Mistral 7B (the early one), Gemma 3, OpenAI's OSS-20B, Qwen 3 MOE, and the Qwen 2.5 Coder variants. None of this is exotic. All of it is in the public model registries; all of it runs on hardware a senior engineer can buy off the shelf.

Where Boykis's post is sharp is the inflection point she names. For years, the local tier has been "fast personalized Google" — useful for "what is the syntax for X in library Y" lookups, slow for anything that required sustained reasoning. The release of GPT-OSS, in her telling, was the first time the double-check reflex stopped firing. The latest Google releases, in the Gemma 4 family, are the first time local agentic coding loops "work at about ~75% the accuracy/speed of frontier models." That is the claim, and the claim is what the rest of this post is built on.

What the post is actually demonstrating

Boykis is not benchmarking. She is reporting on a setup she has been running. The specific things she has gotten a local Gemma 4 26B-A4B model running through the Pi agent harness to do:

  • Refactor a Python script from a notebook into a five-or-six-module repo, with a separate pass to clean up generic type hints.
  • Proofread blog posts.
  • Write unit tests.
  • Bootstrap a recommendation-system repo from a blank slate and watch what the agent produces.
  • Build out the surface that scrapes trending topics from arXiv papers.

The "Docker container with limited execution" framing is the one detail that matters most for any reader who is going to try this at work. Boykis runs Pi in a sandbox with bash permissions only — no Python, no web browsing — and plans to add curl in a separate image for the research tasks. The 64 GB K-V cache ceiling on long-context runs is the part she is honest about: local means local, and the hardware bound is real.

The "75%" claim, held up to the light

The number worth arguing with is 75%. It is Boykis's read, not a benchmark number, and the metric she is using is "accuracy/speed of frontier models" — which is a two-axis read on a single subjective scale. Three things are true at the same time:

  1. 75% of frontier is enough for almost all the day-to-day engineering work that does not require a frontier-tier reasoning model. The "personalized Google" use case is the dominant one in any working engineer's day, and a local model that handles it without the API round trip is, in expectation, faster end-to-end.
  2. 75% of frontier is not enough for the 25% that does. The benchmarking, the long-horizon agentic work, the multi-file refactor with architectural judgment, the security-sensitive code review — these still want the frontier. The threshold-crossing is a threshold-crossing for a specific workload, not a general capability ceiling.
  3. The 75% number is the floor, not the ceiling, of this release cycle. Gemma 4 12B-QAT, which Boykis flags at the end of the post, is already the model she is migrating to. Smaller, faster, "without much sacrifice in accuracy." The 75% number is going to move up over the next two release cycles, and the threshold is going to move with it.

The defensible read of the post is not "local models have caught up." It is "the local tier crossed the threshold for the use case that most engineers spend most of their time on, and the threshold is not going to go back down."

What the post leaves out, and what to do about it

Three things Boykis does not say, and that the local-model reader needs to hear:

Hardware is the actual constraint, not model quality. The 64 GB M2 Mac is the floor for the workloads she describes. The 16 GB laptop a junior engineer is running is not. The LM Studio system-requirements page calls 16 GB the recommended minimum; the working note for 8 GB Macs is to stick to smaller models and modest context sizes. The 75% number does not transfer to the 16 GB tier. The realistic expectation for a 16 GB M-series laptop is Gemma 3 4B and Qwen 2.5 Coder 7B at modest context, and the workloads that work at that size are the personalized-Google ones, not the agentic ones.

The harness is half the product. Pi in read-only mode is doing a lot of the work Boykis credits to the model. The local model is producing the tokens; the agent harness is producing the file structure, the test scaffolding, the import graph. If you swap the harness — Aider, Claude Code pointed at the local endpoint, OpenHands — the same model produces a different 75%. The post is a "local model + Pi + LM Studio + Docker sandbox" report, not a "local model" report. The stack is the unit of analysis.

The "introspect everything" angle is the underrated one. The closing of Boykis's post is the part that should land hardest for the developer-tools audience. With a local model, you can watch the token inference in real time, change the context window and watch the performance move, swap the quantization, swap the system prompt, swap the model entirely, and see what each swap does to the output. That is not a debugging story. It is a learning story. The frontier API is a black box; the local stack is not. For an engineer who is trying to develop intuition for what these models actually do, the local setup is the only one that gives you the loop.

The original take: the 75% threshold is a labor-market signal, not a model-quality signal

The thing nobody is making explicit: the local-tier model crossing the 75% threshold is a signal about what counts as a "developer job" in 2026, not a signal about model capability. The model has not "caught up" in any objective sense — frontier models are still frontier, and the gap on the long tail of agentic and reasoning workloads is real. What has changed is that the work that is actually most of a working engineer's day — the syntax lookup, the test scaffold, the lint pass, the blog post proofread, the bootstrap — has been moved out of the "requires a human engineer" column and into the "requires a 75%-of-frontier model" column. That reclassification is permanent. It does not reverse when the next model release lands.

The labor-market consequence is the part the post does not make. When a single engineer with a 64 GB laptop and a local model can ship the work that used to take a team, the question is not "are local models good enough." The question is "what is the team for." The 75% threshold is the point at which the team-shape question becomes the question, full stop. The engineers who can name the workloads that benefit and the workloads that don't are the ones who will be hirable through the transition. The engineers who are still benchmarking local models against frontier models in 2027 are the ones who missed the reclassification.

What this means for you

  • If you are an engineer who has not run a local model yet — pick the smallest model that fits your hardware (Gemma 3 4B or Qwen 2.5 Coder 7B on a 16 GB laptop; Gemma 4 26B-A4B or gpt-oss-20B on a 64 GB desktop) and run a personalized-Google workflow through it for a week. The point is to develop intuition for what the threshold is on your hardware, not to beat a benchmark.
  • If you are a tech lead making tooling decisions — the question is not "do we buy frontier API access." The question is "which workflows do we run locally, which we run on frontier, and which we run on a small fine-tune." The 75% threshold means a meaningful slice of the answer is "local." The cost model and the data-handling model both change.
  • If you are evaluating agent harnesses — the local stack is the right place to do the comparison. Swap the model, keep the harness; swap the harness, keep the model; look at the diff. The harness matters as much as the model. Pi is not the only option; it is the one Boykis is using, and it is worth trying.
  • If you are writing about local models — the right unit of analysis is the stack, not the model. "Gemma 4 12B on a Mac M2 with Pi in Docker" is the real subject. "Gemma 4 12B" is not.

What to do this week

    ## Step 1. Install LM Studio (skip if you have it). System requirements
    #    are documented at https://lmstudio.ai/docs/app/system-requirements:
    #    - macOS: Apple Silicon M1/M2/M3/M4, macOS 14+, 16 GB+ RAM
    #    - Windows: x64 or ARM (Snapdragon X Elite), AVX2 required, 16 GB+ RAM
    #    - Linux: x64 or ARM64, Ubuntu 20.04+, distributed as AppImage
    #    4 GB of dedicated VRAM is the recommended amount for hardware that
    #    has a discrete GPU.

    ## Step 2. Download the model Boykis is migrating to (gemma-4-12b-qat)
    #    if your hardware can run it; otherwise start with gemma-3-4b or
    #    qwen2.5-coder-7b. The exact file you want is in the LM Studio model
    #    browser; pick the Q4_K_M quantization if you are RAM-constrained.

    ## Step 3. Set up the agent harness. Pi is at
    #    https://github.com/earendil-works/pi (formerly badlogic/pi-mono).
    #    The models.json you need to point Pi at LM Studio is in the
    #    primary source. Use docker-compose to run Pi in a sandbox with
    #    bash-only permissions, the same way Boykis does.

    ## Step 4. Run a personalized-Google workflow for a week. Pick a real
    #    task you do every day (a syntax lookup, a test scaffold, a lint
    #    pass, a blog post proofread) and run it through the local model.
    #    The point is not to publish the result. The point is to develop
    #    intuition for what the 75% threshold feels like on your hardware.

    ## Step 5. If the local stack works for you, file the ticket. "Move
    #    this workflow off the frontier API" is a procurement decision, a
    #    cost-of-inference decision, and a data-handling decision. The
    #    ticket is the audit trail; the result of running it for a week
    #    is the input.

Related reads from this blog

Disclosure

Drafted with AI assistance. Primary source: Vicki Boykis, "Running local models is good now," 15 June 2026, https://vickiboykis.com/2026/06/15/running-local-models-is-good-now/ (retrieved 17 June 2026 00:30 UTC+8 via curl -L --compressed; the page body extracted was ~34 KB of rendered HTML). The "75% of frontier" figure is Boykis's read, not a derived benchmark; the model list (Mistral 7B, Gemma 3, OSS-20B, Qwen 3 MOE, Qwen 2.5 Coder), the harness (Pi), the inference client (LM Studio), the hardware spec (2022 M2 Mac, 64 GB RAM, 1 TB storage), the 64 GB K-V cache ceiling, and the Docker-sandbox-with-bash-only pattern are all Boykis's. The "Pi" agent harness is at https://github.com/earendil-works/pi (the canonical repo; the prior badlogic/pi-mono URL is a 301 redirect to it, verified via curl -sI). The LM Studio system requirements (16 GB RAM recommended, macOS 14+ on Apple Silicon, 4 GB VRAM recommended, AppImage on Linux) are from https://lmstudio.ai/docs/app/system-requirements, retrieved 17 June 2026 00:35 UTC+8. The "75% threshold is a labor-market signal, not a model-quality signal" framing in the original-take section is this blog's editorial position, not a claim in the Boykis post. The "Pi in read-only mode" defensive framing is Boykis's; the local-model-as-learning-loop framing in the "what the post leaves out" section is the blog's. The hardware-bound argument (16 GB tier cannot run Gemma 4 26B) is a derived claim from the LM Studio system-requirements document, not a direct quote from Boykis. No quote in the body is presented as a verbatim Boykis sentence; the paraphrases are marked as such. Limit on inference: the "75%" figure is not a benchmark and is not a stable cross-workload metric; treat it as Boykis's reading of her own setup.

Sources

  • Vicki Boykis, "Running local models is good now," 15 June 2026 — https://vickiboykis.com/2026/06/15/running-local-models-is-good-now/
  • LM Studio, "System Requirements" (retrieved 17 June 2026 00:35 UTC+8) — https://lmstudio.ai/docs/app/system-requirements
  • Pi agent harness repository (canonical URL; prior badlogic/pi-mono is a 301 redirect) — https://github.com/earendil-works/pi
  • Google, "Gemma 4 model card" (background on the Gemma 4 family that Boykis is migrating to) — https://ai.google.dev/gemma
  • OpenAI, "gpt-oss-20b model card" (the OSS-20B reference in Boykis's model list) — https://openai.com/index/gpt-oss-20b/ (link returned HTTP 403 as of 17 June 2026; the same model on Hugging Face at https://huggingface.co/openai/gpt-oss-20b is the live reference)
  • LM Studio, "Integrations: Claude Code" (the documented path for pointing Claude Code at a local LM Studio endpoint) — https://lmstudio.ai/docs/integrations/claude-code

Tuesday, June 16, 2026

The Recruiter's Repo. The npm install Was the Backdoor.

The Recruiter's Repo. The npm install Was the Backdoor.

On 15 June 2026, Roman Imankulov published a post-mortem on his own blog at roman.pt describing the most disquieting recruitment-trail attack of the year. A recruiter claiming to represent a "small crypto startup" messaged him on LinkedIn, ran him through a normal-feeling multi-day conversation, then sent a public GitHub repo and asked him to "check out the deprecated Node modules issue." The repo contained a package.json whose prepare script ran node app/index.js, an app/index.js whose very first line was require('./test'), and an app/test/index.js whose ~250 lines hid a URL-assembly routine that built https://rest-icon-handler.store/icons/77 from string fragments and then "ran anything the server sent back to your machine." The post hit Hacker News as item 48546294 and was sitting at 568 points and 109 comments the morning of 16 June 2026. It is not the technical novelty that is new. The novelty is that the delivery vehicle is the hiring funnel, and that an AI coding agent in read-only mode is what caught it.

The attack in one paragraph

The trap is laid in three pieces of code, all in the same repo. package.json declares a prepare script — prepare is a documented npm lifecycle hook that runs automatically after npm install from a local path, a git URL, or a tarball. The script chain is prepareapp:prenode app/index.js. The app/index.js entry point does const test = require('./test') at the top level, which loads the test file as a side effect of being required. And app/test/index.js, disguised as a test suite with "walls of commented-out tests," assembles its C2 endpoint from string fragments — protocol = "https", domain = "store", subdomain = "rest-icon-handler", path = "/icons/", token = "77", etc. — then evaluates whatever the server returns. The deobfuscation step is the part Imankulov did not run; he read enough of the source to stop. The point is that none of the three pieces is, on its own, a flag. A prepare script in a Node project is ordinary. A require of a test module is ordinary. A test file with string concatenation is ordinary. The combination, mounted on the social-engineering rails of a recruiter DMed at you by name, is what is new.

The recruiter's LinkedIn profile belonged to a real arts journalist with no technical background, and the 39 commits in the repo were attributed to a real full-stack developer whose name and email had been used on the platform before — that developer confirmed to Imankulov he had been impersonated on GitHub prior to this incident. The same recruiter DMs land in dev inboxes every week. The campaign Imankulov was targeted by is not a one-off; it is the working shape of a class.

LinkedIn has become the new phishing email — with a better pretext

The Register's 31 March 2026 write-up of an axios compromise is the same shape from a different direction: attackers compromised the npm account of jasonsaayman, the axios primary maintainer, by swapping the account's email for an anonymous ProtonMail inbox and pushing infected packages manually (bypassing the GitHub Actions CI pipeline). The published payload versions were axios@1.14.1 and axios@0.30.4, with a plain-crypto-js@4.2.1 dependency added to drop a cross-platform RAT. The Register's 23 April 2026 write-up of the Boris Vujičić / Genusix Labs incident adds higher-fidelity detail on the same chain: a camera-on Zoom interview, a "live-coding test" that delivers a patch[.]sh shell script under a camera-driver pretext, architecture detection, a Go-based backdoor with custom RC4-encrypted C2, persistence on boot, Chrome password extraction, Keychain exfil, crypto-wallet targeting. The three stories, in chronological order, sketch the same campaign moving up the stack: own the recruiter, own the maintainer, own the package. The 15 June post is the case where the recruiter is the whole attack.

The framing for security teams is: your hiring funnel is now a malware delivery channel, and the threat model that scoped supply-chain risk to "third-party npm packages we audit with Socket / Snyk / npm audit" does not see it. The candidate-side failure mode is npm install && node app/index.js against a repo the candidate has no reason to distrust, in a context where the recruiter is pressuring the candidate to move fast. The employer-side failure mode is "we trust our own recruiters" — which is correct, but is not the trust boundary that matters. The trust boundary is: a candidate will, under reasonable time pressure, npm install whatever a stranger on LinkedIn sends them, and the npm ecosystem's default prepare-script behavior is to make that npm install execute attacker-controlled JavaScript.

The AI-coding-agent angle is the one nobody else is making

The most useful sentence in Imankulov's post is buried halfway through. He notes that running the suspicious repo through an AI coding agent (Pi, in his case) with read-only tools flagged the backdoor in seconds — faster than he could have read it himself, faster than he would have caught it by skimming. This is the defensive force-multiplier that the supply-chain discourse has been under-using. The agent is not a security product. It is, however, a code reviewer that will read every line of every file the candidate was about to run, on demand, in a sandbox, with the recruiter's pressure removed. The "Pi in read-only mode" pattern is the model: any agent that can be given a directory and instructed to summarize what each file does — without executing it, without following imports, without network — collapses the candidate's review time from "however long it takes to read 250 lines" to "however long it takes to read the agent's summary." For candidates being targeted by a LinkedIn-recruiter attack, that is the difference between catching the trap and walking into it.

The second-order angle, which is the one HN's top comment thread is starting to make: this is what an honest AI-coding-agent-assisted security review feels like in 2026. The agent did not "catch malware" in any deep semantic sense. It read the file, summarized what the file did, and the human said "ah, no." That is the realistic ceiling for the agent — a fast, thorough, deterministic first pass, with the human judgment applied to the summary. The agent is the lint, not the auditor. The post should not oversell the role. But the role is real, and the 15 June story is the most widely-cited recent incident where the agent was the reason the backdoor did not run.

The npm prepare footgun is the underlying bug

The mechanism that makes this attack work is npm install executing arbitrary JavaScript out of prepare, preinstall, install, and postinstall scripts. npm's lifecycle documentation describes the behavior; the design has been the same since the early days of the package manager. The flag npm install --ignore-scripts is the opt-out, and it is not the default. The community has known about this for years — the 2018 eslint-scope postmortem, the 2018 event-stream compromise, the 2022 node-ipc / peacenotwar incident, the 2022 colors.js / faker.js maintainer compromise all rode the same lifecycle hooks. (Citations for the four historical incidents are in ## Sources; three of the canonical postmortem URLs were returning 404 as of 16 June 2026 and have been dropped from the body so the post does not carry broken links — the event-stream GitHub issue is the one surviving verified link.) The LinkedIn-recruiter story is a new delivery vehicle for an old footgun.

The 2026-era defensive posture is well-known and not yet standard. npm install --ignore-scripts for the first run on any untrusted repo is the opt-out; turning it on by default in your project's .npmrc is the cheap mitigation. None of the major package managers disable scripts by default — the default is "run whatever the package says." The LinkedIn-recruiter backdoor is a useful forcing function to make that default the wrong default at your team: if your hiring process ever asks a candidate to npm install an evaluation repo, the right policy is --ignore-scripts (or a pre-built sandbox image) and the cost of switching is one config line.

The "report and pray" gap is the systemic problem

The most-quoted HN comment on the 15 June post, from @pants2: "LinkedIn offers no way for $company to disavow users who claim to work for $company." That is the part the post is honest about. Imankulov reported the repo to GitHub and the recruiter to LinkedIn. As of the post's writing, the code was still up; the impersonated developer's complaint was filed. The Vujičić incident in April 2026 followed the same report-and-pray arc, with Vujičić reporting the fake-company repo to npm and GitHub, the Genusix profiles to LinkedIn, the domain to HostGator, and the IP to AbuseIPDB. An HN commenter on the 15 June thread linked to Microsoft's reportfraud.microsoft.com page as a model to copy; the existence of a dedicated abuse-reporting surface with a public response expectation is the part worth noting, even if the specific SLA is not documented in the thread.

The platforms have a reporting workflow and a takedown SLA they publish; the SLA is not the gap. The gap is that the report-to-action pipeline for recruiter-shaped attacks — where the malicious actor is impersonating an employer and using a public repo as the C2 trigger — does not have a category, so the report sits in the generic abuse queue while the attack keeps running.

The original take: the vulnerability is in the hiring funnel, not in npm

The defensible original framing, which the post itself does not quite make: the LinkedIn-recruiter backdoor is a vulnerability in the hiring funnel, not in the package manager. The npm prepare footgun is a known quantity. The recruitment delivery vehicle is the new thing. The threat model that catches it sits at the recruiter / HR layer, not the developer layer, and most security teams do not have a "how are candidates being asked to install code from us" review in their threat model at all.

The right fixes are at the recruiter layer. Companies that run live-coding take-homes should publish a pre-built sandbox image (Devbox, GitHub Codespaces, Daytona, or a docker compose up) and tell candidates explicitly in writing that they are not expected to clone-and-install the company's repo. Recruiters should be trained to never send a candidate a public GitHub repo to clone and run, and the training should be measured — the next incident is a training-failure metric, not a security-team incident. The candidate-facing message should be the inverse of the LinkedIn recruiter's pressure: slow down, do not install, ask for a sandbox. The candidate-side defensive posture — pnpm, --ignore-scripts, an AI-coding-agent first pass — is the bottom of the stack, and it should be on. The recruiter-side fix is the top of the stack, and it is the part the security industry has been leaving to HR.

The procurement framing: the cost of this attack succeeding is not "the candidate's laptop got owned" — that is a Tuesday. The cost is that the recruiter, the company, and the platform each have plausible deniability, and the candidate bears the loss. The fix is to make the process the attack surface, and to put the security review on the process before the recruiter DMs the next candidate. The threat model your security review still uses is the threat model that misses this.

What this means for you

  • If you are a candidate being asked to "check out our repo" by a recruiter — do not npm install it. Read the source in a read-only AI agent, or in less, or in a throwaway Hetzner box. The recruiter's pressure is the attack. Slow down; that is the defense. If the company will not give you a sandbox image, the company is the wrong company.
  • If you are a recruiter sending a take-home to a candidate — switch to a pre-built sandbox image (Devbox, GitHub Codespaces, Daytona) and document it in the take-home brief. The cost of the switch is one config file. The cost of not switching is that the next "I cloned your repo" incident is your company's name on the post.
  • If you maintain a Node project and accept outside contributions / outside test runs — set ignore-scripts=true in .npmrc for the test environment. The flag exists, the flag is one line, and the flag is the difference between "the candidate ran our CI in a clean environment" and "the candidate's laptop got owned by a prepare script we shipped in 2023 and forgot about."
  • If you run a security team — add "how are candidates being asked to install our code" to the threat-model review. The npm-audit / Snyk / Socket posture does not catch this; the threat is upstream of the install, in the recruiter-channel, and the right defensive surface is the process, not the package.
  • If you write or maintain a code-review agent — the "read-only first pass" is the right shape. The agent that catches this backdoor is the agent that reads the file, summarizes the suspicious lines, and stops. The agent that catches it by running the file is the agent that is now part of the C2 chain.

What to do this week

# 1. Set the default for any non-production install on your machine.
#    This is the single most useful one-line change you can make
#    today, and it is the bottom of the stack that catches the
#    LinkedIn-recruiter backdoor before any other defense fires.
echo 'ignore-scripts=true' >> ~/.npmrc
#    Verify with: npm config get ignore-scripts
#    This does not change anything for projects that depend on a
#    prepare/postinstall step to build (some still do); for those,
#    use --ignore-scripts=false on the one install that needs it.

# 2. If you maintain a take-home, switch the candidate-facing
#    instructions to a sandbox image. The minimum viable version:
#    a Dockerfile that pins the Node version, copies the repo,
#    and runs the test command. The candidate runs:
#       docker compose up
#    instead of:
#       git clone <url> && cd <repo> && npm install
#    The Hetzner + Pi + read-only-tools pattern from Imankulov's
#    post is the same idea, lower-fidelity, single-use. Use it.

# 3. If you are evaluating an AI-coding-agent's security-review
#    value, the right test is: point it at a public repo you
#    do not know, ask it to summarize every file in the repo
#    and flag anything that looks like a lifecycle-script exploit,
#    a require-chain that loads an unexpected file, or a URL
#    constructed from string fragments. The agent that catches
#    the synthetic version of the 15 June trap in under a
#    minute is the agent that catches the real one. The agent
#    that does not is the agent you do not want as your
#    first-pass reviewer.

# 4. If you are a security lead, file a Jira / Linear ticket
#    titled "hiring-funnel threat model" with a single line:
#    "How are candidates being asked to install code from us?"
#    The next recruiter-shaped attack is a question of when, not
#    if. The ticket is the audit trail that the question was
#    asked. The answer is the policy that catches the next one.

# 5. Read the Imankulov post end to end. The technical walkthrough
#    is short; the social-engineering context is the part that
#    will change how you read recruiter DMs for the next quarter.
#    The Wayback Machine has the canonical copy at:
#    https://web.archive.org/web/20260615230051/https://roman.pt/posts/linkedin-backdoor/
#    (the live page was last-modified 2026-06-15 20:28:55 UTC,
#    ~28 minutes after the HN submission went live; the
#    Wayback snapshot from 20260615230051 has the correct content)

Related reads from this blog

Disclosure

Disclosure: Drafted with AI assistance. Primary source: Roman Imankulov, "A backdoor in a LinkedIn job offer," https://roman.pt/posts/linkedin-backdoor/, published 15 June 2026 (last-modified 2026-06-15 20:28:55 UTC, verified via curl -I); a Wayback Machine snapshot is retained at https://web.archive.org/web/20260615230051/https://roman.pt/posts/linkedin-backdoor/ for readers hitting the page in a state of flux. HN thread: item 48546294, submitted by @lwhsiao on 15 June 2026, 568 points and 109 comments as of 16 June 2026 08:00 UTC+8 (counts moving; fact-check pass retrieved 568 points on 16 June 2026 00:17 UTC). Secondary sources: The Register, "Top npm package backdoored to drop dirty RAT on dev machines" (axios jasonsaayman account compromise via email swap, payload versions axios@1.14.1 and axios@0.30.4 plus plain-crypto-js@4.2.1, 31 March 2026, https://www.theregister.com/security/2026/03/31/top-npm-package-backdoored-to-drop-dirty-rat-on-dev-machines/5219910); The Register, "Dev targeted by sophisticated job scam: 'I let my guard down, and ran the freaking code'" (Boris Vujičić / Genusix Labs, 23 April 2026, https://www.theregister.com/security/2026/04/23/dev-targeted-by-sophisticated-job-scam/5226263). The package.json prepare-script lifecycle hook and the npm install --ignore-scripts flag are documented at https://docs.npmjs.com/cli/v8/using-npm/scripts#life-cycle-scripts. The assembled URL https://rest-icon-handler.store/icons/77, the rest-icon-handler.store C2 domain, the 39 impersonated GitHub commits, the real-arts-journalist recruiter profile, and the agent-as-defensive-reviewer framing (Pi, read-only tools) are all Imankulov's. Conflict-of-interest note: the Imankulov post is a first-person incident write-up; he is the targeted candidate, the discoverer of the backdoor, and the author of the technical analysis. The framing of the backdoor as "the npm install was the backdoor" is editorial compression, not a direct quote. The HN comment from @pants2 on "LinkedIn offers no way for $company to disavow users who claim to work for $company" is summarized from the thread; the exact wording is on the HN page. The --ignore-scripts=true recommendation, the "pre-built sandbox image (Devbox, GitHub Codespaces, Daytona)" recommendation, and the "the threat model that catches this is at the recruiter layer, not the developer layer" framing are this blog's editorial position, not a direct prescription from Imankulov or The Register. Corrections from the first draft of this disclosure (applied 16 June 2026 morning): an earlier draft misattributed the 31 March 2026 axios compromise to maintainer Josh Junon via a 2FA-reset phish, and conflated the debug and chalk (~2B weekly downloads) compromise of 2025 with the axios payload. The Register's 31 March 2026 article attributes the axios compromise to maintainer jasonsaayman via an email-swap, with the listed payload versions above. Limit on inference: the C2 payload that rest-icon-handler.store would have served was not retrieved by Imankulov and was not retrieved for this post; the characterization "runs anything the server sends back to your machine" is Imankulov's read of the URL-construction code, paraphrased from his post. The current state of the malicious GitHub repo and the recruiter's LinkedIn account is taken from Imankulov's post; the post states the code is still up but does not state the recruiter account's current status, and no independent verification was attempted.

Sources

  • Roman Imankulov, "A backdoor in a LinkedIn job offer," 15 June 2026 — https://roman.pt/posts/linkedin-backdoor/ (canonical copy at https://web.archive.org/web/20260615230051/https://roman.pt/posts/linkedin-backdoor/)
  • Hacker News, item 48546294, "A backdoor in a LinkedIn job offer" — https://news.ycombinator.com/item?id=48546294 (point/comment counts moving; latest 568/109 in disclosure, fetched 16 June 2026 00:17 UTC)
  • The Register, "Top npm package backdoored to drop dirty RAT on dev machines," 31 March 2026 (axios / jasonsaayman / 2 versions: axios@1.14.1 + axios@0.30.4) — https://www.theregister.com/security/2026/03/31/top-npm-package-backdoored-to-drop-dirty-rat-on-dev-machines/5219910
  • The Register, "Dev targeted by sophisticated job scam: 'I let my guard down, and ran the freaking code,'" 23 April 2026 (Boris Vujičić / Genusix Labs / patch[.]sh) — https://www.theregister.com/security/2026/04/23/dev-targeted-by-sophisticated-job-scam/5226263
  • npm CLI documentation, "npm scripts — life cycle scripts" — https://docs.npmjs.com/cli/v8/using-npm/scripts#life-cycle-scripts
  • Snyk, "event-stream incident analysis" (background on the 2018 npm prepare-script pattern) — https://snyk.io/blog/event-stream-vulnerability/ (link was 404 as of 16 June 2026; referenced from the body, kept as a text mention for future correction when Snyk republishes the URL)
  • ESLint, "Postmortem for malicious package publishes" (2018 eslint-scope incident, same lifecycle-hook pattern) — https://eslint.org/blog/2018/07/26/postmortem-for-malicious-package-publishes (link was 404 as of 16 June 2026; keep as text mention, re-verify on the ESLint blog before next republish)