Top 50 · this week

Trending for engineers

Tools, models, APIs & resources gaining traction · infra, AI, backend, devops.

Tools· 20

01
turbovec
vector-db
Rust vector index with Python bindings, built on Google's training-free TurboQuant quantization for local search.
turbovec wraps TurboQuant, the data-oblivious quantizer Google Research presented at ICLR 2026 that needs zero training passes over your data. It compresses a 10-million-document corpus from roughly 31GB down to about 4GB, and on ARM hardware its search beats FAISS IndexPQFastScan by 12-20%. It installs via pip or cargo, runs entirely locally with no managed service, and plugs into LangChain, LlamaIndex, and Haystack. It has been the standout repo on GitHub trending this week, adding well over a thousand stars a day.
02
Hyperlight
micro-vm
Embeddable Rust VMM that runs untrusted code in hypervisor-isolated micro-VMs spun up in milliseconds.
Hyperlight is a Cloud Native Computing Foundation sandbox project for safely executing untrusted code inside hypervisor-isolated micro virtual machines. There is no guest kernel or OS, so VMs start in a couple of milliseconds and guest function calls complete in microseconds. It supports KVM, Microsoft Hypervisor, and the Windows Hypervisor Platform. The companion hyperlight-wasm crate pairs it with wasmtime, letting any WebAssembly module run in a protected micro-VM with two layers of isolation.
03
zizmor
ci-security
Static analysis for GitHub Actions that finds template injection, credential leaks, and supply-chain risks.
zizmor scans GitHub Actions workflow YAML for security problems like template injection that leads to attacker-controlled execution, accidental credential persistence, and over-broad permission grants. It also flags supply-chain risks such as actions pinned to mutable tags instead of commit hashes. Findings can be emitted as SARIF so they surface directly in GitHub's Security tab. It installs through pip, uv, or cargo, and a recent Trail of Bits hardening pass plus broader CI adoption has pushed it up the trending charts.
04
Goose
agent-framework
Open-source, extensible AI agent that goes beyond suggestions to install dependencies, run code, and execute tasks.
Goose is an on-machine AI agent that does more than suggest code: it installs dependencies, runs commands, and carries out multi-step engineering tasks autonomously. It is extensible through a plugin system and works with a range of model backends rather than locking you into one provider. Written largely in Rust, it has been one of the fastest-rising agent projects on GitHub trending this week. Engineers are adopting it as an open alternative to closed coding-agent products.
05
sem
version-control
Semantic version control layer over git: entity-level diffs, blame, and change-impact analysis instead of line diffs.
sem sits on top of git and reasons about code at the level of functions, types, and other entities rather than raw lines. That makes diffs, blame, and impact analysis track what actually changed semantically, even when code moves or is reformatted. The pitch resonates with teams drowning in large AI-generated diffs where line-based review breaks down. It climbed GitHub's Rust trending list this week as developers tried it on noisy pull requests.
06
whichllm
local-llm
CLI that ranks which local LLMs will actually run and perform best on your specific hardware.
whichllm benchmarks and ranks open models against the GPU, VRAM, and memory you actually have, so you stop guessing whether a model will fit or thrash. It surfaces real measured throughput rather than marketing numbers, helping people pick the largest model that runs comfortably on their machine. It struck a nerve with the local-LLM crowd and added several hundred stars a day this week. The result is a practical answer to the constant 'what can I run on this card' question.
07
sandboxd
dev-sandbox
Self-hosted dev sandboxes with preview URLs in one command, built for coding agents without Kubernetes.
sandboxd spins up isolated, self-hosted development sandboxes complete with live preview URLs using a single command and no Kubernetes. It is aimed squarely at coding agents and rapid app generation, where each task or branch wants its own throwaway environment. Written in Go, it positions itself as infrastructure for 'SaaS factories' churning out many small apps. It picked up momentum this week as agent-driven workflows look for cheap, disposable runtime environments.
08
OpenEnv
agentic-rl
Open standard and hub for agentic reinforcement-learning environments, now backed by the open-source community.
OpenEnv is an effort to standardize how agentic reinforcement-learning environments are defined, shared, and run, so RL training for agents stops being a pile of bespoke harnesses. Hugging Face used its blog this week to rally the open-source community behind the format. The goal is a common interface for tasks, tools, and rewards that different training stacks can target. It matters as more teams move from static fine-tuning toward RL on realistic, tool-using agent tasks.
09
claude-mem
agent-memory
Persistent memory layer that captures and restores context across coding-agent sessions.
claude-mem records what an agent does during a session and replays the relevant context into later ones, so long-running work survives restarts and context-window resets. It targets the common pain of coding agents forgetting earlier decisions, file layouts, and prior reasoning. The project added a couple hundred stars a day this week amid broad interest in agent memory. It is part of a wave of tools trying to give agents durable state rather than starting cold every time.
10
MemPalace
agent-memory
Open-source AI memory system that claims top open benchmarks for storing and recalling agent context.
MemPalace is a free, open-source memory system for AI agents that stores facts and conversation history and retrieves them on demand. Its README leans on benchmark results to argue it is the best-performing open memory layer available. It competes in a crowded space of agent-memory projects all chasing the same problem of durable, queryable context. Strong week-over-week star growth put it on GitHub's Python trending list.
11
hf CLI
cli
Redesigned Hugging Face command-line tool reworked to be optimized for AI agents driving the Hub.
Hugging Face rebuilt its command-line interface around the assumption that an agent, not just a human, will be running it. The redesign emphasizes predictable output, scriptable commands, and clean primitives for pulling, pushing, and managing models, datasets, and Spaces. It reflects a broader shift where CLIs are treated as tool surfaces for LLMs rather than only interactive shells. Hugging Face detailed the design in a blog post this week.
12
Cube
semantic-layer
Open-source semantic layer that defines metrics once and serves them to BI, embedded analytics, and AI agents.
Cube Core is an open-source semantic layer that centralizes metric and dimension definitions so every consumer queries consistent numbers. Increasingly it markets itself as the data interface for AI, giving LLM agents a governed way to ask business questions instead of writing raw SQL against warehouses. It exposes SQL, REST, and GraphQL APIs and handles caching and access control. Renewed interest in trustworthy data access for agents kept it on GitHub trending this week.
13
Supervision
computer-vision
Roboflow's reusable computer-vision toolkit for detection, tracking, annotation, and dataset utilities.
Supervision packages the repetitive glue code of computer-vision projects: bounding-box and mask utilities, object tracking, annotators, and dataset loaders that work across popular model formats. It lets engineers wire up detection and tracking pipelines without rewriting the same boilerplate each time. It is model-agnostic, sitting alongside whatever detector you already use. It added several hundred stars a day this week and remains one of the most-starred practical CV libraries.
14
Gitdot
git-forge
Open-source, self-hostable GitHub alternative written in Rust, pitched as a faster code-forge.
Gitdot is an open-source code-hosting platform built in Rust that aims to be a leaner alternative to GitHub. It drew a large Hacker News thread this week, with debate over whether 'written in Rust' and 'better than GitHub' claims hold up given missing mobile support. Commenters floated federation between forges rather than another centralized walled garden. The traction reflects ongoing appetite for self-hosted, open development platforms.
15
Handy
speech-to-text
Free, open-source, offline speech-to-text desktop app built in Rust and extensible with custom models.
Handy is a cross-platform desktop dictation app that transcribes speech entirely on-device, with no cloud calls or API keys. It is open source, written in Rust, and designed to be extensible so users can swap in different speech models. Privacy-minded developers like that audio never leaves the machine. It trended on GitHub this week as local, offline AI tooling continues to gain momentum.
16
Open Notebook
rag
Open-source NotebookLM alternative for chatting with and summarizing your own document collections.
Open Notebook reimplements the core of Google's NotebookLM as a self-hostable, open-source app with more flexibility over models and storage. You load documents and then query, summarize, and generate notes grounded in that private corpus. It lets users choose their own LLM and embedding backends rather than depending on a single vendor. It added several hundred stars a day this week as demand for private, controllable RAG tools grows.
17
Agent-Reach
agent-tooling
Gives AI agents read and search access across the open web, including Twitter, Reddit, and YouTube.
Agent-Reach is a toolkit that lets agents read and search public web sources, including social platforms like Twitter, Reddit, and YouTube, behind a single interface. It targets the gap where agents can reason but cannot actually observe what is happening online right now. By normalizing access to many sites it spares developers from stitching together brittle scrapers. It surged on GitHub's Python trending list this week, adding more than a thousand stars in a day.
18
UniRL
multimodal-rl
Tencent Hunyuan framework for reinforcement learning on unified multimodal models.
UniRL is a framework from Tencent Hunyuan for applying reinforcement learning to unified multimodal models that handle text and images together. It provides the training plumbing for RL fine-tuning across modalities rather than treating vision and language pipelines separately. It targets researchers pushing on agentic and reasoning behavior in multimodal systems. The open release drew quick attention on GitHub trending this week.
19
Apple Core AI
on-device-ai
Apple's new on-device AI stack with export recipes and a Swift runtime for running models across CPU, GPU, and Neural Engine.
Core AI is Apple's reworked on-device machine-learning framework introduced at WWDC 2026, with a public repo of model-export recipes, Python primitives, and Swift runtime utilities. It offers a new path to convert PyTorch models into a form that runs across the CPU, GPU, and the Neural Engine. Developers on Hacker News highlighted distributed inference across Macs over Thunderbolt and an OpenAI-compatible local server. It signals a deeper Apple push into local AI alongside its Gemini-based assistant.
20
flowsint
osint
Graph-based investigation platform for visual, extensible OSINT and cyber-investigation workflows.
flowsint is a platform for building visual, graph-based investigations, aimed at OSINT and security analysts who need to map relationships between entities. It is designed to be flexible and extensible, letting teams add their own data sources and node types. The graph model makes it easier to trace connections across people, infrastructure, and accounts than flat tables would. It climbed GitHub's TypeScript trending list this week.

Models· 13

01
Nemotron 3 Ultra
open-llm
NVIDIA's open 550B Mixture-of-Experts reasoning model with a hybrid Mamba-Transformer design and 1M-token context.
Released on June 4, NVIDIA's Nemotron 3 Ultra is a 550B-parameter Mixture-of-Experts model with about 55B active per token on a hybrid Mamba-Transformer architecture. NVIDIA published weights, training data, and recipes under the Linux Foundation's permissive OpenMDW-1.1 license. It pairs a 1M-token context with markedly higher throughput than comparable open models, with a pre-release endpoint clocking 300-plus tokens per second. It tops US open-weight rankings but still trails China's Kimi K2.6 on intelligence benchmarks while running several times faster.
02
MiMo-V2.5-Pro-UltraSpeed
fast-inference
Xiaomi's 1-trillion-parameter model sustaining over 1,000 tokens per second on a single commodity 8-GPU node.
Xiaomi's MiMo-V2.5-Pro-UltraSpeed sustains more than 1,000 tokens per second on a 1-trillion-parameter model running on a standard 8-GPU server, roughly a 10x speedup over the base model. The jump comes from FP4 quantization applied only to the MoE experts, DFlash speculative decoding that verifies blocks of tokens per forward pass, and the TileRT inference engine. The underlying base model is open source, while the UltraSpeed tier runs at about 3x the price for 10x the speed. It topped Hacker News and r/LocalLLaMA this week as evidence that fast, cheap inference is arriving on commodity hardware.
03
MiniMax M3
open-llm
Open-weight coding and agentic model with a 1M-token context, claiming frontier scores at a fraction of the cost.
Released June 1, MiniMax M3 is an open-weight model positioned for coding, agentic work, and long context, with native image and video understanding. MiniMax reports 59.0% on SWE-Bench Pro and 83.5 on BrowseComp, claiming it beats GPT-5.5 and Gemini 3.1 Pro on several tasks for 5-10% of the cost, though some numbers were run on its own infrastructure. Its MSA architecture cuts per-token compute at 1M context dramatically, with much faster prefill and decoding than the prior generation. The release also debuted a new attention architecture that drew its own discussion on r/MachineLearning.
04
Gemma 4 12B
local-llm
Google's encoder-free multimodal open model that runs on a 16GB laptop, with native audio input.
Released in early June, Gemma 4 12B is a unified, encoder-free multimodal model that processes images and audio without separate encoders, and is the first mid-sized Gemma to take native audio input. It runs locally on any laptop with 16GB of RAM or VRAM, using about half the memory of the 26B variant while staying close on benchmarks. The weights are free under Apache 2.0 and available on Hugging Face and Kaggle at just under 18GB. It became the talk of r/artificial, with users reporting strong results on a single consumer GPU.
05
DeepSeek V4 Pro
open-llm
Open-weight 1.6T MoE model that lands near frontier coding and reasoning at roughly one-thirtieth the cost.
DeepSeek V4-Pro is a 1.6-trillion-parameter Mixture-of-Experts model activating 49B per token with a 1M-token context and open weights on Hugging Face. The V4-Pro-Max variant posts 80.6% on SWE-bench Verified and 93.5 on LiveCodeBench, among the highest coding scores reported, while pricing stays around $0.44 per million input tokens. A Hacker News thread comparing it favorably to GPT-5.5 on precision drew skepticism about the methodology but underscored how cheap frontier-class coding has become. It remains the reference open-weight model that newer releases benchmark themselves against.
06
Claude Opus 4.8
frontier-llm
Anthropic's latest frontier model, currently topping Cognition's FrontierCode code-quality benchmark.
Claude Opus 4.8 is Anthropic's newest frontier model, available via API and on gateways like OpenRouter since late May. It leads the published snapshot of Cognition's new FrontierCode benchmark, which measures whether AI-written code would actually be merged, with a 13.4% diamond score. It also tops SWE-Bench Pro in the comparisons surfaced this week against open challengers. A 'Fast' variant offers higher output speed without dropping to a smaller model.
07
Kimi K2.6
open-llm
Moonshot AI's 1T open-weight agentic model with a 300-agent swarm and leading open benchmark scores.
Kimi K2.6 is Moonshot AI's flagship open model, a 1-trillion-parameter Mixture-of-Experts with 32B active per token, released under a modified MIT license. It is built for long-horizon coding and multi-agent orchestration, with an Agent Swarm that scales to 300 specialized sub-agents running up to 4,000 coordinated steps. It scores 58.6 on SWE-Bench Pro and leads several open-model rankings, which is why NVIDIA's Nemotron 3 Ultra measures itself against it. Weights are on Hugging Face and it is served through APIs like DeepInfra.
08
Qwen3.7 Max
frontier-llm
Alibaba's largest Qwen3.7 model, a hosted frontier-tier option on major inference gateways.
Qwen3.7 Max is the top tier of Alibaba's Qwen3.7 family, aimed at the most demanding reasoning and coding tasks. It is offered as a hosted model rather than open weights, available through gateways including OpenRouter. The Qwen line remains one of the most widely used model families on Hugging Face, with smaller Qwen3 variants dominating download charts. Max anchors the high end of that lineup for teams that want Qwen quality without self-hosting.
09
GLM-5.1
open-llm
Zhipu AI's latest GLM open model, now a default option in popular local runners like Ollama.
GLM-5.1 is the newest model in Zhipu AI's GLM series, continuing the line's strength in bilingual and agentic tasks. It now appears among the headline models Ollama advertises out of the box, alongside Kimi-K2.6, MiniMax, DeepSeek, and Qwen. That inclusion signals real adoption among people running models locally. It is part of the steady cadence of capable Chinese open-weight releases reshaping the open ecosystem this year.
10
Step 3.7 Flash
fast-llm
StepFun's speed-optimized model offering low-latency responses for high-volume workloads.
Step 3.7 Flash is StepFun's latency-optimized model, tuned for cheap, fast responses rather than maximum capability. It is available through OpenRouter, where it joined the roster of low-cost options competing on tokens-per-dollar. Models like this are increasingly the workhorses behind agentic loops where throughput matters more than topping a leaderboard. Its late-May arrival adds to the crowded field of fast, inexpensive inference endpoints.
11
Nemotron 3.5 Content Safety
safety-model
NVIDIA's customizable multimodal safety model for moderating inputs and outputs across languages.
Released June 4, Nemotron 3.5 Content Safety is NVIDIA's open model for classifying and moderating content across text and other modalities. It is designed to be customizable so enterprises can adapt safety policies to their own rules and many languages rather than a fixed taxonomy. It slots into production LLM pipelines as a guardrail on both prompts and generations. NVIDIA detailed it on the Hugging Face blog alongside the Nemotron 3 Ultra launch.
12
Kokoro-82M
tts
Tiny 82M-parameter text-to-speech model delivering quality voices at a fraction of typical model size.
Kokoro-82M is a compact text-to-speech model that punches well above its 82-million-parameter weight, producing natural voices cheaply enough to run almost anywhere. Its small footprint makes it popular for local and edge deployments where heavier TTS systems are impractical. It sits among the most-downloaded models on Hugging Face, with downloads in the tens of millions. It remains a go-to open option for developers adding speech output without a hosted API.
13
Chronos-2
time-series
Amazon's foundation model for zero-shot time-series forecasting across many domains.
Chronos-2 is Amazon's latest time-series foundation model, applying a pretrained-transformer approach to forecasting so users can get predictions without training a bespoke model per dataset. It handles varied domains zero-shot and integrates with the AutoGluon stack. It ranks among the most-downloaded models on Hugging Face, reflecting steady demand for general-purpose forecasting. It is a reminder that foundation-model techniques are spreading well beyond text.

APIs & Services· 12

01
Vercel AI Gateway
model-gateway
Single API and billing layer that routes requests across many model providers, with a monthly usage index.
Vercel's AI Gateway gives developers one endpoint and one bill to call models from many providers, handling routing, fallbacks, and observability. Its June 2026 production index reported DeepSeek entering the fight for token volume while Anthropic continued to dominate spend, a closely watched signal of what teams actually run in production. New frontier models like Nemotron 3 Ultra now light up on the gateway within days of release. It competes with OpenRouter and similar gateways for the role of default model router.
02
Vercel Sandbox
compute
Managed ephemeral sandboxes for running untrusted or agent-generated code, now gaining persistent drives.
Vercel Sandbox runs untrusted or AI-generated code in isolated, ephemeral environments, a building block for agent workflows that need to execute arbitrary commands safely. This week Vercel opened a private beta for Drives, adding persistent storage so sandboxes can keep state between runs. That closes a gap with longer-lived dev environments while keeping isolation. It is part of Vercel's push to host the runtime layer for AI coding agents.
03
Cloudflare AI cost controls
cost-control
Cloudflare features to cap and cut runaway AI spend, plus turning threat intel into real-time WAF rules.
Cloudflare leaned into AI operations this week, pitching tools to rein in 'out of control' AI bills by caching, routing, and governing model traffic at the edge. It also shipped a pipeline that turns its threat indicators into real-time WAF rules and published an architecture for defending against frontier cyber models as customer zero. The moves extend Cloudflare from CDN and security into the AI infrastructure and gateway space. They reflect how much of AI cost and risk now lives in the network layer.
04
skills.sh API
agent-skills
Vercel API for publishing and consuming reusable agent skills programmatically.
The skills.sh API gives developers a programmatic way to publish, discover, and pull reusable 'skills' for coding agents rather than copying files by hand. It treats agent capabilities as a distributable package format with an API surface. The launch fits the broader 2026 trend of skills becoming first-class artifacts that agents install on demand. Vercel shipped it as part of its changelog this week.
05
ZeroGPU
inference
Hugging Face's dynamic GPU allocation for Spaces, now backed by NVIDIA Blackwell-class hardware.
ZeroGPU lets Hugging Face Spaces grab and release powerful GPUs on demand, so demos and small apps get accelerator time without paying for an always-on instance. The platform has moved to high-end NVIDIA hardware, with newer tiers citing RTX Pro 6000 Blackwell and H200 devices. Free accounts get a daily GPU quota and PRO users get much more, making it a popular way to ship AI demos cheaply. It featured on Product Hunt this week as the compute-efficient layer for AI inference.
06
DeepInfra
inference
Inference platform serving open models like Kimi K2.6 and Nemotron 3 Ultra at high throughput.
DeepInfra is a hosted inference provider that runs popular open-weight models behind a simple, pay-per-token API. It was among the first endpoints serving NVIDIA's Nemotron 3 Ultra, where Artificial Analysis measured 300-plus tokens per second pre-launch, and it hosts Moonshot's Kimi K2.6. It competes on price and speed for teams that want open models without managing GPUs. Its early access to fresh releases keeps it visible whenever a major open model drops.
07
Krisp Voice Translation API
voice
Real-time speech-to-speech translation delivered as a developer API.
Krisp, known for AI noise cancellation, launched a real-time speech-to-speech translation API this week. It lets developers add live cross-language voice translation to calls and apps without building the speech pipeline themselves. The API targets meetings, support, and any product where participants speak different languages. It debuted on Product Hunt as part of a wave of voice-infrastructure launches.
08
Microsoft MAI-Voice-2
voice
Microsoft's expressive text-to-speech with voice cloning across 15 languages.
MAI-Voice-2 is Microsoft's expressive text-to-speech system offering voice cloning in 15 languages. It is part of Microsoft's in-house MAI model family rather than a partner model, signaling continued investment in owning the voice stack. The focus on expressiveness and multilingual cloning targets assistants, dubbing, and accessibility use cases. It launched on Product Hunt this week.
09
OrchestraML
ml-platform
Service that turns an English prompt into a deployed ML model with a human approval step.
OrchestraML takes a plain-English description and walks it through to a trained, deployed machine-learning model, inserting a human approval gate before anything ships. It aims at teams that want to stand up conventional ML models without hand-building each pipeline stage. The human-in-the-loop checkpoint is the differentiator from fully automated AutoML. It launched on Product Hunt this week.
10
Astra Autonomous Pentest
security
AI agents that find, validate, and help fix security vulnerabilities autonomously.
Astra's autonomous pentest service uses AI agents to discover vulnerabilities, validate that they are real rather than false positives, and propose fixes. The validation step targets the alert-fatigue problem that plagues traditional scanners. It positions continuous, agent-driven testing as a replacement for periodic manual pentests. It surfaced on Product Hunt this week amid growing interest in offensive-security automation.
11
Walrus Memory
agent-memory
Hosted memory service that lets agents keep context across apps and sessions.
Walrus Memory provides a managed memory layer so AI agents retain context not just within a session but across different apps and over time. It targets the same durability problem as open libraries, but as a hosted API teams can drop in. Persistent cross-app memory is increasingly seen as table stakes for useful assistants. It launched on Product Hunt this week.
12
OpenRouter
model-gateway
Unified API and live usage rankings across hundreds of language models from many providers.
OpenRouter exposes hundreds of models behind one OpenAI-compatible API, handling provider routing, fallbacks, and billing. Its public rankings have become a widely cited signal of which models developers are actually calling, and new releases from NVIDIA, Qwen, MiniMax, and StepFun appear within days. The breadth lets teams switch models without rewriting integration code. It remains a default control plane for multi-model applications.

Resources· 5

01
How's Linear so fast? A technical breakdown
performance
Deep dive into the local-first sync architecture behind Linear's snappy feel, and where it does and doesn't deliver.
This widely shared breakdown reverse-engineers why Linear feels instant: a local-first sync engine that applies changes to a client cache in milliseconds before reconciling with the server. The post walks through the data model and sync protocol that make optimistic updates possible. Hacker News commenters pushed back that the real-world experience is mixed, citing slow search and noisy notifications at scale, and pointed to related projects like Zero and a published reverse-engineering of Linear's engine. It is a useful tour of the tradeoffs in building local-first web apps.
02
Introducing FrontierCode
benchmark
Cognition's new benchmark measures whether AI-written code would actually be merged, not just whether tests pass.
FrontierCode is a coding benchmark built with the maintainers of 36 flagship open-source repositories, each task taking 40-plus hours to construct. Instead of unit-test pass rates, it scores regression safety, cleanliness, scope, and maintainability against 3,000-plus rubrics, asking whether a human maintainer would merge the change. Cognition reports that over half of SWE-bench-style outputs are unmergeable, and even the leading model scores only 13.4% on the hardest 'diamond' subset. A team member ran a candid Hacker News AMA, and reviewers welcomed a benchmark that resists saturation.
03
Cache Stampede Prevention
backend
Practical guide to stopping thundering-herd cache misses with distributed locking, pub/sub, and request coalescing.
This engineering deep dive tackles the cache-stampede problem, where a popular key expiring sends a flood of identical requests to your database at once. It compares three mitigations: distributed locks so only one worker recomputes, pub/sub to broadcast the fresh value, and request coalescing to fold concurrent misses into a single fetch. Each approach is laid out with its failure modes and when to reach for it. It is a clear, reusable reference for anyone running read-heavy caching layers.
04
SQLite: improving performance with pre-sort
databases
How sorting data before insertion dramatically speeds up SQLite writes and index builds.
This post shows how pre-sorting rows before bulk-inserting into SQLite can yield large performance gains by keeping index B-tree writes sequential rather than scattered. The author benchmarks the difference and explains why insertion order interacts with page locality and cache behavior. It is the kind of concrete, measurable optimization that applies to many embedded-database workloads. It topped r/programming this week and sparked discussion of similar tricks in other databases.
05
I design with Claude more than Figma now
ai-workflow
A Jane Street engineer on why prompting an LLM for UI iterations has displaced traditional design tooling for them.
In this Jane Street blog post, an engineer argues that iterating on interface ideas by prompting Claude has become more productive than working in Figma, because the model gives free, unlimited iteration without friction. The piece reflects a broader shift in how some engineers prototype UI. Hacker News readers were split: some find the workflow liberating, while others note the outputs converge on similar contemporary web tropes and question whether it produces genuinely creative design. It is a concrete data point in the debate over AI-assisted design.

Tools· 20

turbovec

Hyperlight

zizmor

Goose

sem

whichllm

sandboxd

OpenEnv

claude-mem

MemPalace

hf CLI

Cube

Supervision

Gitdot

Handy

Open Notebook

Agent-Reach

UniRL

Apple Core AI

flowsint

Models· 13

Nemotron 3 Ultra

MiMo-V2.5-Pro-UltraSpeed

MiniMax M3

Gemma 4 12B

DeepSeek V4 Pro

Claude Opus 4.8

Kimi K2.6

Qwen3.7 Max

GLM-5.1

Step 3.7 Flash

Nemotron 3.5 Content Safety

Kokoro-82M

Chronos-2

APIs & Services· 12

Vercel AI Gateway

Vercel Sandbox

Cloudflare AI cost controls

skills.sh API

ZeroGPU

DeepInfra

Krisp Voice Translation API

Microsoft MAI-Voice-2

OrchestraML

Astra Autonomous Pentest

Walrus Memory

OpenRouter

Resources· 5

How's Linear so fast? A technical breakdown

Introducing FrontierCode

Cache Stampede Prevention

SQLite: improving performance with pre-sort

I design with Claude more than Figma now