Feed | devblogs.sh

LogRocket1 hour ago

The Replay (11/26/25): An AI reality check, Prisma v7, and more

LogRocket’s The Replay (11/26/25) is a curated weekly newsletter highlighting an AI “reality check,” Prisma ORM’s v7 changes (moving from Rust to TypeScript), the importance of offline-first UX, Angular v21 updates (experimental Signal Forms), and a tip on “caveman compression” to reduce token usage, with links and subscription info.

Estuary5 hours ago

Key Considerations When You Migrate Away From Fivetran

Vendor-neutral guide detailing key considerations for migrating off Fivetran: inventorying connectors and downstream dependencies, choosing ingestion patterns (batch/CDC/streaming), planning snapshots/backfills and CDC switchover, dual-run validation, handling schema evolution, and preparing monitoring and operational runbooks before cutover.

Grab17 hours ago

Real-time data quality monitoring: Kafka stream contracts with syntactic and semantic test

Grab describes Coban, a platform for real-time Kafka stream data-quality monitoring that lets teams declare data contracts (schemas and field-level semantic rules), auto-transforms those contracts into FlinkSQL tests, detects syntactic and semantic violations in real time, and provides observability and alerting via Genchi, Slack, and S3 sinks.

Dolthub17 hours ago

What Agents Can Teach You About Your UI

The author describes how Claude Code sometimes writes to the wrong Dolt branch because dolt checkout behaves differently when a Dolt SQL server is running, recounts attempts to fix it via AGENT.md, and reports that adding an MCP HTTP interface makes branch selection explicit for agents. The piece argues that agent failures reveal confusing tool interfaces that should be fixed.

Arpit Bhayani17 hours ago

The Q, K, V Matrices

A clear tutorial on transformer self-attention explaining the roles of Query, Key, and Value matrices: how to construct Q, K, V from input embeddings using learned weight matrices (Wq, Wk, Wv), compute attention scores (QK^T / sqrt(d_k)), apply softmax to get weights, and combine weighted values to produce outputs. Includes a small numpy example, discussion of projection dimensionality (d_k), and how multi-head attention and dimension choices affect capacity and computation.

Sean Goedecke17 hours ago

Becoming unblockable

Practical advice for engineers to avoid getting blocked: keep multiple tasks available, sequence risky/dependent work early, invest in a stable developer environment (use CI/staging when needed), investigate errors in services you don’t own, cultivate cross-team relationships, and escalate to senior managers when necessary to remove organizational blockers.

All Things Distributed1 day ago

Tech predictions for 2026 and beyond

A forward-looking essay predicting technology trends for 2026+: AI moves into the human loop via companion robots and personalized tutors, generative AI augments rather than replaces developers (creating “renaissance developers”), quantum advances force immediate post-quantum cryptography readiness, defense-driven technologies will reach civilian use faster, and organizations must invest in PQC, talent, and responsible AI deployment.

Spotify2 days ago

Background Coding Agents: Context Engineering (Part 2)

Spotify describes how it built and refined a background coding agent for large-scale code migrations, covering early experiments, a homegrown agentic loop, and adoption of Claude Code. The post focuses on context engineering and prompt design, toolset choices (a verify tool, standardized Git, restricted Bash), lessons for writing prompts and chunking work, and operational trade-offs when scaling LLM-based agents across thousands of repositories.

Uber2 days ago

Evolution and Scale of Uber’s Delivery Search Platform

Uber Eats built a multilingual semantic search platform using a two-tower Qwen-based embedding model (finetuned with Matryoshka Representation Learning), large-scale training (PyTorch, DeepSpeed, Ray), and offline embedding pipelines stored in feature tables. They index billions of candidates with HNSW graphs inside Lucene Plus, use quantization and shard-level k tuning (and embedding-dimension cuts) to trade off cost, latency and recall, and add pre-filters and micro re-ranking. Productionization includes biweekly model/index refreshes, a blue/green embedding-column pattern, automated validation gates (completeness, backward compatibility, correctness), serving-time checks, gradual rollouts and automatic rollback to ensure reliability.

Twilio2 days ago

Checking it twice: How to spot a fake in your inbox and stay safe this holiday season

Twilio warns customers about increased phishing and brand-impersonation attempts targeting Twilio and SendGrid users, explains common red flags (fake sender domains, copied branding, lookalike sites, urgent language), lists what Twilio will never request (passwords, 2FA codes, API keys, gift-card/crypto payments), and gives steps to verify messages and remediate (change passwords, rotate API keys, enable 2FA, check account activity, and report suspicious emails to fraud@twilio.com).

Allegro2 days ago

AI Between Progress and Responsibility - 3 Lessons from the World AI Summit

A first-person account of three takeaways from the 9th World AI Summit: Karen Hao’s warning about the human, environmental and social costs of large AI models; Jason Snyder’s philosophical critique of efficiency and the importance of preserving human agency and “time worth wasting”; and Swaan Dekker’s example of Amsterdam’s human-centric, transparent, future‑proof municipal AI. The author argues for responsible choices — smaller or specialized models, sovereign/local approaches, and design that preserves human judgment and public well‑being.

Lyft1 week ago

LyftLearn Evolution: Rethinking ML Platform Architecture

Lyft details evolving LyftLearn by moving offline ML compute from an in-house Kubernetes-based system to a hybrid architecture: AWS SageMaker for offline training, notebooks and batch jobs, and Kubernetes for low-latency online serving. They built cross-platform base images and compatibility layers to preserve runtime parity, replaced fragile K8s watcher-based state management with EventBridge+SQS, solved credential, metrics, hyperparameter and startup-latency gaps (SOCI and warm pools), and enabled cross-cluster Spark networking. The migration reduced operational complexity, improved reliability, and lowered TCO while keeping user workflows unchanged.

Datadog1 week ago

Scaling real-time file monitoring with eBPF: How we filtered billions of kernel events per minute

Datadog built an eBPF-based File Integrity Monitoring system to capture real-time, high-fidelity file activity with process/container context. To handle more than 10 billion file events per minute they moved much of the filtering into the kernel (using static "approvers" and dynamic "discarders"), added Agent-side rules to reduce outbound traffic, and implemented a two-stage kernel/user-space evaluation—cutting noise by ~94% while preserving detection coverage.

Dropbox1 week ago

How Dash uses context engineering for smarter AI

Dropbox’s Dash evolved from a retrieval-first RAG pipeline into an agentic LLM system. To make agents more reliable and efficient, the team focuses on context engineering: consolidating retrieval into a single Dash universal search index, filtering results with an index+knowledge-graph to surface only relevant context, and delegating complex operations (like query construction) to specialized agents. They discuss tradeoffs around MCP tool definitions, token/context-window limits, long-running jobs, and plans to refine user/company memory and smaller/faster models.

Netflix1 week ago

Integrating Netflix’s Foundation Model into Personalization applications

Netflix describes three practical patterns for integrating its transformer-based Foundation Model into personalization: (1) producing profile and item embeddings and serving them via an Embedding Store (with stabilization and daily/near-real-time refreshes), (2) using the model decoder as a subgraph inside downstream models to eliminate staleness at the cost of added complexity and latency, and (3) fine-tuning the Foundation Model for product-specific objectives with a provided fine-tuning framework. The post discusses implementation details, infrastructure (pretraining cadence, embedding pipelines, feature generation), trade-offs (freshness, compute, SLAs), and ongoing work such as near-real-time embedding inference and model distillation.

Incident.io1 week ago

Bloom filters: the niche trick behind a 16× faster API

A technical deep dive showing how incident.io cut an alerts-listing API P95 from ~5s to ~0.3s by pushing attribute filtering into Postgres using bloom-filter bitmaps (bit(512) with seven hashes), combined with a mandatory 30-day created_at filter. The post explains the original in-memory JSONB filtering bottleneck, compares GIN jsonb indexing vs bloom filters with benchmarks, describes implementation details (ULID pagination, false-positive handling), and explains why bloom filters were chosen.

Expedia2 weeks ago

Colocating Input Partitions with Kafka Streams When Consuming Multiple Topics: Sub-Topology Matters!

The article recounts how a Kafka Streams app consuming two similarly partitioned topics failed to colocate same-index partitions because the inputs were in separate sub-topologies, breaking an in-memory cache optimization. By replacing the local Guava cache with a shared Kafka Streams state store attached to both branches, the topology unified, same-index partitions were colocated, redundant external API calls were eliminated, and performance improved. The key lesson: sub-topology design in Kafka Streams affects partition assignment and cache locality, so architects must design processing graphs deliberately when cross-topic coordination is required.

Indeed2 weeks ago

Normalized Entropy or Apply Rate? Evaluation Metrics for Online Modeling Experiments

Indeed's Ranking Models team evaluates which metrics to use for online modeling experiments. The post compares model-performance metrics (Normalized Entropy, ROC-AUC, calibration errors, nDCG) and product metrics (engagement, outcomes, relevance, revenue), discusses misalignment and dilution caused by system design and online/offline gaps, and recommends prioritizing product metrics for decisions while enforcing guardrails on model performance (NE, ROC-AUC) and monitoring calibration for production models.

Pinterest2 weeks ago

Slashing CI Wait Times: How Pinterest Cut Android Testing Build Times by 36%+

Pinterest built an in‑house Android E2E test platform (PinTestLab) on EC2 emulators and implemented a runtime‑aware sharding algorithm (LPT/min‑heap using historical runtimes from Metro) to balance tests by expected wall time. Moving from package/count-based sharding on Firebase to time‑based sharding cut CI build time by ~36%, reduced the slowest shard by ~55%, and tightened shard runtime variance, with fallbacks and plans for on‑demand sharding.

LogRocket3 hours ago

Ripple over React? Evaluating the newest JS framework

A hands-on review and tutorial of RippleJS — a TypeScript-first UI framework by Dominic Gannaway — covering its compiler-integrated reactivity (track/@), .ripple syntax, reactive arrays/objects, component model, scoped CSS, performance and memory-efficiency claims, and a Todo app example with comparisons to React. The article notes current limitations (no SSR, small ecosystem) and invites community contributions.