News Aggregator


AWS Glue Crawlers: Common Pitfalls, Schema Challenges, and Best Practices

Aggregated on: 2025-09-25 14:22:30

AWS Glue is a powerful serverless data integration that simplifies data discovery, preparation, and transformation. However, as with any tool, real-world application reveals quirks and corner cases that are not clearly identified in documentation.  In this article, let's talk about some key challenges observed from my hands-on experience while building data pipelines using Glue crawlers when dealing with CSV files, schema evolution, partitioning, and crawler update settings.

View more...

Digital Experience Monitoring and Endpoint Posture Checks Usage in SASE

Aggregated on: 2025-09-25 13:22:30

In this article, I will go through the concepts of digital experience monitoring (DEM) and Endpoint Posture Checks and discuss how these essential capabilities are integrated into the SASE framework to enforce the zero trust principle. Together, these capabilities empower enterprises’ security and IT teams to maintain optimal performance, a strong security posture, and trust, regardless of where users connect. Digital Experience Monitoring Digital experience monitoring (DEM) helps to monitor and provide observability across the entire path. It delivers granular, real-time telemetry across endpoints, network paths, and application services, regardless of user location. In the past, enterprises that adopted cloud resources had to deploy various tools to monitor problems within cloud applications, network infrastructure, or on-premises devices, to provide a consistent user experience for hybrid and remote workforces. 

View more...

Is Anyone There? Listening to Your Users Through Conversational AI Observability

Aggregated on: 2025-09-25 12:22:30

You’ve done it. After months of development, your team has launched a state-of-the-art conversational AI assistant. It’s powered by the latest LLM, the interface is slick, and the potential is enormous. Then the first piece of user feedback lands in your inbox. It just says: "The bot is confusing."

View more...

Lessons Learned From Building Production-Scale Data Conversion Pipelines

Aggregated on: 2025-09-25 11:22:30

Building production-scale data pipelines usually involves wrangling outputs from multiple legacy systems. Whether you’re trying to build out business intelligence use cases, handle a system migration, or lay the foundations for a new data warehouse, chances are high that you’ll have to normalize and integrate the outputs of multiple systems that were never designed to talk to one another. Recently, we built a production-scale data pipeline converting one data set from one enterprise system (Health Information Exchanges) to be used as an input into another (a claims-powered risk stratification algorithm). Although these two formats fundamentally represented the same underlying event (clinical encounters), the two systems spoke completely different “languages” — different coding standards, field definitions, and expectations about what was required. The goal was not a one-off ETL script, but a reusable, production-ready pipeline that downstream applications could rely on.

View more...

Death by a Thousand YAMLs: Surviving Kubernetes Tool Sprawl

Aggregated on: 2025-09-24 18:22:30

Editor's Note: The following is an article written for and published in DZone's 2025 Trend Report, Kubernetes in the Enterprise: Optimizing the Scale, Speed, and Intelligence of Cloud Operations. Kubernetes is eating the world. 

View more...

The New API Economy With LLMs

Aggregated on: 2025-09-24 17:22:30

Large language models (LLMs) are becoming more advanced in understanding context in natural language. With this, a new paradigm is emerging — using LLMs as APIs. Traditionally, an API call would be GET /users/123/orders and you would receive a JSON in return, which would return the orders for the user 123. APIs facilitate the interaction between different software systems.

View more...

Key Principles of API-First Development for SaaS

Aggregated on: 2025-09-24 16:22:30

Having worked in software development for over 8 years, I have repeatedly watched developers struggle to integrate APIs into platforms as an afterthought. The situation is common. Someone builds a beautiful web app, then the business team asks for mobile support, third-party integrations, and suddenly you're reverse-engineering your own application to expose endpoints that make sense. Luckily, this is changing. With API-first development, we can design the architecture with the API as part of it from day one. This is especially beneficial for SaaS products as they rely on third-party integrations and ecosystem support. 

View more...

Using TanStack Query for Scalable React Applications

Aggregated on: 2025-09-24 15:07:30

When building React applications, data fetching often starts with the native fetch API or tools like Axios. While this approach works for small projects, larger applications require features such as caching, retries, synchronization, and request cancellation, and it is here that TanStack Query, formerly React Query, excels. It provides a battle-tested abstraction for CRUD operations with powerful state management built in. In this article, we’ll walk through fetching data with useQuery, performing mutations with useMutation, and highlighting some features that make TanStack Query a helpful tool for scaling React apps.

View more...

Resilient Data Pipelines in GCP: Handling Failures and Latency in Distributed Systems

Aggregated on: 2025-09-24 14:07:30

I have spent years designing and operating data pipelines in Google Cloud, and one thing has not changed: resilience is not optional. It does not matter how nice your design diagrams look or how scalable the architecture is. In practice, nodes die, quotas are exhausted, regions are shaded, schemas alter unannounced, and message queues are clogged up at the most unpredictable moments. The main distinction between a functional pipeline and a resilient pipeline lies in the fact that the former can withstand failures and still meet latency requirements. The article explains my philosophy on resilience in distributed data pipelines on GCP, based not only on the experience of running these systems, but also more broadly on systems research and Google operational experience.

View more...

Why I Ditched Redis for Cloudflare Durable Objects in My Rate Limiter

Aggregated on: 2025-09-24 13:07:30

Have you ever watched your serverless application crumble under unexpected traffic? Last month, our AI-powered image generator went viral on social media, and within hours, we were drowning in requests. Our traditional rate-limiting setup couldn't keep up with the distributed load across Cloudflare's edge network. This experience taught me that rate limiting in serverless environments requires a fundamentally different approach. Here's how I built a production-ready rate limiter using Cloudflare Durable Objects that handles thousands of concurrent requests while running at the edge.

View more...

Shipping Responsible AI Without Slowing Down

Aggregated on: 2025-09-24 12:07:30

In software engineering, launch day rarely fails because a unit test was missing; in machine learning (ML), that’s not the case. Inputs far from training data, adversarial prompts, proxies that drift away from human goals, or an upstream artefact that isn’t what it claims to be can all sink a release. The question is not “can every failure be prevented?” but “can failures be bounded, detected quickly, and recovered from predictably?” Two research threads shape this approach. The first maps where ML goes wrong in production: robustness gaps, weak runtime monitoring, misalignment with real human objectives, and systemic issues across the stack (supply chain, access, blast radius). The second focuses on how teams make decisions that stand up to scrutiny: a deliberative loop that’s open, informed, multi-vocal, and responsive. Put together, the operating model feels like standard software engineering — just opinionated for ML.

View more...

Top 7 Mistakes When Testing JavaFX Applications

Aggregated on: 2025-09-24 11:07:30

JavaFX is a versatile tool for creating rich enterprise-grade GUI applications. Testing these applications is an integral part of the development lifecycle. However, Internet sources are very scarce when it comes to defining best practices and guidelines for testing JavaFX apps. Therefore, developers must rely on commercial offerings for JavaFX testing services or write their test suites following trial-and-error approaches. This article summarises the seven most common mistakes programmers make when testing JavaFX applications and ways to avoid them.

View more...

LLMs at the Edge: Decentralized Power and Control

Aggregated on: 2025-09-23 19:07:29

Most of the large language models' applications have been implemented in centralized cloud environments, raising concerns about latency, privacy, and energy consumption. This chapter examines the potential application of LLMs in decentralized edge computing, where computing tasks are distributed across interconnected devices rather than centralized hosts. Therefore, by applying approaches like quantization, model compression, distributed inference, and federated learning, LLMs solve the problems of limited computational and memory resources on edge devices, making them suitable for practical use in real-world settings.  Several advantages of decentralization are outlined in the chapter, such as increased privacy, user control, and enhanced system robustness. Additionally, it focuses on the potential of employing energy-efficient methods and dynamic power modes to enhance edge systems. The conclusion re-emphasizes that edge AI is the way forward as a responsible and performant solution for the future of decentralized AI technologies, which would be privacy-centric, high-performing, and put the user first.

View more...

Running AI/ML on Kubernetes: From Prototype to Production — Use MLflow, KServe, and vLLM on Kubernetes to Ship Models With Confidence

Aggregated on: 2025-09-23 18:07:29

Editor's Note: The following is an article written for and published in DZone's 2025 Trend Report, Kubernetes in the Enterprise: Optimizing the Scale, Speed, and Intelligence of Cloud Operations. After training a machine learning model, the inference phase must be fast, reliable, and cost efficient in production. Serving inference at scale, however, brings difficult problems: GPU/resource management, latency and batching, model/version rollout, observability, and orchestration of ancillary services (preprocessors, feature stores, and vector databases). Running artificial intelligence and machine learning (AI/ML) on Kubernetes gives us a scalable, portable platform for training and serving models. Kubernetes schedules GPUs and other resources so that we can pack workloads efficiently and autoscale to match traffic for both batch jobs and real-time inference. It also coordinates multi-component stacks — like model servers, preprocessors, vector DBs, and feature stores — so that complex pipelines and low-latency endpoints run reliably. 

View more...

From Requirements to Results: Anchoring Agile With Traceability

Aggregated on: 2025-09-23 17:07:29

Agile is one of the most widely adopted project management methodologies in the field of software development because it enables teams to deliver incrementally, adapt quickly to changes, and prioritize collaboration over rigid processes. However, Agile’s fast-changing nature can also expose one of its weaknesses, which is traceability.  Traditional project management approaches, such as Waterfall, make sure that requirements are tied to design documents, test cases, and acceptance metrics. This pipeline ensures that every feature can be traced back to its origin. On the other hand, Agile prioritizes lightweight artifacts and fast iteration, which pose challenges to tracking how individual backlog items map to higher-level business objectives. As a project manager, I’ve seen this gap firsthand. Teams often run into questions like: Are we building the features that align with stakeholder needs? Do the tests validate the requirements? Did we guarantee full coverage across multiple sprints?  Without a clear system of traceability, the results are often uncertain. 

View more...

AI Readiness: Why Cloud Infrastructure Will Decide Who Wins the Next Wave

Aggregated on: 2025-09-23 16:52:29

Everywhere I go, cloud and DevOps teams are asking the same question: “Are we ready for AI?”

View more...

Model Evaluation Metrics Explained

Aggregated on: 2025-09-23 16:07:29

Measuring the true performance of machine learning models goes far beyond headline accuracy. The metrics you choose shape not only how you tweak your algorithms, but how your models impact users, businesses, and critical systems.  In this article, we break down the most practical and widely used evaluation metrics: Accuracy, Precision, Recall, F1 Score, and ROC-AUC. Alongside technical definitions, we'll discuss their strategic importance-how these numbers map to real-world outcomes and business objectives. Whether you're shipping a product or publishing research, knowing how to evaluate model success is foundational to effective machine learning. We'll also look at common metric pitfalls-and how to avoid them.

View more...

Mastering Fluent Bit: Top 3 Telemetry Pipeline Output Plugins for Developers (Part 7)

Aggregated on: 2025-09-23 15:07:29

This series is a general-purpose getting-started guide for those of us wanting to learn about the Cloud Native Computing Foundation (CNCF) project Fluent Bit.  Each article in this series addresses a single topic by providing insights into what the topic is, why we are interested in exploring that topic, where to get started with the topic, and how to get hands-on with learning about the topic as it relates to the Fluent Bit project.

View more...

Testing Automation Antipatterns: When Good Practices Become Your Worst Enemy

Aggregated on: 2025-09-23 14:07:29

Note: This article is a summary of a talk I gave at VLCTesting in 2023. Here's the recording (Spanish). Test automation is a fundamental tool for gaining confidence in what we build in a fast and efficient way. However, we often encounter practices that, while seemingly beneficial in the short term, generate significant problems in the long term: antipatterns.

View more...

Why the Principle of Least Privilege Is Critical for Non-Human Identities

Aggregated on: 2025-09-23 13:07:29

Attackers only really care about two aspects of a leaked secret: does it still work, and what privileges it grants once they are in. One of the takeaways from GitGuardian’s 2025 State of Secrets Sprawl Report was that the majority of GitLab and GitHub API keys leaked in public had been granted full read and write access to the associated repositories. Once an attacker controls access to a repository, they can do all sorts of nasty business.  Both platforms allow for fine-grained access controls, enabling developers to tightly restrict what every token can and can't do. The question is then, why are teams not following the principle of least privilege for their projects? And what can be done to better secure the enterprise against overpermissioned NHIs?

View more...

Scaling ML Experiments: The High-Throughput Playbook

Aggregated on: 2025-09-23 12:07:29

From Guesswork to Growth: Why A/B Testing Is Non-Negotiable Every product decision is a bet under uncertainty. A/B testing turns those bets into measurable, causal learning. By randomly assigning users to control versus treatment, you create two groups that are — on average — identical. Any difference in conversion, retention, revenue, or latency can be attributed to the change, not to seasonality, campaigns, or shifting user mix. Randomization gives you a credible counterfactual.

View more...

Top 5 RAD Platforms for Developers

Aggregated on: 2025-09-23 11:07:29

Rapid Application Development platforms are more in demand as companies aim to deliver secure, scalable systems faster while adhering to a developer-first approach. This article reviews five popular RADs that can meet the needs of professional developers. This blog post reviews five popular Rapid Application Development (RAD) platforms: WaveMaker, OpenXava, OutSystems, Oracle APEX, and Jmix. I will break down each platform's team fit, productivity, security, support, lock-in, licensing, exploring the advantages of each, and how easy it is to get started. 

View more...

AI Infrastructure for Agents and LLMs: Options, Tools, and Optimization

Aggregated on: 2025-09-22 19:22:29

,Infrastructure, whether on cloud, on-premise, or in a hybrid cloud, plays a critical role in implementing the AI architecture. This article is part of a series of articles that explores the diverse infrastructure options available for deploying and optimizing AI agents and large language models (LLMs). It delves into the crucial role infrastructure plays in realizing AI architectures, particularly for inference. We'll examine various tools, including open-source solutions, and illustrate the inference flow with diagrams, highlighting key considerations for efficient and scalable AI deployments. Modern AI applications demand sophisticated infrastructure that can handle the computational intensity of large language models, the complexity of multi-agent systems, and the real-time requirements of interactive applications. The challenge lies not just in selecting the right tools, but in understanding how they integrate across the entire technology stack to deliver reliable, scalable, and cost-effective solutions.

View more...

Isolation Level for MongoDB Multi-Document Transactions (Strong Consistency)

Aggregated on: 2025-09-22 18:22:29

Many outdated or imprecise claims about transaction isolation levels in MongoDB persist. These claims are outdated because they may be based on an old version where multi-document transactions were introduced, MongoDB 4.0, such as the old Jepsen report, and issues have been fixed since then. They are also imprecise because people attempt to map MongoDB's transaction isolation to SQL isolation levels, which is inappropriate, as the SQL Standard definitions ignore Multi-Version Concurrency Control (MVCC), utilized by most databases, including MongoDB. Martin Kleppmann has discussed this issue and provided tests to assess transaction isolation and potential anomalies. I will conduct these tests on MongoDB to explain how multi-document transactions work and avoid anomalies.

View more...

How to Build Secure Knowledge Base Integrations for AI Agents

Aggregated on: 2025-09-22 17:22:29

Done well, knowledge base integrations enable AI agents to deliver specific, context-rich answers without forcing employees to dig through endless folders. Done poorly, they introduce security gaps and permissioning mistakes that erode trust. The challenge for software developers building these integrations is that no two knowledge bases handle permissions the same way. One might gate content at the space level, another at the page level, and a third at the attachment level. 

View more...

Integrating AI Into Test Automation Frameworks With the ChatGPT API

Aggregated on: 2025-09-22 16:07:29

When I first tried to implement AI in a test automation framework, I expected it to be helpful only for a few basic use cases. A few experiments later, I noticed several areas where the ChatGPT API actually saved me time and gave the test automation framework more power: producing realistic test data, analyzing logs in white-box tests, and handling flaky tests in CI/CD. Getting Started With the ChatGPT API ChatGPT API is a programming interface by OpenAI that operates on top of the HTTP(s) protocol. It allows sending requests and retrieving outputs from a pre-selected model as raw text, JSON, XML, or any other format you prefer to work with.

View more...

Spring REST API Client Flavors: From RestTemplate to RestClient

Aggregated on: 2025-09-22 15:07:29

Just as humans have always preferred co-existing and communicating ideas, looking for and providing pieces of advice from and to their fellow humans, applications nowadays find themselves in the same situation, where they need to exchange data in order to collaborate and fulfill their purposes. At a very high level, applications’ interactions are carried out either conversationally (the case of REST APIs), where the information is exchanged synchronously by asking and responding, or asynchronously via notifications (the case of event-driven APIs), where data is sent by producers and picked up by consumers as it becomes available and they are ready.

View more...

Stop Reactive Network Troubleshooting: Monitor These 5 Metrics to Prevent Downtime

Aggregated on: 2025-09-22 14:07:29

Downtime in sectors like manufacturing and healthcare isn’t just inconvenient — it’s potentially catastrophic. I’ve overseen ecosystems for years and realized that preventing such bottom-line disasters requires a watchful eye and a constant finger on the network pulse. This is possible with real monitoring across pinpointed variables: knowing which handful of key metrics predict problems in your specific environment, understanding the difference between normal fluctuations and actual performance issues, and translating technical problems into business impact before executives start asking uncomfortable questions about IT spending.

View more...

Azure IOT Cloud-to-Device Communication Methods

Aggregated on: 2025-09-22 13:22:29

Today, managing communication between the cloud and millions of smart devices is challenging.  Suppose you are managing a huge number of devices out there and you need to push some critical device state update to them all, but many of them are offline or may have spotty network issues; how do you make sure this message gets through? The Azure IoT Hub provides three major cloud-to-device communication mechanisms: C2D messages, direct methods, and desired properties in the device twin. These are each designed for different use cases. This article presents how to effectively select these methods to build reliable, scalable, and effective IoT solutions. Knowing the details when to use each one for what scenarios will help to build robust and reliable IOT solutions.

View more...

Benchmarking Instance Types for Amazon OpenSearch Workloads

Aggregated on: 2025-09-22 12:22:28

Choosing the optimal instance type for Amazon OpenSearch clusters is crucial for balancing performance and cost. With AWS offering both the OpenSearch-specialized OM2 instances and the newer general-purpose M7g instances, organizations face an important decision. While OM2 instances are tailored for OpenSearch with high memory-to-vCPU ratios, M7g instances bring the latest technology, promising enhanced overall performance. The best choice depends on your specific workload characteristics and requirements.

View more...

Think in Graphs, Not Just Chains: JGraphlet for TaskPipelines

Aggregated on: 2025-09-22 11:22:28

JGraphlet is a tiny, zero-dependency library for building task pipelines in Java. Its power comes not from a long list of features, but from a small set of core design principles that work together in harmony. At the heart of JGraphlet is simplicity, backed by a Graph. Add Tasks to a pipeline and connect them to create your graph. Each Task has an input and output. A TaskPipeline builds and executes a pipeline while managing the I/O for each Task. 

View more...

Your SDLC Has an Evil Twin — and AI Built It

Aggregated on: 2025-09-19 19:22:27

You think you know your SDLC like the back of your carpal-tunnel-riddled hand: You've got your gates, your reviews, your carefully orchestrated dance of code commits and deployment pipelines.  But here's a plot twist straight out of your auntie's favorite daytime soap: there's an evil twin lurking in your organization (cue the dramatic organ music). 

View more...

Tiny Deltas, Big Wins: Schema-Less Thrift Patching at Planet Scale

Aggregated on: 2025-09-19 18:22:27

Introduction: The Power of Tiny Deltas Imagine this common scenario: you have a binary Thrift blob, perhaps holding crucial transaction data or image metadata, stored in a distributed cache. Suddenly, a single field within that blob needs an update — maybe a transaction status change, or an image is flagged as sensitive. The catch? You don't have the Thrift IDL (Interface Definition Language) schema readily available on the serving layer, and redeploying the data producers is simply not an option due to the sheer scale and complexity of your operations. This is where the fbthrift library's parseObject/serializeObject API shines, offering a remarkably elegant solution. It enables you to deserialize, mutate, and re-emit a Thrift frame using only numeric field IDs, bypassing the need for code generation or schema uploads. This capability is invaluable for scenarios like hot-patches, rapid feature-flag flips, or compliance-driven data redactions, all without the overhead of re-sending or re-processing an entire message.

View more...

Distributed Cloud-Based Dynamic Configuration Management

Aggregated on: 2025-09-19 17:22:27

It is not uncommon for back-end software to have a configuration file to start up with. These are generally YAML or JSON files, which are loaded by the system while starting up, and are then used to set up initial configuration for a system. Values included here may affect business logic or infrastructure. Let us create a new service called DumplingSale (because I love dumplings, or as we call them, momos). This service is used for managing the sales of dumplings.

View more...

Deep Dive into Distributed File System Permission Management: Linux Security Integration

Aggregated on: 2025-09-19 16:22:27

In multi-user environments with high-security requirements, robust permission controls are fundamental for resource isolation. Linux's file permission model provides a flexible access control mechanism, ensuring system security through user/group permission settings. For distributed file systems supporting Linux, compliance with this model is critical for consistent security. This article explores key Linux permission mechanisms and their implementation in a FUSE-based distributed file system.

View more...

A Backend-First Approach to Production-Scale LLM Applications

Aggregated on: 2025-09-19 15:07:27

A few months ago, I launched the first version of my platform, which operated without AI functionality. It worked well for its initial purpose, but I knew it could do more. A few weeks ago, I rolled out version two, this time with large language models (LLMs) as its core component. It was designed to operate through a structured workflow in which the frontend sends requests to the backend, where the platform applies business logic before accessing OpenAI's API to generate responses. All operations performed as expected during controlled testing sessions. As more people started using the platform, new problems appeared. These were mostly caused by user actions and factors such as slow internet, accidental browser refreshes, and other interruptions that affected the user experience. Users will always do unexpected things in production, and not all of it is their fault. I had to accept that and find a way for the platform to handle these hiccups smoothly. The solution was to add safeguards, a safety net to catch problems and keep the system running gracefully. I redesigned the platform, putting the backend at the center of all large language model operations.

View more...

VS Code Agent Mode: An Architect's Perspective for the .NET Ecosystem

Aggregated on: 2025-09-19 14:07:27

GitHub Copilot agent mode had several enhancements in VS Code as part of its July 2025 release, further bolstering its capabilities. The supported LLMs are getting better iteratively; however, both personal experience and academic research remain divided on future capabilities and gaps. I've had my own learnings exploring agent mode for the last few months, ever since it was released, and had the best possible outcomes with Claude Sonnet Models. After 18 years of building enterprise systems — ranging from integrating siloed COTS to making clouds talk, architecting IoT telemetry data ingestions and eCommerce platforms — I've seen plenty of "revolutionary" tools come and go. I've watched us transition from monoliths to microservices, from on-premises to cloud, from waterfall to agile. I've learned Java 1.4, .NET 9, and multiple flavors of JavaScript. Each transition revealed fundamental flaws in how we think about software construction.

View more...

7 API Integration Patterns: REST, gRPC, SSE, WS, and Queues

Aggregated on: 2025-09-19 13:07:27

There are multiple API integration patterns. I have already mentioned and described some of the differences in different articles: gRPC vs REST, WebSockets vs SSE. This text is a kind of One Ring article — one to rule them all. I want you to have a single place where you can find a comparison of all the API integration patterns done in a clear and consistent manner. Thus, I have put here all the previous comparisons, and added some more into this text. 

View more...

Exploring Text-to-Cypher: Integrating Ollama, MCP, and Spring AI

Aggregated on: 2025-09-19 12:07:27

When text-to-query approaches (specifically, text2cypher) first entered the scene, I was a bit uncertain how it was useful, especially when existing models were hit-or-miss on result accuracy. It would be hard to justify the benefits over a human expert in the domain and query language. However, as technologies have evolved over the last couple of years, I've started to see how a text-to-query approach adds flexibility to rigid applications that could previously only answer a set of pre-defined questions with limited parameters.

View more...

Spring Boot WebSocket: Building a Multichannel Chat in Java

Aggregated on: 2025-09-19 11:07:27

As you may have already guessed from the title, the topic for today will be Spring Boot WebSockets. Some time ago, I provided an example of WebSocket chat based on Akka toolkit libraries. However, this chat will have somewhat more features, and a quite different design. I will skip some parts so as not to duplicate too much content from the previous article. Here you can find a more in-depth intro to WebSockets. Please note that all the code that’s used in this article is also available in the GitHub repository.

View more...

Best Software Engineer Books: Build Your Personal Library

Aggregated on: 2025-09-19 04:15:00

I believe that every one of us, software engineers, should have our own personal library of software engineering books. Whether in old plain-text book form or in a newer, more eco-friendly electronic one is an open question. The important thing is to actually have one. I am one of those strange people who believe that we, in general, should read books. Doing so has multiple benefits, but let's not dive too deep into this and focus on software engineering. Well, there are a couple of problems with software engineer books: They get old rather quickly. There are a lot of them. They are expensive. They have varying levels of quality. Given our limited time, the obvious conclusion is that it is hard to find a book worthy of reading, one we will not waste our money on. Here comes this article. It will be the first in a series focused on what books I recommend you include in your professional library. This particular blog covers books that focus on the softer parts of our job:

View more...

LLMs for Debugging Code

Aggregated on: 2025-09-18 18:30:00

Large language models (LLMs) are transforming software development lifecycles, with their utility in code understanding, code generation, debugging, and many more. This article provides insights into how LLMs can be utilized to debug codebases, detailing their core capabilities, the methodologies used for training, and how the applications might evolve further in the future. Despite the issues with LLMs like hallucinations, the integration of LLMs into development environments through sophisticated, agentic debugging frameworks proves to improve developers’ efficiency. Introduction The Evolving Role of LLMs in Coding LLMs have already proven their capabilities beyond their initial applications in natural language processing to achieve remarkable performance in diverse code-related tasks, including code generation and translation. They power AI coding assistants like GitHub Copilot and Cursor, and have demonstrated near-human-level performance on standard benchmarks such as HumanEval and MBPP. 

View more...

Disabling UseNUMA Flag When CPU and Memory Node Misalign in JDK

Aggregated on: 2025-09-18 17:30:00

Today, Java is still one of the widely used languages to build and run applications, and for the same reason, organizations prioritize measuring its performance.  When running a Java application on a multi-NUMA (Non-Uniform Memory Access) memory node, we need to pay attention to the remote accesses, if any, otherwise, that can result in increased latencies and hence result in reduced performance. The libnuma kernel library provides several policies, including localalloc, preferred, membind, and interleave, which enable users to affinitize their applications and run them with optimal utilization of the server nodes as per their requirements.  

View more...

Blueprint for Agentic AI: Azure AI Foundry, AutoGen, and Beyond

Aggregated on: 2025-09-18 16:30:00

In 2025, AI isn’t just about individual models doing one thing at a time, but it’s about intelligent agents working together like a well-coordinated team. Picture this: a group of AI systems, each with its own specialty, teaming up to solve complex problems in real time. Sounds futuristic? It’s already happening — thanks to multi-agent systems. Two tools that are making this possible in a big way are Azure AI Foundry and AutoGen.

View more...

Remote Android Management: A Step-by-Step Guide

Aggregated on: 2025-09-18 15:30:00

The Problem No One Talks About In an era where screens dominate bedtime routines, millions now fall asleep to YouTube videos, podcasts, or streaming apps. However, this habit has a hidden cost: uncontrolled volume exposure, especially for children. As a parent and developer, I faced this problem firsthand — my child’s late-night YouTube binges led to restless sleep and morning irritability. Free apps in the Google Play Store, like Volume Limiter and Volume Control, were a failure: They crashed, had no settings, or were too intrusive. Perhaps commercial apps would be better, but I haven't tested this since they cost money, often quite a bit.

View more...

FOSDEM 2025 Recap: Open Source Contributors Unite to Collaborate and Help Advance Apache Software Projects

Aggregated on: 2025-09-18 14:30:00

FOSDEM 2025 has come to a close, and it certainly was not without a lot of content and participation from Apache Software Foundation (ASF) members, committers, and contributors! We asked ASF participants to provide summaries and observations from this year’s premier free software event, to share a small part of the work that ASF community members do for open-source software development. This blog provides a brief overview of their talks, including links to the video recordings. Apache NuttX RTOS Talk: "SBOM Journey for an Open Source Project - Apache NuttX RTOS" (video)

View more...

Unified Checkout Experience Through Micro Frontend Architecture

Aggregated on: 2025-09-18 13:15:00

Large retail systems today, much like Walmart, operate multiple types of checkout registers across various services — pharmacy, auto care, fuel stations, photo centers, and more. These checkout points are not just limited to traditional frontend registers for scanning and payment, but encompass a broad array of service-specific interfaces. As the breadth of services grows, retailers are often left managing fragmented checkout solutions. This fragmentation leads to inconsistent user experiences, higher training overhead for staff, and slower development cycles. The need for a unified checkout experience across microapps — one that abstracts underlying service complexity and presents a consistent interface to customers and associates — has never been more critical.

View more...

Creating a Distributed Computing Cluster for a Data Base Management System: Part 1

Aggregated on: 2025-09-18 12:15:00

Ideas of creating a distributed computing cluster (DCC) for database management systems (DBMS) have been striking me for quite a long time. If simplified, the DCC software makes it possible to combine many servers into one super server (cluster), performing an even balancing of all queries between individual servers. In this case, everything will appear for the application running on the DCC as if it was running with one server and one database (DB). It will not be dispersed databases on distributed servers, but work as one virtual one. All network protocols, replication exchanges, and proxy redirections will be concealed inside the DCC. At the same time, all resources of distributed servers, in particular RAM and CPU time, will be utilized evenly and in an efficient fashion. For example, in a cloud data processing center (DPC), it is possible to take one physical super server and divide it into a number of virtual DBMS servers. But the reverse procedure was not possible until now, i.e., it is not possible to take a number of physical servers and merge them into a single virtual DBMS super server. In some specified sense, DCC is a technology that makes it possible to merge physical servers into one virtual DBMS super server.

View more...

Development of System Configuration Management: Summary and Reflections

Aggregated on: 2025-09-18 11:15:00

Series Overview This article is Part 4 of a multi-part series: "Development of system configuration management." The complete series:

View more...

Enable AWS Budget Notifications With SNS Using AWS CDK

Aggregated on: 2025-09-17 19:14:56

Keeping track of AWS spend is very important. Especially since it’s so easy to create resources, you might forget to turn off an EC2 instance or container you started, or remove a CDK stack for a specific experiment. Costs can creep up fast if you don’t put guardrails in place. Recently, I had to set up budgets across multiple AWS accounts for my team. Along the way, I learned a few gotchas (especially around SNS and KMS policies) that weren’t immediately clear to me as I started out writing AWS CDK code. In this post, we’ll go through how to:

View more...