News Aggregator


High Fidelity Data: Balancing Privacy and Usage

Aggregated on: 2024-08-29 22:08:04

The effective de-identification algorithms that balance data usage and privacy are critical. Industries like healthcare, finance, and advertising rely on accurate and secure data analysis. However, existing de-identification methods often compromise either the data usability or privacy protection and limit advanced applications like knowledge engineering and AI modeling. To address these challenges, we introduce High Fidelity (HiFi) data, a novel approach to meet the dual objectives of data usability and privacy protection. High-fidelity data maintains the original data's usability while ensuring compliance with stringent privacy regulations. 

View more...

Achieving DevOps Harmony With Unified Log Monitoring for CI/CD

Aggregated on: 2024-08-29 20:53:04

In modern software development, DevOps methods have evolved into the pillar of dependable and effective product delivery. Two methods that particularly help automate and simplify the software release process are continuous integration (CI) and continuous deployment (CD). But as software systems get more complicated, so does the necessity for strong log monitoring systems that can unite and streamline log management at several CI/CD phases. This article explores the need for uniting log monitoring in DevOps, the associated difficulties, and approaches for achieving harmony in CI/CD processes.

View more...

When (Not) to Write an Apache APISIX Plugin

Aggregated on: 2024-08-29 19:08:04

When I introduce Apache APISIX in my talks, I mention the massive number of existing plugins, and that each of them implements a specific feature. One of the key features of Apache APISIX is its flexibility. If a feature is missing, you can create your own plugin in Lua or a language compiled into Wasm, showcasing the platform's adaptability to your specific needs. In this post, I aim to provide practical alternatives to writing a custom plugin, offering solutions you can quickly implement in your projects. Cons of Writing a Plugin Before describing alternatives, let me explain the issues of writing a plugin.

View more...

Team-as-Code: How to Apply Platform Engineering to DevOps’ Coding Stage

Aggregated on: 2024-08-29 17:53:04

Why Apply a Platform Approach to the Coding Stage? While both parts of the modern software development lifecycle, DevOps and platform engineering target distinct challenges. DevOps focuses on integration and continuous delivery (CI/CD) and teams track metrics such as code deployment frequency, lead time for changes, change failure rate, etc.

View more...

Utilizing Multiple Vectors and Advanced Search Data Model Design for City Data

Aggregated on: 2024-08-29 16:08:04

Goal of This Application In this article, we will build an advanced data model and use it for ingestion and various search options. For the notebook portion, we will run a hybrid multi-vector search, re-rank the results, and display the resulting text and images. Ingest data fields, enrich data with lookups, and format: Learn to ingest data including JSON and images, format and transform to optimize hybrid searches. This is done inside the streetcams.py application. Store data into Milvus: Learn to store data in Milvus, an efficient vector database designed for high-speed similarity searches and AI applications. In this step, we are optimizing the data model with scalar and multiple vector fields — one for text and one for the camera image. We do this in the streetcams.py application. Use open source models for data queries in a hybrid multi-modal, multi-vector search: Discover how to use scalars and multiple vectors to query data stored in Milvus and re-rank the final results in this notebook. Display resulting text and images: Build a quick output for validation and checking in this notebook. Simple Retrieval-Augmented Generation (RAG) with LangChain: Build a simple Python RAG application (streetcamrag.py) to use Milvus for asking about the current weather via Ollama. While outputing to the screen we also send the results to Slack formatted as Markdown. Summary By the end of this application, you’ll have a comprehensive understanding of using Milvus, data ingest object semi-structured and unstructured data, and using open source models to build a robust and efficient data retrieval system. For future enhancements, we can use these results to build prompts for LLM, Slack bots, streaming data to Apache Kafka, and as a Street Camera search engine.

View more...

Go: Unit and Integration Tests

Aggregated on: 2024-08-29 14:53:04

Unit Tests Unit testing is a fundamental part of software development that ensures individual components of your code work as expected. In Go, unit tests are straightforward to write and execute, making them an essential tool for maintaining code quality. What Is a Unit Test? A unit test is a small, focused test that validates the behavior of a single function or method. The goal is to ensure that the function works correctly in isolation, without depending on external systems like databases, file systems, or network connections. By isolating the function, you can quickly identify and fix bugs within a specific area of your code.

View more...

The Ultimate Guide To Evaluate RAG System Components: What You Need To Know

Aggregated on: 2024-08-29 13:08:04

Retrieval-Augmented Generation (RAG) (opens new window)systems have been designed to improve the response quality of a large language model (LLM). When a user submits a query, the RAG system extracts relevant information from a vector database and passes it to the LLM as context. The LLM then uses this context to generate a response for the user. This process significantly improves the quality of LLM responses with less “hallucination.” (opens new win So, in the workflow above, there are two main components in a RAG system:

View more...

Java Concurrency: Visibility and Synchronized

Aggregated on: 2024-08-28 20:08:04

Previously, we examined the happens before guarantee in Java. This guarantee gives us confidence when we write multithreaded programs with regard to the re-ordering of statements that can happen. In this post, we shall focus on variable visibility between two threads and what happens when we change a variable that is shared. Code Examination Let’s examine the following code snippet:

View more...

How to Dockerize a React App With Vite: Step-by-Step Guide

Aggregated on: 2024-08-28 19:08:04

In this article, I’ll show you how to Dockerize a React application built with Vite. We’ll go through: Configuring Vite for Docker Creating the Dockerfile Creating the Docker Compose file Building and running the Docker Container By the end of this article, you’ll have a portable React app ready to deploy in any environment.

View more...

Advanced Techniques in Automated Threat Detection

Aggregated on: 2024-08-28 18:08:04

In the fast-paced and constantly evolving digital landscape of today, bad actors are always looking for newer and better methods to launch their attacks. As cybercriminal tactics evolve, they develop more sophisticated malware, more convincing scams, and attacks that are designed specifically to evade known security measures. With this in mind, it is vital for organizations to invest in more advanced automated tools and solutions to go “from threat identification to eradication and remediation with as few humans in the loop as possible.” Taking advantage of emerging technologies and sophisticated measures can aid organizations in automating these processes to an extent and saving time, labor, and other resources that can run thin when relying solely on humans to handle threats.

View more...

The Evolution of Conversational AI: Blending Determinism With Dynamism

Aggregated on: 2024-08-28 17:23:04

Conversational AI agents have come a long way from their early days of simple, scripted interactions. With the explosion of large language models (LLMs) like GPT-3, Gemini, and beyond, the landscape of human-computer interaction is undergoing a significant transformation. These AI agents are increasingly expected to mimic human-like interactions, which demands a delicate balance between deterministic (convergent) workflows and dynamic, creative responses (divergent). This dual approach is redefining how these agents function across various domains, including education, customer service, and personal assistance.

View more...

Why Replace External Database Caches?

Aggregated on: 2024-08-28 16:23:04

Teams often consider external caches when the existing database cannot meet the required service-level agreement (SLA). This is a clear performance-oriented decision. Putting an external cache in front of the database is commonly used to compensate for subpar latency stemming from various factors, such as inefficient database internals, driver usage, infrastructure choices, traffic spikes, and so on. Caching might seem like a fast and easy solution because the deployment can be implemented without tremendous hassle and without incurring the significant cost of database scaling, database schema redesign, or even a deeper technology transformation. However, external caches are not as simple as they are often made out to be. In fact, they can be one of the more problematic components of a distributed application architecture.

View more...

Documenting a Java WebSocket API Using Smart-Doc

Aggregated on: 2024-08-28 15:23:04

Smart-Doc is a powerful documentation generation tool that helps developers easily create clear and detailed API documentation for Java projects. With the growing popularity of WebSocket technology, Smart-Doc has added support for WebSocket interfaces starting from version 3.0.7. This article will detail how to use Smart-Doc to generate Java WebSocket interface documentation and provide a complete example of a WebSocket server. Overview of WebSocket Technology First, let's briefly understand WebSocket technology. The WebSocket protocol provides a full-duplex communication channel, making data exchange between the client and server simpler and more efficient. In Java, developers can easily implement WebSocket servers and clients using JSR 356: Java API for WebSocket.

View more...

Effortless Concurrency: Leveraging the Actor Model in Financial Transaction Systems

Aggregated on: 2024-08-28 14:23:04

Introduction to the Problem Managing concurrency in financial transaction systems is one of the most complex challenges faced by developers and system architects. Concurrency issues arise when multiple transactions are processed simultaneously, which can lead to potential conflicts and data inconsistencies. These issues manifest in various forms, such as overdrawn accounts, duplicate transactions, or mismatched records, all of which can severely undermine the system's reliability and trustworthiness. In the financial world, where the stakes are exceptionally high, even a single error can result in significant financial losses, regulatory violations, and reputational damage to the organization. Consequently, it is critical to implement robust mechanisms to handle concurrency effectively, ensuring the system's integrity and reliability.

View more...

Enhancing Software Quality with Checkstyle and PMD: A Practical Guide

Aggregated on: 2024-08-28 13:23:04

It is widely agreed that maintaining a high-quality standard in software development is crucial for any project. However, the approach to achieving this level of quality needs further discussion. One highly effective method for ensuring quality is through software design or architecture governance. In this article, I will explain how you can use two powerful tools — Checkstyle and PMD — to establish and enforce coding standards, thus improving your project’s overall code quality and maintainability. Understanding Checkstyle and PMD Checkstyle is a development tool that helps you and your team establish a consistent code style standard across your project. By setting rules for code formatting, naming conventions, and other stylistic aspects, Checkstyle enforces a baseline for code quality that all team members must adhere to. This consistency is crucial, especially in large teams or projects with multiple contributors.

View more...

How to Setup Multi-Primary Istio in EKS and AKS for Production

Aggregated on: 2024-08-27 23:08:03

Many large enterprises like retail and banks are adopting open-source Istio service mesh to abstract and better manage the security and network of microservices. To either tackle cost, achieve HA/DR, or improve latency, they apply multi-cloud and multi-cluster strategies in their production system. Implementing Istio in a multi-cloud environment can be tricky, and architects often take time for experimentation. In this blog, we will discuss various ways to achieve multi-cloud and multicluster configuration for Istio implementation and also guide you through the steps to set up primary-primary multicluster Istio in EKS and AKS

View more...

Enhancing Accuracy in AI-Driven Mobile Applications: Tackling Hallucinations in Large Language Models

Aggregated on: 2024-08-27 22:08:03

In recent discussions around AI, hallucinations in Large Language Models (LLMs) have become a focal point. These hallucinations manifest when an LLM generates outputs that, while coherent and contextually appropriate, are factually incorrect. For instance, in a mobile app that provides technical support, an LLM might confidently assert that a certain deprecated API can still be used in a current version of Android, leading to potential application errors. This issue is particularly critical in my work, where precision in mobile app development is non-negotiable. Understanding why LLMs produce such hallucinations is essential, especially when deploying them in scenarios that require high trust and accuracy. It's important to recognize that an LLM is not a structured database; it functions more like a predictive text engine, generating content based on probabilistic patterns rather than factual data.

View more...

Are You Tired of Fragile Tests? Meet data-testid

Aggregated on: 2024-08-27 21:08:03

In the realm of front-end development, ensuring that your application is thoroughly tested and maintains high quality is paramount. One of the strategies that can significantly enhance both the development and testing processes is the use of the data-testid attribute. This attribute, specifically designed for testing purposes, offers numerous advantages, particularly from a QA perspective. Benefits of Using data-testid Stable and Reliable Locators Benefit One of the primary challenges in automated testing is ensuring that test scripts remain stable as the UI evolves. Typically, selectors like classes and IDs are used to locate elements in the DOM, but these can change frequently as the design or structure of the UI is updated. data-testid provides a stable and reliable way to locate elements, as it is intended solely for testing purposes and is less likely to be altered.

View more...

Building a Powerful AI and Machine Learning Pipeline: Best Practices and Tools

Aggregated on: 2024-08-27 20:08:03

Artificial intelligence and machine learning have evolved from experimental technologies to essential components of modern business strategies. Companies that effectively build and deploy AI/ML models gain a significant competitive advantage, but creating a fully functional AI system is complex and involves multiple stages.  Each stage, from raw data collection to the deployment of a final model, demands careful planning and execution. This article explores best practices for constructing a robust AI/ML pipeline, guiding you through every step — from data collection and processing to model deployment and monitoring.

View more...

LangChain Language Correctness Detector

Aggregated on: 2024-08-27 19:23:03

This project implements a simple LangChain language correctness detector that detects grammatical errors, sentiment, and aggressiveness, and provides solutions for the errors in the text. Features Detects grammatical errors in the text Analyzes the sentiment of the text Measures the aggressiveness of the text Provides solutions for the detected errors Stack Used Node.js: JavaScript runtime environment TypeScript: Typed superset of JavaScript LangChain: Language processing library OpenAI API: For language model capabilities Google Cloud: For additional language processing services Installation Clone the repository: git clone https://github.com/xavidop/langchain-example.git cd langchain-example

View more...

Overcoming the Retry Dilemma in Distributed Systems

Aggregated on: 2024-08-27 18:23:03

“Insanity is doing the same thing over and over again, but expecting different results” - Source unknown As you can see in the quote above, humans have this tendency to retry things even when results are not going to change. This was manifested in systems designs as well where we pushed these biases when designing systems. If you look closely there are two broad categories of failures: 

View more...

Beyond the Obvious: Uncovering the Hidden Challenges in Cybersecurity

Aggregated on: 2024-08-27 17:23:03

In the ever-evolving landscape of cybersecurity, staying ahead of threats requires more than just keeping up with the latest technologies. As we delve into the insights shared by industry experts at Black Hat 2024, it becomes clear that some of the most critical challenges facing security professionals today are often hidden in plain sight. This article explores these overlooked areas and their implications for developers, engineers, and security professionals. The Human Element: The Overlooked Firewall While cutting-edge technologies dominate cybersecurity discussions, several experts emphasized that the human factor remains both our greatest vulnerability and our strongest asset. Katie Paxton-Fear, API Researcher at Traceable AI, points out that "teams often fixate on what's new and shiny," potentially overlooking the crucial "human firewall."

View more...

Telemetry Pipelines Workshop: Integrating Fluent Bit With OpenTelemetry, Part 1

Aggregated on: 2024-08-27 16:23:03

Are you ready to get started with cloud-native observability with telemetry pipelines?  This article is part of a series exploring a workshop guiding you through the open source project Fluent Bit, what it is, a basic installation, and setting up the first telemetry pipeline project. Learn how to manage your cloud-native data from source to destination using the telemetry pipeline phases covering collection, aggregation, transformation, and forwarding from any source to any destination. 

View more...

Cybersecurity Career Paths: Bridging the Gap Between Red and Blue Team Roles

Aggregated on: 2024-08-27 15:23:03

In cybersecurity, professionals are often divided into two distinct groups: Red Teams, which focus on offense, and Blue Teams, which focus on defense. Red Teaming involves ethical hacking. Here, security experts simulate cyberattacks to find vulnerabilities in a system before malicious actors can exploit them. On the other hand, Blue Teaming is all about defending the system from such attacks. Blue Team members monitor, detect, and respond to security incidents. For developers, understanding the dynamics of both Red and Blue Teams is very important. Developers are often on the front lines of building and securing applications. They must consider how their work fits into the broader security landscape. Whether you are writing code for a new app or patching vulnerabilities in apps after a security breach, knowing the strategies and challenges of both teams can make you a more well-rounded professional.

View more...

Debugging Low Cache Hit Ratio

Aggregated on: 2024-08-27 14:23:03

Disk activity is much slower than reading data from RAM. With today’s performance characteristics, reading from DRAM takes around 100 nanoseconds whereas reading from the physical drive is between 10 microseconds (for SSD) up to 10 milliseconds (for HHD). This is up to 100,000 times slower than accessing the random access memory. Reading from the L1 cache is even faster and can take 3 CPU cycles which is less than 1 nanosecond. Therefore, every read from a physical drive is a tremendous performance hit and should be avoided. In this blog post, we are going to see how to debug scenarios where we can’t utilize cached data and need to read from the physical drive. We’re going to see why it’s important, what to look for, and what tools and extensions to use. How Databases Read Data Databases are well aware of performance issues when reading data directly from the hard drive. Therefore, they incorporate many sophisticated techniques to boost performance and cache data where possible. Let’s see how the database can access the data and what happens next.

View more...

Maximizing Enterprise Data: Unleashing the Productive Power of AI With the Right Approach

Aggregated on: 2024-08-26 22:08:03

In today's digital landscape, data has become the lifeblood of organizations, much like oil was in the industrial era. Yet, the genuine hurdle is converting data into meaningful insights that drive business success. With AI and generative AI revolutionizing data platforms, the critical question is: Are we ready to harness the transformative power of data to propel growth and innovation? The answer is a mixed bag. While we can derive some benefits from our data, unlocking its full potential requires a purposeful and multi-faceted approach grounded in several essential elements:

View more...

DORA Metrics: Tracking and Observability With Jenkins, Prometheus, and Observe

Aggregated on: 2024-08-26 21:08:03

DORA (DevOps Research and Assessment) metrics, developed by the DORA team have become a standard for measuring the efficiency and effectiveness of DevOps implementations. As organizations start to adopt DevOps practices to accelerate software delivery, tracking performance and reliability becomes critical. DORA metrics help organizations address these critical tasks by providing a framework for understanding how well teams are delivering software and how quickly they can recover from failures. This article will delve into DORA metrics, demonstrate how to track them using Jenkins, and explore how to use Prometheus for collecting and displaying these metrics in Observe. What Are DORA Metrics? DORA metrics are a set of four key performance indicators (KPIs) that help organizations evaluate their software delivery performance. These metrics are:

View more...

Methodcentipede

Aggregated on: 2024-08-26 20:23:03

When I was a child, I used to lie on the bed and gaze for a long time at the patterns on an old Soviet rug, seeing animals and fantastical figures within them. Now, I more often look at code, but similar images still emerge in my mind. Like on the rug, these images form repetitive patterns. They can be either pleasing or repulsive. Today, I want to tell you about one such unpleasant pattern that can be found in programming. Scenario Imagine a service that processes a client registration request and sends an event about it to Kafka. In this article, I will show an implementation example that I consider an antipattern and suggest an improved version.

View more...

Anomaly Detection: The Dark Horse of Fraud Detection

Aggregated on: 2024-08-26 19:23:03

Today, machine learning-based fraud prediction has become a mainstay in most organizations. The two common types of machine learning are supervised and unsupervised machine learning. Out of the two, supervised learning is the most desired choice for fraud prediction for apparent reasons. Supervised learning that learns the patterns from known fraud cases yields more accurate predictions. On the other hand, unsupervised learning can be leveraged even when we don’t have confirmed cases of fraud. The drawback is that it has a lower level of prediction accuracy compared to supervised learning. Supervised ML Models Won’t Know What We Don’t Know Organizations today typically only implement supervised models. A common reason for this is the belief that if a supervised model can deliver the best performance, there is no need to have an unsupervised model. This school of thought could prove dangerous in some domains, fraud detection being one of them. Supervised models only learn what they are taught. They can’t evolve on their own to capture the new fraud patterns. Fraudsters, conversely, are highly creative entities constantly attempting to figure out new ways of evading detection.

View more...

Multi-Agent System’s Architecture

Aggregated on: 2024-08-26 18:23:03

The distribution of decision-making and interaction among the various agents that make up the system principally distinguishes multi-agent systems from single-agent systems. In a single-agent system, a centralized agent makes all decisions, with other agents acting as remote slaves. It is customary for this one agent to decide depending on the circumstances. This can lead to the overlooking of alternative viewpoints and possibilities. On the other hand, multi-agent systems consist of several intelligent agents that interact with each other, each capable of making decisions and influencing the surrounding environment. The purpose of multi-agent architecture is to construct agents that are able to bring in multiple perspectives by virtue of the roles that they play. Different contexts facilitate the creation of these agents. Despite using the same LLM, each agent’s behavior is unique due to its specific function, objective, and context, just like a squad member.

View more...

You Don’t Get Paid to Practice Scrum

Aggregated on: 2024-08-26 17:23:03

TL; DR: Why Solving Customer Problems Instead Matters Scrum is just a tool; your job is to solve real customer problems and deliver value. Stop focusing on perfecting frameworks and start prioritizing outcomes that matter. It’s time to reassess what truly drives your success, particularly given the challenging business environment. Why Solving Customer Problems Matters More Than Perfecting Scrum Agile practices, particularly within Scrum, often captivate practitioners with their events, roles, principles, rules, and stickies. However, practitioners tend to overlook two crucial truths — both veterans and newcomers alike:

View more...

Why You Should Migrate Microservices From Java to Kotlin: Experience and Insights

Aggregated on: 2024-08-26 16:23:03

I work at one of the largest private banks in Eastern Europe, developing the backend for a mobile application. Our cluster consists of more than 400 microservices, and peak loads on individual services can reach five-digit values. When we initially started transitioning to a microservices architecture, all our code was written in Java. However, over time, we began actively migrating microservices to Kotlin. Today, all new microservices are created exclusively in Kotlin, and the share of Java code has decreased to less than 20%.  In this article, I will explain why the migration to Kotlin has been so successful and why developers are eager to switch to this language, even with prior experience only in other JVM languages.

View more...

Better Search Results Through Intelligent Chunking and Metadata Integration

Aggregated on: 2024-08-26 15:23:03

Often, the knowledge bases over which we develop an LLM-based retrieval application contain a lot of data in various formats. To provide the LLM with the most relevant context to answer the question specific to a section within the knowledge base, we rely on chunking the text within the knowledge base and keeping it handy. Chunking Chunking is the process of slicing text into meaningful units to improve information retrieval. By ensuring each chunk represents a focused thought or idea, chunking assists in maintaining the contextual integrity of the content.

View more...

Efficient and Fault-Tolerant Solutions for Distributed Mutual Exclusion

Aggregated on: 2024-08-26 14:23:03

In the realm of distributed systems, ensuring that only one process can access a shared resource at any given time is crucial — this is where mutual exclusion comes into play. Without a reliable way to enforce mutual exclusion, systems could easily run into issues like data inconsistency or race conditions, potentially leading to catastrophic failures. Algorithms to Address the Challenge Several algorithms have been developed over the years to address this challenge. One of the most well-known is the Majority Quorum Algorithm. It’s effective, no doubt, but it can be quite demanding in terms of communication, especially when you're dealing with a large network of nodes.

View more...

Securing the Future: Defending LLM-Based Applications in the Age of AI

Aggregated on: 2024-08-26 13:23:03

As artificial intelligence and large language models (LLMs) continue to revolutionize the tech landscape, they also introduce new security challenges that developers, engineers, architects, and security professionals must address. At Black Hat 2024, we spoke with Mick Baccio, Global Security Strategist at Splunk, about their research on the exploitation of LLM-based applications and how organizations can implement the OWASP Top 10 framework to better defend these systems. The Evolving Threat Landscape The rapid adoption of LLM-based applications has opened up new avenues for potential exploitation. Baccio emphasizes the importance of understanding these emerging threats:

View more...

A Hands-On Guide to OpenTelemetry: Exploring Telemetry Data With Jaeger

Aggregated on: 2024-08-23 23:23:01

Are you ready to start your journey on the road to collecting telemetry data from your applications? Great observability begins with great instrumentation!  In this series, you'll explore how to adopt OpenTelemetry (OTel) and how to instrument an application to collect tracing telemetry. You'll learn how to leverage out-of-the-box automatic instrumentation tools and understand when it's necessary to explore more advanced manual instrumentation for your applications. By the end of this series, you'll have an understanding of how telemetry travels from your applications to the OpenTelemetry Collector, and be ready to bring OpenTelemetry to your future projects. Everything discussed here is supported by a hands-on, self-paced workshop authored by Paige Cruz. 

View more...

Architectural Patterns for Enterprise Generative AI Apps: DSFT, RAG, RAFT, and GraphRAG

Aggregated on: 2024-08-23 21:23:01

A best-designed Enterprise Architecture is the backbone for any organization's IT systems, which support the foundational building blocks to achieve the organization's business objective. The architecture consists of best practices, clearly outlined strategies, common frameworks, and guidelines for the engineering team and other stakeholders to pick the right tool to accomplish the tasks. Enterprise Architecture is mostly governed by the architecture team that supports the line of business. In most organizations, the architecture team is responsible for outlining the architecture patterns and common frameworks which would help the engineering and product team not to spend hours of effort in doing proof of concepts, but rather help them to adopt the strategies to design the core building blocks based on the patterns. Since Generative AI has been transforming the entire landscape, most organizations are either building Generative AI-based applications or they are integrating Generative AI capabilities or features into their existing applications or products. In this article, we will deep dive into the common architectural patterns that are available for building Generative AI solutions. We will also be discussing various enterprise-level strategies in picking the right framework for the right use case.

View more...

When Backstage Met Terraform: Exploring Platform Abstractions [Video]

Aggregated on: 2024-08-23 19:08:01

In a recent fireside chat, I delved into the intriguing convergence of Terraform and Backstage, two pivotal technologies reshaping the landscape of platform engineering. This session was particularly exciting as I had a special guest, Seve Kim, a product manager from Backstage, joining Abby Bangser and me!  Although the chat only lasted 30 minutes, we covered much ground. We explored the layers of platform architecture, talked about the goals of platform engineering, and discussed how best to leverage open-source tools like Backstage, Kratix, and Terraform to create robust, scalable internal developer platforms.

View more...

The Role of Data Governance in Data Strategy: Part 3

Aggregated on: 2024-08-23 17:08:01

Data Subject Access Rights (DSAR)  In the previous articles (Part 1 and Part 2), we have seen the concept of BigID and how it enhances the data in an organization. In this article, let's see what is Data Subject Access Rights (DSAR) and how they correlate to individual rights in real-time.  Data Rights Fulfillment (DRF) is a process of steps/actions taken by an organization with data protection rules and ensuring that individual rights and personal data are respected.

View more...

Agile Practices That Developers Can Use to Create Better Projects

Aggregated on: 2024-08-23 15:08:01

No matter what the project is about, every software development team wants to create the best project: one that is bug-free, shows the best performance, and meets the customer's needs. However, the software development cycle involves many things that can slow down the project or create expectations and communication problems between developers. One way to remedy these situations is by adopting Agile practices. First, we are going to explain to you what Agile is, the best practices, and how these can improve the performance of software developers so that they create better projects.

View more...

Integrating Apache Kafka in KRaft Mode With RisingWave for Event Stream Processing

Aggregated on: 2024-08-23 13:08:01

Over the past few years, Apache Kafka has emerged as the top event streaming platform for streaming data/event ingestion. However, in an earlier version of Apache Kafka, 3.5, Zookeeper was the additional and mandatory component for managing and coordinating the Kafka cluster. Relying on ZooKeeper on the operational multi-node Kafka cluster introduced complexity and could be a single point of failure.  ZooKeeper is completely a separate system having its own configuration file syntax, management tools, and deployment patterns. In-depth skills with experience are necessary to manage and deploy two individual distributed systems and an eventually up-and-running Kafka cluster. Having expertise in Kafka administration without ZooKeeper won’t be able to help to come out from the crisis, especially in the production environment where ZooKeeper runs in a completely isolated environment (Cloud). 

View more...

Parent Document Retrieval (PDR): Useful Technique in RAG

Aggregated on: 2024-08-22 22:23:01

What Is Parent Document Retrieval (PDR)? Parent Document Retrieval is a method implemented in state-of-the-art RAG models meant to recover full parent documents from which relevant child passages or snippets can be extracted. It provides context enrichment and is passed on to the RAG model for more comprehensive, information-rich responses to complex or nuanced questions.  Major steps in parent document retrieval in RAG models include:

View more...

Setting Up CORS and Integration on AWS API Gateway Using CloudFormation

Aggregated on: 2024-08-22 20:08:01

Cross-Origin Resource Sharing (CORS) is an essential security mechanism utilized by web browsers, allowing for regulated access to server resources from origins that differ in domain, protocol, or port. In the realm of APIs, especially when utilizing AWS API Gateway, configuring CORS is crucial to facilitate access for web applications originating from various domains while mitigating potential security risks. This article aims to provide a comprehensive guide on CORS and integrating AWS API Gateway through CloudFormation. It will emphasize the significance of CORS, the development of authorization including bearer tokens, and the advantages of selecting optional methods in place of standard GET requests.

View more...

Protect Your Alerts: The Importance of Independent Incident Alert Management

Aggregated on: 2024-08-22 18:08:01

In a world where IT infrastructure underpins countless businesses and organizations, maintaining operational integrity during critical failures or outages is non-negotiable. A key element in achieving this is ensuring that your incident alert management system remains active and accessible under all circumstances. Unfortunately, a significant vulnerability can arise when the incident alert management system shares the same cloud provider as your primary services. If that cloud provider experiences an outage, your alert management system could become unavailable just when it is needed the most. This could lead to delayed responses, prolonged downtimes, and potentially catastrophic consequences for your business operations. Understanding the Role of Redundancy in Incident Management Redundancy is a fundamental principle in IT management, especially when it comes to ensuring continuous operations. Consider a scenario where your services are hosted on a major cloud provider like Amazon Web Services (AWS), Microsoft Azure, or Google Cloud. While these platforms are indeed robust and reliable, they are not infallible. They can and have experienced failures caused by various factors such as Distributed Denial of Service (DDoS) attacks, major hardware failures, software bugs, or even human error resulting in misconfigurations. In such situations, if your incident alert management system is also hosted on the same cloud, the very tools you rely on to notify you of the outage might be compromised as well. This could leave your IT team in the dark, unaware of the issues, and unable to respond promptly.

View more...

Kotlin Coroutines and OpenTelemetry Tracing

Aggregated on: 2024-08-22 16:08:01

I recently compared three OpenTelemetry approaches on the JVM: Java Agent v1, v2, and Micrometer. I used Kotlin and coroutines without overthinking. I received interesting feedback on the usage of @WithSpan with coroutines: Indeed, the @WithSpan annotation has worked flawlessly in conjunction with coroutines for some time already. However, it made me think about the underlying workings of OpenTelemetry. Here are my findings.

View more...

Day in the Life of a Developer With Google’s Gemini Code Assist: Part 1

Aggregated on: 2024-08-22 15:08:00

I started evaluating Google's Gemini Code Assist development in December 2023, almost about its launch time. The aim of this article is to cover its usage and impact beyond basic code generation on all the activities that a developer is supposed to do in his daily life (especially with additional responsibilities entrusted to developers these days with the advent of "Shift-Left" and full stack development roles).  Gemini Code Assist Gemini Code Assist can be tried at no cost until November 2024.

View more...

Multicluster Gateways With Kubernetes Gateway API

Aggregated on: 2024-08-22 14:08:01

Kubernetes Gateway API is the new specification released by CNCF to standardize the Kubernetes Ingress traffic. Now, what if a service is configured as High Availability (HA)? (Say it is in a different cloud environment and you have to access it from the Gateway; i.e., multicluster, multi-cloud scenario.) In this article, we will showcase how to use the Gateway API spec to configure gateways for multicluster setup.  Multicluster Kubernetes Gateway Demo Overview We have two clusters: one in EKS (primary) and the other in GKE (remote). I have deployed Istio in both the clusters and the setup is primary-remote Istio installation. Istio is used as the controller to implement the Gateway API resources. 

View more...

Evolution of Governance Framework With AI

Aggregated on: 2024-08-22 13:08:01

We have seen that running effective projects and conducting business both depend on the governance framework. This framework lays out the fundamentals of management. These principles assisted in bringing all of the business's top stakeholders into alignment with the guided principle objectives, such as defining performance standards, determining acceptable risk levels, and determining the manner and content of reporting. These core ideas are what all projects must adhere to. It may take a lot of work to create such frameworks but once all stakeholders are aligned on this framework, it can be leveraged throughout the project management lifecycle. This framework plays an important role in managing and mitigating the risks that come throughout the project lifecycle. We have seen a lot of last-minute risk coming in the late cycle of project management where such a governance framework will be helpful. This framework will foster trust among stakeholders and enhance the decision-making ability for each risk.

View more...

Improving Snowflake Performance by Mastering the Query Profile

Aggregated on: 2024-08-21 22:08:00

Having worked with over 50 Snowflake customers across Europe and the Middle East, I've analyzed hundreds of Query Profiles and identified many issues including issues around performance and cost. In this article, I'll explain:

View more...

Automatic 1111: Sketch-to-Image Workflow

Aggregated on: 2024-08-21 21:08:00

In this article, we will be discussing how to convert hand-drawn or digital sketches into photorealistic images using stable diffusion models with the help of ControlNet. We will be extending the Automatic 1111's txt2img feature to develop this custom workflow. Prerequisites Before diving in, let's make sure we have the following prerequisites covered:

View more...