In today’s data-driven world, effective data management is crucial for organizations aiming to make well-informed, data-driven decisions. As the importance of data continues to grow, so does the significance of robust data management practices. This includes the processes of ingesting, storing, organizing, and maintaining the data generated and collected by an organization. Within the realm of data management, schema evolution stands out as one of the most critical aspects. Businesses evolve over time, leading to changes in data and, consequently, changes in corresponding schemas. Even though a schema may be initially defined for your data, evolving business requirements inevitably demand schema modifications. Yet, modifying data structures is no straightforward task, especially when dealing with distributed systems and teams. It’s essential that downstream consumers of the data can seamlessly adapt to new schemas. Coordinating these changes becomes a critical challenge to minimize downtime and prevent production issues. Neglecting robust data management and schema evolution strategies can result in service disruptions, breaking data pipelines, and incurring significant future costs. In the context of Apache Kafka, schema evolution is managed through a schema registry. As producers share data with consumers via Kafka, the schema is stored in this registry. The Schema Registry enhances the reliability, flexibility, and scalability of systems and applications by providing a standardized approach to manage and validate schemas used by both producers and consumers. This blog post will walk you through the steps of utilizing Amazon MSK in combination with AWS Glue Schema Registry and Terraform to build a cross-account streaming pipeline for Kafka, complete with built-in schema evolution. This approach provides a comprehensive solution to address your dynamic and evolving data requirements.
Articles tagged with "iac"
In the process of constructing your Hybrid Hub and Spoke Network within the Cloud, which includes the integration of On-Premises networks and allows internet-based access, the implementation of a network firewall is essential for robust security. This security measure involves thorough traffic analysis and filtering between the entities to safeguard against both internal and external cyber threats and exploits. By actively monitoring and inspecting the flow of traffic, a network firewall plays a crucial role in identifying and blocking vulnerability exploits and unauthorized access attempts. Within the AWS ecosystem, the AWS Network Firewall is a service that is often used for achieving a high level of network security. As a stateful and fully managed network firewall, it includes intrusion detection and prevention capabilities, offering comprehensive protection for VPC-based network traffic. This blog post aims to guide you through the process of integrating the AWS Network Firewall into your hybrid AWS Hub and Spoke network. By doing so, you can effectively analyze, monitor, and filter both incoming and outgoing network traffic among all involved parties, thereby enhancing the overall security of your infrastructure layer.
When leveraging AWS services such as EC2, ECS, or EKS, achieving standardized and automated image creation and configuration is essential for securely managing workloads at scale. The concept of a Golden AMI is often used in this context. Golden AMIs represent pre-configured, hardened and thoroughly tested machine images that encompass a fully configured operating system, essential software packages, and customizations tailored for specific workload. It is also strongly recommended to conduct comprehensive security scans during the image creation process to mitigate the risk of vulnerabilities. By adopting Golden AMIs, you can ensure consitent configuration across different environments, leading to decreased setup and deployment times, fewer configuration errors, and a diminished risk of security breaches. In this blog post, I would like to demonstrate how you can leverage AWS CodePipeline and AWS Stepfunctions, along with Terraform and Packer, to establish a fully automated pipeline for creating Golden AMIs.
Presentation Deploying resources with infrastructure as code is the recommended way to provision resources in AWS. The native AWS-way of doing it is by using Cloudformation or CDK (Cloud Development Kit), and you should of course do this from day one. But in real world sometimes somebody provisioned resources via the console, or there is a need of refactor your code and split your stack into multiple stacks. Luckily It is not very often we have cases where it’s required to import resources.
When implementing a hybrid cloud solution and connecting your AWS VPCs with corporate data centers, setting up proper DNS resolution across the whole network is an important step to ensure full integration and functionality. In order to accomplish this task, Route53 Inbound and Outbound endpoints can be used. In combination with forwarding rules, they allow you to forward DNS traffic between your AWS VPC and on-premises data centers. In this blog post, I would like to show you how you can leverage Route53 endpoints in combination with Terraform to establish seamless DNS query resolution across your entire hybrid network.
When setting up an IPSec VPN connection between your AWS network and your corporate data center, the fully-managed AWS Site-to-Site VPN service is a popular choice that often comes to mind. AWS Site-to-Site VPN offers a highly-available, scalable, and secure way to connect your on-premises users and workloads to AWS. In this blog post, I would like to show you how you can go beyond a simple, static AWS Site-to-Site VPN connection by leveraging dynamically routed Site-to-Site VPNs in combination with a Transit Gateway. This hub and spoke network setup will allow us to employ the Border Gateway Protocol (BGP) as well as equal-cost multi-path routing (ECMP) and AWS Global Accelerator to not only exchange routing information between AWS and the corporate data center automatically but also increases the overall VPN throughput and reliability.
Do you run software that provides locally available health checks via a webserver only reachable via localhost? In this blog post, I will show you an architecture that you can use to connect those local health checks to CloudWatch Logs and even receive alarms if things are not going to plan.
Every machine has recurring tasks. Backups, updates, runs of configuration management software like Chef, small scripts, … But one of the problems in a cloud environment is visibility. Instead of scheduling dozens of cron jobs or tasks per instance, would it not be nice to have a central service for this? You already have. And it’s called EventBridge…