AWS Big Data Blog
Secure access to a cross-account Amazon MSK cluster from Amazon MSK Connect using IAM authentication
Amazon Managed Streaming for Apache Kafka (MSK) Connect is a fully managed, scalable, and highly available service that enables the streaming of data between Apache Kafka and other data systems. Amazon MSK Connect is built on top of Kafka Connect, an open-source framework that provides a standard way to connect Kafka with external data systems. Kafka Connect supports a variety of connectors, which are used to stream data in and out of Kafka. MSK Connect extends the capabilities of Kafka Connect by providing a managed service with added security features, straightforward configuration, and automatic scaling capabilities, enabling businesses to focus on their data streaming needs without the overhead of managing the underlying infrastructure.
In some use cases, you might need to use an MSK cluster in one AWS account, but MSK Connect is located in a separate account. In this post, we demonstrate how to create a connector to achieve this use case. At the time of writing, MSK Connect connectors can be created only for MSK clusters that have AWS Identity and Access Management (IAM) role-based authentication or no authentication. We demonstrate how to implement IAM authentication after establishing network connectivity. IAM provides enhanced security measures, making sure your systems are protected against unauthorized access.
Solution overview
The connector can be configured for a variety of purposes, such as sinking data to an Amazon Simple Storage Service (Amazon S3) bucket, tracking the source database changes, or serving as a migration tool such as MirrorMaker2 on MSK Connect to transfer data from a source cluster to a target cluster this is located in a different account.
The following diagram illustrates a use case using Debezium and Amazon S3 source connectors.
The following diagram illustrates using S3 Sink and migration to a cross-account failover cluster using a MirrorMaker connector deployed on MSK Connect.
Currently MSK Connect connectors can be created only for MSK clusters which have IAM role-based authentication or no authentication. In this blog, I’ll guide you through the essential steps for implementing the industry-recommended IAM (Identity and Access Management) authentication after establishing network connectivity. IAM provides enhanced security measures, ensuring your systems are protected against unauthorized access.
The launch of multi-VPC private connectivity (powered by AWS PrivateLink) and cluster policy support for MSK clusters simplifies the connectivity of Kafka clients to brokers. By enabling this feature on the MSK cluster, you can use the cluster-based policy to manage all access control centrally in one place. In this post, we cover the process of enabling this feature on the source MSK cluster.
We don’t fully utilize the multi-VPC connectivity provided by this new feature because that requires you to use different bootstrap URLs with port numbers (14001:3) that are not supported by MSK Connect as of writing of this post. We explore a secure network connectivity solution that uses private connectivity patterns, as detailed in How Goldman Sachs builds cross-account connectivity to their Amazon MSK clusters with AWS PrivateLink.
Connecting to a cross-account MSK cluster from MSK Connect involves the following steps.
Steps to configure the MSK cluster in Account A:
- Enable the multi-VPC private connectivity(Private Link) feature for IAM authentication scheme that is enabled for your MSK cluster.
- Configure the cluster policy to allow a cross-account connector.
- Implement one of the preceding network connectivity patterns according to your use case to establish the connectivity with the Account B VPC and make network changes accordingly.
Steps to configure the MSK connector in Account B:
- Create an MSK connector in private subnets using the AWS Command Line Interface (AWS CLI).
- Verify the network connectivity from Account A and make network changes accordingly.
- Check the destination service to verify the incoming data.
Prerequisites
To follow along with this post, you should have an MSK cluster in one AWS account and MSK Connect in a separate account.
Set up the MSK cluster setup in Account A:
In this post, we only show the important steps that are required to enable the multi-VPC feature on an MSK cluster:
- Create a provisioned MSK cluster in Account A’s VPC with the following considerations, which are required for the multi-VPC feature:
- Cluster version must be 2.7.1 or higher.
- Instance type must be m5.large or higher.
- Authentication should be IAM (you must not enable unauthenticated access for this cluster).
- After you create the cluster, go to the Networking settings section of your cluster and choose Edit. Then choose Turn on multi-VPC connectivity.
- Select IAM role-based authentication and choose Turn on selection.
It might take around 30 minutes to enable. This step is required to enable the cluster policy feature that allows the cross-account connector to access the MSK cluster.
- After it has been enabled, scroll down to Security settings and choose Edit cluster policy.
- Define your cluster policy and choose Save changes.
- The new cluster policy allows for defining a Basic or Advanced cluster policy. With the Basic option, it only allows
CreateVPCConnection
,GetBootstrapBrokers
,DescribeCluster
, andDescribeClusterV2
actions that are required for creating the cross-VPC connectivity to your cluster. However, we have to use Advanced to allow more actions that are required by the MSK Connector. The policy should be as follows:
You might need to modify the preceding permissions to limit access to your resources (topics, groups). Also, you can restrict access to a specific connector by giving the connector IAM role, or you can mention the account number to allow the connectors in that account.
Now the cluster is ready. However, you need to make sure of the network connectivity between the cross-account connector VPC and the MSK cluster VPC.
If you’re using VPC peering or Transit Gateway while connecting to MSK Connect either from cross-account or the same account, do not configure your connector to reach the peered VPC resources with IPs in the following CIDR ranges (for more details, see Connecting from connectors):
- 10.99.0.0/16
- 192.168.0.0/16
- 172.21.0.0/16
In the MSK cluster security group, make sure you allowed port 9098 from Account B network resources and make changes in the subnets according to your network connectivity pattern.
Set up the MSK connector in Account B:
In this section, we demonstrate how to use the S3 Sink connector. However, you can use a different connector according to your use case and make the changes accordingly.
- Create an S3 bucket (or use an existing bucket).
- Make sure that the VPC that you’re using in this account has a security group and private subnets. If your connector for MSK Connect needs access to the internet, refer to Enable internet access for Amazon MSK Connect.
- Verify the network connectivity between Account A and Account B by using the telnet command to the broker endpoints with port 9098.
- Create an S3 VPC endpoint.
- Create a connector plugin according to your connector plugin provider (confluent or lenses). Make a note of the custom plugin Amazon Resource Name (ARN) to use in a later step.
- Create an IAM role for your connector to allow access to your S3 bucket and the MSK cluster.
- The IAM role’s trust relationship should be as follows:
- Add the following S3 access policy to your IAM role:
- The following policy contains the required actions by the connector:
- The IAM role’s trust relationship should be as follows:
You might need to modify the preceding permissions to limit access to your resources (topics, groups)
Finally, it’s time to create the MSK connector. Because the Amazon MSK console doesn’t allow viewing MSK clusters in other accounts, we show you how to use the AWS CLI instead. We also use basic Amazon S3 configuration for testing purposes. You might need to modify the configuration according to your connector’s use case.
- Create a connector using the AWS CLI with the following command with the required parameters of the connector, along with Account A’s MSK cluster broker endpoints:
- After you create the connector, connect the producer to your topic and insert data into it. In the following code, we use a Kafka client to insert data for testing purposes:
If everything is set up correctly, you should see the data in your destination S3 bucket. If not, check the troubleshooting tips in the following section.
Troubleshooting tips
After deploying the connector, if it’s in the CREATING state on the connector details page, access the Amazon CloudWatch log group specified in your connector creation request. Review the logs for any errors. If no errors are found, wait for the connector to complete its creation process.
Additionally, make sure the IAM roles have their required permissions, and check the security groups and NACLs for proper connectivity between VPCs.
Clean up
When you’re done testing this solution, clean up any unwanted resources to avoid ongoing charges
Conclusion
In this post, we demonstrated how to create an MSK connector when you need to use an MSK cluster in one AWS account, but MSK Connect is located in a separate account. This architecture includes an S3 Sink connector for demonstration purposes, but it can accommodate other types of sink and source connectors. Additionally, this architecture focuses solely on IAM authenticated connectors. If an unauthenticated connector is desired, the multi-VPC connectivity (PrivateLink) and cluster policy components can be ignored. The remaining process, which involves creating a network connection between the account VPCs, remains the same.
Try out the solution for yourself, and let us know your questions and feedback in the comments section.
Check out more AWS Partners or contact an AWS Representative to learn how we can help accelerate your business.
About the Author
Venkata Sai Mahesh Swargam is a Cloud Engineer at AWS in Hyderabad. He specializes in Amazon MSK and Amazon Kinesis services. Mahesh is dedicated to helping customers by providing technical guidance and solving issues related to their Amazon MSK architectures. In his free time, he enjoys being with family and traveling around the world.