Tag Archives: Security Blog

Connect your on-premises Kubernetes cluster to AWS APIs using IAM Roles Anywhere

2025-02-24 Varun Sharma

Post Syndicated from Varun Sharma original https://aws.amazon.com/blogs/security/connect-your-on-premises-kubernetes-cluster-to-aws-apis-using-iam-roles-anywhere/

Many customers want to seamlessly integrate their on-premises Kubernetes workloads with AWS services, implement hybrid workloads, or migrate to AWS. Previously, a common approach involved creating long-term access keys, which posed security risks and is no longer recommended. While solutions such as Kubernetes secrets vault and third-party options exist, they fail to address the underlying issue effectively.

One option to connect your on-premises Kubernetes workloads to AWS APIs is to use the service account issuer discovery feature. This allows the Kubernetes API server to act as an OpenID Connect (OIDC) identity provider and be federated with AWS Identity and Access Management (IAM). However, this approach requires public internet access to the Kubernetes API server, which might not be desirable for some customers.

To help eliminate the need for long-term access keys or exposing the Kubernetes API server to the public internet, AWS has introduced AWS IAM Roles Anywhere. This feature enables secure, seamless integration of on-premises Kubernetes workloads with AWS services, promoting robust security practices and minimizing potential risks associated with long-term credentials or public exposure.

IAM Roles Anywhere enables workloads outside of AWS to access AWS resources by exchanging X.509 bound identities for temporary AWS credentials. With IAM Roles Anywhere, you can use the same IAM roles and policies as your AWS workloads to access AWS resources, promoting consistency.

IAM Roles Anywhere can be combined with a standard public key infrastructure solution. In this blog post, we use AWS Private Certificate Authority, which has several advantages over using a self-signed certificate authority (CA). First, it reduces operational and management overhead, because AWS manages the CA for you. Second, the cryptographic key material can be stored in hardware security modules or at least vaulted, which helps you protect your private CA against key compromises. Additionally, certificates can be short-lived, which aligns with dynamic Kubernetes environments where pod lifetimes are typically shorter than traditional servers.

We also demonstrate how to integrate IAM Roles Anywhere without modifying your existing workload Docker files, and how to automate the X.509 certificate lifecycle with cert-manager and an AWS Private CA backend in short-lived certificate mode. By using these capabilities, you can seamlessly integrate your on-premises Kubernetes workloads with AWS services, promoting robust security practices, minimizing risks associated with long-term credentials, and helping to ensure a streamlined, consistent access management experience.

This post is for customers who run their own Kubernetes cluster outside of AWS without using Amazon EKS Anywhere. If you’re using Amazon Elastic Kubernetes Service (Amazon EKS), use IAM roles for service accounts or Amazon EKS Pod Identity instead.

Background

“Why should I prefer X.509 certificates over IAM access keys?” Access keys are long-term credentials that must be rotated regularly to minimize the risk of unauthorized access. They need to be securely deployed onto servers hosting applications that use them, requiring procedures for secure transfer and deletion of transient copies. As the number of applications and access keys grows, tracking and managing them becomes operationally challenging.

In contrast, X.509 certificates use public key infrastructure (PKI). The private key is generated directly on the application server and doesn’t leave it. Only a certificate signing request, which doesn’t contain secrets, is sent to the CA for signing and returning the certificate. This alleviates the need for securely transmitting secret keys.

However, you can argue that X.509 certificates are also long-lived credentials. This concern is valid, but not necessarily true. As demonstrated by projects such as Let’s Encrypt, it’s possible to reduce certificate lifetimes from years to months by implementing automation for certificate renewal. After such a mechanism is in place, certificate lifetimes can be further limited to days or even hours.

In this post, we introduce mutually authenticated Transport Layer Security (mTLS), which uses certificates for high-assurance bidirectional authentication. Certificates are used to establish trust between the client and server, making sure that both parties are authenticated and authorized to communicate securely. By implementing mTLS, you can achieve a higher level of security and trust in your communication channels, mitigating potential risks associated with unauthorized access or man-in-the-middle attacks. Here, we implement ephemeral certificates that are tied to the lifecycle of pods. When a pod is started, a certificate is automatically created, and it expires after a short period of time unless it’s actively in use by the pod, in which case it’s automatically renewed by the cert-manager. This approach verifies that certificates are only valid for the duration of the pod’s lifetime, minimizing the potential risk associated with long-lived credentials. Additionally, IAM Roles Anywhere supports certificate revocation list (CRL) checks, allowing you to perform explicit revocation of certificates if required. This feature provides an additional layer of security, enabling you to revoke access promptly in case of compromised credentials or other security concerns.

Throughout this post, we assume that you have a basic understanding of IAM Roles Anywhere. For more information you can see this blog post. Furthermore, we assume that you are familiar with Kubernetes, kubectl, Helm, and cert-manager.

Solution overview

This solution assumes that you have an existing Kubernetes cluster running outside of AWS.

Figure 1 shows the high-level architecture of our solution. An on-premises Kubernetes cluster accessing AWS APIs using IAM Roles Anywhere with X.509 certificates issued by AWS Private CA in short-lived-certificate mode.

Figure 1: High level architecture of on-premises Kubernetes accessing AWS APIs

Here’s how the solution works, as shown in Figure 1:

An AWS Private CA in short-lived certificate mode issues X.509 certificates for your pods.
When you set up your AWS Private CA as a trusted source and establish a specific profile, IAM Roles Anywhere will validate and accept authentication requests that use certificates issued by your AWS Private CA.
cert-manager, deployed into your Kubernetes cluster, orchestrates the issuance of AWS Private CA certificates to authorized pods.
Each pod uses IAM Roles Anywhere to create an AWS session using its private key and X.509 certificate obtained from cert-manager.

Let’s explore the different parts of the architecture in more detail.

AWS Private CA short lived credentials

AWS Private CA offers a short-lived certificate, where the validity period is limited to 7 days or fewer. You can see this AWS Blog to learn how to use AWS Private CA short-lived certificates. This new mode can be used to issue certificates for your Kubernetes pods and benefit from lower costs of operations. By synchronizing the certificate lifecycle with the lifecycle of the pod, you can minimize the operational overhead for this solution. To help meet requirements for auditability and transparency, you can use the audit report feature to list the issued certificates in a machine readable format.

IAM Roles Anywhere

Figure 2 shows a detailed overview of the components involved in authentication with IAM Roles Anywhere.

Figure 2: Components of IAM Roles Anywhere

IAM Roles Anywhere allows you to obtain temporary security credentials for workloads that run outside of AWS. Your workloads must use a certificate issued by a trusted PKI CA to authenticate with IAM Roles Anywhere. You establish trust between IAM Roles Anywhere and your CA by creating a trust anchor that points to the root of the CA.

cert-manager

Figure 3 shows a detailed overview of the cert-manager setup used in this post, including the aws-privateca-issuer add-on for the integration of AWS Private CA.

Figure 3: Detailed overview of cert-manager setup

cert-manager is a tool for managing X.509 certificates in Kubernetes. As shown in Figure 3, cert-manager will make sure that certificates are valid and up-to-date and attempt to renew them before they expire. By using add-ons, you can configure different backends for issuing X.509 certificates. In this post, we explore how to integrate cert-manager with AWS Private CA using the aws-privateca-issuer add-on. The aws-privateca-issuer add-on defines two custom resources, AWSPCAIssuer and AWSPCAClusterIssuer, which are used to configure the link to AWS Private CA. They are similar to the Issuer and ClusterIssuer resources that come with cert-manager, but specific to aws-privateca-issuer.

After the AWSPCAIssuer or AWSPCAClusterIssuer is available, aws-privateca-issuer authenticates towards AWS APIs using temporary security credentials obtained from IAM Roles Anywhere. cert-manager watches for the certificate resource, which references to an AWSPCAIssuer, which in turn references to AWS Private CA. aws-privatca-issuer requests a certificate from AWS Private CA. The auto-generated private key and the signed certificate are stored in Kubernetes secrets.

Using certificates and secrets

cert-manager supports multiple ways of integrating into your Kubernetes workloads. You can use certificate resources, which represent a human-readable definition of a certificate signing request (CSR) and contain information on certificate lifespan and renewal time. When using a certificate, the auto-generated private key and the signed certificate are stored in Kubernetes secrets.

With this option, an X.509 certificate is issued manually and saved as a secret. After a PKI is configured as an issuer, a certificate resource is created to automate the renewal of the certificate. With the certificate resource, the lifecycle of certificates is decoupled from the lifecycle of the pods that use them. This allows you to bootstrap the X.509 certificate even before the trusted PKI is deployed.

Using the CSI driver

Another way of integrating cert-manager is by using a CSI driver. In this case, the certificate lifecycle is bound to the lifecycle of the pod. An X.509 certificate and private key are mounted into a predefined folder where your workloads can read them. On pod creation, cert-manager automatically creates a private key and requests a certificate for the configured trusted PKI. When the pod is deleted, the private key and certificate are also deleted and become invalid because they aren’t renewed by cert-manager.

In this post, we use the CSI driver approach for workloads to create ephemeral certificates for IAM Roles Anywhere.

Workload configuration

Figure 4 shows a detailed view of how pods can be configured to use IAM Roles Anywhere without needing to change the underlying Docker images by using a sidecar that provides an IMDSv2 endpoint that mimics the behavior in the Amazon Elastic Compute Cloud (Amazon EC2) instance metadata endpoint.

Figure 4: Pod configuration using a sidecar

As shown in Figure 4, when using a certificate resource, the auto-generated private key and the signed certificate are stored in Kubernetes secrets and mounted into the pod. When using the CSI driver, a private key is generated locally (for the pod), a certificate is requested from cert-manager based on the given attributes and is issued by AWSPCAIssuer, and the certificates are mounted directly into the pod with no intermediate secret being created.

IAM Roles Anywhere uses the CreateSession API to authenticate requests with a SigV4a signature using the private key and its associated X.509 certificate. This exchange provides a IAM role session credential, as if you had assumed the IAM role. The aws_signing_helper binary is provided to call the CreateSession API from the command line. In this post, a sidecar container that provides an IMDSv2 endpoint to the workload container is used. This container uses the aws_signing_helper binary and uses its serve command.

This way, applications using AWS SDKs can use the AWS_EC2_METADATA_SERVICE_ENDPOINT environment variable to set the instance metadata endpoint to the correct port on the localhost interface. The X.509 certificate and private key are provided as files to the sidecar container.

Solution deployment

In this section, we show the steps needed to deploy the solution in your AWS account.

Prerequisites

To deploy the solution in this post, make sure that you have the following in place:

AWS Command Line Interface (AWS CLI) v2
An AWS account and IAM permissions for IAM, IAM Roles Anywhere, and AWS Private CA
Latest stable Kubernetes
kubectl (matching your Kubernetes version)
Helm 3
jq

Note: As an alternative to using the AWS CLI, you can use the AWS Controllers for Kubernetes (ACK) service controller for AWS Private CA for creating and managing CertificateAuthority, Certificate, and CertificateAuthorityActivation resources directly within your Kubernetes cluster. After establishing your CA hierarchy using the ACK controller, you can proceed with the subsequent steps involving IAM Roles Anywhere integration, aws-privateca-issuer, and cert-manager as described in this post.

Step 1 – AWS Private CA

Set up a root CA in AWS Private CA, which will issue short lived certificates for your pods. In this example you use only one CA; for production environments, you should check the considerations for designing CA hierarchies. Start by using the AWS CLI to create a configuration.

cat <<EOF > ca-config.json
{
   "KeyAlgorithm":"RSA_2048",
   "SigningAlgorithm":"SHA256WITHRSA",
   "Subject":{
      "Country":"DE",
      "Organization":"Example Corp",
      "OrganizationalUnit":"SREs",
      "State":"HE",
      "Locality":"FRANKFURT",
      "CommonName":"Blogpost CA"
   }
}
EOF

Create the CA in AWS Private CA with short-lived certificates mode.

aws acm-pca create-certificate-authority \
  --certificate-authority-configuration file://ca-config.json \
  --certificate-authority-type "ROOT" \
  --usage-mode SHORT_LIVED_CERTIFICATE

The command will return a CertificateAuthorityArn, which you will need for further commands, so export it for later use. Replace <region> with your AWS Region.
```
export PCA_ARN=arn:aws:acm-pca:<region>:012345678912:certificate-authority/8213159d-cad0-481c-bf14-a0ced4d6d479
```

After creating the root CA, the CA is in a pending state. You need to create a CSR.

aws acm-pca get-certificate-authority-csr \
     --certificate-authority-arn ${PCA_ARN} \
     --output text > ca.csr

Now, the CSR needs to be signed by the root CA.

aws acm-pca issue-certificate \
     --certificate-authority-arn ${PCA_ARN} \
     --csr fileb://ca.csr \
     --signing-algorithm SHA256WITHRSA \
     --template-arn arn:aws:acm-pca:::template/RootCACertificate/V1 \
     --validity Value=365,Type=DAYS

This command returns a CertificateArn which you will need later. Export it.

export ROOT_CA_CERTIFICATE_ARN=arn:aws:acm-pca:<region>:012345678912:certificate-authority/8213159d-cad0-481c-bf14-a0ced4d6d479/certificate/5830e475088eee553bd409b7f4964613

Download the root CA certificate and upload it to your AWS Private CA.

aws acm-pca get-certificate \
    --certificate-authority-arn ${PCA_ARN} \
    --certificate-arn ${ROOT_CA_CERTIFICATE_ARN} \
    --output text > cert.pem

aws acm-pca import-certificate-authority-certificate \
     --certificate-authority-arn ${PCA_ARN} \
     --certificate fileb://cert.pem

Verify the status of the PCA, it should be ACTIVE.

aws acm-pca describe-certificate-authority \
    --certificate-authority-arn ${PCA_ARN} \
    --output json

Step 2 – IAM Roles Anywhere

At this point your root CA is set up and ready to use. The next step is to configure IAM Roles Anywhere.

Start by defining a trust anchor that will refer to your newly created AWS Private CA and export the trustAnchorArn. Replace <value-of-trustAnchorArn> with the Amazon Resource Name (ARN) value of your IAM Roles Anywhere trust anchor.
```
aws rolesanywhere create-trust-anchor \
--name onprem-k8s-issuer \
--enabled \
--source sourceType=AWS_ACM_PCA,sourceData={acmPcaArn=${PCA_ARN}}

export TRUST_ANCHOR_ARN=<value-of-trustAnchorArn>
```

Create an IAM role to be used by the aws-privateca-issuer cert-manager plugin. This role needs to include the actions sts:AssumeRole, sts:SetSourceIdentity and sts:TagSession, which are required by IAMRA. Replace <TA_ID> with your trust anchor.

Note: You should specify a PrincipalTag with the CN. Furthermore, it should be scoped to the IAMRA service principal. This further restricts authorization based on attributes that are extracted from the X.509 certificate and provides an additional layer of security by helping to ensure that even if an unauthorized party gains access to a valid certificate, they cannot assume the role unless the certificate’s CN matches the specified value.

cat <<EOF > trust-policy.json
{
    "Version": "2012-10-17",
    "Statement": [{
        "Sid": "Statement1",
        "Effect": "Allow",
        "Principal": {
            "Service": "rolesanywhere.amazonaws.com"
        },
        "Action": [
            "sts:AssumeRole",
            "sts:SetSourceIdentity",
            "sts:TagSession"
        ],
        "Condition": {
            "StringEquals": {
                "aws:PrincipalTag/x509Subject/CN": "iamra-issuer"
            },
            "ArnEquals": {
                "aws:SourceArn": [
                    "arn:aws:rolesanywhere:<region>:012345678912:trust-anchor/<TA_ID>"
                ]
            }

        }
    }]
}
EOF

Use the following to create the iamra-issuer role:

aws iam create-role --role-name iamra-issuer \
  --assume-role-policy-document file://trust-policy.json

The command will return a JSON document containing information about the newly created role. Export the ARN for later use.
```
export IAMRA_ISSUER_ROLE=arn:aws:iam::012345678912:role/iamra-issuer
```

Attach an inline policy that allows the role request certificates from your PCA and retrieve these. Note that there is a condition limiting the AWS Private CA templates to only allow EndEntityCertificate.

cat <<EOF > inline-policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "awspcaissuerread",
      "Action": [
        "acm-pca:DescribeCertificateAuthority",
        "acm-pca:GetCertificate"
      ],
      "Effect": "Allow",
      "Resource": "$PCA_ARN"
    },
    {
      "Sid": "awspcaissuerwrite",
      "Action": [
        "acm-pca:IssueCertificate"
      ],
      "Effect": "Allow",
      "Resource": "$PCA_ARN",
      "Condition":{
        "StringEquals":{
          "acm-pca:TemplateArn":"arn:aws:acm-pca:::template/EndEntityCertificate/V1"
        }
      }
    }
  ]
}
EOF

Use the following to associate the inline policy (created in the preceding step) with the iamra-issuer role.

aws iam put-role-policy --role-name iamra-issuer \
  --policy-name iamra-issuer \
  --policy-document file://inline-policy.json

To finish, create a profile that defines which IAM roles can be assumed and then export the returned ARN.

aws rolesanywhere create-profile --name iamra-issuer \
  --role-arns ${IAMRA_ISSUER_ROLE} \
  --enabled

Export the returned ARN:

export IAMRA_PROFILE_ARN=arn:aws:rolesanywhere:<region>:012345678912:profile/<Profile_ID>

The created role iamra-issuer will only be used by the aws-privateca-issuer to integrate with AWS Private CA. You should repeat the process of creating IAM roles and IAMRA profiles for your workloads. it’s recommended to create a separate IAM role for each workload and limit its use with condition statements in the trust policy, checking for the workload identity and trust anchor (for example, matching the common name). Furthermore, it’s important that you add IAMRA to the trust policy and allow the aforementioned actions. Best practice with IAM roles is to apply least-privilege permissions.

Step 3 – Create the init container

To integrate IAM Roles Anywhere within your Kubernetes environment, you need to provide an IMDSv2 endpoint to your application containers by running the aws_signing_helper binary as a sidecar. You also need to configure your applications using an environment variable to use the new instance metadata endpoint. To do so, build a Docker image that works as a sidecar.

In this step, create a basic image that fulfills the preceding requirements. In your environment, you might want to adapt this example to use your own base image and implement your image hardening processes.

Copy the following script and save it as init.sh.

#!/bin/sh

if [[ -z "$TRUST_ANCHOR_ARN" ]]; then
  echo "Must provide TRUST_ANCHOR_ARN environment variable." 1>&2
  exit 1
fi

if [[ -z "$PROFILE_ARN" ]]; then
  echo "Must provide PROFILE_ARN environment variable." 1>&2
  exit 1
fi

if [[ -z "$ROLE_ARN" ]]; then
  echo "Must provide ROLE_ARN environment variable." 1>&2
  exit 1
fi

echo "starting IMDSv2 endpoint with aws_signing_helper ..."
/aws_signing_helper serve \
  --certificate /iamra/tls.crt         \
  --private-key /iamra/tls.key         \
  --trust-anchor-arn $TRUST_ANCHOR_ARN \
  --profile-arn $PROFILE_ARN           \
  --role-arn $ROLE_ARN

This script is the entry point of the sidecar container. It expects the environment variables TRUST_ANCHOR_ARN, PROFILE_ARN, and ROLE_ARN, which are required by aws_signing_helper. It also expects an X.509 certificate and its private key in the folder /iamra, which will be mounted in a later stage during pod initialization. Finally, it invokes the aws_signing_helper with the serve directive which creates an IMDSv2 endpoint listening on 9911 by default. This can be customized using the --port parameter.

Now let’s inspect the Docker file.

Note: At the time of writing, we used the alpine3.17.0 image. Use a hardened base image that’s designed to be secure and aligns with the requirements of your environment.

FROM alpine:3.17.0

COPY init.sh .
RUN apk add --no-cache libc6-compat libgcc wget
RUN wget https://rolesanywhere.amazonaws.com/releases/1.3.0/X86_64/Linux/aws_signing_helper
RUN chmod +x /aws_signing_helper /init.sh 
RUN ln -s /lib/libc.musl-x86_64.so.1 /lib/libresolv.so.2
ENTRYPOINT ["/bin/sh", "-c", "/init.sh"]

This Docker file copies the init.sh and downloads the aws_signing_helper binary. The init.sh script is defined as an entry point to the container. Dynamic libraries required by aws_signing_helper are installed using Alpine Linux package manager (Apk).

Now build the docker image, sign in to it, and push it for later use. For the following commands replace <my-docker-registry> with the hostname of your local registry or use an ECR Repository.

docker build . -t <my-docker-registry>/iamra-sidecar
docker login <my-docker-registry>
docker push <my-docker-registry>/iamra-sidecar

Step 4 – Install cert-manager

In this step, install cert-manager into your cluster and configure aws-privateca-issuer using a manually bootstrapped certificate. cert-manager-approver-policy is used to control which certificates can be requested by the workloads. Then, set up the cert-manager CSI driver to automatically provision X.509 certificates for your workload pods.

Start with the cert-manager setup:

Add the cert-manager repository to Helm and install the chart.

Note: At the time of writing, we used cert-manager version 1.16.2. Check for the latest stable version.

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.16.2 \
  --set installCRDs=true \
  --set extraArgs={--controllers='*\,-certificaterequests-approver'}
  
helm install \
  cert-manager-approver-policy jetstack/cert-manager-approver-policy \
  --namespace cert-manager \
  --wait \
    --set app.approveSignerNames="{\
issuers.cert-manager.io/*,clusterissuers.cert-manager.io/*,\
awspcaclusterissuers.awspca.cert-manager.io/*,awspcaissuers.awspca.cert-manager.io/*\
}"


#make modifications in cert-manager-approver-policy and add below permissions

kubectl edit  Clusterrole cert-manager-approver-policy -n cert-manager -o yaml

- apiGroups:
  - awspca.cert-manager.io
  resources:
  - awspcaissuers
  - awspcaclusterissuers
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - cert-manager.io
  - awspca.cert-manager.io
  resources:
  - signers
  verbs:
  - approve

Now, install the cert-manager aws-privateca-issuer plugin. This integration connects cert-manager with AWS Private CA and lets you issue short-lived certificates automatically. Currently, aws-privateca-issuer Helm chart doesn’t support IAMRA natively. So, you’re going to use the same init-container to set up IAMRA as for the workload pods.

You need to issue the first X.509 certificate for aws-privateca-issuer IAMRA manually. Later, cert-manager will renew it automatically.

Create the bootstrap certificate. When asked for a common name, enter iamra-issuer.
```
openssl req -out iamra.csr -new -newkey rsa:2048 \
-nodes -keyout iamra.key
```
The previous command will create an RSA private key named iamra.key and a certificate signing request name iamra.csr. Now you need to call AWS Private CA to issue the bootstrap certificate.
Set the validity period of the certificate to 1 day so that cert-manager will replace it after it’s set up. The IAM role that’s performing this action must have permissions to AWS Certificate Manager (ACM), IAM, and IAM Roles Anywhere to complete the setup.
```
aws acm-pca issue-certificate \
      --certificate-authority-arn ${PCA_ARN} \
      --csr fileb://iamra.csr \
      --signing-algorithm "SHA256WITHRSA" \
      --validity Value=1,Type="DAYS"
```

The command will return a CertificateArn for your iamra-issuer certificate. Export it and save the certificate to a file.

export IAMRA_ISSUER_CERT_ARN=arn:aws:acm-pca:<region>:012345678912:certificate-authority/8213159d-cad0-481c-bf14-a0ced4d6d479/certificate/afc47911ed2ded9c2664fa597a33b9fb
aws acm-pca get-certificate \
      --certificate-authority-arn ${PCA_ARN} \
      --certificate-arn ${IAMRA_ISSUER_CERT_ARN} | \
      jq -r .'Certificate' > iamra-cert.pem

Create a Kubernetes secret that contains the certificate and private key.
```
kubectl create secret tls -n cert-manager iamra-issuer \
  --cert=iamra-cert.pem \
  --key=iamra.key
```
You’re ready to install the aws-privateca-issuer. You need to modify the Helm chart because it doesn’t currently support IAMRA. You will render the Helm chart into YAML manifests, which are then adapted for IAMRA.

Install the Helm repository and render the charts into a file.

helm repo add awspca https://cert-manager.github.io/aws-privateca-issuer
 helm template --release-name iamra --include-crds awspca/aws-privateca-issuer \
   -n cert-manager > privateca-issuer.yaml

Add your previously built image as a sidecar and replace the environment variables with your exported values. Search for the deployment definition and add the following section:

# Source: aws-privateca-issuer/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: iamra-aws-privateca-issuer
  namespace: cert-manager
  labels:
    helm.sh/chart: aws-privateca-issuer-v1.4.0
    app.kubernetes.io/name: aws-privateca-issuer
    app.kubernetes.io/instance: iamra
    app.kubernetes.io/version: "v1.4.0"
    app.kubernetes.io/managed-by: Helm
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/name: aws-privateca-issuer
      app.kubernetes.io/instance: iamra
  template:
    metadata:
      labels:
        app.kubernetes.io/name: aws-privateca-issuer
        app.kubernetes.io/instance: iamra
    spec:
      serviceAccountName: iamra-aws-privateca-issuer
      securityContext:
        runAsUser: 65532
      volumes:
        - name: "iamra-secret"
          projected:
            sources:
              - secret:
                  name: iamra-issuer
      containers:
        - name: iamra-sidecar
          image: 012345678912.dkr.ecr.us-east-2.amazonaws.com/<replace-with-iamra-sidecar-repository>
          imagePullPolicy: Always
          env:
            - name: "TRUST_ANCHOR_ARN"
              value: "arn:aws:rolesanywhere:us-east-2:012345678912:trust-anchor/05d183f8-a34e-4f0c-ad2a-de6f803"
            - name: "PROFILE_ARN"
              value: "arn:aws:rolesanywhere:us-east-2:012345678912:profile/7b45f9a9-73fa-47f8-a20f-88aacbf57"
            - name: "ROLE_ARN"
              value: "arn:aws:iam::012345678912:role/iamra-issuer"
          volumeMounts:
            - name: iamra-secret
              mountPath: "/iamra"
              readOnly: true
        - name: aws-privateca-issuer
          securityContext:
            allowPrivilegeEscalation: false
          image: "public.ecr.aws/k1n1h4h4/cert-manager-aws-privateca-issuer:latest"
          env:
           - name: "AWS_EC2_METADATA_SERVICE_ENDPOINT"
             value: "http://localhost:9911/"
          imagePullPolicy: IfNotPresent
          command:
            - /manager
          args:
            - --leader-elect
          ports:
            - containerPort: 8080
              name: http
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8081
            initialDelaySeconds: 15
            periodSeconds: 20
          readinessProbe:
            httpGet:
              path: /healthz
              port: 8081
            initialDelaySeconds: 5
            periodSeconds: 10
      terminationGracePeriodSeconds: 10

Apply your modified manifest to install aws-privateca-issuer and verify the deployment you have modified. It should show that one pod is ready and available.

kubectl apply -f privateca-issuer.yaml

kubectl get deployment -n cert-manager -l app.kubernetes.io/name=aws-privateca-issuer

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
iamra-aws-privateca-issuer   1/1     1            1           4d10h

Define an AWSPCAIssuer, which will be used for renewal of the manually bootstrapped certificate for the aws-privateca-issuer add-on.

Note: At the time of writing, we used awspca cert-manager API version v1beta1. Check for the latest stable version.
```
export AWS_REGION=<region>
cat <<EOF | kubectl apply -f -
apiVersion: awspca.cert-manager.io/v1beta1
kind: AWSPCAIssuer
metadata:
  name: iamra-cm-issuer
  namespace: cert-manager
spec:
  arn: ${PCA_ARN}
  region: ${AWS_REGION}
EOF
```

After at least one AWSPCAIssuer or AWSPCAClusterIssuer is available, aws-privateca-issuer is going to authenticate towards AWS APIs by calling sts.get-caller-identity and verify the authentication method. You can verify this using its log files. It should print the assumed role.

kubectl logs -n cert-manager -l app.kubernetes.io/name=aws-privateca-issuer -c aws-privateca-issuer | grep -i getcalleridentity

Defaulted container "aws-privateca-issuer" out of: aws-privateca-issuer, iamra-init (init)
{"level":"info","ts":1669240040.2704494,"logger":"controllers.GenericIssuer","msg":"sts.GetCallerIdentity","genericissuer":"cert-manager/iamra-cm-issuer","arn":"arn:aws:sts::012345678912:assumed-role/iamra-issuer/5bafffcfb691969f0616a9b1a68032ec","account":"012345678912","user_id":"AROA2EIPPI5BVJ6SKBYOY:5bafffcfb691969f0616a9b1a68032ec"}

Now, you can create a cert-manager Certificate resource that represents a desired certificate that should be issued by the referenced cert-manager Issuer. It combines information of a CSR with details on the validity period and renewal.

Create the certificate object:

cat <<EOF | kubectl apply -f - 
  apiVersion: cert-manager.io/v1
  kind: Certificate
  metadata:
    name: iamra-privateca-issuer-cert
    namespace: cert-manager
  spec:
    secretName: iamra-issuer
    duration: 168h # 7d
    renewBefore: 24h # 15d
    subject:
      organizations:
        - "Example Corp."
      organizationalUnits:
        - "Admin"
    commonName: "iamra-issuer"
    isCA: false
    usages:
      - "client auth"
      - "server auth"
    issuerRef:
      group: awspca.cert-manager.io
      kind: AWSPCAIssuer
      name: iamra-cm-issuer
  EOF
  helm upgrade -i -n cert-manager cert-manager-csi-driver jetstack/cert-manager-csi-driver --wait
  -- > install policies:
  policy + role + role binding to allow service account to accept certs.
  cat <<EOF | kubectl apply -f - 
  apiVersion: policy.cert-manager.io/v1alpha1
  kind: CertificateRequestPolicy
  metadata:
    name: iamra-issuer-policy
  spec:
    allowed:
      commonName:
        required: true
        value: "iamra-issuer"
      subject:
        organizations:
          values: ["Example Corp."]
          required: true
        organizationalUnits:
          values: ["Admin"]
          required: true
      usages:
      - "server auth"
      - "client auth"
    selector:
      issuerRef:
        group: awspca.cert-manager.io
        kind: AWSPCAIssuer
        name: iamra-cm-issuer
  ---
  apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRole
  metadata:
    name: cert-manager-policy:iamra-issuer-policy
  rules:
    - apiGroups: ["policy.cert-manager.io"]
      resources: ["certificaterequestpolicies"]
      verbs: ["use"]
      resourceNames: ["iamra-issuer-policy"]
  ---
  apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRoleBinding
  metadata:
    name: cert-manager-policy:iamra-issuer-policy
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: cert-manager-policy:iamra-issuer-policy
  subjects:
  - kind: ServiceAccount
    name: cert-manager
    namespace: cert-manager
  EOF

Step 5 – Deploy your workload

In Step 4, sub-step 9, you created an AWSPCAIssuer named iamra-cm-issuer. You then used this AWSPCAIssuer to renew the manually bootstrapped certificate for the aws-privateca-issuer.

In Step 4, sub-step 11, you created the certificate iamra-privateca-issuer-cert, which is used by the aws-privateca-issuer.

In this step, you will deploy the sample workload. When deploying the sample workload, make sure to repeat the process of creating IAM roles and IAMRA profiles (from Step 2), the AWSPCAIssuer (Step 4, sub-step 9), and the CertificateRequestPolicy (Step 4, sub-step 11) for the certificate request.

For more information on certificate request policies, see the cert-manager documentation on approval policies.

Use the following code to deploy the workload.

cat <<EOF | kubectl apply -f -
  
apiVersion: v1
kind: Pod
metadata:
   creationTimestamp: null
   labels:
     run: acmpca-csi-test
   name: acmpca-csi-test
spec:
  containers:
      - name: iamra-sidecar
        image: 056930860237.dkr.ecr.us-east-2.amazonaws.com/aws_sighning:latest
        imagePullPolicy: Always
        env:
          - name: "TRUST_ANCHOR_ARN"
            value: "arn:aws:rolesanywhere:us-east-2:012345678912:trust-anchor/05d183f8-a34e-4f0c-ad2a-de6f803ac172"
          - name: "PROFILE_ARN"
            value: "arn:aws:rolesanywhere:us-east-2:012345678912:profile/7b45f9a9-73fa-47f8-a20f-88aacbf579d2"
          - name: "ROLE_ARN"
            value: "arn:aws:iam::012345678912:role/iam-roles-anywhere-s3-full-access"
        volumeMounts:
          - name: "iamra-csi"
            mountPath: "/iamra"
            readOnly: true
      - name: aws-cli
        image: amazon/aws-cli:latest
        env:
        - name: "AWS_EC2_METADATA_SERVICE_ENDPOINT"
          value: "http://127.0.0.1:9911/"
        command:
          - sleep
          - "3600"
  dnsPolicy: ClusterFirst
  restartPolicy: Never
  volumes:
    - name: "iamra-csi"
      csi:
        readOnly: true
        driver: csi.cert-manager.io
        volumeAttributes:
            csi.cert-manager.io/issuer-name: my-pca
            csi.cert-manager.io/issuer-group: awspca.cert-manager.io
            csi.cert-manager.io/issuer-kind: AWSPCAIssuer
            csi.cert-manager.io/common-name: "${SERVICE_ACCOUNT_NAME}.${POD_NAMESPACE}"
            csi.cert-manager.io/duration: 168h
            csi.cert-manager.io/renew-before: 24h
            csi.cert-manager.io/is-ca: "false"
            csi.cert-manager.io/key-usages: "client auth, server auth"
  EOF

Step 6 – Test your deployment

To test the deployment, you can use kubectl exec to access the iamra-sidecar container. Navigate to the iamra directory and check if the certificate and key are mounted.

Command:
kubectl exec -it acmpca-csi-test – sh ls | grep iamra

Output: iamra

Command:
cd iamra /iamra# ls

Output: ca.crt tls.crt tls.key

You can also exec into the aws-cli container and verify the caller identity and make API calls to Amazon Simple Storage Service (Amazon S3):

Command:
kubectl exec -it acmpca-csi-test -c aws-cli – sh $aws sts get-caller-identity

Output: You should see iam-roles-anywhere-s3-full-access in caller-identity.

Command:
$aws s3 ls

Output: You should be able to list the S3 bucket based on the permissions associated with the assumed role.

Summary

In this post, you learned about a solution for securely connecting on-premises Kubernetes workloads to AWS services using IAM Roles Anywhere. The approach alleviates the need for long-term access keys or public internet exposure of the Kubernetes API server. By using this solution for containerized and full stack applications, you can benefit from:

Enhanced security: Use short-lived X.509 certificates instead of long-term credentials.
Simplified management: Automate the certificate lifecycle with cert-manager and AWS Private CA.
Seamless integration: No modifications are required to existing workload Docker files.
Consistent policies: Use the same IAM roles and policies across AWS and on premises.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

From log analysis to rule creation: How AWS Network Firewall automates domain-based security for outbound traffic

2025-02-21 Mary Kay Sondecker

Post Syndicated from Mary Kay Sondecker original https://aws.amazon.com/blogs/security/from-log-analysis-to-rule-creation-how-aws-network-firewall-automates-domain-based-security-for-outbound-traffic/

When it comes to controlling incoming (ingress) and outgoing (egress) network traffic, organizations typically focus heavily on inbound traffic controls—carefully restricting what traffic can enter their network perimeter. However, this approach addresses only inbound security challenges. Modern applications rely heavily on third-party code through operating systems, libraries, and packages. This dependency can create potential security vulnerabilities. If these components are compromised, affected workloads might attempt to connect to unauthorized command and control servers or send sensitive data to unauthorized destinations on the internet.

This is why implementing strong outbound traffic controls—particularly through domain-based allowlisting—has become a critical security best practice. Rather than allowing unrestricted outbound access or maintaining an ever-growing denylist of low-reputation domains, many organizations are shifting to domain-based allowlisting. This approach restricts outbound communications to explicitly trusted domains, reduces potential risk surfaces, and helps to protect against both known and unknown threats. However, manually identifying and maintaining these allowlists has traditionally been a complex and time-consuming process.

AWS Network Firewall automated domain lists improve visibility into network traffic patterns and simplify outbound traffic control management. This feature provides analytics for HTTP and HTTPS network traffic, helping organizations understand domain usage patterns. It also automates firewall log analysis to create rules based on your network traffic. By combining increased visibility with automation, this feature enhances your security awareness and helps to improve the effectiveness of your firewall rules.

In this blog post, we’ll guide you through the implementation of the AWS Network Firewall automated domain list feature, providing a detailed overview, step-by-step instructions, and best practices to optimize your network security.

Overview of automated domain lists and traffic insights

Domain-based security allows you to control network traffic based on the domain names that your applications and users are trying to access. This approach offers a more intuitive and flexible way to create firewall rules, focusing on the destinations your network is trying to reach rather than just IP addresses. However, effectively configuring and managing firewall rules remains challenging for some customers, especially in large environments where connected devices, applications, and traffic patterns are continuously growing and changing. Organizations might struggle to keep up with these changes, leading to outdated or ineffective firewall rules and policies that are either too permissive, exposing the network to risks, or too restrictive, blocking legitimate traffic.

Let’s explore how automated domain lists address these challenges through various use cases and benefits:

Preventive and detective security controls

Domain control through allowlisting – Establishing domain allowlists aligns with the security principle of least privilege for network traffic. A least-privilege model adjusts the scope of what a workload can do across the network, from infinite and undefined to scoped-down and well-defined, enabling better insight into potentially risky behaviors. By limiting outbound connections to only approved domains, organizations can more effectively control and monitor workload communications.
Rule audit and compliance – Domain allowlisting makes it clear which domains are allowed, supporting alignment with standards like the Payment Card Industry Data Security Standard (PCI DSS), Health Insurance Portability and Accountability Act (HIPAA), Cybersecurity Maturity Model Certification (CMMC), and General Data Protection Regulation (GDPR).
Preventive controls enable detection – Preventive controls also act as detective controls, establishing a baseline for normal domain access patterns. With a domain allowlist in place, security teams can better detect workloads that show signs of unauthorized activity.
Incident response support – Domain reporting provides the latest list of workload domains accessed, enabling quick identification of potentially malicious domains during security incidents. This information helps teams prioritize which workloads may need immediate attention.

Operational value

Initial firewall setup and management – Automated allowlisting involves analyzing existing traffic patterns and recommending domain-based rules, which simplifies the process of establishing baseline firewall rules. This helps organizations quickly deploy effective security policies, potentially reducing the time and expertise needed for initial firewall configuration and ongoing management.
Application modernization – Allowlisting supports adjusting firewall rules to accommodate rapidly changing traffic patterns in microservices and containerized environments, helping security to keep pace with evolving architectures.
Cross-environment consistency – Allowlisting enables consistent firewall rule creation and management across multi-cloud and hybrid environments, regardless of where applications or data reside.

How the automated domain list feature works

Automated domain lists work by analyzing your HTTP and HTTPS traffic, generating reports on frequently accessed domains, and providing a convenient way to create rules based on actual network traffic patterns. To begin using automated domain lists in AWS Network Firewall, sign in to the AWS Management Console, access the Network Firewall service, and either work with an existing firewall or create a new one. Then follow the rest of the steps in this post.

Step 1: Enable traffic analysis mode to capture HTTP and HTTPS traffic domain logs

After you’ve selected a firewall, in the left navigation pane, choose Configure advanced settings. Select the Enable traffic analysis mode checkbox to enable it, as shown in Figure 1. Network Firewall uses this logging mode to collect data on observed domains for HTTP and HTTPS traffic to create domain reports.

Figure 1: Enabling traffic analysis mode for a firewall

To stop collecting data on frequently accessed domains in your network traffic, clear the checkbox to disable traffic analysis mode, as shown in Figure 2. Note that if you disable traffic analysis mode, you won’t be able to generate domain reports.

Figure 2: Disabling traffic analysis mode

Once traffic analysis mode is enabled, you’re ready to generate a domain report based on observed network traffic. Next, you can go to the Monitoring and observability tab and choose Create report.

Figure 3: Traffic analysis mode enabled: Now you’re ready to generate domain-based reports

Step 2: Create a domain report

The domain report summarizes the HTTP and HTTPS traffic observed by your firewall for up to 30 days (or for the duration since firewall activation if less than 30 days). Select the checkbox next to each traffic analysis type you want to include in the report—HTTP, HTTPS, or both.

Important: Use your monthly domain report to examine 30 days of traffic behavior. Each report type (HTTP, HTTPS) is available once every 30 days at no additional cost.

Figure 4: Create a domain report that includes traffic analysis types HTTP, HTTPS, or both

To see the status of your domain report, go to the Reports section in the console for your specific firewall. When the report is ready, you can review the report directly in the console or download it, as shown in Figure 5.

Figure 5: The list of domain reports in the Reports section of the console for your specific firewall

Step 3: Review the report details

The report details include the traffic type (HTTP or HTTPS) and the observation period (start and end dates). By default, the report covers the last 30 days, or the entire period since traffic analysis was enabled if that is less than 30 days. The report also shows these details:

The Domain list shows domains that are a fully qualified domain name (FQDN) observed in the network traffic, such as aws.com or subdomain.aws.com.
The Access attempt count refers to the overall count of connection requests to the domain, including both successful and failed attempts.
The Unique sources field shows the number of distinct source IP addresses connected to the domain, indicating its popularity. For example, if one workload connects to aws.com, then count = 1; if 1000 workloads connect to aws.com, then count = 1,000.
The First accessed field shows when the domain was first seen in your traffic, while Last accessed shows when it was most recently seen. This includes both successful and failed attempts to access the domain.
The Protocol field indicates how the domain was observed—through either HTTP or HTTPS traffic (in other words, HTTP headers or a TLS handshake).

An example report is shown in Figure 6.

Figure 6: Example domain report details: 30-day analysis

Step 4: (Optional) Create a domain list rule group

You can copy the list of observed domains from the report to a stateful domain list rule group and update your firewall policy. To do so, in the Report details section, choose Create domain list group to use the firewall policy wizard to create or update your firewall rules. The selected domains are automatically copied to a domain list rule group, as shown in Figure 7. For detailed instructions, see the AWS Network Firewall documentation.

Figure 7: Option to copy over the observed domain lists and create a domain list rule group using the firewall policy wizard

Best practices for implementing domain allowlists

When you implement domain allowlisting, consider the following guidelines for operational success. We recommend that you also consult your own internal compliance and security policies.

Start with a strategy of generous allowlisting:
- Begin with broader and more generous allowlist rules rather than a more refined list, initially, to reduce the risk of accidently blocking legitimate domains.
- Focus on getting to a Default Deny policy so that you can benefit from its risk surface reduction.
- Create flexible rules for trusted domains, including second-level domains and top-level domains, such as allowing access to subdomains under your registered second-level domain. Or allow access to second-level domains under top-level domains that your organization trusts—for example, .mil, .gov, or .edu.
- Use custom Suricata rules with regex capabilities to handle complex traffic efficiently. See Examples of stateful rules for Network Firewall.
- Remember that even a broad allowlist provides better security than having no allowlist at all.
Make iterative improvements:
- After you establish an initial generous allowlist and Default Deny rules, evaluate the rules to determine which areas you might want to start narrowing down further. Use alert rules before pass rules in order to log the specific domains a pass rule might be allowing access to.
- Adjust logging levels based on domain trust levels and monitoring requirements.
- Review and update rules based on operational insights and changing requirements.
- Take a pragmatic and iterative approach to rule refinement rather than attempting to make the ruleset very strict.
Set up robust logging:
- Enable Network Firewall alert logs to maintain visibility into traffic patterns.
- Use tools like Amazon CloudWatch Logs Contributor Insights for log analysis.
- Consider setting up proactive alerts for denied domains accessed by critical workloads.
- Monitor logs to identify potential allowlist additions or changes.
Additional considerations:
- After you enable traffic analysis mode, the automated domain lists feature provides visibility into your network traffic, reporting on observed connections. Although it doesn’t distinguish between allowed and blocked traffic, the domain list report can help you identify the most critical domains to include in your firewall rules.
- The domain traffic data used to generate the list of domain recommendations is available for up to the last 30 days after traffic analysis has been enabled. This allows you to focus on the most relevant and recent network activity when optimizing your firewall policies.
- Data collection for automated domain lists is opt-in and performed independently of the firewall policy and logging configuration. Enabling the feature doesn’t impact the performance of the firewall itself.

Conclusion

With AWS Network Firewall automated domain lists, you can simplify your firewall management process, create more effective rules based on actual traffic patterns, and maintain a strong security posture with less manual effort. This feature helps you address common challenges such as keeping up with rapidly changing application landscapes, managing security across complex environments, and adhering to compliance requirements. To learn more about Network Firewall and its features, see the product page and service documentation.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Network Firewall re:Post forum or contact AWS Support.

How to restrict Amazon S3 bucket access to a specific IAM role

2025-02-14 Chris Craig

Post Syndicated from Chris Craig original https://aws.amazon.com/blogs/security/how-to-restrict-amazon-s3-bucket-access-to-a-specific-iam-role/

February 14, 2025: This post was updated with the recommendation to restrict S3 bucket access to an IAM role by using the aws:PrincipalArn condition key instead of the aws:userid condition key.

April 2, 2021: In the section “Granting cross-account bucket access to a specific IAM role,” we updated the second policy to fix an error.

July 11, 2016: This post was first published.

Customers often ask how to limit access to an Amazon Simple Storage Service (Amazon S3) bucket to only a specific AWS Identity and Access Management (IAM) user or role. A popular approach has been to use the Principal element to list the users or roles who need access to the bucket. However, the Principal element needs the exact values of the user ARN, role ARN, or assumed-role ARN. It does not support using a wildcard (*) to include all role sessions, nor does it allow you to use policy variables.

In this blog post, we show how to restrict S3 bucket access to a specific IAM role or user within an account by using the Conditions element. Even if another user in the same account has an Admin policy or a policy with s3:*, they will be denied access if they are not explicitly listed in the Conditions element. You can use this approach, for example, to limit access to a bucket with sensitive content or additional security requirements.

Solution overview

The solution in this post uses a bucket policy to restrict access to an S3 bucket, even if an entity has access to the full API of S3 through an attached identity-based policy. The following diagram illustrates how this works for accessing an S3 bucket within the same account as your IAM user or IAM role. We recommend that you use IAM roles, and only use IAM users for use cases that aren’t supported by federated users.

Figure 1: Diagram illustrating how to access an S3 bucket within the same account as your IAM user or IAM role

The workflow in Figure 1 is as follows:

The IAM user’s policy and the IAM role’s identity-based policy grant access to “s3:*”.
The S3 bucket policy associated with Bucket B restricts access to only the IAM role. This means that only the IAM role is able to access its content.
Both the IAM user and the IAM role can access other S3 buckets (for example, Bucket A) in the account. The IAM role is able to access both buckets, but the user can access only the S3 buckets without the bucket policy attached to them. Even though both the role and the user have full “s3:*” permissions, the bucket policy negates access to the bucket for anyone that has not assumed the role.

The main difference in the cross-account approach is that every bucket must have a bucket policy attached to allow access to the IAM role from the other account. The following diagram illustrates how this works in a cross-account deployment scenario.

Figure 2: Diagram illustrating how to access an S3 bucket in a different account than your IAM role

The workflow in Figure 2 is as follows:

The IAM role’s identity-based policy and the IAM users’ policy in the bucket account both grant access to “s3:*”
Bucket policy B denies access to all IAM users and roles except the role specified, and the policy defines what the role is allowed to do with the bucket.
Bucket policy A allows access to the IAM role from the other account.
The IAM user and IAM role can both access Bucket A because the IAM user is in the same account and there is an explicit Allow in bucket policy A for the role. The role can access both buckets because the Deny in bucket policy B is only for principals other than the IAM role.

Using the `aws:PrincipalArn` condition

You can use different types of condition keys to compare details about the principal making the request with the principal properties that you specify in the policy. We recommend that you use the aws:PrincipalArn key. The aws:PrincipalArn key compares the Amazon Resource Name (ARN) of the principal that made the request with the ARN that you specify in the policy.

You could also use the aws:userid policy variable to uniquely identify a user or role in their explicit Deny statements. There is added complexity with using aws:userid to find the value because you have to perform an API call using valid credentials. When working with IAM roles this activity has additional complexity because you are required to get the AssumedRoleUser information, which will not only include the unique role ID, but also the role-session-name that was provided while assuming the role. For example, the aws:userid for an AssumedRoleUser will be as follows:

aws:userid – AROADBQP57FF2AEXAMPLE:role-session-name

It becomes inconvenient to manage and track these IDs when you have a large list of users and roles to be included in the policy.

To mitigate these challenges, we recommend that you use the aws:PrincipalArn condition key. For IAM roles, the request context returns the ARN of the role, not the ARN of the user that assumed the role. AWS recommends that you specify the ARN for resources in policies instead of unique IDs and that you perform IAM policy audits on a periodic basis. Let’s look at how to use the condition key in an IAM policy.

Granting same-account bucket access to a specific role

When accessing a bucket from within the same account, in most cases it is not necessary to use a bucket policy because the policy defines access that is already granted by the user’s direct IAM policy. S3 bucket policies are usually used for cross-account access, but you can also use them to restrict access through an explicit Deny. The Deny would be applied to all principals whether they were in the same account as the bucket or within a different account.

In this case, you use the IAM user or role ARN with the aws:PrincipalArn condition key in a StringNotEquals or StringNotLike condition with a wildcard string. In addition, you use the aws:PrincipalARN key to compare the ARN of the principal that made the request with the ARN that you specify in the policy. Using a conditional logic element allows for the use of a wildcard string to allow for any role session name to be accepted.

Once you have the ARN of the role to which you want to allow access, you need to block the access of other users from within the same account as the bucket. An example policy to block access to the bucket and its objects for users that are not using the IAM role credentials would look like the following.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket",
        "arn:aws:s3:::amzn-s3-demo-bucket/*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:PrincipalArn": [
            "arn:aws:iam::111122223333:role/<ROLE-NAME>"
          ]
        }
      }
    }
  ]
}

Use this same policy for IAM users as shown below.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket",
        "arn:aws:s3:::amzn-s3-demo-bucket/*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:PrincipalARN": [
            "arn:aws:iam::111122223333:role/<ROLE-NAME>”,
            “arn:aws:iam::111122223333:user/<USER-NAME>"
          ]
        }
      }
    }
  ]
}

Granting cross-account bucket access to a specific IAM role

When granting cross-account bucket access to an IAM user or role, you must define what the IAM user or role is allowed to do with the granted access. Learn more about the permissions needed to allow an IAM entity to access a bucket via the CLI/API and the console in Writing IAM Policies: How to Grant Access to an Amazon S3 Bucket. Using the information found in this blog post, an example bucket policy would look like the following.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/<ROLE-NAME>"
            },
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/<ROLE-NAME>"
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*"
        },
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket",
                "arn:aws:s3:::amzn-s3-demo-bucket/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:PrincipalARN": [
                        "arn:aws:iam::111122223333:role/<ROLE-NAME>"
                    ]
                }
            }
        }
    ]
}

To grant access to an IAM user in another account, you need to add the ARN for the IAM user to the aws:PrincipalArn condition as outlined in the previous section of this blog post. In addition to the aws:PrincipalArn condition, you would also need to add the IAM user’s full ARN to the Principal element of these policies. An example policy is shown below.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": [
                {
                    "AWS": [
                        "arn:aws:iam::444455556666:role/<ROLE-NAME>”,
                        “arn:aws:iam::444455556666:user/<USER-NAME>"
                    ]
                }
            ],
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
        },
        {
            "Effect": "Allow",
            "Principal": [
                {
                    "AWS": [
                        "arn:aws:iam::444455556666:role/<ROLE-NAME>”,
                        “arn:aws:iam::444455556666:user/<USER-NAME>"
                    ]
                }
            ],
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*"
        },
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket",
                "arn:aws:s3:::amzn-s3-demo-bucket/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:PrincipalARN": [
                        "arn:aws:iam::444455556666:role/<ROLE-NAME>”,
                        “arn:aws:iam::444455556666:user/<USER-NAME>"
                    ]
                }
            }
        }
    ]
}

In addition to including role permissions in the bucket policy, you need to define these permissions in the IAM user’s or role’s user policy. The permissions are added to a customer managed policy and attached to the role or user in the IAM console, with the following example policy document.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:ListAllMyBuckets",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*"
    }
  ]
}

By following the guidance in this post, you restrict S3 bucket access to a specific IAM role or user in same-account and cross-account scenarios, even if the user has an Admin policy or a policy with “s3:*”. There are many applications of this logic in which requirements will vary across use cases. We recommend to employ the principle of least privilege wherever possible, and to grant only the minimum permissions that are required to perform necessary tasks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Identity and Access Management re:Post or contact AWS Support.

The importance of encryption and how AWS can help

2025-02-12 Ken Beer

Post Syndicated from Ken Beer original https://aws.amazon.com/blogs/security/importance-of-encryption-and-how-aws-can-help/

February 12, 2025: This post was republished to include new services and features that have launched since the original publication date of June 11, 2020.

Encryption is a critical component of a defense-in-depth security strategy that uses multiple defensive mechanisms to protect workloads, data, and assets. As organizations look to innovate while building trust with customers, they need to meet critical compliance requirements and improve data security. Encryption, when used correctly, adds a layer of protection against unauthorized access that can help you strengthen data protection, adhere to regulations and standards, and enhance the security of communications.

How and why does encryption work?

Encryption works by using an algorithm with a key to convert data into unreadable data (ciphertext) that can only become readable again with the right key. For example, a simple phrase like “Hello World!” may look like “1c28df2b595b4e30b7b07500963dc7c” when encrypted. There are several different types of encryption algorithms, all using different types of keys. A strong encryption algorithm relies on mathematical properties to produce ciphertext that can’t be decrypted using any practically available amount of computing power without also having the necessary key. Therefore, protecting and managing the keys becomes a critical part of any encryption solution.

Encryption as part of your security strategy

An effective security strategy begins with stringent access control and continuous work to define the least privilege necessary for persons or systems accessing data. When using the AWS Cloud, you adopt the model of shared responsibility. You are responsible for managing your own access control policies. Encryption is a critical component of a defense-in-depth strategy because it can mitigate weaknesses in your primary access control mechanism. What if an access control mechanism fails and allows access to the raw data on disk or traveling along a network link? If the data is encrypted using a strong key, as long as the decryption key is not on the same system as your data, it is computationally infeasible for a bad actor to decrypt your data.

To show how infeasible this is, let’s consider the Advanced Encryption Standard (AES) with 256-bit keys (AES-256). It’s the strongest industry-adopted and government-approved algorithm for encrypting data. AES-256 is the technology we use to encrypt data in AWS, including Amazon Simple Storage Service (S3) server-side encryption. It would take at least a trillion years to break using current (and foreseeable future) computing technology. Current research suggests that even the future availability of quantum-based computing won’t sufficiently reduce the time it would take to break AES-256 encryption.

But what if you mistakenly create overly permissive access policies on your data? A well-designed encryption and key management system can also help prevent this from becoming an issue, because it separates access to the decryption key from access to your data.

Requirements for an encryption solution

To get the most from an encryption solution, you need to think about two things:

Protecting keys at rest: Are the systems using encryption keys secured so the keys can never be used outside the system? In addition, do these systems implement encryption algorithms correctly to produce strong ciphertexts that cannot be decrypted without access to the right keys?
Independent key management: Is the authorization to use encryption independent from how access to the underlying data is controlled?

There are third-party solutions that you can bring to AWS to help meet these requirements. However, these systems can be difficult and expensive to operate at scale. AWS offers a range of options to simplify encryption and key management.

Protecting keys at rest

When you use third-party key management solutions, it can be difficult to gauge the risk of your plaintext keys leaking and being used outside the solution. The keys have to be stored somewhere, and you can’t always know or audit all the ways those storage systems are secured from unauthorized access. The combination of technical complexity and the necessity of making the encryption usable without degrading performance or availability means that choosing and operating a key management solution can present difficult tradeoffs. The best practice to maximize key security is using a hardware security module (HSM). This is a specialized computing device that has several security controls built into it to help prevent encryption keys from leaving the device in a way that could allow an adversary to access and use those keys.

One such control in modern HSMs is tamper response, in which the device detects physical or logical attempts to access plaintext keys without authorization, and destroys the keys before the attack succeeds. Because you can’t install and operate your own hardware in AWS datacenters, AWS offers two services using HSMs with tamper response to protect customers’ keys: AWS Key Management Service (AWS KMS), which manages a fleet of HSMs on the customer’s behalf, and AWS CloudHSM, which gives customers the ability to manage their own HSMs. Each service can create keys on your behalf, or you can import keys from your on-premises systems to be used by each service.

The keys in AWS KMS or AWS CloudHSM can be used to encrypt data directly, or to protect other keys that are distributed to applications that directly encrypt data. The technique of encrypting encryption keys is called envelope encryption, and it enables encryption and decryption to happen on the computer where the plaintext customer data exists, rather than sending the data to the HSM each time. For very large data sets (e.g., a database), it’s not practical to move gigabytes of data between the data set and the HSM for every read/write operation. Instead, envelope encryption allows a data encryption key to be distributed to the application when it’s needed. The “master” keys in the HSM are used to encrypt a copy of the data key so the application can store the encrypted key alongside the data encrypted under that key. Once the application encrypts the data, the plaintext copy of data key can be deleted from its memory. The only way for the data to be decrypted is if the encrypted data key, which is only a few hundred bytes in size, is sent back to the HSM and decrypted.

The process of envelope encryption is used in AWS services in which data is encrypted on a customer’s behalf (which is known as server-side encryption) to minimize performance degradation. If you want to encrypt data in your own applications (client-side encryption), you’re encouraged to use envelope encryption with AWS KMS or AWS CloudHSM. Both services offer client libraries and SDKs to add encryption functionality to their application code and use the cryptographic functionality of each service. The AWS Encryption SDK is an example of a tool that can be used anywhere, not just in applications running in AWS. To make it easier for customers to encrypt data in databases like Amazon DynamoDB, we built the AWS Database Encryption SDK. The AWS Database Encryption SDK is a set of software libraries that enable you to use client-side encryption in your database design, including record-level encryption of database items. Today, the AWS Database Encryption SDK supports Amazon DynamoDB with attribute-level encryption.

Because implementing encryption algorithms and HSMs is critical to get right, all vendors of HSMs should have their products validated by a trusted third party. HSMs in both AWS KMS and AWS CloudHSM are validated under the National Institute of Standards and Technology’s FIPS 140 program, the standard for evaluating cryptographic modules. This validates the secure design and implementation of cryptographic modules, including functions related to ports and interfaces, authentication mechanisms, physical security and tamper response, operational environments, cryptographic key management, and electromagnetic interference/electromagnetic compatibility (EMI/EMC). Encryption using a FIPS 140 level 3 validated cryptographic module is often a requirement for other security-related compliance schemes like FedRamp and HIPAA-HITECH in the U.S., or the international payment card industry standard (PCI-DSS).

Independent key management

While AWS KMS and AWS CloudHSM can protect plaintext master keys on your behalf, you are still responsible for managing access controls to determine who can cause which encryption keys to be used under which conditions. One advantage of using AWS KMS is that the policy language you use to define access controls on keys is the same one you use to define access to all other AWS resources. Note that the language is the same, not the actual authorization controls. You need a mechanism for managing access to keys that is different from the one you use for managing access to your data. AWS KMS provides that mechanism by allowing you to assign one set of administrators who can only manage keys and a different set of administrators who can only manage access to the underlying encrypted data. Configuring your key management process in this way helps provide separation of duties you need to avoid accidentally escalating privilege to decrypt data to unauthorized users. For even further separation of control, AWS CloudHSM offers an independent policy mechanism to define access to keys.

In 2022, AWS KMS launched support for external key stores (XKS), a feature that allows you to store AWS KMS customer managed keys on an HSM that you operate on premises or at a location of your choice. At a high level, AWS KMS forwards requests for encryption and decryption to your HSM. Your key material never leaves your HSM. This can help you unblock use cases for a small portion of highly regulated workloads where encryption keys should be stored and used outside of an AWS data center. However, XKS forces a significant shift in the shared responsibility model—you now have responsibility for the durability, throughput, latency, and availability of your KMS key. If that key is lost or destroyed, you could permanently lose access to data, and if an XKS key becomes unavailable, all workloads in AWS that are dependent on that XKS key will be inaccessible.

Even with the ability to separate key management from data management, you can still verify that you have configured access to encryption keys correctly. AWS KMS is integrated with AWS CloudTrail so you can audit who used which keys, for which resources, and when. This provides granular vision into your encryption management processes, which is typically much more in-depth than on-premises audit mechanisms. Audit events from AWS CloudHSM can be sent to Amazon CloudWatch, the AWS service for monitoring and alarming third-party solutions you operate in AWS.

Encrypting data at rest and in transit

AWS services that handle customer data, encrypt data that is sent from one system to another—known as data in transit—provide options to encrypt data at rest. AWS services that offer encryption at rest using AWS KMS or AWS CloudHSM use AES-256. None of these services store plaintext encryption keys at rest—that’s a function that only AWS KMS and AWS CloudHSM may perform using their FIPS 140 level 3 validated HSMs. This architecture helps minimize the unauthorized use of keys.

When encrypting data in transit, AWS services use the Transport Layer Security (TLS) protocol to provide encryption between your application and the AWS service. Most commercial solutions use an open source project called OpenSSL for their TLS needs. OpenSSL has roughly 500,000 lines of code with at least 70,000 of those implementing TLS. The code base is large, complex, and difficult to audit. Moreover, when OpenSSL has bugs, the global developer community is challenged to not only fix and test the changes, but also to make sure that the resulting fixes themselves do not introduce new flaws.

AWS’s response to challenges with the TLS implementation in OpenSSL was to develop our own implementation of TLS, known as s2n, or signal to noise. We released s2n in June 2015, which we designed to be small and fast. The goal of s2n is to provide you with network encryption that is easier to understand and that is fully auditable. We released and licensed it under the Apache 2.0 license and hosted it on GitHub.

We also designed s2n to be analyzed using automated reasoning to test for safety and correctness using mathematical logic. Through this process, known as formal methods, we verify the correctness of the s2n code base every time we change the code. We also automated these mathematical proofs, which we regularly re-run to ensure the desired security properties are unchanged with new releases of the code. Automated mathematical proofs of correctness are an emerging trend in the security industry, and AWS uses this approach for a wide variety of our mission-critical software.

Similarly, in 2022, we released s2n-quic, an open-source Rust implementation of the QUIC protocol that was added to our set of AWS encryption open source libraries. QUIC is an encrypted transport protocol designed for performance and is the foundation of HTTP/3. It is specified in a set of IETF standards that were ratified in May 2021. Amazon CloudFront HTTP/3 support is built on top of s2n-quic, due to its emphasis on performance and efficiency. You can learn more about s2n-quic in this Security Blog post.

Implementing TLS requires using encryption keys and digital certificates that assert the ownership of those keys. AWS Certificate Manager and AWS Private Certificate Authority are two services that can simplify the issuance and rotation of digital certificates across your infrastructure that needs to offer TLS endpoints. Both services use a combination of AWS KMS and AWS CloudHSM to generate and/or protect the keys used in the digital certificates they issue.

Encrypting data in use

You might also have use cases for protecting data that is actively being used by federated learning models or other applications. Cryptographic computing—a set of technologies that allow computations to be performed on encrypted data, so that sensitive data is not exposed—is a methodology for protecting data in use.

Consider the example of an insurance company that works with other companies to develop machine learning models for insurance fraud detection. You might need to use sensitive data about your customers as training data for your models, but you don’t want to share your customer data in plaintext form with the other companies. Cryptographic computing gives organizations a way to train models collaboratively without exposing plaintext data about their customers to each other, or to a cloud provider like AWS. You can read more about cryptographic computing in this AWS Security Blog post.

Today, you can see cryptographic computing at work in AWS Clean Rooms, a service that helps companies and their partners more easily and securely analyze and collaborate on their collective datasets—all without sharing or copying one another’s underlying data. AWS Clean Rooms has a feature called Cryptographic Computing for AWS Clean Rooms (C3R) that cryptographically protects your data even while it is being processed by an AWS Clean Rooms collaboration.

The role of end-to-end encryption in secure communications

End-to-end encryption (E2EE) is a method of secure communication between two or more parties that combines encryption in transit and encryption at rest to protect data from unauthorized access, interception, or tampering. Decryption happens only on the parties you intend to communicate with, and no service providers in between. Every call, message, and file is encrypted with a unique private key and remains protected in transit. Unauthorized parties can’t access communication content, because they don’t have the private key required to decrypt the data.

AWS Wickr is an end-to-end encrypted messaging and collaboration service that protects one-to-one and group messaging, voice and video calling, file sharing, screen sharing, and location sharing with 256-bit encryption. With Wickr, each message gets a unique AES private encryption key and a unique Elliptic-curve Diffie–Hellman (ECDH) public key to negotiate the key exchange with recipients. Message content—including text, files, audio, or video—is encrypted on the sending device (your iPhone, for example) by using the message-specific AES key. This key is then exchanged by using the ECDH key exchange mechanism, so that only intended recipients can decrypt the message.

Quantum computing and post-quantum cryptography

Quantum computing is a field of technology that uses quantum mechanics to solve complex problems faster than on classical computers. Quantum computers are able to solve certain types of problems faster by taking advantage of quantum mechanical effects, such as superposition and quantum interference. For cryptography, this has implications that affect traditional encryption mechanisms such as asymmetric key encryption, which is often used for protecting data in transit (TLS) or creating hash-based signatures to verify the integrity and authenticity of a message or file. Quantum computers, if they are performant and stable enough, could theoretically compromise the security of asymmetric key algorithms like RSA, Elliptic Curve Cryptography (ECC), or Diffie-Hellman key agreement schemes. Based on current research, symmetric key algorithms like AES are not considered to be at risk from a quantum computer, because the key length of 256 bits is already sufficient to compensate for a decrease in cryptographic key strength posed by quantum algorithms.

AWS gives customers the option of evaluating post-quantum algorithms alongside traditional algorithms, using hybrid schemes that make use of both classic cryptography and newer post-quantum cryptographic (PQC) algorithms that are designed to be resistant to quantum computer threats. AWS has taken the first step in deploying PQC by implementing ML-KEM, a module lattice-based key encapsulation mechanism, within AWS-LC, our open source FIPS-140-3 validated cryptographic library. AWS-LC is the core cryptographic library used throughout AWS. Specifically, AWS-LC is used in s2n-tls, our open source TLS implementation used across AWS services with HTTPS-based endpoints.

We have also deployed post-quantum s2n-tls with AWS KMS, AWS Certificate Manager (ACM), and AWS Secrets Manager TLS endpoints—bringing the benefits of post-quantum cryptography to customers who enable hybrid post-quantum TLS in their AWS SDK to connect to those services. AWS Transfer Family also supports post-quantum, hybrid SFTP file transfers. You can read more about our efforts to migrate more AWS managed service endpoints to PQC over TLS in this AWS Security blog post. You can also find information about the work of Amazon and AWS in cryptographic research and improvements on the Amazon Science Blog.

Encrypt everything, everywhere

AWS provides customers the ability to encrypt everything, everywhere. Customers can encrypt data at rest, in transit, and in memory, with a few clicks in the AWS Management Console, or an AWS API call. Services like Amazon Simple Storage Service (Amazon S3) encrypt new objects by default, and also support the use of customer managed AWS KMS keys to give customers more control over their encryption keys. Importantly, AWS KMS uses techniques like envelope encryption and highly scalable key management infrastructure to enable AWS services like Amazon S3 or Amazon Elastic Block Store (Amazon EBS) to encrypt data with minimal performance impact to customer applications.

AWS is also consistently working to improve the performance and security of our customers’ data as it moves between networks or devices. As of June 2024, all AWS API endpoints support TLS 1.3 and require at least TLS 1.2 or higher. By using TLS 1.3, you can decrease your connection time by removing one network round trip for every connection request, and can benefit from some of the most modern and secure cryptographic cipher suites available today.

Customers who require memory encryption can use AWS Graviton, our custom-built family of processors based on ARM. AWS Graviton2, AWS Graviton3, and AWS Graviton3E support always-on memory encryption. The encryption keys are securely generated within the host system, do not leave the host system, and are destroyed when the host is rebooted or powered down. Memory encryption is also supported for other instance types; see the EC2 documentation for more details.

As part of our AWS Digital Sovereignty Pledge, we commit to continue to innovate and invest in additional controls for encryption features so that our customers can encrypt everything, everywhere with encryption keys managed inside or outside the AWS Cloud.

Summary

At AWS, security is our top priority. We are committed to helping you control how your data is used, who has access to it, and how it is protected. By building and supporting encryption tools that work both on and off the cloud, we help you secure your data and enable compliance across your environment. We put security at the center of everything we do to make sure that you can protect your data using best-of-breed security technology in a cost-effective way.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS KMS forum or the AWS CloudHSM forum, or contact AWS Support.

Implementing least privilege access for Amazon Bedrock

2025-02-11 Jonathan Jenkyn

Post Syndicated from Jonathan Jenkyn original https://aws.amazon.com/blogs/security/implementing-least-privilege-access-for-amazon-bedrock/

Generative AI applications often involve a combination of various services and features—such as Amazon Bedrock and large language models (LLMs)—to generate content and to access potentially confidential data. This combination requires strong identity and access management controls and is special in the sense that those controls need to be applied on various levels. In this blog post, you will review the scenarios and approaches where you can apply least privilege access to applications using Amazon Bedrock. To fully benefit from the guidance in this post, you need an understanding of AWS APIs, AWS Identity and Access Management (IAM) policies, and AWS security services.

Let’s start by defining the principle of least privilege (PoLP): The PoLP is a security concept that advises granting the minimal level of access—or permissions—necessary for users, programs, or systems to perform their tasks. The main idea is that the fewer permissions an entity has, the lower the risk of malicious or accidental damage. Applying the PoLP to your use of AWS serves two purposes:

Security: By limiting access, you reduce the potential impact of a security incident. If a user or service has minimal permissions, the scope for any damage can be significantly reduced.
Operational simplicity: Managing permissions can become complex if not properly managed and maintained. Applying the PoLP to your access controls early helps keep configurations as manageable as possible. Finally, there are regulatory frameworks that require separation of duty between roles and a documented strategy for access controls, which can be achieved in part by adhering to the PoLP.

Amazon Bedrock is a fully managed AWS service that makes high-performing foundation models (FMs) available through a single unified API. You use Amazon Bedrock through AWS APIs, which expose actions for the control plane and administration such as the configuration of Amazon Bedrock Guardrails and Amazon Bedrock Agents, in addition to data plane functional actions such as inference.

Generally, the path to using Amazon Bedrock for a production workload includes the following stages:

Model selection: Decide on the required features (Retrieval Augmented Generation (RAG), fine-tuning, and so on), evaluate and select a model, and approve a EULA if necessary.
Model adaptation: Prompt engineering, integration of Amazon Bedrock into the application, and addition of model customization if desired.
Model testing: Validate and test the solution.
Model operation: Deploy the solution and make it available. Monitor and operate the solution.

In the following sections, we go through each phase and outline how you can apply the PoLP.

Model selection

In this phase, you choose the features and models that are needed to fulfill your requirements and define how you will apply the PoLP. These can include, for example, model customization, Retrieval Augmented Generation (RAG) or the use of agents.

Security should be integrated into the design so that the defined controls can be implemented during the development phase. One approach to define the necessary security controls is threat modeling. Doing this exercise early in the process will simplify the upcoming phases. The results can be used later to decide on the required guardrails, potential changes to the architecture, and test cases.

In this phase, you will also decide how the solution should be deployed. Customers typically operate in a multi-account setup; therefore, the selection of target organizational units (OUs) and accounts is required. We recommend creating a new OU for generative AI applications. For details, see the deep-dive chapter on generative AI in the AWS Security Reference Architecture. We will talk later about service control policies (SCPs) and how they can be used to restrict permissions. The generative AI OU is a good place to enforce those guardrails.

Amazon Bedrock provides access to a variety of high-performing FMs from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon. In this stage, you need to choose the models that you’ll use and approve them. With a third-party FM, approval might include accepting a EULA. You can limit identities and the models that they can subscribe to in order to follow compliance with EULAs that have been reviewed by your legal department. The following is an example of an SCP that allows account operators to enable all Anthropic FMs and a single Meta Llama FM.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAcceptingModelEULAs",
      "Effect": "Allow",
      "Action": [
        "aws-marketplace:Subscribe"
      ],
      "Resource": "*",
      "Condition": {
        "ForAnyValue:StringEquals": {
          "aws-marketplace:ProductId": [
            "c468b48a-84df-43a4-8c46-8870630108a7",
            "b0eb9475-3a2c-43d1-94d3-56756fd43737",
            "prod-6dw3qvchef7zy",
            "prod-m5ilt4siql27k",
            "prod-ozonys2hmmpeu",
            "prod-fm3feywmwerog",
            "prod-2c2yc2s3guhqy"
          ]
        }
      }
    },
    {
      "Sid": "AllowUnsubscribingFromModels",
      "Effect": "Allow",
      "Action": [
        "aws-marketplace:Unsubscribe",
        "aws-marketplace:ViewSubscriptions"
      ],
      "Resource": "*"
    }
  ]
}

While this approach works well if you’re only allowlisting actions, you might have highly privileged users that already have broad access to AWS Marketplace APIs. In such a case, you can follow a deny all except a few approach. Such a policy, using the same models as before, would look like the following example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyAcceptingAllExceptCertainModelEULAs",
      "Effect": "Deny",
      "Action": [
        "aws-marketplace:Subscribe"
      ],
      "Resource": "*",
      "Condition": {
        "ForAllValues:StringNotEquals": {
          "aws-marketplace:ProductId": [
            "c468b48a-84df-43a4-8c46-8870630108a7",
            "b0eb9475-3a2c-43d1-94d3-56756fd43737",
            "prod-6dw3qvchef7zy",
            "prod-m5ilt4siql27k",
            "prod-ozonys2hmmpeu",
            "prod-fm3feywmwerog",
            "prod-2c2yc2s3guhqy"
          ]
        }
      }
    },
    {
      "Sid": "DenyUnsubscribingAllExceptCertainModels",
      "Effect": "Deny",
      "Action": [
        "aws-marketplace:Unsubscribe",
        "aws-marketplace:ViewSubscriptions"
      ],
      "Resource": "*",
      "Condition": {
        "ForAllValues:StringNotEquals": {
          "aws-marketplace:ProductId": [
            "c468b48a-84df-43a4-8c46-8870630108a7",
            "b0eb9475-3a2c-43d1-94d3-56756fd43737",
            "prod-6dw3qvchef7zy",
            "prod-m5ilt4siql27k",
            "prod-ozonys2hmmpeu",
            "prod-fm3feywmwerog",
            "prod-2c2yc2s3guhqy"
          ]
        }
      }
    }
  ]
}

You can find the required product IDs used in the condition in Grant IAM permissions to request access to Amazon Bedrock foundation models.

Model adaptation

In this phase, the solution is built—that is, code is written. This is mostly identical to traditional software development, however there are some areas specific to generative AI, such as prompt engineering, prompt guardrails, model monitoring, and agent design. In this post, we focus solely on the identity and access management aspects.

Adaptation is the phase where the detailed permission sets are created. Data perimeters can be used as a conceptual tool to define and implement guardrails. Because data perimeters are typically coarse grained, they aren’t sufficient to achieve the goal of the PoLP. However, in combination with fine-grained policies, they support a defense-in-depth approach. The following data perimeters exist:

Identity: Only trusted identities are allowed in my network, only trusted identities can access my resources.
Resource: My identities can access only trusted resources, only trusted resources can be accessed from my network.
Network: My identities can access resources only from expected networks, my resources can only be accessed from expected networks.

For applications that use Amazon Bedrock, you can use a virtual private cloud (VPC) network construct with Amazon Virtual Private Cloud (Amazon VPC) to host them. Doing so means that you can then use AWS PrivateLink to create VPC endpoints for both data and control plane APIs. Using PrivateLink to create endpoints, it’s possible to provide access to Amazon Bedrock for VPC-bound compute resources (such as Amazon Elastic Compute Cloud (Amazon EC2), or AWS Lambda) without the need for an internet gateway. In other words, you can deploy these resources entirely in private subnets. By using resource-based policies on these endpoints, you can restrict the principals, actions, resources, and conditions related to making API calls.

Let’s assume you have a VPC with an EC2 instance running in a private subnet hosting an application that uses Amazon Bedrock model invocations and have created an interface VPC endpoint to connect to the Amazon Bedrock data plane. The EC2 instance is configured to use an instance profile using the <rolename> IAM role and needs to be able to invoke a single Anthropic’s Claude Instant FM through an Amazon Bedrock InvokeModel API call. You can apply the PoLP to the containing VPC, and thus the EC2 instance, with a custom policy on the Amazon Bedrock interface VPC endpoint. To use the following policy in your own account, replace the default interface VPC endpoint policy with the following example, replacing <rolename> with the role you want to allow and <account-id> with your 12-digit account number.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowInvokingClaudeInstantV1Models",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<account-id>:role/<rolename>"
      },
      "Action": [
        "bedrock:InvokeModel"
      ],
      "Resource": "arn:aws:bedrock:*::foundation-model/anthropic.claude-instant-v1"
    }
  ]
}

Check out Security Objective 2: Implement a data perimeter using VPC endpoint policies for more information about this data perimeter approach.

You can define the allowed models that can be used in Amazon Bedrock directly in the policy. However, if you have multiple applications that use Amazon Bedrock, you might have to update multiple policies when a new model is allowed to be used. To complement the data perimeter approach, you can add an SCP to limit the models that can be used for inference. Because Amazon Bedrock is using a simple API (InvokeModel and Converse) for inference, a condition element in an IAM policy can be used to deny the use of unapproved models. Note that while the two policies (the SCP and the VPC endpoint policy) look similar, they work differently: VPC endpoint policies are enforced provided that the network path through PrivateLink is enforced; SCPs are applied to principals within the account or OU they’re attached to. Be extra careful if the calling identity resides outside of your account, because only the VPC endpoint policy will apply.

For example, imagine that you wanted to block the invocation of all Anthropic FMs across your organizations within AWS Organizations, in all AWS Regions. The following SCP example applied to the OUs or AWS accounts in scope would achieve that outcome:

{
  "Version": "2012-10-17",
  "Statement": {
    "Sid": "DenyInferenceForAnthropicModels",
    "Effect": "Deny",
    "Action": [
      "bedrock:InvokeModel",
      "bedrock:InvokeModelWithResponseStream"
    ],
    "Resource": [
      "arn:aws:bedrock:*::foundation-model/anthropic.*"
    ]
  }
}

You can use the same pattern to access data that’s needed for your application, such as data residing in Amazon Simple Storage Service (Amazon S3).

Model customization

A solution built on Amazon Bedrock might include model customization. The common denominator of the different customization approaches is that they include data, which is assumed to be confidential and thus in-scope for applying the PoLP. Here, we take a scenario where data is stored in Amazon S3 and can be encrypted using a customer managed AWS Key Management Service (AWS KMS) key.

Measures can be taken on multiple levels, as conceptualized in data perimeters: network, identity, and resource. Amazon Bedrock model customization uses service roles, which allows you to apply fine-grained and least-privilege access end-to-end. These service roles will be assumed by the Amazon Bedrock service principal, so that it can execute actions on your behalf. To allow the Amazon Bedrock service principal to assume the role in your account, you need to attach a trust policy to the role.

Let’s imagine that you’re running an Amazon Bedrock customization job in the us-east-1 (N. Virginia) Region. Using the following trust policy example will allow only the Amazon Bedrock service principal to assume your role.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowBedrockServicePrincipalUnderConditions",
      "Effect": "Allow",
      "Principal": {
        "Service": "bedrock.amazonaws.com"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "aws:SourceAccount": "<account-id>"
        },
        "ArnEquals": {
          "aws:SourceArn": "arn:aws:bedrock:us-east-1:<account-id>:model-customization-job/*"
        }
      }
    }
  ]
}

Make sure to replace <account-id> in the preceding example trust policy with your own 12-digit account number. The policy contains a condition that provides cross-service confused deputy prevention by adding the aws:SourceAccount condition. The confused deputy problem is a situation where an entity that doesn’t have permission to perform an action can coerce a more privileged entity to perform the action. In AWS, cross-service impersonation can result in the confused deputy problem. Cross-service impersonation can occur when one service (the calling service) calls another service (the called service). AWS provides tools to help you protect your data for all services with service principals that have been given access to resources in your account. Both the aws:SourceArn and aws:SourceAccount global condition context keys in the role’s trust policy limit the permissions that Amazon Bedrock gives another service (in the preceding case, to the customization job) to the resource. aws:SourceArn is the more restrictive approach here, because it defines the specific source of the assume request, and not just the AWS account.

You should provide only the permissions that are required to fulfill the model customization task. For example, imagine that you want to limit access to your training data, the validation data bucket, and the output bucket (where Amazon Bedrock will deliver output metrics). The following policy, attached to that same service role, provides only those permissions. Replace the <training-bucket> placeholder with the bucket name that contains your training data, <validation-bucket> with your validation bucket, and <output-bucket> with the bucket where you want to store metrics.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAccessToTrainingAndValidationBucket",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::<training-bucket>",
        "arn:aws:s3:::<training-bucket>/*",
        "arn:aws:s3:::<validation-bucket>",
        "arn:aws:s3:::<validation-bucket>/*"
      ]
    },
    {
      "Sid": "AllowAccessToOutputBucket",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::<output-bucket>",
        "arn:aws:s3:::<output-bucket>/*"
      ]
    }
  ]
}

Complementing this approach, we recommend using a VPC for the model customization job to restrict access to the training data. Technically, this again involves a VPC endpoint resource policy because the network is using interface VPC endpoints to access your S3 bucket. This allows you to define another network control, specifically an S3 bucket policy that only allows access through a specific VPC endpoint. So, for the situation where you want to limit access for the customization job itself, you can apply a bucket policy such as the following example, replacing <training-bucket> with the bucket name that contains your training data, and <vpce-id> with the ID of the VPC endpoint that resides in your VPC:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AccessToSpecificVPCEOnly",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",      
      "Resource": [
        "arn:aws:s3:::<training-bucket>",
        "arn:aws:s3:::<training-bucket>/*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:SourceVpce": "<vpce-id>"
        }
      }
    }
  ]
}

In addition, you would restrict the principals that can access your VPC endpoint and the actions they’re allowed to take in Amazon S3. For simplicity, we’re omitting an example policy here because it’s very similar to the one we have in the Amazon Bedrock invocation section earlier in this post.

If you need to enforce encryption in Amazon S3 using a customer managed AWS KMS key (SSE-KMS), you will need to do the following:

Update the bucket policy with a statement denying unencrypted content being uploaded.
Update the KMS key policy to allow the service role to decrypt and describe the key.

The next policy example should be added to the bucket policy and demonstrates how to deny unencrypted objects being added to an S3 bucket. Again, replace <training-bucket> with the name of the S3 bucket that contains your training data:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyObjectsThatAreNotSSEKMS",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::<training-bucket>/*",
      "Condition": {
        "Null": {
          "s3:x-amz-server-side-encryption-aws-kms-key-id": "true"
        }
      }
    }
  ]
}

Finally, in the KMS key policy, you need a statement similar to the following to allow the Amazon Bedrock service role access to the KMS key. Replace <account-id> with your 12-digit account number and <bedrock-service-role> with the role you created, which will be assumed by the Amazon Bedrock service principal. Make sure to only give the required access to decrypt data with the KMS key to the IAM role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowUseOfKeyByBedrockRole",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<account-id>:role/<bedrock-service-role>"
      },
      "Action": [
        "kms:Decrypt",
        "kms:DescribeKey"
      ],
      "Resource": "*"
    }
  ]
}

Amazon Bedrock can also encrypt a customized model with a customer managed KMS key. Amazon Bedrock uses KMS key grants to encrypt the customized model and to decrypt it later when you deploy it for inference. Therefore, you need to grant the same IAM role permissions to create KMS key grants in the KMS key policy. The KMS key you use for this purpose is typically different than the one you used to encrypt the training data to allow fine-grained permissions on both keys.

So, let’s imagine that you want to use two different roles to encrypt and decrypt the customized models. To allow the role that executes the model customization job to use your KMS key, you need to add the following policy statements to the KMS key policy, replacing <account-id> with your 12-digit account number, <region> with the Region where you run Amazon Bedrock, <bedrock-model-customization-role> with the role name you use to run the model customization job, and <invocation-role> with the name of the role you use for inference.

{
  "Version": "2012-10-17",
  "Id": "PermissionsCustomModelKey",
  "Statement": [
    {
      "Sid": "PermissionsEncryptCustomModel",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<account-id>:role/<bedrock-model-customization-role>"
        ]
      },
      "Action": [
        "kms:Decrypt",
        "kms:GenerateDataKey",
        "kms:DescribeKey",
        "kms:CreateGrant"
      ],
      "Resource": "*",
      "Condition": {
        "StringLike": {
          "kms:ViaService": [
            "bedrock.<region>.amazonaws.com"
          ]
        }
      }
    },
    {
      "Sid": "PermissionsDecryptModel",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<account-id>:role/<invocation-role>"
        ]
      },
      "Action": [
        "kms:Decrypt"
      ],
      "Resource": "*",
      "Condition": {
        "StringLike": {
          "kms:ViaService": [
            "bedrock.<region>.amazonaws.com"
          ]
        }
      }
    }
  ]
}

By using KMS key grants, you can revoke the permissions you granted to the service role after the customization job is done, thus reducing the permissions to least privilege. Also, Amazon Bedrock uses secondary KMS key grants for model encryption, which means that they’re automatically retired as soon as the operation that Amazon Bedrock performs on behalf of the customer is completed. The Encryption of model customization jobs and artifacts describes in more detail how grants are used.

To completement these IAM policy guardrails, you can add network controls to reduce the scope of the permissions of the process. Because we focus on IAM policies in this post, we won’t go into details here but only mention how the process works.

When you start a model customization job, a model training job is triggered within the model deployment account. The training job takes a base model from its S3 bucket, then connects to the S3 bucket that holds the customization training data to start the customization. This can be done through your VPC, where you specify a VPC configuration such as subnets and security groups, and the training job places an elastic network interface (ENI) into that VPC as specified. A request to the S3 bucket to read the training data now adheres to whatever routing rules are present in the VPC for that ENI. The VPC routing and security group attached to the ENI can be used to limit networking access to the model customization job.

Amazon Bedrock Agents

Amazon Bedrock Agents offers the capability to build and configure autonomous agents for applications. You can find more information about Amazon Bedrock Agents in Automate tasks in your application using AI agents.

Using an Amazon Bedrock agent also provides certain security properties that are applied to an inference task. For example, at the time of writing, there is no IAM condition key for the bedrock:InvokeModel API to require an Amazon Bedrock guardrail being attached to that same call. However, you can require that inferences are invoked through a call to an agent that has specific Amazon Bedrock guardrails configured.

Let’s say that you want to create a role that explicitly is only allowed to invoke Amazon Bedrock models through a specific Amazon Bedrock agent. The following IAM principal permissions policy example implies that the Amazon Bedrock agent specified has approved Amazon Bedrock guardrails configured. Again replace <region>, <account-id>, <bedrock-agent-id>, and <bedrock-agent-alias-id> with the values of your Amazon Bedrock agent.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAgentInvocation",
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeAgent"
      ],
      "Resource": "arn:aws:bedrock:<region>:<account-id>:agent-alias/<bedrock-agent-id>/<bedrock-agent-alias-id>"
    },
    {
      "Sid": "DenyDirectInvocation",
      "Effect": "Deny",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream",
        "bedrock:CreateModelInvocationJob"
      ],
      "Resource": "arn:aws:bedrock:*::foundation-model/*"
    }
  ]
}

Provided that the Amazon Bedrock agent is configured by a systems administrator or operator with an approved Amazon Bedrock guardrail, the principal with the preceding policy attached to it will be able to invoke it with a prompt, and won’t directly invoke an Amazon Bedrock model. This strategy for making sure that Amazon Bedrock guardrails are applied to all Amazon Bedrock invocations is currently not possible with the bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream APIs, because they don’t have a condition key to match an Amazon Bedrock guardrail to. In addition, denying bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream also denies the Converse APIs and StartAsyncInvoke APIs, so there’s no need to add these separately to the Deny statement.

Because this strategy verifies the use of specific Amazon Bedrock guardrails, you can also use it for the enforcement of specific prompts, IAM service roles, knowledge bases, prompt and completion content restrictions, KMS keys, and FMs in inference invocations. For this approach to be effective, you need to also limit the principals that can create and update Amazon Bedrock agent configurations. Again, this can be restricted using an IAM policy, which is attached to only specific principals.

The following is an example IAM policy statement that gives an attached IAM principal the ability to update the configuration of a specific Amazon Bedrock agent, replacing <region>, <account-id>, and <agent-id> with the Region, account and identifier, and agent identifier that you’re using. If you want this to apply to all agents, replace <agent-id> with an asterisk (*).

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowUpdatingBedrockAgents",
      "Effect": "Allow",
      "Action": [
        "bedrock:DisassociateAgentKnowledgeBase",
        "bedrock:GetAgent*",
        "bedrock:ListAgent*",
        "bedrock:PrepareAgent",
        "bedrock:TagResource",
        "bedrock:UntagResource",
        "bedrock:UpdateAgent*"
      ],
      "Resource": [
        "arn:aws:bedrock:<region>:<account-id>:agent/<agent-id>"
      ]
    }
  ]
}

Where agents aren’t suitable, users or applications performing inference against the Amazon Bedrock models will need permissions to call the bedrock:InvokeModel, bedrock:InvokeModelWithResponseStream, or bedrock:CreateModelInvocationJob actions. In these cases, it can be desirable to limit the target models, following an allowlisting approach. Also, such permissions would only be attached to roles or applications that need to use them.

The following is an example of such a policy that restricts invocation to Anthropic’s Claude Instant.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowInvokationOnAnthropicClaudeInstantV1",
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream",
        "bedrock:CreateModelInvocationJob"
      ],
      "Resource": "arn:aws:bedrock:*::foundation-model/anthropic.claude-instant-v1"
    }
  ]
}

You can include detective or reactive controls using Amazon CloudWatch EventBridge rules to detect model invocations that don’t use appropriate Amazon Bedrock guardrails, but that’s outside the scope of this post.

Model testing

Testing is the last step before the solution is deployed. Types of tests include unit tests, integration tests, user acceptance tests, penetration testing, and more. In this phase, you can again verify that the permissions that were assigned are indeed least privilege.

Especially in functional tests where data is involved, it’s important to consider that the data used for testing might be confidential. This is typically true when no synthetic test data is generated for the testing process. Controls to restrict access to the data and logs that are produced and might contain pieces of this data need to be the same as you will apply in a production environment. That you are only testing the solution doesn’t automatically mean that data access controls aren’t needed.

As discussed earlier in this post, controls are activated not only on identities, but also on the network and on resources. All of these should be validated and their effectiveness confirmed in this phase. Tests include but aren’t limited to:

Validate that you can only perform allowed actions in Amazon Bedrock through VPC endpoints, and that actions that don’t use VPC endpoints are blocked.
Validate the effectiveness of the resource policies on VPC endpoints by making sure that they can only be used by authenticated and authorized principals.
When using knowledge bases, validate that only the Amazon Bedrock service principal can access them.
When using Amazon Bedrock guardrails, evaluate their effectiveness. Because of the nature of generative AI applications, the diversity in input and output data can be big. Therefore, make sure to test guardrails with a reasonably large number of prompts.
If model invocation logging is activated, validate that logs are correctly written and protected with IAM permissions and encryption.
Validate that only the required personnel can access these logs, because they might contain sensitive data. Consider automatically sanitizing and forwarding them to a new CloudWatch log group.
Validate that all relevant Amazon Bedrock API calls are being properly logged in AWS CloudTrail, and that you can effectively monitor and alert on any suspicious activity.
Make sure that sensitive information—such as prompts and responses—isn’t being stored in the CloudTrail logs or in any trace output.

The threat modelling that you have potentially created in the design phase can provide valuable inputs that you can use for security-related test cases.

Model operation

In this phase, the solution is finally deployed into production. Operators need Amazon Bedrock control plane permissions to provision and manage Amazon Bedrock resources such as agents, guardrails, prompt libraries, and knowledge bases. They should only get the control plane permissions to provision and manage the Amazon Bedrock features that are being used. These same operators should have access to invoke or configure the Amazon Bedrock service features or their resources (Amazon Bedrock Agents, Amazon Bedrock Guardrails, and prompt libraries) only through an authorized pipeline. This immutable infrastructure approach restricts human users from creating situations or configurations that would otherwise allow unapproved access to the data plane, untracked changes to the control plane, or potentially disruptive updates to your application.

Alternatively, to reduce the assigned permissions to the absolute minimum, automated deployments using pipelines and pipeline roles can be used. This will not only provide versioned infrastructure but also adheres to PoLP by not providing access to human identities.

After deployment, the solution is live and being accessed by real users. At this point, topics such as monitoring, logging, and incident response become relevant. While Amazon Bedrock by default doesn’t store inference or response data, it’s recommended that you activate logging of those elements to constantly verify the accuracy of your generative AI application.

By using this solution, you reduce access to a minimum in the following areas:

Logged prompts and responses
Data available through knowledge bases, RAG, or similar sources
The ability to change the infrastructure

Use multiple, dedicated, least privileged roles for each task. This helps reduce the permissions scope to a minimum. Also, because least privilege enforces using a specific role for a specific task, it reduces the risk of unintended changes by requiring the assumption of a specific role.

By following the AWS Security Reference Architecture, security monitoring data is consolidated in a central security account. This allows a comprehensive, central overview of the security posture of your infrastructure.

Logging sensitive information

An important operational aspect is logging potentially sensitive data that’s sent to or received from the LLM. While Amazon Bedrock doesn’t store prompts or responses, you can use model invocation logging to collect invocation logs, model input data, and model output data for all invocations in your AWS account used in Amazon Bedrock. Model invocation logging isn’t enabled by default. After it’s enabled, prompts, completions, or both for all invocations of all approved models are then logged to the configured log destination. Valid log destinations for prompt and completion logs are Amazon S3 and CloudWatch. When writing logs to these destinations, they can optionally be encrypted using a supplied KMS key.

The contents of these logs might contain sensitive information in the prompt provided by the user or the reply generated by the model. As such, access to these logs should be restricted to personnel and machine processes that require and are authorized to access this classification of data. There are strategies such as using Amazon Macie on the Amazon S3 logs and CloudWatch Logs data protection capabilities to detect, monitor, and redact this sensitive information from logs, but that’s outside the scope of this post.

Even with Amazon Bedrock guardrails in place, the contents of these logs contain the pre-guardrailed user input, and so you must assume that these prompt and completion logs contain sensitive information. In this case, best practice is to encrypt log data with a KMS key, apply a data protection policy to the log group, and define at least three IAM roles:

BasicCompletionLogReviewer: An IAM role whose sole purpose is to access and review the redacted version of these logs.
SensitiveDataCompletionLogReviewer: A restricted IAM role whose sole purpose is to access and review the unredacted version of these logs.
CompletionLogAdmin: A restricted IAM role whose sole purpose is to create, view, and delete data protection policies that can send audit findings to Amazon S3 and CloudWatch destinations.

To allow reading log events in a specific log group, use a policy such as the following and attach it to the BasicCompletionLogReviewer role, replacing <region>, <account-id>, <log-group-name>, and <alias-name> with values that match your CloudWatch log group and the KMS key that encrypts it.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowReadingMaskedLogStream",
      "Effect": "Allow",
      "Action": [
        "logs:DescribeLogStreams",
        "logs:GetLogEvents"
      ],
      "Resource": "arn:aws:logs:<region>:<account-id>:log-group:<log-group-name>:*"
    },
    {
      "Sid": "AllowDecryptOfLogEvents",
      "Effect": "Allow",
      "Action": [
        "kms:Decrypt"
      ],
      "Resource": "arn:aws:kms:<region>:<account-id>:alias/<alias-name>"
    }

  ]
}

With an active data protection policy in place, the preceding policy won’t allow access to the unredacted version of these logs. To allow access to the unredacted versions to the SensitiveDataCompletionLogReviewer role, you need to add an additional action, replacing <region>, <account-id>, <log-group-name>, and <alias-name> with values that match your CloudWatch log group and the KMS key that encrypts it.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowReadingMaskedLogEvents",
      "Effect": "Allow",
      "Action": [
        "logs:DescribeLogStreams",
        "logs:GetLogEvents",
        "logs:Unmask"
      ],
      "Resource": "arn:aws:logs:<region>:<account-id>:log-group:<log-group-name>:*"
    },
    {
      "Sid": "AllowDecryptOfLogEvents",
      "Effect": "Allow",
      "Action": [
        "kms:Decrypt"
      ],
      "Resource": "arn:aws:kms:<region>:<account-id>:alias/<alias-name>"
    }
  ]
}

The policy for the CompletionLogAdmin role requires different permissions; the following sample policy allows a user to create, view, and delete data protection policies that can send audit findings to all three types of audit destinations. It doesn’t permit the user to view unmasked data. This policy will look like the following example, replacing <delivery-stream-id>, <bucket-name>, <log-group-name>, and <alias-name> with the values that match your setup. Note that this includes a statement that explicitly denies the attached role access to decrypt the logs with the configured KMS key, aligning with the PoLP:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowLogGroupsManagement1",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogDelivery",
        "logs:PutResourcePolicy",
        "logs:DescribeLogGroups",
        "logs:DescribeResourcePolicies"
      ],
      "Resource": "*"
    },
    {
      "Sid": "AllowLogGroupsManagement2",
      "Effect": "Allow",
      "Action": [
        "logs:GetDataProtectionPolicy",
        "logs:DeleteDataProtectionPolicy",
        "logs:PutDataProtectionPolicy",
        "s3:PutBucketPolicy",
        "firehose:TagDeliveryStream",
        "s3:GetBucketPolicy"
      ],
      "Resource": [
        "arn:aws:firehose:::deliverystream/<delivery-stream-id>",
        "arn:aws:s3:::<bucket-name>",
        "arn:aws:logs:::log-group:<log-group-name>:*"
      ]
    },
    {
      "Sid": "AllowListKMSKeys",
      "Effect": "Allow",
      "Action": [
        "kms:ListKeys",
        "kms:ListAliases"
      ],
      "Resource": "*"
    },
    {
      "Sid": "DenyDecryptOfLogEvents",
      "Effect": "Deny",
      "Action": [
        "kms:Decrypt"
      ],
      "Resource": "arn:aws:kms:<region>:<account-id>:alias/<alias-name>"
    }

  ]
}

This approach helps to avoid inadvertent access to or exposure of this sensitive data source and upholds the PoLP by separating duties.

Review IAM permissions on a periodic basis

Managing permissions is an ongoing effort because requirements and functionality change over time. Therefore, we recommend regularly reviewing the assigned permissions and verifying that they aren’t overly permissive. For example, if you have a Lambda function that makes API calls to Amazon Bedrock, and changes are made to that function that require additional permissions (perhaps the use of a new model), then it’s acceptable to update the policy attached to the IAM role that the function uses. It’s not always obvious that permissions for using the earlier model are still needed in the same policy; or permissions might be widened unnecessarily to include all models. When applying the PoLP, it’s important that policies be tested at the time they’re deployed to make sure that they meet the exact application needs and no more, but also that the presumed needs are reviewed periodically.

Using AWS IAM Access Analyzer, you can review and simulate proposed changes to IAM policies to ensure their suitability for a given application or function. You can also use IAM Access Analyzer to review unused permissions over time. This gives system operators an opportunity to inspect and then inform the removal of unused permissions in policies used with Amazon Bedrock applications. Remember that some permissions are dormant and ready for periodic use, such as incident response, recovery and other rare use cases, so your review shouldn’t assume that an unused permission is unnecessary, but an opportunity to review the need for the permission.

Finally, align monitoring of new Amazon Bedrock APIs with your IAM strategy. Especially when using denylisting approaches, it’s important to consider that services will announce new APIs, capabilities, and FMs over time. An example for this was the announcement of the new Converse API. This API provides functionality similar to Invoke, but in a consistent and thus simpler way. Considering such changes is therefore an integral part of your regular policy review processes.

Strong identity and access management is a journey, not a one-time action.

Conclusion

In this post, we have demonstrated some ways that you can apply the principle of least privilege (PoLP) to large language model (LLM)-based applications that use Amazon Bedrock. We have discussed the security considerations of each phase of the development lifecycle and provided examples that you can use as a starting point to implement your own PoLP strategy. It’s important that security doesn’t start late in the process; think about risks and the required actions as early as possible to make sure that your strategy is effective when your application goes live.

Finally, remember that the field of generative AI is moving quickly. We believe that it has the potential to transform virtually every customer experience. From a security perspective, this means that the threat landscape will change and evolve over time. Make sure to constantly adapt to new risks; evaluate and integrate them into your PoLP strategy.

Your AWS account team and specialists are happy to assist you on this journey.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Announcing ASCP integration with Pod Identity: Enhanced security for secrets management in Amazon EKS

2025-02-10 Rodrigo Bersa

Post Syndicated from Rodrigo Bersa original https://aws.amazon.com/blogs/security/announcing-ascp-integration-with-pod-identity-enhanced-security-for-secrets-management-in-amazon-eks/

In 2021, Amazon Web Services (AWS) introduced the AWS Secrets and Configuration Provider (ASCP) for the Kubernetes Secrets Store Container Storage Interface (CSI) Driver, offering a reliable way to manage secrets in Amazon Elastic Kubernetes Service (Amazon EKS). Today, we’re excited to announce the integration of ASCP with Pod Identity, the new standard for AWS Identity and Access Management (IAM) integration with Amazon EKS. This integration provides enhanced security, simplified configuration, and improved Day-2 operation for applications running on EKS that need access to secrets stored in AWS Secrets Manager and AWS Systems Manager Parameter Store. In this post, we’ll cover use-case scenarios for single and cross-account secrets using Pod Identity integration with ASCP.

Amazon EKS introduced Pod Identity in 2023 as a feature that streamlines the process of configuring IAM permissions for Kubernetes applications. Pod Identity simplifies the identity management of applications running on top of Amazon EKS by allowing you to set up permissions directly through EKS interfaces, reducing the number of steps required and alleviating the need to switch between the EKS and IAM services. Pod Identity enables the use of a single IAM role across multiple clusters without updating trust policies, and it supports role session tags for more granular access control. This approach not only simplifies policy management by allowing the reuse of permission policies across roles, but also enhances security by enabling access to AWS resources based on matching tags.

Background

When your applications that are running on top of Amazon EKS require sensitive information like credentials to access a database, or a key to authenticate through an API, we call this kind of information secrets. You can secure, store, and manage secrets in AWS Secrets Manager. ASCP allows Kubernetes applications to securely retrieve the secrets stored in Secrets Manager and Systems Manager Parameter Store. Previously, ASCP relied on IAM roles for service accounts (IRSA) for authentication. Although IRSA provided improvements over previous methods, Pod Identity offers even greater security and simplicity.

Pod Identity provides a more secure and efficient way to integrate IAM roles with applications running on EKS, granting more granular AWS permissions to individual Pods, so that you don’t need instance-level credentials or IRSA.

This integration provides the following key benefits over using IRSA:

Enhanced security: Pod Identity provides a more granular and secure way to manage permissions at the Pod level, alleviating the need to expose the IAM role annotation on Kubernetes ServiceAccount objects.
Simplified configuration:Pod Identity streamlines and simplifies the setup process, reducing the potential for misconfiguration, especially in high-scale environments.
Improved operation: Pod Identity reduces the operational overhead compared to previous methods, centralizing the management with the AWS API.
Native EKS integration: Pod Identity is the new standard for IAM integration for applications running on EKS and provides a more cohesive experience.

Solution overview

With this integration, ASCP uses Pod Identity to authenticate and authorize access to AWS services. When a Pod requires access to a secret, the workflow is as shown in Figure 1.

Figure 1: The workflow performed by the Pod Identity and ASCP integration to provide access to a secret stored on AWS Secrets Manager for a Pod running on Amazon EKS

The workflow is as follows:

A user creates an IAM role and a Pod Identity association between the IAM role and the Kubernetes ServiceAccount assigned to the Pod.
EKS API validates the Pod Identity association, allowing ASCP to use this role to authenticate with AWS services.
If authorized, the Pod Identity agent allows ASCP to assume the IAM role assigned to the Pod through the use of a ServiceAccount token.
The Pod retrieves the requested secrets values and makes them available to the Pod through the use of a mounted volume.

Prerequisites

You need to have the following prerequisites in place in order to implement this solution:

An Amazon EKS cluster (version 1.24 or later)
Pod Identity enabled on your cluster
Access to two AWS accounts (for cross-account access)
AWS CLI installed
kubectl installed
Helm installed

Implement the solution

This guide presents two scenarios: single-account setup and cross-account setup. Complete the single-account steps in Account A before proceeding to the cross-account configuration, which involves Account B. The cross-account setup builds on the single-account foundation to demonstrate secure secrets management across AWS accounts.

Amazon EKS cluster setup

Before you start, you’ll need to set up an Amazon EKS cluster with the required add-ons in a single account (Account A).

(Optional) Use the following commands to set environment variables and create an Amazon EKS cluster:

export CLUSTER_NAME=<YOUR_CLUSTER_NAME>
export AWS_REGION=<YOUR_AWS_REGION>
export NAMESPACE=<YOUR_NAMESPACE>
export ACCOUNT_A=<ACCOUNT_ID_1>
export ACCOUNT_B=<ACCOUNT_ID_2>
export SECRET_NAME=<NAME_OF_YOUR_SECRET>
export SECRET_NAME_CROSS=<NAME_OF_YOUR_CROSS_SECRET>

Here are the environment variables for this demonstration:

export CLUSTER_NAME=ascp-podidentity
export SECRET_NAME=secret-a
export AWS_REGION=$(aws configure get region)
export NAMESPACE=default
export SECRET_NAME=secret-a
export SECRET_NAME_CROSS=secret-b

Create the EKS cluster:

eksctl create cluster —name $CLUSTER_NAME —version=1.31

Update your kubeconfig file to enable Pod Identity on your EKS cluster (if it’s not already enabled):
```
eksctl create addon --cluster $CLUSTER_NAME --name eks-pod-identity-agent
```

Install the Secrets Store CSI driver:

aws eks --region $AWS_REGION update-kubeconfig --name $CLUSTER_NAME
helm repo add secrets-store-csi-driver https://kubernetes-sigs.github.io/secrets-store-csi-driver/charts
helm install -n kube-system csi-secrets-store secrets-store-csi-driver/secrets-store-csi-driver

Verify the Secrets Store CSI Driver installation:

kubectl --namespace=kube-system get pods -l "app=secrets-store-csi-driver"

You should see this expected output:

NAME READY STATUS RESTARTS AGE
csi-secrets-store-secrets-store-csi-driver-fcxf6 3/3 Running 0 41s
csi-secrets-store-secrets-store-csi-driver-j9wqh 3/3 Running 0 41s

Install the ASCP plugin:

kubectl apply -f https://raw.githubusercontent.com/aws/secrets-store-csi-driver-provider-aws/main/deployment/aws-provider-installer.yaml

Consuming secrets in a single account

Now that you’ve set up your Amazon EKS cluster, in this use case, you’ll create an AWS Secrets Manager secret in the same account where the cluster and the applications reside (Account A).

Create a secret in AWS Secrets Manager with tags kubernetes-namespace and eks-cluster-name within the same AWS account as the EKS cluster:

aws secretsmanager create-secret --region $AWS_REGION \
--name $SECRET_NAME \
--tags Key=kubernetes-namespace,Value=$NAMESPACE Key=eks-cluster-name,Value=$CLUSTER_NAME \
--secret-string "{"user":"user1","password":"passwd1"}"

You should see this expected output:

{
"ARN": "arn:aws:secretsmanager:$AWS_REGION:$ACCOUNT_A:secret:dummy-secret-q1w2e3",
"Name": "secret-a",
"VersionId": "abcdefgh-1234-5678-ijklmnopqrst"
}

Note the Amazon Resource Name (ARN) of the new secret. You will use it in the next step.

Create an IAM role and trust policy with the required permissions to retrieve the newly created secret within the same AWS account:

SECRET_ARN=< NOTED_SECRET_ARN_FROM_PREVIOUS_STEP>
cat << EOF > policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret"],
            "Resource": ["$SECRET_ARN"],
            "Condition": {
                "StringEquals": {
                    "secretsmanager:ResourceTag/kubernetes-namespace": "\${aws:PrincipalTag/kubernetes-namespace}",
                    "secretsmanager:ResourceTag/eks-cluster-name": "\${aws:PrincipalTag/eks-cluster-name}"
                }
            }
        }
    ]
}
EOF

In the preceding example IAM role permission policy, the use of conditions with kubernetes-namespace and eks-cluster-name tags helps enforce fine-grained access control by specifying that secrets can only be accessed by Pods from specific namespace and clusters. This allows secretsmanager actions only if the secret is tagged with the matching kubernetes-namespace and eks-cluster-name values.

cat << EOF > trust.json
{
    "Version": "2012-10-17",
    "Statement": [
        {  
            "Effect": "Allow",
            "Principal": {
                "Service": "pods.eks.amazonaws.com" 
            },
            "Action": [
                "sts:AssumeRole",
                "sts:TagSession"
            ]
        }
    ]
}
EOF

Notice that the preceding example IAM role trust policy allows the Amazon EKS Pod Identity service (pods.eks.amazonaws.com) to assume the role and tag the session. These actions are necessary for Pod Identity to function correctly, enabling Pods to securely access AWS resources.

Finally, apply the role and trust policy:

aws iam create-role --role-name ascp-podidentity --assume-role-policy-document file://trust.json
aws iam put-role-policy --role-name ascp-podidentity --policy-name ascp-podidentity --policy-document file://policy.json

Note the ARN of the new role. You will use it in the next step.

Create a Kubernetes ServiceAccount and add the Pod Identity association between the ServiceAccount and the IAM role:

ROLE_ARN=<NOTED_ROLE_ARN_FROM_PREVIOUS_STEP>
kubectl create serviceaccount serviceaccount-a

aws eks create-pod-identity-association \
--cluster-name "$CLUSTER_NAME" \
--namespace "default" \
--service-account serviceaccount-a \
--role-arn $ROLE_ARN

Create a SecretProviderClass to use the newly created secret in your Amazon EKS cluster.

cat << EOF | kubectl apply -f -

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: aws-secrets-manager
spec:
  provider: aws
  parameters:
    objects: |
      - objectName: "secret-a" # Secret name or ARN to be mounted to the Pod
        objectType: "secretsmanager"
    usePodIdentity: "true" # Indicator to use Pod Identity instead of IRSA
    
EOF

Create a new deployment to consume your newly created secret using Pod Identity as a mounted volume.

cat << EOF | kubectl apply -f -

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: application-a
  name: application-a
spec:
  replicas: 1
  selector:
    matchLabels:
      app: application-a
  template:
    metadata:
      labels:
        app: application-a
    spec:
      containers:
      - image: public.ecr.aws/amazonlinux/amazonlinux:2023-minimal
        name: amazonlinux
        args:
        - infinity
        command:
        - sleep
 volumeMounts:
 - name: secrets-store-inline
 mountPath: "/mnt/secret" # Directory where secret will be mounted to
 readOnly: true
 volumes:
 - name: secrets-store-inline
 csi:
 driver: secrets-store.csi.k8s.io
 readOnly: true
 volumeAttributes:
 secretProviderClass: "aws-secrets-manager" # Match SecretProviderClass name created
 serviceAccountName: serviceaccount-a # Specify service account created

EOF

Validate that the deployment was created successfully and confirm the secret was mounted correctly.

kubectl get pods -l app=application-a
kubectl exec -it $(kubectl get pods -l app=application-a -o name) -- cat /mnt/secret/secret-a

You should see this expected output:

NAME READY STATUS RESTARTS AGE
application-a-b98d44bb8-bzvn4 1/1 Running 0 3s

{"user":"user1","password":"passwd1"}%

Working with cross-account secrets through resource policies

For this second use case, you’ll also use the Amazon EKS cluster on Account A, and create a new Secrets Manager secret in a different account (Account B).

In order to access a secret in a different account, you can’t use the default AWS Key Management Service (AWS KMS) key, but will use aws/secretsmanager to encrypt this secret. So you need to first create a new AWS KMS key that allows cross-account access.

On Account B

Create a customer managed key on AWS KMS with cross-account permissions:

cat < key-policy.json
{
  "Version": "2012-10-17",
  "Id": "cross-account-access",
  "Statement": [
    {
      "Sid": "Enable IAM User Permissions",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::${ACCOUNT_B}:root"
      },
      "Action": "kms:*",
      "Resource": "arn:aws:kms:${AWS_REGION}:${ACCOUNT_B}:key/*"
    },
    {
      "Sid": "Allow access for Key Administrators",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::${ACCOUNT_B}:role/Admin"
      },
      "Action": [
        "kms:Create",
        "kms:Describe",
        "kms:Enable",
        "kms:List",
        "kms:Put",
        "kms:Update",
        "kms:Revoke",
        "kms:Disable",
        "kms:Get",
        "kms:Delete",
        "kms:TagResource",
        "kms:UntagResource",
        "kms:ScheduleKeyDeletion",
        "kms:CancelKeyDeletion",
        "kms:RotateKeyOnDemand"
      ],
      "Resource": "arn:aws:kms:${AWS_REGION}:${ACCOUNT_B}:key/*"
    },
    {
      "Sid": "Allow access through AWS Secrets Manager",
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": [
        "kms:Encrypt",
        "kms:Decrypt",
        "kms:ReEncrypt",
        "kms:CreateGrant",
        "kms:DescribeKey"
      ],
      "Resource": "arn:aws:kms:${AWS_REGION}:${ACCOUNT_B}:key/*",
      "Condition": {
        "StringEquals": {
          "kms:ViaService": "secretsmanager.${AWS_REGION}.amazonaws.com",
          "kms:CallerAccount": [
            "$ACCOUNT_A",
            "$ACCOUNT_B"
          ]
        }
      }
    },
    {
      "Sid": "Allow access through AWS Secrets Manager for Generate Data Key",
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": "kms:GenerateDataKey",
      "Resource": "arn:aws:kms:${AWS_REGION}:${ACCOUNT_B}:key/*",
      "Condition": {
        "StringEquals": {
          "kms:CallerAccount": [
            "$ACCOUNT_A",
            "$ACCOUNT_B"
          ]
        },
        "StringLike": {
          "kms:ViaService": "secretsmanager.${AWS_REGION}.amazonaws.com"
        }
      }
    },
    {
      "Sid": "Allow direct access to key metadata to the account",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::${ACCOUNT_A}:root",
          "arn:aws:iam::${ACCOUNT_B}:root"
        ]
      },
      "Action": [
        "kms:Describe",
        "kms:Get",
        "kms:List",
        "kms:RevokeGrant"
      ],
      "Resource": "arn:aws:kms:${AWS_REGION}:${ACCOUNT_B}:key/*"
    }
  ]
}
EOF

Then create the KMS key:

$ aws kms create key --region $AWS_REGION --policy file://key-policy.json

You should see this expected output:

{
    "KeyMetadata": {
        "AWSAccountId": "$ACCOUNT_B",
        "KeyId": "abcdefgh-1234-5678-90ij-klmnopqrstuv",
        "Arn": "arn:aws:kms:${AWS_REGION}:${ACCOUNT_B}:key/abcdefgh-1234-5678-90ij-klmnopqrstuv",
        "CreationDate": "2025-01-23T18:00:00.756000-05:00",
        "Enabled": true,
        "Description": "",
        "KeyUsage": "ENCRYPT_DECRYPT",
        "KeyState": "Enabled",
        "Origin": "AWS_KMS",
        "KeyManager": "CUSTOMER",
        "CustomerMasterKeySpec": "SYMMETRIC_DEFAULT",
        "KeySpec": "SYMMETRIC_DEFAULT",
        "EncryptionAlgorithms": [
            "SYMMETRIC_DEFAULT"
        ],
        "MultiRegion": false
    }
}

Take note of the KeyID and the ARN of the key. You can use it to create an alias for the newly created key:

KEY_ID=<NOTED_KEY_ID_FROM_PREVIOUS_STEP>
aws kms create-alias --region $AWS_REGION —-alias-name alias/secretsmanager-cross —-target-key-id $KEY_ID

Create a new secret in Secrets Manager, using the new KMS key to encrypt it:

aws secretsmanager create-secret --region $AWS_REGION \
--name $SECRET_NAME_CROSS \
--tags Key=kubernetes-namespace,Value=$NAMESPACE Key=eks-cluster-name,Value=$CLUSTER_NAME \
--secret-string "This is a Cross Account Secret" \
--kms-key-id $KEY_ID

You should see this expected output:

{
"ARN": "arn:aws:secretsmanager:$AWS_REGION:$ACCOUNT_B:secret:secret-b-12345",
"Name": "secret-b",
"VersionId": "abcdefgh-1234-5678-ijklmnopqrst"
}

Note the ARN of the new secret. You will use it in the next step.

Add a resource policy that allows the IAM role on Account A to access the newly created secret:

SECRET_CROSS_ARN=< NOTED_SECRET_CROSS_ARN_FROM_PREVIOUS_STEP>
cat << EOF > resource-policy.json

{
  "Version" : "2012-10-17",
  "Statement" : [ {
    "Effect" : "Allow",
    "Principal" : {
      "AWS" : "arn:aws:iam::${ACCOUNT_A}:root"
    },
    "Action" : [ "secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret" ],
    "Resource" : "$SECRET_CROSS_ARN"
    } 
  ]
}

EOF
aws secretsmanager put-resource-policy --secret-id $SECRET_NAME_CROSS --resource-policy file://resource-policy.json

Now let’s go back to Account A.

On Account A

Create a new IAM role and trust policy with the required permissions to retrieve the newly created secret within Account B:

KEY_ARN=< NOTED_KEY_ARN_FROM_PREVIOUS_STEP>
cat << EOF > cross-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret"],
            "Resource": ["$SECRET_CROSS_ARN"],
            "Condition": {
                "StringEquals": {
                    "secretsmanager:ResourceTag/kubernetes-namespace": "\${aws:PrincipalTag/kubernetes-namespace}",
                    "secretsmanager:ResourceTag/eks-cluster-name": "\${aws:PrincipalTag/eks-cluster-name}"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:DescribeKey"
                ],
            "Resource": "$KEY_ARN"
        }
    ]
}
EOF

In the preceding example IAM role permission policy, note that the resource ARN is pointing to the secret created on Account B.

cat << EOF > trust.json
{
    "Version": "2012-10-17",
    "Statement": [
        {  
            "Effect": "Allow",
            "Principal": {
                "Service": "pods.eks.amazonaws.com" 
            },
            "Action": [
                "sts:AssumeRole",
                "sts:TagSession"
            ]
        }
    ]
}
EOF

The IAM role trust policy needs to have pods.eks.amazonaws.com as a trusted entity to perform the sts:AssumeRole and sts:TagSession actions.

Next, create the role and apply the policy:

aws iam create-role --role-name ascp-podidentity-cross --assume-role-policy-document file://trust.json
aws iam put-role-policy --role-name ascp-podidentity-cross --policy-name ascp-podidentity-cross --policy-document file://cross-policy.json

Note the ARN of the new role. You will use it in the next step.

Create the second Kubernetes ServiceAccount and add the cross-account Pod Identity association:

kubectl create serviceaccount serviceaccount-b

ROLE_CROSS_ARN=<NOTED_ROLE_CROSS_ARN_FROM_PREVIOUS_STEP>
aws eks create-pod-identity-association \
--cluster-name "$CLUSTER_NAME" \
--namespace "default" \
--service-account serviceaccount-b \
--role-arn $ROLE_CROSS_ARN

Create a SecretProviderClass to use the cross-account secret created in Account B in your Amazon EKS cluster:

cat << EOF | kubectl apply -f -

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: aws-secrets-manager-cross
spec:
  provider: aws
  parameters:
    objects: |
      - objectName: "$SECRET_CROSS_ARN" # Full ARN of the Secret in the ACCOUNT B to be mounted on Pod
        objectType: "secretsmanager"
    usePodIdentity: "true" # Indicator to use Pod Identity instead of IRSA
    
EOF

Create a new deployment to consume your newly created secret using Pod Identity as a mounted volume:

cat << EOF | kubectl apply -f -

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: application-b
  name: application-b
spec:
  replicas: 1
  selector:
    matchLabels:
      app: application-b
  template:
    metadata:
      labels:
        app: application-b
    spec:
      containers:
      - image: public.ecr.aws/amazonlinux/amazonlinux:2023-minimal
        name: amazonlinux
        args:
        - infinity
        command:
        - sleep
 volumeMounts:
 - name: secrets-store-cross
 mountPath: "/mnt/secret" # Directory where secret will be mounted to readOnly: true
 volumes:
 - name: secrets-store-cross
 csi:
 driver: secrets-store.csi.k8s.io
 readOnly: true
 volumeAttributes:
 secretProviderClass: "aws-secrets-manager-cross" # Match SecretProviderClass name created for cross-account access serviceAccountName: serviceaccount-b # Specify service account created for cross-account access

EOF

Validate that the deployment was created successfully and confirm the mounted secret:

kubectl get pods -l app=application-b
kubectl exec -it $(kubectl get pods -l app=application-b -o name) -- cat /mnt/secret/$SECRET_CROSS_ARN

Note that the volume name is the secret ARN, because it’s a cross-account secret.

You should see this expected output:

NAME READY STATUS RESTARTS AGE
application-b-67b755444f-ngrhv 1/1 Running 0 8s

"This is a Cross Account Secret"%

Conclusion

The integration of ASCP with Pod Identity marks a significant step forward in secrets management for Amazon EKS. It offers enhanced security, simplified configuration, and improved operations. We encourage all EKS users to explore this new integration and take advantage of its benefits.

The integration of ASCP with Pod Identity offers these benefits over IRSA:

Simplified setup: With Pod Identity, you don’t need to create and manage service accounts for each application.
Enhanced security: Pod Identity provides more granular control over permissions at the Pod level.
Improved scalability: Pod Identity is easier to implement in large-scale environments.
Consistent AWS experience: Pod Identity aligns more closely with AWS best practices for IAM management.

For more information, see our documentation for: AWS Secrets Manager, AWS Secrets CSI Store Provider (ASCP), and Amazon EKS Pod Identity.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Enhancing telecom security with AWS

2025-02-07 Kal Krishnan

Post Syndicated from Kal Krishnan original https://aws.amazon.com/blogs/security/enhancing-telecom-security-with-aws/

If you’d like to skip directly to the detailed mapping between the CISA guidance and AWS security controls and best practices, visit our Github page.

Implementing CISA’s enhanced visibility and hardening guidance for communications infrastructure

In response to recent cybersecurity incidents attributed to actors from the People’s Republic of China, a number of cybersecurity agencies led by the U.S. Cybersecurity and Infrastructure Security Agency (CISA) have jointly released comprehensive guidance for securing communications infrastructure. As communications service providers (CSPs) migrate their workloads to the cloud, they must take steps to implement these security measures effectively in cloud environments.

This blog post describes how CSPs can use Amazon Web Services (AWS) capabilities to implement this guidance while benefiting from the advantages of the cloud.

The guidance focuses on two key domains:

Strengthening visibility: Enabling security teams to monitor, detect, and respond to potential threats through comprehensive visibility into digital assets
Hardening systems and devices: Implementing robust security controls and configurations to minimize vulnerabilities and help prevent unauthorized access

Overview of fundamental cloud concepts

Before exploring the specific guidance in this post, it’s important to understand how security recommendations apply differently to public cloud environments than to private infrastructure. A common tendency in the telecommunications industry is to treat public clouds as merely scaled-up versions of private clouds. This can result in a misunderstanding of security capabilities and underutilization of cloud-native security features of the public cloud.

The fundamental difference lies in how public clouds are architected—specifically for multi-tenancy, with strong tenant isolation as a cornerstone of their design. In AWS, virtual resources are isolated by default and require explicit configuration to interconnect. For example, when you create a virtual private cloud (VPC) with Amazon VPC, this logically isolated network does not permit inbound or outbound traffic until specific routes and ports are deliberately configured. Similarly, Amazon Simple Storage Service (Amazon S3) buckets are private by default, requiring explicit configuration to grant access. This isolation extends to the core of our virtualization infrastructure through the AWS Nitro System, which provides unprecedented workload isolation—even AWS operators with the highest privileges have no technical access to customer workloads. Furthermore, data moving between Nitro System based virtual machines or across our global backbone network is automatically encrypted, providing additional layers of protection beyond customer-implemented encryption.

This secure-by-design and secure-by-default philosophy permeates throughout AWS service design and operations. It isn’t merely a design choice—it’s a business imperative driven by the critical need for operational resilience and customer trust in the public cloud model. Our commitment to principles of this sort is reflected in our participation as a signatory to CISA’s Secure by Design Pledge.

When AWS customers operate in the public cloud, understanding the shared responsibility model is paramount. This model clearly delineates security responsibilities: AWS is responsible for security of the cloud, while customers are responsible for security in the cloud. This division of responsibilities significantly reduces your operational burden, because AWS assumes responsibility for securing everything about and inside the cloud services it provides, all the way down to the physical protection of data centers. As a result, you can concentrate your security resources where they matter most—protecting your applications and workloads—while AWS handles the undifferentiated heavy lifting of infrastructure security.

This shared responsibility model becomes even more advantageous when considering the economies of scale inherent to public cloud operations. The massive scale of AWS allows us to invest more in securing the foundations than a single enterprise could achieve independently, creating a security multiplier effect that benefits all customers. A compelling example of this scale advantage is our comprehensive threat intelligence program, which deploys honeypot sensors throughout our global network. These sensors observe more than 100 million potential threat interactions and probes daily. Using artificial intelligence and machine learning (AI/ML), we analyze this information and take swift, often automated actions to mitigate threats. In the first half of 2023 alone, this program enabled us to dismantle the sources of approximately 230,000 Layer 7 distributed denial of service (DDoS) events. We also provide this intelligence to customers through services like Amazon GuardDuty, extending the benefits of our scale to our customers.

The scale of AWS operations not only enables exemplary threat intelligence, but also necessitates extensive automation of our security operations. Several routine tasks, such as feature and patch deployments and configuration updates, are fully automated through deployment pipelines. Automation has the added benefit of taking humans out of the loop, thereby decreasing opportunities for mistakes.

Our scale also facilitates our comprehensive compliance with security standards across multiple industries and jurisdictions. Our global presence and diverse customer base necessitate adherence to the most stringent security requirements worldwide. Through the AWS Compliance Program, we’ve achieved 143 security standards and compliance certifications, ranging from ISO standards for cloud security and privacy to industry-specific regulations in finance and healthcare, as well as government security programs. This includes independent verification of our claims on the isolation properties of our Nitro System virtualization infrastructure. Consequently, you benefit from this scale-driven compliance, gaining access to a secure, certified infrastructure that implements state-of-the-art security systems.

These are a few reasons why, in a blog post titled Why cloud first is not a security problem, the UK’s National Cyber Security Centre concluded that “using the cloud securely should be your primary concern – not the underlying security of the public cloud.”

Private clouds, on the other hand, are typically within the control of a single organization and are single-tenanted, offering relatively weak workload isolation. Virtual resources in private clouds usually default to being interconnected upon creation, and so require explicit steps to increase isolation. Manual operations, with their opportunities for mistakes as well as potential involvement of threat actors, are often still a large part of private cloud workflows. Rarely do they undergo the level of security scrutiny that public clouds are routinely subjected to. These, and other differences, mean that security risks in each offering are inherently different, so correspondingly distinct security controls and solution architectures are needed to mitigate these risks.

Implementing hardening guidance with AWS

Your cloud resources and data are contained in an AWS account. An account acts as an identity and access management isolation boundary. When you need to share resources and data between two accounts, you must explicitly allow this access. This reduces the risk of lateral movement between accounts.

Designing your AWS environment correctly lays a strong foundation that can help you meet the CISA cybersecurity guidance. AWS Control Tower, working with AWS Organizations, enables you to establish a well-architected, multi-account environment based on security best practices.

For detailed guidance on creating a secure landing zone for telecom workloads, refer to our comprehensive blog post on this topic.

We’ve analyzed the recommendations in CISA’s guidance and grouped them into six categories across the two key domains. Refer to the GitHub page linked at the bottom of this post, which has further detailed guidance on the relevant AWS services that can assist your implementation of each individual security measure in the guidance.

Logging and monitoring

The guidance in this category emphasizes the importance of increasing visibility to understand network activity, which is essential for detecting anomalies and responding to incidents. Key security controls include the following: have a robust asset management capability, enable logging at various levels, centralize logging, protect the logging and monitoring infrastructure, and use security tools to detect anomalies and incidents.

Enhanced visibility is an inherent advantage of the public cloud model, particularly in AWS. This transparency is not just a bolted-on feature, but a fundamental necessity driven by the API-centric, pay-as-you-go business model. To accurately bill customers, AWS has built comprehensive tracking and logging capabilities into its core architecture. As a result, AWS provides robust tools that allow you to capture, centralize, and monitor logs across every layer of your network workload. This level of visibility extends far beyond what’s typically achievable in traditional infrastructure, offering you unprecedented insight into your IT assets and user activities.

Another key guidance is this area is to centralize security-related logging while isolating the logging infrastructure from other production environments. You can implement this guidance in AWS by using Amazon Security Lake together with OpenSearch implemented in separate accounts, with access restricted to just the security organization. Alternatively, this solution provides a best-practice implementation of creating collection and ingestion pipelines to allow for centralization and inspection of log sources across your AWS workloads without the use of Security Lake.

Configuration and change management

The guidance in this category emphasizes the centralization, security, and protection of configurations. It highlights the importance of detecting and providing alerts for unauthorized modifications, auditing configurations for compliance, and a change management process that automates routine changes to minimize unintended drift.

In AWS, you can implement infrastructure and configuration as code, which allows for central storage of configuration in repositories, tracking changes through version control, and implementing changes through approved change management processes. You can use code repositories and continuous integration and continuous delivery (CI/CD) pipelines to automate the implementation of these configurations, helping you increase deployment speed, maintain consistency, simplify management, and implement a rigorous and auditable change control process.

Regardless of how infrastructure is deployed and managed, you can use the AWS Config service to automatically track the current state and history of a wide set of configuration information across more than 100 AWS services and hundreds of their resource types. You can also write custom AWS Config rules to take automatic actions whenever sensitive resources are modified, or take advantage of more than 400 AWS managed rules in AWS Config that send alerts or create automatic remediations when critical resources change state.

Identity and access management

The guidance in this category emphasizes the importance of active account and permissions management, use of phishing-resistant authentication methods, implementing least privilege through role-based access controls, managing emergency access, and limiting sessions.

Authentication and authorization, which are critical components of access control, are managed through AWS Identity and Access Management (IAM), AWS IAM Identity Center, and AWS Organizations. AWS provides you with capabilities to manage permissions at scale across identities, resources, and services, including mandating the use of multi-factor authentication (MFA) for logins. Furthermore, these capabilities support customers adhering to the principle of least privilege by encouraging time-bound, session-based credential management by using AWS Security Token Service (AWS STS).

Software running in the cloud that needs to call cloud APIs receives its temporary and frequently rotated credentials automatically through IAM roles for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), and AWS Lambda, helping to eliminate the need for long-term credentials that can leak or be compromised. Access to cloud APIs from on-premises software can be safely boot-strapped from enterprise identity management technologies by using AWS IAM Roles Anywhere. You can even protect authentication to non-cloud technologies with a combination of roles and the use of AWS Secrets Manager to protect and automatically rotate secrets such as database passwords.

Network and traffic management

The guidance in this category emphasizes segmenting workloads and networks to limit the potential for lateral movement and exposure to the internet, monitoring and regulating traffic flows by using policies, and securing unused ports.

You can achieve network micro-segmentation, a critical aspect of modern security architecture, through VPCs and subnets. You can, for example, segregate internet-facing components of your application from those that don’t require such access by placing them in separate VPCs and enabling internet access only on the VPC that requires it. You can control traffic flows within and between segments by using a variety of network services—routing tables, internet gateways, transit gateways, and firewall services, to name a few. This segmentation minimizes your risk from unauthorized activity that originates from the internet and limits the potential for lateral movement in the event of a breach.

To implement the guidance regarding out-of-band management, you can architect your network connections to separate management traffic from network signaling traffic by using subnets—for example, a single EC2 instance can have multiple elastic network interfaces (ENIs) attached to different subnets or even different VPCs: one that permits only management traffic and another that permits only signaling traffic.

Strong cryptography to encrypt data at rest and in traffic

The guidance in this category emphasizes using strong cipher suites, secure versions of encryption protocols, and PKI-based certificates to protect data at rest and in transit.

Encryption, a cornerstone of data protection, is comprehensively addressed in AWS offerings. API endpoints of AWS services support TLS 1.3 (and a minimum of TLS 1.2) with secure standards-based cipher suites, encryption keys, and advanced security features like HKDF (HMAC-based extract-and-expand key derivation function) for added security. AWS services that manage customer secrets sent over the wire also support post-quantum cryptography. For example, AWS Key Management Service (AWS KMS), AWS Certificate Manager, and AWS Secrets Manager support a hybrid post-quantum key exchange option for the TLS network encryption protocol. In its use of the Border Gateway Protocol (BGP), AWS uses Resource Public Key Infrastructure (RPKI) and Route Origin Authorization (ROA) to protect the Amazon IP address space and routes from misconfigurations and hijacking.

You can also use AWS cryptographic services such as AWS KMS, AWS CloudHSM, and AWS Certificate Manager to help secure your data in transit and at rest. Keys that you create in AWS KMS are protected by FIPS 140-2 Level 3 validated hardware security modules (HSMs), and there is no mechanism for anyone, including AWS service operators, to view, access, or export plaintext key material.

AWS Secrets Manager helps you securely manage, retrieve, and rotate database credentials, application credentials, OAuth tokens, API keys, and other secrets throughout their lifecycles. For more details on AWS cryptography solutions and best practices, refer to Encryption best practices for AWS services.

Vulnerability management

This guidance emphasizes minimizing exploitation risks through proper lifecycle management, regular patching, and elimination of insecure protocols. AWS helps address these requirements through both shared responsibility and innovative architectural approaches.

Under the shared responsibility model, AWS manages the security of underlying infrastructure. This includes maintaining up-to-date systems and services, disabling insecure protocols and unused ports, and providing Security Bulletins for timely vulnerability notifications. AWS services are supported through contractually defined terms, so that you don’t need to worry about end-of-life infrastructure components.

For your applications, AWS enables a transformative approach to vulnerability management through ephemeral resources and immutable infrastructure. Instead of maintaining long-lived instances that require continuous patching, you can maintain a single, hardened, and frequently updated Amazon Machine Image (AMI) as your golden image. When updates are needed, rather than patching running instances, you simply deploy new instances with your application code installed from an updated AMI. Similar approaches also apply to container-based workloads. Workloads based on AWS Lambda reduce your patching responsibility even further, because only the code that contains your business logic (and any supporting layers you have chosen) needs to be updated—AWS patches the underlying hypervisors, operating systems, and containers automatically. This approach enables you to keep your systems in a known, secure state while reducing both the threat surface and operational overhead. You can further enhance security by using AWS networking features like security groups to disable insecure protocols, such as enforcing HTTPS rather than HTTP.

Conclusion

The comprehensive guidance from cybersecurity agencies provides a crucial framework for securing telecommunications infrastructure. As demonstrated throughout this post, AWS offers a robust set of native services and capabilities that align with the recommendations from CISA and allied governments. From enhanced visibility through logging and monitoring, to strong identity management, network segmentation, encryption, and vulnerability management, AWS provides the tools you need to implement these security controls effectively while maintaining operational efficiency. The shared responsibility model, combined with AWS continuous innovation in security, enables telecommunications companies to build and maintain resilient, secure cloud environments.

Visit our GitHub page for detailed information on implementing CISA guidance with AWS services.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

2024 PiTuKri ISAE 3000 Type II attestation report available with 179 services in scope

2025-02-07 Tariro Dongo

Post Syndicated from Tariro Dongo original https://aws.amazon.com/blogs/security/2024-pitukri-isae-3000-type-ii-attestation-report-available-with-179-services-in-scope/

Amazon Web Services (AWS) is pleased to announce the issuance of the Criteria to Assess the Information Security of Cloud Services (PiTuKri) Type II attestation report with 179 services in scope.

The Finnish Transport and Communications Agency (Traficom) Cyber Security Centre published PiTuKri, which consists of 52 criteria that provide guidance across 11 domains for assessing the security of cloud service providers.

An independent third-party audit firm issued the report to assure customers that the AWS control environment is appropriately designed and operating effectively to demonstrate adherence with PiTuKri requirements. This attestation demonstrates the AWS commitment to adhere to security expectations for cloud service providers set by Traficom.

The latest report covers a 12-month period from October 1, 2023 to September 30, 2024. AWS has added the following 10 services to the current PiTuKri scope:

Customers can find the PiTuKri ISAE 3000 report on AWS Artifact. To learn more about the complete list of services in scope, see AWS Compliance Programs and AWS Services in Scope for PiTuKri.

AWS strives to continuously bring new services into the scope of its compliance programs to help you meet your architectural and regulatory needs. Contact your AWS account team for questions about the PiTuKri report.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below.

2024 FINMA ISAE 3000 Type II attestation report available with 179 services in scope

2025-02-07 Tariro Dongo

Post Syndicated from Tariro Dongo original https://aws.amazon.com/blogs/security/2024-finma-isae-3000-type-ii-attestation-report-available-with-179-services-in-scope/

Amazon Web Services (AWS) is pleased to announce the issuance of the Swiss Financial Market Supervisory Authority (FINMA) Type II attestation report with 179 services in scope.

The Swiss Financial Market Supervisory Authority (FINMA) has published several requirements and guidelines about engaging with outsourced services for the regulated financial services customers in Switzerland.

An independent third-party audit firm issued the report to assure customers that the AWS control environment is appropriately designed and operating effectively to support adherence with FINMA requirements.

The latest report covers the 12-month period from October 1, 2023 to September 30, 2024, for the following circulars:

2018/03 “Outsourcing – banks, insurance companies and selected financial institutions under FinIA”
2023/01 “Operational risks and resilience – banks”
Business Continuity Management (BCM) minimum standards proposed by the Swiss Insurance Association

AWS has added the following 10 services to the current FINMA scope:

Customers can find the FINMA ISAE 3000 report on AWS Artifact. To learn more about the complete list of services in scope, see AWS Compliance Programs and AWS Services in Scope for FINMA.

If you have feedback about this post, submit comments in the Comments section below.

AWS renews MTCS Level 3 certification under the SS584:2020 standard

2025-02-05 Joseph Goh

Post Syndicated from Joseph Goh original https://aws.amazon.com/blogs/security/aws-renews-mtcs-level-3-certification-under-the-ss5842020-standard/

Amazon Web Services (AWS) is pleased to announce the renewal of the Multi-Tier Cloud Security (MTCS) Level 3 certification under the SS584:2020 standard in December 2024 for the Asia Pacific (Singapore), Asia Pacific (Seoul), and United States AWS Regions, excluding AWS GovCloud (US) Regions. This achievement reaffirms our commitment to maintaining the highest security standards for our global customers, particularly those in Singapore and the Asia-Pacific.

AWS was the first cloud service provider (CSP) to attain MTCS Level 3 certification for Singapore in 2014. We continued this leadership by being among the first CSPs certified under the updated SS584:2020 Level 3 standard in 2021. Our dedication to expanding our security coverage is evident in the significant increase of in-scope services from 145 to 184, representing a 27% growth since 2021.

The MTCS standard is recognized as the world’s first cloud security standard to specify a multi-tiered management system for cloud security. This standard can be applied by CSPs to support differing cloud user needs for data sensitivity and business criticality, and the use of MTCS is mandated by the Singapore government as a requirement for public sector agencies and regulated organizations.

As part of our commitment to transparency, AWS fulfills the self-disclosure requirement for CSPs, providing detailed service-oriented information typically found in service level agreements. This allows our customers to make informed decisions about their cloud security needs.

The MTCS framework establishes three levels of security, with Level 3 being the most stringent:

Level 1: Designed for non-business-critical data and systems with baseline security controls.
Level 2: Addresses the needs of organizations that run business-critical data and systems in public or third-party cloud systems.
Level 3: Tailored for regulated organizations with specific and more stringent security requirements, including industry-specific regulations.

Benefits of the MTCS Level 3 certification

By achieving MTCS Level 3 certification, AWS helps Singapore customers in regulated industries to securely host applications and systems with highly sensitive information. This includes confidential business data, financial records, and medical records in a Level-3-compliant MTCS environment.

As cloud technology continues to evolve, AWS remains dedicated to maintaining and exceeding the highest security standards. Our renewed MTCS Level 3 certification under the SS584:2020 standard is a testament to this commitment, enabling our customers in Singapore and around the world to use AWS services with confidence for their most sensitive and critical workloads.

You can now download the latest MTCS certificates and the MTCS Self-Disclosure Form in AWS Artifact. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact. For a full list of AWS services that are certified under MTCS, see the AWS Multi-Tier Cloud Security (MTCS) page

If you have feedback about this post, submit comments in the Comments section below.

How AWS Network Firewall session state replication maximizes high availability for your application traffic

2025-02-05 Tushar Jagdale

Post Syndicated from Tushar Jagdale original https://aws.amazon.com/blogs/security/how-aws-network-firewall-session-state-replication-maximizes-high-availability-for-your-application-traffic/

AWS Network Firewall is a managed, stateful network firewall and intrusion protection service that you can use to implement firewall rules for fine grained control over your network traffic. With Network Firewall, you can filter traffic at the perimeter of your virtual private cloud (VPC); including filtering traffic going to and coming from an internet gateway, NAT gateway, over VPN or AWS Direct Connect. In this post, we show you how AWS Network Firewall uses session state replication to help maintain high availability for your application traffic.

Network Firewall session state replication

The stateful engine used by Network Firewall applies network security policies according to customer-defined rules. When inspecting stateful flows, Network Firewall maintains a reconstructed state—stored in a flow table—for each connection. Network Firewall is a distributed service, spreading traffic and the connection state over many backend firewall hosts using Gateway Load Balancer endpoints.

There are several operational reasons why backend hosts might be brought in and out of service, such as autoscaling events to adjust for changing traffic levels or installing software for security and other service updates. During these operations, if the traffic contains long-lived traffic flows, Network Firewall allows these flows to drain for several minutes before replacing the hosts and re-balancing them to the newer hosts.

When an existing flow is rebalanced onto a new host that doesn’t contain its connection state, the connection is handled according to the firewall policy’s configured stream exception policy. Network Firewall will either drop or reject these connections, or Network Firewall can be configured to continue applying the firewall policy’s rules without the context from earlier in the connection. Both choices have implications: using the drop or reject action maintains security by forcing connections to be restarted and re-inspected but at the cost of some broken connections, while the continue action requires writing firewall rules that can accept connections that are broken midstream.

In December 2024, AWS introduced the ability for Network Firewall to replicate the session state between backend hosts, reducing the number of cases where the stream exception policy needs to be applied to broken connections. Now, the majority of these failed-over flows go to a new host that already contains the correct flow state, allowing those connections to continue without interruption. This feature is automatically enabled by default on all firewalls and no action is required by you. The stream exception policy will continue to be applied in rare cases where the state cannot automatically be replicated or when connections are broken for other reasons such as routing changes in the network.

Figure 1: Session state replication flow

Figure 1 shows the sequence of events to maintain persistent connection during operations to replace a backend firewall host:

A network flow arrives at the Network Firewall endpoint and is forwarded to a firewall backend host (firewall host 1) by Gateway Load Balancer.
Firewall host 1 is de-registered from the Gateway Load Balancer target group, which causes Gateway Load Balancer to stop assigning new flows to the host but maintains existing ones.
The service exports the remaining session state table from backend firewall host 1.
The service replicates the session table data to other healthy backend hosts.
A flow flush operation causes Gateway Load Balancer to reassign the remaining flows on host 1 to other in-service hosts, where the ongoing flows will continue being inspected by the stateful inspection rules configured on the firewall.

Some of the key considerations for your own workloads:

This new feature only applies to the firewall stateful engine.
Both IPv4 and IPv6 connection states are supported.
Network Firewall doesn’t support asymmetric routing. See Avoiding asymmetric routing with AWS Network Firewall for more information..
The configured stream exception policy will continue to be applied in rare cases where state cannot be replicated and when connections are broken for other reasons. The StreamExceptionPolicyPackets Amazon CloudWatch metric continues to reflect the number of times that the exception policy is applied to arriving packets. For more information, see Stream exception policy options in your AWS Network Firewall firewall policy .

Conclusion

In this post, we outlined how AWS Network Firewall uses its ability to replicate connection state across multiple backend firewall hosts to maintain high availability for your application traffic. This feature is enabled by default for existing and new customers and there are no additional costs or configuration changes required to use this feature.

To learn more about AWS Network Firewall, see the AWS Network Firewall product page and the service documentation. To see which AWS Regions AWS Network Firewall is available in, see AWS Services by Region.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Implement effective data authorization mechanisms to secure your data used in generative AI applications – part 2

2025-02-03 Riggs Goodman III

Post Syndicated from Riggs Goodman III original https://aws.amazon.com/blogs/security/implement-effective-data-authorization-mechanisms-to-secure-your-data-used-in-generative-ai-applications-part-2/

In part 1 of this blog series, we walked through the risks associated with using sensitive data as part of your generative AI application. This overview provided a baseline of the challenges of using sensitive data with a non-deterministic large language model (LLM) and how to mitigate these challenges with Amazon Bedrock Agents. The next question that might come to mind is where and how to use sensitive data across LLM training, fine-tuning, vector databases, agents, and tooling. In this blog post, we build on part 1 with a more detailed discussion about data governance. Then, knowing the data governance baseline, we walk through how to use sensitive data across different data sources with the correct data authorization model. Lastly, we talk about how to implement data authorization mechanisms as part of a generative AI application that uses Retrieval Augmented Generation (RAG) as part of the architecture.

Data governance with LLMs

In this section, we’ll look into data governance as part of the overall data security landscape in more detail than we did in part 1. Many traditional workloads rely upon structured data stores, such as relational databases, for their data source. In contrast, one of the main benefits when you build a generative AI application is the ability to gain insight from massive amounts of both structured (schema-based) and unstructured data, including logs, documents, warehouse data, and other data sources. In the past, access to the unstructured data was limited to specific applications where authorization was granted to specific principals. In this type of architecture, a frontend application makes a decision whether to authorize the user’s access to the data and then uses a single AWS Identity and Access Management (IAM) role to the backend data source, providing access to the data in an object store, data warehouse, or other location. Access to the application incorporates authorization decisions to permit or deny access to the data or a subset of data. AWS has implemented access patterns tied to principal identity, including AWS trusted identity propagation and Amazon Simple Storage Service (Amazon S3) access grants.

Managing data access for generative AI applications presents challenges when you’re interested in using multiple data sources as part of the application, due to a lack of visibility into what data exists in what location and whether there is sensitive data as part of the data source. With data stored across locations, departments, and systems, many customers don’t know what data they have within each data source. And if you don’t know what data you have, it’s difficult to determine authorization policies to govern access to this data with generative AI applications, or whether the data source should be part of the generative AI application to begin with. From a data governance perspective, you need to look across four different pillars of the process: data visibility, access control, quality assurance, and ownership. Do the data sources include customer data? Is it internal data? Is it a combination of both? Do you need to remove certain objects or documents from the data source to align to the business goals of the application you’re building? Who should be authorized to access that data if you have different authorization levels within generative AI applications? AWS provides services in this area, including AWS Glue, Amazon DataZone, and AWS Lake Formation, to govern data to use with generative AI applications. Having a grasp on data governance is a critical prerequisite to implementing the data authorization capabilities we discuss in this post.

With that, how do you securely integrate sensitive data into your generative AI applications? Let’s walk through the different locations where sensitive data can exist: LLM training and fine-tuning, vector databases, tools, and agents.

LLM training and fine-tuning

The first location where sensitive data might reside within a generative AI application is in the LLM itself. The majority of foundation models (FMs) and LLMs are built and developed by third-party organizations, including Anthropic, Cohere, Meta, and other model providers. In these models, LLMs are becoming increasingly large, training on trillions of data points across both regular data and synthetic data created by other LLMs. However, most model providers today do not disclose the data sources used by the models because of privacy and proprietary reasons. FMs developed by third-party organizations are not trained on your private data, but if you are a large enterprise, you might train your own LLMs using sensitive data, licensed data, and public data for your use cases, or you might fine-tune existing models with additional data. This allows you to choose which data to include in training the model.

However, and as mentioned in part 1, LLMs do not make data authorization decisions, which causes challenges with granting access to different groups of principals. It is your application that will decide whether a given principal should be authorized to invoke the model. In addition, if you need to remove data from the LLM, the only way to remove training or fine-tuned data today is to retrain the model without that data. Although fine-tuning and prompt engineering can influence the completions the LLM returns, training data or fine-tuned data can be returned to whomever has access to query the model. Therefore, if you choose to fine-tune an existing model, carefully consider what data you use during training. Proprietary data that is included in training can be accessible to users who perform inference using that model. You should carefully evaluate training data to remove personally identifiable information (PII) or data that requires additional authorization above that which is required to access the model itself.

It’s important to note that there are LLM guardrails that support responsible AI mechanisms. For example, Amazon Bedrock Guardrails implementations remove certain content from prompts and completions. However, guardrails are non-deterministic and focus on filtering out harmful content, denied topics, word filters, or PII data from prompts and completions.

Important: You should not rely on responsible AI mechanisms such as guardrails or built-in model safety mechanisms for your data security, because they do not use identity as a signal as part of the filtering.

Retrieval-Augmented Generation (RAG)

The second location where sensitive data sits in generative AI applications is in vector databases. RAG implementations provide generative AI applications with access to contextual information from your organization’s private data sources to deliver relevant, accurate, and customized responses from LLMs. RAG allows you to add additional context to a prompt that is sent to an LLM and does not require you to train or fine-tune a model with your own data. When you use RAG as part of the generative AI application, you query the vector database to find documents or chunks of information similar to the principal’s prompt. Data that is returned from the vector database will be sent to the model with the original prompt as additional context for the request. For AWS services, we implement RAG using Amazon Bedrock Knowledge Bases and Amazon Q Connectors.

Figure 1 shows the RAG runtime execution flow with vector databases and models. When a user queries the application, the query is turned into embeddings that the vector database uses to find documents that are similar to the query. These documents or chunks are sent to the LLM to augment the original query from the user, so that the LLM can generate a response.

Figure 1: RAG runtime execution flow

In order to implement strong data authorization with RAG, you need to authorize the data before sending the additional content as part of the prompt to the LLM. This can be implemented at the generative AI application or the vector database. With RAG, you build your own authorization workflow within your application and perform authorization at different granularity levels. If you authorize the access to the vector database itself, then you allow a user with access to the application access to documents within the vector database. Therefore, for example, if you have two departments (such as finance and HR), you can create two vector databases, one for finance and one for HR. Principals who have the finance entitlement will be allowed access to the finance vector store, but not the one for HR, and vice versa.

What if you want to shift authorization granularity into the vector database itself? In a different deployment, if the vector database includes documents for separate groups of principals in the vector database, the API call to the vector database must include information on the group membership for the principal making the request. For example, if HR employees have access to certain documents within the vector database, the generative AI application or vector database must authorize whether the principal has access to the data that is returned. You can implement document-level filtering in Amazon Bedrock Knowledge Bases by using the retrievalConfiguration metadata field as part of the API call. As shown in the example in the next section, with metadata filtering, you add metadata key/value pairs that the vector database uses to filter the results that are returned, similar to group membership. Because the metadata filter is part of the API request and not the prompt, threat actors cannot use prompt injections to get access to data they are not authorized to access—authorization is tied to the principal’s identity that is passed to the frontend application and the metadata filters that are passed to the RAG implementation.

In order to build secure RAG implementations, it’s important that you use the correct authorization and data governance implementation. The data sent to the LLM should include only data the principal is authorized to have access to. LLM and guardrail features are probabilistic, and therefore they should not be used to make data authorization decisions.

Tools

A third pattern used by generative AI applications to interface with sensitive data is function or tool calling. With tools, the LLM doesn’t directly call the tool. Rather, when you send a request to an LLM, you also supply a definition for one or more tools that help the LLM generate a response. If the LLM determines it needs the tool to generate a response for the message, the LLM responds with a request for the application to call the tool. It also includes the input parameters to pass to the tool. Then, in the generative AI application, the application calls the tool on the LLM’s behalf, for example an API, an AWS Lambda function, or other software. The application continues the conversation with the LLM by providing the output from the tool as part of the prompt, and then the LLM generates a response based on the new data. This runtime execution flow is shown in Figure 2.

Figure 2: Tools runtime execution flow

Although the LLM decides whether a tool is required, the application code must perform security checks on the parameters passed back by the LLM and make authorization decisions on what tools can be called, what permissions the tool should have, and what actions can be taken. Traditional security mechanisms still apply. For example, tools should be sandboxed so that the side effects of running the tool will not affect future invocations. In addition, parameters generated by the LLM for use by the tool should be sanitized before they are passed into the tool to help avoid potential privilege escalation or remote code execution issues (for more information, see the OWASP top 10 for LLM, Improper Output Handling).

As with the other generative AI patterns mentioned earlier, the application also makes the tool authorization decisions. Similar to RAG implementations, the generative AI application decides on the appropriate authorization implementation, including application-level authorization, group-level authorization, or user-level authorization, or passes that decision to the tool through the use of an identity token, which was part of the discussion of agents in the part 1 post. With these capabilities, you can use multiple types of data sets (sensitive data, public data) in a function call implementation. However, as with authorization decisions with APIs today, authorization decisions in generative AI applications should be made based on the identity of the principal that is accessing the generative AI application and validated as part of every call to the tool. As mentioned previously, you should not allow the LLM to decide which authorization level a principal should have access to, because this can lead to excess agency (for more information, see the OWASP Top 10 for LLM Applications, Excess Agency).

Agents

The fourth pattern that we spoke about at length in the previous post is the use of agents. Here, we’ll discuss how to make use of multiple different data sources with agents. An agent helps principals complete multi-step actions based on principal input and data provided to the model. Agents, including Amazon Bedrock Agents, orchestrate between LLMs, data sources (RAG), software applications (tools), and principal conversations. With an agent, you choose an LLM that the agent invokes to interpret prompt input and subsequent prompts in its orchestration process, including generating follow-up steps. You configure the agent with actions, which might include eliciting clarification from the end user through additional questions, function calling for API operations, or RAG to augment the query with extra relevant context from knowledge bases. These actions are used during the orchestration process, which might take multiple steps, in order to answer the end user’s original query. These components are gathered to construct base prompts for the agent to perform orchestration until the principal request is complete, as shown in Figure 3.

Figure 3: Agent runtime execution flow

For agents and the use of external data sources, there are some additional considerations beyond the data authorization decisions we discussed earlier. First, in order to use the right data authorization context, identity information needs to be passed to the agent as part of the generative AI API call to the agent. With Amazon Bedrock Agents, this is done by using session attributes for tools and metadata filtering for vector databases. You use these attributes as part of calling different data sources within the agent configuration.

Second, the goal of using agents is to perform a task for the principal. Unlike RAG, these tasks may include making API calls to change data or take actions on behalf of the end user (principal). This differs from other data sources discussed previously, where the implementation for data access was data retrieval. With agents, the goal is to have the autonomous orchestration perform API actions, including the add, update, and delete categories of the function. You should take additional care when deciding the authorization you give principals as part of the execution flow of the agent. One option to consider when using agents is adding validation steps. This provides the principal (user) with validation steps for the work the agent performed before the agent changes data or makes calls to APIs to perform actions with data.

Now that we’ve discussed where and how to use data with generative AI applications, let’s walk through an example with a RAG implementation.

Data filtering and authorization with RAG

Let’s say you’re an enterprise that is interested in using a generative AI application for internal groups to retrieve information about policies and historical information. For this implementation, a single Amazon S3 data source for the vector database, which includes documents for both the Finance department and the HR department, is used as part of a RAG implementation. For our simplified example, users are interested in knowing what SECRET_KEY they need to use for their work. Each department has separate SECRET_KEY values that only users who are part of the respective groups have access to. The S3 bucket is the source of the Amazon Bedrock knowledge base, which the generative AI application uses as part of the implementation. This is shown in Figure 4.

Figure 4: Architecture overview with Finance and HR users accessing a generative AI application

Without any data authorization implemented, when an HR user queries the generative AI application, the Amazon Bedrock knowledge base will return the following results when using the Retrieve API call. (The Retrieve API call allows you to call the Amazon Bedrock knowledge base and have the results sent back to the generative AI application, in comparison to the RetrieveAndGenerate API call, which sends the results along with the prompt to the LLM without the generative AI application seeing the results from the knowledge base call until after the LLM responds to the prompt.)

aws bedrock-agent-runtime retrieve \
--knowledge-base-id FF6MZUZQMQ \
--retrieval-query text="What is the SECRET_KEY?"
{
    "retrievalResults": [
        {
            "content": {
                "text": "HR SECRET_KEY is HRBOT"
            },
            "location": {
                "s3Location": {
                    "uri": "s3://amzn-s3-demo-bucket/hr/hr.txt"
                },
                "type": "S3"
            },
            "metadata": {
                "x-amz-bedrock-kb-source-uri": "s3://amzn-s3-demo-bucket/hr/hr.txt",
                "x-amz-bedrock-kb-chunk-id": "1%3A0%3A5pe-v5IBdy11OzJ9mB2-",
                "x-amz-bedrock-kb-data-source-id": "OVJKWTMXQD",
                "group": "HR"
            },
            "score": 0.50864935
        },
        {
            "content": {
                "text": "Finance SECRET_KEY is FinanceBOT"
            },
            "location": {
                "s3Location": {
                    "uri": "s3://amzn-s3-demo-bucket/finance/finance.txt"
                },
                "type": "S3"
            },
            "metadata": {
                "x-amz-bedrock-kb-source-uri": "s3://amzn-s3-demo-bucket/finance/finance.txt",
                "x-amz-bedrock-kb-chunk-id": "1%3A0%3AvVK-v5IBeX5eb0Bilm5H",
                "x-amz-bedrock-kb-data-source-id": "OVJKWTMXQD"
            },
            "score": 0.4856355
        }
    ]
}

As shown, the SECRET_KEY for both the Finance department (FinanceBOT) and the HR department (HRBOT) are returned from the knowledge base, sourced from the respective prefixes in S3. However, to follow company policy, the Finance department and HR department do not want users outside the department to gain access to information within the S3 buckets that they are not authorized to view, including PII data for employees, unreleased financial data, internal HR policies, and other information that is only for users within each department. How would you go about implementing this restriction using the proper data authorization as described here?

There are two options for the solution. First, you could create two separate vector stores, one for Finance and one for HR. When a Finance user accesses the generative AI application, the application will only request data from the Finance vector store, because the user does not have authorization to the HR vector store. When an HR user accesses the generative AI application, it’s the opposite, with the application only allowing access to the HR vector store.

The second option is using a common vector store, where you might have common data for both departments in addition to sensitive data for the use of specific groups. Metadata filtering provides the generative AI application with a way to filter out context from the vector store at the vector store itself. When you add metadata as a *.metadata.json file that’s associated with an S3 object, you can apply filters within the Amazon Bedrock API call to filter out data that is returned by the knowledge base. For example, you can add metadata to both objects (hr.txt and finance.txt) within S3, by adding a hr.txt.metdata.json file and finance.txt.metadata.json file within the S3 bucket. When the vector database indexes from the S3 bucket, it will pull the metadata from the S3 bucket to allow you to filter on the metadata associated with the respective file. An example of the hr.txt.metadata.json file is shown following, along with the vectorSearchConfiguration filter that is used alongside the Retrieve API.

// hr.txt.metadata.json

{
    "metadataAttributes" : { 
        "group" : "HR"
    }
}

// retrieveconfiguration.json

{
    "vectorSearchConfiguration": {
        "filter": {
            "equals": {
                "key": "group",
                "value": "HR"
            }
        }
    }
}

With both of these metadata files in place, you will reindex the knowledge base to associate the metadata with each file. When you call the knowledge base with the filter as part of the API call, you get the following response:

aws bedrock-agent-runtime retrieve \
--knowledge-base-id FF6MZUZQMQ \
--retrieval-configuration="file://retrieveconfiguration.json" \
--retrieval-query text="What is the SECRET_KEY?"
{
    "retrievalResults": [
        {
            "content": {
                "text": "HR SECRET_KEY is HRBOT"
            },
            "location": {
                "s3Location": {
                    "uri": "s3://amzn-s3-demo-bucket/hr/hr.txt"
                },
                "type": "S3"
            },
            "metadata": {
                "x-amz-bedrock-kb-source-uri": "s3://amzn-s3-demo-bucket/hr/hr.txt",
                "x-amz-bedrock-kb-chunk-id": "1%3A0%3A5pe-v5IBdy11OzJ9mB2-",
                "x-amz-bedrock-kb-data-source-id": "OVJKWTMXQD",
                "group": "HR"
            },
            "score": 0.49277097
        }
    ]
}

As you can see, you only receive the chunks from the HR folder, because only the hr.txt object has the "group' : "HR" metadata applied to the objects. Due to this, the generative AI application can pass these chunks along with your prompt to the LLM for the user to receive the SECRET_KEY. You can find more information on metadata filtering in the blog post Amazon Bedrock Knowledge Bases now supports metadata filtering to improve retrieval accuracy.

Regardless of how you assign metadata to objects within the data source, the filter used with the API call is applied after the data authorization decision is made by the generative AI application. When a user logs in to the generative AI application, the application authenticates the user to identify who the user is and what department the user is in through the use of OpenID Connect (OIDC) or OAuth2, depending on the application. This step is required if you want your generative AI application to have strong authorization policies. After the generative AI application authenticates the user, it will authorize the user and apply the filters that are required when making API calls to the Amazon Bedrock knowledge base. It’s worth repeating that it’s the application that makes the data authorization decision, and the resulting API call to the knowledge base is post-authorization. By passing metadata through a secure side channel within the API and not the prompt, this practice helps to prevent threat actors and unintended users from gaining access to data they aren’t authorized to access.

Conclusion

Implementing the correct data authorization mechanisms is a foundational step that is required when you use sensitive data as part of generative AI applications. Depending on where the data sits as part of the generative AI application, you will need to use different implementations of data authorization, and there isn’t a one-size-fits-all solution. In this post, we walked through how to use sensitive data across these different data sources with the correct data authorization model. Then, we discussed how to implement data authorization mechanisms as part of a generative AI application and RAG by using metadata filtering. For additional information on generative AI security, take a look at other blog posts in the AWS Security Blog Channel and AWS blog posts covering generative AI.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Amazon Redshift enhances security by changing default behavior in 2025

2025-01-30 Yanzhu Ji

Post Syndicated from Yanzhu Ji original https://aws.amazon.com/blogs/security/amazon-redshift-enhances-security-by-changing-default-behavior-in-2025/

Today, I’m thrilled to announce that Amazon Redshift, a widely used, fully managed, petabyte-scale data warehouse, is taking a significant step forward in strengthening the default security posture of our customers’ data warehouses. Some default security settings for newly created provisioned clusters, Amazon Redshift Serverless workgroups, and clusters restored from snapshots have changed. These changes include disabling public accessibility, enabling database encryption, and enforcing secure connections.

Amazon Redshift already supports encryption in transit and at rest. Database encryption is crucial because it helps safeguard sensitive data from unauthorized access. Furthermore, restricting public access can be advantageous because it limits the threat surface and helps prevent unauthorized access to the database. By confining the Amazon Redshift cluster within the customer’s virtual private cloud (VPC), the cluster is isolated from the public internet, which significantly reduces the possibility of unauthorized parties discovering and accessing the data warehouse.

Enforcing secure connections is another essential security measure. This enforces encryption of communication between the applications and the database, reducing the risk of eavesdropping and man-in-the-middle exploits, which helps protect the confidentiality and integrity of the data being transmitted.

By implementing additional security defaults for newly created provisioned clusters, Serverless workgroups, and clusters restored from snapshots, Amazon Redshift helps customers adhere to best practices in data security without requiring additional setup, reducing the risk of potential misconfigurations.

These new security enhancements include three key changes:

Disabling public access by default – Public accessibility will be disabled by default for newly created or restored provisioned clusters. This means that the newly created clusters will be accessible only within your VPC and not accessible from the public internet. With this change, if you create a provisioned cluster from the AWS Management Console, then the cluster is created with public access disabled by default. Specifically, the PubliclyAccessible parameter will be set to false by default. This change will also be reflected in the CreateCluster and RestoreFromClusterSnapshot API operations and the corresponding console, AWS CLI, and AWS CloudFormation By default, connections to clusters will only be permitted from client applications within the same VPC. To access your data warehouse from applications in another VPC, you have to configure cross-VPC access.
If you still need public access, you must explicitly override the default and set the PubliclyAccessible parameter to true when you run the CreateCluster or RestoreFromClusterSnapshot API operations. With a publicly accessible cluster, we recommend that you always use security groups or network access control lists (network ACLs) to restrict access.
Enabling encryption by default – With this change, the ability to create unencrypted clusters will no longer be available in the Amazon Redshift console. When you use the console, CLI, API, or CloudFormation to create a provisioned cluster without specifying an AWS Key Management Service (AWS KMS) key, the cluster will automatically be encrypted with an AWS-owned key. The AWS-owned key is managed by AWS.
This update might impact you if you are creating unencrypted clusters by using automated scripts or using data sharing with unencrypted clusters. If you regularly create new unencrypted consumer clusters and use them for data sharing, review your configurations to verify that the producer and consumer clusters are both encrypted to reduce the chance that you will experience disruptions in your data-sharing workloads.
Enforcing secure connections by default – With this change, a new default parameter group named default.redshift-2.0 will be introduced for newly created or restored clusters, with the require_ssl parameter set to true by default. New clusters created without a specified parameter group will automatically use the default.redshift-2.0 parameter group. When you create a cluster through the console, the new default.redshift-2.0 parameter group will be automatically selected. This change will also be reflected in the CreateCluster and RestoreFromClusterSnapshot API operations, as well as in the corresponding console, AWS CLI, and AWS CloudFormation operations.
For customers who are using existing or custom parameter groups, the service will continue to honor the require_ssl value specified in your parameter group. However, we recommend that you update the require_ssl parameter to true in order to enhance the security of your connections. You continue to have the option to change the require_ssl value in your custom parameter groups as needed. You can follow the procedure in this topic in the Amazon Redshift Management Guide to configure security options for connections.

We recommend that all Amazon Redshift customers review their current configurations for this service and consider implementing the new security measures across their applications. These security enhancements could impact existing workflows that rely on public access, unencrypted clusters, or non-SSL connections. We recommend that you review and update your configurations, scripts, and tools to align with these new defaults.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Updated whitepaper available: Aligning to the NIST Cybersecurity Framework in the AWS Cloud

2025-01-29 Luca Iannario

Post Syndicated from Luca Iannario original https://aws.amazon.com/blogs/security/updated-whitepaper-available-aligning-to-the-nist-cybersecurity-framework-in-the-aws-cloud/

Today, we released an updated version of the Aligning to the NIST Cybersecurity Framework (CSF) in the AWS Cloud whitepaper to reflect the significant changes introduced in the National Institute of Standards and Technology (NIST) Cybersecurity Framework (CSF) 2.0, published in February 2024. This comprehensive update helps you understand how AWS services align with the enhanced framework and how you can use AWS capabilities to improve your cybersecurity posture.

The NIST CSF 2.0 provides guidance to industry, government agencies, and other organizations to manage cybersecurity risks. The updated version introduces important changes, including the following:

A new “Govern” Core Function, emphasizing procedural and organizational activities that have an impact on the management of cybersecurity risk within organizations.
An expanded scope, beyond critical infrastructure, to help organizations of many sizes and sectors.
Enhanced guidance for privacy risk management and supply chain security.
Updated Categories and Subcategories that better reflect current cybersecurity challenges.

In accordance with the AWS Shared Responsibility Model, the whitepaper provides a detailed mapping of AWS services to the six CSF Core Functions: Govern (New), Identify, Protect, Detect, Respond, and Recover. Organizations can use this whitepaper to understand how AWS services align with NIST CSF 2.0 requirements, implement AWS solutions to help achieve their security objectives, use AWS capabilities for automated security operations, and build resilient architectures that support their cybersecurity strategies.

Security and compliance remain our top priorities at AWS. This updated whitepaper demonstrates our commitment to helping customers align with the latest security frameworks while protecting their data and resources in the AWS Cloud. The whitepaper also includes practical guidance for implementing AWS services and features that support the CSF outcomes, whether you’re just starting your cloud journey or looking to enhance your existing security posture.

To learn more about implementing NIST CSF 2.0 in your organization by using AWS services, contact your AWS account team or download the whitepaper.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Testing and evaluating GuardDuty detections

2025-01-28 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/testing-and-evaluating-guardduty-detections/

Amazon GuardDuty is a threat detection service that continuously monitors, analyzes, and processes Amazon Web Services (AWS) data sources and logs in your AWS environment. GuardDuty uses threat intelligence feeds, such as lists of malicious IP addresses and domains, file hashes, and machine learning (ML) models to identify suspicious and potentially malicious activity in your AWS environment. When GuardDuty identifies a potential security issue, it creates a GuardDuty finding that gives you information about what the potential security issue is, the resources involved, and contextualized information that’s key to remediating the issue. GuardDuty helps you monitor for the latest threats by continually expanding threat detection to emerging and common threats.

Whether you’re new to GuardDuty or are a long-time user, it’s recommended that you understand the different GuardDuty finding types and finding details and practice responding to them as suggested in the security pillar of AWS Well-Architected.

In this blog post, I dive deep into an open source tool for testing GuardDuty findings and then walk through three examples of how you can use this tool to test and improve your response to GuardDuty findings.

Overview

If you want to learn more about GuardDuty, you can read about the finding types in this AWS documentation. However, customers often want realistic findings in their environment to understand what a finding looks like and to practice responding hands on. While you can use GuardDuty to create sample findings in your environment, these findings are approximations populated with placeholder values and look different from real findings. Additionally, you cannot practice remediation with these findings because they’re not tied to actual resources in your account. This can be helpful if you only want to see what details are in a finding, but if you want to practice a real-world scenario, these sample findings might not be adequate.

To address this use case and provide customers with a secure and reliable way to test the threat detection capabilities of GuardDuty, the GuardDuty service team launched an open source project called GuardDuty Tester. The GuardDuty Tester creates infrastructure in your environment to simulate different security issues so that you can test GuardDuty findings that mirror actual security issues that you might encounter, such as crypto mining or a reverse shell being created on an Amazon Elastic Compute Cloud (Amazon EC2) instance. The GuardDuty Tester was originally released in 2018 as an AWS CloudFormation template and was focused more on testing investigation workflows than on a wide range of finding types. AWS has since released an updated version that uses the AWS Cloud Development Kit (AWS CDK) to make the infrastructure code easier to read and expanded the test coverage to over 100 unique finding types and resource combinations.

The ability to create findings across different resource types such as Amazon EC2, Amazon Simple Storage Service (Amazon S3), and Amazon Elastic Kubernetes Service (Amazon EKS) is a valuable resource for your security team, allowing them to simulate various types of threats with isolated infrastructure so that you don’t need to compromise your deployed workloads to improve response actions and techniques. Remember that the GuardDuty Tester doesn’t cover every possible scenario, but is instead focused on threat intelligence and rules-based findings. Anomaly-based findings, which require learning about how you operate your environment, aren’t included in the GuardDuty Tester.

Getting started with the GuardDuty Tester

The GuardDuty Tester is deployed by using the AWS CDK to create the required infrastructure and scripts to generate the GuardDuty findings. For safety, AWS recommends that you deploy the GuardDuty Tester in a nonproduction environment in an account that’s used specifically for this purpose. This way, your security team can differentiate between test GuardDuty findings and findings for other workloads that they’re monitoring.

In this post, I won’t walk through configuring the GuardDuty Tester because this is already documented in the GuardDuty documentation. Instead, I will go over what you need to know about the GuardDuty Tester and some of the benefits.

Figure 1 shows the GuardDuty Tester architecture, which includes the resources necessary to create GuardDuty findings for various protection plans such as Amazon S3 buckets, Amazon EC2 instances, and an Amazon EKS cluster. The tester also deploys a dedicated GuardDuty Tester instance where you will run the scripts needed to create the GuardDuty findings.

Figure 1: GuardDuty Tester architecture

The GuardDuty Tester provides key features including:

A wide range of threat scenario simulations: Resources that the GuardDuty Tester can create findings for include Amazon S3, AWS Identity and Access Management (IAM), Amazon Elastic Container Service (Amazon ECS) for both Amazon EC2 and AWS Fargate hosted workloads, Amazon EKS, and AWS Lambda and covers over 105 threat scenarios. This includes GuardDuty runtime monitoring as well as other GuardDuty protection plans.
Access through AWS Systems Manager: The GuardDuty Tester provides secure access by using Systems Manager to minimize open ports to the internet and allowing access only through Systems Manager.
Modular scripts: With an expanded library of tests available, the GuardDuty Tester accepts user parameters to set the scope of the tests to run, which gives you greater flexibility for different testing scenarios.

Setting up the GuardDuty Tester environment is straightforward and requires only a few commands. As outlined in the documentation and the README file in the repository, there are a number of prerequisites to set up the stack. These prerequisites include Python 3+, git, the AWS Command Line Interface (AWS CLI), AWS Systems Manager Session Manager plugin, npm, Docker, and a subscription to Kali Linux image for Amazon EC2. You will have to subscribe to the Kali Linux instance in AWS Marketplace, but will be charged for the instance only while the GuardDuty Tester is deployed. After these prerequisites are met, you can clone the repository, install the packages, and deploy the GuardDuty Tester to your AWS account.

Deploying the GuardDuty Tester can take 20–30 minutes, but if you’re following along with this post, I assume that you have deployed the GuardDuty Tester into your environment and have started your Systems Manager session as stated on Part A of Step 3 – Run tester scripts in the GuardDuty documentation. Now, I will dive into the first testing example.

Manual investigation

The first test use case is about getting familiar with what GuardDuty findings look like and the details that a finding gives you. This might be one of your first steps after turning on GuardDuty, or this might be an activity that you perform to help new team members understand GuardDuty findings.

To start a manual investigation:

Run the following command in your Systems Manager session to view the GuardDuty Tester options.
```
Python3 guardduty_tester.py --help
```
Run the following command in your Systems Manager session to create the first test finding.
```
Python3 guardduty_tester.py - -ec2 - -runtime only - -tactics impact
```
Before creating the findings, the GuardDuty Tester prompts you to confirm that it’s allowed to change GuardDuty settings in the environment. For example, if you’ve chosen to create findings related to the GuardDuty runtime monitoring feature but don’t have this feature enabled, the GuardDuty Tester will enable it for the tests and then disable it after testing is complete.

Note: This will start the 30-day trial of the enabled features in this account, in this AWS Region, even if the feature is disabled after testing is complete. More information about GuardDuty pricing and free trials can be found on the GuardDuty pricing page.
After choosing y which indicates “yes”, the GuardDuty Tester reports the number of domain reputation findings it’s expecting. Figure 2 shows an example of the expected findings. You can learn more about domain reputation findings in the GuardDuty finding documentation.

Figure 2: Generated GuardDuty findings in the console
After the GuardDuty Tester is finished, wait a few minutes and then go to the AWS Management Console for GuardDuty to see the findings. In this example, there are four new GuardDuty findings as expected from step 4 and shown in Figure 3. With the findings generated, you can start your manual investigation.

Figure 3: GuardDuty finding details

In the preceding figure, you can see some of the finding details presented—such as the action type and the process information—that can help you quickly identify what trigger started the suspicious communication. From here, I encourage you to use this finding to practice your runbooks for investigation and response. For example, you might start with validating and triaging the finding before moving into evidence collection and remediation. If you don’t have incident response runbooks already built, you can use this finding as an example to get started. There are multiple open source examples such as AWS incident response playbooks and AWS customer response playbook. A runbook will help your team evaluate the information provided in the GuardDuty finding and understand what else they need to know about your specific environment to properly respond to the finding. For example, in the finding, you will have resource and actor information but not things such as who is the account owner or point of contact for security for that account.

Creating alerts

The next use case highlights how to create alerts based on GuardDuty findings. When setting up alerting automation with tools such as Amazon Simple Notification Service (Amazon SNS) and Slack, you should create a finding using the GuardDuty Tester to test that you’ve configured your alert correctly. See Creating custom responses to GuardDuty findings for information about creating alerts with either of these tools. Figure 4 shows a sample EventBridge rule that will send GuardDuty findings to SNS.

Figure 4: EventBridge rule to send GuardDuty findings to SNS

For this post, I assume that you’ve already configured an Amazon EventBridge rule and Amazon SNS alert.

To test alerts:

Run the following command in your Systems Manager session to create a privileged container finding.
```
Python3 guardduty_tester.py - -finding ‘PrivilegedEscalation:Kubernetes/PrivilegedContainer’
```
Shortly after creating this finding, you should see an SNS alert based on the finding type.

Figure 5: SNS notification from a GuardDuty finding

If you’ve configured the alert correctly, you will see an email similar to Figure 5. The email demonstrates that SNS notifications were successfully configured and tested using the GuardDuty Tester. If this is a new finding, you will receive this SNS notification shortly after the GuardDuty Tester generates the finding, but if this is an updated finding, then the timing will be based on the notification frequency configured in the account.

There are many ways that customers consume GuardDuty findings in their environments. Whether you’re using Amazon SNS or another mechanism such as a chat application, ticketing system, or a security information and event management (SIEM) solution, you can use this example of an EventBridge rule and the GuardDuty Tester to test out your notification pipeline.

Automated response

For the third use case, I show you how to create an automated action based on a GuardDuty finding. In this example, I create a finding based on an EC2 instance connecting to a Bitcoin mining domain, then based on this finding, I use Lambda to tag the instance to assist with identification during that investigation steps that follow. Although this is a simple example, it shows you what you can do by combining EventBridge rules and Lambda functions. If you want to create an automated response for GuardDuty runtime monitoring findings that requires making a host-level modification, you can use EventBridge rules with AWS Systems Manager Run Command to run commands locally on a host to remediate a security issue.

Start by creating a Lambda function that will take a GuardDuty event delivered by EventBridge, pull out the instance ID information, and then use that as a parameter in the create_tags API call. See the following example code.

import json
import boto3
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    try:
        # Extract the necessary information from the GuardDuty finding
        instance_id = event['detail']['resource']['instanceDetails']['instanceId']
        account_id = event['detail']['accountId']
        region = event['detail']['region']

        # Create an EC2 client
        ec2 = boto3.client('ec2', region_name=region)

        # Add the "infected" and "cryptomining" tag value pair to the instance
        ec2.create_tags(
            Resources=[instance_id],
            Tags=[
                {
                    'Key': 'infected',
                    'Value': 'cryptomining'
                }
            ]
        )

        logger.info(f"Tagged instance {instance_id} with 'infected=cryptomining' in account {account_id} and region {region}")
        return {
            'statusCode': 200,
            'body': 'Instance tagged successfully'
        }
    except Exception as e:
        logger.error(f"Error tagging instance {instance_id}: {str(e)}")
        return {
            'statusCode': 500,
            'body': f"Error tagging instance: {str(e)}"
        }

Next, I create an EventBridge rule specific to the Bitcoin mining finding that I want to test, shown in Figure 6. The target is the Lambda function that I just created.

Figure 6: EventBridge rule for crypto mining GuardDuty finding

Now that the EventBridge rule is in place with the Lambda function as the target, I can use the GuardDuty Tester to trigger a Bitcoin mining finding and test my solution with the following command.

Python3 guardduty_tester.py - - finding ‘CryptoCurrency:EC2/BitcoinTool.B!DNS’

After the finding is generated, I go to my EC2 instance, where there’s a new instance tag with a key of infected and a value of cryptomining, shown in Figure 7.

Figure 7: Updated tags after automated response

Although this is a general example, you can use the same approach across various actions that you might take in response to a GuardDuty finding and then test them using the GuardDuty Tester. Examples include using Lambda to add logic in AWS WAF, a network access control list (network ACL), or AWS Network Firewall to block suspicious traffic, or use Systems Manager Run Command to end a malicious process that’s running on a host.

Conclusion

The updated GuardDuty Tester represents a significant advancement in helping organizations validate and gain confidence in GuardDuty threat detection. The GuardDuty Tester now provides more comprehensive coverage of GuardDuty runtime monitoring and protection plans across various AWS services.

By using the GuardDuty Tester and following the use cases in this post, you can proactively assess your threat detection readiness, identify potential gaps, and implement necessary measures to help you fortify your AWS environments against evolving cyber threats.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

AWS Firewall Manager retrofitting: Harmonizing central security with application team flexibility

2025-01-28 Ian Olson

Post Syndicated from Ian Olson original https://aws.amazon.com/blogs/security/aws-firewall-manager-retrofitting-harmonizing-central-security-with-application-team-flexibility/

AWS Firewall Manager is a powerful tool that organizations can use to define common AWS WAF rules with centralized security policies. These policies specify which accounts and resources are in scope. Firewall Manager creates a web access control list (web ACL) that adheres to the organization’s policy requirements and associates it with the in-scope resources. Figure 1 shows a Firewall Manager security policy and web ACL created in each in-scope account.

Figure 1: A Firewall Manager security policy and Firewall Manager created web ACLs in each in-scope account

In this post, we’ll talk about the benefits of retrofitting and how you can use this feature to allow Firewall Manager to manage existing web ACLs. When retrofitting is enabled, a Firewall Manager security policy doesn’t replace existing web ACLs. Instead, Firewall Manager adds the top and bottom rule sections to existing web ACLs associated with in-scope resources. For application teams, Firewall Manger no longer restricts how they configure and deploy AWS WAF. Teams can use either the AWS Management Console or infrastructure as code (IaC) tools to customize rules in web ACLs, even if those web ACLs are managed by Firewall Manager.

Firewall Manager before retrofitting

Firewall Manager offers significant benefits, but the existing approach results in several challenges:

Compatibility with infrastructure as code (IaC): Firewall Manager creates and associates AWS WAF web ACLs with in-scope resources. IaC tools expect to create and manage resources (in other words, own their lifecycle). Application teams cannot use IaC to manage the WAF rules and other web ACL configuration components that are created by Firewall Manager; there are custom solutions that inject locally defined rules into Firewall Manager-created web ACLs, but these are complex and have risks such as drift. For AWS WAF customers and application teams, this introduces an operational challenge.
Existing WAF migration: Customers who are already using AWS WAF must migrate existing rules to Firewall Manager-managed web ACLs.
Application-specific rule complexity: Forcing all in-scope resources in the same account and AWS Region to use the same web ACL makes application-specific or exception-based rules more complex. In addition, changes to one application’s rules could impact others that share the same web ACL.
Increased costs: When many applications share a single web ACL, application-specific rules are part of a single web ACL, which can increase total WAF capacity units (WCU) usage, sometimes resulting in higher AWS WAF request costs.

Firewall Manager with retrofitting addresses challenges

To address these challenges, Firewall Manager now offers the ability to retrofit existing web ACLs. Let’s get specific about when and how Firewall Manager retrofitting works:

Firewall Manager will only retrofit a web ACL when all associated resources (for example, Application Load Balancers, API Gateways, and Amazon CloudFront distributions) are in scope. If a not-in-scope resource is also associated, Firewall Manager will not retrofit or update future security policy changes to a retrofitted web ACL. In that scenario, associated in-scope resources and the web ACL are marked noncompliant with security policies.
Retrofitting only modifies customer-created web ACLs. Retrofitting will not act on web ACLs retrofitted by another security policy or managed by Firewall Manager. If either scenario occurs, the web ACL and associated resources are marked noncompliant with security policies.
Retrofitting adds the following on top of an existing web ACL that is associated with one or more in-scope resources:
- Retrofitting adds first rule groups and last rule groups defined in a security policy to the web ACL. Existing rules within the web ACL are not changed. The order of rule evaluation changes is the following:
  1. Security policy–defined first rule group rules
  2. Security policy–defined last rule group rules
- Retrofitting adds a WAF logging configuration if one is defined by the security policy. If the web ACL already has a logging configuration, Firewall Manager does not replace the existing logging configuration and marks the web ACL noncompliant.
- Retrofitting does not verify or configure other attributes that are defined by the security policy. This includes the following properties: default action, custom request headers, web ACL Captcha or challenge configurations, and token domain list. These properties are used only when Firewall Manager creates a web ACL.
If an in-scope resource that supports AWS WAF does not have a web ACL, Firewall Manager creates and associates a Firewall Manager-managed web ACL with that resource.

Figure 2 shows the rules and logging configuration before and after a Firewall Manager security policy retrofits a web ACL that is associated with in-scope resources.

Figure 2: Using retrofitting to update an existing web ACL

This new retrofitting capability solves the previous challenges in the following ways:

IaC compatible: Application teams can provision and manage AWS WAF with IAC tools. Firewall Manager retrofits existing web ACLs by adding rules defined in the security policy to web ACLs created by IaC tools. Application teams can manage AWS WAF exactly the same as when Firewall Manager was not in use.
Existing WAF integration: Customers with existing AWS WAF deployments can adopt Firewall Manager without the need to migrate existing WAF rules.
Application-specific rules: Multiple resources in the same account can use separate web ACLs, which simplifies application-specific rules.
Help prevent additional costs: Application-specific rules are applied only to the relevant web ACLs. This helps prevent increased AWS WAF request costs from shared web ACLs that have a high WCU usage.

By enhancing Firewall Manager to retrofit existing web ACLs, customers can use the power of centralized WAF management without restricting how AWS WAF is deployed and configured by application teams in member accounts.

Firewall Manager security policy for AWS WAF – Enabling retrofitting

Figure 3 shows an example of a Firewall Manager security policy.

Figure 3: An example Firewall Manager security policy that uses the new Retrofit existing webACLs feature

This security policy defines WAF rules in the first rule group section. It applies to all Application Load Balancers (ALBs) across your organization in AWS Organizations in the Region where this security policy is created. The policy action is set to automatically remediate. Under Web ACL management, there is a new section, Managed web ACL source, with two options, Default and Retrofit existing webACLs. Default is the existing behavior: Firewall Manager creates and associates a Firewall Manager managed web ACL. Retrofit existing webACLs applies the WAF rules and logging configuration (if any) defined by a security policy to existing web ACLs when they are associated with an in-scope resource. This policy specifies Retrofit existing webACLs. If an in-scope resource does not have a web ACL, Firewall Manager still creates and associates a web ACL by default.

Retrofitting in action

Let’s walk through what happens when you set Managed web ACL source to Retrofit existing webACLs. Figure 4 shows two ALBs that are in-scope of our security policy, LoadBalancer1 and LoadBalancer2.

Figure 4: Two existing ALBs, one with an existing web ACL and the other without

LoadBalancer1 has the following characteristics:

LoadBalancer1 does NOT have a web ACL associated.
After the Firewall Manager security policy applies, LoadBalancer1 is associated with a Firewall Manager created web ACL, as shown in Figure 5.

Figure 5: LoadBalancer1 is now associated with a Firewall Manager managed and created web ACL
The Firewall Manager-created web ACL contains the WAF rules defined in the security policy.

LoadBalancer2 has the following characteristics:

LoadBalancer2 has an existing customer created web ACL associated with it, as shown in Figure 6.

Figure 6: LoadBalancer2 is associated with a customer-created web ACL
This web ACL was created by the application team with an application-specific rule. The web ACL could have been created with the AWS Management Console, AWS CloudFormation, or other IaC tools like Terraform.
After the Firewall Manager security policy takes effect, LoadBalancer2 remains associated with the existing customer-created web ACL MyCustomWebACL.
Retrofitting adds WAF rules in the first rule group according to the security policy, as shown in Figure 7. Existing WAF rules are not changed, and rules and other aspects of the web ACL are not changed and can continue to be managed exactly as you would without Firewall Manager present.

Figure 7: Firewall Manager has retrofitted a customer-created web ACL and added the security policy–defined first rule groups rules

Figure 8 shows that both ALBs are now compliant with our security policy. LoadBalancer1 has a web ACL created by Firewall Manager; future ALBs that don’t have a web ACL associated would also become associated to this web ACL. LoadBalancer2 is associated with a web ACL the application team previously created. The application-defined WAF rules are not changed and the WAF rules defined by the security policy are added to this web ACL.

Figure 8: Multiple ALBs in scope for the same security policy

Associate a web ACL with Firewall Manager already in place

Let’s continue from our previous scenario. The application team for LoadBalancer1 now configures application-specific rules for their ALB by using the following process.

The application team creates a web ACL and defines their own WAF rules. They might do this with the console, AWS CloudFormation, or other IaC tools like Terraform.
The application team associates LoadBalancer1 with the custom web ACL.

Figure 9: From the web ACL, only LoadBalancer1 is associated with the app team’s web ACL
After a moment, Firewall Manager detects the change to an in-scope resource (LoadBalancer1). Firewall Manager retrofits the application team’s web ACL AppTeamNewWebACL, making it align with the security policy.

Figure 10: Firewall Manager has retrofitted the app team–created web ACL and added the security policy–defined first rule group rules

The diagram in Figure 11 shows the workflow that happens when the application team creates their own web ACL and associates it with LoadBalancer1. Firewall Manager detects the association change and retrofits the application team’s web ACL again, bringing LoadBalancer1 into alignment with security policies.

Figure 11: Firewall Manager retrofitting a web ACL in response to being associated with an existing in-scope resource

Out-of-scope resources associated with a retrofitted web ACL

Let’s make a change to our security policy. In Figure 12, the policy scope has been updated to only apply to resources when they have the resource tag Tier: Production. Application teams add this tag to LoadBalancer1 and LoadBalancer2.

Figure 12: The Firewall Manager security policy scope has been updated to include a resource tag

Later, an application team creates LoadBalancer3 and associates it with AppTeamNewWebACL (Figure 13).

Figure 13: A new ALB associated with a custom WAF web ACL

The application team does not tag LoadBalancer3, making it out of scope with the security policy, as shown in Figure 14.

Figure 14: A new ALB that is out of scope with a security policy

This web ACL is currently retrofitted by our security policy and associated with LoadBalancer2 (in-scope), as shown in Figure 15.

Figure 15: A retrofitted web ACL associated with in-scope and out-of-scope ALBs

Firewall Manager detects that an out-of-scope resource is associated with a retrofitted web ACL. As shown in Figure 16, Firewall Manager marks the web ACL AppTeamNewWebACL noncompliant. It also marks LoadBalancer2 noncompliant.

Figure 16: Shows retrofit-specific reasons a resource or web ACL will be marked noncompliant. In this case, a not-in-scope resource is associated with a web ACL where another associated resource is in-scope.

As long as the web ACL is noncompliant, Firewall Manager will not retrofit a non-retrofitted web ACL, and future security policy changes will not be applied to that web ACL. Existing retrofitted WAF rules for that web ACL are not modified or removed. When the web ACL later becomes compliant, it will again retrofit the latest state of the security policy. Figure 17 shows how Firewall Manager keeps the existing retrofitting but does not apply updates. Using our example, one of three things would need to happen for the web ACL to become compliant:

LoadBalancer3 can be made in-scope with the current security policy. The application team can then add the resource tag Tier: Production to this ALB.
The resource tag can be removed from the policy scope.
LoadBalancer3 can be disassociated with the web ACL.

Note: We recommend that you promptly address not-in-scope resources associated with web ACLs that are shared by in-scope resources. For example, if the preceding scenario was not addressed and LoadBalancer3 was deleted three months later, Firewall Manager would at that point retrofit the web ACL (3 months later). This is not necessarily a problem, but could trigger unexpected changes to a web ACL’s rules.

Figure 17: Firewall Manager retains existing but not future changes as long as a not-in-scope resource is associated with this web ACL

In summary: Firewall Manager initially retrofits and will apply future updates to existing web ACLs while all associated resources are in-scope. When an out-of-scope resource is associated, an initial retrofit is delayed and security policy retrofit updates are paused until the out-of-scope resource is addressed.

Figure 18 demonstrates how Firewall Manager will not perform an initial retrofit of a web ACL associated with both in-scope and not-in-scope resources.

Figure 18: Firewall Manger will only perform an initial retrofit when only in-scope resources are associated with a web ACL

Conclusion

Firewall Manager verifies that in-scope resources adhere to relevant security policies or are marked noncompliant. You can enable retrofitting for Firewall Manager for AWS WAF to seamlessly enforce these security policies without changing how your application teams manage the configuration of their WAF rules.

Note: Retrofitting is only available for AWS Firewall Manager security policies for AWS WAF; it is not available for AWS WAF Classic.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Firewall Manager re:Post or contact AWS Support.

Announcing upcoming changes to the AWS Security Token Service global endpoint

2025-01-28 Palak Arora

Post Syndicated from Palak Arora original https://aws.amazon.com/blogs/security/announcing-upcoming-changes-to-the-aws-security-token-service-global-endpoint/

AWS launched AWS Security Token Service (AWS STS) in August 2011 with a single global endpoint (https://sts.amazonaws.com), hosted in the US East (N. Virginia) AWS Region. To reduce dependency on a single Region, STS launched AWS STS Regional endpoints (https://sts.{Region_identifier}.{partition_domain}) in February 2015. These Regional endpoints allow you to use STS in the same Region as your workloads, which improves both performance and reliability.

However, many customers and third-party tools continue to call the STS global endpoint, and as a result, these customers don’t get the benefits of STS Regional endpoints. To help improve the resiliency and performance of your applications, we are making changes to the STS global endpoint, with no action required from customers. These changes will be released in the coming weeks.

In this blog post, we discuss the upcoming changes to the STS global endpoint and their benefits, and provide our recommendation on which STS endpoint to use going forward.

Upcoming changes to the STS global endpoint

The changes being made to the STS global endpoint will help enhance resiliency and improve performance. Today, all the requests to the STS global endpoint are served by the US East (N. Virginia) Region. Starting in early 2025, requests to the STS global endpoint will be automatically served in the same Region as your AWS deployed workloads. For example, if your application calls sts.amazonaws.com from the US West (Oregon) Region, your calls will be served locally in the US West (Oregon) Region instead of being served by the US East (N. Virginia) Region.

With this change, requests to the STS global endpoint will be served locally if your request originated from AWS Regions that are enabled by default.¹ However, requests to the STS global endpoint will continue to be served in US East (N. Virginia) Region if your request originated from opt-in Regions or if you used STS from outside AWS, such as in your on-premises network or data centers.

We will gradually roll out this change to AWS Regions that are enabled by default by mid-2025, starting with the Europe (Stockholm) Region.

We’ve taken the following measures to help avoid disruptions to your existing processes:

AWS CloudTrail logs for requests made to the STS global endpoints will be sent to the US East (N. Virginia) Region. CloudTrail logs for requests handled by STS Regional endpoints will continue to be logged to their respective Region in CloudTrail, even if the requests are served locally.
CloudTrail logs for operations performed by the STS global and Regional endpoints will have the additional fields endpointType and awsServingRegion to clarify which endpoint and Region served the request.
Requests made to the sts.amazonaws.com endpoints will have a value of us-east-1 for the aws:RequestedRegion condition key, regardless of which Region served the request.
Requests handled by the sts.amazonaws.com endpoints will not share a request quota with the Regional STS endpoints.

^{1. In addition, for your requests to be served locally, your DNS request for sts.amazonaws.com must be handled by an Amazon DNS Server in Amazon Virtual Private Cloud (Amazon VPC).}

Our recommendation

We continue to recommend that you use the appropriate STS Regional endpoints whenever possible. If you’re using STS from outside AWS, such as in your on-premises networks or data centers, we recommend you use the STS Regional endpoint that is hosted in the same Region as the AWS resource that you need STS credentials to access. If you’re building in opt-in Regions such as Asia Pacific (Hong Kong) or Asia Pacific (Jakarta), we recommend that you use the STS endpoint from the opt-in Region that is hosting your workload. By following the steps in the blog post How to use Regional AWS STS endpoints, you can identify workloads that are still using the global STS endpoint and get insights into how to reconfigure them when required.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Building a culture of security: AWS partners with the BBC

2025-01-27 Carter Spriggs

Post Syndicated from Carter Spriggs original https://aws.amazon.com/blogs/security/building-a-culture-of-security-aws-partners-with-the-bbc/

Cybersecurity isn’t just about technology—it’s about people. That’s why Amazon Web Services (AWS) partnered with the BBC to explore the human side of cybersecurity in our latest article, The Human Side of Cybersecurity: Building a Culture of Security, available on the BBC website.

In the piece, we spotlight the AWS Security Guardians program and how it helps security enable your business, not slow it down. Through the program, teams are provided tools, resources, and guidance to help address security concerns at every step of development. This enables developers to make the right decisions and embed security deeper in their solutions. Using our approach with their Security Champions initiative, the Commonwealth Bank of Australia shared how the program has helped transform their company culture and give ownership of services and security directly to their teams.

AWS and the Commonwealth Bank sought to understand how they could foster a culture where employees not only understand security, but care about it. They both needed a way to support new technology innovation while reducing security remediation work. This challenge offered both companies an opportunity to try something new with their Security Champions programs, enabling security expertise to be distributed across their respective organizations, which created a scalable solution with measurable results.

Read the full article on the BBC website to discover how AWS and our customers are creating a better environment for our teams, colleagues, customers, and communities.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

2024 C5 Type 2 attestation report available with 179 services in scope

2025-01-27 Tea Jioshvili

Post Syndicated from Tea Jioshvili original https://aws.amazon.com/blogs/security/2024-c5-type-2-attestation-report-available-with-179-services-in-scope/

Amazon Web Services (AWS) is pleased to announce a successful completion of the 2024 Cloud Computing Compliance Controls Catalogue (C5) attestation cycle with 179 services in scope. This alignment with C5 requirements demonstrates our ongoing commitment to adhere to the heightened expectations for cloud service providers. AWS customers in Germany and across Europe can run their applications in the AWS Regions that are in scope of the C5 report with the assurance that AWS aligns with C5 criteria.

The C5 attestation scheme is backed by the German government and was introduced by the Federal Office for Information Security (BSI) in 2016. AWS has adhered to the C5 requirements since their inception. C5 helps organizations demonstrate operational security against common cybersecurity threats when using cloud services.

Independent third-party auditors evaluated AWS for the period of October 1, 2023, through September 30, 2024. The C5 report illustrates the compliance status of AWS for both the basic and additional criteria of C5. Customers can download the C5 report through AWS Artifact, a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

AWS has added the following 10 services to the current C5 scope:

The following AWS Regions are in scope of the 2024 C5 attestation: Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Milan), Europe (Paris), Europe (Stockholm), Europe (Spain), Europe (Zurich), and Asia Pacific (Singapore). For up-to-date information, see the C5 page of our AWS Services in Scope by Compliance Program.

AWS strives to continuously bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about C5 compliance, reach out to your AWS account team.

If you have feedback about this post, submit comments in the Comments section below.

CCN releases guide for Spain’s ENS landing zones using Landing Zone Accelerator on AWS

2025-01-23 Tomás Clemente Sánchez

Post Syndicated from Tomás Clemente Sánchez original https://aws.amazon.com/blogs/security/ccn-releases-guide-for-spains-ens-landing-zones-using-landing-zone-accelerator-on-aws/

Spanish version »

The Spanish National Cryptologic Center (CCN) has published a new STIC guide (CCN-STIC-887 Anexo A) that provides a comprehensive template and supporting artifacts for implementing landing zones that comply with Spain’s National Security Framework (ENS) Royal Decree 311/2022 using the Landing Zone Accelerator on AWS. Spain’s ENS establishes a common framework of basic principles and requirements of security for Spanish public sector organizations and their service providers, including supply chain providers. Over the years, the collaboration between Amazon Web Services (AWS) and the CCN has resulted in the publication of eight secure configuration guides (Series STIC 887) that provide comprehensive advice on the configuration of AWS services to align with the ENS. The guide CCN-STIC-887 Anexo A is the last addition to this series.

The centerpiece of this new guide is the ENS template for the Landing Zone Accelerator on AWS (LZA ENS). A landing zone serves as the initial setup of an organization’s cloud account or environment, including the implementation of security controls, access management, and compliance frameworks. The Landing Zone Accelerator on AWS is a powerful open source tool created by AWS for organizations that want to quickly customize and automate implementation of landing zones that align with AWS best practices and with regulatory compliance frameworks. This tool provides a comprehensive solution that, managed entirely by code, automatically configures over 35 AWS services using a simplified set of configuration files to manage and govern a multi-account environment, helping customers with highly regulated workloads and complex compliance requirements.

The CCN-STIC-887 Anexo A guide focuses on helping organizations implement landing zones that meet ENS security requirements from the ground up. It offers detailed instructions and templates for establishing a landing zone—the foundational infrastructure required for a secure, well-managed cloud environment—and a control matrix to demonstrate compliance with ENS controls.

Key components covered in the STIC 887H guide include:

Logging and monitoring: LZA ENS performs a default and scaled activation of the necessary logging and monitoring services required to meet ENS monitoring requirements in AWS services (such as AWS CloudTrail, Amazon CloudWatch, AWS Security Hub, and Amazon GuardDuty).
Access control: LZA ENS implements the management of identity and access management methods and policies at scale, which are aligned with the access control requirements of the ENS in a centralized manner using AWS IAM Identity Center.
Asset management: By default, LZA ENS activates inventory functions and resource and inventory tagging policies (for example, AWS Config) that support ENS asset management controls in the services.
Network topology: LZA ENS can be used to deploy a centralized network topology in accordance with ENS network security controls.
Cryptography: The encryption service activation capabilities built into LZA ENS can help organizations align with ENS data protection standards through mandatory encryption at rest, enforcement mechanisms with AWS Key Management Service (AWS KMS), and monitoring mechanisms to detect unencrypted data and communications with AWS Config rules.
Compliance and data residency: LZA ENS includes control policies to promote the use of AWS services with the ENS High certification and to provide processing on AWS in accordance with customers’ data residency requirements.

Organizations that require specific customizations to fully meet the requirements of the ENS can use LZA ENS to quickly modify and add customized security controls and then execute the scaled deployment of these controls to their accounts in the landing zone. One of the customizations included in LZA ENS is the integration of the open source security tool Prowler with Security Hub as an automated auditing tool with the objective of providing an up-to-date view of compliance with ENS controls. In addition, by providing a base designed for security and the flexibility to add custom controls, LZA ENS can support the process of achieving and maintaining compliance with the ENS in the AWS Cloud environment.

The CCN-STIC-887 Anexo A guide represents an important step forward in standardizing secure cloud deployments for Spanish public sector organizations and those working with government entities. This publication demonstrates the AWS commitment to support organizations in their secure cloud adoption journey while maintaining compliance with national security standards.

Spanish version

CCN publica la guía para las Zonas de Aterrizaje del ENS con AWS Landing Zone Accelerator

El Centro Criptológico Nacional de España (CCN) ha publicado una nueva guía STIC (CCN-STIC-887 Anexo A) que proporciona una plantilla de código y material de soporte para implementar zonas de aterrizaje (o landing zones) que cumplan con el Esquema Nacional de Seguridad del Real Decreto 311/2022 (ENS) mediante el Landing Zone Accelerator on AWS. El ENS establece un marco común de principios básicos, requisitos y medidas de seguridad para las organizaciones del sector público español y sus prestadores de servicios, incluyendo la cadena de suministro. A lo largo de los años, la colaboración entre Amazon Web Services (AWS) y el CCN se ha traducido en la publicación de ocho guías de configuración segura (serie STIC 887) que proporcionan consejo sobre la configuración de los servicios de AWS para alinearse con el ENS. La guía CCN-STIC-887 Anexo A es la última incorporación a esta serie.

La pieza central de la nueva guía es la plantilla ENS para el AWS Landing Zone Accelerator (LZA ENS). Una zona de aterrizaje (landing zone) sirve como la configuración inicial del entorno en la nube de una organización, e incluye la implementación inicial de controles de seguridad, la administración del acceso y los marcos de cumplimiento. El AWS Landing Zone Accelerator es una potente herramienta de código abierto creada por AWS para las organizaciones que desean implementar de forma rápida, segura, personalizada y automatizada zonas de aterrizaje alineadas con las prácticas recomendadas de AWS, así como con marcos de conformidad. Esta herramienta proporciona una solución integral que, mediante código, configura automáticamente más de 35 servicios de AWS con un conjunto simplificado de archivos de configuración para administrar y gobernar un entorno multicuenta, lo que ayuda a los clientes con cargas de trabajo altamente reguladas y requisitos de cumplimiento normativo.

La guía CCN-STIC-887 Anexo A se centra específicamente en ayudar a las organizaciones a implementar desde cero zonas de aterrizaje que cumplan con los requisitos de seguridad del ENS. Ofrece instrucciones y plantillas detalladas para establecer una zona de aterrizaje – la infraestructura básica necesaria para un entorno de nube seguro y bien administrado – así como una matriz de control para demostrar el cumplimiento de los controles del ENS.

Los componentes clave incluidos en la guía STIC 887H incluyen:

Registro y monitoreo: LZA ENS realiza una activación por defecto y a escala de los servicios de registro y monitoreo necesarios en AWS (como AWS CloudTrail, Amazon CloudWatch, AWS Security Hub, y AWS GuardDuty) para cumplir con los requisitos de monitoreo del ENS.
Control de acceso: LZA ENS implementa los métodos y políticas de administración de identidades y accesos a escala, que se alinean con los requisitos de control de acceso del ENS de manera centralizada mediante AWS IAM Identity Center..
Administración de activos: De forma predeterminada, el LZA ENS activa las funciones de inventario y las políticas de etiquetado de recursos e inventario (por ejemplo AWS Config) que soportan los controles de administración de activos del ENS.
Topología de red: LZA ENS se puede utilizar para implementar una topología de red centralizada de acuerdo con los controles de seguridad de red ENS.
Criptografía: las capacidades de activación de cifrado integradas en la LZA ayudan a organizaciones a alinearse con los estándares de protección de datos del ENS mediante el cifrado obligatorio en reposo, los mecanismos de aplicación con AWS Key Management Service (AWS KMS) y los mecanismos de supervisión para detectar datos y comunicaciones no cifrados con las reglas de AWS Config.
Cumplimiento y residencia de datos: LZA ENS incluye políticas de control para promover el uso de los servicios de AWS con la certificación del ENS Alto y realizar el procesamiento en AWS de acuerdo con los requisitos de residencia de datos del cliente.

Las organizaciones que requieren personalizaciones específicas para cumplir plenamente los requisitos del ENS pueden usar el LZA ENS para modificar rápidamente y añadir fácilmente controles de seguridad personalizados y ejecutar la implementación a escala de estos controles en sus cuentas de la zona de aterrizaje. Una de las personalizaciones que hemos incluido en el LZA ENS es la integración de Prowler con AWS Security Hub como una herramienta de auditoría automatizada, con el objetivo de proporcionar una visión actualizada del cumplimiento de los controles ENS de una manera fácil y eficaz. Además, al proporcionar una base diseñada para la seguridad y la flexibilidad de agregar controles personalizados, LZA ENS puede ayudar durante el proceso de obtener la conformidad con el ENS en el entorno de nube de AWS.

La guía CCN-STIC-887 Anexo A representa un importante paso adelante en la estandarización de las implementaciones seguras en la nube para las organizaciones del sector público español. Esta publicación demuestra el compromiso de AWS de apoyar a las organizaciones en su proceso de adopción segura de la nube, manteniendo al mismo tiempo el cumplimiento de las normas de seguridad nacionales.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Background

Solution overview

AWS Private CA short lived credentials

IAM Roles Anywhere

cert-manager

Using certificates and secrets

Using the CSI driver

Workload configuration

Solution deployment

Prerequisites

Step 1 – AWS Private CA

Step 2 – IAM Roles Anywhere

Step 3 – Create the init container

Step 4 – Install cert-manager

Step 5 – Deploy your workload

Step 6 – Test your deployment

Summary

Overview of automated domain lists and traffic insights

Preventive and detective security controls

Operational value

How the automated domain list feature works

Step 1: Enable traffic analysis mode to capture HTTP and HTTPS traffic domain logs

Step 2: Create a domain report

Step 3: Review the report details

Step 4: (Optional) Create a domain list rule group

Best practices for implementing domain allowlists

Conclusion

Solution overview

Using the aws:PrincipalArn condition

Granting same-account bucket access to a specific role

Granting cross-account bucket access to a specific IAM role

How and why does encryption work?

Encryption as part of your security strategy

Requirements for an encryption solution

Protecting keys at rest

Independent key management

Encrypting data at rest and in transit

Encrypting data in use

The role of end-to-end encryption in secure communications

Quantum computing and post-quantum cryptography

Encrypt everything, everywhere

Summary

Model selection

Model adaptation

Model customization

Amazon Bedrock Agents

Model testing

Model operation

Logging sensitive information

Review IAM permissions on a periodic basis

Conclusion

Background

Solution overview

Prerequisites

Implement the solution

Amazon EKS cluster setup

Consuming secrets in a single account

Working with cross-account secrets through resource policies

On Account B

On Account A

Conclusion

Implementing CISA’s enhanced visibility and hardening guidance for communications infrastructure

Overview of fundamental cloud concepts

Implementing hardening guidance with AWS

Logging and monitoring

Configuration and change management

Identity and access management

Network and traffic management

Strong cryptography to encrypt data at rest and in traffic

Vulnerability management

Conclusion

Benefits of the MTCS Level 3 certification

Network Firewall session state replication

Conclusion

Data governance with LLMs

LLM training and fine-tuning

Retrieval-Augmented Generation (RAG)

Tools

Agents

Data filtering and authorization with RAG

Using the `aws:PrincipalArn` condition