PyPI was subpoenaed

Post Syndicated from original https://lwn.net/Articles/932886/

It is, it seems, a week of Python Package Index (PyPI) news. On the PyPI blog, Director of Infrastructure at the Python Software Foundation (PSF), Ee Durbin, has posted an admirably detailed description of the organization’s response to three subpoenas it received for PyPI user information in March and April. The requests for information were quite broad and the PSF did produce the requested material (to the extent possible), which involved five PyPI user accounts, under the advice of counsel.

PyPI and the PSF are committed to the freedom, security, and privacy of our users.

This process has offered time to revisit our current data and privacy standards, which are minimal, to ensure they take into account the varied interests of the Python community. Though we collect very little personal data from PyPI users, any unnecessarily held data are still subject to these kinds of requests in addition to the baseline risk of data compromise via malice or operator error.

As a result we are currently developing new data retention and disclosure policies. These policies will relate to our procedures for future government data requests, how and for what duration we store personally identifiable information such as user access records, and policies that make these explicit for our users and community.

The post goes on to detail exactly which fields in the database tables were used to fulfill the request (without identifying the targets, naturally). Meanwhile, another statement in the post leaves open the possibility that further subpoenas have been received since that time:

We have waited for the string of subpoenas to subside, though we were committed from the beginning to write and publish this post as a matter of transparency, and as allowed by the lack of a non-disclosure order associated with the subpoenas received in March and April 2023.

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Post Syndicated from Avijit Goswami original https://aws.amazon.com/blogs/big-data/improve-operational-efficiencies-of-apache-iceberg-tables-built-on-amazon-s3-data-lakes/

Apache Iceberg is an open table format for large datasets in Amazon Simple Storage Service (Amazon S3) and provides fast query performance over large tables, atomic commits, concurrent writes, and SQL-compatible table evolution. When you build your transactional data lake using Apache Iceberg to solve your functional use cases, you need to focus on operational use cases for your S3 data lake to optimize the production environment. Some of the important non-functional use cases for an S3 data lake that organizations are focusing on include storage cost optimizations, capabilities for disaster recovery and business continuity, cross-account and multi-Region access to the data lake, and handling increased Amazon S3 request rates.

In this post, we show you how to improve operational efficiencies of your Apache Iceberg tables built on Amazon S3 data lake and Amazon EMR big data platform.

Optimize data lake storage

One of the major advantages of building modern data lakes on Amazon S3 is it offers lower cost without compromising on performance. You can use Amazon S3 Lifecycle configurations and Amazon S3 object tagging with Apache Iceberg tables to optimize the cost of your overall data lake storage. An Amazon S3 Lifecycle configuration is a set of rules that define actions that Amazon S3 applies to a group of objects. There are two types of actions:

  • Transition actions – These actions define when objects transition to another storage class; for example, Amazon S3 Standard to Amazon S3 Glacier.
  • Expiration actions – These actions define when objects expire. Amazon S3 deletes expired objects on your behalf.

Amazon S3 uses object tagging to categorize storage where each tag is a key-value pair. From an Apache Iceberg perspective, it supports custom Amazon S3 object tags that can be added to S3 objects while writing and deleting into the table. Iceberg also let you configure a tag-based object lifecycle policy at the bucket level to transition objects to different Amazon S3 tiers. With the s3.delete.tags config property in Iceberg, objects are tagged with the configured key-value pairs before deletion. When the catalog property s3.delete-enabled is set to false, the objects are not hard-deleted from Amazon S3. This is expected to be used in combination with Amazon S3 delete tagging, so objects are tagged and removed using an Amazon S3 lifecycle policy. This property is set to true by default.

The example notebook in this post shows an example implementation of S3 object tagging and lifecycle rules for Apache Iceberg tables to optimize storage cost.

Implement business continuity

Amazon S3 gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. Amazon S3 is designed for 99.999999999% (11 9’s) of durability, S3 Standard is designed for 99.99% availability, and Standard – IA is designed for 99.9% availability. Still, to make your data lake workloads highly available in an unlikely outage situation, you can replicate your S3 data to another AWS Region as a backup. With S3 data residing in multiple Regions, you can use an S3 multi-Region access point as a solution to access the data from the backup Region. With Amazon S3 multi-Region access point failover controls, you can route all S3 data request traffic through a single global endpoint and directly control the shift of S3 data request traffic between Regions at any time. During a planned or unplanned regional traffic disruption, failover controls let you control failover between buckets in different Regions and accounts within minutes. Apache Iceberg supports access points to perform S3 operations by specifying a mapping of bucket to access points. We include an example implementation of an S3 access point with Apache Iceberg later in this post.

Increase Amazon S3 performance and throughput

Amazon S3 supports a request rate of 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket. The resources for this request rate aren’t automatically assigned when a prefix is created. Instead, as the request rate for a prefix increases gradually, Amazon S3 automatically scales to handle the increased request rate. For certain workloads that need a sudden increase in the request rate for objects in a prefix, Amazon S3 might return 503 Slow Down errors, also known as S3 throttling. It does this while it scales in the background to handle the increased request rate. Also, if supported request rates are exceeded, it’s a best practice to distribute objects and requests across multiple prefixes. Implementing this solution to distribute objects and requests across multiple prefixes involves changes to your data ingress or data egress applications. Using Apache Iceberg file format for your S3 data lake can significantly reduce the engineering effort through enabling the ObjectStoreLocationProvider feature, which adds an S3 hash [0*7FFFFF] prefix in your specified S3 object path.

Iceberg by default uses the Hive storage layout, but you can switch it to use the ObjectStoreLocationProvider. This option is not enabled by default to provide flexibility to choose the location where you want to add the hash prefix. With ObjectStoreLocationProvider, a deterministic hash is generated for each stored file and a subfolder is appended right after the S3 folder specified using the parameter write.data.path (write.object-storage-path for Iceberg version 0.12 and below). This ensures that files written to Amazon S3 are equally distributed across multiple prefixes in your S3 bucket, thereby minimizing the throttling errors. In the following example, we set the write.data.path value as s3://my-table-data-bucket, and Iceberg-generated S3 hash prefixes will be appended after this location:

CREATE TABLE my_catalog.my_ns.my_table
( id bigint,
data string,
category string)
USING iceberg OPTIONS
( 'write.object-storage.enabled'=true,
'write.data.path'='s3://my-table-data-bucket')
PARTITIONED BY (category);

Your S3 files will be arranged under MURMUR3 S3 hash prefixes like the following:

2021-11-01 05:39:24 809.4 KiB 7ffbc860/my_ns/my_table/00328-1642-5ce681a7-dfe3-4751-ab10-37d7e58de08a-00015.parquet
2021-11-01 06:00:10 6.1 MiB 7ffc1730/my_ns/my_table/00460-2631-983d19bf-6c1b-452c-8195-47e450dfad9d-00001.parquet
2021-11-01 04:33:24 6.1 MiB 7ffeeb4e/my_ns/my_table/00156-781-9dbe3f08-0a1d-4733-bd90-9839a7ceda00-00002.parquet

Using Iceberg ObjectStoreLocationProvider is not a foolproof mechanism to avoid S3 503 errors. You still need to set appropriate EMRFS retries to provide additional resiliency. You can adjust your retry strategy by increasing the maximum retry limit for the default exponential backoff retry strategy or enabling and configuring the additive-increase/multiplicative-decrease (AIMD) retry strategy. AIMD is supported for Amazon EMR releases 6.4.0 and later. For more information, refer to Retry Amazon S3 requests with EMRFS.

In the following sections, we provide examples for these use cases.

Storage cost optimizations

In this example, we use Iceberg’s S3 tags feature with the write tag as write-tag-name=created and delete tag as delete-tag-name=deleted. This example is demonstrated on an EMR version emr-6.10.0 cluster with installed applications Hadoop 3.3.3, Jupyter Enterprise Gateway 2.6.0, and Spark 3.3.1. The examples are run on a Jupyter Notebook environment attached to the EMR cluster. To learn more about how to create an EMR cluster with Iceberg and use Amazon EMR Studio, refer to Use an Iceberg cluster with Spark and the Amazon EMR Studio Management Guide, respectively.

The following examples are also available in the sample notebook in the aws-samples GitHub repo for quick experimentation.

Configure Iceberg on a Spark session

Configure your Spark session using the %%configure magic command. You can use either the AWS Glue Data Catalog (recommended) or a Hive catalog for Iceberg tables. In this example, we use a Hive catalog, but we can change to the Data Catalog with the following configuration:

spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog

Before you run this step, create a S3 bucket and an iceberg folder in your AWS account with the naming convention <your-iceberg-storage-blog>/iceberg/.

Update your-iceberg-storage-blog in the following configuration with the bucket that you created to test this example. Note the configuration parameters s3.write.tags.write-tag-name and s3.delete.tags.delete-tag-name, which will tag the new S3 objects and deleted objects with corresponding tag values. We use these tags in later steps to implement S3 lifecycle policies to transition the objects to a lower-cost storage tier or expire them based on the use case.

%%configure -f { "conf":{ "spark.sql.extensions":"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions", "spark.sql.catalog.dev":"org.apache.iceberg.spark.SparkCatalog", "spark.sql.catalog.dev.catalog-impl":"org.apache.iceberg.hive.HiveCatalog", "spark.sql.catalog.dev.io-impl":"org.apache.iceberg.aws.s3.S3FileIO", "spark.sql.catalog.dev.warehouse":"s3://&amp;amp;lt;your-iceberg-storage-blog&amp;amp;gt;/iceberg/", "spark.sql.catalog.dev.s3.write.tags.write-tag-name":"created", "spark.sql.catalog.dev.s3.delete.tags.delete-tag-name":"deleted", "spark.sql.catalog.dev.s3.delete-enabled":"false" } }

Create an Apache Iceberg table using Spark-SQL

Now we create an Iceberg table for the Amazon Product Reviews Dataset:

spark.sql(""" DROP TABLE if exists dev.db.amazon_reviews_iceberg""")
spark.sql(""" CREATE TABLE dev.db.amazon_reviews_iceberg (
marketplace string,
customer_id string,
review_id string,
product_id string,
product_parent string,
product_title string,
star_rating int,
helpful_votes int,
total_votes int,
vine string,
verified_purchase string,
review_headline string,
review_body string,
review_date date,
year int)
USING iceberg
location 's3://<your-iceberg-storage-blog>/iceberg/db/amazon_reviews_iceberg'
PARTITIONED BY (years(review_date))""")

In the next step, we load the table with the dataset using Spark actions.

Load data into the Iceberg table

While inserting the data, we partition the data by review_date as per the table definition. Run the following Spark commands in your PySpark notebook:

df = spark.read.parquet("s3://amazon-reviews-pds/parquet/product_category=Electronics/*.parquet")

df.sortWithinPartitions("review_date").writeTo("dev.db.amazon_reviews_iceberg").append()

Insert a single record into the same Iceberg table so that it creates a partition with the current review_date:

spark.sql("""insert into dev.db.amazon_reviews_iceberg values ("US", "99999999","R2RX7KLOQQ5VBG","B00000JBAT","738692522","Diamond Rio Digital",3,0,0,"N","N","Why just 30 minutes?","RIO is really great",date("2023-04-06"),2023)""")

You can check the new snapshot is created after this append operation by querying the Iceberg snapshot:

spark.sql("""SELECT * FROM dev.db.amazon_reviews_iceberg.snapshots""").show()

You will see an output similar to the following showing the operations performed on the table.

Check the S3 tag population

You can use the AWS Command Line Interface (AWS CLI) or the AWS Management Console to check the tags populated for the new writes. Let’s check the tag corresponding to the object created by a single row insert.

On the Amazon S3 console, check the S3 folder s3://your-iceberg-storage-blog/iceberg/db/amazon_reviews_iceberg/data/ and point to the partition review_date_year=2023/. Then check the Parquet file under this folder to check the tags associated with the data file in Parquet format.

From the AWS CLI, run the following command to see that the tag is created based on the Spark configuration spark.sql.catalog.dev.s3.write.tags.write-tag-name":"created":

xxxx@3c22fb1238d8 ~ % aws s3api get-object-tagging --bucket your-iceberg-storage-blog --key iceberg/db/amazon_reviews_iceberg/data/review_date_year=2023/00000-43-2fb892e3-0a3f-4821-a356-83204a69fa74-00001.parquet

You will see an output, similar to the below, showing the associated tags for the file

{ "VersionId": "null", "TagSet": [{ "Key": "write-tag-name", "Value": "created" } ] }

Delete a record and expire a snapshot

In this step, we delete a record from the Iceberg table and expire the snapshot corresponding to the deleted record. We delete the new single record that we inserted with the current review_date:

spark.sql("""delete from dev.db.amazon_reviews_iceberg where review_date = '2023-04-06'""")

We can now check that a new snapshot was created with the operation flagged as delete:

spark.sql("""SELECT * FROM dev.db.amazon_reviews_iceberg.snapshots""").show()

This is useful if we want to time travel and check the deleted row in the future. In that case, we have to query the table with the snapshot-id corresponding to the deleted row. However, we don’t discuss time travel as part of this post.

We expire the old snapshots from the table and keep only the last two. You can modify the query based on your specific requirements to retain the snapshots:

spark.sql ("""CALL dev.system.expire_snapshots(table => 'dev.db.amazon_reviews_iceberg', older_than => DATE '2024-01-01', retain_last => 2)""")

If we run the same query on the snapshots, we can see that we have only two snapshots available:

spark.sql("""SELECT * FROM dev.db.amazon_reviews_iceberg.snapshots""").show()

From the AWS CLI, you can run the following command to see that the tag is created based on the Spark configuration spark.sql.catalog.dev.s3. delete.tags.delete-tag-name":"deleted":

xxxxxx@3c22fb1238d8 ~ % aws s3api get-object-tagging --bucket avijit-iceberg-storage-blog --key iceberg/db/amazon_reviews_iceberg/data/review_date_year=2023/00000-43-2fb892e3-0a3f-4821-a356-83204a69fa74-00001.parquet

You will see output similar to below showing the associated tags for the file

{ "VersionId": "null", "TagSet": [ { "Key": "delete-tag-name", "Value": "deleted" }, { "Key": "write-tag-name", "Value": "created" } ] }

You can view the existing metadata files from the metadata log entries metatable after the expiration of snapshots:

spark.sql("""SELECT * FROM dev.db.amazon_reviews_iceberg.metadata_log_entries""").show()

The snapshots that have expired show the latest snapshot ID as null.

Create S3 lifecycle rules to transition the buckets to a different storage tier

Create a lifecycle configuration for the bucket to transition objects with the delete-tag-name=deleted S3 tag to the Glacier Instant Retrieval class. Amazon S3 runs lifecycle rules one time every day at midnight Universal Coordinated Time (UTC), and new lifecycle rules can take up to 48 hours to complete the first run. Amazon S3 Glacier is well suited to archive data that needs immediate access (with milliseconds retrieval). With S3 Glacier Instant Retrieval, you can save up to 68% on storage costs compared to using the S3 Standard-Infrequent Access (S3 Standard-IA) storage class, when the data is accessed once per quarter.

When you want to access the data back, you can bulk restore the archived objects. After you restore the objects back in S3 Standard class, you can register the metadata and data as an archival table for query purposes. The metadata file location can be fetched from the metadata log entries metatable as illustrated earlier. As mentioned before, the latest snapshot ID with Null values indicates expired snapshots. We can take one of the expired snapshots and do the bulk restore:

spark.sql("""CALL dev.system.register_table(table => 'db.amazon_reviews_iceberg_archive', metadata_file => 's3://avijit-iceberg-storage-blog/iceberg/db/amazon_reviews_iceberg/metadata/00000-a010f15c-7ac8-4cd1-b1bc-bba99fa7acfc.metadata.json')""").show()

Capabilities for disaster recovery and business continuity, cross-account and multi-Region access to the data lake

Because Iceberg doesn’t support relative paths, you can use access points to perform Amazon S3 operations by specifying a mapping of buckets to access points. This is useful for multi-Region access, cross-Region access, disaster recovery, and more.

For cross-Region access points, we need to additionally set the use-arn-region-enabled catalog property to true to enable S3FileIO to make cross-Region calls. If an Amazon S3 resource ARN is passed in as the target of an Amazon S3 operation that has a different Region than the one the client was configured with, this flag must be set to ‘true‘ to permit the client to make a cross-Region call to the Region specified in the ARN, otherwise an exception will be thrown. However, for the same or multi-Region access points, the use-arn-region-enabled flag should be set to ‘false’.

For example, to use an S3 access point with multi-Region access in Spark 3.3, you can start the Spark SQL shell with the following code:

spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \
--conf spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog \
--conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \
--conf spark.sql.catalog.my_catalog.s3.use-arn-region-enabled=false \
--conf spark.sql.catalog.test.s3.access-points.my-bucket1=arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap \
--conf spark.sql.catalog.test.s3.access-points.my-bucket2=arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap

In this example, the objects in Amazon S3 on my-bucket1 and my-bucket2 buckets use the arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap access point for all Amazon S3 operations.

For more details on using access points, refer to Using access points with compatible Amazon S3 operations.

Let’s say your table path is under mybucket1, so both mybucket1 in Region 1 and mybucket2 in Region have paths of mybucket1 inside the metadata files. At the time of the S3 (GET/PUT) call, we replace the mybucket1 reference with a multi-Region access point.

Handling increased S3 request rates

When using ObjectStoreLocationProvider (for more details, see Object Store File Layout), a deterministic hash is generated for each stored file, with the hash appended directly after the write.data.path. The problem with this is that the default hashing algorithm generates hash values up to Integer MAX_VALUE, which in Java is (2^31)-1. When this is converted to hex, it produces 0x7FFFFFFF, so the first character variance is restricted to only [0-8]. As per Amazon S3 recommendations, we should have the maximum variance here to mitigate this.

Starting from Amazon EMR 6.10, Amazon EMR added an optimized location provider that makes sure the generated prefix hash has uniform distribution in the first two characters using the character set from [0-9][A-Z][a-z].

This location provider has been recently open sourced by Amazon EMR via Core: Improve bit density in object storage layout and should be available starting from Iceberg 1.3.0.

To use, make sure the iceberg.enabled classification is set to true, and write.location-provider.impl is set to org.apache.iceberg.emr.OptimizedS3LocationProvider.

The following is a sample Spark shell command:

spark-shell --conf spark.driver.memory=4g \
--conf spark.executor.cores=4 \
--conf spark.dynamicAllocation.enabled=true \
--conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/iceberg-V516168123 \
--conf spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog \
--conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \
--conf spark.sql.catalog.my_catalog.table-override.write.location-provider.impl=org.apache.iceberg.emr.OptimizedS3LocationProvider

The following example shows that when you enable the object storage in your Iceberg table, it adds the hash prefix in your S3 path directly after the location you provide in your DDL.

Define the table write.object-storage.enabled parameter and provide the S3 path, after which you want to add the hash prefix using write.data.path (for Iceberg Version 0.13 and above) or write.object-storage.path (for Iceberg Version 0.12 and below) parameters.

Insert data into the table you created.

The hash prefix is added right after the /current/ prefix in the S3 path as defined in the DDL.

Clean up

After you complete the test, clean up your resources to avoid any recurring costs:

  1. Delete the S3 buckets that you created for this test.
  2. Delete the EMR cluster.
  3. Stop and delete the EMR notebook instance.

Conclusion

As companies continue to build newer transactional data lake use cases using Apache Iceberg open table format on very large datasets on S3 data lakes, there will be an increased focus on optimizing those petabyte-scale production environments to reduce cost, improve efficiency, and implement high availability. This post demonstrated mechanisms to implement the operational efficiencies for Apache Iceberg open table formats running on AWS.

To learn more about Apache Iceberg and implement this open table format for your transactional data lake use cases, refer to the following resources:


About the Authors

Avijit Goswami is a Principal Solutions Architect at AWS specialized in data and analytics. He supports AWS strategic customers in building high-performing, secure, and scalable data lake solutions on AWS using AWS managed services and open-source solutions. Outside of his work, Avijit likes to travel, hike in the San Francisco Bay Area trails, watch sports, and listen to music.

Rajarshi Sarkar is a Software Development Engineer at Amazon EMR/Athena. He works on cutting-edge features of Amazon EMR/Athena and is also involved in open-source projects such as Apache Iceberg and Trino. In his spare time, he likes to travel, watch movies, and hang out with friends.

Prashant Singh is a Software Development Engineer at AWS. He is interested in Databases and Data Warehouse engines and has worked on Optimizing Apache Spark performance on EMR. He is an active contributor in open source projects like Apache Spark and Apache Iceberg. During his free time, he enjoys exploring new places, food and hiking.

Supermicro Shows Liquid-Cooled NVIDIA H100 Delta Next at ISC 2023

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/supermicro-shows-liquid-cooled-nvidia-h100-delta-next-at-isc-2023/

Supermicro showed off a liquid-cooled HGX H100 8x GPU system at ISC 2023 and it takes more engineering than we thought

The post Supermicro Shows Liquid-Cooled NVIDIA H100 Delta Next at ISC 2023 appeared first on ServeTheHome.

How to use managed dedicated IPs for email sending

Post Syndicated from Tyler Holmes original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-use-managed-dedicated-ips-for-email-sending/

Starting to use dedicated IPs has always been a long, complicated process driven by factors such as the large effort to monitor and maintain the IPs and the costs, both in infrastructure and management of IP and Domain reputation. The Dedicated IP (Managed) feature in Amazon SES eliminates much of the complexity of sending email via dedicated IPs and allows you to start sending through dedicated IPs much faster and with less management overhead.

What’s the Difference Between standard dedicated IPs and managed dedicated IPs?

You can use SES for dedicated IP addresses in two ways: standard and managed. Both allow you to lease dedicated IP addresses for an additional fee, but differ in how they’re configured and managed. While there are shared commonalities they each have unique advantages dependent on your use case, see here for a comparison.

Standard dedicated IPs are manually set up and managed in SES. They allow you full control over your sending reputation but require you to fully manage your dedicated IPs, including warming them up, scaling them out, and managing your pools.

Managed dedicated IPs are a quick, and easy, way to start using dedicated IP addresses. These dedicated IP addresses leverage machine learning to remove the need to manage the IP warm-up process. The feature also handles the scaling of your IPs up or down as your volume increases (or decreases) to provide a quick, easy, and cost-efficient way to start using dedicated IP addresses that are managed by SES.

How Does the Managed Dedicated IP Feature Automate IP Warming?

Great deliverability through your dedicated IP address requires that you have a good reputation with the receiving ISPs, also known as a mailbox provider. Mailbox providers will only accept a small volume of email from an IP that they don’t recognize. When you’re first allocated an IP, it’s new and won’t be recognized by the receiving mailbox provider because it doesn’t have any reputation associated with it. In order for an IP’s reputation to be established, it must gradually build trust with receiving mailbox providers—this gradual trust building process is referred to as warming-up. Adding to the complexity is that each mailbox provider has their own concept of warming, accepting varying volumes of email as you work through the warm up process.

The concept of IP warming has always been a misnomer, with customers thinking that once their IP is “warm” that it stays that way when in reality the process is an ongoing one, fluctuating as your recipient domain mix changes and your volume changes. Ensuring that you have the best deliverability when sending via dedicated IPs requires monitoring more than just recipient engagement rates (opens, clicks, bounces, complaints, opt-ins, etc.), you also need to manage volume per mailbox provider. Understanding the volumes that recipient mailbox providers will accept is very difficult if not impossible for senders using standard Dedicated IPs. Managing this aspect of the warm up creates risk for sending too little, meaning warm-up takes longer, or too much, which means receiving mailbox providers may throttle, reduce IP reputation, or even filter out email being sent by an IP that is not properly warming up.

This process is a costly, risky, and time consuming one that can be eliminated using the managed feature. Managed dedicated IPs will automatically apportion the right amount of traffic per mailbox provider to your dedicated IPs and any leftover email volume is sent over the shared network of IPs, allowing you to send as you normally would. As time goes on, the proportion of email traffic being sent over your dedicated IPs increases until they are warm, at which point all of your emails will be sent through your dedicated IPs. In later stages, any sending that is in excess of your normal patterns is proactively queued to ensure the best deliverability to each mailbox provider.

As an example, if you’ve been sending all your traffic to Gmail, the IP addresses are considered warmed up only for Gmail and cold for other mailbox providers. If your customer domain mix changes and includes a large proportion of email sends to Hotmail, SES ramps up traffic slowly for Hotmail as the IP addresses are not warmed up yet while continuing to deliver all the traffic to Gmail via your dedicated IPs. The warmup adjustment is adaptive and is based on your actual sending patterns.

The managed feature is great for those that prioritize and want to be in complete control of their reputation while not wanting to spend the time and effort to manage the warm-up process or the scaling of IPs as your volume grows. A full breakdown of the use cases that are a good fit for the managed feature can be found here

How to Configure Managed Pools and Configuration Sets

Enabling managed dedicated IPs can be configured in just a few steps and can be done either from the console or programmatically. The first step is to create a managed IP pool, then the managed dedicated IPs feature will determine how many dedicated IPs you require based on your sending patterns, provision them for you, and then manage how they scale based on your sending requirements. Note that this process is not instantaneous, dependent on your sending patterns it may take more or less time for the dedicated IPs to be provisioned, you need to have consistent email volume coming from your account in order for the feature to provision the correct number of IPs.

Once enabled, you can utilize managed dedicated IPs in your email sending by associating the managed IP pool with a configuration set, and then specifying that configuration set when sending email. The configuration set can also be applied to a sending identity by using a default configuration set, which can simplify your sending, as anytime the associated sending identity is used to send email your managed dedicated IPs will be used.

Instructions

Configure Via The Console

To enable Dedicated IPs (Managed) via the Amazon SES console:

  • Sign in to the AWS Management Console and open the Amazon SES console at https://console.aws.amazon.com/ses/.
  • In the left navigation pane, choose Dedicated IPs.
  • Do one of the following (Note: You will begin to incur charges after creating a Dedicated IPs (Managed) pool below)
    • If you don’t have existing dedicated IPs in your account:
      • The Dedicated IPs onboarding page is displayed. In the Dedicated IPs (Managed) overview panel, choose Enable dedicated IPs. The Create IP Pool page opens.
    • If you have existing dedicated IPs in your account:
      • Select the Managed IP pools tab on the Dedicated IPs page.
      • In the All Dedicated IP (managed) pools panel, choose Create Managed IP pool. The Create IP Pool page opens.
  • In the Pool details panel,
    • Choose Managed (auto managed) in the Scaling mode field.
    • Enter a name for your managed pool in the IP pool name field.
    • Note
      • The IP pool name must be unique. It can’t be a duplicate of a standard dedicated IP pool name in your account.
      • You can have a mix of up to 50 pools split between both Standard and Dedicated IPs (Managed) per AWS Region in your account.
  • (Optional) You can associate this managed IP pool with a configuration set by choosing one from the dropdown list in the Configuration sets field.
    • Note
      • If you choose a configuration set that’s already associated with an IP pool, it will become associated with this managed pool, and no longer be associated with the previous pool.
      • To add or remove associated configuration sets after this managed pool is created, edit the configuration set’s Sending IP pool parameter in the General details panel.
      • If you haven’t created any configuration sets yet, see Using configuration sets in Amazon SES.
  • (Optional) You can add one or more Tags to your IP pool by including a tag key and an optional value for the key.
    • Choose Add new tag and enter the Key. You can also add an optional Value for the tag. You can add up to 50 tags, if you make a mistake, choose Remove.
    • To add the tags, choose Save changes. After you create the pool, you can add, remove, or edit tags by selecting the managed pool and choosing Edit.
  • Select Create pool.
    • Note
      • After you create a managed IP pool, it can’t be converted to a standard IP pool.
      • When using Dedicated IPs (Managed), you can’t have more than 10,000 sending identities (domains and email addresses, in any combination) per AWS Region in your account.
  • After you create your managed IP pool, a link will automatically be generated in the CloudWatch metrics column in the All Dedicated IPs (Managed) pools table in the SES console, that when selected, will open the Amazon CloudWatch console and display your sending reputation at an effective daily rate with specific mailbox providers for the managed pool using the following metrics:

 

Metric Description
1 Available24HourSend Indicates how  much volume the managed pool has available to send towards a specific mailbox provider.
2 SentLast24Hours Indicates how  much volume of email has been sent through the managed pool by dedicated IPs  towards a specific mailbox provider.

You can also track the managed pool’s sending performance by using event publishing.

Configure VIA The API

You can configure your Managed Dedicated IP Pool through the API as well. A dedicated pool can be specified to be managed by setting the scaling-mode to “MANAGED” when creating the dedicated pool.

Configure Via The CLI

You can configure your SES resources through the CLI. A dedicated pool can be specified to be managed by setting the —scaling-mode MANAGED parameter when creating the dedicated pool.

  • # Specify which AWS region to use
    • export AWS_DEFAULT_REGION=’us-east-1′
  • # Create a managed dedicated pool
    • aws sesv2 create-dedicated-ip-pool —pool-name dedicated1 —scaling-mode MANAGED
  • # Create a configuration set that that will send through the dedicated pool
    • aws sesv2 create-configuration-set —configuration-set-name cs_dedicated1 —delivery-options SendingPoolName=dedicated1
  • # Configure the configuration set as the default for your sending identity
    • aws sesv2 put-email-identity-configuration-set-attributes —email-identity {{YOUR-SENDING-IDENTITY-HERE}} —configuration-set-name cs_dedicated1
  • # Send SES email through the API or SMTP without requiring any code changes. Emails will # be sent out through the dedicated pool.
    • aws sesv2 send-email –from-email-address “{YOUR-SENDING-IDENTITY-HERE}}” –destination “[email protected]” —content ‘{“Simple”: {“Subject”: {“Data”: “Sent from a Dedicated IP Managed pool”},”Body”: {“Text”: {“Data”: “Hello”}}}}’

Monitoring

We recommend customers onboard to event destinations and delivery delay events to more accurately track the sending performance of their dedicated sending.

Conclusion

In this blog post we explained the benefits of sending via a Dedicated IPs (Managed) feature as well as how to configure and begin sending through a Managed Dedicated IP. Dedicated IPs (Managed) pricing can be reviewed at the pricing page for SES here.

Let’s Architect! Designing microservices architectures

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-designing-microservices-architectures/

In 2022, we published Let’s Architect! Architecting microservices with containers. We covered integrations patterns and some approaches for implementing microservices using containers. In this Let’s Architect! post, we want to drill down into microservices only, by focusing on the main challenges that software architects and engineers face while working on large distributed systems structured as a set of independent services.

There are many considerations to cover in detail within a broad topic like microservices. We should reflect on the organizational structure, automation pipelines, multi-account strategy, testing, communication, and many other areas. With this post we dive deep into the topic by analyzing the options for discoverability and connectivity available through Amazon VPC Lattice; then, we focus on architectural patterns for communication, mainly on asynchronous communication, as it fits very well into the paradigm. Finally, we explore how to work with serverless microservices and analyze a case study from Amazon, coming directly from the Amazon Builder’s Library.

AWS Container Day featuring Kubernetes

Modern applications are often built using a microservice distributed approach, which involves dividing the application into smaller, specialized services. Each of these services implement their own subset of functionalities or business logic. To facilitate communication between these services, it is essential to have a method to authorize, route, and monitor network traffic. It is also important, in case of issues, to have the ability of identifying the root cause of an issue, whether it originates at the application, service, or network level.

Amazon VPC Lattice can offer a consistent way to connect, secure, and monitor communication between instances, containers, and serverless functions. With Amazon VPC Lattice, you can define policies for traffic management, network access, advanced routing, implement discoverability, and, at the same time, monitor how the traffic is flowing inside complex applications in near real time.

Take me to this video!

Amazon VPC Lattice service gives you a consistent way to connect, secure, and monitor communication between your services

Amazon VPC Lattice service gives you a consistent way to connect, secure, and monitor communication between your services

Application integration patterns for microservices

Loosely coupled integration can help you design independent systems that can be developed and operated individually, plus increase the availability and reliability of the overall system landscape—particularly by using asynchronous communication. While there are many approaches for integration and conversation scenarios, it’s not always clear which approach is best for a given situation.

Join this re:Invent 2022 session to learn about foundational patterns for integration and conversation scenarios with an emphasis on loose coupling and asynchronous communication. Explore real-world use cases architected with cloud-native and serverless services, and receive guidance on choosing integration technology.

Take me to this re:Invent 2022 video!

Loosely coupled integration can help you design independent systems that can be developed and operated individually and can also increase the availability and reliability of the overall system

Loosely coupled integration can help you design independent systems that can be developed and operated individually and can also increase the availability and reliability of the overall system

Design patterns for success in serverless microservices

Software engineers love patterns—proven approaches to well-known problems that make software development easier and set our projects up for success. In complex, distributed systems, such as microservices, patterns like CQRS and Event Sourcing help decouple and scale systems.

The first part of the video is all about introducing architectural patterns and their applications, while the second part contains a set of demos and examples from the AWS console.
In this session, we examine at some typical patterns for building robust and performant serverless microservices, and how data access patterns can drive polyglot persistence.

Take me to this AWS Summit video!

With Event Sourcing data is stored as a series of events, instead of direct updates to data stores. Microservices replay events from an event store to compute the appropriate state of their own data stores

With event sourcing data is stored as a series of events, instead of direct updates to data stores; microservices replay events from an event store to compute the appropriate state of their own data stores

Avoiding overload in distributed systems by putting the smaller service in control

If we don’t pay attention to the relative scale of a service and its clients, distributed systems with microservices can be at risk of overload. A common architecture pattern adopted by many AWS services consists of splitting the system in a control plane and a data plane.

This article drills down into this scenario to understand what could happen if the data plane fleet exceeds the scale of the control plane fleet by a factor of 100 or more. This can happen in a microservices-based architecture when service X recovers from an outage and starts sending a large amount of request to service Y. Without careful fine-tuning, this shift in behavior can overwhelm the smaller callee. With this resource, we want to share some mental models and design strategies that are beneficial for distributed systems and teams working on microservices architectures.

Take me to the Amazon Builders’ Library!

To stay up to date on the data plane’s operational state, the control plane can poll an Amazon S3 bucket into which data plane servers periodically write that information

To stay updated on the data plane’s operational state, the control plane can poll an Amazon S3 bucket into which data plane servers periodically write that information

See you next time!

Thanks for stopping by! Join us in two weeks when we’ll discuss multi-tenancy and patterns for SaaS on AWS.

To find all the blogs from this series, you can check out the Let’s Architect! list of content on the AWS Architecture Blog.

[$] Monitoring mount operations

Post Syndicated from original https://lwn.net/Articles/932648/

Amir Goldstein kicked off a session on monitoring mounts at the
2023 Linux Storage, Filesystem,
Memory-Management and BPF Summit
. In particular, there are problems
when trying to efficiently monitor “a very large number of mounts in a
mount namespace”; some user-space programs need an accurate view of the
mount tree without having to constantly parse /proc/mounts or the
like. There
are a number of questions to be answered,
including what the API should look like and what entity should be watched
in order to get notifications of new
mount operations.

Security updates for Wednesday

Post Syndicated from original https://lwn.net/Articles/932827/

Security updates have been issued by Debian (libssh and sofia-sip), Fedora (cups-filters, dokuwiki, qt5-qtbase, and vim), Oracle (git, python-pip, and python3-setuptools), Red Hat (git, kernel, kpatch-patch, rh-git227-git, and sudo), SUSE (openvswitch, rmt-server, and texlive), and Ubuntu (binutils, cinder, cloud-init, firefox, golang-1.13, Jhead, liblouis, ncurses, node-json-schema, node-xmldom, nova, python-glance-store, python-os-brick, and runc).

Healthcare Orgs: Do You Need an Outsourced SOC?

Post Syndicated from Rapid7 original https://blog.rapid7.com/2023/05/24/healthcare-orgs-do-you-need-an-outsourced-soc/

Healthcare Orgs: Do You Need an Outsourced SOC?

Gartner predicts that 50% of organizations will partner with an external MDR (Managed Detection and Response) service by 2025 for around-the-clock monitoring. What determines where healthcare organizations fall on that 50/50 split over using an outsourced SOC? It usually comes down to their ability to adapt to the current needs of the healthcare industry.

A growing demand for improved healthcare services means more healthcare providers are turning to the cloud. But for a world built on strict regulations and literal life-or-death situations, migrating too quickly to the cloud can be a serious challenge. When healthcare teams take on cloud adoption too fast, then run the risk of:

  • Accumulating cloud services that fall through security cracks—AKA shadow IT
  • Expanding their organization’s attack surface without a means of defense, opening up more opportunities for breaches and leaks

That’s where the help of an outsourced SOC comes in. With an extra team of experts on board, healthcare organizations can secure new ephemeral environments—without putting their security teams through resource strain or burnout.

Still, it can be tough for healthcare organizations to identify when it’s time to outsource, if ever at all. Here are some tell-tale signs that outsourcing a SOC and investing in managed services is the right call.

Your Teams Are Already Overwhelmed

While most healthcare organizations have a trusted team of a few security experts, they’re usually smaller than most security teams in tech enterprises, snappy startups, or other more cyber-savvy industries. That leads to a tricky cycle of needing to do more with fewer resources.

A day in the life of a security engineer in healthcare is marked by a seemingly endless game of catchup—one that doesn’t support speed, efficiency, or a successful migration to the cloud.

If your organization’s security teams are:

  • Struggling to find qualified talent
  • Overwhelmed by firefighting every single incident on their plate
  • Tired of combing through seas of alerts—some of which are false positives
  • Burned out by carrying out repetitive and mundane tasks that could be automated

You’re Super New to the Cloud

Healthcare security teams are typically IT or network pros who are well-acquainted and well-trained to defend traditional environments. However, there may be knowledge gaps when it comes to healthcare’s approach to cloud security. But with global cyber attacks on healthcare organizations rising 74% per week in 2022, security teams have no time to waste learning how to protect cloud environments.

Investing in the right education and training for healthcare’s traditional security pros simply takes time and effort that many organizations can’t afford to waste. But with an external SOC, security teams can:

  • Rely on cloud security experts to handle the trickiest parts of the process
  • Learn as they go with the guidance of seasoned professionals
  • Gain strategic guidance and insights to help take their security program to the next level

You’d Benefit From Automated Processes but Struggle To Implement Them

Automation is the key to boosting your cloud security program and iterating it at scale. For healthcare, automation provides the biggest benefit in ensuring that strict compliance regulations—like HIPAA—are met. That spells good news for stakeholders, who are typically most concerned with meeting standards and maintaining compliance.

With automation, security teams in healthcare can:

  • Configure guardrails ensuring new assets and environments adhere to regulations and compliance standards
  • Set up automated alerts that indicate when standards are not met

However, implementing automation, especially if your organization’s new at it, can seem like a hefty investment and a daunting task to accomplish. It’s time to enlist the help of an outsourced SOC if your security teams:

  • Have limited or no experience with automation
  • Are still manually handling a lot of rote but necessary tasks
  • Know where duties get repetitive but don’t know what to do about it

That way, external cyber experts can set up automated guardrails, teach your teams how they work, and eliminate tedious, manual work.

Next Steps With Outsourced SOCs

Organizations with limited resources and novice knowledge of the cloud can significantly benefit from teaming up with managed services. But in a sea of possible partners, knowing which experts to go with can be tough—especially when healthcare organizations have various security needs.

That’s why we built Managed Threat Complete, an always-on MDR with vulnerability management in a single subscription. Consolidate your investment in external SOCs by teaming up with our seasoned security pros today.

Learn More

For more information about healthcare cybersecurity, download our new ebook: In Healthcare (and Security) Early Detection is Key

In this eBook, you’ll learn:

  • The current state of threats in the healthcare industry
  • The top challenges in addressing those threats
  • How to overcome those challenges and implement defense strategies

Download it now!

Indiana, Iowa, and Tennessee Pass Comprehensive Privacy Laws

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/05/indiana-iowa-and-tennessee-pass-comprehensive-privacy-laws.html

It’s been a big month for US data privacy. Indiana, Iowa, and Tennessee all passed state privacy laws, bringing the total number of states with a privacy law up to eight. No private right of action in any of those, which means it’s up to the states to enforce the laws.

Книгите, от които се боим

Post Syndicated from Милена Галунска original https://www.toest.bg/knigite-ot-koito-se-boim/

Книгите, от които се боим

Никога не съм чела Комунистическия манифест на Маркс. Досегът ми с Библията се ограничава с няколко глави от „Библия за деца“. В ХI клас стигнах само до средата на „Под игото“. Това едва ли е толкова чудно в нашето постсоциалистическо, псевдорелигиозно, силно националистическо общество.

Непознаването на идеите на Маркс не пречи на средностатистическия българин да води минимум по един спор на ден с друг средностатистически българин на тема „Колко хубаво/лошо беше/е било през социализма“. Неразгръщането на Библията не спира потока от хора, който прекрачва църковния праг на Великден. А проспаните часове по литература навярно са основната причина „Под игото“ – романът, на който се отделя най-много време в часовете по български език и литература – да е най-обичаният от българите (според класация, проведена преди 14 години).

Има книги, които ни влияят, дори да не сме ги чели. Те съставляват канона. Книги, чиито герои и мотиви са добре познати и на онези, които не са чували за тях. Чиито сюжети се спрягат толкова много, че масовата публика ги намира за банални, без дори да може да назове първоизточника.

Защото канонът се определя не от книгите, които са най-четени, а от онези, за които хората най-много лъжат, че са чели.

Интересно как би изглеждало едно такова социологическо проучване:

Коя е книгата, за която най-често лъжете, че сте прочели?

Най-вероятно резултатите биха били сходни с официалните проучвания за любими хора. А как бих отговорила аз на този въпрос? Отговорът ми се припокрива с този на народа отпреди 14 години.

Първият ми досег с „Под игото“ беше през лятото между V и VI клас, когато бях на 11, но легендите за страшния Вазов роман се носеха много преди това. В летата, предшестващи въпросното, ставах свидетел на „борбите“ на по-големите ми приятели с „Под игото“. Когато заветното лято настъпи, вече бях психически подготвена. Знаех точно кои глави се изучават в училище, как започва книгата, кои са най-големите трудности при прочитането ѝ, кои са страниците, на които най-много хора се предават. Но аз нямаше да се предам, щях да прочета целия роман, щях да надвия архаичния език, изпълнен с турцизми.

Бях амбициозно дете, но дори амбицията не можа да ми помогне да мина отвъд 40-та страница.

Втори опит четири и половина години по-късно. Отново неуспешен. Но ако за първия път виня крехката си възраст (не знам какви доводи биха ме убедили, че 11–12 години са достатъчни, за да разбереш езика или смисъла на „Под игото“), то за втория обвинявам учителката си по български език и литература и зле изградената система, на която тя беше продукт.

Българската образователна система – това същество в непрестанна метаморфоза, което въпреки всички сменени кожи си остава все същото.

Съдейки от своя личен опит, един от основните проблеми в часовете по литература е, че вместо да повишават интереса към творби, определени като класика, отблъскват учениците от тях. Въпросните творби са поставени на най-високия и същевременно труднодостижим рафт на литературата. Така обикновеният човек, който няма желание да търси стълба, ги оставя да събират прах на върха и посяга към много по-незастрашително близка алтернативна литература.

Алтернатива на канона може да бъде всичко – от малко познати произведения до масови или жанрови заглавия – всичко различно от изучаваното. Но това става само в оптималния случай. За жалост, много от българите напълно загърбват рафта с книги и се обръщат към риалити телевизията в търсене на забавление.

Моят личен бунт в часовете по литература се изразяваше в упоритото ми нежелание да слушам по време на урока.

Не бях съгласна с отказа на учителката ми да признае някакво тълкувание, различно от официалното. Според нея съществуваше само един начин за прочитане и осмисляне на произведението. Само един ъгъл, под който то можеше да се гледа. Ако се осмелях да изкажа някакво различно мнение (като например твърдението ми, че „Две хубави очи“ не е непременно любовно стихотворение, а по-скоро отразява преклонението на лирическия говорител пред детското и невинното), получавах само празния поглед на учителката и монотонно произнесеното „Някой друг?“.

И така в един момент спрях да слушам. Вместо това

намирах за много по-полезно тихо да си чета произведения от изучавания автор, които по някаква причина МОН беше оставил извън учебния план.

Докато съучениците ми рецитираха „Стон“ (друг „любим“ момент от часовете по литература, а именно задължителното наизустяване на стихотворенията), аз четях „Пръстен с опал“, последвано от „Обичам те“, „Не бой се и ела“, „Проклятие“ и така, докато не стигна до края на книгата, където ме чакаше цикълът „Царици на нощта“. Защо не изучавахме и тези стихотворения?

Може би защото любовният триъгълник Мина–Яворов–Лора се оказва по-важен от лириката на самия Яворов? И по този начин ключов за българската литература поет се представя на учениците само в един контекст – на мъжа, застанал между две жени?  И то през един романтичен и в същото време трагичен сюжет, често пъти представен като за страниците на жълта медия.

Моят любим поет, изучаван в горните класове, беше Атанас Далчев. Това се дължеше на факта, че в седмицата, отредена ни за Далчевото творчество, учителката ни по литература отсъстваше. Доста тъжна причина един автор да ти стане любим.

В учителката по заместване аз най-накрая съзрях истинското лице на преподавателя по литература.

Човек, който те насърчава да изкажеш мнението си, да се аргументираш по-добре, да потънеш в произведението. Харесах стихотворенията на Далчев повече от тези на изучаваните дотогава автори. Щях да проумея защо няколко години по-късно.

В поезията на Далчев няма нищо излишно, езикът е съвременен, изчистен от помпозности. Образите, които този език изгражда, са ясни и точни. Далчевите стихотворения се откроиха от останалите дотолкова, че вече не ги причислявах към канона, който ми се струваше толкова студен и безжизнен, заключен в миналото.

И сякаш това е основният проблем на канона. Вместо на произведението да се гледа като на стъпало, върху което е стъпила съвременната литература, то се заключва в периода, към който принадлежи. После се изучава следващият период, и следващият, и така се стига до първите години на социализма, когато литературата свършва (поне според христоматията по литература).

Идва април и с него започва интензивният преговор за матурата. Най-важното събитие за всеки ученик или по-скоро учител, който иска да си вири носа през юни, защото неговите ученици имат по-висок успех от този на колегата му. Нищо че учениците му може и да не прочетат друга книга до края на живота си.

В обществото ни ключов е имиджът, а в училище той се гради с високи оценки.

Как иначе ще им покажем, че сме най-добрите?

Заради това отношение към канона се настроих против него.

Не желаех (и все още не желая) да чета нещо само защото обществото ми го е наложило. Моят личен канон е съставен от непрочетени книги.

Започнати и недовършени, купени и чакащи на някой рафт, или такива, които все още ми предстоят. В този канон влиза всичко, което по някакъв начин съм сметнала за важно от класиката. Книгите, които са оказали достатъчно голямо влияние върху света, че части от тях, мотиви, сюжети, герои са се споменавали около мен, откакто се помня.

Докато растях, един от любимите ми филми беше „Матилда“. Така и не прочетох книгата, което беше много странно предвид любовта ми към детските книги на Роалд Дал. Филмът беше толкова специален, че не исках да променям представата си за историята, като прочета книгата. Матилда не би постъпила така. Тя би прочела всяка книга, за която чуе и до която има достъп. Като дете тя беше най-верният ми ориентир за това какво трябва да правя, за да стана умна.

Матилда четеше „Моби Дик“, заради това смятах, че и аз трябва да прочета „Моби Дик“. И все още смятам така.

Друга книга, която като дете смятах, че задължително трябва да прочета, беше „Парижката Света Богородица“. Книга, която получих за деветия си рожден ден. Период, в който бях обсебена от Франция – минифранкофон, който нито говори френски, нито знае какво е значението на думата „франкофон“.

Започвала съм да чета „Парижката Света Богородица“ десетки пъти, така и не съм я завършвала.

Книга, която съм завършвала обаче, е „Отнесени от вихъра“. Доскорошният ми отговор на въпроса „Коя е любимата ти книга?“. За първи път прочетох романа, когато бях на 16, и бързо след това „И утре е ден“ се превърна в най-повтаряната ми фраза. Това, което ме привлече най-силно, беше образът на Скарлет О’Хара. Във всеки друг роман, който бях чела дотогава, героини като Скарлет бях отрицателни, създадени като антипод на милата и добра главна героиня.

Скарлет има повече общо със злата мащеха, отколкото със Снежанка, и въпреки това искаш тя да успее.

Разлистваш страница след страница с надеждата Скарлет да се прибере у дома и всичко да се подреди. Отново да постигне онова безгрижие, на чиито криле се носи в началото на романа. Вместо това обаче краят на историята, оказва се, нагарча. Осъзнаваш, че блаженството на детството вече е недостижимо както за Скарлет, така и за теб. Претърпях няколко катарзиса, четейки „Отнесени от вихъра“. Нещо, което нямаше да се случи, ако бях посегнала към този роман в друг момент от живота си.

В онзи период исках да продължа да чета за такива героини. Така започнах „Панаир на суетата“. В анотацията ми беше обещана „история, проследяваща съдбата на завладяващата и безжалостната Беки Шарп […] прототипа и вдъхновението за създадената по-късно Скарлет О’Хара“.

Поне в първите 300 страници не видях такова сходство. Може би на пръв поглед го има, но темите на романите, стилът, на който са написани, внушенията, които създават, са напълно различни, ако изобщо мога да правя обобщения за „Панаир на суетата“ въз основа на първите 300 страници, което е около една трета от цялата книга.

Не заради това спрях да го чета обаче. Множеството герои и лирически отклонения, колкото и интересни да са, се оказаха проблем за претоварения ми от немския език мозък.

От 2018-а насам започвам всяка година с обещанието, че това ще е годината, в която ще завърша „Панаир на суетата“.

Кой знае, може би през 2023-та обещанието ми ще се изпълни.

Други два романа, които отскоро са част от канона ми от незавършени книги, са „Одисей“ на Джойс и „Ана Каренина“ на Толстой. И докато съществува съвсем реален шанс „Ана Каренина“ да бъде завършена скоро, за „Одисей“ не си създавам излишни илюзии. Има книги, на които трябва да се отдаде заслуженото време (не в дни, а в години). И не само за прочитането им, но и за осмислянето им.

Гореспоменатите заглавия са само част от личния ми литературен канон. Книги, попаднали в живота ми случайно. Препоръчани ми от приятел, от лице, което смятам за авторитет или от любимия ми детски филмов герой. Книги, на които съм придала толкова голямо значение, че вече ме е страх да ги отворя.

Може би това е друг начин, по който бихме могли да характеризираме канона. Книги, от които се боим.

Книги, за които сме чували това-онова – знаем сюжета или някой и друг цитат, или името на главния герой. Знаем, че са важни за развитието на световната и/или националната литература. Че са високо ценени от критици и литературоведи и в същото време са четени от поне един близък, чието мнение ценим. Книги, с които един ден се сблъскваме челно. Дори понякога стигаме до края им. Или просто лъжем, че сме ги прочели.


Есето на Милена Галунска (с оригинално заглавие „Моят литературен канон“) е писано в рамките на водения от доц. д-р Биляна Курташева курс „Българският литературен канон формиране и проблематизиране“ към департамент „Нова българистика“, НБУ.

The collective thoughts of the interwebz