Tag Archives: encryption

Foundation report for 2015

Post Syndicated from Michael "Monty" Widenius original http://monty-says.blogspot.com/2016/02/foundation-report-for-2015.html

This is a repost of Otto Kekäläinen’s blog of the MariaDB foundations work in 2015.The mariadb.org website had over one million page views in 2015, a growth of about 9% since 2014. Good growth has been visible all over the MariaDB ecosystem and we can conclude that 2015 was a successful year for MariaDB. Increased adoptionMariaDB was included for the first time in an official Debian release (version 8.0 “Jessie”) and there has been strong adoption of MariaDB 10.0 in Linux distributions that already shipped 5.5. MariaDB is now available from all major Linux distributions including SUSE, RedHat, Debian and Ubuntu. Adoption of MariaDB in other platforms also increased, and MariaDB is now available as a database option on, among others, Amazon RDS, 1&1, Azure and Juju Charm Store (Ubuntu). Active maintenance and active developmentIn 2015 there were 6 releases of the 5.5 series, 8 releases of the 10.0 series and 8 releases of the 10.1 series. The 10.1 series was announced for general availability in October 2015 with the release of 10.1.8. In addition, there were also multiple releases of MariaDB Galera Cluster, and the C, Java and OBDC connectors as well as many other MariaDB tools. The announcements for each release can be read on the Mariadb.org blog archives with further details in the Knowledge Base. Some of the notable new features in 10.1 include: Galera clustering is now built-in instead of a separate server version, and can be activated with a simple configuration change.Traditional replication was also improved and is much faster in certain scenarios.Table, tablespace and log encryption were introduced.New security hardening features by default and authentication improvements.Improved support for the Spatial Reference systems for GIS data.We are also proud that the release remains backwards compatible and it is easy to upgrade to 10.1 from any previous MariaDB or MySQL release. 10.1 was also a success in terms of collaboration and included major contributions from multiple companies and developers. MariaDB events and talksThe main event organized by the MariaDB Foundation in the year was the MariaDB Developer Meetup in Amsterdam in October, at the Booking.com offices. It was a success with over 60 attendees In addition there were about a dozen events in 2015 at which MariaDB Foundation staff spoke.We are planning a new MariaDB developer event in early April 2016 in Berlin. We will make a proper announcement of this as soon as we have the date and place fixed.Staff, board and membersIn 2015 the staff included: Otto Kekäläinen, CEOMichael “Monty” Widenius, Founder and core developerAndrea Spåre-Strachan, personal assistant to Mr WideniusSergey Vojtovich, core developerAlexander Barkov, core developerVicențiu Ciorbaru, developerIan Gilfillan, documentation writer and webmasterOur staffing will slightly increase as Vicențiu will start working full time in 2016 for the Foundation. Our developers worked a lot on performance and scalability issues, ported the best features from new MySQL releases, improved MariaDB portability for platforms like ARM, AIX, IBM s390 and Power8, fixed security issues and other bugs. A lot of time was also invested in cleaning up the code base as the current 2,2 million lines of code includes quite a lot of legacy code in it. Version control and issue tracker statistics shows that the foundation staff made 528 commits, reported 373 bugs or issues and closed 424 bugs or other issues. In total there were 2400 commits made by 91 contributors in 2015.The Board of Directors in 2015 consisted of: Chairman Rasmus Johansson, VP Engineering at MariaDB CorporationMichael “Monty” Widenius, Founder and CTO of MariaDB CorporationJeremy Zawodny, Software Engineer at CraigslistSergei Golubchik, Chief Architect at MariaDB CorporationEspen Håkonsen, CIO of Visma and Managing Director of Visma IT & CommunicationsEric Herman, Principal Developer at Booking.comMariaDB Foundation CEO Otto Kekäläinen served as the secretary of the board. In 2015 we welcomed as new major sponsors Booking.com, Visma, Verkkokauppa.com. Acronis just joined to be a member for 2016. Please check out the full list of supporters. If you want to help the MariaDB Foundation in the mission to guarantee continuity and open collaboration, please support us as with individual or corporate sponsorship. What will 2016 bring?We expect steady growth in the adoption of MariaDB in 2016. There are many migrations from legacy database solutions underway, and as the world becomes increasingly digital, there are a ton of new software projects starting that use MariaDB to for their SQL and no-SQL data needs. In 2016 many will upgrade to 10.1 and the quickest ones will start using MariaDB 10.2 which is scheduled to be released some time during 2016. MariaDB also has a lot of plugins and storage engines that are getting more and more attention, and we expect more buzz around them when software developers figure out new ways to manage data in fast, secure and scalable ways.

Turning Amazon EMR into a Massive Amazon S3 Processing Engine with Campanile

Post Syndicated from Michael Wallman original https://blogs.aws.amazon.com/bigdata/post/Tx1XU0OQAZER3MI/Turning-Amazon-EMR-into-a-Massive-Amazon-S3-Processing-Engine-with-Campanile

Michael Wallman is a senior consultant with AWS ProServ

Have you ever had to copy a huge Amazon S3 bucket to another account or region? Or create a list based on object name or size? How about mapping a function over millions of objects? Amazon EMR to the rescue! EMR allows you to deploy large managed clusters, tightly coupled with S3, transforming the problem from a single, unscalable process/instance to one of orchestration.

The Campanile framework is a collection of scripts and patterns that use standard building blocks like Hadoop MapReduce, Streaming, HIVE, and Python Boto. Customers have used Campanile to migrate petabytes of data from one account to another, run periodic sync jobs and large Amazon Glacier restores, enable SSE, create indexes, and sync data before enabling CRR.

Traditionally, you could perform these tasks with simple shell commands: aws s3 ls s3://bucket | awk ‘{ if($3 > 1024000) print $0 }’. More recently, you could use complex threaded applications, having single processes make hundreds of requests a second. Now, with object counts reaching the billions, these patterns no longer realistically scale. For example, how long would it take you to list a billion objects with a process listing 1000 objects/sec? 11 days without interruption.

This post examines how the Campanile framework can be used to implement a bucket copy at speeds in excess of 100 Gbps or tens of thousands of transactions per second. Campanile also helps streamline the deployment of EMR, by providing a number of bootstrap actions that install system dependencies like Boto, write instance-type specific configuration files, and provide additional useful development and reporting tools like SaltStack and Syslog.

Each block below corresponds to an EMR step, and in most cases a single S3 API request.

Bucketlist

The first challenge of processing a large number of objects is getting the list of objects themselves. As mentioned earlier, it quickly becomes a scaling problem, and any manageable listing operation has to be distributed across the clusters. Therefore, the first and most important pattern of Campanile is the use of a part file which is a new-line separated text file in <delimiter>,<prefix> format. Relying on the S3 lexicographically sorted index and the ability to list keys hierarchically, Hadoop splits the part file across mapper tasks and lists the entire bucket in parallel. The more parts/nodes there are, the more concurrent listing operations. Using this model, can you list more than a million objects per second? YES

For a bucket containing objects in UUID format, a part file might look like the one below. Upon the completion of the BucketList step, a list of objects is written to hdfs.

Distributed diff using Hive

Now that you have an easy method for listing very large buckets, deriving the difference between two buckets is no longer a tedious, difficult, or time consuming task. If a diff is required, simply list the destination and run the Hive script below. Finding one missing or changed object amongst millions or billions is no problem. 

NOTE: The Hive script relies on Hive variables, an additional feature of EMR where you can include variables in scripts dynamically by using the dollar sign and curly braces.

DROP TABLE IF EXISTS src;
DROP TABLE IF EXISTS dst;
DROP TABLE IF EXISTS diff;
CREATE EXTERNAL TABLE src(key STRING, etag STRING, size BIGINT, mtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ LINES TERMINATED BY ‘n’ LOCATION ‘${SRC}’;
CREATE EXTERNAL TABLE dst(key STRING, etag STRING, size BIGINT, mtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ LINES TERMINATED BY ‘n’ LOCATION ‘${DST}’;
CREATE EXTERNAL TABLE diff(key STRING, etag STRING, size BIGINT, mtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ LINES TERMINATED BY ‘n’ LOCATION ‘${DIFF}’;
INSERT OVERWRITE TABLE diff SELECT src.key,src.etag,src.size,src.mtime FROM src LEFT OUTER JOIN dst ON (src.key = dst.key) WHERE (dst.key IS NULL) OR (src.etag != dst.etag);

Multipartlist

Probably the most complex function of the group, multipartlist pre-processes objects that were uploaded to S3 using the MultiPart API operation (for more information, see Multipart Upload Overview). These objects have ETag values suffixed with a hyphen and the number-of-parts that made up the original object.  While S3 supports a maximum of 5 GB per PUT operation, you can use MultiPart, to tie together any number of parts between 1 and 10,000, and create an object up to 5 TB in size. 

NOTE: Uploading 5 GB in a single PUT is both inefficient and error prone. This is why many of the S3 tools, including the AWS CLI and console, start using MultiPart on objects greater than 8 MB.

NOTE: MultipartUploadInitiate is the operation where the destination object’s metadata, cache control headers, ACLs, and encryption parameters are set.

To replicate ETags across buckets, you must mimic the original object part map. Campanile currently supports a few functions for this calculation (and is looking to expand). But splitting larger objects also has a positive effect on performance. While non-multipart objects are serviced by a single mapper, multipart objects can be processed asynchronously across independent mapper tasks. Thousands of parts of the same object, all being processed at the same time.

NOTE: Objects without a multipart ETag value are simply passed through, as seen in object 00 below.

Objectcopy and Multipartcomplete

Corresponding to S3 API operations GET and PUT, objectcopy does most of the work in terms of transactions per second (tps) and overall network throughput. Reading the output of Multipartlist, it downloads each object/part in the list, and immediately uploads it to the destination bucket. For single-part objects, the destination’s metadata, cache control headers, ACLs, and encryption parameters are set in this step..

To maximize performance, it relies on two configuration parameters set by Campanile’s EMR bootstrap action. The first setting is ephemeral, which is a comma-separated list of the instance’s ephemeral mount points. To distribute load across disks, one is randomly selected as the temporary location of the downloaded object/part. But in most cases, a downloaded object/part never reaches the disk because of the second setting. Maxtmpsize (dependent on the instance’s memory size) tells the tempfile class to flush bytes to disk only when this size is reached. Objects or parts that are smaller than maxtmpsize, stay entirely in memory. See an example of Campanile’s configuration below.

$ cat /etc/campanile.cfg
[DEFAULT]
maxtmpsize=134217728
ephemeral=/mnt,/mnt1

For single-part objects, the copy completes here. Multipart uploads require a final UploadComplete request, which incorporates information about each individual part. Therefore, objectcopy outputs the results of each uploaded part, for the reducer(s[WM3] ) to process. Remember, parts are being uploaded and processed asynchronously, so Campanile relies on Hadoop’s Reducer function to sort and group parts that constitute a single object. Multipartcomplete, orders part data, and signals the upload complete. This completes the copy process.

Conclusion

Within the aws-big-data-blog repo, you can find Campanile code samples, test files, detailed documentation, and a test procedure to exercise the steps covered above. From these samples, you can unlock the power of S3 and EMR. So clone the repo and get going!

If you have any questions or suggestions, please leave a comment below.

———————————————————–

Related

Nasdaq’s Architecture using Amazon EMR and Amazon S3 for Ad Hoc Access to a Massive Data Set

 

 

Turning Amazon EMR into a Massive Amazon S3 Processing Engine with Campanile

Post Syndicated from Michael Wallman original https://blogs.aws.amazon.com/bigdata/post/Tx1XU0OQAZER3MI/Turning-Amazon-EMR-into-a-Massive-Amazon-S3-Processing-Engine-with-Campanile

Michael Wallman is a senior consultant with AWS ProServ

Have you ever had to copy a huge Amazon S3 bucket to another account or region? Or create a list based on object name or size? How about mapping a function over millions of objects? Amazon EMR to the rescue! EMR allows you to deploy large managed clusters, tightly coupled with S3, transforming the problem from a single, unscalable process/instance to one of orchestration.

The Campanile framework is a collection of scripts and patterns that use standard building blocks like Hadoop MapReduce, Streaming, HIVE, and Python Boto. Customers have used Campanile to migrate petabytes of data from one account to another, run periodic sync jobs and large Amazon Glacier restores, enable SSE, create indexes, and sync data before enabling CRR.

Traditionally, you could perform these tasks with simple shell commands: aws s3 ls s3://bucket | awk ‘{ if($3 > 1024000) print $0 }’. More recently, you could use complex threaded applications, having single processes make hundreds of requests a second. Now, with object counts reaching the billions, these patterns no longer realistically scale. For example, how long would it take you to list a billion objects with a process listing 1000 objects/sec? 11 days without interruption.

This post examines how the Campanile framework can be used to implement a bucket copy at speeds in excess of 100 Gbps or tens of thousands of transactions per second. Campanile also helps streamline the deployment of EMR, by providing a number of bootstrap actions that install system dependencies like Boto, write instance-type specific configuration files, and provide additional useful development and reporting tools like SaltStack and Syslog.

Each block below corresponds to an EMR step, and in most cases a single S3 API request.

Bucketlist

The first challenge of processing a large number of objects is getting the list of objects themselves. As mentioned earlier, it quickly becomes a scaling problem, and any manageable listing operation has to be distributed across the clusters. Therefore, the first and most important pattern of Campanile is the use of a part file which is a new-line separated text file in <delimiter>,<prefix> format. Relying on the S3 lexicographically sorted index and the ability to list keys hierarchically, Hadoop splits the part file across mapper tasks and lists the entire bucket in parallel. The more parts/nodes there are, the more concurrent listing operations. Using this model, can you list more than a million objects per second? YES

For a bucket containing objects in UUID format, a part file might look like the one below. Upon the completion of the BucketList step, a list of objects is written to hdfs.

Distributed diff using Hive

Now that you have an easy method for listing very large buckets, deriving the difference between two buckets is no longer a tedious, difficult, or time consuming task. If a diff is required, simply list the destination and run the Hive script below. Finding one missing or changed object amongst millions or billions is no problem. 

NOTE: The Hive script relies on Hive variables, an additional feature of EMR where you can include variables in scripts dynamically by using the dollar sign and curly braces.

DROP TABLE IF EXISTS src;
DROP TABLE IF EXISTS dst;
DROP TABLE IF EXISTS diff;
CREATE EXTERNAL TABLE src(key STRING, etag STRING, size BIGINT, mtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ LINES TERMINATED BY ‘n’ LOCATION ‘${SRC}’;
CREATE EXTERNAL TABLE dst(key STRING, etag STRING, size BIGINT, mtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ LINES TERMINATED BY ‘n’ LOCATION ‘${DST}’;
CREATE EXTERNAL TABLE diff(key STRING, etag STRING, size BIGINT, mtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ LINES TERMINATED BY ‘n’ LOCATION ‘${DIFF}’;
INSERT OVERWRITE TABLE diff SELECT src.key,src.etag,src.size,src.mtime FROM src LEFT OUTER JOIN dst ON (src.key = dst.key) WHERE (dst.key IS NULL) OR (src.etag != dst.etag);

Multipartlist

Probably the most complex function of the group, multipartlist pre-processes objects that were uploaded to S3 using the MultiPart API operation (for more information, see Multipart Upload Overview). These objects have ETag values suffixed with a hyphen and the number-of-parts that made up the original object.  While S3 supports a maximum of 5 GB per PUT operation, you can use MultiPart, to tie together any number of parts between 1 and 10,000, and create an object up to 5 TB in size. 

NOTE: Uploading 5 GB in a single PUT is both inefficient and error prone. This is why many of the S3 tools, including the AWS CLI and console, start using MultiPart on objects greater than 8 MB.

NOTE: MultipartUploadInitiate is the operation where the destination object’s metadata, cache control headers, ACLs, and encryption parameters are set.

To replicate ETags across buckets, you must mimic the original object part map. Campanile currently supports a few functions for this calculation (and is looking to expand). But splitting larger objects also has a positive effect on performance. While non-multipart objects are serviced by a single mapper, multipart objects can be processed asynchronously across independent mapper tasks. Thousands of parts of the same object, all being processed at the same time.

NOTE: Objects without a multipart ETag value are simply passed through, as seen in object 00 below.

Objectcopy and Multipartcomplete

Corresponding to S3 API operations GET and PUT, objectcopy does most of the work in terms of transactions per second (tps) and overall network throughput. Reading the output of Multipartlist, it downloads each object/part in the list, and immediately uploads it to the destination bucket. For single-part objects, the destination’s metadata, cache control headers, ACLs, and encryption parameters are set in this step..

To maximize performance, it relies on two configuration parameters set by Campanile’s EMR bootstrap action. The first setting is ephemeral, which is a comma-separated list of the instance’s ephemeral mount points. To distribute load across disks, one is randomly selected as the temporary location of the downloaded object/part. But in most cases, a downloaded object/part never reaches the disk because of the second setting. Maxtmpsize (dependent on the instance’s memory size) tells the tempfile class to flush bytes to disk only when this size is reached. Objects or parts that are smaller than maxtmpsize, stay entirely in memory. See an example of Campanile’s configuration below.

$ cat /etc/campanile.cfg
[DEFAULT]
maxtmpsize=134217728
ephemeral=/mnt,/mnt1

For single-part objects, the copy completes here. Multipart uploads require a final UploadComplete request, which incorporates information about each individual part. Therefore, objectcopy outputs the results of each uploaded part, for the reducer(s[WM3] ) to process. Remember, parts are being uploaded and processed asynchronously, so Campanile relies on Hadoop’s Reducer function to sort and group parts that constitute a single object. Multipartcomplete, orders part data, and signals the upload complete. This completes the copy process.

Conclusion

Within the aws-big-data-blog repo, you can find Campanile code samples, test files, detailed documentation, and a test procedure to exercise the steps covered above. From these samples, you can unlock the power of S3 and EMR. So clone the repo and get going!

If you have any questions or suggestions, please leave a comment below.

———————————————————–

Related

Nasdaq’s Architecture using Amazon EMR and Amazon S3 for Ad Hoc Access to a Massive Data Set

 

 

How to Help Protect Sensitive Data with AWS KMS

Post Syndicated from Matt Bretan original https://blogs.aws.amazon.com/security/post/Tx79IILINW04DC/How-to-Help-Protect-Sensitive-Data-with-AWS-KMS

AWS Key Management Service (AWS KMS) celebrated its one-year launch anniversary in November 2015, and organizations of all sizes are using it to effectively manage their encryption keys. KMS also successfully completed the PCI DSS 3.1 Level 1 assessment as well as the latest SOC assessment in August 2015.

One question KMS customers frequently ask is about how how to encrypt Primary Account Number (PAN) data within AWS because PCI DSS sections 3.5 and 3.6 require the encryption of credit card data at rest and has stringent requirements around the management of encryption keys. One KMS encryption option is to encrypt your PAN data using customer data keys (CDKs) that are exportable out of KMS. Alternatively, you also can use KMS to directly encrypt PAN data by using a customer master key (CMK). In this blog post, I will show you how to help protect sensitive PAN data by using KMS CMKs.

The use of a CMK to directly encrypt data removes some of the burden of having developers manage encryption libraries. Additionally, a CMK cannot be exported from KMS, which alleviates the concern about someone saving the encryption key in an insecure location. You can also leverage AWS CloudTrail so that you have logs of the key’s use.

For the purpose of this post, I have three different AWS Identity and Access Management (IAM) roles to help ensure the security of the PAN data being encrypted:

KeyAdmin – This is the general key administrator role, which has the ability to create and manage the KMS keys. A key administrator does not have the ability to directly use the keys for encrypt and decrypt functions. Keep in mind that because the administrator does have the ability to change a key’s policy, they could escalate their own privilege by changing this policy to give themselves encrypt/decrypt permissions.

PANEncrypt – This role allows the user only to encrypt an object using the CMK.

PANDecrypt – This role allows the user only to decrypt an object using the CMK.

If you don’t already have a CMK that you wish to use to encrypt the sensitive PAN data, you can create one with the following command. (Throughout this post, remember to replace the placeholder text in red with your account-specific information.)

$ aws kms create-key –profile KeyAdmin –description "Key used to encrypt and decrypt sensitive PAN data" –policy file://Key_Policy

Notice the use of –profile KeyAdmin in the previous command. This forces the command to be run as a role specified within my configuration file that has permissions to create a KMS key. We will be using different roles, as defined in the following key policy (to which file://Key_Policy in the previous command refers), to manipulate and use the KMS key. For additional details about how to assume roles within the CLI, see Assuming a Role.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAccessForKeyAdministrators",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/KeyAdmin"   
      },
      "Action": [
        "kms:Create*",
        "kms:Describe*",
        "kms:Enable*",
        "kms:List*",
        "kms:Put*",
        "kms:Update*",
        "kms:Revoke*",
        "kms:Disable*",
        "kms:Get*",
        "kms:Delete*",
        "kms:ScheduleKeyDeletion",
        "kms:CancelKeyDeletion"
      ],
      "Resource": "*"
    },
    {
      "Sid": "AllowEncryptionWithTheKey",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/PANEncrypt"
      },
      "Action": [
        "kms:Encrypt",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:DescribeKey",
        “kms:ListKeys” 
      ],
      "Resource": "*"
    },
    {
      "Sid": "AllowDecryptionWithTheKey",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/PANDecrypt"
      },                             
      "Action": [
        "kms:Decrypt",
      ],
      "Resource": "*"
    }
  ]
}

After the new CMK is created, I can then assign it an alias so that it will be easier to identify in the future. In this case, I will create the alias SensitivePANKey, as shown in the following command.

$ aws kms create-alias –profile KeyAdmin –alias-name alias/SensitivePANKey –target-key-id arn:aws:kms:us-east-1:123456789012:key/221c9ce1-9da8-44e9-801b-faf1EXAMPLE

Now that I have a CMK with least-privilege permissions to limit who can manage and use it, I can start to use it to encrypt PAN data. To keep things simple in this post, I will be using AWS CLI commands to accomplish this, but this can also be done through an AWS SDK and incorporated into an application.

Using the PANEncrypt role, the following CLI command takes in a string of data (in this case “Sensitive PAN Data”), encrypts it using the key I created earlier in this post, and sends the Base64-encoded ciphertext into a new file called encrypted-123-1449012738. Notice that I also use EncryptionContext to further protect the encrypted data. The sensitive PAN data is sent to KMS over TLS (with ciphers that enforce perfect forward secret) and is then encrypted under AES-GCM using a 256-bit key.

$ aws kms encrypt –profile PANEncrypt –key-id alias/SensitivePANKey –plaintext "Sensitive PAN Data" –query CiphertextBlob –encryption-context [email protected],Date=1449012738 –output text | base64 –decode > encrypted-123-1449012738

Because the EncryptionContext must be the same when I decrypt the file, I gave the file a unique name that can help us rebuild the EncryptionContext when it comes time to decrypt the object. The file name structure is: encrypted-GUID-Date. The GUID allows us to look up the user’s user name within our directory, and then I use the date as part of the context. As Greg Rubin discussed in another AWS Security Blog post, the EncryptionContext can help ensure the integrity of the encrypted data.

From here, I can use the following command to put this encrypted object in an Amazon S3 bucket.

$ aws s3 cp encrypted-123-1449012738 s3://Secure-S3-Bucket-For-PAN/ –region us-west-2 –sse aws:kms

For this example, I chose to use native SSE-S3 encryption, but this could have been another KMS key as well. From here, I update a database with the location of this encrypted object within the S3 bucket.

When I need to retrieve the PAN data, I can make the following CLI call to get the encrypted object from my S3 bucket.

$ aws s3 cp s3://Secure-S3-Bucket-For-PAN/ encrypted-123-1449012738 . –region us-west-2

Finally to decrypt the object, I run the following command using the PANDecrypt role.

$ echo “Decrypted PAN Data: $(aws kms decrypt –profile PANDecrypt –CiphertextBlob fileb://encrypted-123-1449012738 –encryption-context [email protected],Date=1449012738 –output text –query Plaintext | base64 –decode)”

Notice that I use the same EncryptionContext as I did when I encrypted the sensitive PAN data. To get this EncryptionContext, I again look up the UserName from the GUID and then include the Date. Then for the purpose of this example, I print this sensitive data to the screen, but in a real-world example, this can be passed to another application or service.

Now that I have shown that KMS can directly encrypt and decrypt sensitive PAN data, this can be rolled out as a service within an application environment. As a best practice, you should avoid using the same KMS CMK to encrypt more than 2 billion objects. After that point, the security of the resulting ciphertexts may be weakened. To mitigate this risk, you can choose to have KMS rotate your CMK annually or you can create multiple CMKs to handle your workloads safely. Additionally, this service should be composed of two different component services: one that provides encryption, and another that has enhanced controls around it and is used to decrypt the sensitive data. These component services would address storage of the ciphertext, metadata, error conditions, and so on. With the integration of CloudTrail into KMS and application logs, an organization can have detailed records of the calls into the service and the use of KMS keys across the organization.

If you have questions or comments about this post, either post them below or visit the KMS forum.

– Matt

The current state of boot security

Post Syndicated from Matthew Garrett original http://mjg59.dreamwidth.org/39339.html

I gave a presentation at 32C3 this week. One of the things I said was “If any of you are doing seriously confidential work on Apple laptops, stop. For the love of god, please stop.” I didn’t really have time to go into the details of that at the time, but right now I’m sitting on a plane with a ridiculous sinus headache and the pseudoephedrine hasn’t kicked in yet so here we go.The basic premise of my presentation was that it’s very difficult to determine whether your system is in a trustworthy state before you start typing your secrets (such as your disk decryption passphrase) into it. If it’s easy for an attacker to modify your system such that it’s not trustworthy at the point where you type in a password, it’s easy for an attacker to obtain your password. So, if you actually care about your disk encryption being resistant to anybody who can get temporary physical possession of your laptop, you care about it being difficult for someone to compromise your early boot process without you noticing.There’s two approaches to this. The first is UEFI Secure Boot. If you cryptographically verify each component of the boot process, it’s not possible for a user to compromise the boot process. The second is a measured boot. If you measure each component of the boot process into the TPM, and if you use these measurements to control access to a secret that allows the laptop to prove that it’s trustworthy (such as Joanna Rutkowska’s Anti Evil Maid or my variant on the theme), an attacker can compromise the boot process but you’ll know that they’ve done so before you start typing.So, how do current operating systems stack up here?Windows: Supports UEFI Secure Boot in a meaningful way. Supports measured boot, but provides no mechanism for the system to attest that it hasn’t been compromised. Good, but not perfect.Linux: Supports UEFI Secure Boot[1], but doesn’t verify signatures on the initrd[2]. This means that attacks such as Evil Abigail are still possible. Measured boot isn’t in a good state, but it’s possible to incorporate with a bunch of manual work. Vulnerable out of the box, but can be configured to be better than Windows.Apple: Ha. Snare talked about attacking the Apple boot process in 2012 – basically everything he described then is still possible. Apple recently hired the people behind Legbacore, so there’s hope – but right now all shipping Apple hardware has no firmware support for UEFI Secure Boot and no TPM. This makes it impossible to provide any kind of boot attestation, and there’s no real way you can verify that your system hasn’t been compromised.Now, to be fair, there’s attacks that even Windows and properly configured Linux will still be vulnerable to. Firmware defects that permit modification of System Management Mode code can still be used to circumvent these protections, and the Management Engine is in a position to just do whatever it wants and fuck all of you. But that’s really not an excuse to just ignore everything else. Improving the current state of boot security makes it more difficult for adversaries to compromise a system, and if we ever do get to the point of systems which aren’t running any hidden proprietary code we’ll still need this functionality. It’s worth doing, and it’s worth doing now.[1] Well, except Ubuntu’s signed bootloader will happily boot unsigned kernels which kind of defeats the entire point of the exercise[2] Initrds are built on the local machine, so we can’t just ship signed imagescomment count unavailable comments

Month in Review

Post Syndicated from Andy Werth original https://blogs.aws.amazon.com/bigdata/post/TxV5JXBROJC8HK/Month-in-Review

Lots for big data enthusiasts in December on the AWS Big Data Blog. Take a look!

Top 10 Performance Tuning Techniques for Amazon Redshift

“This post takes you through the most common issues that customers find as they adopt Amazon Redshift, and gives you concrete guidance on how to address each."

Migrating Metadata when Encrypting an Amazon Redshift Cluster

"The customer is acquiring a manufacturing company that is slightly smaller than they are. Each has a BI infrastructure and they believe consolidating platforms would lower expenses and simplify operations. They want to move the acquired organization’s warehouse into the existing Amazon Redshift cluster, but  have a contractual obligation to encrypt data.”

Performance Tuning Your Titan Graph Database on AWS

“Graph databases can outperform an RDBMS and give much simpler query syntax for many use cases. In my last post, Building a Graph Database on AWS Using Amazon DynamoDB and Titan, I showed how a network of relationships can be stored and queried using a graph database. In this post, I show you how to tune the performance of your Titan database running on Amazon DynamoDB in AWS.”

Securely Access Web Interfaces on Amazon EMR Launched in a Private Subnet

"In this post, I outline two possible solutions to securely access web UIs on an EMR cluster running in a private subnet. These options cover scenarios such as a connecting through a local network to your VPC or connecting through the Internet if your private subnet is not directly accessible.”

Query Routing and Rewrite: Introducing pgbouncer-rr for Amazon Redshift and PostgreSQL

"Have you ever wanted to split your database load across multiple servers or clusters without impacting the configuration or code of your client applications? Or perhaps you have wished for a way to intercept and modify application queries, so that you can make them use optimized tables (sorted, pre-joined, pre-aggregated, etc.), add security filters, or hide changes you have made in the schema?”

 

 

FROM THE ARCHIVE

(April 9, 2015)

Nasdaq’s Architecture using Amazon EMR and Amazon S3 for Ad Hoc Access to a Massive Data Set

Nate Sammons, a Principal Architect for Nasdaq, describes Nasdaq’s new data warehouse initiative: “Because we can now use Amazon S3 client-side encryption with EMRFS, we can meet our security requirements for data at rest in Amazon S3 and enjoy the scalability and ecosystem of applications in Amazon EMR.”

Month in Review

Post Syndicated from Andy Werth original https://blogs.aws.amazon.com/bigdata/post/TxV5JXBROJC8HK/Month-in-Review

Lots for big data enthusiasts in December on the AWS Big Data Blog. Take a look!

Top 10 Performance Tuning Techniques for Amazon Redshift

“This post takes you through the most common issues that customers find as they adopt Amazon Redshift, and gives you concrete guidance on how to address each."

Migrating Metadata when Encrypting an Amazon Redshift Cluster

"The customer is acquiring a manufacturing company that is slightly smaller than they are. Each has a BI infrastructure and they believe consolidating platforms would lower expenses and simplify operations. They want to move the acquired organization’s warehouse into the existing Amazon Redshift cluster, but  have a contractual obligation to encrypt data.”

Performance Tuning Your Titan Graph Database on AWS

“Graph databases can outperform an RDBMS and give much simpler query syntax for many use cases. In my last post, Building a Graph Database on AWS Using Amazon DynamoDB and Titan, I showed how a network of relationships can be stored and queried using a graph database. In this post, I show you how to tune the performance of your Titan database running on Amazon DynamoDB in AWS.”

Securely Access Web Interfaces on Amazon EMR Launched in a Private Subnet

"In this post, I outline two possible solutions to securely access web UIs on an EMR cluster running in a private subnet. These options cover scenarios such as a connecting through a local network to your VPC or connecting through the Internet if your private subnet is not directly accessible.”

Query Routing and Rewrite: Introducing pgbouncer-rr for Amazon Redshift and PostgreSQL

"Have you ever wanted to split your database load across multiple servers or clusters without impacting the configuration or code of your client applications? Or perhaps you have wished for a way to intercept and modify application queries, so that you can make them use optimized tables (sorted, pre-joined, pre-aggregated, etc.), add security filters, or hide changes you have made in the schema?”

 

 

FROM THE ARCHIVE

(April 9, 2015)

Nasdaq’s Architecture using Amazon EMR and Amazon S3 for Ad Hoc Access to a Massive Data Set

Nate Sammons, a Principal Architect for Nasdaq, describes Nasdaq’s new data warehouse initiative: “Because we can now use Amazon S3 client-side encryption with EMRFS, we can meet our security requirements for data at rest in Amazon S3 and enjoy the scalability and ecosystem of applications in Amazon EMR.”

The Most Popular AWS Security Blog Posts in 2015

Post Syndicated from Craig Liebendorfer original https://blogs.aws.amazon.com/security/post/Tx4QX7W51NDSLO/The-Most-Popular-AWS-Security-Blog-Posts-in-2015

The following 20 posts are the most popular posts that were published in 2015 on the AWS Security Blog. You can use this list as a guide to do some catchup reading or even read a post again that you found particularly valuable.  

Introducing s2n, a New Open Source TLS Implementation

Customer Update—AWS and EU Safe Harbor

How to Connect Your On-Premises Active Directory to AWS Using AD Connector

How to Implement Federated API and CLI Access Using SAML 2.0 and AD FS

Privacy and Data Security

Enable a New Feature in the AWS Management Console: Cross-Account Access

PCI Compliance in the AWS Cloud

How to Help Prepare for DDoS Attacks by Reducing Your Attack Surface

How to Address the PCI DSS Requirements for Data Encryption in Transit Using Amazon VPC

How to Receive Alerts When Your IAM Configuration Changes

How to Receive Notifications When Your AWS Account’s Root Access Keys Are Used

How to Receive Alerts When Specific APIs Are Called by Using AWS CloudTrail, Amazon SNS, and AWS Lambda

New in IAM: Quickly Identify When an Access Key Was Last Used

2015 AWS PCI Compliance Package Now Available

An Easier Way to Manage Your Policies

New Whitepaper—Single Sign-On: Integrating AWS, OpenLDAP, and Shibboleth

New SOC 1, 2, and 3 Reports Available — Including a New Region and Service In-Scope

How to Create a Limited IAM Administrator by Using Managed Policies

How to Delegate Management of Multi-Factor Authentication to AWS IAM Users

Now Available: Videos and Slide Decks from the re:Invent 2015 Security and Compliance Track

Also, the following 20 posts are the most popular AWS Security Blog posts since its inception in April 2013. Some of these posts have been readers’ favorites year after year.

Introducing s2n, a New Open Source TLS Implementation

Writing IAM Policies: How to Grant Access to an Amazon S3 Bucket

Where’s My Secret Access Key?

Securely connect to Linux instances running in a private Amazon VPC

Enabling Federation to AWS Using Windows Active Directory, ADFS, and SAML 2.0

A New and Standardized Way to Manage Credentials in the AWS SDKs

IAM Policies and Bucket Policies and ACLs! Oh, My! (Controlling Access to S3 Resources)

Writing IAM Policies: Grant Access to User-Specific Folders in an Amazon S3 Bucket

Demystifying EC2 Resource-Level Permissions

Resource-Level Permissions for EC2–Controlling Management Access on Specific Instances

Controlling Network Access to EC2 Instances Using a Bastion Server

Customer Update—AWS and EU Safe Harbor

Granting Permission to Launch EC2 Instances with IAM Roles (PassRole Permission)

How Do I Protect Cross-Account Access Using MFA?

Building an App Using Amazon Cognito and an OpenID Connect Identity Provider

A safer way to distribute AWS credentials to EC2

How to Connect Your On-Premises Active Directory to AWS Using AD Connector

How to Implement Federated API and CLI Access Using SAML 2.0 and AD FS

Privacy and Data Security

How to Enable Cross-Account Access to the AWS Management Console

We thank you for visiting the AWS Security Blog in 2015 and hope you’ll return again regularly in 2016. Let us know in the comments section below if there is a specific security or compliance topic you would like us to cover in the new year. 

– Craig

Query Routing and Rewrite: Introducing pgbouncer-rr for Amazon Redshift and PostgreSQL

Post Syndicated from Bob Strahan original https://blogs.aws.amazon.com/bigdata/post/Tx3G7177U6YHY5I/Query-Routing-and-Rewrite-Introducing-pgbouncer-rr-for-Amazon-Redshift-and-Postg

Bob Strahan is a senior consultant with AWS Professional Services

Have you ever wanted to split your database load across multiple servers or clusters without impacting the configuration or code of your client applications? Or perhaps you have wished for a way to intercept and modify application queries, so that you can make them use optimized tables (sorted, pre-joined, pre-aggregated, etc.), add security filters, or hide changes you have made in the schema?

The pgbouncer-rr project is based on pgbouncer, an open source, PostgreSQL connection pooler. It adds two new significant features:

Routing: Intelligently send queries to different database servers from one client connection; use it to partition or load balance across multiple servers or clusters.

Rewrite: Intercept and programmatically change client queries before they are sent to the server; use it to optimize or otherwise alter queries without modifying your application.

Pgbouncer-rr works the same way as pgbouncer. Any target application can be connected to pgbouncer-rr as if it were an Amazon Redshift or PostgreSQL server, and pgbouncer-rr creates a connection to the actual server, or reuses an existing connection.

You can deploy multiple instances of pgbouncer-rr to avoid throughput bottlenecks or single points of failure, or to support multiple configurations. It can live in an Auto Scaling group, and behind an Elastic Load Balancing load balancer. It can be deployed to a public subnet while your servers reside in private subnets. You can choose to run it as a bastion server using SSH tunneling, or you can use pgbouncer’s recently introduced SSL support for encryption and authentication.

Documentation and community support for pgbouncer can be easily found online;  pgbouncer-rr is a superset of pgbouncer.

Now I’d like to talk about the query routing and query rewrite feature enhancements.

Query Routing

The routing feature maps client connections to server connections using a Python routing function which you provide. Your function is called for each query, with the client username and the query string as parameters. Its return value must identify the target database server. How it does this is entirely up to you.

For example, you might want to run two Amazon Redshift clusters, each optimized to host a distinct data warehouse subject area. You can determine the appropriate cluster for any given query based on the names of the schemas or tables used in the query. This can be extended to support multiple Amazon Redshift clusters or PostgreSQL instances.

In fact, you can even mix and match Amazon Redshift and PostgreSQL, taking care to ensure that your Python functions correctly handle any server-specific grammar in your queries; your database will throw errors if your routing function sends queries it can’t process. And, of course, any query must run entirely on the server to which is it routed; cross-database joins or multi-server transactions do not work!

Here’s another example: you might want to implement controlled load balancing (or A/B testing) across replicated clusters or servers. Your routing function can choose a server for each query based on any combination of the client username, the query string, random variables, or external input. The logic can be as simple or as sophisticated as you want.

Your routing function has access to the full power of the Python language and the myriad of available Python modules. You can use the regular expression module (re) to match words and patterns in the query string, or use the SQL parser module (sqlparse) to support more sophisticated/robust query parsing. You may also want to use the AWS SDK module (boto) to read your routing table configurations from Amazon DynamoDB.

The Python routing function is dynamically loaded by pgbouncer-rr from the file you specify in the configuration:

routing_rules_py_module_file = /etc/pgbouncer-rr/routing_rules.py

The file should contain the following Python function:

def routing_rules(username, query):

The function parameters provide the username associated with the client, and a query string. The function return value must be a valid database key name (dbkey) as specified in the configuration file, or None. When a valid dbkey is returned by the routing function, the client connection will be routed to a connection in the specified server connection pool. When None is returned by the routing function, the client remains routed to its current server connection.

The route function is called only for query and prepare packets, with the following restrictions:

All queries must run wholly on the assigned server. Cross-server joins do not work.

Ideally, queries should auto-commit each statement. Set pool_mode = statement in the configuration.

Multi-statement transactions work correctly only if statements are not rerouted by the routing_rules function to a different server pool mid-transaction. Set pool_mode = transaction in the configuration.

If your application uses database catalog tables to discover the schema, then the routing_rules function should direct catalog table queries to a database server that has all the relevant schema objects created.

Simple query routing example

Amazon Redshift cluster 1 has data in table ‘tablea’. Amazon Redshift cluster 2 has data in table ‘tableb’. You want a client to be able to issue queries against either tablea or tableb without needing to know which table resides on which cluster.

Create a (default) entry with a key, say, ‘dev’ in the [databases] section of the pgbouncer configuration. This entry determines the default cluster used for client connections to database ‘dev’. You can make either redshift1 or redshift2 the default, or even specify a third ‘default’ cluster. Create additional entries in the pgbouncer [databases] section for each cluster; give these unique key names such as ‘dev.1’, and ‘dev.2’.

[databases]
dev = host= port=5439 dbname=dev
dev.1 = host= port=5439 dbname=dev
dev.2 = host= port=5439 dbname=dev

Ensure that the configuration file setting routing_rules_py_module_file specifies the path to your Python routing function file, such as ~/routing_rules.py. The code in the file could look like the following:

def routing_rules(username, query):
if "tablea" in query:
return "dev.1"
elif "tableb" in query:
return "dev.2"
else:
return None

Below is a toy example, but it illustrates the concept. If a client sends the query SELECT * FROM tablea, it matches the first rule, and is assigned to server pool ‘dev.1’ (redshift1). If a client (and it could be the same client in the same session) sends the query SELECT * FROM tableb, it matches the second rule, and is assigned to server pool ‘dev.2’ (redshift2). Any query that does not match either rule results in None being returned, and the server connection remains unchanged.

Below is an alternative function for the same use case, but the routing logic is defined in a separate extensible data structure using regular expressions to find the table matches. The routing table structure could easily be externalized in a DynamoDB table.

# ROUTING TABLE
# ensure that all dbkey values are defined in [database] section of the pgbouncer ini file
routingtable = {
‘route’ : [{
‘usernameRegex’ : ‘.*’,
‘queryRegex’ : ‘.*tablea.*’,
‘dbkey’ : ‘dev.1’
}, {
‘usernameRegex’ : ‘.*’,
‘queryRegex’ : ‘.*tableb.*’,
‘dbkey’ : ‘dev.2’
}
],
‘default’ : None
}

# ROUTING FN – CALLED FROM PGBOUNCER-RR – DO NOT CHANGE NAME
# IMPLEMENTS REGEX RULES DEFINED IN ROUTINGTABLE OBJECT
# RETURNS FIRST MATCH FOUND
import re
def routing_rules(username, query):
for route in routingtable[‘route’]:
u = re.compile(route[‘usernameRegex’])
q = re.compile(route[‘queryRegex’])
if u.search(username) and q.search(query):
return route[‘dbkey’]
return routingtable[‘default’]

You will likely want to implement more robust and sophisticated rules, taking care to avoid unintended matches. Write test cases to call your function with different inputs and validate the output dbkey values.

Query Rewrite

The rewrite feature provides you with the opportunity to manipulate application queries en route to the server without modifying application code. You might want to do this to:

Optimize an incoming query to use the best physical tables when you have replicated tables with alternative sort/dist keys and column subsets (emulate projections), or when you have stored pre-joined or pre-aggregated data (emulate ‘materialized views’).

Apply query filters to support row-level data partitioning/security.

Roll out new schemas, resolve naming conflicts, and so on, by changing identifier names on the fly.

The rewrite function is also implemented in a fully configurable Python function, dynamically loaded from an external module specified in the configuration: 

rewrite_query_py_module_file = /etc/pgbouncer-rr/rewrite_query.py

The file should contain the following Python function:

def rewrite_query(username, query):

The function parameters provide the username associated with the client, and a query string. The function return value must be a valid SQL query string which returns the result set that you want the client application to receive.

Implementing a query rewrite function is straightforward when the incoming application queries have fixed formats that are easily detectable and easily manipulated, perhaps using regular expression search/replace logic in the Python function. It is much more challenging to build a robust rewrite function to handle SQL statements with arbitrary format and complexity.

Enabling the query rewrite function triggers pgbouncer-rr to enforce that a complete query is contained in the incoming client socket buffer. Long queries are often split across multiple network packets. They should all be in the buffer before the rewrite function is called, which requires that the buffer size be large enough to accommodate the largest query. The default buffer size (2048) is likely too small, so specify a much larger size in the configuration: pkt_buf = 32768.

If a partially received query is detected, and there is room in the buffer for the remainder of the query, pgbouncer-rr waits for the remaining packets to be received before processing the query. If the buffer is not large enough for the incoming query, or if it is not large enough to hold the re-written query (which may be longer than the original), then the rewrite function will fail. By default, the failure is logged and the original query string will be passed to the server unchanged. You can force the client connection to terminate instead, by setting: rewrite_query_disconnect_on_failure = true.

Simple query rewrite example

You have a star schema with a large fact table in Amazon Redshift (such as ‘sales’) with two related dimension tables (such as ‘store’ and ‘product’). You want to optimize equally for two different queries:

1> SELECT storename, SUM(total) FROM sales JOIN store USING (storeid)
GROUP BY storename ORDER BY storename
2> SELECT prodname, SUM(total) FROM sales JOIN product USING (productid)
GROUP BY prodname ORDER BY prodname

By experimenting, you have determined that the best possible solution is to have two additional tables, each optimized for one of the queries:

store_sales: store and sales tables denormalized, pre-aggregated by store, and sorted and distributed by store name

product_sales: product and sales tables denormalized, pre-aggregated by product, sorted and distributed by product name

So you implement the new tables, and take care of their population in your ETL processes. But you’d like to avoid directly exposing these new tables to your reporting or analytic client applications. This might be the best optimization today, but who knows what the future holds? Maybe you’ll come up with a better optimization later, or maybe Amazon Redshift will introduce cool new features that provide a simpler alternative.

So, you implement a pgbouncer-rr rewrite function to change the original queries on the fly. Ensure that the configuration file setting rewrite_query_py_module_file specifies the path to your python function file, say~/rewrite_query.py.

The code in the file could look like this:

import re
def rewrite_query(username, query):
q1="SELECT storename, SUM(total) FROM sales JOIN store USING (storeid)
GROUP BY storename ORDER BY storename"
q2="SELECT prodname, SUM(total) FROM sales JOIN product USING (productid)
GROUP BY prodname ORDER BY prodname"
if re.match(q1, query):
new_query = "SELECT storename, SUM(total) FROM store_sales
GROUP BY storename ORDER BY storename;"
elif re.match(q2, query):
new_query = "SELECT prodname, SUM(total) FROM product_sales
GROUP BY prodname ORDER BY prodname;"
else:
new_query = query
return new_query

Again, this is a toy example to illustrate the concept. In any real application, your Python function needs to employ more robust query pattern matching and substitution.

Your reports and client applications use the same join query as before:

SELECT prodname, SUM(total) FROM sales JOIN product USING (productid) GROUP BY prodname ORDER BY prodname;

Now, when you look on the Amazon Redshift console Queries tab, you see that the query received by Amazon Redshift is the rewritten version that uses the new product_sales table, leveraging your pre-joined, pre-aggregated data and the targeted sort and dist keys:

SELECT prodname, SUM(total) FROM product_sales GROUP BY prodname ORDER BY prodname;

Getting Started

Here are the steps to start working with pgbouncer-rr.

Install

Download and install pgbouncer-rr by running the following commands (Amazon Linux/RHEL/CentOS):

# install required packages
sudo yum install libevent-devel openssl-devel git libtool python-devel -y

# download the latest pgbouncer distribution
git clone https://github.com/pgbouncer/pgbouncer.git

# download pgbouncer-rr extensions
git clone https://github.com/awslabs/pgbouncer-rr-patch.git

# merge pgbouncer-rr extensions into pgbouncer code
cd pgbouncer-rr-patch
./install-pgbouncer-rr-patch.sh ../pgbouncer

# build and install
cd ../pgbouncer
git submodule init
git submodule update
./autogen.sh
./configure …
make
sudo make install

Configure

Create a configuration file, using ./pgbouncer-example.ini as a starting point, adding your own database connections and Python routing rules and rewrite query functions.

Set up user authentication; for more information, see authentication file format.

NOTE: The recently added pgbouncer auth_query feature does not work with Amazon Redshift.

By default, pgbouncer-rr does not support SSL/TLS connections. However, you can experiment with pgbouncer’s newest TLS/SSL feature. Just add a private key and certificate to your pgbouncer-rr configuration:

client_tls_sslmode=allow
client_tls_key_file = ./pgbouncer-rr-key.key
client_tls_cert_file = ./pgbouncer-rr-key.crt

Hint: Here’s how to easily generate a test key with a self-signed certificate using openssl:

openssl req -newkey rsa:2048 -nodes -keyout pgbouncer-rr-key.key -x509 -days 365 -out pgbouncer-rr-key.crt

Configure a firewall

Configure your Linux firewall to enable incoming connections on the configured pgbouncer-rr listening port. For example:

sudo firewall-cmd –zone=public –add-port=5439/tcp –permanent
sudo firewall-cmd –reload

If you are running pgbouncer-rr on an Amazon EC2 instance, the instance security group must also be configured to allow incoming TCP connections on the listening port.

Launch

Run pgbouncer-rr as a daemon using the commandline pgbouncer <config_file> -d. See pgbouncer –help for commandline options. Hint: Use -v -v to enable verbose logging. If you look carefully in the logfile, you will see evidence of the query routing and query rewrite features in action.

Connect

Configure your client application as though you were connecting directly to an Amazon Redshift or PostgreSQL database, but be sure to use the pgbouncer-rr hostname and listening port.

Here’s an example using plsql:

psql -h pgbouncer-dnshostname -U dbuser -d dev -p 5439

Here’s another example using a JDBC driver URL (Amazon Redshift driver):

jdbc:redshift://pgbouncer-dnshostname:5439/dev

Other uses for pgbouncer-rr

It can be used for lots of things, really. In addition to the examples shown above, here are some other use cases suggested by colleagues:

Serve small or repetitive queries from PostgreSQL tables consisting of aggregated results.

Parse SQL for job-tracking table names to implement workload management with job tracking tables on PostgreSQL, simplifies application development.

Leverage multiple Amazon Redshift clusters to serve dashboarding workloads with heavy concurrency requirements.

Determine the appropriate route based on the current workload/state of cluster resources (always route to the cluster with least running queries, etc).

Use Query rewrite to parse SQL for queries which do not leverage the nuances of Amazon Redshift query design or query best practices; either block these queries or re-write them for performance.

Use SQL parse to limit end user ability to access tables with ad hoc queries; e.g., identify users scanning N+ years of data and instead issue a query which blocks them with a re-write: e.g., SELECT ‘WARNING: scans against v_all_sales must be limited to no more than 30 days’ AS alert;

Use SQL parse to identify and rewrite queries which filter on certain criteria to direct them towards a specific table containing data matching that filter.

Actually, your use cases don’t need to be limited to just routing and query rewriting. You could design a routing function that leaves the route unchanged, but which instead implements purposeful side effects, such as:

Publishing custom CloudWatch metrics, enabling you to monitor specific query patterns and/or user interactions with your databases.

Capturing SQL DDL and INSERT/UPDATE statements, and wrap them into Amazon Kinesis put-records as input to the method described in Erik Swensson’s excellent post, Building Multi-AZ or Multi-Region Amazon Redshift Clusters.

We’d love to hear your thoughts and ideas for pgbouncer-rr functions. If you have questions or suggestions, please leave a comment below

Copyright 2015-2015 Amazon.com, Inc. or its affiliates. All Rights Reserved.

———————————————

Related:

Top 10 Performance Tuning Techniques for Amazon Redshift

 

 

Query Routing and Rewrite: Introducing pgbouncer-rr for Amazon Redshift and PostgreSQL

Post Syndicated from Bob Strahan original https://blogs.aws.amazon.com/bigdata/post/Tx3G7177U6YHY5I/Query-Routing-and-Rewrite-Introducing-pgbouncer-rr-for-Amazon-Redshift-and-Postg

Bob Strahan is a senior consultant with AWS Professional Services

Have you ever wanted to split your database load across multiple servers or clusters without impacting the configuration or code of your client applications? Or perhaps you have wished for a way to intercept and modify application queries, so that you can make them use optimized tables (sorted, pre-joined, pre-aggregated, etc.), add security filters, or hide changes you have made in the schema?

The pgbouncer-rr project is based on pgbouncer, an open source, PostgreSQL connection pooler. It adds two new significant features:

Routing: Intelligently send queries to different database servers from one client connection; use it to partition or load balance across multiple servers or clusters.

Rewrite: Intercept and programmatically change client queries before they are sent to the server; use it to optimize or otherwise alter queries without modifying your application.

Pgbouncer-rr works the same way as pgbouncer. Any target application can be connected to pgbouncer-rr as if it were an Amazon Redshift or PostgreSQL server, and pgbouncer-rr creates a connection to the actual server, or reuses an existing connection.

You can deploy multiple instances of pgbouncer-rr to avoid throughput bottlenecks or single points of failure, or to support multiple configurations. It can live in an Auto Scaling group, and behind an Elastic Load Balancing load balancer. It can be deployed to a public subnet while your servers reside in private subnets. You can choose to run it as a bastion server using SSH tunneling, or you can use pgbouncer’s recently introduced SSL support for encryption and authentication.

Documentation and community support for pgbouncer can be easily found online;  pgbouncer-rr is a superset of pgbouncer.

Now I’d like to talk about the query routing and query rewrite feature enhancements.

Query Routing

The routing feature maps client connections to server connections using a Python routing function which you provide. Your function is called for each query, with the client username and the query string as parameters. Its return value must identify the target database server. How it does this is entirely up to you.

For example, you might want to run two Amazon Redshift clusters, each optimized to host a distinct data warehouse subject area. You can determine the appropriate cluster for any given query based on the names of the schemas or tables used in the query. This can be extended to support multiple Amazon Redshift clusters or PostgreSQL instances.

In fact, you can even mix and match Amazon Redshift and PostgreSQL, taking care to ensure that your Python functions correctly handle any server-specific grammar in your queries; your database will throw errors if your routing function sends queries it can’t process. And, of course, any query must run entirely on the server to which is it routed; cross-database joins or multi-server transactions do not work!

Here’s another example: you might want to implement controlled load balancing (or A/B testing) across replicated clusters or servers. Your routing function can choose a server for each query based on any combination of the client username, the query string, random variables, or external input. The logic can be as simple or as sophisticated as you want.

Your routing function has access to the full power of the Python language and the myriad of available Python modules. You can use the regular expression module (re) to match words and patterns in the query string, or use the SQL parser module (sqlparse) to support more sophisticated/robust query parsing. You may also want to use the AWS SDK module (boto) to read your routing table configurations from Amazon DynamoDB.

The Python routing function is dynamically loaded by pgbouncer-rr from the file you specify in the configuration:

routing_rules_py_module_file = /etc/pgbouncer-rr/routing_rules.py

The file should contain the following Python function:

def routing_rules(username, query):

The function parameters provide the username associated with the client, and a query string. The function return value must be a valid database key name (dbkey) as specified in the configuration file, or None. When a valid dbkey is returned by the routing function, the client connection will be routed to a connection in the specified server connection pool. When None is returned by the routing function, the client remains routed to its current server connection.

The route function is called only for query and prepare packets, with the following restrictions:

All queries must run wholly on the assigned server. Cross-server joins do not work.

Ideally, queries should auto-commit each statement. Set pool_mode = statement in the configuration.

Multi-statement transactions work correctly only if statements are not rerouted by the routing_rules function to a different server pool mid-transaction. Set pool_mode = transaction in the configuration.

If your application uses database catalog tables to discover the schema, then the routing_rules function should direct catalog table queries to a database server that has all the relevant schema objects created.

Simple query routing example

Amazon Redshift cluster 1 has data in table ‘tablea’. Amazon Redshift cluster 2 has data in table ‘tableb’. You want a client to be able to issue queries against either tablea or tableb without needing to know which table resides on which cluster.

Create a (default) entry with a key, say, ‘dev’ in the [databases] section of the pgbouncer configuration. This entry determines the default cluster used for client connections to database ‘dev’. You can make either redshift1 or redshift2 the default, or even specify a third ‘default’ cluster. Create additional entries in the pgbouncer [databases] section for each cluster; give these unique key names such as ‘dev.1’, and ‘dev.2’.

[databases]
dev = host= port=5439 dbname=dev
dev.1 = host= port=5439 dbname=dev
dev.2 = host= port=5439 dbname=dev

Ensure that the configuration file setting routing_rules_py_module_file specifies the path to your Python routing function file, such as ~/routing_rules.py. The code in the file could look like the following:

def routing_rules(username, query):
if "tablea" in query:
return "dev.1"
elif "tableb" in query:
return "dev.2"
else:
return None

Below is a toy example, but it illustrates the concept. If a client sends the query SELECT * FROM tablea, it matches the first rule, and is assigned to server pool ‘dev.1’ (redshift1). If a client (and it could be the same client in the same session) sends the query SELECT * FROM tableb, it matches the second rule, and is assigned to server pool ‘dev.2’ (redshift2). Any query that does not match either rule results in None being returned, and the server connection remains unchanged.

Below is an alternative function for the same use case, but the routing logic is defined in a separate extensible data structure using regular expressions to find the table matches. The routing table structure could easily be externalized in a DynamoDB table.

# ROUTING TABLE
# ensure that all dbkey values are defined in [database] section of the pgbouncer ini file
routingtable = {
‘route’ : [{
‘usernameRegex’ : ‘.*’,
‘queryRegex’ : ‘.*tablea.*’,
‘dbkey’ : ‘dev.1’
}, {
‘usernameRegex’ : ‘.*’,
‘queryRegex’ : ‘.*tableb.*’,
‘dbkey’ : ‘dev.2’
}
],
‘default’ : None
}

# ROUTING FN – CALLED FROM PGBOUNCER-RR – DO NOT CHANGE NAME
# IMPLEMENTS REGEX RULES DEFINED IN ROUTINGTABLE OBJECT
# RETURNS FIRST MATCH FOUND
import re
def routing_rules(username, query):
for route in routingtable[‘route’]:
u = re.compile(route[‘usernameRegex’])
q = re.compile(route[‘queryRegex’])
if u.search(username) and q.search(query):
return route[‘dbkey’]
return routingtable[‘default’]

You will likely want to implement more robust and sophisticated rules, taking care to avoid unintended matches. Write test cases to call your function with different inputs and validate the output dbkey values.

Query Rewrite

The rewrite feature provides you with the opportunity to manipulate application queries en route to the server without modifying application code. You might want to do this to:

Optimize an incoming query to use the best physical tables when you have replicated tables with alternative sort/dist keys and column subsets (emulate projections), or when you have stored pre-joined or pre-aggregated data (emulate ‘materialized views’).

Apply query filters to support row-level data partitioning/security.

Roll out new schemas, resolve naming conflicts, and so on, by changing identifier names on the fly.

The rewrite function is also implemented in a fully configurable Python function, dynamically loaded from an external module specified in the configuration: 

rewrite_query_py_module_file = /etc/pgbouncer-rr/rewrite_query.py

The file should contain the following Python function:

def rewrite_query(username, query):

The function parameters provide the username associated with the client, and a query string. The function return value must be a valid SQL query string which returns the result set that you want the client application to receive.

Implementing a query rewrite function is straightforward when the incoming application queries have fixed formats that are easily detectable and easily manipulated, perhaps using regular expression search/replace logic in the Python function. It is much more challenging to build a robust rewrite function to handle SQL statements with arbitrary format and complexity.

Enabling the query rewrite function triggers pgbouncer-rr to enforce that a complete query is contained in the incoming client socket buffer. Long queries are often split across multiple network packets. They should all be in the buffer before the rewrite function is called, which requires that the buffer size be large enough to accommodate the largest query. The default buffer size (2048) is likely too small, so specify a much larger size in the configuration: pkt_buf = 32768.

If a partially received query is detected, and there is room in the buffer for the remainder of the query, pgbouncer-rr waits for the remaining packets to be received before processing the query. If the buffer is not large enough for the incoming query, or if it is not large enough to hold the re-written query (which may be longer than the original), then the rewrite function will fail. By default, the failure is logged and the original query string will be passed to the server unchanged. You can force the client connection to terminate instead, by setting: rewrite_query_disconnect_on_failure = true.

Simple query rewrite example

You have a star schema with a large fact table in Amazon Redshift (such as ‘sales’) with two related dimension tables (such as ‘store’ and ‘product’). You want to optimize equally for two different queries:

1> SELECT storename, SUM(total) FROM sales JOIN store USING (storeid)
GROUP BY storename ORDER BY storename
2> SELECT prodname, SUM(total) FROM sales JOIN product USING (productid)
GROUP BY prodname ORDER BY prodname

By experimenting, you have determined that the best possible solution is to have two additional tables, each optimized for one of the queries:

store_sales: store and sales tables denormalized, pre-aggregated by store, and sorted and distributed by store name

product_sales: product and sales tables denormalized, pre-aggregated by product, sorted and distributed by product name

So you implement the new tables, and take care of their population in your ETL processes. But you’d like to avoid directly exposing these new tables to your reporting or analytic client applications. This might be the best optimization today, but who knows what the future holds? Maybe you’ll come up with a better optimization later, or maybe Amazon Redshift will introduce cool new features that provide a simpler alternative.

So, you implement a pgbouncer-rr rewrite function to change the original queries on the fly. Ensure that the configuration file setting rewrite_query_py_module_file specifies the path to your python function file, say~/rewrite_query.py.

The code in the file could look like this:

import re
def rewrite_query(username, query):
q1="SELECT storename, SUM(total) FROM sales JOIN store USING (storeid)
GROUP BY storename ORDER BY storename"
q2="SELECT prodname, SUM(total) FROM sales JOIN product USING (productid)
GROUP BY prodname ORDER BY prodname"
if re.match(q1, query):
new_query = "SELECT storename, SUM(total) FROM store_sales
GROUP BY storename ORDER BY storename;"
elif re.match(q2, query):
new_query = "SELECT prodname, SUM(total) FROM product_sales
GROUP BY prodname ORDER BY prodname;"
else:
new_query = query
return new_query

Again, this is a toy example to illustrate the concept. In any real application, your Python function needs to employ more robust query pattern matching and substitution.

Your reports and client applications use the same join query as before:

SELECT prodname, SUM(total) FROM sales JOIN product USING (productid) GROUP BY prodname ORDER BY prodname;

Now, when you look on the Amazon Redshift console Queries tab, you see that the query received by Amazon Redshift is the rewritten version that uses the new product_sales table, leveraging your pre-joined, pre-aggregated data and the targeted sort and dist keys:

SELECT prodname, SUM(total) FROM product_sales GROUP BY prodname ORDER BY prodname;

Getting Started

Here are the steps to start working with pgbouncer-rr.

Install

Download and install pgbouncer-rr by running the following commands (Amazon Linux/RHEL/CentOS):

# install required packages
sudo yum install libevent-devel openssl-devel git libtool python-devel -y

# download the latest pgbouncer distribution
git clone https://github.com/pgbouncer/pgbouncer.git

# download pgbouncer-rr extensions
git clone https://github.com/awslabs/pgbouncer-rr-patch.git

# merge pgbouncer-rr extensions into pgbouncer code
cd pgbouncer-rr-patch
./install-pgbouncer-rr-patch.sh ../pgbouncer

# build and install
cd ../pgbouncer
git submodule init
git submodule update
./autogen.sh
./configure …
make
sudo make install

Configure

Create a configuration file, using ./pgbouncer-example.ini as a starting point, adding your own database connections and Python routing rules and rewrite query functions.

Set up user authentication; for more information, see authentication file format.

NOTE: The recently added pgbouncer auth_query feature does not work with Amazon Redshift.

By default, pgbouncer-rr does not support SSL/TLS connections. However, you can experiment with pgbouncer’s newest TLS/SSL feature. Just add a private key and certificate to your pgbouncer-rr configuration:

client_tls_sslmode=allow
client_tls_key_file = ./pgbouncer-rr-key.key
client_tls_cert_file = ./pgbouncer-rr-key.crt

Hint: Here’s how to easily generate a test key with a self-signed certificate using openssl:

openssl req -newkey rsa:2048 -nodes -keyout pgbouncer-rr-key.key -x509 -days 365 -out pgbouncer-rr-key.crt

Configure a firewall

Configure your Linux firewall to enable incoming connections on the configured pgbouncer-rr listening port. For example:

sudo firewall-cmd –zone=public –add-port=5439/tcp –permanent
sudo firewall-cmd –reload

If you are running pgbouncer-rr on an Amazon EC2 instance, the instance security group must also be configured to allow incoming TCP connections on the listening port.

Launch

Run pgbouncer-rr as a daemon using the commandline pgbouncer <config_file> -d. See pgbouncer –help for commandline options. Hint: Use -v -v to enable verbose logging. If you look carefully in the logfile, you will see evidence of the query routing and query rewrite features in action.

Connect

Configure your client application as though you were connecting directly to an Amazon Redshift or PostgreSQL database, but be sure to use the pgbouncer-rr hostname and listening port.

Here’s an example using plsql:

psql -h pgbouncer-dnshostname -U dbuser -d dev -p 5439

Here’s another example using a JDBC driver URL (Amazon Redshift driver):

jdbc:redshift://pgbouncer-dnshostname:5439/dev

Other uses for pgbouncer-rr

It can be used for lots of things, really. In addition to the examples shown above, here are some other use cases suggested by colleagues:

Serve small or repetitive queries from PostgreSQL tables consisting of aggregated results.

Parse SQL for job-tracking table names to implement workload management with job tracking tables on PostgreSQL, simplifies application development.

Leverage multiple Amazon Redshift clusters to serve dashboarding workloads with heavy concurrency requirements.

Determine the appropriate route based on the current workload/state of cluster resources (always route to the cluster with least running queries, etc).

Use Query rewrite to parse SQL for queries which do not leverage the nuances of Amazon Redshift query design or query best practices; either block these queries or re-write them for performance.

Use SQL parse to limit end user ability to access tables with ad hoc queries; e.g., identify users scanning N+ years of data and instead issue a query which blocks them with a re-write: e.g., SELECT ‘WARNING: scans against v_all_sales must be limited to no more than 30 days’ AS alert;

Use SQL parse to identify and rewrite queries which filter on certain criteria to direct them towards a specific table containing data matching that filter.

Actually, your use cases don’t need to be limited to just routing and query rewriting. You could design a routing function that leaves the route unchanged, but which instead implements purposeful side effects, such as:

Publishing custom CloudWatch metrics, enabling you to monitor specific query patterns and/or user interactions with your databases.

Capturing SQL DDL and INSERT/UPDATE statements, and wrap them into Amazon Kinesis put-records as input to the method described in Erik Swensson’s excellent post, Building Multi-AZ or Multi-Region Amazon Redshift Clusters.

We’d love to hear your thoughts and ideas for pgbouncer-rr functions. If you have questions or suggestions, please leave a comment below

Copyright 2015-2015 Amazon.com, Inc. or its affiliates. All Rights Reserved.

———————————————

Related:

Top 10 Performance Tuning Techniques for Amazon Redshift

 

 

OVH Sponsors Let’s Encrypt

Post Syndicated from Let's Encrypt - Free SSL/TLS Certificates original https://letsencrypt.org//2015/12/21/ovh-sponsorship.html

We’re pleased to announce that OVH has become a Platinum sponsor of Let’s Encrypt.

According to OVH CTO and Founder Octave Klaba, “OVH is delighted to become a Platinum sponsor. With Let’s Encrypt, OVH will be able to set a new standard for security by offering end-to-end encrypted communications by default to all its communities.”

The Web is an increasingly integral part of our daily lives, and encryption by default is critical in order to provide the degree of security and privacy that people expect. Let’s Encrypt’s mission is to encrypt the Web and our sponsors make pursuing that mission possible.

OVH’s sponsorship will help us to pay for staff and other operation costs in 2016.

If your company or organization would like to sponsor Let’s Encrypt, please email us at sponsor@letsencrypt.org.

What’s New in AWS Key Management Service: AWS CloudFormation Support and Integration with More AWS Services

Post Syndicated from Sreekumar Pisharody original https://blogs.aws.amazon.com/security/post/TxHY6YJA60MTUL/What-s-New-in-AWS-Key-Management-Service-AWS-CloudFormation-Support-and-Integrat

We’re happy to make two announcements about what’s new in AWS Key Management Service (KMS).

First, AWS CloudFormation has added a template for KMS that lets you quickly create KMS customer master keys (CMK) and set their properties. Starting today, you can use the AWS::KMS::Key resource to create a CMK in KMS. To get started, you can use AWS CloudFormation Designer to drag-and-drop a KMS key resource type into your template, as shown in the following image.

To learn more about using KMS with CloudFormation, see the “AWS::KMS::Key” section of the AWS CloudFormation User Guide.

Second, AWS Import/Export Snowball, AWS CloudTrail, Amazon SES, Amazon WorkSpaces, and Amazon Kinesis Firehose now support encryption of data within those services using keys in KMS. As with other KMS-integrated services, you can use CloudTrail to audit the use of your KMS key to encrypt or decrypt your data in SES, Amazon WorkSpaces, CloudTrail, Import/Export Snowball, and Amazon Kinesis Firehose. To see the complete list of AWS services integrated with KMS, see KMS Product Details. For more details about how these services encrypt your data with KMS, see the How AWS Services Use AWS KMS documentation pages.

If you have questions or comments, please add them in the “Comments” section below or on the KMS forum.

– Sree

Migrating Metadata when Encrypting an Amazon Redshift Cluster

Post Syndicated from John Loughlin original https://blogs.aws.amazon.com/bigdata/post/Tx3L6LQQ1Q6XXTK/Migrating-Metadata-when-Encrypting-an-Amazon-Redshift-Cluster

John Loughlin is a Solutions Architect with Amazon Web Services

A customer came to us asking for help expanding and modifying their Amazon Redshift cluster. In the course of responding to their request, we made use of several tools available in the AWSLabs GitHub repository. What follows is an account of how you can use some of these tools as we did (this is not intended to be an exhaustive description of the content of that library).

The customer is acquiring another manufacturing company that is only slightly smaller than they are. Each has a BI infrastructure and they believe consolidating platforms would lower expenses and simplify operations. They want to move the acquired organization’s warehouse into the existing Amazon Redshift cluster, but with a new requirement. Because of the nature of some of the projects the acquired company has, they have a contractual obligation to encrypt data.

Amazon Redshift supports the encryption of data at rest, in the database and the associated snapshots. To enable this, encryption must be selected when the database is created. To encrypt a database after it has been created, it is necessary to stand up a new database and move the content from the unencrypted cluster to the new cluster where it will be encrypted.

Moving the contents of your application’s data tables is straightforward, as Amazon Redshift provides an UNLOAD feature for this purpose.

To determine the tables to UNLOAD, consider running a query such as the following:

SELECT tablename FROM pg_tables WHERE schemaname = ‘public’;

Note that the list of schema names should be extended to reflect where you have created objects in your cluster. Running UNLOAD from the source cluster and COPY in the new one migrates application data. Simple enough.

UNLOAD (‘SELECT * FROM sample_table’) TO ‘s3://mybucket/sample/sample_Table_’ credentials  ‘aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>’ manifest;

This command splits the results of a SELECT statement across a set of files, one or more files per node slice, to simplify parallel reloading of the data. It also creates a manifest file that can be used to ensure that the COPY command loads all of the required files, and only the required files, into the encrypted cluster. Using a manifest file on COPY is a recommended practice.

You can make this process simpler still by using the Amazon Redshift Unload/Copy utility. This tool exports data from a source cluster to a location on S3 and encrypts the data with the Amazon Key Management Service (Amazon KMS). It can also import the data into another Amazon Redshift cluster and clean up S3.

For many applications, the Amazon Redshift cluster contains more than just application data. Amazon Redshift supports the creation of database users, creation of groups, and assignment of privileges to both groups and users. Re-creating these accurately could be error-prone unless everything was created using scripts and every script source code–controlled. Fortunately, it is easy to create scripts that migrate this information directly from your source cluster and which can be run in the encrypted cluster to replicate the data that you require.

The best place to start before actually creating scripts is the AWSLABS GitHub repository. In the AdminViews directory, there are already several useful scripts. You can generate the DDL for schemas and tables and views. You can also get lists of schema, view, and table privileges by user and see the groups that a user belongs to. All this is useful information, but you want to generate SQL statements in your source database to run in your new, encrypted database.

You can pull a list of users from the pg_user table as follows: 

SELECT ‘CREATE USER ‘|| usename || ‘;’ FROM pg_user WHERE usename <> ‘rdsdb’;

Produces:

CREATE USER acctowner;
CREATE USER mrexample;
CREATE USER counterexample;
CREATE USER mrspremise;
CREATE USER mrsconclusion;

You should assign passwords to the accounts you create. There is no way to extract the existing passwords from the source database so it is necessary to create new ones.

Download the code from GitHub, expand the src directory, and find the scripts in the AdminViews directory. Create a schema called admin in your Amazon Redshift cluster and run each of the scripts starting with v_ to create the views. The resulting views can then be accessed in a SQL statement as follows:

SELECT * FROM admin.v_generate_schema_ddl;

Schema name: Admin
ddl: Create schema admin

Run the v_generate_group_DDL.SQL script to create the groups in the new database:

SELECT ‘CREATE GROUP  ‘|| groname  ||’;’ FROM pg_group;

Produces:

CREATE GROUP chosen;
CREATE GROUP readonly;

Users belong to groups and you can capture these associations using the v_get_users-in_group script:

SELECT ‘ALTER GROUP ‘ ||groname||’ ADD USER ‘||usename||’;’ FROM admin.v_get_users_in_group;

Produces:

ALTER GROUP chosen ADD USER mrsconclusion;
ALTER GROUP readonly ADD USERmrexample;
ALTER GROUP readonly ADD USERmrspremise;

Schema, view, and table DDL can be generated directly from the appropriate scripts:

v_generate_schema_DDL.SQL, v_generate_table_DDL.SQL v_generate_view_DDL.SQL

You need to set appropriate privileges on the schemas in the new database and there is a script that you can use to capture the relevant information in the existing database:

SELECT * FROM admin.v_get_schema_priv_by_user
WHERE schemaname
NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND usename <> ‘johnlou’
AND usename <> ‘rdsdb’;

Here you see multiple different permissions granted to each user who has been granted privileges on a schema. To generate SQL to run against the new database, you can use a user-defined function (UDF) to create a string of privileges for each row in the result set. One way of building this function is as follows:

create function
f_schema_priv_granted(cre boolean, usg boolean) returns varchar
STABLE
AS $$
priv = ”
if cre:
priv = str(‘create’)
if usg:
priv = priv + str(‘, usage’)
return priv
$$LANGUAGE plpythonu

The f_schema_priv_granted function returns a string of concatenated permissions. Run this in a query to generate SQL containing GRANT statements:

SELECT ‘GRANT ‘|| f_schema_priv_granted(cre, usg) ||’ ON schema ‘|| schemaname || ‘ TO ‘ || usename || ‘;’
FROM admin.v_get_schema_priv_by_user
WHERE schemaname NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND usename <> ‘rdsdb’;

Produces

GRANT CREATE, USAGE ON schema public TO mrexample;
GRANT CREATE, USAGE ON schema public TO counterexample;
GRANT CREATE, USAGE ON schema public TO mrspremise;
GRANT CREATE, USAGE ON schema public TO o mrsconclusion;

Alternatively, if you prefer CASE statements to UDFs or are not comfortable with python, you can write something similar to the following:

SELECT ‘grant ‘|| concat(CASE WHEN cre is true THEN ‘create’ else ‘ ‘ END,
CASE WHEN usg is true THEN ‘, usage’ ELSE ‘ ‘ END )
|| ‘ ON schema ‘|| schemaname || ‘ TO ‘ || usename || ‘;’
FROM admin.v_get_schema_priv_by_user
WHERE schemaname NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND schemaname <> ‘public’
AND usg = ‘true’;

Similarly, a UDF can be used to create a string of permissions used in a GRANT statement on each view and table. There is a wider range of privileges: SELECT, INSERT, UPDATE, DELETE, and REFERENCES. The UDF looks like the following:

create function
f_table_priv_granted(sel boolean, ins boolean, upd boolean, delc boolean, ref boolean) returns varchar
STABLE
AS $$
priv = ”
if sel:
priv = str(‘select’)
if ins:
priv = priv + str(‘, insert’)
if upd:
priv = priv + str(‘, update’)
if delc:
priv = priv + str(‘, delete’)
if ref:
priv = priv + str(‘, references ‘)
return priv
$$LANGUAGE plpythonu

Note that in the function, the fourth argument does not match the column in the views. Python objects to the use of ‘del’ as it is a reserved word. Also note that you can construct a SQL statement with the same function using CASE statements if you prefer not to use the UDF.

You can generate privileges for tables with the following query:

SELECT ‘GRANT ‘|| f_table_priv_granted(sel, ins, upd, del, ref) || ‘ ON ‘||
schemaname||’.’||tablename ||’ TO ‘|| usename || ‘;’ FROM admin.v_get_tbl_priv_by_user
WHERE schemaname NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND usename <> ‘rdsdb’;

Produces:

GRANT SELECT on public.old_sample to mrexample;
GRANT SELECT ON public.old_sample TO mrspremise;
GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.old_sample TO mrsconclusion;
GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.sample TO mrexample;
GRANT SELECT ON public.sample to mrspremise;

Similarly, run the following query for privileges on views:

SELECT ‘GRANT ‘|| f_table_priv_granted(sel, ins, upd, del, ref) || ‘ ON ‘||
schemaname||’.’||tablename ||’ TO ‘|| usename || ‘;’ FROM admin.v_get_view_priv_by_user
WHERE schemaname NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND usename <> ‘rdsdb’;

Produces:

GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.loadview TO johnlou;
GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.my_sample_view TO johnlou;

The scripts from the repository make migrating metadata to the new, encrypted cluster easier. Having moved the tables from the acquired company’s warehouse into a separate schema in Amazon Redshift, there are several other scripts that also proved useful.

The table_info.sql script shows the pct_unsorted and pct_stats_off columns to indicate the degree of urgency for running the vacuum and analyze processes.

The table_inspector.sql script is useful in validating that distribution keys chosen for the migrated tables are likely to be effective. The results include pct_skew_across_slices, the percentage of data distribution skew, and pct_slices_populated. Problematic tables are those where there is either a large value in the pct_skew_across_column_slices or a low value in the pct_slices_populated column.

Summary

In this post, you saw examples of extending existing scripts to generate SQL that can be used to define users and groups in a new database, two examples of using the UDF feature to generate lists of privileges for various objects, sample queries to generate SQL to make this easier to do, and two scripts that help validate that new tables perform well.

Hopefully, these scripts can simplify work in your environment and suggest ways you can extend the existing scripts for more custom processing on Amazon Redshift clusters that are appropriate to your uses.

If you have questions or suggestions, please leave a comment below.

—————————–

Related:

Best Practices for Micro-Batch Loading on Amazon Redshift

Migrating Metadata when Encrypting an Amazon Redshift Cluster

Post Syndicated from John Loughlin original https://blogs.aws.amazon.com/bigdata/post/Tx3L6LQQ1Q6XXTK/Migrating-Metadata-when-Encrypting-an-Amazon-Redshift-Cluster

John Loughlin is a Solutions Architect with Amazon Web Services

A customer came to us asking for help expanding and modifying their Amazon Redshift cluster. In the course of responding to their request, we made use of several tools available in the AWSLabs GitHub repository. What follows is an account of how you can use some of these tools as we did (this is not intended to be an exhaustive description of the content of that library).

The customer is acquiring another manufacturing company that is only slightly smaller than they are. Each has a BI infrastructure and they believe consolidating platforms would lower expenses and simplify operations. They want to move the acquired organization’s warehouse into the existing Amazon Redshift cluster, but with a new requirement. Because of the nature of some of the projects the acquired company has, they have a contractual obligation to encrypt data.

Amazon Redshift supports the encryption of data at rest, in the database and the associated snapshots. To enable this, encryption must be selected when the database is created. To encrypt a database after it has been created, it is necessary to stand up a new database and move the content from the unencrypted cluster to the new cluster where it will be encrypted.

Moving the contents of your application’s data tables is straightforward, as Amazon Redshift provides an UNLOAD feature for this purpose.

To determine the tables to UNLOAD, consider running a query such as the following:

SELECT tablename FROM pg_tables WHERE schemaname = ‘public’;

Note that the list of schema names should be extended to reflect where you have created objects in your cluster. Running UNLOAD from the source cluster and COPY in the new one migrates application data. Simple enough.

UNLOAD (‘SELECT * FROM sample_table’) TO ‘s3://mybucket/sample/sample_Table_’ credentials  ‘aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>’ manifest;

This command splits the results of a SELECT statement across a set of files, one or more files per node slice, to simplify parallel reloading of the data. It also creates a manifest file that can be used to ensure that the COPY command loads all of the required files, and only the required files, into the encrypted cluster. Using a manifest file on COPY is a recommended practice.

You can make this process simpler still by using the Amazon Redshift Unload/Copy utility. This tool exports data from a source cluster to a location on S3 and encrypts the data with the Amazon Key Management Service (Amazon KMS). It can also import the data into another Amazon Redshift cluster and clean up S3.

For many applications, the Amazon Redshift cluster contains more than just application data. Amazon Redshift supports the creation of database users, creation of groups, and assignment of privileges to both groups and users. Re-creating these accurately could be error-prone unless everything was created using scripts and every script source code–controlled. Fortunately, it is easy to create scripts that migrate this information directly from your source cluster and which can be run in the encrypted cluster to replicate the data that you require.

The best place to start before actually creating scripts is the AWSLABS GitHub repository. In the AdminViews directory, there are already several useful scripts. You can generate the DDL for schemas and tables and views. You can also get lists of schema, view, and table privileges by user and see the groups that a user belongs to. All this is useful information, but you want to generate SQL statements in your source database to run in your new, encrypted database.

You can pull a list of users from the pg_user table as follows: 

SELECT ‘CREATE USER ‘|| usename || ‘;’ FROM pg_user WHERE usename <> ‘rdsdb’;

Produces:

CREATE USER acctowner;
CREATE USER mrexample;
CREATE USER counterexample;
CREATE USER mrspremise;
CREATE USER mrsconclusion;

You should assign passwords to the accounts you create. There is no way to extract the existing passwords from the source database so it is necessary to create new ones.

Download the code from GitHub, expand the src directory, and find the scripts in the AdminViews directory. Create a schema called admin in your Amazon Redshift cluster and run each of the scripts starting with v_ to create the views. The resulting views can then be accessed in a SQL statement as follows:

SELECT * FROM admin.v_generate_schema_ddl;

Schema name: Admin
ddl: Create schema admin

Run the v_generate_group_DDL.SQL script to create the groups in the new database:

SELECT ‘CREATE GROUP  ‘|| groname  ||’;’ FROM pg_group;

Produces:

CREATE GROUP chosen;
CREATE GROUP readonly;

Users belong to groups and you can capture these associations using the v_get_users-in_group script:

SELECT ‘ALTER GROUP ‘ ||groname||’ ADD USER ‘||usename||’;’ FROM admin.v_get_users_in_group;

Produces:

ALTER GROUP chosen ADD USER mrsconclusion;
ALTER GROUP readonly ADD USERmrexample;
ALTER GROUP readonly ADD USERmrspremise;

Schema, view, and table DDL can be generated directly from the appropriate scripts:

v_generate_schema_DDL.SQL, v_generate_table_DDL.SQL v_generate_view_DDL.SQL

You need to set appropriate privileges on the schemas in the new database and there is a script that you can use to capture the relevant information in the existing database:

SELECT * FROM admin.v_get_schema_priv_by_user
WHERE schemaname
NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND usename <> ‘johnlou’
AND usename <> ‘rdsdb’;

Here you see multiple different permissions granted to each user who has been granted privileges on a schema. To generate SQL to run against the new database, you can use a user-defined function (UDF) to create a string of privileges for each row in the result set. One way of building this function is as follows:

create function
f_schema_priv_granted(cre boolean, usg boolean) returns varchar
STABLE
AS $$
priv = ”
if cre:
priv = str(‘create’)
if usg:
priv = priv + str(‘, usage’)
return priv
$$LANGUAGE plpythonu

The f_schema_priv_granted function returns a string of concatenated permissions. Run this in a query to generate SQL containing GRANT statements:

SELECT ‘GRANT ‘|| f_schema_priv_granted(cre, usg) ||’ ON schema ‘|| schemaname || ‘ TO ‘ || usename || ‘;’
FROM admin.v_get_schema_priv_by_user
WHERE schemaname NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND usename <> ‘rdsdb’;

Produces

GRANT CREATE, USAGE ON schema public TO mrexample;
GRANT CREATE, USAGE ON schema public TO counterexample;
GRANT CREATE, USAGE ON schema public TO mrspremise;
GRANT CREATE, USAGE ON schema public TO o mrsconclusion;

Alternatively, if you prefer CASE statements to UDFs or are not comfortable with python, you can write something similar to the following:

SELECT ‘grant ‘|| concat(CASE WHEN cre is true THEN ‘create’ else ‘ ‘ END,
CASE WHEN usg is true THEN ‘, usage’ ELSE ‘ ‘ END )
|| ‘ ON schema ‘|| schemaname || ‘ TO ‘ || usename || ‘;’
FROM admin.v_get_schema_priv_by_user
WHERE schemaname NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND schemaname <> ‘public’
AND usg = ‘true’;

Similarly, a UDF can be used to create a string of permissions used in a GRANT statement on each view and table. There is a wider range of privileges: SELECT, INSERT, UPDATE, DELETE, and REFERENCES. The UDF looks like the following:

create function
f_table_priv_granted(sel boolean, ins boolean, upd boolean, delc boolean, ref boolean) returns varchar
STABLE
AS $$
priv = ”
if sel:
priv = str(‘select’)
if ins:
priv = priv + str(‘, insert’)
if upd:
priv = priv + str(‘, update’)
if delc:
priv = priv + str(‘, delete’)
if ref:
priv = priv + str(‘, references ‘)
return priv
$$LANGUAGE plpythonu

Note that in the function, the fourth argument does not match the column in the views. Python objects to the use of ‘del’ as it is a reserved word. Also note that you can construct a SQL statement with the same function using CASE statements if you prefer not to use the UDF.

You can generate privileges for tables with the following query:

SELECT ‘GRANT ‘|| f_table_priv_granted(sel, ins, upd, del, ref) || ‘ ON ‘||
schemaname||’.’||tablename ||’ TO ‘|| usename || ‘;’ FROM admin.v_get_tbl_priv_by_user
WHERE schemaname NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND usename <> ‘rdsdb’;

Produces:

GRANT SELECT on public.old_sample to mrexample;
GRANT SELECT ON public.old_sample TO mrspremise;
GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.old_sample TO mrsconclusion;
GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.sample TO mrexample;
GRANT SELECT ON public.sample to mrspremise;

Similarly, run the following query for privileges on views:

SELECT ‘GRANT ‘|| f_table_priv_granted(sel, ins, upd, del, ref) || ‘ ON ‘||
schemaname||’.’||tablename ||’ TO ‘|| usename || ‘;’ FROM admin.v_get_view_priv_by_user
WHERE schemaname NOT LIKE ‘pg%’
AND schemaname <> ‘information_schema’
AND usename <> ‘rdsdb’;

Produces:

GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.loadview TO johnlou;
GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.my_sample_view TO johnlou;

The scripts from the repository make migrating metadata to the new, encrypted cluster easier. Having moved the tables from the acquired company’s warehouse into a separate schema in Amazon Redshift, there are several other scripts that also proved useful.

The table_info.sql script shows the pct_unsorted and pct_stats_off columns to indicate the degree of urgency for running the vacuum and analyze processes.

The table_inspector.sql script is useful in validating that distribution keys chosen for the migrated tables are likely to be effective. The results include pct_skew_across_slices, the percentage of data distribution skew, and pct_slices_populated. Problematic tables are those where there is either a large value in the pct_skew_across_column_slices or a low value in the pct_slices_populated column.

Summary

In this post, you saw examples of extending existing scripts to generate SQL that can be used to define users and groups in a new database, two examples of using the UDF feature to generate lists of privileges for various objects, sample queries to generate SQL to make this easier to do, and two scripts that help validate that new tables perform well.

Hopefully, these scripts can simplify work in your environment and suggest ways you can extend the existing scripts for more custom processing on Amazon Redshift clusters that are appropriate to your uses.

If you have questions or suggestions, please leave a comment below.

—————————–

Related:

Best Practices for Micro-Batch Loading on Amazon Redshift

Facebook Sponsors Let’s Encrypt

Post Syndicated from Let's Encrypt - Free SSL/TLS Certificates original https://letsencrypt.org//2015/12/03/facebook-sponsorship.html

We’re happy to share today that Facebook is the newest Gold sponsor of Let’s Encrypt. Facebook has taken multiple important steps to support and advance encryption this year, and we are glad to see Let’s Encrypt as the latest example.

According to Alex Stamos, Chief Security Officer at Facebook, “Making it easier for websites to deploy HTTPS encryption is an important step in improving the security of the whole internet, and Facebook is proud to support this effort.”

Facebook’s sponsorship will help us produce a greater impact as we open up our public beta today and usher in more participants over the coming months.

If your company or organization would like to sponsor Let’s Encrypt, please email us at sponsor@letsencrypt.org.

Big Data AWS Training Course Gets Big Update

Post Syndicated from Michael Stroh original https://blogs.aws.amazon.com/bigdata/post/Tx3FR6JXY0HVTS3/Big-Data-AWS-Training-Course-Gets-Big-Update

Michael Stroh is Communications Manager for AWS Training & Certification

AWS offers a number of in-depth technical training courses, which we’re regularly updating in response to student feedback and changes to the AWS platform. Today I want to tell you about some exciting changes to Big Data on AWS, our most comprehensive training course on the AWS big data platform.

The 3-day class is primarily aimed at data scientists, analysts, solutions architects, and anybody else who wants to use AWS to handle their big data workloads. The course teaches you how to leverage Amazon Elastic MapReduce (EMR), Amazon Redshift, Amazon Kinesis, Amazon DynamoDB and the rest of the AWS big data platform (as well as several popular third-party tools) to get useful answers from your data at a speed and cost that suits your needs.

What’s new

So what’s different? For starters, the course was completely reorganized to talk about the AWS big data platform like a story—from data ingestion, to storage, to visualization—and make it easier to follow.

Customers also said they really wanted to hear more about Amazon Redshift and understand the differences between Amazon Redshift and Amazon EMR—where these services overlap, and where they’re different. So the new version of the course adds about 150% more Amazon Redshift-related content, including course modules on cluster architecture and optimization, and concepts critical to understanding Amazon Redshift, such as data warehousing and columnar data storage.

Also in response to customer feedback, the AWS Training team beefed up coverage of Hadoop programming frameworks, especially for Hive, Presto, Pig, and Spark. The Spark module, for example, now includes details on MLlib, Spark Streaming, and GraphX. There’s also a new course module on Hue, the popular Hadoop web interface, and a new hands-on lab on running Hue on Amazon EMR.

Course updates were also a response to the fast evolving big data platform. So we added coverage of AWS Import/Export Snowball, Amazon Kinesis Firehose, and AWS QuickSight—the data ingestion, streaming, and visualization services (respectively) announced at re:Invent 2015.

Other notable highlights of the revised course include:

More explanation of how Amazon Kinesis and Amazon Kinesis Streams work.

More focus on three different types of server-side encryption of data stored in Amazon S3 (SSE-C, SSE-S3, and SSE-KMS).

New hands-on lab featuring TIBCO Spotfire, the popular visualization and analytics tool.

Additional reference architectures and patterns for creating and hosting big data environments on AWS.

The revised course also includes new or improved case studies of The Weather Channel, Nasdaq, Netflix, AdRoll, and Kaiten Sushiro (shown below), a conveyor belt sushi chain that uses Amazon Kinesis and Amazon Redshift to help decide in real time what plates chefs should be making next:

Taking the class

That’s just a sampling of the changes. To learn more, check out the course description for Big Data on AWS. Here’s a global list of upcoming classes.

If you’re thinking about taking the course, you should already have a basic familiarity with Apache Hadoop, SQL, MapReduce, and other common big data technologies and concepts—plus a working knowledge of core AWS services. Still ramping up? We recommend taking Big Data Technology Fundamentals and AWS Technical Essentials first.

If you have questions or suggestions, please leave a comment below.

Big Data AWS Training Course Gets Big Update

Post Syndicated from Michael Stroh original https://blogs.aws.amazon.com/bigdata/post/Tx3FR6JXY0HVTS3/Big-Data-AWS-Training-Course-Gets-Big-Update

Michael Stroh is Communications Manager for AWS Training & Certification

AWS offers a number of in-depth technical training courses, which we’re regularly updating in response to student feedback and changes to the AWS platform. Today I want to tell you about some exciting changes to Big Data on AWS, our most comprehensive training course on the AWS big data platform.

The 3-day class is primarily aimed at data scientists, analysts, solutions architects, and anybody else who wants to use AWS to handle their big data workloads. The course teaches you how to leverage Amazon Elastic MapReduce (EMR), Amazon Redshift, Amazon Kinesis, Amazon DynamoDB and the rest of the AWS big data platform (as well as several popular third-party tools) to get useful answers from your data at a speed and cost that suits your needs.

What’s new

So what’s different? For starters, the course was completely reorganized to talk about the AWS big data platform like a story—from data ingestion, to storage, to visualization—and make it easier to follow.

Customers also said they really wanted to hear more about Amazon Redshift and understand the differences between Amazon Redshift and Amazon EMR—where these services overlap, and where they’re different. So the new version of the course adds about 150% more Amazon Redshift-related content, including course modules on cluster architecture and optimization, and concepts critical to understanding Amazon Redshift, such as data warehousing and columnar data storage.

Also in response to customer feedback, the AWS Training team beefed up coverage of Hadoop programming frameworks, especially for Hive, Presto, Pig, and Spark. The Spark module, for example, now includes details on MLlib, Spark Streaming, and GraphX. There’s also a new course module on Hue, the popular Hadoop web interface, and a new hands-on lab on running Hue on Amazon EMR.

Course updates were also a response to the fast evolving big data platform. So we added coverage of AWS Import/Export Snowball, Amazon Kinesis Firehose, and AWS QuickSight—the data ingestion, streaming, and visualization services (respectively) announced at re:Invent 2015.

Other notable highlights of the revised course include:

More explanation of how Amazon Kinesis and Amazon Kinesis Streams work.

More focus on three different types of server-side encryption of data stored in Amazon S3 (SSE-C, SSE-S3, and SSE-KMS).

New hands-on lab featuring TIBCO Spotfire, the popular visualization and analytics tool.

Additional reference architectures and patterns for creating and hosting big data environments on AWS.

The revised course also includes new or improved case studies of The Weather Channel, Nasdaq, Netflix, AdRoll, and Kaiten Sushiro (shown below), a conveyor belt sushi chain that uses Amazon Kinesis and Amazon Redshift to help decide in real time what plates chefs should be making next:

Taking the class

That’s just a sampling of the changes. To learn more, check out the course description for Big Data on AWS. Here’s a global list of upcoming classes.

If you’re thinking about taking the course, you should already have a basic familiarity with Apache Hadoop, SQL, MapReduce, and other common big data technologies and concepts—plus a working knowledge of core AWS services. Still ramping up? We recommend taking Big Data Technology Fundamentals and AWS Technical Essentials first.

If you have questions or suggestions, please leave a comment below.

Anti Evil Maid 2 Turbo Edition

Post Syndicated from Matthew Garrett original http://mjg59.dreamwidth.org/35742.html

The Evil Maid attack has been discussed for some time – in short, it’s the idea that most security mechanisms on your laptop can be subverted if an attacker is able to gain physical access to your system (for instance, by pretending to be the maid in a hotel). Most disk encryption systems will fall prey to the attacker replacing the initial boot code of your system with something that records and then exfiltrates your decryption passphrase the next time you type it, at which point the attacker can simply steal your laptop the next day and get hold of all your data.There are a couple of ways to protect against this, and they both involve the TPM. Trusted Platform Modules are small cryptographic devices on the system motherboard[1]. They have a bunch of Platform Configuration Registers (PCRs) that are cleared on power cycle but otherwise have slightly strange write semantics – attempting to write a new value to a PCR will append the new value to the existing value, take the SHA-1 of that and then store this SHA-1 in the register. During a normal boot, each stage of the boot process will take a SHA-1 of the next stage of the boot process and push that into the TPM, a process called “measurement”. Each component is measured into a separate PCR – PCR0 contains the SHA-1 of the firmware itself, PCR1 contains the SHA-1 of the firmware configuration, PCR2 contains the SHA-1 of any option ROMs, PCR5 contains the SHA-1 of the bootloader and so on.If any component is modified, the previous component will come up with a different measurement and the PCR value will be different, Because you can’t directly modify PCR values[2], this modified code will only be able to set the PCR back to the “correct” value if it’s able to generate a sequence of writes that will hash back to that value. SHA-1 isn’t yet sufficiently broken for that to be practical, so we can probably ignore that. The neat bit here is that you can then use the TPM to encrypt small quantities of data[3] and ask it to only decrypt that data if the PCR values match. If you change the PCR values (by modifying the firmware, bootloader, kernel and so on), the TPM will refuse to decrypt the material.Bitlocker uses this to encrypt the disk encryption key with the TPM. If the boot process has been tampered with, the TPM will refuse to hand over the key and your disk remains encrypted. This is an effective technical mechanism for protecting against people taking images of your hard drive, but it does have one fairly significant issue – in the default mode, your disk is decrypted automatically. You can add a password, but the obvious attack is then to modify the boot process such that a fake password prompt is presented and the malware exfiltrates the data. The TPM won’t hand over the secret, so the malware flashes up a message saying that the system must be rebooted in order to finish installing updates, removes itself and leaves anyone except the most paranoid of users with the impression that nothing bad just happened. It’s an improvement over the state of the art, but it’s not a perfect one.Joanna Rutkowska came up with the idea of Anti Evil Maid. This can take two slightly different forms. In both, a secret phrase is generated and encrypted with the TPM. In the first form, this is then stored on a USB stick. If the user suspects that their system has been tampered with, they boot from the USB stick. If the PCR values are good, the secret will be successfully decrypted and printed on the screen. The user verifies that the secret phrase is correct and reboots, satisfied that their system hasn’t been tampered with. The downside to this approach is that most boots will not perform this verification, and so you rely on the user being able to make a reasonable judgement about whether it’s necessary on a specific boot.The second approach is to do this on every boot. The obvious problem here is that in this case an attacker simply boots your system, copies down the secret, modifies your system and simply prints the correct secret. To avoid this, the TPM can have a password set. If the user fails to enter the correct password, the TPM will refuse to decrypt the data. This can be attacked in a similar way to Bitlocker, but can be avoided with sufficient training: if the system reboots without the user seeing the secret, the user must assume that their system has been compromised and that an attacker now has a copy of their TPM password.This isn’t entirely great from a usability perspective. I think I’ve come up with something slightly nicer, and certainly more Web 2.0[4]. Anti Evil Maid relies on having a static secret because expecting a user to remember a dynamic one is pretty unreasonable. But most security conscious people rely on dynamic secret generation daily – it’s the basis of most two factor authentication systems. TOTP is an algorithm that takes a seed, the time of day and some reasonably clever calculations and comes up with (usually) a six digit number. The secret is known by the device that you’re authenticating against, and also by some other device that you possess (typically a phone). You type in the value that your phone gives you, the remote site confirms that it’s the value it expected and you’ve just proven that you possess the secret. Because the secret depends on the time of day, someone copying that value won’t be able to use it later.But instead of using your phone to identify yourself to a remote computer, we can use the same technique to ensure that your computer possesses the same secret as your phone. If the PCR states are valid, the computer will be able to decrypt the TOTP secret and calculate the current value. This can then be printed on the screen and the user can compare it against their phone. If the values match, the PCR values are valid. If not, the system has been compromised. Because the value changes over time, merely booting your computer gives your attacker nothing – printing an old value won’t fool the user[5]. This allows verification to be a normal part of every boot, without forcing the user to type in an additional password.I’ve written a prototype implementation of this and uploaded it here. Do pay attention to the list of limitations – without a bootloader that measures your kernel and initrd, you’re still open to compromise. Adding TPM support to grub is on my list of things to do. There are also various potential issues like an attacker being able to use external DMA-capable devices to obtain the secret, especially since most Linux distributions still ship kernels that don’t enable the IOMMU by default. And, of course, if your firmware is inherently untrustworthy there’s multiple ways it can subvert this all. So treat this very much like a research project rather than something you can depend on right now. There’s a fair amount of work to do to turn this into a meaningful improvement in security.[1] I wrote about them in more detail here, including a discussion of whether they can be used for general purpose DRM (answer: not really)[2] In theory, anyway. In practice, TPMs are embedded devices running their own firmware, so who knows what bugs they’re hiding.[3] On the order of 128 bytes or so. If you want to encrypt larger things with a TPM, the usual way to do it is to generate an AES key, encrypt your material with that and then encrypt the AES key with the TPM.[4] Is that even a thing these days? What do we say instead?[5] Assuming that the user is sufficiently diligent in checking the value, anywaycomment count unavailable comments

The new sd-bus API of systemd

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/the-new-sd-bus-api-of-systemd.html

With the new v221 release of
systemd

we are declaring the
sd-bus
API shipped with
systemd
stable. sd-bus is our minimal D-Bus
IPC
C library, supporting as
back-ends both classic socket-based D-Bus and
kdbus. The library has been been
part of systemd for a while, but has only been used internally, since
we wanted to have the liberty to still make API changes without
affecting external consumers of the library. However, now we are
confident to commit to a stable API for it, starting with v221.
In this blog story I hope to provide you with a quick overview on
sd-bus, a short reiteration on D-Bus and its concepts, as well as a
few simple examples how to write D-Bus clients and services with it.
What is D-Bus again?
Let’s start with a quick reminder what
D-Bus actually is: it’s a
powerful, generic IPC system for Linux and other operating systems. It
knows concepts like buses, objects, interfaces, methods, signals,
properties. It provides you with fine-grained access control, a rich
type system, discoverability, introspection, monitoring, reliable
multicasting, service activation, file descriptor passing, and
more. There are bindings for numerous programming languages that are
used on Linux.
D-Bus has been a core component of Linux systems since more than 10
years. It is certainly the most widely established high-level local
IPC system on Linux. Since systemd’s inception it has been the IPC
system it exposes its interfaces on. And even before systemd, it was
the IPC system Upstart used to expose its interfaces. It is used by
GNOME, by KDE and by a variety of system components.
D-Bus refers to both a
specification
,
and a reference
implementation
. The
reference implementation provides both a bus server component, as well
as a client library. While there are multiple other, popular
reimplementations of the client library – for both C and other
programming languages –, the only commonly used server side is the
one from the reference implementation. (However, the kdbus project is
working on providing an alternative to this server implementation as a
kernel component.)
D-Bus is mostly used as local IPC, on top of AF_UNIX sockets. However,
the protocol may be used on top of TCP/IP as well. It does not
natively support encryption, hence using D-Bus directly on TCP is
usually not a good idea. It is possible to combine D-Bus with a
transport like ssh in order to secure it. systemd uses this to make
many of its APIs accessible remotely.
A frequently asked question about D-Bus is why it exists at all,
given that AF_UNIX sockets and FIFOs already exist on UNIX and have
been used for a long time successfully. To answer this question let’s
make a comparison with popular web technology of today: what
AF_UNIX/FIFOs are to D-Bus, TCP is to HTTP/REST. While AF_UNIX
sockets/FIFOs only shovel raw bytes between processes, D-Bus defines
actual message encoding and adds concepts like method call
transactions, an object system, security mechanisms, multicasting and
more.
From our 10year+ experience with D-Bus we know today that while there
are some areas where we can improve things (and we are working on
that, both with kdbus and sd-bus), it generally appears to be a very
well designed system, that stood the test of time, aged well and is
widely established. Today, if we’d sit down and design a completely
new IPC system incorporating all the experience and knowledge we
gained with D-Bus, I am sure the result would be very close to what
D-Bus already is.
Or in short: D-Bus is great. If you hack on a Linux project and need a
local IPC, it should be your first choice. Not only because D-Bus is
well designed, but also because there aren’t many alternatives that
can cover similar functionality.
Where does sd-bus fit in?
Let’s discuss why sd-bus exists, how it compares with the other
existing C D-Bus libraries and why it might be a library to consider
for your project.
For C, there are two established, popular D-Bus libraries: libdbus, as
it is shipped in the reference implementation of D-Bus, as well as
GDBus, a component of GLib, the low-level tool library of GNOME.
Of the two libdbus is the much older one, as it was written at the
time the specification was put together. The library was written with
a focus on being portable and to be useful as back-end for higher-level
language bindings. Both of these goals required the API to be very
generic, resulting in a relatively baroque, hard-to-use API that lacks
the bits that make it easy and fun to use from C. It provides the
building blocks, but few tools to actually make it straightforward to
build a house from them. On the other hand, the library is suitable
for most use-cases (for example, it is OOM-safe making it suitable for
writing lowest level system software), and is portable to operating
systems like Windows or more exotic UNIXes.
GDBus
is a much newer implementation. It has been written after considerable
experience with using a GLib/GObject wrapper around libdbus. GDBus is
implemented from scratch, shares no code with libdbus. Its design
differs substantially from libdbus, it contains code generators to
make it specifically easy to expose GObject objects on the bus, or
talking to D-Bus objects as GObject objects. It translates D-Bus data
types to GVariant, which is GLib’s powerful data serialization
format. If you are used to GLib-style programming then you’ll feel
right at home, hacking D-Bus services and clients with it is a lot
simpler than using libdbus.
With sd-bus we now provide a third implementation, sharing no code
with either libdbus or GDBus. For us, the focus was on providing kind
of a middle ground between libdbus and GDBus: a low-level C library
that actually is fun to work with, that has enough syntactic sugar to
make it easy to write clients and services with, but on the other hand
is more low-level than GDBus/GLib/GObject/GVariant. To be able to use
it in systemd’s various system-level components it needed to be
OOM-safe and minimal. Another major point we wanted to focus on was
supporting a kdbus back-end right from the beginning, in addition to
the socket transport of the original D-Bus specification (“dbus1”). In
fact, we wanted to design the library closer to kdbus’ semantics than
to dbus1’s, wherever they are different, but still cover both
transports nicely. In contrast to libdbus or GDBus portability is not
a priority for sd-bus, instead we try to make the best of the Linux
platform and expose specific Linux concepts wherever that is
beneficial. Finally, performance was also an issue (though a secondary
one): neither libdbus nor GDBus will win any speed records. We wanted
to improve on performance (throughput and latency) — but simplicity
and correctness are more important to us. We believe the result of our
work delivers our goals quite nicely: the library is fun to use,
supports kdbus and sockets as back-end, is relatively minimal, and the
performance is substantially
better

than both libdbus and GDBus.
To decide which of the three APIs to use for you C project, here are
short guidelines:

If you hack on a GLib/GObject project, GDBus is definitely your
first choice.

If portability to non-Linux kernels — including Windows, Mac OS and
other UNIXes — is important to you, use either GDBus (which more or
less means buying into GLib/GObject) or libdbus (which requires a
lot of manual work).

Otherwise, sd-bus would be my recommended choice.

(I am not covering C++ specifically here, this is all about plain C
only. But do note: if you use Qt, then QtDBus is the D-Bus API of
choice, being a wrapper around libdbus.)
Introduction to D-Bus Concepts
To the uninitiated D-Bus usually appears to be a relatively opaque
technology. It uses lots of concepts that appear unnecessarily complex
and redundant on first sight. But actually, they make a lot of
sense. Let’s have a look:

A bus is where you look for IPC services. There are usually two
kinds of buses: a system bus, of which there’s exactly one per
system, and which is where you’d look for system services; and a
user bus, of which there’s one per user, and which is where you’d
look for user services, like the address book service or the mail
program. (Originally, the user bus was actually a session bus — so
that you get multiple of them if you log in many times as the same
user –, and on most setups it still is, but we are working on
moving things to a true user bus, of which there is only one per
user on a system, regardless how many times that user happens to
log in.)

A service is a program that offers some IPC API on a bus. A
service is identified by a name in reverse domain name
notation. Thus, the org.freedesktop.NetworkManager service on the
system bus is where NetworkManager’s APIs are available and
org.freedesktop.login1 on the system bus is where
systemd-logind’s APIs are exposed.

A client is a program that makes use of some IPC API on a bus. It
talks to a service, monitors it and generally doesn’t provide any
services on its own. That said, lines are blurry and many services
are also clients to other services. Frequently the term peer is
used as a generalization to refer to either a service or a client.

An object path is an identifier for an object on a specific
service. In a way this is comparable to a C pointer, since that’s
how you generally reference a C object, if you hack object-oriented
programs in C. However, C pointers are just memory addresses, and
passing memory addresses around to other processes would make
little sense, since they of course refer to the address space of
the service, the client couldn’t make sense of it. Thus, the D-Bus
designers came up with the object path concept, which is just a
string that looks like a file system path. Example:
/org/freedesktop/login1 is the object path of the ‘manager’
object of the org.freedesktop.login1 service (which, as we
remember from above, is still the service systemd-logind
exposes). Because object paths are structured like file system
paths they can be neatly arranged in a tree, so that you end up
with a venerable tree of objects. For example, you’ll find all user
sessions systemd-logind manages below the
/org/freedesktop/login1/session sub-tree, for example called
/org/freedesktop/login1/session/_7,
/org/freedesktop/login1/session/_55 and so on. How services
precisely label their objects and arrange them in a tree is
completely up to the developers of the services.

Each object that is identified by an object path has one or more
interfaces. An interface is a collection of signals, methods, and
properties (collectively called members), that belong
together. The concept of a D-Bus interface is actually pretty
much identical to what you know from programming languages such as
Java, which also know an interface concept. Which interfaces an
object implements are up the developers of the service. Interface
names are in reverse domain name notation, much like service
names. (Yes, that’s admittedly confusing, in particular since it’s
pretty common for simpler services to reuse the service name string
also as an interface name.) A couple of interfaces are standardized
though and you’ll find them available on many of the objects
offered by the various services. Specifically, those are
org.freedesktop.DBus.Introspectable, org.freedesktop.DBus.Peer
and org.freedesktop.DBus.Properties.

An interface can contain methods. The word “method” is more or
less just a fancy word for “function”, and is a term used pretty
much the same way in object-oriented languages such as Java. The
most common interaction between D-Bus peers is that one peer
invokes one of these methods on another peer and gets a reply. A
D-Bus method takes a couple of parameters, and returns others. The
parameters are transmitted in a type-safe way, and the type
information is included in the introspection data you can query
from each object. Usually, method names (and the other member
types) follow a CamelCase syntax. For example, systemd-logind
exposes an ActivateSession method on the
org.freedesktop.login1.Manager interface that is available on the
/org/freedesktop/login1 object of the org.freedesktop.login1
service.

A signature describes a set of parameters a function (or signal,
property, see below) takes or returns. It’s a series of characters
that each encode one parameter by its type. The set of types
available is pretty powerful. For example, there are simpler types
like s for string, or u for 32bit integer, but also complex
types such as as for an array of strings or a(sb) for an array
of structures consisting of one string and one boolean each. See
the D-Bus specification
for the full explanation of the type system. The
ActivateSession method mentioned above takes a single string as
parameter (the parameter signature is hence s), and returns
nothing (the return signature is hence the empty string). Of
course, the signature can get a lot more complex, see below for
more examples.

A signal is another member type that the D-Bus object system
knows. Much like a method it has a signature. However, they serve
different purposes. While in a method call a single client issues a
request on a single service, and that service sends back a response
to the client, signals are for general notification of
peers. Services send them out when they want to tell one or more
peers on the bus that something happened or changed. In contrast to
method calls and their replies they are hence usually broadcast
over a bus. While method calls/replies are used for duplex
one-to-one communication, signals are usually used for simplex
one-to-many communication (note however that that’s not a
requirement, they can also be used one-to-one). Example:
systemd-logind broadcasts a SessionNew signal from its manager
object each time a user logs in, and a SessionRemoved signal
every time a user logs out.

A property is the third member type that the D-Bus object system
knows. It’s similar to the property concept known by languages like
C#. Properties also have a signature, and are more or less just
variables that an object exposes, that can be read or altered by
clients. Example: systemd-logind exposes a property Docked of
the signature b (a boolean). It reflects whether systemd-logind
thinks the system is currently in a docking station of some form
(only applies to laptops …).

So much for the various concepts D-Bus knows. Of course, all these new
concepts might be overwhelming. Let’s look at them from a different
perspective. I assume many of the readers have an understanding of
today’s web technology, specifically HTTP and REST. Let’s try to
compare the concept of a HTTP request with the concept of a D-Bus
method call:

A HTTP request you issue on a specific network. It could be the
Internet, or it could be your local LAN, or a company
VPN. Depending on which network you issue the request on, you’ll be
able to talk to a different set of servers. This is not unlike the
“bus” concept of D-Bus.

On the network you then pick a specific HTTP server to talk
to. That’s roughly comparable to picking a service on a specific bus.

On the HTTP server you then ask for a specific URL. The “path” part
of the URL (by which I mean everything after the host name of the
server, up to the last “/”) is pretty similar to a D-Bus object path.

The “file” part of the URL (by which I mean everything after the
last slash, following the path, as described above), then defines
the actual call to make. In D-Bus this could be mapped to an
interface and method name.

Finally, the parameters of a HTTP call follow the path after the
“?”, they map to the signature of the D-Bus call.

Of course, comparing an HTTP request to a D-Bus method call is a bit
comparing apples and oranges. However, I think it’s still useful to
get a bit of a feeling of what maps to what.
From the shell
So much about the concepts and the gray theory behind them. Let’s make
this exciting, let’s actually see how this feels on a real system.
Since a while systemd has included a tool busctl that is useful to
explore and interact with the D-Bus object system. When invoked
without parameters, it will show you a list of all peers connected to
the system bus. (Use –user to see the peers of your user bus
instead):
$ busctl
NAME PID PROCESS USER CONNECTION UNIT SESSION DESCRIPTION
:1.1 1 systemd root :1.1 – – –
:1.11 705 NetworkManager root :1.11 NetworkManager.service – –
:1.14 744 gdm root :1.14 gdm.service – –
:1.4 708 systemd-logind root :1.4 systemd-logind.service – –
:1.7200 17563 busctl lennart :1.7200 session-1.scope 1 –
[…]
org.freedesktop.NetworkManager 705 NetworkManager root :1.11 NetworkManager.service – –
org.freedesktop.login1 708 systemd-logind root :1.4 systemd-logind.service – –
org.freedesktop.systemd1 1 systemd root :1.1 – – –
org.gnome.DisplayManager 744 gdm root :1.14 gdm.service – –
[…]

(I have shortened the output a bit, to make keep things brief).
The list begins with a list of all peers currently connected to the
bus. They are identified by peer names like “:1.11”. These are called
unique names in D-Bus nomenclature. Basically, every peer has a
unique name, and they are assigned automatically when a peer connects
to the bus. They are much like an IP address if you so will. You’ll
notice that a couple of peers are already connected, including our
little busctl tool itself as well as a number of system services. The
list then shows all actual services on the bus, identified by their
service names (as discussed above; to discern them from the unique
names these are also called well-known names). In many ways
well-known names are similar to DNS host names, i.e. they are a
friendlier way to reference a peer, but on the lower level they just
map to an IP address, or in this comparison the unique name. Much like
you can connect to a host on the Internet by either its host name or
its IP address, you can also connect to a bus peer either by its
unique or its well-known name. (Note that each peer can have as many
well-known names as it likes, much like an IP address can have
multiple host names referring to it).
OK, that’s already kinda cool. Try it for yourself, on your local
machine (all you need is a recent, systemd-based distribution).
Let’s now go the next step. Let’s see which objects the
org.freedesktop.login1 service actually offers:
$ busctl tree org.freedesktop.login1
└─/org/freedesktop/login1
├─/org/freedesktop/login1/seat
│ ├─/org/freedesktop/login1/seat/seat0
│ └─/org/freedesktop/login1/seat/self
├─/org/freedesktop/login1/session
│ ├─/org/freedesktop/login1/session/_31
│ └─/org/freedesktop/login1/session/self
└─/org/freedesktop/login1/user
├─/org/freedesktop/login1/user/_1000
└─/org/freedesktop/login1/user/self

Pretty, isn’t it? What’s actually even nicer, and which the output
does not show is that there’s full command line completion
available: as you press TAB the shell will auto-complete the service
names for you. It’s a real pleasure to explore your D-Bus objects that
way!
The output shows some objects that you might recognize from the
explanations above. Now, let’s go further. Let’s see what interfaces,
methods, signals and properties one of these objects actually exposes:
$ busctl introspect org.freedesktop.login1 /org/freedesktop/login1/session/_31
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
org.freedesktop.DBus.Introspectable interface – – –
.Introspect method – s –
org.freedesktop.DBus.Peer interface – – –
.GetMachineId method – s –
.Ping method – – –
org.freedesktop.DBus.Properties interface – – –
.Get method ss v –
.GetAll method s a{sv} –
.Set method ssv – –
.PropertiesChanged signal sa{sv}as – –
org.freedesktop.login1.Session interface – – –
.Activate method – – –
.Kill method si – –
.Lock method – – –
.PauseDeviceComplete method uu – –
.ReleaseControl method – – –
.ReleaseDevice method uu – –
.SetIdleHint method b – –
.TakeControl method b – –
.TakeDevice method uu hb –
.Terminate method – – –
.Unlock method – – –
.Active property b true emits-change
.Audit property u 1 const
.Class property s "user" const
.Desktop property s "" const
.Display property s "" const
.Id property s "1" const
.IdleHint property b true emits-change
.IdleSinceHint property t 1434494624206001 emits-change
.IdleSinceHintMonotonic property t 0 emits-change
.Leader property u 762 const
.Name property s "lennart" const
.Remote property b false const
.RemoteHost property s "" const
.RemoteUser property s "" const
.Scope property s "session-1.scope" const
.Seat property (so) "seat0" "/org/freedesktop/login1/seat… const
.Service property s "gdm-autologin" const
.State property s "active" –
.TTY property s "/dev/tty1" const
.Timestamp property t 1434494630344367 const
.TimestampMonotonic property t 34814579 const
.Type property s "x11" const
.User property (uo) 1000 "/org/freedesktop/login1/user/_1… const
.VTNr property u 1 const
.Lock signal – – –
.PauseDevice signal uus – –
.ResumeDevice signal uuh – –
.Unlock signal – – –

As before, the busctl command supports command line completion, hence
both the service name and the object path used are easily put together
on the shell simply by pressing TAB. The output shows the methods,
properties, signals of one of the session objects that are currently
made available by systemd-logind. There’s a section for each
interface the object knows. The second column tells you what kind of
member is shown in the line. The third column shows the signature of
the member. In case of method calls that’s the input parameters, the
fourth column shows what is returned. For properties, the fourth
column encodes the current value of them.
So far, we just explored. Let’s take the next step now: let’s become
active – let’s call a method:
# busctl call org.freedesktop.login1 /org/freedesktop/login1/session/_31 org.freedesktop.login1.Session Lock

I don’t think I need to mention this anymore, but anyway: again
there’s full command line completion available. The third argument is
the interface name, the fourth the method name, both can be easily
completed by pressing TAB. In this case we picked the Lock method,
which activates the screen lock for the specific session. And yupp,
the instant I pressed enter on this line my screen lock turned on
(this only works on DEs that correctly hook into systemd-logind for
this to work. GNOME works fine, and KDE should work too).
The Lock method call we picked is very simple, as it takes no
parameters and returns none. Of course, it can get more complicated
for some calls. Here’s another example, this time using one of
systemd’s own bus calls, to start an arbitrary system unit:
# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartUnit ss "cups.service" "replace"
o "/org/freedesktop/systemd1/job/42684"

This call takes two strings as input parameters, as we denote in the
signature string that follows the method name (as usual, command line
completion helps you getting this right). Following the signature the
next two parameters are simply the two strings to pass. The specified
signature string hence indicates what comes next. systemd’s StartUnit
method call takes the unit name to start as first parameter, and the
mode in which to start it as second. The call returned a single object
path value. It is encoded the same way as the input parameter: a
signature (just o for the object path) followed by the actual value.
Of course, some method call parameters can get a ton more complex, but
with busctl it’s relatively easy to encode them all. See the man
page
for
details.
busctl knows a number of other operations. For example, you can use
it to monitor D-Bus traffic as it happens (including generating a
.cap file for use with Wireshark!) or you can set or get specific
properties. However, this blog story was supposed to be about sd-bus,
not busctl, hence let’s cut this short here, and let me direct you
to the man page in case you want to know more about the tool.
busctl (like the rest of system) is implemented using the sd-bus
API. Thus it exposes many of the features of sd-bus itself. For
example, you can use to connect to remote or container buses. It
understands both kdbus and classic D-Bus, and more!
sd-bus
But enough! Let’s get back on topic, let’s talk about sd-bus itself.
The sd-bus set of APIs is mostly contained in the header file
sd-bus.h.
Here’s a random selection of features of the library, that make it
compare well with the other implementations available.

Supports both kdbus and dbus1 as back-end.

Has high-level support for connecting to remote buses via ssh, and
to buses of local OS containers.

Powerful credential model, to implement authentication of clients
in services. Currently 34 individual fields are supported, from the
PID of the client to the cgroup or capability sets.

Support for tracking the life-cycle of peers in order to release
local objects automatically when all peers referencing them
disconnected.

The client builds an efficient decision tree to determine which
handlers to deliver an incoming bus message to.

Automatically translates D-Bus errors into UNIX style errors and
back (this is lossy though), to ensure best integration of D-Bus
into low-level Linux programs.

Powerful but lightweight object model for exposing local objects on
the bus. Automatically generates introspection as necessary.

The API is currently not fully documented, but we are working on
completing the set of manual pages. For details
see all pages starting with sd_bus_.
Invoking a Method, from C, with sd-bus
So much about the library in general. Here’s an example for connecting
to the bus and issuing a method call:
#include <stdio.h>
#include <stdlib.h>
#include <systemd/sd-bus.h>

int main(int argc, char *argv[]) {
sd_bus_error error = SD_BUS_ERROR_NULL;
sd_bus_message *m = NULL;
sd_bus *bus = NULL;
const char *path;
int r;

/* Connect to the system bus */
r = sd_bus_open_system(&bus);
if (r < 0) {
fprintf(stderr, "Failed to connect to system bus: %sn", strerror(-r));
goto finish;
}

/* Issue the method call and store the respons message in m */
r = sd_bus_call_method(bus,
"org.freedesktop.systemd1", /* service to contact */
"/org/freedesktop/systemd1", /* object path */
"org.freedesktop.systemd1.Manager", /* interface name */
"StartUnit", /* method name */
&error, /* object to return error in */
&m, /* return message on success */
"ss", /* input signature */
"cups.service", /* first argument */
"replace"); /* second argument */
if (r < 0) {
fprintf(stderr, "Failed to issue method call: %sn", error.message);
goto finish;
}

/* Parse the response message */
r = sd_bus_message_read(m, "o", &path);
if (r < 0) {
fprintf(stderr, "Failed to parse response message: %sn", strerror(-r));
goto finish;
}

printf("Queued service job as %s.n", path);

finish:
sd_bus_error_free(&error);
sd_bus_message_unref(m);
sd_bus_unref(bus);

return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}

Save this example as bus-client.c, then build it with:
$ gcc bus-client.c -o bus-client `pkg-config –cflags –libs libsystemd`

This will generate a binary bus-client you can now run. Make sure to
run it as root though, since access to the StartUnit method is
privileged:
# ./bus-client
Queued service job as /org/freedesktop/systemd1/job/3586.

And that’s it already, our first example. It showed how we invoked a
method call on the bus. The actual function call of the method is very
close to the busctl command line we used before. I hope the code
excerpt needs little further explanation. It’s supposed to give you a
taste how to write D-Bus clients with sd-bus. For more more
information please have a look at the header file, the man page or
even the sd-bus sources.
Implementing a Service, in C, with sd-bus
Of course, just calling a single method is a rather simplistic
example. Let’s have a look on how to write a bus service. We’ll write
a small calculator service, that exposes a single object, which
implements an interface that exposes two methods: one to multiply two
64bit signed integers, and one to divide one 64bit signed integer by
another.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <systemd/sd-bus.h>

static int method_multiply(sd_bus_message *m, void *userdata, sd_bus_error *ret_error) {
int64_t x, y;
int r;

/* Read the parameters */
r = sd_bus_message_read(m, "xx", &x, &y);
if (r < 0) {
fprintf(stderr, "Failed to parse parameters: %sn", strerror(-r));
return r;
}

/* Reply with the response */
return sd_bus_reply_method_return(m, "x", x * y);
}

static int method_divide(sd_bus_message *m, void *userdata, sd_bus_error *ret_error) {
int64_t x, y;
int r;

/* Read the parameters */
r = sd_bus_message_read(m, "xx", &x, &y);
if (r < 0) {
fprintf(stderr, "Failed to parse parameters: %sn", strerror(-r));
return r;
}

/* Return an error on division by zero */
if (y == 0) {
sd_bus_error_set_const(ret_error, "net.poettering.DivisionByZero", "Sorry, can't allow division by zero.");
return -EINVAL;
}

return sd_bus_reply_method_return(m, "x", x / y);
}

/* The vtable of our little object, implements the net.poettering.Calculator interface */
static const sd_bus_vtable calculator_vtable[] = {
SD_BUS_VTABLE_START(0),
SD_BUS_METHOD("Multiply", "xx", "x", method_multiply, SD_BUS_VTABLE_UNPRIVILEGED),
SD_BUS_METHOD("Divide", "xx", "x", method_divide, SD_BUS_VTABLE_UNPRIVILEGED),
SD_BUS_VTABLE_END
};

int main(int argc, char *argv[]) {
sd_bus_slot *slot = NULL;
sd_bus *bus = NULL;
int r;

/* Connect to the user bus this time */
r = sd_bus_open_user(&bus);
if (r < 0) {
fprintf(stderr, "Failed to connect to system bus: %sn", strerror(-r));
goto finish;
}

/* Install the object */
r = sd_bus_add_object_vtable(bus,
&slot,
"/net/poettering/Calculator", /* object path */
"net.poettering.Calculator", /* interface name */
calculator_vtable,
NULL);
if (r < 0) {
fprintf(stderr, "Failed to issue method call: %sn", strerror(-r));
goto finish;
}

/* Take a well-known service name so that clients can find us */
r = sd_bus_request_name(bus, "net.poettering.Calculator", 0);
if (r < 0) {
fprintf(stderr, "Failed to acquire service name: %sn", strerror(-r));
goto finish;
}

for (;;) {
/* Process requests */
r = sd_bus_process(bus, NULL);
if (r < 0) {
fprintf(stderr, "Failed to process bus: %sn", strerror(-r));
goto finish;
}
if (r > 0) /* we processed a request, try to process another one, right-away */
continue;

/* Wait for the next request to process */
r = sd_bus_wait(bus, (uint64_t) -1);
if (r < 0) {
fprintf(stderr, "Failed to wait on bus: %sn", strerror(-r));
goto finish;
}
}

finish:
sd_bus_slot_unref(slot);
sd_bus_unref(bus);

return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}

Save this example as bus-service.c, then build it with:
$ gcc bus-service.c -o bus-service `pkg-config –cflags –libs libsystemd`

Now, let’s run it:
$ ./bus-service

In another terminal, let’s try to talk to it. Note that this service
is now on the user bus, not on the system bus as before. We do this
for simplicity reasons: on the system bus access to services is
tightly controlled so unprivileged clients cannot request privileged
operations. On the user bus however things are simpler: as only
processes of the user owning the bus can connect no further policy
enforcement will complicate this example. Because the service is on
the user bus, we have to pass the –user switch on the busctl
command line. Let’s start with looking at the service’s object tree.
$ busctl –user tree net.poettering.Calculator
└─/net/poettering/Calculator

As we can see, there’s only a single object on the service, which is
not surprising, given that our code above only registered one. Let’s
see the interfaces and the members this object exposes:
$ busctl –user introspect net.poettering.Calculator /net/poettering/Calculator
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
net.poettering.Calculator interface – – –
.Divide method xx x –
.Multiply method xx x –
org.freedesktop.DBus.Introspectable interface – – –
.Introspect method – s –
org.freedesktop.DBus.Peer interface – – –
.GetMachineId method – s –
.Ping method – – –
org.freedesktop.DBus.Properties interface – – –
.Get method ss v –
.GetAll method s a{sv} –
.Set method ssv – –
.PropertiesChanged signal sa{sv}as – –

The sd-bus library automatically added a couple of generic interfaces,
as mentioned above. But the first interface we see is actually the one
we added! It shows our two methods, and both take “xx” (two 64bit
signed integers) as input parameters, and return one “x”. Great! But
does it work?
$ busctl –user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Multiply xx 5 7
x 35

Woohoo! We passed the two integers 5 and 7, and the service actually
multiplied them for us and returned a single integer 35! Let’s try the
other method:
$ busctl –user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Divide xx 99 17
x 5

Oh, wow! It can even do integer division! Fantastic! But let’s trick
it into dividing by zero:
$ busctl –user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Divide xx 43 0
Sorry, can't allow division by zero.

Nice! It detected this nicely and returned a clean error about it. If
you look in the source code example above you’ll see how precisely we
generated the error.
And that’s really all I have for today. Of course, the examples I
showed are short, and I don’t get into detail here on what precisely
each line does. However, this is supposed to be a short introduction
into D-Bus and sd-bus, and it’s already way too long for that …
I hope this blog story was useful to you. If you are interested in
using sd-bus for your own programs, I hope this gets you started. If
you have further questions, check the (incomplete) man pages, and
inquire us on IRC or the systemd mailing list. If you need more
examples, have a look at the systemd source tree, all of systemd’s
many bus services use sd-bus extensively.

The new sd-bus API of systemd

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/the-new-sd-bus-api-of-systemd.html

With the new v221 release of
systemd

we are declaring the
sd-bus
API shipped with
systemd
stable. sd-bus is our minimal D-Bus
IPC
C library, supporting as
back-ends both classic socket-based D-Bus and
kdbus. The library has been been
part of systemd for a while, but has only been used internally, since
we wanted to have the liberty to still make API changes without
affecting external consumers of the library. However, now we are
confident to commit to a stable API for it, starting with v221.

In this blog story I hope to provide you with a quick overview on
sd-bus, a short reiteration on D-Bus and its concepts, as well as a
few simple examples how to write D-Bus clients and services with it.

What is D-Bus again?

Let’s start with a quick reminder what
D-Bus actually is: it’s a
powerful, generic IPC system for Linux and other operating systems. It
knows concepts like buses, objects, interfaces, methods, signals,
properties. It provides you with fine-grained access control, a rich
type system, discoverability, introspection, monitoring, reliable
multicasting, service activation, file descriptor passing, and
more. There are bindings for numerous programming languages that are
used on Linux.

D-Bus has been a core component of Linux systems since more than 10
years. It is certainly the most widely established high-level local
IPC system on Linux. Since systemd’s inception it has been the IPC
system it exposes its interfaces on. And even before systemd, it was
the IPC system Upstart used to expose its interfaces. It is used by
GNOME, by KDE and by a variety of system components.

D-Bus refers to both a
specification
,
and a reference
implementation
. The
reference implementation provides both a bus server component, as well
as a client library. While there are multiple other, popular
reimplementations of the client library – for both C and other
programming languages –, the only commonly used server side is the
one from the reference implementation. (However, the kdbus project is
working on providing an alternative to this server implementation as a
kernel component.)

D-Bus is mostly used as local IPC, on top of AF_UNIX sockets. However,
the protocol may be used on top of TCP/IP as well. It does not
natively support encryption, hence using D-Bus directly on TCP is
usually not a good idea. It is possible to combine D-Bus with a
transport like ssh in order to secure it. systemd uses this to make
many of its APIs accessible remotely.

A frequently asked question about D-Bus is why it exists at all,
given that AF_UNIX sockets and FIFOs already exist on UNIX and have
been used for a long time successfully. To answer this question let’s
make a comparison with popular web technology of today: what
AF_UNIX/FIFOs are to D-Bus, TCP is to HTTP/REST. While AF_UNIX
sockets/FIFOs only shovel raw bytes between processes, D-Bus defines
actual message encoding and adds concepts like method call
transactions, an object system, security mechanisms, multicasting and
more.

From our 10year+ experience with D-Bus we know today that while there
are some areas where we can improve things (and we are working on
that, both with kdbus and sd-bus), it generally appears to be a very
well designed system, that stood the test of time, aged well and is
widely established. Today, if we’d sit down and design a completely
new IPC system incorporating all the experience and knowledge we
gained with D-Bus, I am sure the result would be very close to what
D-Bus already is.

Or in short: D-Bus is great. If you hack on a Linux project and need a
local IPC, it should be your first choice. Not only because D-Bus is
well designed, but also because there aren’t many alternatives that
can cover similar functionality.

Where does sd-bus fit in?

Let’s discuss why sd-bus exists, how it compares with the other
existing C D-Bus libraries and why it might be a library to consider
for your project.

For C, there are two established, popular D-Bus libraries: libdbus, as
it is shipped in the reference implementation of D-Bus, as well as
GDBus, a component of GLib, the low-level tool library of GNOME.

Of the two libdbus is the much older one, as it was written at the
time the specification was put together. The library was written with
a focus on being portable and to be useful as back-end for higher-level
language bindings. Both of these goals required the API to be very
generic, resulting in a relatively baroque, hard-to-use API that lacks
the bits that make it easy and fun to use from C. It provides the
building blocks, but few tools to actually make it straightforward to
build a house from them. On the other hand, the library is suitable
for most use-cases (for example, it is OOM-safe making it suitable for
writing lowest level system software), and is portable to operating
systems like Windows or more exotic UNIXes.

GDBus
is a much newer implementation. It has been written after considerable
experience with using a GLib/GObject wrapper around libdbus. GDBus is
implemented from scratch, shares no code with libdbus. Its design
differs substantially from libdbus, it contains code generators to
make it specifically easy to expose GObject objects on the bus, or
talking to D-Bus objects as GObject objects. It translates D-Bus data
types to GVariant, which is GLib’s powerful data serialization
format. If you are used to GLib-style programming then you’ll feel
right at home, hacking D-Bus services and clients with it is a lot
simpler than using libdbus.

With sd-bus we now provide a third implementation, sharing no code
with either libdbus or GDBus. For us, the focus was on providing kind
of a middle ground between libdbus and GDBus: a low-level C library
that actually is fun to work with, that has enough syntactic sugar to
make it easy to write clients and services with, but on the other hand
is more low-level than GDBus/GLib/GObject/GVariant. To be able to use
it in systemd’s various system-level components it needed to be
OOM-safe and minimal. Another major point we wanted to focus on was
supporting a kdbus back-end right from the beginning, in addition to
the socket transport of the original D-Bus specification (“dbus1”). In
fact, we wanted to design the library closer to kdbus’ semantics than
to dbus1’s, wherever they are different, but still cover both
transports nicely. In contrast to libdbus or GDBus portability is not
a priority for sd-bus, instead we try to make the best of the Linux
platform and expose specific Linux concepts wherever that is
beneficial. Finally, performance was also an issue (though a secondary
one): neither libdbus nor GDBus will win any speed records. We wanted
to improve on performance (throughput and latency) — but simplicity
and correctness are more important to us. We believe the result of our
work delivers our goals quite nicely: the library is fun to use,
supports kdbus and sockets as back-end, is relatively minimal, and the
performance is substantially
better

than both libdbus and GDBus.

To decide which of the three APIs to use for you C project, here are
short guidelines:

  • If you hack on a GLib/GObject project, GDBus is definitely your
    first choice.

  • If portability to non-Linux kernels — including Windows, Mac OS and
    other UNIXes — is important to you, use either GDBus (which more or
    less means buying into GLib/GObject) or libdbus (which requires a
    lot of manual work).

  • Otherwise, sd-bus would be my recommended choice.

(I am not covering C++ specifically here, this is all about plain C
only. But do note: if you use Qt, then QtDBus is the D-Bus API of
choice, being a wrapper around libdbus.)

Introduction to D-Bus Concepts

To the uninitiated D-Bus usually appears to be a relatively opaque
technology. It uses lots of concepts that appear unnecessarily complex
and redundant on first sight. But actually, they make a lot of
sense. Let’s have a look:

  • A bus is where you look for IPC services. There are usually two
    kinds of buses: a system bus, of which there’s exactly one per
    system, and which is where you’d look for system services; and a
    user bus, of which there’s one per user, and which is where you’d
    look for user services, like the address book service or the mail
    program. (Originally, the user bus was actually a session bus — so
    that you get multiple of them if you log in many times as the same
    user –, and on most setups it still is, but we are working on
    moving things to a true user bus, of which there is only one per
    user on a system, regardless how many times that user happens to
    log in.)

  • A service is a program that offers some IPC API on a bus. A
    service is identified by a name in reverse domain name
    notation. Thus, the org.freedesktop.NetworkManager service on the
    system bus is where NetworkManager’s APIs are available and
    org.freedesktop.login1 on the system bus is where
    systemd-logind‘s APIs are exposed.

  • A client is a program that makes use of some IPC API on a bus. It
    talks to a service, monitors it and generally doesn’t provide any
    services on its own. That said, lines are blurry and many services
    are also clients to other services. Frequently the term peer is
    used as a generalization to refer to either a service or a client.

  • An object path is an identifier for an object on a specific
    service. In a way this is comparable to a C pointer, since that’s
    how you generally reference a C object, if you hack object-oriented
    programs in C. However, C pointers are just memory addresses, and
    passing memory addresses around to other processes would make
    little sense, since they of course refer to the address space of
    the service, the client couldn’t make sense of it. Thus, the D-Bus
    designers came up with the object path concept, which is just a
    string that looks like a file system path. Example:
    /org/freedesktop/login1 is the object path of the ‘manager’
    object of the org.freedesktop.login1 service (which, as we
    remember from above, is still the service systemd-logind
    exposes). Because object paths are structured like file system
    paths they can be neatly arranged in a tree, so that you end up
    with a venerable tree of objects. For example, you’ll find all user
    sessions systemd-logind manages below the
    /org/freedesktop/login1/session sub-tree, for example called
    /org/freedesktop/login1/session/_7,
    /org/freedesktop/login1/session/_55 and so on. How services
    precisely label their objects and arrange them in a tree is
    completely up to the developers of the services.

  • Each object that is identified by an object path has one or more
    interfaces. An interface is a collection of signals, methods, and
    properties (collectively called members), that belong
    together. The concept of a D-Bus interface is actually pretty
    much identical to what you know from programming languages such as
    Java, which also know an interface concept. Which interfaces an
    object implements are up the developers of the service. Interface
    names are in reverse domain name notation, much like service
    names. (Yes, that’s admittedly confusing, in particular since it’s
    pretty common for simpler services to reuse the service name string
    also as an interface name.) A couple of interfaces are standardized
    though and you’ll find them available on many of the objects
    offered by the various services. Specifically, those are
    org.freedesktop.DBus.Introspectable, org.freedesktop.DBus.Peer
    and org.freedesktop.DBus.Properties.

  • An interface can contain methods. The word “method” is more or
    less just a fancy word for “function”, and is a term used pretty
    much the same way in object-oriented languages such as Java. The
    most common interaction between D-Bus peers is that one peer
    invokes one of these methods on another peer and gets a reply. A
    D-Bus method takes a couple of parameters, and returns others. The
    parameters are transmitted in a type-safe way, and the type
    information is included in the introspection data you can query
    from each object. Usually, method names (and the other member
    types) follow a CamelCase syntax. For example, systemd-logind
    exposes an ActivateSession method on the
    org.freedesktop.login1.Manager interface that is available on the
    /org/freedesktop/login1 object of the org.freedesktop.login1
    service.

  • A signature describes a set of parameters a function (or signal,
    property, see below) takes or returns. It’s a series of characters
    that each encode one parameter by its type. The set of types
    available is pretty powerful. For example, there are simpler types
    like s for string, or u for 32bit integer, but also complex
    types such as as for an array of strings or a(sb) for an array
    of structures consisting of one string and one boolean each. See
    the D-Bus specification
    for the full explanation of the type system. The
    ActivateSession method mentioned above takes a single string as
    parameter (the parameter signature is hence s), and returns
    nothing (the return signature is hence the empty string). Of
    course, the signature can get a lot more complex, see below for
    more examples.

  • A signal is another member type that the D-Bus object system
    knows. Much like a method it has a signature. However, they serve
    different purposes. While in a method call a single client issues a
    request on a single service, and that service sends back a response
    to the client, signals are for general notification of
    peers. Services send them out when they want to tell one or more
    peers on the bus that something happened or changed. In contrast to
    method calls and their replies they are hence usually broadcast
    over a bus. While method calls/replies are used for duplex
    one-to-one communication, signals are usually used for simplex
    one-to-many communication (note however that that’s not a
    requirement, they can also be used one-to-one). Example:
    systemd-logind broadcasts a SessionNew signal from its manager
    object each time a user logs in, and a SessionRemoved signal
    every time a user logs out.

  • A property is the third member type that the D-Bus object system
    knows. It’s similar to the property concept known by languages like
    C#. Properties also have a signature, and are more or less just
    variables that an object exposes, that can be read or altered by
    clients. Example: systemd-logind exposes a property Docked of
    the signature b (a boolean). It reflects whether systemd-logind
    thinks the system is currently in a docking station of some form
    (only applies to laptops …).

So much for the various concepts D-Bus knows. Of course, all these new
concepts might be overwhelming. Let’s look at them from a different
perspective. I assume many of the readers have an understanding of
today’s web technology, specifically HTTP and REST. Let’s try to
compare the concept of a HTTP request with the concept of a D-Bus
method call:

  • A HTTP request you issue on a specific network. It could be the
    Internet, or it could be your local LAN, or a company
    VPN. Depending on which network you issue the request on, you’ll be
    able to talk to a different set of servers. This is not unlike the
    “bus” concept of D-Bus.

  • On the network you then pick a specific HTTP server to talk
    to. That’s roughly comparable to picking a service on a specific bus.

  • On the HTTP server you then ask for a specific URL. The “path” part
    of the URL (by which I mean everything after the host name of the
    server, up to the last “/”) is pretty similar to a D-Bus object path.

  • The “file” part of the URL (by which I mean everything after the
    last slash, following the path, as described above), then defines
    the actual call to make. In D-Bus this could be mapped to an
    interface and method name.

  • Finally, the parameters of a HTTP call follow the path after the
    “?”, they map to the signature of the D-Bus call.

Of course, comparing an HTTP request to a D-Bus method call is a bit
comparing apples and oranges. However, I think it’s still useful to
get a bit of a feeling of what maps to what.

From the shell

So much about the concepts and the gray theory behind them. Let’s make
this exciting, let’s actually see how this feels on a real system.

Since a while systemd has included a tool busctl that is useful to
explore and interact with the D-Bus object system. When invoked
without parameters, it will show you a list of all peers connected to
the system bus. (Use --user to see the peers of your user bus
instead):

$ busctl
NAME                                       PID PROCESS         USER             CONNECTION    UNIT                      SESSION    DESCRIPTION
:1.1                                         1 systemd         root             :1.1          -                         -          -
:1.11                                      705 NetworkManager  root             :1.11         NetworkManager.service    -          -
:1.14                                      744 gdm             root             :1.14         gdm.service               -          -
:1.4                                       708 systemd-logind  root             :1.4          systemd-logind.service    -          -
:1.7200                                  17563 busctl          lennart          :1.7200       session-1.scope           1          -
[…]
org.freedesktop.NetworkManager             705 NetworkManager  root             :1.11         NetworkManager.service    -          -
org.freedesktop.login1                     708 systemd-logind  root             :1.4          systemd-logind.service    -          -
org.freedesktop.systemd1                     1 systemd         root             :1.1          -                         -          -
org.gnome.DisplayManager                   744 gdm             root             :1.14         gdm.service               -          -
[…]

(I have shortened the output a bit, to make keep things brief).

The list begins with a list of all peers currently connected to the
bus. They are identified by peer names like “:1.11”. These are called
unique names in D-Bus nomenclature. Basically, every peer has a
unique name, and they are assigned automatically when a peer connects
to the bus. They are much like an IP address if you so will. You’ll
notice that a couple of peers are already connected, including our
little busctl tool itself as well as a number of system services. The
list then shows all actual services on the bus, identified by their
service names (as discussed above; to discern them from the unique
names these are also called well-known names). In many ways
well-known names are similar to DNS host names, i.e. they are a
friendlier way to reference a peer, but on the lower level they just
map to an IP address, or in this comparison the unique name. Much like
you can connect to a host on the Internet by either its host name or
its IP address, you can also connect to a bus peer either by its
unique or its well-known name. (Note that each peer can have as many
well-known names as it likes, much like an IP address can have
multiple host names referring to it).

OK, that’s already kinda cool. Try it for yourself, on your local
machine (all you need is a recent, systemd-based distribution).

Let’s now go the next step. Let’s see which objects the
org.freedesktop.login1 service actually offers:

$ busctl tree org.freedesktop.login1
└─/org/freedesktop/login1
  ├─/org/freedesktop/login1/seat
  │ ├─/org/freedesktop/login1/seat/seat0
  │ └─/org/freedesktop/login1/seat/self
  ├─/org/freedesktop/login1/session
  │ ├─/org/freedesktop/login1/session/_31
  │ └─/org/freedesktop/login1/session/self
  └─/org/freedesktop/login1/user
    ├─/org/freedesktop/login1/user/_1000
    └─/org/freedesktop/login1/user/self

Pretty, isn’t it? What’s actually even nicer, and which the output
does not show is that there’s full command line completion
available: as you press TAB the shell will auto-complete the service
names for you. It’s a real pleasure to explore your D-Bus objects that
way!

The output shows some objects that you might recognize from the
explanations above. Now, let’s go further. Let’s see what interfaces,
methods, signals and properties one of these objects actually exposes:

$ busctl introspect org.freedesktop.login1 /org/freedesktop/login1/session/_31
NAME                                TYPE      SIGNATURE RESULT/VALUE                             FLAGS
org.freedesktop.DBus.Introspectable interface -         -                                        -
.Introspect                         method    -         s                                        -
org.freedesktop.DBus.Peer           interface -         -                                        -
.GetMachineId                       method    -         s                                        -
.Ping                               method    -         -                                        -
org.freedesktop.DBus.Properties     interface -         -                                        -
.Get                                method    ss        v                                        -
.GetAll                             method    s         a{sv}                                    -
.Set                                method    ssv       -                                        -
.PropertiesChanged                  signal    sa{sv}as  -                                        -
org.freedesktop.login1.Session      interface -         -                                        -
.Activate                           method    -         -                                        -
.Kill                               method    si        -                                        -
.Lock                               method    -         -                                        -
.PauseDeviceComplete                method    uu        -                                        -
.ReleaseControl                     method    -         -                                        -
.ReleaseDevice                      method    uu        -                                        -
.SetIdleHint                        method    b         -                                        -
.TakeControl                        method    b         -                                        -
.TakeDevice                         method    uu        hb                                       -
.Terminate                          method    -         -                                        -
.Unlock                             method    -         -                                        -
.Active                             property  b         true                                     emits-change
.Audit                              property  u         1                                        const
.Class                              property  s         "user"                                   const
.Desktop                            property  s         ""                                       const
.Display                            property  s         ""                                       const
.Id                                 property  s         "1"                                      const
.IdleHint                           property  b         true                                     emits-change
.IdleSinceHint                      property  t         1434494624206001                         emits-change
.IdleSinceHintMonotonic             property  t         0                                        emits-change
.Leader                             property  u         762                                      const
.Name                               property  s         "lennart"                                const
.Remote                             property  b         false                                    const
.RemoteHost                         property  s         ""                                       const
.RemoteUser                         property  s         ""                                       const
.Scope                              property  s         "session-1.scope"                        const
.Seat                               property  (so)      "seat0" "/org/freedesktop/login1/seat... const
.Service                            property  s         "gdm-autologin"                          const
.State                              property  s         "active"                                 -
.TTY                                property  s         "/dev/tty1"                              const
.Timestamp                          property  t         1434494630344367                         const
.TimestampMonotonic                 property  t         34814579                                 const
.Type                               property  s         "x11"                                    const
.User                               property  (uo)      1000 "/org/freedesktop/login1/user/_1... const
.VTNr                               property  u         1                                        const
.Lock                               signal    -         -                                        -
.PauseDevice                        signal    uus       -                                        -
.ResumeDevice                       signal    uuh       -                                        -
.Unlock                             signal    -         -                                        -

As before, the busctl command supports command line completion, hence
both the service name and the object path used are easily put together
on the shell simply by pressing TAB. The output shows the methods,
properties, signals of one of the session objects that are currently
made available by systemd-logind. There’s a section for each
interface the object knows. The second column tells you what kind of
member is shown in the line. The third column shows the signature of
the member. In case of method calls that’s the input parameters, the
fourth column shows what is returned. For properties, the fourth
column encodes the current value of them.

So far, we just explored. Let’s take the next step now: let’s become
active – let’s call a method:

# busctl call org.freedesktop.login1 /org/freedesktop/login1/session/_31 org.freedesktop.login1.Session Lock

I don’t think I need to mention this anymore, but anyway: again
there’s full command line completion available. The third argument is
the interface name, the fourth the method name, both can be easily
completed by pressing TAB. In this case we picked the Lock method,
which activates the screen lock for the specific session. And yupp,
the instant I pressed enter on this line my screen lock turned on
(this only works on DEs that correctly hook into systemd-logind for
this to work. GNOME works fine, and KDE should work too).

The Lock method call we picked is very simple, as it takes no
parameters and returns none. Of course, it can get more complicated
for some calls. Here’s another example, this time using one of
systemd’s own bus calls, to start an arbitrary system unit:

# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartUnit ss "cups.service" "replace"
o "/org/freedesktop/systemd1/job/42684"

This call takes two strings as input parameters, as we denote in the
signature string that follows the method name (as usual, command line
completion helps you getting this right). Following the signature the
next two parameters are simply the two strings to pass. The specified
signature string hence indicates what comes next. systemd’s StartUnit
method call takes the unit name to start as first parameter, and the
mode in which to start it as second. The call returned a single object
path value. It is encoded the same way as the input parameter: a
signature (just o for the object path) followed by the actual value.

Of course, some method call parameters can get a ton more complex, but
with busctl it’s relatively easy to encode them all. See the man
page
for
details.

busctl knows a number of other operations. For example, you can use
it to monitor D-Bus traffic as it happens (including generating a
.cap file for use with Wireshark!) or you can set or get specific
properties. However, this blog story was supposed to be about sd-bus,
not busctl, hence let’s cut this short here, and let me direct you
to the man page in case you want to know more about the tool.

busctl (like the rest of system) is implemented using the sd-bus
API. Thus it exposes many of the features of sd-bus itself. For
example, you can use to connect to remote or container buses. It
understands both kdbus and classic D-Bus, and more!

sd-bus

But enough! Let’s get back on topic, let’s talk about sd-bus itself.

The sd-bus set of APIs is mostly contained in the header file
sd-bus.h.

Here’s a random selection of features of the library, that make it
compare well with the other implementations available.

  • Supports both kdbus and dbus1 as back-end.

  • Has high-level support for connecting to remote buses via ssh, and
    to buses of local OS containers.

  • Powerful credential model, to implement authentication of clients
    in services. Currently 34 individual fields are supported, from the
    PID of the client to the cgroup or capability sets.

  • Support for tracking the life-cycle of peers in order to release
    local objects automatically when all peers referencing them
    disconnected.

  • The client builds an efficient decision tree to determine which
    handlers to deliver an incoming bus message to.

  • Automatically translates D-Bus errors into UNIX style errors and
    back (this is lossy though), to ensure best integration of D-Bus
    into low-level Linux programs.

  • Powerful but lightweight object model for exposing local objects on
    the bus. Automatically generates introspection as necessary.

The API is currently not fully documented, but we are working on
completing the set of manual pages. For details
see all pages starting with sd_bus_.

Invoking a Method, from C, with sd-bus

So much about the library in general. Here’s an example for connecting
to the bus and issuing a method call:

#include <stdio.h>
#include <stdlib.h>
#include <systemd/sd-bus.h>

int main(int argc, char *argv[]) {
        sd_bus_error error = SD_BUS_ERROR_NULL;
        sd_bus_message *m = NULL;
        sd_bus *bus = NULL;
        const char *path;
        int r;

        /* Connect to the system bus */
        r = sd_bus_open_system(&bus);
        if (r < 0) {
                fprintf(stderr, "Failed to connect to system bus: %sn", strerror(-r));
                goto finish;
        }

        /* Issue the method call and store the respons message in m */
        r = sd_bus_call_method(bus,
                               "org.freedesktop.systemd1",           /* service to contact */
                               "/org/freedesktop/systemd1",          /* object path */
                               "org.freedesktop.systemd1.Manager",   /* interface name */
                               "StartUnit",                          /* method name */
                               &error,                               /* object to return error in */
                               &m,                                   /* return message on success */
                               "ss",                                 /* input signature */
                               "cups.service",                       /* first argument */
                               "replace");                           /* second argument */
        if (r < 0) {
                fprintf(stderr, "Failed to issue method call: %sn", error.message);
                goto finish;
        }

        /* Parse the response message */
        r = sd_bus_message_read(m, "o", &path);
        if (r < 0) {
                fprintf(stderr, "Failed to parse response message: %sn", strerror(-r));
                goto finish;
        }

        printf("Queued service job as %s.n", path);

finish:
        sd_bus_error_free(&error);
        sd_bus_message_unref(m);
        sd_bus_unref(bus);

        return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}

Save this example as bus-client.c, then build it with:

$ gcc bus-client.c -o bus-client `pkg-config --cflags --libs libsystemd`

This will generate a binary bus-client you can now run. Make sure to
run it as root though, since access to the StartUnit method is
privileged:

# ./bus-client
Queued service job as /org/freedesktop/systemd1/job/3586.

And that’s it already, our first example. It showed how we invoked a
method call on the bus. The actual function call of the method is very
close to the busctl command line we used before. I hope the code
excerpt needs little further explanation. It’s supposed to give you a
taste how to write D-Bus clients with sd-bus. For more more
information please have a look at the header file, the man page or
even the sd-bus sources.

Implementing a Service, in C, with sd-bus

Of course, just calling a single method is a rather simplistic
example. Let’s have a look on how to write a bus service. We’ll write
a small calculator service, that exposes a single object, which
implements an interface that exposes two methods: one to multiply two
64bit signed integers, and one to divide one 64bit signed integer by
another.

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <systemd/sd-bus.h>

static int method_multiply(sd_bus_message *m, void *userdata, sd_bus_error *ret_error) {
        int64_t x, y;
        int r;

        /* Read the parameters */
        r = sd_bus_message_read(m, "xx", &x, &y);
        if (r < 0) {
                fprintf(stderr, "Failed to parse parameters: %sn", strerror(-r));
                return r;
        }

        /* Reply with the response */
        return sd_bus_reply_method_return(m, "x", x * y);
}

static int method_divide(sd_bus_message *m, void *userdata, sd_bus_error *ret_error) {
        int64_t x, y;
        int r;

        /* Read the parameters */
        r = sd_bus_message_read(m, "xx", &x, &y);
        if (r < 0) {
                fprintf(stderr, "Failed to parse parameters: %sn", strerror(-r));
                return r;
        }

        /* Return an error on division by zero */
        if (y == 0) {
                sd_bus_error_set_const(ret_error, "net.poettering.DivisionByZero", "Sorry, can't allow division by zero.");
                return -EINVAL;
        }

        return sd_bus_reply_method_return(m, "x", x / y);
}

/* The vtable of our little object, implements the net.poettering.Calculator interface */
static const sd_bus_vtable calculator_vtable[] = {
        SD_BUS_VTABLE_START(0),
        SD_BUS_METHOD("Multiply", "xx", "x", method_multiply, SD_BUS_VTABLE_UNPRIVILEGED),
        SD_BUS_METHOD("Divide",   "xx", "x", method_divide,   SD_BUS_VTABLE_UNPRIVILEGED),
        SD_BUS_VTABLE_END
};

int main(int argc, char *argv[]) {
        sd_bus_slot *slot = NULL;
        sd_bus *bus = NULL;
        int r;

        /* Connect to the user bus this time */
        r = sd_bus_open_user(&bus);
        if (r < 0) {
                fprintf(stderr, "Failed to connect to system bus: %sn", strerror(-r));
                goto finish;
        }

        /* Install the object */
        r = sd_bus_add_object_vtable(bus,
                                     &slot,
                                     "/net/poettering/Calculator",  /* object path */
                                     "net.poettering.Calculator",   /* interface name */
                                     calculator_vtable,
                                     NULL);
        if (r < 0) {
                fprintf(stderr, "Failed to issue method call: %sn", strerror(-r));
                goto finish;
        }

        /* Take a well-known service name so that clients can find us */
        r = sd_bus_request_name(bus, "net.poettering.Calculator", 0);
        if (r < 0) {
                fprintf(stderr, "Failed to acquire service name: %sn", strerror(-r));
                goto finish;
        }

        for (;;) {
                /* Process requests */
                r = sd_bus_process(bus, NULL);
                if (r < 0) {
                        fprintf(stderr, "Failed to process bus: %sn", strerror(-r));
                        goto finish;
                }
                if (r > 0) /* we processed a request, try to process another one, right-away */
                        continue;

                /* Wait for the next request to process */
                r = sd_bus_wait(bus, (uint64_t) -1);
                if (r < 0) {
                        fprintf(stderr, "Failed to wait on bus: %sn", strerror(-r));
                        goto finish;
                }
        }

finish:
        sd_bus_slot_unref(slot);
        sd_bus_unref(bus);

        return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}

Save this example as bus-service.c, then build it with:

$ gcc bus-service.c -o bus-service `pkg-config --cflags --libs libsystemd`

Now, let’s run it:

$ ./bus-service

In another terminal, let’s try to talk to it. Note that this service
is now on the user bus, not on the system bus as before. We do this
for simplicity reasons: on the system bus access to services is
tightly controlled so unprivileged clients cannot request privileged
operations. On the user bus however things are simpler: as only
processes of the user owning the bus can connect no further policy
enforcement will complicate this example. Because the service is on
the user bus, we have to pass the --user switch on the busctl
command line. Let’s start with looking at the service’s object tree.

$ busctl --user tree net.poettering.Calculator
└─/net/poettering/Calculator

As we can see, there’s only a single object on the service, which is
not surprising, given that our code above only registered one. Let’s
see the interfaces and the members this object exposes:

$ busctl --user introspect net.poettering.Calculator /net/poettering/Calculator
NAME                                TYPE      SIGNATURE RESULT/VALUE FLAGS
net.poettering.Calculator           interface -         -            -
.Divide                             method    xx        x            -
.Multiply                           method    xx        x            -
org.freedesktop.DBus.Introspectable interface -         -            -
.Introspect                         method    -         s            -
org.freedesktop.DBus.Peer           interface -         -            -
.GetMachineId                       method    -         s            -
.Ping                               method    -         -            -
org.freedesktop.DBus.Properties     interface -         -            -
.Get                                method    ss        v            -
.GetAll                             method    s         a{sv}        -
.Set                                method    ssv       -            -
.PropertiesChanged                  signal    sa{sv}as  -            -

The sd-bus library automatically added a couple of generic interfaces,
as mentioned above. But the first interface we see is actually the one
we added! It shows our two methods, and both take “xx” (two 64bit
signed integers) as input parameters, and return one “x”. Great! But
does it work?

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Multiply xx 5 7
x 35

Woohoo! We passed the two integers 5 and 7, and the service actually
multiplied them for us and returned a single integer 35! Let’s try the
other method:

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Divide xx 99 17
x 5

Oh, wow! It can even do integer division! Fantastic! But let’s trick
it into dividing by zero:

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Divide xx 43 0
Sorry, can't allow division by zero.

Nice! It detected this nicely and returned a clean error about it. If
you look in the source code example above you’ll see how precisely we
generated the error.

And that’s really all I have for today. Of course, the examples I
showed are short, and I don’t get into detail here on what precisely
each line does. However, this is supposed to be a short introduction
into D-Bus and sd-bus, and it’s already way too long for that …

I hope this blog story was useful to you. If you are interested in
using sd-bus for your own programs, I hope this gets you started. If
you have further questions, check the (incomplete) man pages, and
inquire us on IRC or the systemd mailing list. If you need more
examples, have a look at the systemd source tree, all of systemd’s
many bus services use sd-bus extensively.