Tag Archives: Amazon QuickSight

Tips and tricks for high-performant dashboards in Amazon QuickSight

Post Syndicated from Shekhar Kopuri original https://aws.amazon.com/blogs/big-data/tips-and-tricks-for-high-performant-dashboards-in-amazon-quicksight/

Amazon QuickSight is cloud-native business intelligence (BI) service. QuickSight automatically optimizes queries and execution to help dashboards load quickly, but you can make your dashboard loads even faster and make sure you’re getting the best possible performance by following the tips and tricks outlined in this post.

Data flow and execution of QuickSight dashboard loads

The data flow in QuickSight starts from the client browser to the web server and then flows to the QuickSight engine, which in some cases executes queries against SPICE—a Super-fast, Parallel, In-memory Calculation Engine—or in other cases directly against the database. SPICE uses a combination of columnar storage, in-memory technologies enabled through the latest hardware innovations, and machine code generation to run interactive queries on large datasets and get rapid responses.

The web server, QuickSight engine, and SPICE are auto scaled by QuickSight. This is a fully managed service—you don’t need to worry about provisioning or managing infrastructure when you want to scale up a particular dashboard from tens to thousands of users on SPICE. Dashboards built against direct query data sources may require provisioning or managing infrastructure on the customer side.

The following diagram illustrates the data flow:

Let’s look at the general execution process to understand the implications:

  • A request is triggered in the browser, leading to several static assets such as JavaScript, fonts, and images being downloaded.
  • All the metadata (such as visual configurations and layout) is fetched for the dashboard.
  • Queries are performed, which may include setting up row-level and column-level security, or fetching dynamic control values, default parameters, and all values of drop-downs in filter controls.
  • Up to your concurrency limit, the queries to render your visuals run in a specific sequence (described later in this post). If you’re using SPICE, the concurrency of queries is much higher. Pagination within visuals may lead to additional queries.

The actual execution is more complex and depends on how dashboards are configured and other factors such as the data source types, Direct Query vs. SPICE, cardinality of fields and how often data is getting refreshed etc.  Many operations run in parallel and all visual-related queries are run via WebSocket, as shown in the following screenshot. Many of the steps run in the end-user’s browser, therefore there are limitations such as the number of sequences and workloads that can be pushed onto the browser. Performance may also be slightly different based on the browser type because each browser handles contention differently.

Now let’s look at many great tips that can improve your dashboard’s performance!

SPICE

Utilizing the capabilities of SPICE when possible is a great way to boost overall performance because SPICE manages scaling as well as caching results for you. We recommend using SPICE whenever possible.

Metadata

As seen in the preceding execution sequence, QuickSight fetches metadata up front for a given dashboard during the initial load. We recommend the following actions regarding metadata.

Remove unused datasets from analysis

Datasets that may have been used in the past but have no visual associated with the dashboard anymore add to the metadata payload unnecessarily. It’s likely to impact to dashboard performance.

Make sure your row-level and column-level security is performant

Row-Level security, column-level security and dynamic default parameters each require lookups to take place before the visual queries are issued. When possible, try to limit the number and the complexity of your rules datasets to help these lookups execute faster. Use SPICE for your rules dataset when possible. If you must use a direct query, make sure that the queries are optimal and that the data source you’re querying is scaled appropriately up front.

For embedded dashboards, a great way to optimize row-level security lookups is by utilizing session tags for row-level security paired with an anonymous identity. Similarly, dynamic default parameters, if used, can be evaluated in the host application up front and passed using the embedding SDK.

Calculated functions

In this section, we offer tips regarding calculated functions.

Move calculations to the data prep stage

QuickSight allows you to add calculated fields in the data prep or analysis experiences. We strongly encourage you to move as many calculations as possible to the data prep stage which will allow QuickSight to materialize calculations which do not contain aggregation or parameters into the SPICE dataset. Materializing calculated fields in the dataset helps you reduce the runtime calculations, which improves query performance. Even if you are using aggregation or parameters in your calculation, it might still be possible to move parts of the calculations to data prep. For instance, if you have a formula like the following:

You can remove the sum() and just keep the ifelse(), which will allow QuickSight to materialize (precompute) it and save it as a real field in your SPICE dataset. Then you can either add another calculation which sums it up, or just use sum aggregation once you add it to your visuals.

Generally materializing calculations that use complex ifelse logic or do string manipulation/lookups will result in the greatest improvements in dashboard performance.

Implement the simplified ifelse syntax

The ifelse function supports simplified statements. For example, you might start with the following statement:

The following simplified statement is more performant:

Use the toString() function judiciously

The toString() function has a much lower performance and is much heavier on the database engine than a simple integer or number-based arithmatic calculations. Therefore, you should use it sparingly.

Know when nulls are returned by the system and use null value customization

Most authors make sure that null conditions on calculated fields are handled gracefully. QuickSight often handles nulls gracefully for you. You can use that to your advantage and make the calculations simpler. In the following example, the division by 0 is already handled by QuickSight:

You can write the preceding code as the following:

If you need to represent nulls on visuals with a static string, QuickSight allows you to set custom values when a null value is returned in a visual configuration. In the preceding example, you could just set a custom value of 0 in the formatting option. Removing such handling from the calculated fields can significantly help query performance.

On-sheet filters vs. parameters

Parameters are seemingly a very simple construct but they can quickly get complicated, especially when used in nested calculation functions or when used in controls. Parameters are all evaluated on the fly, forcing all the dependencies to be handled real time. Ask yourself if each parameter is really required. In some cases, you may be able to replace them with simple dropdown control, as shown in the following example for $market.

Instead of creating a control parameter to use in a calculated field, you might be able to use the field with a dropdown filter control.

Text field vs. Dropdown (or List) filter controls

When you are designing an analysis, you can add a filter control for the visuals you want to filter. if the data type of the field is string, you have several choices for the type of control filter. Text field which displays a text box where you can enter a single entry or multiple entries is suggested for the better performance, rather than Dropdown (or List) which requires to fetch the values to populate a list that you can select a single or multiple values.

On-sheet controls

The control panel at the top of the dashboard is collapsible by default, but this setting allows you to have an expanded state while publishing the dashboard. If this setting is enabled, QuickSight prioritizes the calls in order to fetch the controls’ values before the visual loads. If any of the controls have high cardinality, it could impact the performance of loading the dashboard. Evaluate this need against the fact that QuickSight persists last-used control values and the reader might not actually need to adjust controls as a first step.

Visual types: Charts

In this section, we provide advice when using Charts.

Use ‘Hide the “other” category’ when your dimension has less than the cutoff limit

You can choose to limit how many data points you want to display in your visual, before they are added to the other category. This category contains the aggregated data for all the data beyond the cutoff limit for the visual type you are using – either the one you impose or the one based on display limits. If you know your dimension has less than the cutoff limit, use this option. This will improve your dashboard performance.

The other category does not show on scatter plots, heat maps, maps, tables (tabular reports), or key performance indicators (KPIs). It also doesn’t show on line charts when the x-axis is a date.

Visual types: Tables and pivot tables

In this section, we provide advice when using tables and pivot tables.

Use the Values field well when displaying a raw table view

If you want to output all the raw data into table, you can use Group by fields, Values fields, or a mix of them. The most performant approach is set every field into Values. When using Group by, a query is first run under the hood followed by the Group by function, therefore all the data is pulled from the database, which is expensive.

Deploy a minimal set of rows, columns, metrics, and table calculations

If you include too many combinations of rows, columns, metrics, and table calculations in one pivot table, you risk overwhelming the viewer. You can also run into the computational limitations of the underlying database. To reduce the level of complexity and potential errors, you can take the following actions:

  • Apply filters to reduce the data included in for the visual
  • Use fewer fields in the Row and Column field wells
  • Use as few fields as possible in the Values field well
  • Create additional pivot tables so that each displays fewer metrics
  • Reduce subtotals, totals and conditional formatting when possible

Uncollapsed columns are always the simplest case and will likely remain more performant outside of a few cases.

Visual queries sequence

The execution of the individual visual sequence is left to right, then top to bottom. Understanding the sequence of execution can be helpful: you can rearrange visuals on your dashboard without losing the context. Place heavier visuals further down in the dashboard, and place lightweight KPI and insight visuals near the top to display “above-the-fold” content sooner, which improves the dashboard performance’s perception for your readers.

Embedding

Our final set of recommendations are in regards to embedding.

Remove user management flows from the critical path

Most times, user management and authentication flows (such as DescribeUser and RegisterUser APIs) can run asynchronously on the host application.

Consider registering the user in advance before the actual embedding, so that the overhead is removed from every analytics page visit.

Authenticate the user on your website in advance, and acquire any Amazon Cognito or AWS Security Token Service (Amazon STS) session tokens (if required) in advance (for example, at user login time or home page visit). This reduces additional runtime latency overhead when a user visits an analytics page.

Move workloads from clients to the web server or backend services

If a QuickSight dashboard is embedded on a webpage on the host application, which performs other activities too, play close attention to the sequence of API calls on the host. The QuickSight dashboard load might be gated by other heavy API calls on the host application. Move the logic to the web server or backend services as much as possible to limit contention on the browser.

Don’t tear down the embedding iFrame when the user navigates away from analytics section

When the user moves temporarily to a non-analytics page of your web application (especially in single-page applications), instead of removing the embedding iframe from DOM, you can hide it from the user while keeping the iFrame in the page DOM elements. This allows you to resume the same session when the user navigates back to analytics section of your application, and they don’t need to wait for reload.

Use navigateToDashboard() and navigateToSheet() whenever possible

If you have multiple dashboards on your host application that don’t need to load concurrently, you can optimize the authentication flow by utilizing two APIs we expose, navigateToDashboard() or navigateToSheet(), in our JavaScript SDK. These APIs reuse the same iFrame for each load, while reusing the authentication token.

This technique has proven to be very effective for many of our embedding users.

For more information about these APIs, refer to Amazon QuickSight Embedding SDK.

Conclusion

In this post, we shared some tips and tricks for tuning the performance of your QuickSight dashboards. In 2021, we doubled our SPICE data limits to 500 million rows of data per dataset. In addition, incremental data refresh is available for SQL-based data sources such as Amazon Redshift, Amazon Athena, Amazon RDS, Amazon Aurora, PostgreSQL, MySQL, Oracle, SQL Server, MariaDB, Presto, Teradata or Snowflake up to every 15 minutes, which cuts down time between data updates by 75%. In 2022, we continue to innovate on your behalf to make QuickSight dashboard loads even more performant.

We look forward to your feedback on how these tips and tricks helped your dashboards load faster.


About the Authors

Shekhar Kopuri is a Senior Software Development Manager for Amazon QuickSight. He leads the front platform engineering team that focusses on various aspects of front end experience including website performance. Before joining AWS, Shekhar led development of multiple provisioning and activation network OSS applications for a large global telecommunications service provider.

Blake Carroll is a Senior Frontend Engineer for Amazon QuickSight. He works with the frontend platform engineering team with a focus on website performance and has previously been the frontend lead for initial reporting and theming functionality in QuickSight. Prior to joining Amazon, Blake was a co-founder in the digital interactive agency space working with national brands to produce creative web experiences.

Vijay Chaudhari is a Senior Software Development Engineer for Amazon QuickSight, AWS’ cloud-native, fully managed BI service. Vijay started his career with IBM, writing software for the Information Management group. At Amazon, he has built backend applications for retail systems, and near real-time data pre-computation, reporting and analytics systems at Amazon scale. He is passionate about learning and solving new customer problems, and helping them adopt cloud native technologies.

Wakana Vilquin-Sakashita is Specialist Solution Architect for Amazon QuickSight. She works closely with customers to help making sense of the data through visualization. Previously Wakana worked for S&P Global  assisting customers to access data, insights and researches relevant for their business.

Coming June 2022: An updated Amazon QuickSight dashboard experience

Post Syndicated from Rushabh Vora original https://aws.amazon.com/blogs/big-data/coming-june-2022-an-updated-amazon-quicksight-dashboard-experience/

Starting June 30, 2022, Amazon QuickSight is introducing the new look and feel for your dashboards. In this post, we walk through the changes to expect with the new look. The new dashboard experience includes the following improvements:

  • Simplified toolbar
  • Discoverable visual menu
  • Polished controls, menu, and submenus
  • Non-blocking right pane for secondary experiences like filters, threshold alerts, and downloads

Simplified toolbar

The simplified toolbar experience offers updated icons for key actions for better visual clarity.

The following screenshot shows the old look.

screenshot shows the old look

The following screenshot shows the new look with updated icons.

screenshot shows the new look with updated icons

Discoverable visual menu

The visual menu is visible on-hover to improve the discoverability of drills, export, and filter restatements.

To use the visual menu with the old version, you needed to select the visual.

visual menu with the old look, where you needed to select the visual

With the new look, you can view the menu by hovering over it.

With new look, you can view the menu by hovering over it

Polished controls, menu, and submenus

The new look features improved controls, menu, submenus, toast notifications, and other dashboard elements to provide more polished visual experience.

For example, the following screenshot shows the old look for calendar controls.

screenshot shows the old look for calendar controls

The following screenshot shows the new look.

screenshot shows the new look for calendar control

The following screenshot shows the old look for the menu and submenus.

screenshot shows the old look for the menu and submenus

The following screenshot shows the new look.

screenshot shows the new look for menu and submenus

Non-blocking right pane

The new look features a non-blocking right pane for secondary experiences like filters, threshold alerts, and downloads, to improve focus on the content of the dashboard.‘

The following animation shows the old look for ad-hoc filtering.

animation shows the old look for ad-hoc filtering

The filters are now moved to the right pane.

animation shows the new look for ad-hoc filtering that is now moved to the right pane

The following animation shows the old look for creating threshold alerts.

animation shows the old look for creating threshold alerts

Threshold alert creation is now in the right pane.

animation shows the new look for creating threshold alerts, now moved to right pane

The following animation shows the old look for the downloading experience.

animation shows the old look for the downloading experience

The new look offers a downloads pane.

animation shows the new look for the downloading experience in the right pane

Summary

The new look for the QuickSight dashboard experience will be available starting June 30, 2022. If you have any questions or feedback, please reach out to us by leaving a comment.


About the Author

Rushabh Vora is a Senior Technical Product Manager for Amazon QuickSight, Amazon Web Service’s cloud-native, fully managed BI service. He is passionate about Data Visualization. Prior to QuickSight, he was working with Amazon Business as a Product Manager.

Top Amazon QuickSight features and updates launched Q1 2022

Post Syndicated from Mia Heard original https://aws.amazon.com/blogs/big-data/top-amazon-quicksight-features-and-updates-launched-q1-2022/

Amazon QuickSight is a serverless, cloud-based business intelligence (BI) service that brings data insights to your teams and end users through machine learning (ML) powered dashboards and data visualizations, which can be access via QuickSight or embedded in apps and portals that your users access. This post shares the top QuickSight features and updates launched in Q1 2022.

Amazon QuickSight Community

In the new Amazon QuickSight Community, you can ask and answer questions, network with and learn from other BI users from across the globe, access learning content, and stay up to date with what’s new on QuickSight all in one place!

Join the global QuickSight Community today!

Groups Management UI

QuickSight now provides a user interface to manage user groups, allowing admins to efficiently and easily manage user groups via the QuickSight admin console. Groups Management UI is available to administrators with access to QuickSight admin console pages via AWS Identity and Access Management (IAM) credentials.

To learn more, see Creating and managing groups in Amazon QuickSight.

Comparative and cumulative date and time calculations

Amazon QuickSight authors can now quickly implement advanced date/time calculations without having to use complicated row offsets or pre-computed columns. You can add these calculations in regular business reporting, trend analysis, and time series analysis.

To learn more about the new period functions and their capabilities in various use cases, see Add comparative and cumulative date/time calculations in Amazon QuickSight.

Rich text formatting on visual titles and subtitles 

QuickSight authors can now add rich context to their visuals by choosing from various formatting options like font type, size, style, color and style. You can also better organize the text by choosing from various alignment and ordering options. Visual titles and subtitles now also support hyperlinks as well as parameter-based dynamic text.

To learn more, see Formatting a visual title and subtitle.

Custom subtotals at all levels on pivot table

QuickSight allows you to customize how subtotals are displayed in pivot tables, with options for display at multiple levels and for both rows and columns.

To learn more, see Displaying Totals and Subtotals.

Auto refresh direct query controls

QuickSight now supports automatic refreshes of values displayed in drop-down, multi-select and other controls in dashboards that are in direct query mode. Values within controls are updated every 24 hours to ensure the data is automatically updated without any end-user intervention.

For further details, see Refreshing data in Amazon QuickSight.

Conclusion

QuickSight serves millions of dashboard views weekly, enabling data-driven decision-making in organizations of all sizes, including customers like the NFL, 3M, Accenture, and more.

To stay up to date on all things new with QuickSight, visit What’s New with Analytics!


About the Author

Mia Heard is a product marketing manager for Amazon QuickSight, AWS’ cloud-native, fully managed BI service.

Detecting data drift using Amazon SageMaker

Post Syndicated from Shibu Nair original https://aws.amazon.com/blogs/architecture/detecting-data-drift-using-amazon-sagemaker/

As companies continue to embrace the cloud and digital transformation, they use historical data in order to identify trends and insights. This data is foundational to power tools, such as data analytics and machine learning (ML), in order to achieve high quality results.

This is a time where major disruptions are not only lasting longer, but also happening more frequently, as discussed in a McKinsey article on risk and resilience. Any disruption—a pandemic, hurricane, or even blocked sailing routes—has a major impact on the patterns of data and can create anomalous behavior.

ML models are dependent on data insights to help plan and support production-ready applications. With any disruptions, data drift can occur. Data drift is unexpected and undocumented changes to data structure, semantics, and/or infrastructure. If there is data drift, the model performance will degrade and no longer provide an accurate guidance. To mitigate the effects of the disruption, data drift needs to be detected and the ML models quickly trained and adjusted accordingly.

This blog post explains how to approach changing data patterns in the age of disruption and how to mitigate its effects on ML models. We also discuss the steps of building a feedback loop to capture the request data in the production environment and create a data pipeline to store the data for profiling and baselining. Then, we explain how Amazon SageMaker Clarify can help detect data drift.

How to detect data drift

There are three stages to detecting data drift: data quality monitoring, model quality monitoring, and drift evaluation (see Figure 1).

Stages in detecting data drift

Figure 1. Stages in detecting data drift

Data quality monitoring establishes a profile of the input data during model training, and then continuously compares incoming data with the profile. Deviations in the data profile signal a drift in the input data.

You can also detect drift through model quality monitoring, which requires capturing actual values that can be compared with the predictions. For example, using weekly demand forecasting, you can compare the forecast quantities one week later with the actual demand. Some use cases can require extra steps to collect actual values. For example, product recommendations may require you to ask a selected group of consumers for their feedback to the recommendation.

SageMaker Clarify provides insights into your trained models, including importance of model features and any biases towards certain segments of the input data. Changes of these attributes between re-trained models also signal drift. Drift evaluation constitutes the monitoring data and mechanisms to detect changes and triggering consequent actions. With Amazon CloudWatch, you can define rules and thresholds that prompt drift notifications.

Figure 2 illustrates a basic architecture with the data sources for training and production (on the left) and the observed data concerning drift (on the right). You can use Amazon SageMaker Data Wrangler, a visual data preparation tool, to clean and normalize your input data for your ML task. You can store the features that you defined for your models in the Amazon SageMaker Feature Store, a fully managed, purpose-built repository to store, update, retrieve, and share ML features.

The white, rectangular boxes in the architecture diagram represent the tasks for detecting data and model drift. You can integrate those tasks into your ML workflow with Amazon SageMaker Pipelines.

Basic architecture on how data drift is detected using Amazon SageMaker

Figure 2. Basic architecture on how data drift is detected using Amazon SageMaker

The drift observation data can be captured in tabular format, such as comma-separated values or Parquet, on Amazon Simple Storage Service (S3) and analyzed with Amazon Athena and Amazon QuickSight.

How to build a feedback loop

The baselining task establishes a data profile from training data. It uses Amazon SageMaker Model Monitor and runs before training or re-training the model. The baseline profile is stored on Amazon S3 to be referenced by the data drift monitoring job.

The data drift monitoring task continuously profiles the input data, compares it with baseline, and the results are captured in CloudWatch. This tasks runs on its own computation resources using Deequ, which checks that the monitoring job does not slow down your ML inference flow and scales with the data. The frequency of running this task can be adjusted to control cost, which can depend on how rapidly you anticipate that the data may change.

The model quality monitoring task computes model performance metrics from actuals and predicted values. The origin of these data points depends on the use case. Demand forecasting use cases naturally capture actuals that can be used to validate past predictions. Other use cases can require extra steps to acquire ground-truth data.

CloudWatch is a monitoring and observability service with which you can define rules to act on deviation in model performance or data drift. With CloudWatch, you can setup alerts to users via e-mail or SMS, and it can automatically start the ML model re-training process.

Run the baseline task on your updated data set before re-training your model. Use the SageMaker model registry to catalog your ML models for production, manage model versions, and control the associate training metrics.

Gaining insight into data and models

SageMaker Clarify provides greater visibility into your training data and models, helping identify and limit bias and explain predictions. For example, the trained models may consider some features more strongly than others when generating predictions. Compare the feature importance and bias between model-provided versions for a better understanding of the changes.

Conclusion

As companies continue to use data analytics and ML to inform daily activity, data drift may become a more common occurrence. Recognizing that drift can have a direct impact on models and production-ready applications, it is important to architect to identify potential data drift and avoid downgrading the models and negatively impacting results. Failure to capture changes in data can result in loss of process confidence, downgraded model accuracy, or a bottom-line impact to the business.

Amazon QuickSight 1-click public embedding available in preview

Post Syndicated from Kareem Syed-Mohammed original https://aws.amazon.com/blogs/big-data/amazon-quicksight-1-click-public-embedding-available-in-preview/

Amazon QuickSight is a fully managed, cloud-native business intelligence (BI) service that makes it easy to connect to your data, create interactive dashboards, and share these with tens of thousands of users, either directly within a QuickSight application, or embedded in web apps and portals.

QuickSight Enterprise Edition now supports 1-click public embedding, a feature that allows you to embed your dashboards into public applications, wikis, and portals without any coding or development needed. Anyone on the internet can start accessing these embedded dashboards with up-to-date information instantly, without any server deployments or infrastructure licensing needed! 1-click public embedding allows you to empower your end-users with access to insights.

In this post, we walk you through the steps to use this feature, demonstrate the end-user experience, and share sample use cases.

Solution overview

1-click public embedding requires an administrator of the QuickSight account to enable this feature and use session-based pricing. After you complete the prerequisites (see the following section), the 1-click embedding process involves three simple steps:

  1. Enable public access on the dashboard
  2. Allow list the domain where you want to embed the dashboard in QuickSight
  3. Embed the dashboard

Prerequisites

  1. The account needs to have session capacity pricing enabled
  2. As a prerequisite, make sure your QuickSight account is enabled for dashboards to be shared by the public. Because this feature allows dashboards to be enabled for any user, the usability of this feature must be enabled by an administrator of your QuickSight account with the following AWS Identity and Administration (IAM) permissions:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Action": "quicksight:UpdatePublicSharingSettings",
                "Resource": "*",
                "Effect": "Allow"
            }
         ]
    }

After you enable this policy, you can activate the account-level settings.

  1. Sign in to your QuickSight account.
  2. On the user name drop-down menu, choose Manage QuickSight.
  3. Choose Security & permissions in the navigation pane.
  4. Under Public access to dashboards, choose Manage.
  5. Enable the setting Everyone on the internet to allow public access.
  6. Confirm this setting and choose Accept.

Enabling this setting doesn’t automatically enable all the dashboards to be accessed by anyone on the internet. It gives the ability for authors of the dashboards to individually enable the dashboard to be accessed by anyone on the internet via the share link or when embedded.

Enable public access permissions on the dashboard

After you create a QuickSight dashboard, to enable public access, complete the following steps:

  1. On the QuickSight dashboard, on the Share menu, choose Share dashboard.

Only owners and co-owners of the dashboard can perform this action.

  1. Under Enable access for, turn on Anyone on the internet (public).

This setting allows you to share this dashboard with anyone on the internet via the share link or when embedded.

  1. Accept this setting and choose Confirm.

You now have the option to copy the link or the embed code to share the dashboard. Note that when this setting is enabled, the dashboard can only be accessed using the link or when embedded using the embed code.

After you enable the dashboard for public access, you can see badges on the dashboard as follows.

The dashboard toolbar has a PUBLIC badge.

The dashboards grid view has an eye icon for each dashboard.

The dashboards list view has an eye icon for each dashboard.

Disable the public access

You can disable the public access of your dashboards in two ways:

  1. Disable the Anyone on the internet setting for the dashboard (to be done by the owners and co-owners of the dashboard).
  2. Disable the Public access to dashboard setting at the account level (to be done by the IAM admin). You can disable the account-level settings even if the dashboard-level setting is enabled. This still prevents the dashboards from being accessed publicly.

Allow list the domain where you want to embed the dashboard in QuickSight

The domain where the dashboard is to be embedded must be allow listed in QuickSight. For instructions, see Adding domains for embedded users.

Embed the dashboard

After you set your desired access to the dashboard, you can choose Copy embed code, which copies the embed code for that dashboard. This code embeds the dashboard when added to an internal application.

The copied embed code is similar to the following code:

    <iframe
        width="960"
        height="720"
        src="https://quicksight.aws.amazon.com/sn/
        embed/share/accounts/accountid/dashboards/dashboardid">
    </iframe>

To embed the dashboard in an HTML page, open the HTML of the page where you want to embed the dashboard and enter the copied embed code into the HTML code.

If your public-facing applications are built on Google Sites, to embed your dashboard, open the page on Google Sites, then choose Insert and Embed. A pop-up window appears with a prompt to enter a URL or embed code. Choose Embed code and enter the copied embed code in the text box.

Make sure to allow list the following domains in QuickSight when embedding in Google Sites: https://googleusercontent.com (enable subdomains), https://www.gstatic.com, and https://sites.google.com.

After you embed the dashboard in your application, anyone who can access your application can now access the embedded dashboard.

Sample use cases

1-click public embedding enables you to embed your dashboards into public applications, wikis, and portals without any coding or development needed. In this section, we present two sample use cases.

For our first use case, a fictitious school district uses 1-click public embedding to report on the teachers’ enrollment in the district. They built a dashboard and used this feature to embed it on their public-facing site.

For our second use case, a fictitious fintech that provides investment solutions is using 1-click public embedding to show how their investment compares against other well-known indexes and commodities. They used this feature to add this comparison dashboard on their public-facing marketing pages.

Try out 1-click public embedding

To try out this feature, see Embed Amazon QuickSight dashboard in seconds. In this demo, you can change the dashboard between a logistics or sales dashboard by choosing Change Dashboard and entering the embed code for the dashboard you want to render on the site.

Conclusion

With 1-click public embedding, you can now embed rich and interactive QuickSight dashboards quickly and easily. Enable your end-users to dive deeper into their data through embedded dashboard with the click of a button—and with no infrastructure setup or management, scale to millions of users. 1-click public embedding is now in preview; to access this feature, please contact [email protected].

QuickSight also supports embedding in SaaS apps without any user management needed. For more information, refer to Embed multi-tenant dashboards in SaaS apps using Amazon QuickSight without provisioning or managing users.

To stay up to date on QuickSight embedded analytics, check out what’s new with the QuickSight User Guide.


About the Authors

Kareem Syed-Mohammed is a Product Manager at Amazon QuickSight. He focuses on embedded analytics, APIs, and developer experience. Prior to QuickSight he has been with AWS Marketplace and Amazon retail as a PM. Kareem started his career as a developer and then PM for call center technologies, Local Expert and Ads for Expedia. He worked as a consultant with McKinsey and Company for a short while.

Srikanth Baheti is a Specialized World Wide Sr. Solution Architect for Amazon QuickSight. He started his career as a consultant and worked for multiple private and government organizations. Later he worked for PerkinElmer Health and Sciences & eResearch Technology Inc, where he was responsible for designing and developing high traffic web applications, highly scalable and maintainable data pipelines for reporting platforms using AWS services and Serverless computing.-

Enhance analytics with Google Trends data using AWS Glue, Amazon Athena, and Amazon QuickSight

Post Syndicated from Drew Philip original https://aws.amazon.com/blogs/big-data/enhance-analytics-with-google-trends-data-using-aws-glue-amazon-athena-and-amazon-quicksight/

In today’s market, business success often lies in the ability to glean accurate insights and predictions from data. However, data scientists and analysts often find that the data they have at their disposal isn’t enough to help them make accurate predictions for their use cases. A variety of factors might alter an outcome and should be taken into account when making a prediction model. Google Trends is an available option, presenting a broad source of data that reflects global trends more comprehensively. This can help enrich a dataset to yield a better model.

You can use Google Trends data for a variety of analytical use cases. For example, you can use it to learn about how your products or brands are faring among targeted audiences. You can also use it to monitor competitors and see how well they’re performing against your brand.

In this post, we shows how to get Google Trends data programmatically, integrate it into a data pipeline, and use it to analyze data, using Amazon Simple Storage Service (Amazon S3), AWS Glue, Amazon Athena, and Amazon QuickSight. We use an example dataset of movies and TV shows and demonstrate how to get the search queries from Google Trends to analyze the popularity of movies and TV shows.

Solution overview

The following diagram shows a high-level architecture of the solution using Amazon S3, AWS Glue, the Google Trends API, Athena, and QuickSight.

The solution consists of the following components:

  1. Amazon S3 – The storage layer that stores the list of topics for which Google Trends data has to be gathered. It also stores the results returned by Google Trends.
  2. AWS Glue – The serverless data integration service that calls Google Trends for the list of topics to get the search results, aggregates the data, and loads it to Amazon S3.
  3. Athena – The query engine that allows you to query the data stored in Amazon S3. You can use it for supporting one-time SQL queries on Google Trends data and for building dashboards using tools like QuickSight.
  4. QuickSight – The reporting tool used for building visualizations.

In the following sections, we walk through the steps to set up the environment, download the libraries, create and run the AWS Glue job, and explore the data.

Set up your environment

Complete the following steps to set up your environment:

  1. Create an S3 bucket where you upload the list of movies and TV shows. For this post, we use a Netflix Movies and TV Shows public dataset from Kaggle.
  2. Create an AWS Identity and Access Management (IAM) service role that allows AWS Glue to read and write data to the S3 buckets you just created.
  3. Create a new QuickSight account with the admin/author role and access granted to Athena and Amazon S3.

Download the external libraries and dependencies for the AWS Glue Job

The AWS Glue job needs the following two external Python libraries: pytrends and awswrangler. pytrends is a library that provides a simple interface for automating the downloading of reports from Google Trends. awswrangler is a library provided by AWS to integrate data between a Pandas DataFrame and AWS repositories like Amazon S3.

Download the following .whl files for the libraries and upload them to Amazon S3:

Create and configure an AWS Glue job

To set up your AWS Glue job, complete the following steps:

  1. On the AWS Glue console, under ETL in the navigation pane, choose Jobs – New.
  2. For Create job, select Python Shell script editor.
  3. For Options, select Create a new script with boilerplate code.
  4. Choose Create.
  5. On the Script tab, enter the following script, replacing the source and target buckets with your bucket names:
    # Import external library TrendReq needed to connect to Google Trends API and library awswrangler to read/write from pandas to Amazon S3.
    
    from pytrends.request import TrendReq
    pytrend = TrendReq(hl='en-US', tz=360, timeout=10) 
    import pandas as pd
    import awswrangler as wr
    
    # Function get_gtrend, accepts a list of terms as input, calls Google Trends API for each term to get the search trends 
    def get_gtrend(terms):
      trends =[]
      for term in terms:
    # Normalizing the data using popular movie Titanic as baseline to get trends over time.
        pytrend.build_payload(kw_list=["Titanic",term.lower()])
        df = pytrend.interest_over_time()
        df["google_trend"] = round((df[term.lower()] /df['Titanic']) *100)
        
    # Transforming and filtering trends results to align with Analytics use case
        df_trend = df.loc[df.index >= "2018-1-1", "google_trend"].resample(rule="M").max().to_frame()
        df_trend["movie"] = term
        trends.append(df_trend.reset_index())
    
    # Last step in function to concatenate the results for each term and return an aggregated dataset 
      concat_df = pd.concat(trends)
      return concat_df
    
    def main():
      
    # Change the bucket and prefix name to Amazon S3 location where movie titles file from Kaggle has been downloaded. 
      source_bucket = "source_bucket"
      source_prefix = "source_prefix"
    
    # Awswrangler method s3.read_csv is called to load the titles from S3 location into a DataFrame and convert it to a list.
      df = wr.s3.read_csv(f's3://{source_bucket}/{source_prefix}/')
      movies = df['title'].head(20).values.tolist()
    
    #  Call the get_trends function and pass the list of movies as an input. Pandas dataframe is returned with trend data for movies.
      df = get_gtrend(terms=movies)
    
    # Change the prefix name to location where you want to store results. 
      target_bucket = "target_bucket" 
      target_prefix = "target_prefix" 
    
    # Use awswrangler to save pandas dataframe to Amazon S3. 
      wr.s3.to_csv(df,f's3://{target_bucket}/{target_prefix}/trends.csv',index= False)
    
    
    # Invoke the main function
    main()

  6. On the Job details tab, for Name, enter the name of the AWS Glue job.
  7. For IAM Role, choose the role that you created earlier with permissions to run the job and access Amazon S3.
  8. For Type, enter Python Shell to run the Python code.
  9. For Python Version, specify the Python version as Python 3.6.
  10. For Data processing units, choose 1 DPU.
  11. For Number of retries, enter .
  12. Expand Advanced properties and under Libraries, enter the location of the S3 bucket where the pytrends and awswrangler files were downloaded.
  13. Choose Save to save the job.

Run the AWS Glue job

Navigate to the AWS Glue console and run the AWS Glue job you created. When the job is complete, a CSV file with the Google Trends values is created in the target S3 bucket with the prefix specified in the main() function. In the next step, we create an AWS Glue table referring to the target bucket and prefix to allow queries to be run against the Google Trends data.

Create an AWS Glue table on the Google Trends data

In this step, we create a table in the AWS Glue Data Catalog using Athena. The table is created on top of the Google Trends data saved in the target S3 bucket.

In the Athena query editor, select default as the database and enter the following DDL command to create a table named trends. Replace the target bucket and prefix with your own values.

CREATE EXTERNAL TABLE `trends`(
  `date` date, 
  `google_trend` double, 
  `title` string)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://<< target_bucket >>/<<target_prefix >>/'
TBLPROPERTIES (
  'has_encrypted_data'='false', 
  'skip.header.line.count'='1')

This table has three columns:

  • date – The time dimension for aggregating the data. In this example, the time period is monthly.
  • google_trend – The count of Google Trends values normalized on a scale of 0–100.
  • title – The name of the movie or TV show.

Query the data using Athena

Now you can run one-time queries to find the popularity of movies and TV shows.

In the first example, we find the top 10 most popular movies and TV shows for November 2021. In the Athena query editor, enter the following SQL command to query the trends table created in the previous step:

select title,google_trend
from trends 
where date = date_parse('2021-11-30','%Y-%m-%d')
order by google_trend desc
limit 10

In the following example, we find the top 10 most popular movies and TV shows that have grown most in popularity in 2021 until November 30. In the Athena query editor, enter the following SQL command to query the trends table:

select  title,max(google_trend)-min(google_trend) trend_diff
from trends
where date between date_parse('2021-01-31','%Y-%m-%d') and date_parse('2021-11-30','%Y-%m-%d')
group by title
order by 2 desc
limit 10

Build a dashboard to visualize the data using QuickSight

We can use QuickSight to build a dashboard on the data downloaded from Google Trends to identify top movies and TV shows. Complete the following steps:

  1. Sign in to your QuickSight account.
  2. On the QuickSight console, choose Datasets and choose New dataset.
  3. Choose Athena as your data source.
  4. For Data source name, enter a name.
  5. For Athena workgroup, choose [primary].
  6. Choose Create data source.
  7. For Database, choose default.
  8. For Tables, select the trends table.
  9. Choose Select.
  10. Select Directly query your data.
  11. Choose Visualize.

For the first visual, we create a bar chart of the top movies or TV shows by title sorted in ascending order of aggregated Google Trends values.

  1. Choose the horizontal bar chart visual type.
  2. For Y axis, choose title.
  3. For Value, choose google_trend (Average).

Next, we create a time series plot of Google Trends count by month for titles.

  1. Add a new visual and choose the autograph visual type.
  2. For X axis, choose date.
  3. For Value, choose google_trend (Sum).
  4. For Color¸ choose title.

Clean up

To avoid incurring future charges, delete the resources you created for AWS Glue, Amazon S3, IAM, and QuickSight.

  1. AWS Glue Catalog table
    • On the AWS Glue console, choose Tables under Databases in the navigation pane.
    • Select the AWS Glue Data Catalog table that you created.
    • On the Actions drop-down menu, choose Delete.
    • Choose Delete to confirm.
  2. AWS Glue Job
    • Choose Jobs in the navigation pane.
    • Select the AWS Glue job you created.
    • On the Actions drop-down menu, choose Delete.
  3. S3 bucket
    • On the Amazon S3 console, choose Buckets in navigation pane.
    • Choose the bucket you created.
    • Choose Empty and enter your bucket name.
    • Choose Confirm.
    • Choose Delete and enter your bucket name.
    • Choose Delete bucket.
  4. IAM Role
    • On the IAM console, choose Roles in navigation pane.
    • Choose the role you attached to AWS Glue job.
    • Choose Delete role.
    • Choose Yes.
  5. Amazon QuickSight
    • If you created a QuickSight user for trying out this blog and do not want to retain that access, please ask your QuickSight admin to delete your user.
    • If you created the QuickSight account itself just for trying this blog and no longer want to retain it, use following steps to delete it.
    • Choose your user name on the application bar, and then choose Manage QuickSight
    • Choose Account settings.
    • Choose Delete Account.

You can only have one QuickSight account active for each AWS account. Make sure that other users aren’t using QuickSight before you delete the account.

Conclusion

Integrating external data sources such as Google Trends via AWS Glue, Athena, and QuickSight can help you enrich your datasets to yield greater insights. You can use it in a data science context when the model is under-fit and requires more relevant data in order to make better predictions. In this post, we used movies as an example, but the solution extends to a wide breadth of industries, such as products in a retail context or commodities in a finance context. If the simple inventory histories or the transaction dates are available, you may find little correlation to future demand or prices. But with an integrated data pipeline using external data, new relationships in the dataset make the model more reliable.

In a business context, whether your team wants to test out a machine learning (ML) proof of concept more quickly or have limited access to pertinent data, Google Trends integration is a relatively quick way to enrich your data for the purposes of ML and data insights.

You can also extend this concept to other third-party datasets, such as social media sentiment, as your team’s expertise grows and your ML and analytics operations mature. Integrating external datasets such as Google Trends is just one part of the feature and data engineering process, but it’s a great place to start and, in our experience, most often leads to better models that businesses can innovate from.


About the Authors

Drew Philip is a Sr. Solutions Architect with AWS Private Equity. He has held senior
technical leadership positions within key AWS partners such as Microsoft, Oracle, and
Rackspace. Drew focuses on applied engineering that leverages AI-enabled digital innovation and development, application modernization, resiliency and operational excellence for workloads at scale in the public and private sector. He sits on the board of Calvin University’s computer science department and is a contributing member of the AWS Machine Learning Technical Focus Community.

Gautam Prothia is a Senior Solution Architect within AWS dedicated to Strategic Accounts. Gautam has more than 15+ years of experience designing and implementing large-scale data management and analytical solutions. He has worked with many clients across industries to help them modernize their data platforms on the cloud.

Simon Zamarin is an AI/ML Solutions Architect whose main focus is helping customers extract value from their data assets. In his spare time, Simon enjoys spending time with family, reading sci-fi, and working on various DIY house projects.

ProLink uses Amazon QuickSight to enable states to deliver housing assistance to those in need

Post Syndicated from Ryan Kim original https://aws.amazon.com/blogs/big-data/prolink-uses-amazon-quicksight-to-enable-states-to-deliver-housing-assistance-to-those-in-need/

This is a joint post by ProLink Solutions and AWS. ProLink Solutions builds software solutions for emergency fund deployment to help state agencies distribute funds to homeowners in need. Over the past 20 years, ProLink Solutions has developed software for the affordable housing industry, designed to make the experience less complicated and easy to report on.

The COVID-19 pandemic has impacted homeowners across the United States who were unable to pay their mortgages, resulting in delinquencies, defaults, and foreclosures. The federal government acted quickly by establishing the Homeowner Assistance Fund (HAF) under the American Rescue Plan Act of 2021, granting nearly $10 billion to states to distribute to homeowners experiencing COVID-related financial hardships.

Distributing these funds quickly and efficiently required states to rapidly deploy new programs, workflows, and reporting. ProLink Solutions rose to the occasion with a new software as a service (SaaS) solution called ProLink+. Consisting of two parts—a homeowner portal that makes the funding application process easy for homeowners, and a back-office system to help state agencies review and approve funding applications—ProLink+ is a turnkey solution for state agencies looking to distribute their HAF dollars fast. For state agencies responsible for HAF programs, data reporting is key because it reinforces the organizational mission of the agency and helps shape public perception of how the program is progressing. Due to the emergency nature of the funding program, state agencies are continually in the public eye, and therefore access to real-time reporting is a must. As a result, ProLink uses Amazon QuickSight as their business intelligence (BI) solution to create and embed dashboards into the ProLink+ solution.

In this post, we share how ProLink+ uses QuickSight to enhance states’ capabilities to analyze and assess their fund deployment status.

Building the solution with AWS

Data-driven decision-making is critical in any industry or business today, and the affordable housing industry is no exception. As a primary technology player in the affordable housing industry for over two decades, ProLink Solutions supports state housing finance agencies by providing comprehensive suite of software products. ProLink has been an AWS customer since 2012, utilizing AWS services to design and build their in-house software development to maximize agility, scale, functionality, and speed to market of their solutions.

The ProLink+ SaaS solution is built using multiple AWS resources, including but not limited to Amazon Elastic Compute Cloud (Amazon EC2), Elastic Load Balancing, Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), AWS Lambda, and Amazon Cognito. It also utilizes Amazon CloudFront for a secure and high-performance content delivery system to the end-user.

QuickSight inherent integrations with their AWS resources and microservices make it a logical choice for ProLink. In addition, QuickSight allowsProLink to deliver dashboard functionality efficiently and securely without incurring significant costs. Due to the intuitive design of QuickSight, ProLink business analysts are able to build rich, informative dashboards without writing any code or engineering input, thereby shortening the time to client delivery and increasing the efficiency of the decision-making process. With its simple setup and ability to easily create embedded dashboards and visualization tools, QuickSight is yet another flexible and powerful tool that ProLink Solutions uses to deliver high-quality products and services to the market.

Intuitive and effective data visualization is key

The integration of QuickSight into ProLink+ offers a unique opportunity to create a seamless embedded solution for users. For example, the reporting integration isn’t a separate redirect to a different system. The solution exists in the main user interface with visualizations directly associated to the unique activities. Relevant data can be displayed without adding unnecessary complexity to the solution. This experience adds additional value by reducing the learning curve for new customers.

State agencies use QuickSight’s embedded dashboard capabilities in ProLink+ for internal analytical purposes, as well as for real-time reporting to the public. The agencies are proactively thinking about what information needs to be made available to the public and how to best present it. These are big decisions that impact how the public sees the work of the government.

The Percent of Funds Disbursed chart in the following screenshot illustrates how much of the allocation the agency received from the US Department of the Treasury has been disbursed to homeowners in the state.

Blazing a trail for more easily accessible funding

Federal funding programs have traditionally faced challenges with distribution to citizens in need. ProLink Solutions seeks to provide an easy-to-adopt, easy-to-use, and repeatable solution for governments, powered by modern technology. ProLink+ simplifies the process of distributing the funds to the public through an intuitive interface. The dashboard capabilities of QuickSight are an asset to both ProLink Solutions as a solutions provider and state government agencies as end-users. Intuitive, effective data reporting and visualization provides critical insights that help governments communicate their work on behalf of the public, while continuing to improve delivery of their services.

“State agencies across the board are looking for visual reporting tools to tell their stories more effectively. I’m glad QuickSight was readily available to us and we were able to quickly develop a dashboard in our ProLink+ deployment.” Shawn McKenna, CEO ProLink Solutions

Learn more about how ProLink Solutions is helping states distribute housing assistance quickly to those in need.

_______________________________________________________________________

About the Authors

Ryan Kim is the Director of Product Marketing at ProLink Solutions. Ryan leads industry partnerships/initiatives and positioning of all ProLink Solutions’s technology products that serve the affordable housing industry.

Scott Kirn is the Chief Information Officer at ProLink Solutions. Scott leads the Information Technology group at ProLink Solutions and drives all aspects of product development and delivery.

Walter McCain II is a Solutions Architect at Amazon Web Services. Walter is a Solutions Architect for Amazon Web Services, helping customers build operational best practices, application products, and technical solutions in the AWS Cloud. Walter is involved in evangelizing AWS Cloud computing architectures and development for various technologies such as serverless, media entertainment, migration strategies, and security, to name a few.

Announcing the new Amazon QuickSight Community

Post Syndicated from Lillie Atkins original https://aws.amazon.com/blogs/big-data/announcing-the-new-amazon-quicksight-community/

On February 22, 2022, we launched our new Amazon QuickSight Community. Here you can ask and answer questions, network with and learn from other Business Intelligence (BI) users from across the globe, access learning content, and stay up to date with what’s new on Amazon QuickSight—all in one place!

In this post, we discuss some of the features of the QuickSight Community and show you how to sign up, start posting, create a profile and set up notifications.

Content on the Community

The QuickSight Community has three main sections:

  1. Question and Answer – A discussion forum where you can ask your QuickSight questions and get answers from community experts. You can also showcase your QuickSight expertise by sharing your knowledge with others.
  2. Learning Center – A hub for on-demand QuickSight content, including how-to videos workshop videos, articles, and additional educational resources.
  3. Announcements – Stay up-to-date on the latest QuickSight launches, blogs, feature demo videos, and monthly newsletters. We continuously update this section as new content related to QuickSight is added.

Join the QuickSight Community

This QuickSight Community does not require any login to search or browse existing content. You only need to create an account if you want to interact with the community (such as liking posts, replying to posts, and posting your own questions).

This is a public community, so be careful not to post any confidential or private information.

Sign up for the Community

To sign up, complete the following steps:

  1. On the QuickSight Community home page, choose Sign Up.
  2. You can either use your existing Amazon.com account, or create a new login.
  3. Read the Community guidelines.

You can now contribute to the Community. Welcome!

Post a question

You need to be logged in with an account to start asking questions. Before you post a question, search the recommended questions to make sure your question hasn’t already been answered.

To post a question, complete the following steps:

  1. On the Question and answer page, choose New Question.
  2. Enter the information to create your post, including a title and tag.
  3. Choose New Question.

Set up your profile

To view your profile and change the settings, complete the following steps:

  1. Choose your user icon at the top of the page and choose the person icon.
  2. Select the section you are interested in:
  3. If you chose Preferences, you can control security and notification preferences, and update your profile details.

Set up notifications

You can set up notification preferences to be alerted on a specific post, channel, or tag.

To set up notifications on a post, choose the post and then choose the notification icon to set your notification preferences.

To get notified on a specific channel, go to the channel (in this case the Question and answer channel) and choose the notification bell to set your notification preferences.

To get notified on a specific tag, complete the following steps:

  1. Choose the additional options icon.
  2. Choose Tags.
  3. Choose the tag you are interested in.
  4. Choose the notification icon and set your notification preferences.

Conclusion

In this post, we discussed the new Amazon QuickSight Community, and how you can sign up for it, create a post, edit your profile, and set up notifications. The QuickSight Community is a one-stop shop for all of your QuickSight learning needs, and a place to network with other BI users from around the globe. Start exploring today!


About the Authors

Lillie Atkins is a Product Manager for Amazon QuickSight, Amazon Web Service’s cloud-native, fully managed BI service.

Mia Heard is a Product Marketing Manager for Amazon QuickSight, AWS’ cloud-native, fully managed BI service.

Enable users to ask questions about data using natural language within your applications by embedding Amazon QuickSight Q

Post Syndicated from Deepak Murthy original https://aws.amazon.com/blogs/big-data/enable-users-to-ask-questions-about-data-using-natural-language-within-your-applications-by-embedding-amazon-quicksight-q/

Amazon QuickSight Q is a new machine learning-based capability in Amazon QuickSight that enables users to ask business questions in natural language and receive answers with relevant visualizations instantly to gain insights from data. QuickSight Q doesn’t depend on prebuilt dashboards or reports to answer questions, which removes the need for business intelligence (BI) teams to create or update dashboards every time a new business question arises. Users can ask questions and receive visual answers in seconds directly from within QuickSight or from web applications and portals. In this post, we look at how you can embed Q in your web applications or portals.

Solution overview

You can now embed Q in your application without any custom development. Q is a fully managed cloud-native BI offering that you can easily embed without requiring expertise within your team to develop and maintain this capability. You can easily democratize your data and scale your insights to a growing user base, while ensuring you only pay for usage with Q’s unique pay-per-question pricing model.

Applications can authenticate users with any identity provider of choice (such as Active Directory, Amazon Cognito, or any SAML-based federated SSO provider that your organization uses) and act on behalf of the user to get access to the Q question bar. This means that every user receives a secure, personalized question answering experience while requiring no user-facing QuickSight-specific authentication. This enables a novel experience to provide insights within your application with minimal upfront work and allows you to focus on your core application functionality! QuickSight Q embedding is available in QuickSight Enterprise Edition and Q-supported Regions.

To facilitate an easy embedding experience, AWS has also launched the Amazon QuickSight Embedding SDK (JavaScript) and a rich set of Q-specific functionalities. The QuickSight Embedding SDK lets you efficiently integrate Q in your application pages, set a default topic, enable topic selection, set themes, and control Q search bar behavior. This helps you roll out Q to your users faster.

To embed Q in your application, you must complete the following high-level steps:

  1. Set up permissions to generate embedded Q URLs.
  2. Generate a URL with the authentication code attached.
  3. Embed the Q search bar URL.

Set up permissions to generate embedded Q URLs

In this step, you set up permissions for your backend application or web server to embed the Q search bar. This task requires administrative access to AWS Identity and Access Management (IAM). Each user who accesses the Q search bar assumes a role that gives them QuickSight permissions to retrieve a Q-embedded URL.

To make this possible, create an IAM role in your AWS account. Associate an IAM policy with the role to provide permissions to any user who assumes it. The IAM role needs to provide permissions to retrieve embedding URLs for a specific user pool. With the help of the wildcard character *, you can grant the permissions to generate a URL for all users in a specific namespace. Or you can grant permissions to generate a URL for a subset of users in specific namespaces. For this, you add quicksight:GenerateEmbedUrlForRegisteredUser.

The following sample policy provides these permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "quicksight:GenerateEmbedUrlForRegisteredUser"
            ],
            "Resource": "arn:partition:quicksight:region:accountId:user/namespace/userName"
        }
    ]
}

Also, if you’re creating first-time users who will be QuickSight readers, make sure to add the quicksight:RegisterUser permission in the policy.

The following sample policy provides permission to retrieve an embedding URL for first-time users who will be QuickSight readers:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "quicksight:RegisterUser",
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Effect": "Allow",
            "Action": [
              "quicksight:GenerateEmbedUrlForRegisteredUser"
            ],
            "Resource": [
              "arn:partition:quicksight:region:accountId:user/namespace/userName"
            ]
        }
    ]
}

Finally, your application’s IAM identity must have a trust policy associated with it to allow access to the role that you just created. This means that when a user accesses your application, your application assumes the role on the user’s behalf and provisions the user in QuickSight.

The following example uses a role called embedding_quicksight_q_search_bar_role, which has the sample policy preceding as its resource:

{
    "Version": "2012-10-17",
    "Statement": {
        "Effect": "Allow",
        "Action": "sts:AssumeRole",
        "Resource": "arn:aws:iam::11112222333:role/embedding_quicksight_q_search_bar_role"
    }
}

Generate a URL with the authentication code attached

In this step, you authenticate your user and get the embeddable Q topic URL on your application server. If you plan to embed the Q bar for IAM or QuickSight identity types, share the Q topic with the users. When a user accesses your app, the app assumes the IAM role of the user. If that user is new, the app adds the user to QuickSight, then passes an identifier as the unique role session ID.

These steps make sure that each viewer of the Q topic is uniquely provisioned in QuickSight. It also enforces per-user settings, such as the row-level security and dynamic defaults for parameters.

The following example code performs the IAM authentication on the user’s behalf. This code runs on your app server:

import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.quicksight.AmazonQuickSight;
import com.amazonaws.services.quicksight.AmazonQuickSightClientBuilder;
import com.amazonaws.services.quicksight.model.GenerateEmbedUrlForRegisteredUserRequest;
import com.amazonaws.services.quicksight.model.GenerateEmbedUrlForRegisteredUserResult;
import com.amazonaws.services.quicksight.model.RegisteredUserEmbeddingExperienceConfiguration;
import com.amazonaws.services.quicksight.model.RegisteredUserQSearchBarEmbeddingConfiguration;

/**
 * Class to call QuickSight AWS SDK to get url for embedding the Q search bar.
 */
public class RegisteredUserQSearchBarEmbeddingConfiguration {

    private final AmazonQuickSight quickSightClient;

    public RegisteredUserQSearchBarEmbeddingConfiguration() {
        this.quickSightClient = AmazonQuickSightClientBuilder
                .standard()
                .withRegion(Regions.US_EAST_1.getName())
                .withCredentials(new AWSCredentialsProvider() {
                                     @Override
                                     public AWSCredentials getCredentials() {
                                         // provide actual IAM access key and secret key here
                                         return new BasicAWSCredentials("access-key", "secret-key");
                                     }

                                     @Override
                                     public void refresh() {}
                                 }
                )
                .build();
    }

    public String getQuicksightEmbedUrl(
            final String accountId, // AWS Account ID
            final String topicId, // Topic ID to embed
            final String userArn // Registered user arn to use for embedding. Refer to Get Embed Url section in developer portal to find how to get user arn for a QuickSight user.
    ) throws Exception {
        final RegisteredUserEmbeddingExperienceConfiguration experienceConfiguration = new RegisteredUserEmbeddingExperienceConfiguration()
                .withQSearchBar(new RegisteredUserQSearchBarEmbeddingConfiguration().withInitialTopicId(topicId));
        final GenerateEmbedUrlForRegisteredUserRequest generateEmbedUrlForRegisteredUserRequest = new GenerateEmbedUrlForRegisteredUserRequest();
        generateEmbedUrlForRegisteredUserRequest.setAwsAccountId(accountId);
        generateEmbedUrlForRegisteredUserRequest.setUserArn(userArn);
        generateEmbedUrlForRegisteredUserRequest.setExperienceConfiguration(QSearchBar);

        final GenerateEmbedUrlForRegisteredUserResult generateEmbedUrlForRegisteredUserResult = quickSightClient.generateEmbedUrlForRegisteredUser(generateEmbedUrlForRegisteredUserRequest);

        return generateEmbedUrlForRegisteredUserResult.getEmbedUrl();
    }
}

To generate the URL that you can embed in your app, call the GenerateEmbedUrlForRegisteredUser API operation. This URL is valid for 5 minutes, and the resulting session is valid for up to 10 hours. You can configure the session validity by setting the sessionLifetimeinMinutes parameter for GenerateEmbedURL APIs. The API operation provides the URL with an auth_code value that enables a single-sign on session. The following code shows an example response from generate-embed-url-for-registered-user:

//The URL returned is over 900 characters. For this example, we've shortened the string for
//readability and added ellipsis to indicate that it's incomplete.
{
 "Status": "200",
 "EmbedUrl": "https: //dashboards.example.com/embed/620bef10822743fab329fb3751187d2d...",
 "RequestId": "7bee030e-f191-45c4-97fe-d9faf0e03713"
} 

Embed the Q search bar URL

In this step, you embed the Q search bar URL in your website or application page. You can do this with the QuickSight Embedding SDK, which allows you to do the following:

  • Place the Q search bar on an HTML page
  • Pass parameters into the Q search bar
  • Handle error states with messages that are customized to your application

Embed the Q search bar in your webpage by using the QuickSight Embedding SDK or by adding this URL into an iFrame. If you set a fixed height and width number (in pixels), QuickSight uses those and doesn’t change your visual as your window resizes. If you set a relative percent height and width, QuickSight provides a responsive layout that is modified as your window size changes. When you use the QuickSight Embedding SDK, the Q search bar on your page is dynamically resized based on the state. By using the QuickSight Embedding SDK, you can also control parameters within the Q search bar and receive callbacks in terms of page load completion and errors.

The following example code shows how to use the generated URL. This code is generated on your app server:

<!DOCTYPE html>
         <html>
     
         <head>
             <title>QuickSight Q Search Bar Embedding</title>
             <script src="https://unpkg.com/[email protected]/dist/quicksight-embedding-js-sdk.min.js"></script>
             <script type="text/javascript">
                 var session
     
                 function onError(payload) {
                     console.log("Do something when the session fails loading");
                 }
     
                 function onOpen() {
                     console.log("Do something when the Q search bar opens");
                 }
     
                 function onClose() {
                     console.log("Do something when the Q search bar closes");
                 }
     
                 function embedQSearchBar() {
                     var containerDiv = document.getElementById("embeddingContainer");
                     var options = {
                         url: "https://us-east-1.quicksight.aws.amazon.com/sn/dashboards/dashboardId?isauthcode=true&identityprovider=quicksight&code=authcode", // replace this dummy url with the one generated via embedding API
                         container: containerDiv,
                         width: "1000px",
                         locale: "en-US",
                         qSearchBarOptions: {
                             expandCallback: onOpen,
                             collapseCallback: onClose,
                             iconDisabled: false,
                             topicNameDisabled: false, 
                             themeId: 'theme12345',
                             allowTopicSelection: true
                         }
                     };
                     session = QuickSightEmbedding.embedQSearchBar(options);
                     session.on("error", onError);
                 }
     
                 function onCountryChange(obj) {
                     session.setParameters({country: obj.value});
                 }
             </script>
         </head>
     
         <body onload="embedQSearchBar()">
             <div id="embeddingContainer"></div>
         </body>
     
         </html>

For this example to work, make sure to use the QuickSight Embedding SDK to load the embedded dashboard on your website using JavaScript. You can get the SDK in the following ways:

  • Download the QuickSight Embedding SDK from GitHub. This repository is maintained by a group of QuickSight developers.
  • Download the latest QuickSight Embedding SDK version from npmjs.com.
  • If you use npm for JavaScript dependencies, download and install it by running the following command:
    npm install amazon-quicksight-embedding-sdk

The following screenshot is an example of the embedded Q question bar and a QuickSight dashboard to support natural language questions and analysis of pharmaceutical clinical trial data. You can try out this topic and other such topics in an embedded application demo.

Summary

Enterprises can empower users to ask questions about data in plain English within their applications by embedding the QuickSight Q question bar. Embedding Q into your application is straightforward and requires no custom development from your team.

Get started with a free trial of QuickSight Q.


About the Authors

Deepak Murthy is a Senior Product Manager for Amazon QuickSight, AWS’s cloud-native, fully managed BI service. Deepak started his career with Staples, developing enterprise data warehouse solutions. Later, he was the architect of data warehouse and analytics solutions at Wells Fargo, AMC, and Blackhawk Network. Deepak is excited about the potential of self-service analytics and improving data accessibility by enabling new natural language interactions with data, and looks forward to helping customers leverage these latest analytics innovations.

Rob Foley is a Software Development Engineer for Amazon QuickSight, AWS’s cloud-native, fully managed BI service. Rob began his career with AWS, and has been a member of the QuickSight team for over 1.5 years. He has development experience in a breadth of services and stacks, primarily having worked on data-centric applications like Q.

Use AnalyticsIQ with Amazon QuickSight to gain insights for your business

Post Syndicated from Sumitha AP original https://aws.amazon.com/blogs/big-data/use-analyticsiq-with-amazon-quicksight-to-gain-insights-for-your-business/

Decisions are made every day in your organization that impact your business. Making the right decision at the right moment can deeply impact your organization’s growth and your customers. Likewise, having the right data and tools that generate insights into the data can empower your organization’s leaders to make the right decisions.

In the healthcare industry where decisions directly impact an individual’s wellness, having the right data to generate the right insight into the individual experience through the lens of social determinants of health can greatly improve health outcomes and save lives. Understanding the unique social situations of the individuals they serve, from access to transportation, technology to economic, food security and more, allows healthcare providers to address disparities and give all their patients an equal opportunity to achieve their desired level of health.

For example, let’s say a healthcare organization or government agency wants to better understand the factors that affect public health in order to improve the quality of life for various ethnic groups, based on data.

In this post, we show you how to use AnalyticsIQ datasets and Amazon QuickSight to generate valuable insights that could improve your organization’s decision-making. we use the AnalyticsIQ Social Determinants of Health Sample Data dataset to gain insights into the relationship between ethnicity and health, as well as how the social determinants impact the health and wellness of individuals.

Solution overview

The following architecture diagram outlines the components of this solution:

The solution consists of the following components:

To implement the solution, you complete the following high-level steps:

  1. Export the dataset to an S3 bucket.
  2. Sign up for a QuickSight subscription.
  3. Create a QuickSight dataset.
  4. Create visualizations in QuickSight.

Prerequisites

To run this solution, you must have an AWS account. If you don’t already have one, you can create one.

Export the dataset to an S3 bucket

To start working with your dataset, you must subscribe to the dataset and then export the data to an S3 bucket. Complete the following steps:

  1. If you don’t already have a bucket, navigate to the Amazon S3 console, and choose Create bucket.
  2. Give a unique name for your bucket.

Make sure that you create the bucket in the us-east-1 Region.

  1. To subscribe to the sample dataset, follow this link. On the AWS Data Exchange console, choose Continue to subscribe.
  2. On the Complete subscription page, choose Subscribe.
  3. For Select Amazon S3 bucket folder destination, choose your S3 bucket.

The subscription process can take up to 2 minutes to complete.

  1. On the AWS Data Exchange Console, under My subscriptions in the navigation pane, choose Entitled Data.
  2. Under Products, expand Social Determinants of Health Sample Data – Offline, and choose the AnalyticsIQ sample dataset.
  3. On the Revisions tab, select the revision and choose Export to Amazon S3.
  4. Enter the name of the S3 bucket you created for this dataset.
  5. Leave the other options as default.
  6. Choose Export.

You can view the dataset in your S3 bucket under the prefix Sample-Data.

Sign up for a QuickSight subscription

To sign up for a QuickSight subscription, complete the following steps:

  1. On the AWS Management Console, open QuickSight.
  2. Choose Sign up for QuickSight and choose Enterprise.
  3. For QuickSight account name, enter a unique name.
  4. Enter a valid email.
  5. Under Allow access and autodiscovery for these resources, select Amazon S3 and choose Select S3 buckets.
  6. Choose the S3 bucket that you created earlier, and choose Finish.
  7. After your QuickSight account is created, choose Go to QuickSight account.

Create a QuickSight dataset

To create your dataset, complete the following steps:

  1. Using a local text editor, create a JSON file. Copy the following content and replace the placeholder with the name of the bucket that you created earlier:
    {
          "fileLocations": [
              {
                  "URIPrefixes": [
                           "https://<Your BucketName>.s3.amazonaws.com/Sample-Data/"
                  ]
              }
           ],
         "globalUploadSettings": {
                  "format": "TSV"
            }
    }

  2. On the QuickSight console, choose New data set on the Datasets page.
  3. Choose S3.
  4. For DataSource, enter a name.
  5. Choose Upload and upload the JSON file.
  6. Choose Connect.
  7. Choose Visualize.

The following screenshot shows your imported sample data:

Create visualizations in QuickSight

Let’s visualize the average number of cars by various ethnic groups. For more information about the fields, refer to the Key Data Points section on the AWS Marketplace listing.

  1. Choose the sheet and choose the vertical bar chart under Visual types.
  2. From the Fields list, drag EthnicIQ_v2 to X axis and Number_of_Autos to Value.
  3. Choose Aggregate as Average.

Now you can create a visualization for urgent care visits by ethnic groups.

  1. Choose +Add, and choose Add Visual.
  2. Choose a pivot table under Visual types.
  3. From the Fields list, drag EthinicIQ_v2 to Rows and HW_Urgent_Care_Visits_SC to Values.
  4. Choose Aggregate as Average.
  5. Choose the HW_Urgent_Care_Visits_SC field in the pivot table, and choose Sort descending.

Similarly, you can add more visualizations as shown in the following images.

From these visualizations created from sample data, you can see that a person’s use of healthcare services reduces when they have less access to transportation. The AA ethnic group has fewer cars compared to the other groups. The wellness score for the AA group is low when compared to the others. Transportation barriers could be a major factor here. Job satisfaction also contributes to wellness levels. Furthermore, the sample data indicates that the Hispanic community has the highest likelihood of recent urgent care visits. Does this mean these groups aren’t getting enough preventative care, leading to more urgent care visits?

Sleep and job satisfaction play a critical role in affecting stress levels, as well as overall health. This would be a critical factor for people who work shifts. What measures can be taken to increase the sleep quality for that set of people?

These are just few of the innumerable valuable analyses that you can create from the AnalyticsIQ Social Determinants of Health Sample Data dataset. These insights are valuable for various groups of people, such as health professionals, preventative care, employee care, scientists, and governments, to empower communities and help build better public health and social determinant solutions.

Clean up

To avoid incurring ongoing charges, complete the following steps to clean up your resources:

  1. On the QuickSight console, on the Analyses page, choose the details icon on the analysis you created, and choose Delete.
  2. On the QuickSight start page, on the Datasets page, choose the dataset that you created earlier, then choose Delete Data Set.
  3. On the Amazon S3 console, on the Buckets page, select the option next to the name of your bucket, and then choose Delete at the top of the page.
  4. Confirm that you want to delete the bucket by entering the bucket name into the text field, then choose Delete bucket.

Conclusion

In this post, we showed you how you can use the AnalyticsIQ Social Determinants of Health Sample Data dataset to gain insights into society’s health and wellness. We also showed you how you can generate easy-to-understand visualizations using QuickSight. Amazon QuickSight allows dashboards to be shared with 1000s of users without any servers, and with pay-per-session pricing. QuickSight dashboards can also be easily embedded in SaaS applications or corporate portals for sharing insights with all users. You can explore the AnalyticsIQ dataset more on the AWS Data Exchange console. For queries related to the AnalyticsIQ dataset, you can reach out directly to the support team at [email protected].To learn more about the features of QuickSight, refer to Amazon QuickSight Features.


About the Author

Sumitha AP is an AWS Solutions Architect based in Washington DC. She works with SMB customers to help them design secure, scalable, reliable and cost effective solutions in the AWS cloud.

How to build a multi-Region AWS Security Hub analytic pipeline and visualize Security Hub data

Post Syndicated from David Hessler original https://aws.amazon.com/blogs/security/how-to-build-a-multi-region-aws-security-hub-analytic-pipeline/

AWS Security Hub is a service that gives you aggregated visibility into your security and compliance posture across multiple Amazon Web Services (AWS) accounts. By joining Security Hub with Amazon QuickSight—a scalable, serverless, embeddable, machine learning-powered business intelligence (BI) service built for the cloud—your senior leaders and decision-makers can use dashboards to empower data-driven decisions and facilitate a secure configuration of AWS resources

In organizations that operate at cloud scale, being able to summarize and perform trend analysis is key to identifying and remediating problems early, which leads to the overall success of the organization. Additionally, QuickSight dashboards can be embedded in dashboard and reporting platforms that leaders are already familiar with, making the dashboards even more user friendly.

With the solution in this blog post, you can provide leaders with cross-AWS Region views of data to enable decision-makers to assess the health and status of an organizations IT infrastructure at a glance. You also can enrich the dashboard with data sources not available to Security Hub. Finally, this solution allows you the flexibility to have multiple administrator accounts across several AWS organizations and combine them into a single view.

In this blog post, you will learn how to build an analytics pipeline of your Security Hub findings, summarize the data with Amazon Athena, and visualize the data via QuickSight using the following steps:

  • Deploy an AWS Cloud Development Kit (AWS CDK) stack that builds the infrastructure you need to get started.
  • Create an Athena view that summarizes the raw findings.
  • Visualize the summary of findings in QuickSight.
  • Secure QuickSight using best practices.

For a high-level discussion without code examples please see Visualize AWS Security Hub Findings using Analytics and Business Intelligence Tools.

Prerequisites

This blog post assumes that you:

  • Have a basic understanding of how to authenticate and access your AWS account.
  • Are able to run commands via a command line prompt on your local machine.
  • Have a basic understanding of Structured Query Language (SQL).

Solution overview

Figure 1 shows the flow of events and a high-level architecture diagram of the solution.

Figure 1. High level architecture diagram

Figure 1. High level architecture diagram

The steps shown in Figure 1 include:

  • Detect
  • Collect
  • Aggregate
  • Transform
  • Analyze
  • Visualize

Detect

AWS offers a number of tools to help detect security findings continuously. These tools fall into three types:

In this blog, you will use two built-in security standards of Security Hub—CIS AWS Foundations Benchmark controls and AWS Foundational Security Best Practices Standard—and a serverless Prowler scanner that acts as a third-party partner product. In cases where AWS Organizations is used, member accounts send these findings to the member account’s Security Hub

Collect

Within a region, security findings are centralized into a single administrator account using Security Hub.

Aggregate

Using the cross-Region aggregation feature within Security Hub, findings within each administrator account can be aggregated and continuously synchronized across multiple regions.

Ingest

Security Hub not only provides a comprehensive view of security alerts and security posture across your AWS accounts, it also acts as a data sink for your security tools. Any tool that can expose data via AWS Security Finding Format (ASFF) can use the BatchImportFindings API action to push data to Security Hub. For more details, see Using custom product integration to send findings to AWS Security Hub and Available AWS service integrations in the Security Hub User Guide.

Transform

Data coming out of Security Hub is exposed via Amazon EventBridge. Unfortunately, it’s not quite in a form that Athena can consume. EventBridge streams data through Amazon Kinesis Data Firehose directly to Amazon Simple Storage Service (Amazon S3). From Amazon S3, you can create an AWS Lambda function that flattens and fixes some of the column names, such as by removing special characters that Athena cannot recognize. The Lambda function then saves the results back to S3. Finally, an AWS Glue crawler dynamically discovers the schema of the data and creates or updates an Athena table.

Analyze

You will aggregate the raw findings data and create metrics along various grains or pivots by creating a simple yet meaningful Athena view. With Athena, you also can use views to join the data with other data sources, such as your organization’s configuration management database (CMDB) or IT service management (ITSM) system.

Visualize

Using QuickSight, you will register the data sources and build visualizations that can be used to identify areas where security can be improved or reduce risk. This post shares steps detailing how to do this in the Build QuickSight visualizations section below.

Use AWS CDK to deploy the infrastructure

In order to analyze and visualize security related findings, you will need to deploy the infrastructure required to detect, ingest, and transform those findings. You will use an AWS CDK stack to deploy the infrastructure to your account. To begin, review the prerequisites to make sure you have everything you need to deploy the CDK stack. Once the CDK stack is deployed, you can deploy the actual infrastructure. After the infrastructure has been deployed, you will build an Athena view and a QuickSight visualization.

Install the software to deploy the solution

For the solution in this blog post, you must have the following tools installed:

  • The solution in this blog post is written in Python, so you must install Python in addition to CDK. Instructions on how to install Python version 3.X can be found on their downloads page.
  • AWS CDK requires node.js. Directions on how to install node.js can found on the node.js downloads page.
  • This CDK application uses Docker for local bundling. Directions for using Docker can be found at Get Docker.
  • AWS CDK—a software-development framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. To install CDK, visit AWS CDK Toolkit page.

To confirm you have the everything you need

  1. Confirm you are running version 1.108.0 or later of CDK.

    $ cdk ‐‐version

  2. Download the code from github by cloning the repository. cd into the clone directory.

    $ git clone [email protected]:aws-samples/aws-security-hub-analytic-pipeline.git

    $ cd aws-security-hub-analytic-pipeline

  3. Manually create a virtualenv.

    $ python3 -m venv .venv

  4. After the initialization process completes and the virtualenv is created, you can use the following step to activate your virtualenv.

    $ source .venv/bin/activate

  5. If you’re using a Windows platform, use the following command to activate virtualenv:

    % .venv/Scripts/activate.bat

  6. Once the virtualenv is activated, you can install the required dependencies.

    $ pip install -r requirements.txt

Use AWS CDK to deploy the infrastructure into your account

The following steps use AWS CDK to deploy the infrastructure. This infrastructure includes the various scanners, Security Hub, EventBridge, and Kinesis Firehose streams. When complete, the raw Security Hub data will already be stored in an S3 bucket.

To deploy the infrastructure using AWS CDK

  1. If you’ve never used AWS CDK in the account you’re using or if you’ve never used CDK in the us-east-1, us-east-2, or us-west-1 Regions, you must bootstrap the regions via the command prompt.

    $ cdk bootstrap

  2. At this point, you can deploy the stack to your default AWS account via the command prompt.

    $ cdk deploy –all

  3. While cdk deploy is running, you will see the output in Figure 2. This is a prompt to ensure you’re aware that you’re making a security-relevant change and creating AWS Identity and Access Management (IAM) roles. Enter y when prompted to continue the deployment process:

    Figure 2. CDK approval prompt to create IAM roles

    Figure 2. CDK approval prompt to create IAM roles

  4. Confirm cdk deploy is finished. When the deployment is finished, you should see three stack ARNs. It will look similar to Figure 3.

    Figure 3. Final output of CDK deploy

    Figure 3. Final output of CDK deploy

As a result of the deployed CDK code, Security Hub and the Prowler scanner will automatically scan your account, process the data, and send it to S3. While it takes less than an hour for some data to be processed and searchable in Athena, we recommend waiting 24 hours before proceeding to the next steps, to ensure enough data is processed to generate useful visualizations. This is because the remaining steps roll-up findings by the hour. Also, it takes several minutes to get initial results from the Security Hub standards and up to an hour to get initial results from Prowler.

Build an Athena view

Now that you’re deployed the infrastructure to detect, ingest, and transform security related findings, it’s time to use an Athena view to accomplish the analyze portion of the solution. The following view aggregates the number of findings for a given day. Athena views can be used to summarize data or enrich it with data from other sources. Use the following steps to build a simple example view. For more information on creating Athena views, see Working with Views.

To build an Athena view

  1. Open the AWS Management Console and ensure that the Region is set to us-east-1 (Northern Virginia).
  2. Navigate to the Athena service. If you’ve never used this service, choose Get Started to navigate to the Query Editor screen. Otherwise, the Query Editor screen is the default view.
  3. If you’re new to Athena, you also need to set up a query result location.
    1. Choose Settings in the top right of the Query Editor screen to open the settings panel.
    2. Choose Select to select a query result location.

      Figure 4. Athena settings

      Figure 4. Athena settings

    3. Locate an S3 bucket in the list that starts with analyticsink-queryresults and choose the right-arrow icon.
    4. Choose Select to select a query results bucket.

      Figure 5. Select S3 location confirmation

      Figure 5. Select S3 location confirmation

  4. Select AwsDataCatalog as the Data source and security_hub_database as the Database. The Query Editor screen should look like Figure 6.

    Figure 6. Empty query editor

    Figure 6. Empty query editor

  5. Copy and paste the following SQL in the query window:

    CREATE OR REPLACE VIEW “security-hub-rolled-up-finding” AS
    SELECT

    “date_format”(“from_iso8601_timestamp”(updatedat), ‘%Y-%m-%d %H:00’) year_month_day
    , region
    , compliance_status
    , workflowstate
    , severity_label
    , COUNT(DISTINCT title) as cnt
    FROM
    security_hub_database.“security-hub-crawled-findings”
    GROUP BY “date_format”(“from_iso8601_timestamp”(updatedat), ‘%Y-%m-%d %H:00’), compliance_status, workflowstate, severity_label, region

  6. Choose the Run query button.

If everything is correct, you should see Query successful in the Results, as shown in Figure 7.

Figure 7. Creating an Athena view

Figure 7. Creating an Athena view

Build QuickSight visualizations

Now that you’ve deployed the infrastructure to detect, ingest, and transform security related findings, and have created an Athena view to analyze those findings, it’s time to use QuickSight to visualize the findings. To use QuickSight, you must first grant QuickSight permissions to access S3 and Athena. Next you create a QuickSight data source. Third, you will create a QuickSight analysis. (Optional) When complete, you can publish the analysis.

You will build a simple visualization that shows counts of findings over time separated by severity, though it’s also possible to use QuickSight to tell rich and compelling visual stories.

In order to use QuickSight, you need to sign up for a QuickSight subscription. Steps to do so can be found in Signing Up for an Amazon QuickSight Subscription.

The first thing you need to do once logged in to QuickSight is create the data source. If this is your first time logging in to the service, you will be greeted with an initial QuickSight page as shown in Figure 8.

Figure 8. Initial QuickSight page

Figure 8. Initial QuickSight page

Grant QuickSight access to S3 and Athena

While creating the Athena data source will enable QuickSight to query data from Athena, you also need to enable QuickSight to read from S3.

To grant QuickSight access to S3 and Athena

  1. Inside QuickSight, select your profile name (upper right). Choose Manage QuickSight, and then choose Security & permissions.
  2. Choose Add or remove.
  3. Ensure the checkbox next to Athena is selected.
  4. Ensure the checkbox next to Amazon S3 is selected.
  5. Choose Details and then choose Select S3 Buckets.
  6. Locate an S3 bucket in the list that starts with analyticsink-bucket and ensure the checkbox is selected.
    Figure 9. Example permissions

    Figure 9. Example permissions

  7. Choose Finish to save changes.

Create a QuickSight dataset

Once you’ve given QuickSight the necessary permissions, you can create a new dataset.

To create a QuickSight dataset

  1. Choose Datasets from the navigation pane at left. Then choose New Dataset.

    Figure 10. Dataset page

    Figure 10. Dataset page

  2. To create a new Athena connection profile, use the following steps:
    1. In the FROM NEW DATA SOURCES section, choose the Athena data source card.
    2. For Data source name, enter a descriptive name. For example: security-hub-rolled-up-finding.
    3. For Athena workgroup choose [ primary ].
    4. Choose Validate connection to test the connection. This also confirms encryption at rest.
    5. Choose Create data source.
  3. On the Choose your table screen, select:
    Catalog: AwsDataCatalog
    Database: security_hub_database
    Table: security-hub-rolled-up-finding
  4. Finally, select the Import to SPICE for quicker analytics option and choose Visualize.

Once you’re finished, the page to create your first analysis will automatically open. Figure 11 shows an example of the page.

Figure 11. Create an analysis page

Figure 11. Create an analysis page

Create a QuickSight analysis

A QuickSight analysis is more than just a visualization—it helps you uncover hidden insights and trends in your data, identify key drivers, and forecast business metrics. You can create rich analytic experiences with QuickSight. For more information, visit Working with Visuals in the QuickSight User Guide.

For simplicity, you’ll build a visualization that summarizes findings categories by severity and aggregated by hour.

To create a QuickSight analysis

  1. Choose Line Chart from the Visual Types.

    Figure 12. Visual types

    Figure 12. Visual types

  2. Select Fields. Figure 13 shows what your field wells should look like at the end of this step.
    1. Locate the year_month_day_hour field in the field list and drag it over to the X axis field well.
    2. Locate the cnt field in the field list and drag it over to the Value field well.
    3. Locate the severity_label field in the field list and drag it over to Color field well.

      Figure 13. Field wells

      Figure 13. Field wells

  3. Add Filters.
    1. Select Filter in the left navigation panel.

      Figure 14. Filters panel

      Figure 14. Filters panel

    2. Choose Create one… and select the compliance_status field.
    3. Expand the filter and clear NOT_AVAILABLE and PASSED (Note: depending on your data, you might not have all of these statuses).
    4. Choose Apply to apply the filter.

      Figure 15. Filtering out findings that are not failing

      Figure 15. Filtering out findings that are not failing

You should now see a visualization that looks like Figure 16, which shows a summary count of events and their severity.

Figure 16. Example visualization (note: this visualization has five days’ worth of data.)

Figure 16. Example visualization (note: this visualization has five days’ worth of data.)

Publish a QuickSight analysis dashboard (optional)

Publishing a dashboard is a great way to share reports with leaders. This two-step process allows you to share visualizations as a dashboard.

To publish a QuickSight analysis

  1. Choose Share on the application bar, then choose Publish dashboard.
  2. Select Publish new dashboard as, and then enter a dashboard name, such as Security Hub Findings by Severity.

You can also embed dashboards into web applications. This requires using the AWS SDK or through the AWS Command Line Interface (AWS CLI). For more information, see Embedding QuickSight Data Dashboards for Everyone.

Encouraged security posture in QuickSight

QuickSight has a number of security features. While the AWS Security section of the QuickSight User Guide goes into detail, here’s a summary of the standards that apply to this specific scenario. For more details see AWS security in Amazon QuickSight within the QuickSight user guide.

Clean up (optional)

When done, you can clean up QuickSight by removing the Athena view and the CDK stack. Follow the detailed steps below to clean up everything.

To clean up QuickSight

  1. Open the console and choose Datasets in the left navigation pane.
  2. Select security-hub-rolled-up-finding then choose Delete dataset.
  3. Confirm dataset deletion by choosing Delete.
  4. Choose Analyses from the left navigation pane.
  5. Choose the menu in the lower right corner of the security-hub-rolled-up-finding card.

    Figure 17. Example analysis card

    Figure 17. Example analysis card

  6. Select Delete and confirm Delete.

To remove the Athena view

  1. Paste the following SQL in the query window:

    DROP VIEW “security-hub-rolled-up-finding”

  2. Choose the Run query button.

To remove the CDK stack

  1. Run the following command in your terminal:

    cdk destroy

    Note: If you experience errors, you might need to reactivate your Python virtual environment by completing steps 3–5 of Use AWS CDK to deploy the infrastructure.

Conclusion

In this blog, you used Security Hub and QuickSight to deploy a scalable analytic pipeline for your security tools. Security Hub allowed you to join and collect security findings from multiple sources. With QuickSight, you summarized data for your senior leaders and decision-makers to give them the right data in real-time.

You ensured that your sensitive data remained protected by explicitly granting QuickSight the ability to read from a specific S3 bucket. By authorizing access only to the data sources needed to visualize your data, you ensure least privilege access. QuickSight supports many other AWS data sources, including Amazon RDS, Amazon Redshift, Lake Formation, and Amazon OpenSearch Service (successor to Amazon Elasticsearch Service). Because the data doesn’t live inside an Amazon Virtual Private Cloud (Amazon VPC), you didn’t need to grant access to any specific VPCs. Limiting access to VPCs is another great way to improve the security of your environment.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security Hub forum. To start your 30-day free trial of Security Hub, visit AWS Security Hub.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

David Hessler

David Hessler

David is a senior cloud consultant with AWS Professional Services. He has over a decade of technical experience helping customers tackle their most challenging technical problems and providing tailor-made solutions using AWS services. He is passionate about DevOps, security automation, and how the two work together to allow customers to focus on what matters: their mission.

Add comparative and cumulative date/time calculations in Amazon QuickSight

Post Syndicated from Emily Zhu original https://aws.amazon.com/blogs/big-data/add-comparative-and-cumulative-date-time-calculations-in-amazon-quicksight/

Amazon QuickSight recently added native support for comparative (e.g., year-over-year) and cumulative (e.g., year-to-date) period functions which allow you to easily introduce these calculations in business reporting, trend analysis and time series analysis. This allows authors in QuickSight to implement advanced calculations without having to use complicated date offsets in calculations to achieve such datetime-aware comparisons.

In this post, we introduce the new period functions and their capabilities, and demonstrate several typical use cases. We also discuss several scenarios to extend the usage of the period functions, which will be useful in more advanced situations.

New period functions

Before we demonstrate use cases, let’s go over the new period function suite and see what new functions we now support. We can divide period functions into two main groups: comparative (period over period) functions and cumulative (period to date) functions.

Comparative (period over period) functions

You can use period over period functions to compare measures at different time periods, such as year, quarter, and month. For example, you can compute a year-over-year increase in sales, or week-over-week percentage revenue changes.

A typical comparative period function has the syntax periodOverPeriodDifference(measure, date, period, offset), with two optional arguments: period and offset.

You can use the period argument in the function to define the period granularity of the calculation. The granularity of YEAR means year-over-year computation, Quarter means quarter-over-quarter, and so on. If the period argument is left empty, the calculation changes based on the period granularity that is chosen (in the field well) to be displayed in the visual.

You can also use the offset argument to specify how many periods apart you want to compute the comparison. For instance, a period of a quarter with an offset of 2 means comparing against the previous two quarters.

Note that period and offset have to be both specified or both left empty. You can’t specify just one of them.

The following table summarizes the three available period over period functions.

Function Name Function Type Description
periodOverPeriodDifference Table Calculation Calculates the difference of a measure over two different time periods as specified by period granularity and offset.
periodOverPeriodLastValue Table Calculation Calculates the last (previous) value of a measure from a previous time period as specified by period granularity and offset.
periodOverPeriodPercentDifference Table Calculation Calculates the percent difference of a measure over two different time periods as specified by period granularity and offset.

Cumulative (period to date) functions

You can use period to date functions to calculate metrics within a given period-to-date window. There are two main types of cumulative functions.

  • “OverTime” functions, which are table calculations and return outputs for each row in the visual.

E.g., you can use periodToDateCountOverTime with WEEK granularity to compute a series of week-to-date new customer counts to track the fluctuation of customer engagement.

  • Aggregation functions, which output an aggregated value for a fixed period-to-date time range.

E.g., periodtoDateSum with YEAR granularity returns a single value for the total of the metric from the beginning of the year to the endDate provided in the formula. If left blank, the default endDate takes now(), which is the moment when users load the dashboard.

The following table summarizes the period to date functions.

Function Name Function Type Description
periodToDateAvgOverTime Table Calculation Calculates the average of a measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateCountOverTime Table Calculation Calculates the count of a dimension or measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateMaxOverTime Table Calculation Calculates the maximum of a measure or date for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateMinOverTime Table Calculation Calculates the minimum of a measure or date for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateSumOverTime Table Calculation Calculates the sum of a measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateAvg Aggregation Averages the set of numbers in the specified measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateCount Aggregation Calculates the number of values in a dimension or measure for a given time granularity (for instance, a quarter) up to a point in time, including duplicates.
periodToDateMax Aggregation Returns the maximum value of the specified measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateMedian Aggregation Returns the median value of the specified measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateMin Aggregation Returns the minimum value of the specified measure or date for a given time granularity (for instance, a quarter) up to a point in time.
periodToDatePercentile Aggregation Calculates the percentile based on the actual numbers in the measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDatePercentileCont Aggregation Calculates the percentile based on a continuous distribution of the numbers in the measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateStDev Aggregation Calculates the standard deviation of the set of numbers in the specified measure for a given time granularity (for instance, a quarter) up to a point in time based on a sample.
periodToDateStDevP Aggregation Calculates the population standard deviation of the set of numbers in the specified measure for a given time granularity (for instance, a quarter) up to a point in time based on a sample.
periodtoDateSum Aggregation Adds the set of numbers in the specified measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateVar Aggregation Calculates the sample variance of the set of numbers in the specified measure for a given time granularity (for instance, a quarter) up to a point in time.
periodToDateVarP Aggregation Calculates the population variance of the set of numbers in the specified measure for a given time granularity (for instance, a quarter) up to a point in time.

Use case 1: Using a period over period function to analyze sales growth in fixed and dynamic granularity

Let’s dive into how period over period functions can support typical business and financial use cases. The following example uses periodOverPeriodDifference to calculate YoY sales growth. Because we have Segment and Region in the visual, the YoY sales is calculated for each segment and region.

We define the measure of YoYSales with the following formula: YoYSales=periodOverPeriodDifference(sum(Sales),{Order Date},YEAR,1)

The first argument, sum(Sales), tells the function to calculate based on this measure. The second argument, Order Date, specifies the date/time column from which Year information is extracted. The third argument, YEAR, fixes the granularity of this calculation. When this optional argument is specified, this measure always returns YoY (not QoQ or MoM) no matter how Order Date is selected (in the field well) to be displayed in the visual. The fourth argument, 1, specifies the offset of the comparison. In this example, it means we want to compare the sales of each order date with the same date of the previous year. The measure returns empty for order dates of 2018, because no previous periods exist to be compared with.

The period functions are working with totals and subtotals. By adding the total for columns into the visual, you can see the total sales and total YoYSales for each region.

If you leave the optional argument of period granularity empty, meaning change the formula to PoPSales=periodOverPeriodDifference(sum(Sales),{Order Date})as shown in the following example, the time period of the calculation is then determined by the granularity of Order Date displayed on the visual. In the following example, Order Date is chosen to display at quarter level (in the field well), so PoPSales dynamically calculates the QoQ sales growth. Changing Order Date to the monthly level updates the measure to calculate MoM. For PoPSales, only Q1 2018 returns empty because that’s the only quarter that doesn’t have a previous quarter to compare with.

If we add YoYSales from the previous example to this visual, it calculates YoY sales growth at the quarter level (compares sales of Q1 2019 with Q1 2018). This demonstrates the difference between a fixed granularity and a dynamic granularity of period over period functions.

The period over period functions can differentiate between a positive change (increase) and negative change (decrease). Therefore, when we add the conditional formatting to the visual, it’s very straightforward to see the financial performance of each period (green is good, red is bad).

Similarly, you can use periodOverPeriodPercentDifference to calculate relative sales growth over time. You can add dimensions into the visual to dive further into business insights, such as analyzing the breakdown of each business segment’s sales change by quarter, and their contribution to the total sales increase. We use the formula PoPSales%=periodOverPeriodPercentDifference(sum(Sales),{Order Date}).

Use case 2: Using a period to date function to track YTD sales in table calculations and aggregations

Similar to period over period functions, the period to date function suite provides a quick and easy way to calculate year-to-date (YTD) or quarter-to-date (QTD) metrics. In the following example, we use the formula of YTDSumoverSales=periodToDateSumOverTime(sum(Sales),{Order Date},YEAR), and YTDtotalcountSales=periodToDateSumOverTime(count(Sales),{Order Date},YEAR) to calculate YTD sales and YTD total number of transactions.

Contrary to period over period functions, the third argument of period to date functions, period, isn’t optional. Therefore, the calculation granularity is always fixed. In this example, with the granularity defined as YEAR, this measure always calculates YTD, instead of QTD or MTD. Because Order Date is displayed at the monthly level, this calculation outputs the YTD sales of each month, and starts over again in January for the next year. As shown in the result table, YTDSumoverSales of January 2018 is the monthly sales of January 2018, and YTDSumoverSales of February 2018 is the monthly sales of January 2018 plus that of February 2018. And YTDSumoverSales of January 2019 goes back to the monthly sales of January 2019.

You can further dive into the details by populating the calculations in a line chart, and adding more dimensions into the analysis. The following example shows the YTD weekly sales growth trend for each region along the past four years, and uncovers some interesting sales competition between AMER and EMEA in year 2021.

In addition to the table calculations, the aggregation period functions are particularly useful when you need to build KPI charts to evaluate YTD metrics in a real-time manner. In the following example, we use the aggregation period to date functions to build two KPI charts to track the YTD total sales, and YTD total number of transactions. For the date December 26, 2021, the timestamp results match the corresponding table calculations for the date of December 26, 2021 in the table. The following table summarizes the formulas.

Formula Formula Type
YTDSumoverSales=periodToDateSumOverTime(sum(Sales),{Order Date},YEAR) Table Calculation
YTDtotalcountSales=periodToDateSumOverTime(count(Sales),{Order Date},YEAR) Table Calculation
YTDSumSales=periodToDateSum(Sales,{Order Date},YEAR) Aggregation (KPI chart)
YTDCountSales=periodToDateCount(Sales,{Order Date},YEAR) Aggregation (KPI chart)

Advanced use case 1: Date/time awareness with period functions

Period functions are not only easier to define and read, they’re also date/time-aware, meaning the functions are calculated based on a date/time-based offset instead of a fixed number of rows. It can solve two major problems that were not possible to be addressed before.

Period functions can handle varying period duration

If you want to calculate the daily MoM sales increase, you can’t use a fixed offset on each month because the number of days of each month are different (31 days for January and 28 or 29 days for February).

Period functions are calculated based on calendar dates instead of a fixed offset. In the following example, we use the formula MoMsalesDiff=periodOverPeriodDifference(sum(Sales),{Order Date},MONTH,1). The daily MoM increase is calculated correctly based on the day of the month. The sales of the first day of the month are compared with the first day of the previous month, and the same applies to all other days. (Visuals are duplicated for demonstration purposes.)

Period functions can handle sparse (missing) data points

Not all datasets can guarantee a complete set of dates. In the previous example, sales data of January 1, 2018, is missing. Using the workaround based on a fixed offset can cause a problem here because we compare February 1, 2018, with a different date instead of January 1, 2018. Period functions always compare measures by date/time offsets so that only desired dates are compared. In the previous example, MoMsalesDiff shows empty for February 1, 2018, because of the missing data of January 1, 2018.

Advanced use case 2: Nesting period functions with other calculations

Now that we can use period over period and period to date functions to create calculated fields, we can nest these functions with other calculations to drive more advanced analysis.

For example, you may want to know for each year, what are the top 10 weeks of the year in regards to week-over-week sales growth. You can do this by calculating WoWSales first: WoWSales=periodOverPeriodDifference(sum(Sales), {Order Date}, WEEK, 1). Then you nest it with the denseRank window function: RankWoWSales=denseRank([WoWSales DESC],[{YEAR}]). This wouldn’t be possible using the fixed-row based workaround, which is implemented using on-visual calculations instead of calculated fields. In the following visual, the top 10 weeks of each year with the highest sales growth are fetched by a simple filter on RankWoWSales.

You can even nest the period functions with other period functions to generate interesting insights. For example, you can calculate a monthly YoY growth based on the monthly YTD number of transactions. The following formula demonstrates the capability of nesting a YTD calculated field inside a YoY calculated field:

YTDtotalsountSales=periodToDateSumOverTime(count(Sales),{Order Date},YEAR)
YoYYTDSalesCount=periodOverPeriodDifference(YTDtotalcountSales,{Order Date},YEAR,1)

The results in the following visual show a YoY growth based on a YTD accumulated number of transactions instead of the absolute monthly numbers.

Advanced use case 3: Partial period comparisons

Finally, we discuss a third advanced use case: partial period comparison. Imagine it’s November 15, 2021 (which is the 46th day of the last quarter of 2021), and you want to calculate 4 to compare the performance of this quarter with past quarters, but only using the first 46 days of each quarter instead of the whole quarter. This requires a calculated field using periodOverPeriodDifference nested with the sumIf() window function.

The following example demonstrates using a nested calculated field to address this use case:

QuarterToDate=periodToDateSumOverTime(sum(Sales), {Order Date}, QUARTER)
PartialQTDSales=sumIf(Sales, dateDiff(truncDate("Q", {Order Date}), {Order Date}, "HH") <= dateDiff(truncDate("Q", now()), now(), "HH"))
PartialQoQQTDSales=periodOverPeriodDifference(sumif(Sales, {Order Date} <= addDateTime(dateDiff(truncDate("Q", now()), now(), "HH"), "HH", truncDate("Q", {Order Date}))), {Order Date}, QUARTER, 1)

PartialQTDSales computes how many hours from the beginning of this quarter to the current date and uses sumIf() to calculate the total sales of that period of each quarter. partialQoQQTDSales then nests the periodOverPeriodDifference function with PartialQTDSales to find the partial QoQ differences. Such a comparison based on a partial period isn’t feasible without the new date/time-aware period functions.

Conclusion

In this blog, we introduced new QuickSight period functions which enable quick and powerful date/time-based calculations. We reviewed comparative and cumulative period functions (i.e., period over period and period to date), discussed two major use cases (fixed vs. dynamic granularity and table calculation vs. aggregation), and extended the usage to three advanced use cases. Period functions are now generally available in all supported QuickSight Regions.

Looking forward to your feedback and stories on how you apply these calculations for your business needs.


About the Authors

Emily Zhu is a Senior Product Manager at Amazon QuickSight, AWS’s cloud-native, fully managed SaaS BI service. She leads the development of QuickSight core analytics and calculations capability. Before joining AWS, she was working in the Amazon Prime Air drone delivery program and the Boeing company as senior strategist for several years. Emily is passionate about potentials of cloud-based BI solutions and looks forward to helping customers to advance in their data-driven strategy making.

Rajkumar Haridoss is a Senior Software Development Engineer for AWS QuickSight. He is the lead engineer on the Query Generation team and works on back-end calculations, query planning, and query generation layer in QuickSight. Outside of work, he likes spending quality time with family and 4-year-old.

How Ribbon Communications Built a Scalable, Resilient Robocall Mitigation Platform

Post Syndicated from Siva Rajamani original https://aws.amazon.com/blogs/architecture/how-ribbon-communications-built-a-scalable-resilient-robocall-mitigation-platform/

Ribbon Communications provides communications software, and IP and optical networking end-to-end solutions that deliver innovation, unparalleled scale, performance, and agility to service providers and enterprise.

Ribbon Communications is helping customers modernize their networks. In today’s data-hungry, 24/7 world, this equates to improved competitive positioning and business outcomes. Companies are migrating from on-premises equipment for telephony services and looking for equivalent as a service (aaS) offerings. But these solutions must still meet the stringent resiliency, availability, performance, and regulatory requirements of a telephony service.

The telephony world is inundated with robocalls. In the United States alone, there were an estimated 50.5 billion robocalls in 2021! In this blog post, we describe the Ribbon Identity Hub – a holistic solution for robocall mitigation. The Ribbon Identity Hub enables services that sign and verify caller identity, which is compliant to the ATIS standards under the STIR/SHAKEN framework. It also evaluates and scores calls for the probability of nuisance and fraud.

Ribbon Identity Hub is implemented in Amazon Web Services (AWS). It is a fully managed service for telephony service providers and enterprises. The solution is secure, multi-tenant, automatic scaling, and multi-Region, and enables Ribbon to offer managed services to a wide range of telephony customers. Ribbon ensures resiliency and performance with efficient use of resources in the telephony environment, where load ratios between busy and idle time can exceed 10:1.

Ribbon Identity Hub

The Ribbon Identity Hub services are separated into a data (call-transaction) plane, and a control plane.

Data plane (call-transaction)

The call-transaction processing is typically invoked on a per-call-setup basis where availability, resilience, and performance predictability are paramount. Additionally, due to high variability in load, automatic scaling is a prerequisite.

Figure 1. Data plane architecture

Figure 1. Data plane architecture

Several AWS services come together in a solution that meets all these important objectives:

  1. Amazon Elastic Container Service (ECS): The ECS services are set up for automatic scaling and span two Availability Zones. This provides the horizontal scaling capability, the self-healing capacity, and the resiliency across Availability Zones.
  2. Elastic Load Balancing – Application Load Balancer (ALB): This provides the ability to distribute incoming traffic to ECS services as the target. In addition, it also offers:
    • Seamless integration with the ECS Auto Scaling group. As the group grows, traffic is directed to the new instances only when they are ready. As traffic drops, traffic is drained from the target instances for graceful scale down.
    • Full support for canary and linear upgrades with zero downtime. Maintains full-service availability without any changes or even perception for the client devices.
  3. Amazon Simple Storage Service (S3): Transaction detail records associated with call-related requests must be securely and reliably maintained for over a year due to billing and other contractual obligations. Amazon S3 simplifies this task with high durability, lifecycle rules, and varied controls for retention.
  4. Amazon DynamoDB: Building resilient services is significantly easier when the compute processing can be stateless. Amazon DynamoDB facilitates such stateless architectures without compromise. Coupled with the availability of the Amazon DynamoDB Accelerator (DAX) caching layer, the solution can meet the extreme low latency operation requirements.
  5. AWS Key Management Service (KMS): Certain tenant configuration is highly confidential and requires elevated protection. Furthermore, the data is part of the state that must be recovered across Regions in disaster recovery scenarios. To meet the security requirements, the KMS is used for envelope encryption using per-tenant keys. Multi-Region KMS keys facilitates the secure availability of this state across Regions without the need for application-level intervention when replicating encrypted data.
  6. Amazon Route 53: For telephony services, any non-transient service failure is unacceptable. In addition to providing high degree of resiliency through Multi-AZ architecture, Identity Hub also provides Regional level high availability through its multi-Region active-active architecture. Route 53 with health checks provides for dynamic rerouting of requests within minutes to alternate Regions.

Control plane

The Identity Hub control plane is used for customer configuration, status, and monitoring. The API is REST-based. Since this is not used on a call-by-call basis, the requirements around latency and performance are less stringent, though the requirements around high resiliency and dynamic scaling still apply. In this area, ease of implementation and maintainability are key.

Figure 2. Control plane architecture

Figure 2. Control plane architecture

The following AWS services implement our control plane:

  1. Amazon API Gateway: Coupled with a custom authenticator, the API Gateway handles all the REST API credential verification and routing. Implementation of an API is transformed into implementing handlers for each resource, which is the application core of the API.
  2. AWS Lambda: All the REST API handlers are written as Lambda functions. By using the Lambda’s serverless and concurrency features, the application automatically gains self-healing and auto-scaling capabilities. There is also a significant cost advantage as billing is per millisecond of actual compute time used. This is significant for a control plane where usage is typically sparse and unpredictable.
  3. Amazon DynamoDB: A stateless architecture with Lambda and API Gateway, all persistent state must be stored in an external database. The database must match the resilience and auto-scaling characteristics of the rest of the control plane. DynamoDB easily fits the requirements here.

The customer portal, in addition to providing the user interface for control plane REST APIs, also delivers a rich set of user-customizable dashboards and reporting capability. Here again, the availability of various AWS services simplifies the implementation, and remains non-intrusive to the central call-transaction processing.

Services used here include:

  1. AWS Glue: Enables extraction and transformation of raw transaction data into a format useful for reporting and dashboarding. AWS Glue is particularly useful here as the data available is regularly expanding, and the use cases for the reporting and dashboarding increase.
  2. Amazon QuickSight: Provides all the business intelligence (BI) functionality, including the ability for Ribbon to offer separate author and reader access to their users, and implements tenant-based access separation.

Conclusion

Ribbon has successfully deployed Identity Hub to enable cloud hosted telephony services to mitigate robocalls. Telephony requirements around resiliency, performance, and capacity were not compromised. Identity Hub offers the benefits of a 24/7 fully managed service requiring no additional customer on-premises equipment.

Choosing AWS services for Identity Hub gives Ribbon the ability to scale and meet future growth. The ability to dynamically scale the service in and out also brings significant cost advantages in telephony applications where busy hour traffic is significantly higher than idle time traffic. In addition, the availability of global AWS services facilitates the deployment of services in customer-local geographic locations to meet performance requirements or local regulatory compliance.

Enrich datasets for descriptive analytics with AWS Glue DataBrew

Post Syndicated from Daniel Rozo original https://aws.amazon.com/blogs/big-data/enrich-datasets-for-descriptive-analytics-with-aws-glue-databrew/

Data analytics remains a constantly hot topic. More and more businesses are beginning to understand the potential their data has to allow them to serve customers more effectively and give them a competitive advantage. However, for many small to medium businesses, gaining insight from their data can be challenging because they often lack in-house data engineering skills and knowledge.

Data enrichment is another challenge. Businesses that focus on analytics using only their internal datasets miss the opportunity to gain better insights by using reliable and credible public datasets. Small to medium businesses are no exception to this shortcoming, where obstacles such as not having sufficient data diminish their ability to make well-informed decisions based on accurate analytical insights.

In this post, we demonstrate how AWS Glue DataBrew enables businesses of all sizes to get started with data analytics with no prior coding knowledge. DataBrew is a visual data preparation tool that makes it easy for data analysts and scientists to clean and normalize data in preparation for analytics or machine learning. It includes more than 350 pre-built transformations for common data preparation use cases, enabling you to get started with cleaning, preparing, and combining your datasets without writing code.

For this post, we assume the role of a fictitious small Dutch solar panel distribution and installation company named OurCompany. We demonstrate how this company can prepare, combine, and enrich an internal dataset with publicly available data from the Dutch public entity, the Centraal Bureau voor de Statistiek (CBS), or in English, Statistics Netherlands. Ultimately, OurCompany desires to know how well they’re performing compared to the official reported values by the CBS across two important key performance indicators (KPIs): the amount of solar panel installations, and total energy capacity in kilowatt (kW) per region.

Solution overview

The architecture uses DataBrew for data preparation and transformation, Amazon Simple Storage Service (Amazon S3) as the storage layer of the entire data pipeline, and the AWS Glue Data Catalog for storing the dataset’s business and technical metadata. Following the modern data architecture best practices, this solution adheres to foundational logical layers of the Lake House Architecture.

The solution includes the following steps:

  1. We set up the storage layer using Amazon S3 by creating the following folders: raw-data, transformed-data, and curated-data. We use these folders to track the different stages of our data pipeline consumption readiness.
  2. Three CSV raw data files containing unprocessed data of solar panels as well as the external datasets from the CBS are ingested into the raw-data S3 folder.
  3. This part of the architecture incorporates both processing and cataloging capabilities:
    1. We use AWS Glue crawlers to populate the initial schema definition tables for the raw dataset automatically. For the remaining two stages of the data pipeline (transformed-data and curated-data), we utilize the functionality in DataBrew to directly create schema definition tables into the Data Catalog. Each table provides an up-to-date schema definition of the datasets we store on Amazon S3.
    2. We work with DataBrew projects as the centerpiece of our data analysis and transformation efforts. In here, we set up no-code data preparation and transformation steps, and visualize them through a highly interactive, intuitive user interface. Finally, we define DataBrew jobs to apply these steps and store transformation outputs on Amazon S3.
  4. To gain the benefits of granular access control and easily visualize data from Amazon S3, we take advantage of the seamless integration between Amazon Athena and Amazon QuickSight. This provides a SQL interface to query all the information we need from the curated dataset stored on Amazon S3 without the need to create and maintain manifest files.
  5. Finally, we construct an interactive dashboard with QuickSight to depict the final curated dataset alongside our two critical KPIs.

Prerequisites

Before beginning this tutorial, make sure you have the required Identity and Access Management (IAM) permissions to create the resources required as part of the solution. Your AWS account should also have an active subscription to QuickSight to create the visualization on processed data. If you don’t have a QuickSight account, you can sign up for an account.

The following sections provide a step-by-step guide to create and deploy the entire data pipeline for OurCompany without the use of code.

Data preparation steps

We work with the following files:

  • CBS Dutch municipalities and provinces (Gemeentelijke indeling op 1 januari 2021) – Holds all the municipalities and provinces names and codes of the Netherlands. Download the file gemeenten alfabetisch 2021. Open the file and save it as cbs_regions_nl.csv. Remember to change the format to CSV (comma-delimited).
  • CBS Solar power dataset (Zonnestroom; vermogen bedrijven en woningen, regio, 2012-2018) – This file contains the installed capacity in kilowatts and total number of installations for businesses and private homes across the Netherlands from 2012–2018. To download the file, go to the dataset page, choose the Onbewerkte dataset, and download the CSV file. Rename the file to cbs_sp_cap_nl.csv.
  • OurCompany’s solar panel historical data – Contains the reported energy capacity from all solar panel installations of OurCompany across the Netherlands from 2012 until 2018. Download the file.

As a result, the following are the expected input files we use to work with the data analytics pipeline:

  • cbs_regions_nl.csv
  • cbs_sp_cap_nl.csv
  • sp_data.csv

Set up the storage Layer

We first need to create the storage layer for our solution to store all raw, transformed, and curated datasets. We use Amazon S3 as the storage layer of our entire data pipeline.

  1. Create an S3 bucket in the AWS Region where you want to build this solution. In our case, the bucket is named cbs-solar-panel-data. You can use the same name followed by a unique identifier.
  2. Create the following three prefixes (folders) in your S3 bucket by choosing Create folder:
    1. curated-data/
    2. raw-data/
    3. transformed-data/

  3. Upload the three raw files to the raw-data/ prefix.
  4. Create two prefixes within the transformed-data/ prefix named cbs_data/ and sp_data/.

Create a Data Catalog database

After we set up the storage layer of our data pipeline, we need to create the Data Catalog to store all the metadata of the datasets hosted in Amazon S3. To do so, follow these steps:

  1. Open the AWS Glue console in the same Region of your newly created S3 bucket.
  2. In the navigation pane, choose Databases.
  3. Choose Add database.
  4. Enter the name for the Data Catalog to store all the dataset’s metadata.
  5. Name the database sp_catalog_db.

Create AWS Glue data crawlers

Now that we created the catalog database, it’s time to crawl the raw data prefix to automatically retrieve the metadata associated to each input file.

  1. On the AWS Glue console, choose Crawlers in the navigation pane.
  2. Add a crawler with the name crawler_raw and choose Next.
  3. For S3 path, select the raw-data folder of the cbs-solar-panel-data prefix.
  4. Create an IAM role and name it AWSGlueServiceRole-cbsdata.
  5. Leave the frequency as Run on demand.
  6. Choose the sp_catalog_db database created in the previous section, and enter the prefix raw_ to identify the tables that belong to the raw data folder.
  7. Review the parameters of the crawler and then choose Finish.
  8. After the crawler is created, select it and choose Run crawler.

After successful deployment of the crawler, your three tables are created in the sp_catalog_db database: raw_sp_data_csv, raw_cbs_regions_nl_csv, and raw_cbs_sp_cap_nl_csv.

Create DataBrew raw datasets

To utilize the power of DataBrew, we need to connect datasets that point to the Data Catalog S3 tables we just created. Follow these steps to connect the datasets:

  1. On the DataBrew console, choose Datasets in the navigation pane.
  2. Choose Connect new dataset.
  3. Name the dataset cbs-sp-cap-nl-dataset.
  4. For Connect to new dataset, choose Data Catalog S3 tables.
  5. Select the sp_catalog_db database and the raw_cbs_sp_cap_nl_csv table.
  6. Choose Create dataset.

We need to create to two more datasets following the same process. The following table summarizes the names and tables of the catalog required for the new datasets.

Dataset name Data catalog table
sp-dataset raw_sp_data_csv
cbs-regions-nl-dataset raw_cbs_regions_nl_csv

Import DataBrew recipes

A recipe is a set of data transformation steps. These transformations are applied to one or multiple datasets of your DataBrew project. For more information about recipes, see Creating and using AWS Glue DataBrew recipes.

We have prepared three DataBrew recipes, which contain the set of data transformation steps we need for this data pipeline. Some of these transformation steps include: renaming columns (from Dutch to English), removing null or missing values, aggregating rows based on specific attributes, and combining datasets in the transformation stage.

To import the recipes, follow these instructions:

  1. On the DataBrew console, choose Recipes in the navigation pane.
  2. Choose Upload recipe.
  3. Enter the name of the recipe: recipe-1-transform-cbs-data.
  4. Upload the following JSON recipe.
  5. Choose Create recipe.

Now we need to upload two more recipes that we use for transformation and aggregation projects in DataBrew.

  1. Follow the same procedure to import the following recipes:
Recipe name Recipe source file
recipe-2-transform-sp-data Download
recipe-3-curate-sp-cbs-data Download
  1. Make sure the recipes are listed in the Recipes section filtered by All recipes.

Set up DataBrew projects and jobs

After we successfully create the Data Catalog database, crawlers, DataBrew datasets, and import the DataBrew recipes, we need to create the first transformation project.

CBS external data transformation project

The first project takes care of transforming, cleaning, and preparing cbs-sp-cap-nl-dataset. To create the project, follow these steps:

  1. On the DataBrew console, choose Projects in the navigation pane.
  2. Create a new project with the name 1-transform-cbs-data.
  3. In the Recipe details section, choose Edit existing recipe and choose the recipe recipe-1-transform-cbs-data.
  4. Select the newly created cbs-sp-cap-nl-dataset under Select a dataset.
  5. In the Permissions section, choose Create a new IAM role.
  6. As suffix, enter sp-project.
  7. Choose Create project.

After you create the project, a preview dataset is displayed as a result of applying the selected recipe. When you choose 10 more recipe steps, the service shows the entire set of transformation steps.

After you create the project, you need to grant put and delete S3 object permissions to the created role AWSGlueDataBrewServiceRole-sp-project on IAM. Add an inline policy using the following JSON and replace the resource with your S3 bucket name:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::<your-S3-bucket-name>/*"
        }
    ]
}

This role also needs permissions to access the Data Catalog. To grant these permissions, add the managed policy AWSGlueServiceRole to the role.

CBS external data transformation job

After we define the project, we need to configure and run a job to apply the transformation across the entire raw dataset stored in the Raw-data folder of your S3 bucket. To do so, you need to do the following:

  1. On the DataBrew project page, choose Create job.
  2. For Job name, enter 1-transform-cbs-data-job.
  3. For Output to, choose Data Catalog S3 tables.
  4. For File type¸ choose Parquet.
  5. For Database name, choose sp_catalog_db.
  6. For Table name, choose Create new table.
  7. For Catalog table name, enter transformed_cbs_data.
  8. For S3 location, enter s3://<your-S3-bucket-name>/transformed-data/cbs_data/.
  9. In the job output settings section, choose Settings.
  10. Select Replace output files for each job run and then choose Save.
  11. In the permissions section, choose the automatically created role with the sp-project suffix; for example, AWSGlueDataBrewServiceRole-sp-project.
  12. Review the job details once more and then choose Create and run job.
  13. Back in the main project view, choose Job details.

After a few minutes, the job status changes from Running to Successful. Choose the output to go to the S3 location where all the generated Parquet files are stored.

Solar panels data transformation stage

We now create the second phase of the data pipeline. We create a project and a job using the same procedure described in the previous section.

  1. Create a DataBrew project with the following parameters:
    1. Project name2-transform-sp-data
    2. Imported reciperecipe-2-transform-sp-data
    3. Datasetsp_dataset
    4. Permissions roleAWSGlueDataBrewServiceRole-sp-project
  2. Create and run another DataBrew job with the following parameters:
    1. Job name2-transform-sp-data-job
    2. Output to – Data Catalog S3 tables
    3. File type – Parquet
    4. Database namesp_catalog_db
    5. Create new table with table nametransformed_sp_data
    6. S3 locations3://<your-S3-bucket-name>/transformed-data/sp_data/
    7. Settings – Replace output files for each job run.
    8. Permissions roleAWSGlueDataBrewServiceRole-sp-project
  3. After the job is complete, create the DataBrew datasets with the following parameters:
Dataset name Data catalog table
transformed-cbs-dataset awsgluedatabrew_transformed_cbs_data
transformed-sp-dataset awsgluedatabrew_transformed_sp_data

You should now see five items as part of your DataBrew dataset.

Data curation and aggregation stage

We now create the final DataBrew project and job.

  1. Create a DataBrew project with the following parameters:
    1. Project name3-curate-sp-cbs-data
    2. Imported reciperecipe-3-curate-sp-cbs-data
    3. Datasettransformed_sp_dataset
    4. Permissions roleAWSGlueDataBrewServiceRole-sp-project
  2. Create a DataBrew job with the following parameters:
    1. Job name3-curate-sp-cbs-data-job
    2. Output to – Data Catalog S3 tables
    3. File type – Parquet
    4. Database namesp_catalog_db
    5. Create new table with table namecurated_data
    6. S3 locations3://<your-S3-bucket-name>/curated-data/
    7. Settings – Replace output files for each job run
    8. Permissions roleAWSGlueDataBrewServiceRole-sp-project

The last project defines a single transformation step; the join between the transformed-cbs-dataset and the transformed-sp-dataset based on the municipality code and the year.

The DataBrew job should take a few minutes to complete.

Next, check your sp_catalog_db database. You should now have raw, transformed, and curated tables in your database. DataBrew automatically adds the prefix awsgluedatabrew_ to both the transformed and curated tables in the catalog.

Consume curated datasets for descriptive analytics

We’re now ready to build the consumption layer for descriptive analytics with QuickSight. In this section, we build a business intelligence dashboard that reflects OurCompany’s solar panel energy capacity and installations participation in contrast to the reported values by the CBS from 2012–2018.

To complete this section, you need to have the default primary workgroup already set up on Athena in the same Region where you implemented the data pipeline. If it’s your first time setting up workgroups on Athena, follow the instructions in Setting up Workgroups.

Also make sure that QuickSight has the right permissions to access Athena and your S3 bucket. Then complete the following steps:

  1. On the QuickSight console, choose Datasets in the navigation pane.
  2. Choose Create a new dataset.
  3. Select Athena as the data source.
  4. For Data source name, enter sp_data_source.
  5. Choose Create data source.
  6. Choose AWSDataCatalog as the catalog and sp_catalog_db as the database.
  7. Select the table curated_data.
  8. Choose Select.
  9. In the Finish dataset creation section, choose Directly query your data and choose Visualize.
  10. Choose the clustered bar combo chart from the Visual types list.
  11. Expand the field wells section and then drag and drop the following fields into each section as shown in the following screenshot.
  12. Rename the visualization as you like, and optionally filter the report by sp_year using the Filter option.

From this graph, we can already benchmark OurCompany against the regional values reported by the CBS across two dimensions: the total amount of installations and the total kW capacity generated by solar panels.

We went one step further and created two KPI visualizations to empower our descriptive analytics capabilities. The following is our final dashboard that we can use to enhance our decision-making process.

Clean up resources

To clean all the resources we created for the data pipeline, complete the following steps:

  1. Remove the QuickSight analyses you created.
  2. Delete the dataset curated_data.
  3. Delete all the DataBrew projects with their associated recipes.
  4. Delete all the DataBrew datasets.
  5. Delete all the AWS Glue crawlers you created.
  6. Delete the sp_catalog_db catalog database; this removes all the tables.
  7. Empty the contents of your S3 bucket and delete it.

Conclusion

In this post, we demonstrated how you can begin your data analytics journey. With DataBrew, you can prepare and combine the data you already have with publicly available datasets such as those from the Dutch CBS (Centraal Bureau voor de Statistiek) without needing to write a single line of code. Start using DataBrew today and enrich key datasets in AWS for enhanced descriptive analytics capabilities.


About the Authors

Daniel Rozo is a Solutions Architect with Amazon Web Services based out of Amsterdam, The Netherlands. He is devoted to working with customers and engineering simple data and analytics solutions on AWS. In his free time, he enjoys playing tennis and taking tours around the beautiful Dutch canals.

Maurits de Groot is an intern Solutions Architect at Amazon Web Services. He does research on startups with a focus on FinTech. Besides working, Maurits enjoys skiing and playing squash.


Terms of use: Gemeentelijke indeling op 1 januari 2021, Zonnestroom; vermogen bedrijven en woningen, regio (indeling 2018), 2012-2018, and copies of these datasets redistributed by AWS, are licensed under the Creative Commons 4.0 license (CC BY 4.0), sourced from Centraal Bureau voor de Statistiek (CBS). The datasets used in this solution are modified to rename columns from Dutch to English, remove null or missing values, aggregate rows based on specific attributes, and combine the datasets in the final transformation. Refer to the CC BY 4.0 use, adaptation, and attribution requirements for additional information.

ConexED uses Amazon QuickSight to empower its institutional partners by unifying and curating powerful insights using engagement data

Post Syndicated from Michael Gorham original https://aws.amazon.com/blogs/big-data/conexed-uses-amazon-quicksight-to-empower-its-institutional-partners-by-unifying-and-curating-powerful-insights-using-engagement-data/

This post was co-written with Michael Gorham, Co-Founder and CTO of ConexED.

ConexED is one of the country’s fastest-growing EdTech companies designed specifically for education to enhance the student experience and elevate student success. Founded as a startup in 2008 to remove obstacles that hinder student persistence and access to student services, ConexED provides advisors, counselors, faculty, and staff in all departments across campus the tools necessary to meet students where they are.

ConexED offers a student success and case management platform, HUB Kiosk – Queuing System, and now a business intelligence (BI) dashboard powered by Amazon QuickSight to empower its institutional partners.

ConexED strives to make education more accessible by providing tools that make it easy and convenient for all students to connect with the academic support services that are vital to their success in today’s challenging and ever-evolving educational environment. ConexED’s student- and user-friendly interface makes online academic communications intuitive and as personalized as face-to-face encounters, while also making on-campus meetings as streamlined, and well reported as online meetings.

One of the biggest obstacles facing school administrators is getting meaningful data quickly so that informed, data-driven decisions can be made. Reporting can be time-consuming, so they are often generated infrequently, which leads to outdated data. In addition, reporting often lacks customization and data is typically captured in spreadsheets, which doesn’t provide a visual representation of the information that is easy to interpret. ConnexED has always offered robust reporting features, but the problem was that in providing this kind of data for our partners, our development team was spending more than half its time creating custom reporting for the constantly increasing breadth of data the ConexED system generates.

Every new feature we built requires at least two or three new reports – and therefore more of our development team’s time. After we implemented QuickSight, not only can ConexED’s development team focus all its energies on creating competitive features to accelerate the rollout of new product features, but also the reporting and data visualization are now features our customers can control and customize. QuickSight features such as drill-down filtering, predictive forecasting, and aggregation insights have given us the competitive edge that our customers expect from a modern, cloud-based solution.

New technology enables strategic planning

With QuickSight, we’re able to focus on building customer-facing solutions that capture data rather than spending a large portion of our development time solving data visualization and custom report problems. Our development team no longer has to spend its time creating reports for all the data generated, and our customers don’t need to wait. Partnering with QuickSight has enabled ConexED to develop its business intelligence dashboard, which is designed to create operational efficiencies, identify opportunities, and empower institutions by uniting critical data insights to cross-campus student support services. The QuickSight data used in ConexED’s BI dashboard analyzes collected information in real time, allowing our partners to properly project trends in the coming school year using predictive analytics to improve staff efficiency, enhance the student experience, and increase rates of retention and graduation.

The following image demonstrates heat mapping, which displays the recurring days and times when student requests for support services are most frequent, with the busiest hour segments appearing more saturated in color. This enables leadership to utilize staff efficiently so that students have the support services they need when they need it on their pathway to graduation. ConexED’s BI dashboard powered by QuickSight makes this kind of information possible so that our partners can plan strategically.

QuickSight dashboards allow our customers to drill down on the data to glean even more insights of what is happening on their campus. In the following example, the pie chart depicts a whole-campus view of meetings by department, but leadership can choose one of the colored segments to drill down further for more information about a specific department. Whatever the starting point, leadership now has the ability to access more specific, real-time data to understand what’s happening on their campus or any part of it.

Dashboards provide data visualization

Our customers have been extremely impressed with our QuickSight dashboards because they provide data visualizations that make the information easier to comprehend and parse. The dynamic, interactive nature of the dashboards allows ConexED’s partners to go deeper into the data with just a click of the mouse, which immediately generates new data based on what was clicked and therefore new visuals.

With QuickSight, not only can we programmatically display boiler-plate dashboards based on role type, but we can also allow our clients to branch off these dashboards and customize the reporting to their liking. The development team is now able to move quickly to build interesting features that ingest data and provide insightful visualizations and reports on the gathered data easily. ConexED’s BI dashboard powered by QuickSight enables leadership at our partner institutions to understand how users engage with support services on their campus – when they meet, why they meet, how they meet – so that they can make informed decisions to improve student engagement and services.

The right people with the right information

In education, giving the right level of data access to the right people is essential. With intuitive row- and column-level security and anonymous tagging in QuickSight, the ConexED development team was able to quickly build visualizations that correctly display partitioned data to thousands of different users with varying levels of access across our client base.

At ConexED, student success is paramount, and with QuickSight powering our BI dashboard, the right people get the right data, and our institutional customers can now easily analyze vast amounts of data to identify trends in student acquisition, retention, and completion rates. They can also solve student support staffing allocation problems and improve the student experience at their institutions.

QuickSight does the heavy lifting

The ability to securely pull and aggregate data from disparate sources with very little setup work has given ConexED a head start on the predictive analytics space in the EdTech market. Now building visualizations is intuitive, insightful, and fun. In fact, the development team even built in only 1 day an internal QuickSight dashboard to view our own customers’ QuickSight usage. The data visualization combinations are seemingly endless and infinitely valuable to our customers.

ConexED’s partnership with AWS has enabled us to use QuickSight to drive our BI dashboard and provide our customers with the power and information needed for today’s dynamic modern student support services teams.


About the Author

Michael Gorham is Co-Founder and CTO of ConexED. Michael is a multidisciplinary software architect with over 20 years’ experience

Visualize live analytics from Amazon QuickSight connected to Amazon OpenSearch Service

Post Syndicated from Lokesh Yellanur original https://aws.amazon.com/blogs/big-data/visualize-live-analytics-from-amazon-quicksight-connected-to-amazon-opensearch-service/

Live analytics refers to the process of preparing and measuring data as soon as it enters the database or persistent store. In other words, you get insights or arrive at conclusions immediately. Live analytics enables businesses to respond to events without delay. You can seize opportunities or prevent problems before they happen. Speed is the main benefit of live analytics. The faster a business can use data for insights, the faster they can act on critical decisions.

Some live analytics use cases include:

  • Analyzing access logs and application logs from servers to identify any server performance issues that could lead to application downtime or help detect unusual activity. For instance, analyzing monitoring data from a manufacturing line can help early intervention before machinery malfunctions.
  • Targeting individual customers in retail outlets with promotions and incentives while the customers are in the store and close to the merchandise.

We see customers using real-time analytics using our ELK stack. The ELK stack is an acronym used to describe a stack that comprises three popular open-source projects: Elasticsearch, Logstash, and Kibana. Often referred to as Elasticsearch, the ELK stack gives you the ability to aggregate logs from all your systems and applications, analyze these logs, and create visualizations for application and infrastructure monitoring, faster troubleshooting, security analytics, and more. In this post, we extend the live analytics visualizations using Amazon QuickSight.

Solution overview

Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) is a fully managed service that makes it easy for you to deploy, secure, and run OpenSearch cost-effectively at scale. You can build, monitor, and troubleshoot your applications using the tools you love at the scale you need. The service provides support for open-source OpenSearch APIs, managed Kibana, integration with Logstash and other AWS services, and built-in alerting and SQL querying. In addition, Amazon OpenSearch Service lets you pay only for what you use—there are no upfront costs or usage requirements. With Amazon OpenSearch Service, you get the ELK stack you need without the operational overhead.

QuickSight is a scalable, serverless, embeddable, machine learning (ML)-powered business intelligence (BI) service built for the cloud. QuickSight lets you easily create and publish interactive BI dashboards that include ML-powered insights. QuickSight dashboards can be accessed from any device and seamlessly embedded into your applications, portals, and websites.

This post helps you visualize the Centralized Logging solution using QuickSight. Centralized logging helps organizations collect, analyze, and display Amazon CloudWatch logs in a single dashboard in QuickSight.

This solution consolidates, manages, and analyzes log files from various sources. You can collect CloudWatch logs from multiple accounts and AWS Regions. Access log information can be beneficial in security and access audits. It can also help you learn about your customer base and understand your Amazon Simple Storage Service (Amazon S3) bill.

The following diagram illustrates the solution architecture.

For more information about the solution, see Centralized Logging.

Prerequisites

Before you implement the solution, complete the prerequisite steps in this section.

Provision your resources

Launch the following AWS CloudFormation template to launch the Centralized Logging solution:

After you create the stack, you receive an email (to the administrator email address) with your login information, as shown in the following screenshot.

Launch QuickSight in a VPC

Sign up for a QuickSight subscription with the Enterprise license.

QuickSight Enterprise Edition is fully integrated with Amazon Virtual Private Cloud (Amazon VPC). A VPC based on this service closely resembles a traditional network that you operate in your own data center. It enables you to secure and isolate traffic between resources.

Allow QuickSight to access Amazon OpenSearch Service

Make sure QuickSight has access to both the VPC and Amazon OpenSearch Service.

  1. On the QuickSight dashboard, choose the user icon and choose Manage QuickSight.
  2. Choose Security & permissions in the navigation pane.
  3. Choose Add or Remove to update QuickSight access to AWS services.
  1. For Allow access and autodiscovery for these recourses, select Amazon OpenSearch Service.

Manage the VPC and security group connections

You need to give permissions on the QuickSight console to connect to Amazon OpenSearch Service. After you enable Amazon OpenSearch Service on the Security & permissions page, you add a VPC connection with the same VPC and subnet as your Amazon OpenSearch Service domain and create a new security group.

You first create a security group for QuickSight.

  1. Add an inbound rule to allow all communication from the Amazon OpenSearch Service domain.
  2. For Type, choose All TCP.
  3. For Source, select Custom, then enter the ID of the security group used by your Amazon OpenSearch Service domain.
  4. Add an outbound rule to allow all traffic to the Amazon OpenSearch Service domain.
  5. For Type, choose Custom TCP Rule.
  6. For Port Range, enter 443.
  7. For Destination, select Custom, then enter the ID of the security group used by your Amazon OpenSearch Service domain.

Next, you create a security group for the Amazon OpenSearch Service domain.

  1. Add an inbound rule that allows all incoming traffic from the QuickSight security group.
  2. For Type, choose Custom TCP.
  3. For Port Range, enter 443.
  4. For Source, select Custom, then enter the QuickSight security group ID.
  5. Add an outbound rule that allows all traffic to the QuickSight security group.
  6. For Type, choose All TCP.
  7. For Destination, select Custom, then enter the QuickSight security group ID.

Choose your datasets

To validate the connection and create the data source, complete the following steps:

  1. On the QuickSight console, choose Datasets.
  2. Choose Create dataset.
  3. Choose Amazon OpenSearch Service.
  4. For Data source name, enter a name.
  5. Depending on your Amazon OpenSearch Service connections of either public or VPC, choose your connection type and Amazon OpenSearch Service domain.
  6. Choose Validate connection.
  7. Choose Create data source.

  1. Choose Tables.
  2. Select the table in the data source you created.
  3. Review your settings and choose Visualize.

Visualize the data loaded

QuickSight, with its wide array of visuals available, allows you to create meaningful visuals from Amazon OpenSearch Service data.

When you choose Visualize from the previous steps, you start creating an analysis. QuickSight provides a range of visual types to display data, such as graphs, tables, heat maps, scatter plots, line charts, pie charts, and more. The following steps allow you to add a visual type to display the data from the datasets.

  1. On the Add menu, choose Add visual.
  2. Choose your visual type.
  3. Add fields to the field wells to bring data into the visuals to be displayed.

The following screenshot shows a sample group of visuals.

Automatically refresh your data

You can access and visualize your data through direct queries. Your data is queried live each time a visual is rendered. This gives you live access to your data. Additionally, you can automatically refresh the visuals every 1–60 minutes, so that you don’t have to reload the page to see the most up-to-date information. The following screenshot shows the auto-refresh settings while preparing to publish your dashboard.

For more information about the auto-refresh option, see Using Amazon OpenSearch with Amazon QuickSight.

The following screenshot shows an example visualization.

Clean up

When you’re done using this solution, to avoid incurring future charges, delete the resources you created in this walkthrough, including your S3 buckets,  Amazon OpenSearch Service cluster, and other associated resources.

Summary

This post demonstrated how to extend your ELK stack with QuickSight in a secure way for analyzing access logs. The application logs help you identify any server performance issues that could lead to application downtime. They can also help detect unusual activity.

As always, AWS welcomes feedback. Please submit comments or questions in the comments section.


About the Authors

Lokesh Yellanur is a Solutions Architect at AWS. He helps customers with data and analytics solutions in AWS.

Joshua Morrison is a Senior Solutions Architect at AWS based in Richmond, Virginia. He spends time working with customers to help with their adoption of modern cloud technology and security best practices. He enjoys being a father and picking up heavy objects.

Suresh Patnam is a Sr Solutions Architect at AWS; He works with customers to build IT strategy, making digital transformation through the cloud more accessible, focusing on big data, data lakes, and AI/ML. In his spare time, Suresh enjoys playing tennis and spending time with his family. Connect him on LinkedIn.

How to set up Amazon Quicksight dashboard for Amazon Pinpoint and Amazon SES engagement events

Post Syndicated from satyaso original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-set-up-amazon-quicksight-dashboard-for-amazon-pinpoint-and-amazon-ses-events/

In this post, we will walk through using Amazon Pinpoint and Amazon Quicksight to create customizable messaging campaign reports. Amazon Pinpoint is a flexible and scalable outbound and inbound marketing communications service that allows customers to connect with users over channels like email, SMS, push, or voice. Amazon QuickSight is a scalable, serverless, embeddable, machine learning-powered business intelligence (BI) service built for the cloud. This solution allows event and user data from Amazon Pinpoint to flow into Amazon Quicksight. Once in Quicksight, customers can build their own reports that shows campaign performance on a more granular level.

Engagement Event Dashboard

Customers want to view the results of their messaging campaigns in ever increasing levels of granularity and ensure their users see value from the email, SMS or push notifications they receive. Customers also want to analyze how different user segments respond to different messages, and how to optimize subsequent user communication. Previously, customers could only view this data in Amazon Pinpoint analytics, which offers robust reporting on: events, funnels, and campaigns. However, does not allow analysis across these different parameters and the building of custom reports. For example, show campaign revenue across different user segments, or show what events were generated after a user viewed a campaign in a funnel analysis. Customers would need to extract this data themselves and do the analysis in excel.

Prerequisites

  • Digital user engagement event database solution must be setup at 1st.
  • Customers should be prepared to purchase Amazon Quicksight because it has its own set of costs which is not covered within Amazon Pinpoint cost.

Solution Overview

This Solution uses the Athena tables created by Digital user engagement events database solution. The AWS CloudFormation template given in this post automatically sets up the different architecture components, to capture detailed notifications about Amazon Pinpoint engagement events and log those in Amazon Athena in the form of Athena views. You still need to manually configure Amazon Quicksight dashboards to link to these newly generated Athena views. Please follow the steps below in order for further information.

Use case(s)

Event dashboard solutions have following use cases: –

  • Deep dive into engagement insights. (eg: SMS events, Email events, Campaign events, Journey events)
  • The ability to view engagement events at the individual user level.
  • Data/process mining turn raw event data into useful marking insights.
  • User engagement benchmarking and end user event funneling.
  • Compute campaign conversions (post campaign user analysis to show campaign effectiveness)
  • Build funnels that shows user progression.

Getting started with solution deployment

Prerequisite tasks to be completed before deploying the logging solution

Step 1 – Create AWS account, Pinpoint Project, Implement Event-Database-Solution.
As part of this step customers need to implement DUE Event database solution as the current solution (DUE event dashboard) is an extension of DUE event database solution. The basic assumption here is that the customer has already configured Amazon Pinpoint project or Amazon SES within the required AWS region before implementing this step.

The steps required to implement an event dashboard solution are as follows.

a/Follow the steps mentioned in Event database solution to implement the complete stack. Prior installing the complete stack copy and save the name Athena events database name as shown in the diagram. For my case it is due_eventdb. Database name is required as an input parameter for the current Event Dashboard solution.

b/Once the solution is deployed, navigate to the output page of the cloud formation stack, and copy, and save the following information, which will be required as input parameters in step 2 of the current Event Dashboard solution.

Step 2 – Deploy Cloud formation template for Event dashboard solution
This step generates a number of new Amazon Athena views that will serve as a data source for Amazon Quicksight. Continue with the following actions.

  • Download the cloud formation template(“Event-dashboard.yaml”) from AWS samples.
  • Navigate to Cloud formation page in AWS console, click up right on “Create stack” and select the option “With new resources (standard)”
  • Leave the “Prerequisite – Prepare template” to “Template is ready” and for the “Specify template” option, select “Upload a template file”. On the same page, click on “Choose file”, browse to find the file “Event-dashboard.yaml” file and select it. Once the file is uploaded, click “Next” and deploy the stack.

  • Enter following information under the section “Specify stack details”:
    • EventAthenaDatabaseName – As mentioned in Step 1-a.
    • S3DataLogBucket- As mentioned in Step 1-b
    • This solution will create additional 5 Athena views which are
      • All_email_events
      • All_SMS_events
      • All_custom_events (Custom events can be Mobile app/WebApp/Push Events)
      • All_campaign_events
      • All_journey_events

Step 3 – Create Amazon Quicksight engagement Dashboard
This step walks you through the process of creating an Amazon Quicksight dashboard for Amazon Pinpoint engagement events using the Athena views you created in step-2

  1. To Setup Amazon Quicksight for the 1st time please follow this link (this process is not needed if you have already setup Amazon Quicksight). Please make sure you are an Amazon Quicksight Administrator.
  2. Go/search Amazon Quicksight on AWS console.
  3. Create New Analysis and then select “New dataset”
  4. Select Athena as data source
  5. As a next step, you need to select what all analysis you need for respective events. This solution provides option to create 5 different set of analysis as mentioned in Step 2. They are a/All email events, b/All SMS Events, c/All Custom Events (Mobile/Web App, web push etc), d/ All Campaign events, e/All Journey events. Dashboard can be created from Quicksight analysis and same can be shared among the organization stake holders. Following are the steps to create analysis and dashboards for different type of events.
  6. Email Events –
    • For all email events, name the analysis “All-emails-events” (this can be any kind of customer preferred nomenclature), select Athena workgroup as primary, and then create a data source.
    • Once you create the data source Quicksight lists all the views and tables available under the specified database (in our case it is:-  due_eventdb). Select the email_all_events view as data source.
    • Select the event data location for analysis. There are mainly two options available which are a/ Import to Spice quicker analysis b/ Directly query your data. Please select the preferred options and then click on “visualize the data”.
    • Import to Spice quicker analysis – SPICE is the Amazon QuickSight Super-fast, Parallel, In-memory Calculation Engine. It’s engineered to rapidly perform advanced calculations and serve data. In Enterprise edition, data stored in SPICE is encrypted at rest. (1 GB of storage is available for free for extra storage customer need to pay extra, please refer cost section in this document )
    • Directly query your data – This process enables Quicksight to query directly to the Athena or source database (In the current case it is Athena) and Quicksight will not store any data.
    • Now that you have selected a data source, you will be taken to a blank quick sight canvas (Blank analysis page) as shown in the following Image, please drag and drop what visualization type you need to visualize onto the auto-graph pane. Please note that Amazon QuickSight is a Busines intelligence platform, so customers are free to choose the desired visualization types to observe the individual engagement events.
    • As part of this blog, we have displayed how to create some simple analysis graphs to visualize the engagement events.
    • As an initial step please Select tabular Visualization as shown in the Image.
    • Select all the event dimensions that you want to put it as part of the Table in X axis. Amazon Quicksight table can be extended to show as many as tables columns, this completely depends upon the business requirement how much data marketers want to visualize.
    • Further filtering on the table can be done using Quicksight filters, you can apply the filter on specific granular values to enable further filtering. For Eg – If you want to apply filtering on the destination email Id then 1/Select the filter from left hand menu 2/Add destination field as the filtering criterion 3/ Tick on the destination field you are trying to filter or search for the Destination email ID that 4/ All the result in the table gets further filtered as per the filter criterion
    • As a next step please add another visual from top left corner “Add -> Add Visual”, then select the Donut Chart from Visual types pane. Donut charts are always used for displaying aggregation.
    • Then select the “event_type” as the Group to visualize the aggregated events, this helps marketers/business users to figure out how many email events occurred and what are the aggregated success ratio, click ratio, complain ratio or bounce ratio etc for the emails/Campaign that’s sent to end users.
    • To create a Quicksight dashboards from the Quicksight analysis click Share menu option at the top right corner then select publish dashboard”. Provide required dashboard name while publishing the dashboard”. Same dashboard can be shared with multiple audiences in the Organization.
    • Following is the final version of the dashboard. As mentioned above Quicksight dashboards can be shared with other stakeholders and also complete dashboard can be exported as excel sheet.
  7. SMS Events-
    • As shown above SMS events can be analyzed using Quicksight and dash boards can be created out of the analysis. Please repeat all of the sub-steps listed in step 6. Following is a sample SMS dashboard.
  8. Custom Events-
    • After you integrate your application (app) with Amazon Pinpoint, Amazon Pinpoint can stream event data about user activity, different type custom events, and message deliveries for the app. Eg :- Session.start, Product_page_view, _session.stop etc. Do repeat all of the sub-steps listed in step 6 create a custom event dashboards.
  9. Campaign events
    • As shown before campaign also can be included in the same dashboard or you can create new dashboard only for campaign events.

Cost for Event dashboard solution
You are responsible for the cost of the AWS services used while running this solution. As of the date of publication, the cost for running this solution with default settings in the US West (Oregon) Region is approximately $65 a month. The cost estimate includes the cost of AWS Lambda, Amazon Athena, Amazon Quicksight. The estimate assumes querying 1TB of data in a month, and two authors managing Amazon Quicksight every month, four Amazon Quicksight readers witnessing the events dashboard unlimited times in a month, and a Quicksight spice capacity is 50 GB per month. Prices are subject to change. For full details, see the pricing webpage for each AWS service you will be using in this solution.

Clean up

When you’re done with this exercise, complete the following steps to delete your resources and stop incurring costs:

  1. On the CloudFormation console, select your stack and choose Delete. This cleans up all the resources created by the stack,
  2. Delete the Amazon Quicksight Dashboards and data sets that you have created.

Conclusion

In this blog post, I have demonstrated how marketers, business users, and business analysts can utilize Amazon Quicksight dashboards to evaluate and exploit user engagement data from Amazon SES and Pinpoint event streams. Customers can also utilize this solution to understand how Amazon Pinpoint campaigns lead to business conversions, in addition to analyzing multi-channel communication metrics at the individual user level.

Next steps

The personas for this blog are both the tech team and the marketing analyst team, as it involves a code deployment to create very simple Athena views, as well as the steps to create an Amazon Quicksight dashboard to analyse Amazon SES and Amazon Pinpoint engagement events at the individual user level. Customers may then create their own Amazon Quicksight dashboards to illustrate the conversion ratio and propensity trends in real time by integrating campaign events with app-level events such as purchase conversions, order placement, and so on.

Extending the solution

You can download the AWS Cloudformation templates, code for this solution from our public GitHub repository and modify it to fit your needs.


About the Author


Satyasovan Tripathy works at Amazon Web Services as a Senior Specialist Solution Architect. He is based in Bengaluru, India, and specialises on the AWS Digital User Engagement product portfolio. He likes reading and travelling outside of work.

Lucerna Health uses Amazon QuickSight embedded analytics to help healthcare customers uncover new insights

Post Syndicated from David Atkins original https://aws.amazon.com/blogs/big-data/lucerna-health-uses-amazon-quicksight-embedded-analytics-to-help-healthcare-customers-uncover-new-insights/

This is a guest post by Lucerna Health. Founded in 2018, Lucerna Health is a data technology company that connects people and data to deliver value-based care (VBC) results and operational transformation.

At Lucerna Health, data is at the heart of our business. Every day, we use clinical, sales, and operational data to help healthcare providers and payers grow and succeed in the value-based care (VBC) environment. Through our HITRUST CSF® certified Healthcare Data Platform, we support payer-provider integration, health engagement, database marketing, and VBC operations.

As our business grew, we found that faster real-time analysis and reporting capabilities through our platform were critical to success. However, that was a challenge for our data analytics team, which was busier than ever developing our proprietary data engine and data model. No matter how many dashboards we built, we knew we could never keep up with user demand with our previous BI solutions. We needed a more scalable technology that could grow as our customer base continued to expand.

In this post, we will outline how Amazon QuickSight helped us overcome these challenges.

Embedding analytics with QuickSight

We had a rising demand for business intelligence (BI) from our customers, and we needed a better tool to help us keep pace that met our security requirements and was part of a comprehensive business associate contract (BAA) and met HIPAA and other privacy standards. We were using several other BI solutions internally for impromptu analysis and reporting, but we realized we needed a fully embedded solution to provide more automation and an integrated experience within our Healthcare Data Platform. After trying out a different solution, we discovered it wasn’t cost-effective for us. That’s when we turned our attention to AWS.

Three years ago, we decided to go all-in on AWS, implementing a range of AWS services for compute, storage, and networking. Today, each of the building blocks we have in our IT infrastructure run on AWS. For example, we use Amazon Redshift, AWS Glue, and Amazon EMR for our Spark data pipelines, data lake, and data analytics. Because of our all-in approach, we were pleased to find that AWS had a BI platform called QuickSight. QuickSight is a powerful and cost-effective BI service that offers a strong feature set including self-service BI capabilities and interactive dashboards, and we liked the idea of continuing to be all-in on AWS by implementing this service.

One of the QuickSight’s features we were most excited about was its ability to embed analytics deep within our Healthcare Data Platform. With this solution’s embedded analytics software, we were able to integrate QuickSight dashboards directly into our own platform. For example, we offer our customers a portal where they can register a new analytical dashboard through our user interface. That interface connects to the QuickSight application programming interface (API) to enable embedding in a highly configurable and secure way.

With this functionality, our customers can ingest and visualize complex healthcare data, such as clinical data from electronic medical record (EMR) systems, eligibility and claims, CRM and digital interactions data. Our Insights data model is projected into Quicksight’s high performance in memory calculation engine enabling high performance analysis on massive datasets.

Creating a developer experience for customers

We have also embedded the QuickSight console into our platform. Through this approach, our healthcare data customers can build their own datasets and quickly share that data with a wider group of users through our platform. This gives our customers a developer experience that enables them to customize and share analytical reports with their colleagues. With only a few clicks, users can aggregate and compare data from their sales and EMR solutions.

QuickSight has also improved collaboration for our own teams when it comes to custom reports. In the past, teams could only do monthly or specialized reports, spending a lot of time building them, downloading them as PDFs, and sending them out to clients as slides. It was a time-consuming and inefficient way to share data. Now, our users can get easy access to data from previously siloed sources, and then simply publish reports and share access to that data immediately.

Helping healthcare providers uncover new insights

Because healthcare providers now have centralized data at their fingertips, they can make faster and more strategic decisions. For instance, management teams can look at dashboards on our platform to see updated demand data to plan more accurate staffing models. We’ve also created patient and provider data models that provide a 360-degree view of patient and payer data, increasing visibility. Additionally, care coordinators can reprioritize tasks and take action if necessary because they can view gaps in care through the dashboards. Armed with this data, care coordinators can work to improve the patient experience at the point of care.

Building and publishing reports twice as fast

QuickSight is a faster BI solution than anything we’ve used before. We can now craft a new dataset, apply permissions to it, build out an analysis, and publish and share it in a report twice as fast as we could before. The solution also gives our developers a better overall experience. For rapid development and deployment at scale, QuickSight performs extremely well at a very competitive price.

Because QuickSight is a serverless solution, we no longer need to worry about our BI overhead. With our previous solution, we had a lot of infrastructure, maintenance, and licensing costs. We have eliminated those challenges by implementing QuickSight. This is a key benefit because we’re an early stage company and our lean product development team can now focus on innovation instead of spinning up servers.

As our platform has become more sophisticated over the past few years, QuickSight has introduced vast number of great features for data catalog management, security, ML integrations, and look/feel that has really improved on our original solution’s BI capabilities. We look forward to continuing to use this powerful tool to help our customers get more out of their data.


About the Authors

David Atkins is the Co-Founder & Chief Operating Officer at Lucerna Health. Before co-founding Lucnera Health in 2018, David held multiple leadership roles in healthcare organizations, including spending six years at Centen Corporation as the Corporate Vice President of Enterprise Data and Analytic Solutions. Additionally, he served as the Provider Network Management Director at Anthem. When he isn’t spending time with his family, he can be found on the ski slopes or admiring his motorcycle, which he never rides.

Adriana Murillo is the Co-Founder & Chief Marketing Officer at Lucerna Health. Adriana has been involved in the healthcare industry for nearly 20 years. Before co-founding Lucerna Health, she founded Andes Unite, a marketing firm primarily serving healthcare provider organizations and health insurance plans. In addition, Adriana held leadership roles across market segment leadership, product development, and multicultural marketing at not-for-profit health solutions company Florida Blue. Adriana is a passionate cook who loves creating recipes and cooking for her family.

Amazon QuickSight: 2021 in review

Post Syndicated from Jose Kunnackal original https://aws.amazon.com/blogs/big-data/amazon-quicksight-2021-in-review/

With AWS re:Invent just around the corner, we at the Amazon QuickSight team have put together this post to provide you with a handy list of all the key updates this year. We’ve broken this post into three key sections: insights for every user, embedded analytics with QuickSight, scaling and governance.

Insights for every user

Amazon QuickSight allows every user in the organization to get a better understanding of data – through simple natural language questions and interactive dashboards for end-users, or ML-powered data exploration for business analysts. Developers can add embedded visualizations, dashboards, and Q to their apps to differentiate and enhance user experiences. Let’s take a look at the new experiences and features that you can deploy to your users from our 2021 updates.

Amazon QuickSight Q for true self-service for end-users

Earlier this year, Amazon QuickSight Q became generally available, making machine learning (ML) powered Q&A available for end-users to simply ask questions of their data—no training or preparation needed. End-users can go beyond what is presented in the dashboard with Q, avoiding the typical back-and-forth exchanges between the end-user and business intelligence (BI) teams, and the often weeks-long wait associated with adding a new dashboard or visual. It also allows end-users to more intuitively understand data, without having to interpret different visualizations, or understand filters or other elements in a traditional BI dashboard.

For example, a sales manager can simply ask the question “What were monthly sales in California this year?” to get a response from Q. To go deeper, they could follow up with “Who were the top five customers by sales in California?” Q presents a visual response to the user, no manual changes or analysis needed. Q can also be embedded into applications, allowing developers to augment and differentiate their application’s experiences. For more information on Q and how to get started, see Amazon QuickSight Q – Business Intelligence Using Natural Language Questions.

Free-form layouts, new chart types, and much more for pixel-perfect, interactive dashboards

Authors of QuickSight dashboards can now use the new free-form layout, which allows precise placement and sizing of dashboard components, overlay of charts and images, and conditional rendering of elements based on parameters. The combination of these features, along with granular customization options now available across charts (such as hiding grid lines, axis labels, and more) allow dashboards in QuickSight to be highly tailored to specific use cases or designs. The following screenshots show examples of customized dashboards using the free-form layout.

Authors can also use new visual types in QuickSight, such as the dual axis line chart and Sankey, to quickly add new ways of presenting data in dashboards. Sankey charts in particular have been very popular among QuickSight users, allowing visualization of cash flows, process steps, or visitor flows on a website—without having to extensively customize charts or license external plug-ins. We also added the option to add custom web components in dashboards, which allows you to embed images, videos, web pages, or external apps. When combined with the ability to pass parameter values into the custom components, this provides dashboard authors with a very broad set of creative possibilities.

The screenshot below shows an example of dual axis line chart (on the left) where high and volume metrics are mapped on two different scales within the same chart.

The screenshot below shows a Sankey chart showing consumption modes and channels for different energy sources.

The following screenshot shows an example of embedded web content (physical store navigation by different transit modes) within a QuickSight dashboard.

Tables and pivot tables have also received a broad set of updates, allowing authors to customize these extensively to meet organizational design standards, with new features allowing you to do the following:

  • Increase row height
  • Wrap text
  • Vertically align content
  • Customize background color, font color, borders, grid lines, and banding
  • Style and highlight your totals and subtotals
  • Style and hyperlink content to external resources
  • Add images within table cells

The following screenshot shows a customized table visual with links, images, font color, borders, grid lines, banding, text wrap, and custom row height.

The following screenshot shows a pivot table with custom styling for totals and sub-totals.

For deeper analytical exploration of data, we’ve enabled custom sorting of content in visualizations and pivot tables to allow well-defined presentation of content. Custom tooltips allow dashboard authors to add additional context beyond what’s readily available from the visual data on screen. You can now use parameters to dynamically populate titles and subtitles of visuals in dashboards. Time data can be aggregated to seconds, which is helpful for Internet of Things (IoT) and industrial use cases, and filters now allow exclusion of time fields completely to support business-facing use cases where day/month/year are the primary factors.

In filters, we’ve added wildcard search for faster filters for authors and end-users, multi-line filters to allow multiple values to be easily pasted for filtering, and an update to the relative date control to allow readers to select a custom date range over a relative period that has been selected besides selecting time period relative to today.

Consume and collaborate on dashboards

For easier collaboration within an organization, QuickSight now supports 1-click embedding of dashboards in wikis, SharePoint, Google sites, and more, requiring zero development efforts. This makes embedding dashboards as easy as embedding your favorite music video. We’ve also introduced link-based sharing of dashboards, which means that if desired, you can share a dashboard with all users in your organization without having to enable specific users or groups individually.

Threshold-based alerts in QuickSight allow dashboard readers to be notified when specific thresholds are breached by KPIs in a dashboard. Together with available ML-powered automated anomaly alerts, this allows readers to set up notification mechanisms when there are important expected or unexpected changes in data.

This year, we also launched the ability to share a view of a QuickSight dashboard, which allows readers to generate and provide a unique URL to others that captures the state of their filters. This allows for easy discussions around the shared view of data.

For offline access, readers can now receive PDF snapshots of their data, personalized to their specific roles and use cases. Authors set this up using the new personalized email reports feature, allowing unique emails to be sent to thousands of users at a predefined interval, each showing the end-user’s specific view of the data.

Create a reusable data architecture

Whether in a large organization or in a developer setting, creating and reusing datasets plays a significant role in ensuring that shared interpretations of data across the organization are accurate. To support this, QuickSight introduced dataset as a source, a new feature that allows a QuickSight dataset to be a source for creating another dataset. This creates a data lineage across the datasets. Updates related to calculated fields, data refreshes, row-level security, and column-level security can be configured to automatically propagate to datasets, providing a powerful data management tool. For more information, see Creating a Dataset Using an Existing Dataset in Amazon QuickSight.

As part of the logical information contained in the dataset, you can now create field folders to group fields, add metadata to fields, or include aggregate calculations in your dataset, which allows standardized calculations to be predefined and shared for easy inclusion in dashboards by authors.

Datasets are now also versioned, allowing authors and data owners to quickly switch from one version to another, with no API calls or changes needed.

The screenshot below shows an example of version/publishing history of dataset from the preparation screen.

Lastly, QuickSight continues to add to existing live analytics options across Amazon Redshift, Snowflake, SQL Server, Oracle, and other data warehouses with the addition of Exasol. This allows authors a range of options in exploring PB-scale datasets directly from the cloud.

Embed insights into apps

Customers such as 3M, Bolt, Blackboard, NFL, Comcast, and Panasonic Avionics use QuickSight for embedded analytics that serve their customers and partners, saving months and years of development and ongoing maintenance time that would otherwise be needed to create a rich analytics layer in their products. QuickSight also lets customers introduce the latest advancements in BI such as ML-powered Insights and Natural Language Querying in end-user facing applications.

Getting started with embedded dashboards for proofs of concept in software as a service (SaaS) app integrations now only takes minutes, with our new 1-click embedding option. For deeper app integration with transparent authentication, we support server-side calls to QuickSight for embedding, including a new tag-based row-level security option so you can easily add non-modifiable filters to your dashboard. This means that you can embed a multi-tenant embedded dashboard for hundreds of thousands of users without all the heavy lifting needed to duplicate and manage these users in QuickSight or another BI product.

Developers now also have the powerful differentiator of Q as part of QuickSight’s embedded feature set. Q can be embedded into applications, allowing end-users to simply ask questions of data, along with shared context of insights provided through embedded QuickSight dashboards in the app. Some sample embedded dashboards are available on DemoCentral.

For developers and independent software vendors looking to consolidate their interactive dashboards and email reports in QuickSight, we also introduced the ability to customize email reports. This allows customization of the from address, logo, background color, and footer in the email, as shown in the following screenshot.

When combined with the existing functionality of embedding the QuickSight authoring experience, QuickSight now provides developers with a strong suite of embedded analytics capabilities ranging from embedded interactive dashboards, embedded ML-powered Q&A with Q, embedded authoring, and customized email reports.

Scaling and governance

The fully managed, cloud-native architecture of QuickSight has been a delighter for our broad customer base—no servers or nodes to set up, no software updates or patches to manage, and absolutely no infrastructure to think about.

SPICE, the in-memory calculation engine in QuickSight, has been a key pillar of this serverless architecture, allowing data to scale from tens of users to hundreds of thousands without any customer intervention. We have doubled our SPICE data limits to 500 million rows of data per dataset, and now support incremental data refreshes for SQL-based data sources, such as Amazon Redshift, Amazon Athena, PostgreSQL, or Snowflake every 15 minutes, which cuts down time between data updates by 75%. Incremental refreshes also update SPICE datasets in a fraction of the time a full refresh would take, enabling access to the most recent insights much sooner.

This year, we introduced multiple simplifications and security mechanisms as you create your QuickSight account. Administrators signing up to QuickSight can pick from an existing role in their AWS account instead of QuickSight creating a custom service role for the account. This allows you to set up your own role for a group of codependent AWS services and QuickSight that you want to work together.

Admins can now use service control policies (SCPs) to control QuickSight sign-up options within your organization. For example, admins can set up service control policies that deny sign-ups for QuickSight Standard Edition and turn off the ability to invite any users other than those possible via federated single sign-on (SSO).

Admins can also set up QuickSight with SSO such that email addresses for end-users are automatically synced at first-time login, avoiding any manual errors during entry, and preventing use of personal email addresses. See Secure and simplify account setup and access management with new Amazon QuickSight administrative controls to learn more.

QuickSight admins can now also enforce source IP restrictions on access to the QuickSight UI, mobile app, as well as embedded pages. This allows you to secure your data within QuickSight and only keep it for trusted sources to access. See Use IP restrictions to control access to Amazon QuickSight to learn more.

Lastly, adding to our existing certifications (SOC, PCI, HIPAA, and more), we’re now FedRamp High compliant in US GovCloud (West), providing government workloads with the same serverless benefits that our customers have enjoyed.

Conclusion

QuickSight serves millions of dashboard views weekly, enabling data-driven decision-making in organizations of all sizes. Best Western Hotels and Resorts use QuickSight to improve operations worldwide, and provides hotel operators with a real-time look at key metrics that are critical to the business, with over 23,000 users of QuickSight. True Blue, a company focused on specialized workforce solutions, including staffing, talent management, and recruitment process outsourcing, uses QuickSight to deliver more accurate pricing and grow their business across over 500 locations. Vyaire Medical, a global company focused on breathing in every stage of life, used QuickSight to scale up production of ventilators by 20 times during the COVID-19 pandemic. Accelo, a leading cloud-based platform for managing client work from prospect to payment for professional services companies, chose QuickSight to provide embedded analytics to their end-users within their web application.

The features we discussed in this post provide a key summary of the changes over this year that have helped accelerate these and other customers adopt QuickSight.

At re:Invent 2021, you will hear from the NFL—the world’s biggest sports league—about how QuickSight powers their Next Gen Stats portal and provides the NFL clubs, broadcasters, and researchers with real-time and historical stats. You’ll also learn and how Q will revolutionize how data is consumed.

On the embedded analytics front, we will have 3M, a pioneer in global healthcare, and Bolt, which is redefining the online checkout space for millions of users, speak about how QuickSight powers analytics for their end-users and lets them scale to all of their users without any infrastructure overheads.

We also have Accenture and Amazon’s own finance team speaking about how QuickSight allows them to move away from legacy BI to a cloud-native future, while providing the governance and compliance needs typical in the finance world.

This year, you can simply register for re:Invent online and view these sessions from the comfort of your chair. We look forward to connecting with you at re:Invent, whether in-person at our booth and sessions or virtually, and as always look forward to your feedback.


About the Author

Jose Kunnackal, is a principal product manager for Amazon QuickSight, AWS’ cloud-native, fully managed BI service. Jose started his career with Motorola, writing software for telecom and first responder systems. Later he was Director of Engineering at Trilibis Mobile, where he built a SaaS mobile web platform using AWS services. Jose is excited by the potential of cloud technologies and looks forward to helping customers with their transition to the cloud.

Sahitya Pandiri is a technical program manager with Amazon Web Services.

Iterate confidently on Amazon QuickSight datasets with new Dataset Versions capability

Post Syndicated from Shailesh Chauhan original https://aws.amazon.com/blogs/big-data/iterate-confidently-on-amazon-quicksight-datasets-with-new-dataset-versions-capability/

Amazon QuickSight allows data owners and authors to create and model their data in QuickSight using datasets, which contain logical and semantic information about the data. Datasets can be created from a single or multiple data sources, and can be shared across the organization with strong controls around data access (object/row/column level security) and metadata included, and can be programmatically created or modified. QuickSight now supports dataset versioning, which allows dataset owners to see how a dataset has progressed, preview a version, or revert back to a stable working version in case something goes wrong. Dataset Versions gives you the confidence to experiment with your content, knowing that your older versions are available and you can easily revert back to it, if needed. For more details, see Dataset Versions.

In this post, we look at a use case of an author editing a dataset and how QuickSight makes it easy to iterate on your dataset definitions.

What is Dataset Versions?

Previously, changes made to a dataset weren’t tracked. Dataset authors would often make a change that would break the underlying dashboards, and they were often worried about the changes made to the dataset definitions. Dataset authors spent time figuring out how to fix the dataset, which could take significant time.

With Dataset Versions, each publish event associated with the dataset is tracked. Dataset authors can review previous versions of the dataset and how dataset has progressed. Each time someone publishes a dataset, QuickSight creates a new version, which becomes the active version. It makes the previous version the most recent version in the version list. With Dataset Versions, authors can restore back to a previous version if they encounter any issue with the current version.

To help you understand versions better, let’s take the following scenario. Imagine you have a dataset and have iterated on it by making changes over time. You have multiple dashboards based on this dataset. You just added a new table called regions to this dataset. QuickSight saves a new version, and dashboards dependent on it the dataset break due to the addition of this table. You realize that you added the wrong table—you were supposed to add the stateandcity table instead. Let’s see how the Dataset Versions feature comes to your rescue.

Access versions

To access your dataset versions, choose the Manage menu and Publishing History on the data prep page of the dataset.

A panel opens on the right for you to see all the versions. In the following screenshot, the current active version of the dataset is version 38—published on November 10, 2021. This is the version that is breaking your dependent dashboards.

See publishing history

As you make changes to the dataset and publish the changes, QuickSight creates a timeline of all the publishes. You see the publishing history with all the events tracked as a tile. You can choose the tile to preview a particular version and see the respective dataset definition at that time. You know that the dataset was working fine on October 18, 2021 (the previous version), and you choose Preview to verify the dataset definition.

Revert back

After you confirm the dataset definition, choose Revert to go back the previous stable version (published on October 18, 2021). QuickSight asks you to confirm, and you choose Publish. The dataset reverts back to the old working definition and the dependent dashboards are fixed.

Start a new version

Alternatively, as you’re previewing the previously published good version (version 37, published October 18, 2021), you can start fresh from that version. The previous version just had the retail_sales_new table, and you can add the correct table stateandcity to the dataset definition. When you choose Publish, a new version (version 39) is created, and all the dashboards have this new working version, thereby fixing them.

Conclusion

This post showed how the new Dataset Versions feature in QuickSight helps you easily iterate on your datasets, showing you how a dataset has progressed over time and allowing you to revert back to a specific version. Dataset Versions gives you the freedom to experiment with your content, knowing that your older versions are available and you can revert back to them, if required. Dataset Versions is now generally available in QuickSight Standard and Enterprise Editions in all QuickSight Regions. For further details, visit see Dataset Versions.


About the Authors

Shailesh Chauhan is a product manager for Amazon QuickSight, AWS’s cloud-native, fully managed SaaS BI service. Before QuickSight, Shailesh was global product lead at Uber for all data applications built from the ground up. Earlier, he was a founding team member at ThoughtSpot, where he created world’s first analytics search engine. Shailesh is passionate about building meaningful and impactful products from scratch. He looks forward to helping customers while working with people with a great mind and big heart.

Mayank Jain is a Software Development Manager at Amazon QuickSight. He leads the data preparation team that delivers an enterprise-ready platform to transform, define and organize data. Before QuickSight, he was Senior Software Engineer at Microsoft Bing where he developed core search experiences. Mayank is passionate about solving complex problems with simplistic user experience that can empower customer to be more productive.