Tag Archives: Kubernetes

UI Testing at Scale with AWS Lambda

Post Syndicated from Stas Neyman original https://aws.amazon.com/blogs/devops/ui-testing-at-scale-with-aws-lambda/

This is a guest blog post by Wes Couch and Kurt Waechter from the Blackboard Internal Product Development team about their experience using AWS Lambda.

One year ago, one of our UI test suites took hours to run. Last month, it took 16 minutes. Today, it takes 39 seconds. Here’s how we did it.

The backstory:

Blackboard is a global leader in delivering robust and innovative education software and services to clients in higher education, government, K12, and corporate training. We have a large product development team working across the globe in at least 10 different time zones, with an internal tools team providing support for quality and workflows. We have been using Selenium Webdriver to perform automated cross-browser UI testing since 2007. Because we are now practicing continuous delivery, the automated UI testing challenge has grown due to the faster release schedule. On top of that, every commit made to each branch triggers an execution of our automated UI test suite. If you have ever implemented an automated UI testing infrastructure, you know that it can be very challenging to scale and maintain. Although there are services that are useful for testing different browser/OS combinations, they don’t meet our scale needs.

It used to take three hours to synchronously run our functional UI suite, which revealed the obvious need for parallel execution. Previously, we used Mesos to orchestrate a Selenium Grid Docker container for each test run. This way, we were able to run eight concurrent threads for test execution, which took an average of 16 minutes. Although this setup is fine for a single workflow, the cracks started to show when we reached the scale required for Blackboard’s mature product lines. Going beyond eight concurrent sessions on a single container introduced performance problems that impact the reliability of tests (for example, issues in Webdriver or the browser popping up frequently). We tried Mesos and considered Kubernetes for Selenium Grid orchestration, but the answer to scaling a Selenium Grid was to think smaller, not larger. This led to our breakthrough with AWS Lambda.

The solution:

We started using AWS Lambda for UI testing because it doesn’t require costly infrastructure or countless man hours to maintain. The steps we outline in this blog post took one work day, from inception to implementation. By simply packaging the UI test suite into a Lambda function, we can execute these tests in parallel on a massive scale. We use a custom JUnit test runner that invokes the Lambda function with a request to run each test from the suite. The runner then aggregates the results returned from each Lambda test execution.

Selenium is the industry standard for testing UI at scale. Although there are other options to achieve the same thing in Lambda, we chose this mature suite of tools. Selenium is backed by Google, Firefox, and others to help the industry drive their browsers with code. This makes Lambda and Selenium a compelling stack for achieving UI testing at scale.

Making Chrome Run in Lambda

Currently, Chrome for Linux will not run in Lambda due to an absent mount point. By rebuilding Chrome with a slight modification, as Marco Lüthy originally demonstrated, you can run it inside Lambda anyway! It took about two hours to build the current master branch of Chromium to build on a c4.4xlarge. Unfortunately, the current version of ChromeDriver, 2.33, does not support any version of Chrome above 62, so we’ll be using Marco’s modified version of version 60 for the near future.

Required System Libraries

The Lambda runtime environment comes with a subset of common shared libraries. This means we need to include some extra libraries to get Chrome and ChromeDriver to work. Anything that exists in the java resources folder during compile time is included in the base directory of the compiled jar file. When this jar file is deployed to Lambda, it is placed in the /var/task/ directory. This allows us to simply place the libraries in the java resources folder under a folder named lib/ so they are right where they need to be when the Lambda function is invoked.

To get these libraries, create an EC2 instance and choose the Amazon Linux AMI.

Next, use ssh to connect to the server. After you connect to the new instance, search for the libraries to find their locations.

sudo find / -name libgconf-2.so.4
sudo find / -name libORBit-2.so.0

Now that you have the locations of the libraries, copy these files from the EC2 instance and place them in the java resources folder under lib/.

Packaging the Tests

To deploy the test suite to Lambda, we used a simple Gradle tool called ShadowJar, which is similar to the Maven Shade Plugin. It packages the libraries and dependencies inside the jar that is built. Usually test dependencies and sources aren’t included in a jar, but for this instance we want to include them. To include the test dependencies, add this section to the build.gradle file.

shadowJar {
   from sourceSets.test.output
   configurations = [project.configurations.testRuntime]
}

Deploying the Test Suite

Now that our tests are packaged with the dependencies in a jar, we need to get them into a running Lambda function. We use  simple SAM  templates to upload the packaged jar into S3, and then deploy it to Lambda with our settings.

{
   "AWSTemplateFormatVersion": "2010-09-09",
   "Transform": "AWS::Serverless-2016-10-31",
   "Resources": {
       "LambdaTestHandler": {
           "Type": "AWS::Serverless::Function",
           "Properties": {
               "CodeUri": "./build/libs/your-test-jar-all.jar",
               "Runtime": "java8",
               "Handler": "com.example.LambdaTestHandler::handleRequest",
               "Role": "<YourLambdaRoleArn>",
               "Timeout": 300,
               "MemorySize": 1536
           }
       }
   }
}

We use the maximum timeout available to ensure our tests have plenty of time to run. We also use the maximum memory size because this ensures our Lambda function can support Chrome and other resources required to run a UI test.

Specifying the handler is important because this class executes the desired test. The test handler should be able to receive a test class and method. With this information it will then execute the test and respond with the results.

public LambdaTestResult handleRequest(TestRequest testRequest, Context context) {
   LoggerContainer.LOGGER = new Logger(context.getLogger());
  
   BlockJUnit4ClassRunner runner = getRunnerForSingleTest(testRequest);
  
   Result result = new JUnitCore().run(runner);

   return new LambdaTestResult(result);
}

Creating a Lambda-Compatible ChromeDriver

We provide developers with an easily accessible ChromeDriver for local test writing and debugging. When we are running tests on AWS, we have configured ChromeDriver to run them in Lambda.

To configure ChromeDriver, we first need to tell ChromeDriver where to find the Chrome binary. Because we know that ChromeDriver is going to be unzipped into the root task directory, we should point the ChromeDriver configuration at that location.

The settings for getting ChromeDriver running are mostly related to Chrome, which must have its working directories pointed at the tmp/ folder.

Start with the default DesiredCapabilities for ChromeDriver, and then add the following settings to enable your ChromeDriver to start in Lambda.

public ChromeDriver createLambdaChromeDriver() {
   ChromeOptions options = new ChromeOptions();

   // Set the location of the chrome binary from the resources folder
   options.setBinary("/var/task/chrome");

   // Include these settings to allow Chrome to run in Lambda
   options.addArguments("--disable-gpu");
   options.addArguments("--headless");
   options.addArguments("--window-size=1366,768");
   options.addArguments("--single-process");
   options.addArguments("--no-sandbox");
   options.addArguments("--user-data-dir=/tmp/user-data");
   options.addArguments("--data-path=/tmp/data-path");
   options.addArguments("--homedir=/tmp");
   options.addArguments("--disk-cache-dir=/tmp/cache-dir");
  
   DesiredCapabilities desiredCapabilities = DesiredCapabilities.chrome();
   desiredCapabilities.setCapability(ChromeOptions.CAPABILITY, options);
  
   return new ChromeDriver(desiredCapabilities);
}

Executing Tests in Parallel

You can approach parallel test execution in Lambda in many different ways. Your approach depends on the structure and design of your test suite. For our solution, we implemented a custom test runner that uses reflection and JUnit libraries to create a list of test cases we want run. When we have the list, we create a TestRequest object to pass into the Lambda function that we have deployed. In this TestRequest, we place the class name, test method, and the test run identifier. When the Lambda function receives this TestRequest, our LambdaTestHandler generates and runs the JUnit test. After the test is complete, the test result is sent to the test runner. The test runner compiles a result after all of the tests are complete. By executing the same Lambda function multiple times with different test requests, we can effectively run the entire test suite in parallel.

To get screenshots and other test data, we pipe those files during test execution to an S3 bucket under the test run identifier prefix. When the tests are complete, we link the files to each test execution in the report generated from the test run. This lets us easily investigate test executions.

Pro Tip: Dynamically Loading Binaries

AWS Lambda has a limit of 250 MB of uncompressed space for packaged Lambda functions. Because we have libraries and other dependencies to our test suite, we hit this limit when we tried to upload a function that contained Chrome and ChromeDriver (~140 MB). This test suite was not originally intended to be used with Lambda. Otherwise, we would have scrutinized some of the included libraries. To get around this limit, we used the Lambda functions temporary directory, which allows up to 500 MB of space at runtime. Downloading these binaries at runtime moves some of that space requirement into the temporary directory. This allows more room for libraries and dependencies. You can do this by grabbing Chrome and ChromeDriver from an S3 bucket and marking them as executable using built-in Java libraries. If you take this route, be sure to point to the new location for these executables in order to create a ChromeDriver.

private static void downloadS3ObjectToExecutableFile(String key) throws IOException {
   File file = new File("/tmp/" + key);

   GetObjectRequest request = new GetObjectRequest("s3-bucket-name", key);

   FileUtils.copyInputStreamToFile(s3client.getObject(request).getObjectContent(), file);
   file.setExecutable(true);
}

Lambda-Selenium Project Source

We have compiled an open source example that you can grab from the Blackboard Github repository. Grab the code and try it out!

https://blackboard.github.io/lambda-selenium/

Conclusion

One year ago, one of our UI test suites took hours to run. Last month, it took 16 minutes. Today, it takes 39 seconds. Thanks to AWS Lambda, we can reduce our build times and perform automated UI testing at scale!

timeShift(GrafanaBuzz, 1w) Issue 22

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/11/17/timeshiftgrafanabuzz-1w-issue-22/

Welome to TimeShift

We hope you liked our recent article with videos and slides from the events we’ve participated in recently. With Thanksgiving right around the corner, we’re getting a breather from work-related travel, but only a short one. We have some events in the coming weeks, and of course are busy filling in the details for GrafanaCon EU.

This week we have a lot of articles, videos and presentations to share, as well as some important plugin updates. Enjoy!


Latest Release

Grafana 4.6.2 is now available and includes some bug fixes:

  • Prometheus: Fixes bug with new Prometheus alerts in Grafana. Make sure to download this version if your using Prometheus for alerting. More details in the issue. #9777
  • Color picker: Bug after using textbox input field to change/paste color string #9769
  • Cloudwatch: build using golang 1.9.2 #9667, thanks @mtanda
  • Heatmap: Fixed tooltip for “time series buckets” mode #9332
  • InfluxDB: Fixed query editor issue when using > or < operators in WHERE clause #9871

Download Grafana 4.6.2 Now


From the Blogosphere

Cloud Tech 10 – 13th November 2017 – Grafana, Linux FUSE Adapter, Azure Stack and more!: Mark Whitby is a Cloud Solution Architect at Microsoft UK. Each week he prodcues a video reviewing new developments with Microsoft Azure. This week Mark covers the new Azure Monitoring Plugin we recently announced. He also shows you how to get up and running with Grafana quickly using the Azure Marketplace.

Using Prometheus and Grafana to Monitor WebLogic Server on Kubernetes: Oracle published an article on monitoring WebLogic server on Kubernetes. To do this, you’ll use the WebLogic Monitoring Exporter to scrape the server metrics and feed them to Prometheus, then visualize the data in Grafana. Marina goes into a lot of detail and provides sample files and configs to help you get going.

Getting Started with Prometheus: Will Robinson has started a new series on monitoring with Prometheus from someone who has never touched it before. Part 1 introduces a number of monitoring tools and concepts, and helps define a number of monitoring terms. Part 2 teaches you how to spin up Prometheus in a Docker container, and takes a look at writing queries. Looking forward to the third post, when he dives into the visualization aspect.

Monitoring with Prometheus: Alexander Schwartz has made the slides from his most recent presentation from the Continuous Lifcycle Conference in Germany available. In his talk, he discussed getting started with Prometheus, how it differs from other monitoring concepts, and provides examples of how to monitor and alert. We’ll link to the video of the talk when it’s available.

Using Grafana with SiriDB: Jeroen van der Heijden has written an in-depth tutorial to help you visualize data from the open source TSDB, SiriDB in Grafana. This tutorial will get you familiar with setting up SiriDB and provides a sample dashboard to help you get started.

Real-Time Monitoring with Grafana, StatsD and InfluxDB – Artur Caliendo Prado: This is a video from a talk at The Conf, held in Brazil. Artur’s presentation focuses on the experiences they had building a monitoring stack at Youse, how their monitoring became more complex as they scaled, and the platform they built to make sense of their data.

Using Grafana & Inlfuxdb to view XIV Host Performance Metrics – Part 4 Array Stats: This is the fourth part in a series of posts about host performance metrics. This post dives in to array stats to identify workloads and maintain balance across ports. Check out part 1, part 2 and part 3.


GrafanaCon Tickets are Going Fast

Tickets are going fast for GrafanaCon EU, but we still have a seat reserved for you. Join us March 1-2, 2018 in Amsterdam for 2 days of talks centered around Grafana and the surrounding monitoring ecosystem including Graphite, Prometheus, InfluxData, Elasticsearch, Kubernetes, and more.

Get Your Ticket Now


Grafana Plugins

Plugin authors are often adding new features and fixing bugs, which will make your plugin perform better – so it’s important to keep your plugins up to date. We’ve made updating easy; for on-prem Grafana, use the Grafana-cli tool, or update with 1 click if you’re using Hosted Grafana.

UPDATED PLUGIN

Hawkular data source – There is an important change in this release – as this datasource is now able to fetch not only Hawkular Metrics but also Hawkular Alerts, the server URL in the datasource configuration must be updated: http://myserver:123/hawkular/metrics must be changed to http://myserver:123/hawkular

Some of the changes (see the release notes) for more details):

  • Allow per-query tenant configuration
  • Annotations can now be configured out of Availability metrics and Hawkular Alerts events in addition to string metrics
  • allows dot character in tag names

Update

UPDATED PLUGIN

Diagram Panel – This is the first release in a while for the popular Diagram Panel plugin.

In addition to these changes, there are also a number of bug fixes:

Update

UPDATED PLUGIN

Influx Admin Panel – received a number of improvements:

  • Fix issue always showing query results
  • When there is only one row, swap rows/cols (ie: SHOW DIAGNOSTICS)
  • Improved auto-refresh behavior
  • Fix query time sorting
  • show ‘status’ field (killed, etc)

Update


Upcoming Events:

In between code pushes we like to speak at, sponsor and attend all kinds of conferences and meetups. We have some awesome talks and events coming soon. Hope to see you at one of these!

How to Use Open Source Projects for Performance Monitoring | Webinar
Nov. 29, 1pm EST
:
Check out how you can use popular open source projects, for performance monitoring of your Infrastructure, Application, and Cloud faster, easier, and to scale. In this webinar, Daniel Lee from Grafana Labs, and Chris Churilo from InfluxData, will provide you with step by step instruction from download & configure, to collecting metrics and building dashboards and alerts.

RSVP

KubeCon | Austin, TX – Dec. 6-8, 2017: We’re sponsoring KubeCon 2017! This is the must-attend conference for cloud native computing professionals. KubeCon + CloudNativeCon brings together leading contributors in:

  • Cloud native applications and computing
  • Containers
  • Microservices
  • Central orchestration processing
  • And more

Buy Tickets

FOSDEM | Brussels, Belgium – Feb 3-4, 2018: FOSDEM is a free developer conference where thousands of developers of free and open source software gather to share ideas and technology. Carl Bergquist is managing the Cloud and Monitoring Devroom, and the CFP is now open. There is no need to register; all are welcome. If you’re interested in speaking at FOSDEM, submit your talk now!


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

We were glad to be a part of InfluxDays this year, and looking forward to seeing the InfluxData team in NYC in February.


Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions


How are we doing?

I enjoy writing these weekly roudups, but am curious how I can improve them. Submit a comment on this article below, or post something at our community forum. Help us make these weekly roundups better!

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

Staying Busy Between Code Pushes

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/11/16/staying-busy-between-code-pushes/

Staying Busy Between Code Pushes.

Maintaining a regular cadence of pushing out releases, adding new features, implementing bug fixes and staying on top of support requests is important for any software to thrive; but especially important for open source software due to its rapid pace. It’s easy to lose yourself in code and forget that events are happening all the time – in every corner of the world, where we can learn, share knowledge, and meet like-minded individuals to build better software, together. There are so many amazing events we’d like to participate in, but there simply isn’t enough time (or budget) to fit them all in. Here’s what we’ve been up to recently; between code pushes.

Recent Events

Øredev Conference | Malmö, Sweden: Øredev is one of the biggest developer conferences in Scandinavia, and Grafana Labs jumped at the chance to be a part of it. In early November, Grafana Labs Principal Developer, Carl Bergquist, gave a great talk on “Monitoring for Everyone”, which discussed the concepts of monitoring and why everyone should care, different ways to monitor your systems, extending your monitoring to containers and microservices, and finally what to monitor and alert on. Watch the video of his talk below.

InfluxDays | San Francisco, CA: Dan Cech, our Director of Platform Services, spoke at InfluxDays in San Francisco on Nov 14, and Grafana Labs sponsored the event. InfluxDB is a popular data source for Grafana, so we wanted to connect to the InfluxDB community and show them how to get the most out of their data. Dan discussed building dashboards, choosing the best panels for your data, setting up alerting in Grafana and a few sneak peeks of the upcoming Grafana 5.0. The video of his talk is forthcoming, but Dan has made his presentation available.

PromCon | Munich, Germany: PromCon is the Prometheus-focused event of the year. In August, Carl Bergquist, had the opportunity to speak at PromCon and take a deep dive into Grafana and Prometheus. Many attendees at PromCon were already familiar with Grafana, since it’s the default dashboard tool for Prometheus, but Carl had a trove of tricks and optimizations to share. He also went over some major changes and what we’re currently working on.

CNCF Meetup | New York, NY: Grafana Co-founder and CEO, Raj Dutt, particpated in a panel discussion with the folks of Packet and the Cloud Native Computing Foundation. The discussion focused on the success stories, failures, rationales and in-the-trenches challenges when running cloud native in private or non “public cloud” datacenters (bare metal, colocation, private clouds, special hardware or networking setups, compliance and security-focused deployments).

Percona Live | Dublin: Daniel Lee traveled to Dublin, Ireland this fall to present at the database conference Percona Live. There he showed the new native MySQL support, along with a number of upcoming features in Grafana 5.0. His presentation is available to download.

Big Monitoring Meetup | St. Petersburg, Russian Federation: Alexander Zobnin, our developer located in Russia, is the primary maintainer of our popular Zabbix plugin. He attended the Big Monitoring Meetup to discuss monitoring, Grafana dashboards and democratizing metrics.

Why observability matters – now and in the future | Webinar: Our own Carl Bergquist and Neil Gehani, Director of Product at Weaveworks, to discover best practices on how to get started with monitoring both your application and infrastructure. Start capturing metrics that matter, aggregate and visualize them in a useful way that allows for identifying bottlenecks and proactively preventing incidents. View Carl’s presentation.

Upcoming Events

We’re going to maintain this momentum with a number of upcoming events, and hope you can join us.

KubeCon | Austin, TX – Dec. 6-8, 2017: We’re sponsoring KubeCon 2017! This is the must-attend conference for cloud native computing professionals. KubeCon + CloudNativeCon brings together leading contributors in:

  • Cloud native applications and computing
  • Containers
  • Microservices
  • Central orchestration processing
  • And more.

Buy Tickets

How to Use Open Source Projects for Performance Monitoring | Webinar
Nov. 29, 1pm EST:
Check out how you can use popular open source projects, for performance monitoring of your Infrastructure, Application, and Cloud faster, easier, and to scale. In this webinar, Daniel Lee from Grafana Labs, and Chris Churilo from InfluxData, will provide you with step by step instruction from download & configure, to collecting metrics and building dashboards and alerts.

RSVP

FOSDEM | Brussels, Belgium – Feb 3-4, 2018: FOSDEM is a free developer conference where thousands of developers of free and open source software gather to share ideas and technology. Carl Bergquist is managing the Cloud and Monitoring Devroom, and the CFP is now open. There is no need to register; all are welcome. If you’re interested in speaking at FOSDEM, submit your talk now!

GrafanaCon EU

Last, but certainly not least, the next GrafanaCon is right around the corner. GrafanaCon EU (to be held in Amsterdam, Netherlands, March 1-2. 2018),is a two-day event with talks centered around Grafana and the surrounding ecosystem. In addition to the latest features and functionality of Grafana, you can expect to see and hear from members of the monitoring community like Graphite, Prometheus, InfluxData, Elasticsearch Kubernetes, and more. Head to grafanacon.org to see the latest speakers confirmed. We have speakers from Automattic, Bloomberg, CERN, Fastly, Tinder and more!

Conclusion

The Grafana Labs team is spread across the globe. Having a “post-geographic” company structure give us the opportunity to take part in events wherever they may be held in the world. As our team continues to grow, we hope to take part in even more events, and hope you can find the time to join us.

timeShift(GrafanaBuzz, 1w) Issue 21

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/11/10/timeshiftgrafanabuzz-1w-issue-21/

This week the Stockholm team was in Malmö, Sweden for Øredev – one of the biggest developer conferences in Scandinavia, while the rest of Grafana Labs had to live vicariously through Twitter posts. We also announced a collaboration with Microsoft’s Azure team to create an official Azure data source plugin for Grafana. We’ve also announced the next block of speakers at GrafanaCon. Awesome week!


Photos from Oredev


Latest Release

Grafana 4.6.1 adds some bug fixes:

  • Singlestat: Lost thresholds when using save dashboard as #96816
  • Graph: Fix for series override color picker #97151
  • Go: build using golang 1.9.2 #97134
  • Plugins: Fixed problem with loading plugin js files behind auth proxy #95092
  • Graphite: Annotation tooltip should render empty string when undefined #9707

Download Grafana 4.6.1 Now


From the Blogosphere

Grafana Launches Microsoft Azure Data Source: In this article, Grafana Labs co-founder and CEO Raj, Dutt talks about the new Azure data source for Grafana, the collaboration between teams, and how much he admires Microsoft’s embrace of open source software.

Monitor Azure Services and Applications Using Grafana: Continuing the theme of Microsoft Azure, the Azure team published an article about the collaboration and resulting plugin. Ashwin discusses what prompted the project and shares some links to dive in deeper into how to get up and running.

Monitoring for Everyone: It only took 1 day for the organizers of Oredev Conference to start publishing videos of the talks. Bravo! Carl Bergquist’s talk is a great overview of the whys, what’s, and how’s of monitoring.

Eight years of Go: This article is in honor of Go celebrating 8 years, and discusses the growth and popularity of the language. We are thrilled to be in such good company in the “Go’s impact in open source” section. Congrats, and we wish you many more years of success!

A DIY Dashboard with Grafana: Christoph wanted to experiment with how to feed time series from his own code into a Grafana dashboard. He wrote a proof of concept called grada to connect any Go code to a Grafana dashboard panel.

Visualize Time-Series Data with Open Source Grafana and InfluxDB: Our own Carl Bergquist co-authored an article with Gunnar Aasen from InfluxData on using Grafana with InfluxDB. This is a follow up to a webinar the two participated in earlier in the year.


GrafanaCon EU

Planning for GrafanaCon EU is rolling right along, and we’re excited to announce a new block of speakers! We’ll continue to confirm speakers regularly, so keep an eye on grafanacon.org. Here are the latest additions:

Stig Sorensen
HEAD OF TELEMETRY
BLOOMBERG

Sean Hanson
SOFTWARE DEVELOPER
BLOOMBERG

Utkarsh Bhatnagar
SR. SOFTWARE ENGINEER
TINDER

Borja Garrido
PROJECT ASSOCIATE
CERN

Abhishek Gahlot
SOFTWARE ENGINEER
Automattic

Anna MacLachlan
CONTENT MARKETING MANAGER
Fastly

Gerlando Piro
FRONT END DEVELOPER
Fastly

GrafanaCon Tickets are Available!

Now that you’re getting a glimpse of who will be speaking, lock in your seat for GrafanaCon EU today! Join us March 1-2, 2018 in Amsterdam for 2 days of talks centered around Grafana and the surrounding monitoring ecosystem including Graphite, Prometheus, InfluxData, Elasticsearch, Kubernetes, and more.

Get Your Ticket Now


Upcoming Events:

In between code pushes we like to speak at, sponsor and attend all kinds of conferences and meetups. We have some awesome talks lined up this November. Hope to see you at one of these events!


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

Pretty awesome to have the co-founder of Kubernetes tweet about Grafana!


Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions


How are we doing?

Well, that wraps up another week! How we’re doing? Submit a comment on this article below, or post something at our community forum. Help us make these weekly roundups better!

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

timeShift(GrafanaBuzz, 1w) Issue 20

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/11/03/timeshiftgrafanabuzz-1w-issue-20/

This week, in addition to rolling out a Grafana 4.6.1 release, we’ve been busy prepping for upcoming events. In Europe, we’ll be speaking at and sponsoring the sold-out Øredev Conference in Malmö, Sweden, Nov 7-11, and on the west coast, we’ll be speaking at and sponsoring InfluxDays, Nov 14 in San Francisco, CA. We hope to get a chance to say hi to you at one of these events.

We also closed the CFP window this week for talks at GrafanaCon EU. We received a tremendous number of great submissions, and will spend the next few weeks making our selections. As speakers are confirmed, we’ll add them to the website, so be sure to keep an eye out. We’re thrilled that the community is so excited to share their knowledge of Grafana and open source monitoring.


Latest Release

Grafana 4.6.1 adds some bug fixes:

  • Singlestat: Lost thresholds when using save dashboard as #96816
  • Graph: Fix for series override color picker #97151
  • Go: build using golang 1.9.2 #97134
  • Plugins: Fixed problem with loading plugin js files behind auth proxy #95092
  • Graphite: Annotation tooltip should render empty string when undefined #9707

Download Grafana 4.6.1 Now


From the Blogosphere

FOSDEM 2018 Monitoring & Cloud Devroom CFP: The CFP is now open for the Monitoring & Cloud Devroom at FOSDEM 2018, held in Brussels, Belgium, Feb 3-4, 2018. FOSDEM is the premier open source conference in europe, and covers a broad range of topics. The Monitoring and Cloud devroom was a popular devroom last year, so be sure to submit your talk before the November 26 deadline!

PRTG plus Grafana FTW!: @neuralfraud has written a plugin for PRTG that allows you to view PRTG data directly in Grafana. This article goes over the features of the plugin, beautiful dashboards and visualization options, and how to get started.

Grafana-based GUI for mgstat, a system monitoring tool for InterSystems Caché, Ensemble or HealthShare: This is a continuation of the previous article Making Prometheus Monitoring for InterSystems Caché where we examine how to visualize the results from the mgstat tool. This article is broken down into which stats are collected and how these stats are collected.

Using Grafana & InfluxDB to view XIV Host Performance Metrics: Allan wanted to get an unified view of what his hosts were doing, however, the XIV GUI only allowed 12 hosts to be displayed at a given time– which was extremely limiting. This is the first in a series of articles on how to gather and parse host data and visualize it in Grafana.

Service telemetry with Grafana and InfluxDB – Part I: Introduction: This is the first in a new series of posts which will walk you through the process of building a production-ready solution for monitoring real-time metrics for any application or service, complete with useful and beautiful dashboards.


GrafanaCon General Admission Now Available!

Early bird tickets are no longer available, but you can still lock in your seat for GrafanaCon! Join us March 1-2, 2018 in Amsterdam for 2 days of talks centered around Grafana and the surrounding monitoring ecosystem including Graphite, Prometheus, InfluxData, Elasticsearch, Kubernetes, and more.

Get Your Ticket Now


Grafana Plugins

Keeping your Grafana plugins up to date is important. Plugin authors are often adding new features and fixing bugs, which will make your plugin perform better. We’ve made updating easy; for on-prem Grafana, use the Grafana-cli tool, or update with 1 click if you’re using Hosted Grafana.

UPDATED PLUGIN

Piechart Panel – The latest version of the Piechart Panel has the following fixes:

  • Add “No data points” description for pie chart with 0 value
  • Donut now works with transparent panel
  • Can toggle to hide series from piechart
  • On graph legend can show values. Can choose how many decimals
  • Sort pie slices upon sorting of legend entries
  • Fix for color picker for Grafana 4.6

Update


Contribution of the Week:

Each week we highlight some of the important contributions from our amazing open source community. Thank you for helping make Grafana better!

@akshaychhajed
We got an amazing PR this week. Grafana has lots of docker-compose files for internal testing that were created using a very old version of docker-compose. @akshaychhajed sent a PR converting them all to the latest version of docker-compose. Huge thanks from the Grafana team!


Upcoming Events:

In between code pushes we like to speak at, sponsor and attend all kinds of conferences and meetups. We have some awesome talks lined up this November. Hope to see you at one of these events!


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

Beautiful – I want to build a real-life version of this using a block of wood, some nails and colored string… or maybe have it cross-stitched on a pillow 🙂


Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions


How are we doing?

Well, that wraps up another week! How we’re doing? Submit a comment on this article below, or post something at our community forum. Help us make these weekly roundups better!

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

timeShift(GrafanaBuzz, 1w) Issue 18

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/10/20/timeshiftgrafanabuzz-1w-issue-18/

Welcome to another issue of timeShift. This week we released Grafana 4.6.0-beta2, which includes some fixes for alerts, annotations, the Cloudwatch data source, and a few panel updates. We’re also gearing up for Oredev, one of the biggest tech conferences in Scandinavia, November 7-10. In addition to sponsoring, our very own Carl Bergquist will be presenting “Monitoring for everyone.” Hope to see you there – swing by our booth and say hi!


Latest Release

Grafana 4.6-beta-2 is now available! Grafana 4.6.0-beta2 adds fixes for:

  • ColorPicker display
  • Alerting test
  • Cloudwatch improvements
  • CSV export
  • Text panel enhancements
  • Annotation fix for MySQL

To see more details on what’s in the newest version, please see the release notes.

Download Grafana 4.6.0-beta-2 Now


From the Blogosphere

Screeps and Grafana: Graphing your AI: If you’re unfamiliar with Screeps, it’s a MMO RTS game for programmers, where the objective is to grow your colony through programming your units’ AI. You control your colony by writing JavaScript, which operates 247 in the single persistent real-time world filled by other players. This article walks you through graphing all your game stats with Grafana.

ntopng Grafana Integration: The Beauty of Data Visualization: Our friends at ntop created a tutorial so that you can graph ntop monitoring data in Grafana. He goes through the metrics exposed, configuring the ntopng Data Source plugin, and building your first dashboard. They’ve also created a nice video tutorial of the process.

Installing Graphite and Grafana to Display the Graphs of Centreon: This article, provides a step-by-step guide to getting your Centreon data into Graphite and visualizing the data in Grafana.

Bit v. Byte Episode 3 – Metrics for the Win: Bit v. Byte is a new weekly Podcast about the web industry, tools and techniques upcoming and in use today. This episode dives into metrics, and discusses Grafana, Prometheus and NGINX Amplify.

Code-Quickie: Visualize heating with Grafana: With the winter weather coming, Reinhard wanted to monitor the stats in his boiler room. This article covers not only the visualization of the data, but the different devices and sensors you can use to can use in your own home.

RuuviTag with C.H.I.P – BLE – Node-RED: Following the temperature-monitoring theme from the last article, Tobias writes about his journey of hooking up his new RuuviTag to Grafana to measure temperature, relative humidity, air pressure and more.


Early Bird will be Ending Soon

Early bird discounts will be ending soon, but you still have a few days to lock in the lower price. We will be closing early bird on October 31, so don’t wait until the last minute to take advantage of the discounted tickets!

Also, there’s still time to submit your talk. We’ll accept submissions through the end of October. We’re looking for technical and non-technical talks of all sizes. Submit a CFP now.

Get Your Early Bird Ticket Now


Grafana Plugins

This week we have updates to two panels and a brand new panel that can add some animation to your dashboards. Installing plugins in Grafana is easy; for on-prem Grafana, use the Grafana-cli tool, or with 1 click if you are using Hosted Grafana.

NEW PLUGIN

Geoloop Panel – The Geoloop panel is a simple visualizer for joining GeoJSON to Time Series data, and animating the geo features in a loop. An example of using the panel would be showing the rate of rainfall during a 5-hour storm.

Install Now

UPDATED PLUGIN

Breadcrumb Panel – This plugin keeps track of dashboards you have visited within one session and displays them as a breadcrumb. The latest update fixes some issues with back navigation and url query params.

Update

UPDATED PLUGIN

Influx Admin Panel – The Influx Admin panel duplicates features from the now deprecated Web Admin Interface for InfluxDB and has lots of features like letting you see the currently running queries, which can also be easily killed.

Changes in the latest release:

  • Converted to typescript project based on typescript-template-datasource
  • Select Databases. This only works with PR#8096
  • Added time format options
  • Show tags from response
  • Support template variables in the query

Update


Contribution of the week:

Each week we highlight some of the important contributions from our amazing open source community. Thank you for helping make Grafana better!

The Stockholm Go Meetup had a hackathon this week and sent a PR for letting whitelisted cookies pass through the Grafana proxy. Thanks to everyone who worked on this PR!


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

This is awesome – we can’t get enough of these public dashboards!

We Need Your Help!

Do you have a graph that you love because the data is beautiful or because the graph provides interesting information? Please get in touch. Tweet or send us an email with a screenshot, and we’ll tell you about this fun experiment.

Tell Me More


Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions


How are we doing?

Please tell us how we’re doing. Submit a comment on this article below, or post something at our community forum. Help us make these weekly roundups better!

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

timeShift(GrafanaBuzz, 1w) Issue 15

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/09/29/timeshiftgrafanabuzz-1w-issue-15/

This week the Grafana Labs team converged on Stockholm. In addition to taking advantage of the beautiful weather, which was perfect for team outings, we were also hard at work setting objectives for the next Grafana release, finalizing details for GrafanaCon EU, and enjoying some good old-fashioned face time in an otherwise post-geographic company. This issue of TimeShift covers a few recent and upcoming talks, monitoring Kubernetes and plugin updates.

All Systems Go! 2017 Schedule Published

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/all-systems-go-2017-schedule-published.html

The All Systems Go! 2017 schedule has been published!

I am happy to announce that we have published the All Systems Go! 2017 schedule!
We are very happy with the large number and the quality of the
submissions we got, and the resulting schedule is exceptionally
strong.

Without further ado:

Here’s the schedule for the first day (Saturday, 21st of October).

And here’s the schedule for the second day (Sunday, 22nd of October).

Here are a couple of keywords from the topics of the talks:
1password, azure, bluetooth, build systems,
casync, cgroups, cilium, cockpit, containers,
ebpf, flatpak, habitat, IoT, kubernetes,
landlock, meson, OCI, rkt, rust, secureboot,
skydive, systemd, testing, tor, varlink,
virtualization, wifi, and more.

Our speakers are from all across the industry: Chef CoreOS, Covalent,
Facebook, Google, Intel, Kinvolk, Microsoft, Mozilla, Pantheon,
Pengutronix, Red Hat, SUSE and more.

For further information about All Systems Go! visit our conference web site.

Make sure to buy your ticket for All Systems Go! 2017 now! A limited
number of tickets are left at this point, so make sure you get yours
before we are all sold out! Find all details here.

See you in Berlin!

timeShift(GrafanaBuzz, 1w) Issue 14

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/09/22/timeshiftgrafanabuzz-1w-issue-14/

Summer is officially in the rear-view mirror, but we at Grafana Labs are excited. Next week, the team will gather in Stockholm, Sweden where we’ll be discussing Grafana 5.0, GrafanaCon EU and setting other goals. If you’re attending Percona Live Europe 2017 in Dublin, be sure and catch Grafana developer, Daniel Lee on Tuesday, September 26. He’ll be showing off the new MySQL data source and a sneak peek of Grafana 5.0.

And with that – we hope you enjoy this issue of TimeShift!


Latest Release

Grafana 4.5.2 is now available! Various fixes to the Graphite data source, HTTP API, and templating.

To see details on what’s been fixed in the newest version, please see the release notes.

Download Grafana 4.5.2 Now


From the Blogosphere

A Monitoring Solution for Docker Hosts, Containers and Containerized Services: Stefan was searching for an open source, self-hosted monitoring solution. With an ever-growing number of open source TSDBs, Stefan outlines why he chose Prometheus and provides a rundown of how he’s monitoring his Docker hosts, containers and services.

Real-time API Performance Monitoring with ES, Beats, Logstash and Grafana: As APIs become a centerpiece for businesses, monitoring API performance is extremely important. Hiren recently configured real time API response time monitoring for a project and shares his implementation plan and configurations.

Monitoring SSL Certificate Expiry in GCP and Kubernetes: This article discusses how to use Prometheus and Grafana to automatically monitor SSL certificates in use by load balancers across GCP projects.

Node.js Performance Monitoring with Prometheus: This is a good primer for monitoring in general. It discusses what monitoring is, important signals to know, instrumentation, and things to consider when selecting a monitoring tool.

DIY Dashboard with Grafana and MariaDB: Mark was interested in testing out the new beta MySQL support in Grafana, so he wrote a short article on how he is using Grafana with MariaDB.

Collecting Temperature Data with Raspberry Pi Computers: Many of us use monitoring for tracking mission-critical systems, but setting up environment monitoring can be a fun way to improve your programming skills as well.


GrafanaCon EU CFP is Open

Have a big idea to share? A shorter talk or a demo you’d like to show off? We’re looking for technical and non-technical talks of all sizes. The proposals are rolling in, but we are happy to save a speaking slot for you!

I’d Like to Speak at GrafanaCon


Grafana Plugins

There were a lot of plugin updates to highlight this week, many of which were due to changes in Grafana 4.5. It’s important to keep your plugins up to date, since bug fixes and new features are added frequently. We’ve made the process of installing and updating plugins simple. On an on-prem instance, use the Grafana-cli, or on Hosted Grafana, install and update with 1-click.

NEW PLUGIN

Linksmart HDS Data Source – The LinkSmart Historical Data Store is a new Grafana data source plugin. LinkSmart is an open source IoT platform for developing IoT applications. IoT applications need to deal with large amounts of data produced by a growing number of sensors and other devices. The Historical Datastore is for storing, querying, and aggregating (time-series) sensor data.

Install Now

UPDATED PLUGIN

Simple JSON Data Source – This plugin received a bug fix for the query editor.

Update Now

UPDATED PLUGIN

Stagemonitor Elasticsearch App – Numerous small updates and the version updated to match the StageMonitor version number.

Update Now

UPDATED PLUGIN

Discrete Panel – Update to fix breaking change in Grafana 4.5.

Update Now

UPDATED PLUGIN

Status Dot Panel – Minor HTML Update in this version.

Update Now

UPDATED PLUGIN

Alarm Box Panel – This panel was updated to fix breaking changes in Grafana 4.5.

Update Now


This week’s MVC (Most Valuable Contributor)

Each week we highlight a contributor to Grafana or the surrounding ecosystem as a thank you for their participation in making open source software great.

Sven Klemm opened a PR for adding a new Postgres data source and has been very quick at implementing proposed changes. The Postgres data source is on our roadmap for Grafana 5.0 so this PR really helps. Thanks Sven!


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

Glad you’re finding Grafana useful! Curious about that annotation just before midnight 🙂

We Need Your Help

Last week we announced an experiment we were conducting, and need your help! Do you have a graph that you love because the data is beautiful or because the graph provides interesting information? Please get in touch. Tweet or send us an email with a screenshot, and we’ll tell you about this fun experiment.

I Want to Help


Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions


What do you think?

What would you like to see here? Submit a comment on this article below, or post something at our community forum. Help us make these weekly roundups better!

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

Manage Kubernetes Clusters on AWS Using CoreOS Tectonic

Post Syndicated from Arun Gupta original https://aws.amazon.com/blogs/compute/kubernetes-clusters-aws-coreos-tectonic/

There are multiple ways to run a Kubernetes cluster on Amazon Web Services (AWS). The first post in this series explained how to manage a Kubernetes cluster on AWS using kops. This second post explains how to manage a Kubernetes cluster on AWS using CoreOS Tectonic.

Tectonic overview

Tectonic delivers the most current upstream version of Kubernetes with additional features. It is a commercial offering from CoreOS and adds the following features over the upstream:

  • Installer
    Comes with a graphical installer that installs a highly available Kubernetes cluster. Alternatively, the cluster can be installed using AWS CloudFormation templates or Terraform scripts.
  • Operators
    An operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex stateful applications on behalf of a Kubernetes user. This release includes an etcd operator for rolling upgrades and a Prometheus operator for monitoring capabilities.
  • Console
    A web console provides a full view of applications running in the cluster. It also allows you to deploy applications to the cluster and start the rolling upgrade of the cluster.
  • Monitoring
    Node CPU and memory metrics are powered by the Prometheus operator. The graphs are available in the console. A large set of preconfigured Prometheus alerts are also available.
  • Security
    Tectonic ensures that cluster is always up to date with the most recent patches/fixes. Tectonic clusters also enable role-based access control (RBAC). Different roles can be mapped to an LDAP service.
  • Support
    CoreOS provides commercial support for clusters created using Tectonic.

Tectonic can be installed on AWS using a GUI installer or Terraform scripts. The installer prompts you for the information needed to boot the Kubernetes cluster, such as AWS access and secret key, number of master and worker nodes, and instance size for the master and worker nodes. The cluster can be created after all the options are specified. Alternatively, Terraform assets can be downloaded and the cluster can be created later. This post shows using the installer.

CoreOS License and Pull Secret

Even though Tectonic is a commercial offering, a cluster for up to 10 nodes can be created by creating a free account at Get Tectonic for Kubernetes. After signup, a CoreOS License and Pull Secret files are provided on your CoreOS account page. Download these files as they are needed by the installer to boot the cluster.

IAM user permission

The IAM user to create the Kubernetes cluster must have access to the following services and features:

  • Amazon Route 53
  • Amazon EC2
  • Elastic Load Balancing
  • Amazon S3
  • Amazon VPC
  • Security groups

Use the aws-policy policy to grant the required permissions for the IAM user.

DNS configuration

A subdomain is required to create the cluster, and it must be registered as a public Route 53 hosted zone. The zone is used to host and expose the console web application. It is also used as the static namespace for the Kubernetes API server. This allows kubectl to be able to talk directly with the master.

The domain may be registered using Route 53. Alternatively, a domain may be registered at a third-party registrar. This post uses a kubernetes-aws.io domain registered at a third-party registrar and a tectonic subdomain within it.

Generate a Route 53 hosted zone using the AWS CLI. Download jq to run this command:

ID=$(uuidgen) && \
aws route53 create-hosted-zone \
--name tectonic.kubernetes-aws.io \
--caller-reference $ID \
| jq .DelegationSet.NameServers

The command shows an output such as the following:

[
  "ns-1924.awsdns-48.co.uk",
  "ns-501.awsdns-62.com",
  "ns-1259.awsdns-29.org",
  "ns-749.awsdns-29.net"
]

Create NS records for the domain with your registrar. Make sure that the NS records can be resolved using a utility like dig web interface. A sample output would look like the following:

The bottom of the screenshot shows NS records configured for the subdomain.

Download and run the Tectonic installer

Download the Tectonic installer (version 1.7.1) and extract it. The latest installer can always be found at coreos.com/tectonic. Start the installer:

./tectonic/tectonic-installer/$PLATFORM/installer

Replace $PLATFORM with either darwin or linux. The installer opens your default browser and prompts you to select the cloud provider. Choose Amazon Web Services as the platform. Choose Next Step.

Specify the Access Key ID and Secret Access Key for the IAM role that you created earlier. This allows the installer to create resources required for the Kubernetes cluster. This also gives the installer full access to your AWS account. Alternatively, to protect the integrity of your main AWS credentials, use a temporary session token to generate temporary credentials.

You also need to choose a region in which to install the cluster. For the purpose of this post, I chose a region close to where I live, Northern California. Choose Next Step.

Give your cluster a name. This name is part of the static namespace for the master and the address of the console.

To enable in-place update to the Kubernetes cluster, select the checkbox next to Automated Updates. It also enables update to the etcd and Prometheus operators. This feature may become a default in future releases.

Choose Upload “tectonic-license.txt” and upload the previously downloaded license file.

Choose Upload “config.json” and upload the previously downloaded pull secret file. Choose Next Step.

Let the installer generate a CA certificate and key. In this case, the browser may not recognize this certificate, which I discuss later in the post. Alternatively, you can provide a CA certificate and a key in PEM format issued by an authorized certificate authority. Choose Next Step.

Use the SSH key for the region specified earlier. You also have an option to generate a new key. This allows you to later connect using SSH into the Amazon EC2 instances provisioned by the cluster. Here is the command that can be used to log in:

ssh –i <key> [email protected]<ec2-instance-ip>

Choose Next Step.

Define the number and instance type of master and worker nodes. In this case, create a 6 nodes cluster. Make sure that the worker nodes have enough processing power and memory to run the containers.

An etcd cluster is used as persistent storage for all of Kubernetes API objects. This cluster is required for the Kubernetes cluster to operate. There are three ways to use the etcd cluster as part of the Tectonic installer:

  • (Default) Provision the cluster using EC2 instances. Additional EC2 instances are used in this case.
  • Use an alpha support for cluster provisioning using the etcd operator. The etcd operator is used for automated operations of the etcd master nodes for the cluster itself, in addition to for etcd instances that are created for application usage. The etcd cluster is provisioned within the Tectonic installer.
  • Bring your own pre-provisioned etcd cluster.

Use the first option in this case.

For more information about choosing the appropriate instance type, see the etcd hardware recommendation. Choose Next Step.

Specify the networking options. The installer can create a new public VPC or use a pre-existing public or private VPC. Make sure that the VPC requirements are met for an existing VPC.

Give a DNS name for the cluster. Choose the domain for which the Route 53 hosted zone was configured earlier, such as tectonic.kubernetes-aws.io. Multiple clusters may be created under a single domain. The cluster name and the DNS name would typically match each other.

To select the CIDR range, choose Show Advanced Settings. You can also choose the Availability Zones for the master and worker nodes. By default, the master and worker nodes are spread across multiple Availability Zones in the chosen region. This makes the cluster highly available.

Leave the other values as default. Choose Next Step.

Specify an email address and password to be used as credentials to log in to the console. Choose Next Step.

At any point during the installation, you can choose Save progress. This allows you to save configurations specified in the installer. This configuration file can then be used to restore progress in the installer at a later point.

To start the cluster installation, choose Submit. At another time, you can download the Terraform assets by choosing Manually boot. This allows you to boot the cluster later.

The logs from the Terraform scripts are shown in the installer. When the installation is complete, the console shows that the Terraform scripts were successfully applied, the domain name was resolved successfully, and that the console has started. The domain works successfully if the DNS resolution worked earlier, and it’s the address where the console is accessible.

Choose Download assets to download assets related to your cluster. It contains your generated CA, kubectl configuration file, and the Terraform state. This download is an important step as it allows you to delete the cluster later.

Choose Next Step for the final installation screen. It allows you to access the Tectonic console, gives you instructions about how to configure kubectl to manage this cluster, and finally deploys an application using kubectl.

Choose Go to my Tectonic Console. In our case, it is also accessible at http://cluster.tectonic.kubernetes-aws.io/.

As I mentioned earlier, the browser does not recognize the self-generated CA certificate. Choose Advanced and connect to the console. Enter the login credentials specified earlier in the installer and choose Login.

The Kubernetes upstream and console version are shown under Software Details. Cluster health shows All systems go and it means that the API server and the backend API can be reached.

To view different Kubernetes resources in the cluster choose, the resource in the left navigation bar. For example, all deployments can be seen by choosing Deployments.

By default, resources in the all namespace are shown. Other namespaces may be chosen by clicking on a menu item on the top of the screen. Different administration tasks such as managing the namespaces, getting list of the nodes and RBAC can be configured as well.

Download and run Kubectl

Kubectl is required to manage the Kubernetes cluster. The latest version of kubectl can be downloaded using the following command:

curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl

It can also be conveniently installed using the Homebrew package manager. To find and access a cluster, Kubectl needs a kubeconfig file. By default, this configuration file is at ~/.kube/config. This file is created when a Kubernetes cluster is created from your machine. However, in this case, download this file from the console.

In the console, choose admin, My Account, Download Configuration and follow the steps to download the kubectl configuration file. Move this file to ~/.kube/config. If kubectl has already been used on your machine before, then this file already exists. Make sure to take a backup of that file first.

Now you can run the commands to view the list of deployments:

~ $ kubectl get deployments --all-namespaces
NAMESPACE         NAME                                    DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system       etcd-operator                           1         1         1            1           43m
kube-system       heapster                                1         1         1            1           40m
kube-system       kube-controller-manager                 3         3         3            3           43m
kube-system       kube-dns                                1         1         1            1           43m
kube-system       kube-scheduler                          3         3         3            3           43m
tectonic-system   container-linux-update-operator         1         1         1            1           40m
tectonic-system   default-http-backend                    1         1         1            1           40m
tectonic-system   kube-state-metrics                      1         1         1            1           40m
tectonic-system   kube-version-operator                   1         1         1            1           40m
tectonic-system   prometheus-operator                     1         1         1            1           40m
tectonic-system   tectonic-channel-operator               1         1         1            1           40m
tectonic-system   tectonic-console                        2         2         2            2           40m
tectonic-system   tectonic-identity                       2         2         2            2           40m
tectonic-system   tectonic-ingress-controller             1         1         1            1           40m
tectonic-system   tectonic-monitoring-auth-alertmanager   1         1         1            1           40m
tectonic-system   tectonic-monitoring-auth-prometheus     1         1         1            1           40m
tectonic-system   tectonic-prometheus-operator            1         1         1            1           40m
tectonic-system   tectonic-stats-emitter                  1         1         1            1           40m

This output is similar to the one shown in the console earlier. Now, this kubectl can be used to manage your resources.

Upgrade the Kubernetes cluster

Tectonic allows the in-place upgrade of the cluster. This is an experimental feature as of this release. The clusters can be updated either automatically, or with manual approval.

To perform the update, choose Administration, Cluster Settings. If an earlier Tectonic installer, version 1.6.2 in this case, is used to install the cluster, then this screen would look like the following:

Choose Check for Updates. If any updates are available, choose Start Upgrade. After the upgrade is completed, the screen is refreshed.

This is an experimental feature in this release and so should only be used on clusters that can be easily replaced. This feature may become a fully supported in a future release. For more information about the upgrade process, see Upgrading Tectonic & Kubernetes.

Delete the Kubernetes cluster

Typically, the Kubernetes cluster is a long-running cluster to serve your applications. After its purpose is served, you may delete it. It is important to delete the cluster as this ensures that all resources created by the cluster are appropriately cleaned up.

The easiest way to delete the cluster is using the assets downloaded in the last step of the installer. Extract the downloaded zip file. This creates a directory like <cluster-name>_TIMESTAMP. In that directory, give the following command to delete the cluster:

TERRAFORM_CONFIG=$(pwd)/.terraformrc terraform destroy --force

This destroys the cluster and all associated resources.

You may have forgotten to download the assets. There is a copy of the assets in the directory tectonic/tectonic-installer/darwin/clusters. In this directory, another directory with the name <cluster-name>_TIMESTAMP contains your assets.

Conclusion

This post explained how to manage Kubernetes clusters using the CoreOS Tectonic graphical installer.  For more details, see Graphical Installer with AWS. If the installation does not succeed, see the helpful Troubleshooting tips. After the cluster is created, see the Tectonic tutorials to learn how to deploy, scale, version, and delete an application.

Future posts in this series will explain other ways of creating and running a Kubernetes cluster on AWS.

Arun

timeShift(GrafanaBuzz, 1w) Issue 5

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/07/21/timeshiftgrafanabuzz-1w-issue-5/

We cover a lot of ground in this week’s timeShift. From diving into building your own plugin, finding the right dashboard, configuration options in the alerting feature, to monitoring your local weather, there’s something for everyone. Are you writing an article about Grafana, or have you come across an article you found interesting? Please get in touch, we’ll add it to our roundup.


From the Blogosphere

  • Going open-source in monitoring, part III: 10 most useful Grafana dashboards to monitor Kubernetes and services: We have hundreds of pre-made dashboards ready for you to install into your on-prem or hosted Grafana, but not every one will fit your specific monitoring needs. In part three of the series, Sergey discusses is experiences with finding useful dashboards and shows off ten of the best dashboards you can install for monitoring Kubernetes clusters and the services deployed on them.

  • Using AWS Lambda and API gateway for server-less Grafana adapters: Sometimes you’ll want to visualize metrics from a data source that may not yet be supported in Grafana natively. With the plugin functionality introduced in Grafana 3.0, anyone can create their own data sources. Using the SimpleJson data source, Jonas describes how he used AWS Lambda and AWS API gateway to write data source adapters for Grafana.

  • How to Use Grafana to Monitor JMeter Non-GUI Results – Part 2: A few issues ago we listed an article for using Grafana to monitor JMeter Non-GUI results, which required a number of non-trivial steps to complete. This article shows of an easier way to accomplish this that doesn’t require any additional configuration of InfluxDB.

  • Programming your Personal Weather Chart: It’s always great to see Grafana used outside of the typical dev-ops usecase. This article runs you through the steps to create your own weather chart and show off your local weather stats in Grafana. BONUS: Rob shows off a magic mirror he created, which can display this data.

  • vSphere Performance data – Part 6 – The Dashboard(s): This 6-part series goes into a ton of detail and walks you through the various methods of retrieving vSphere performance data, storing the data in a TSDB, and creating dashboards for the metrics. Part 6 deals specifically with Grafana, but I highly recommend reading all of the articles, as it chronicles the journey of metrics exploration, storage, and visualization from someone who had no prior experience with time series data.

  • Alerting in Grafana: Alerting in Grafana is a fairly new feature and one that we’re continuing to iterate on. We’re soon adding additional data source support, new notification channels, clustering, silencing rules, and more. This article steps you through all the configuration options to get you to your first alert.


Plugins and Dashboards

It can seem like work slows during July and August, but we’re still seeing a lot of activity in the community. This week we have a new graph panel to show off that gives you some unique looking dashboards, and an update to the Zabbix data source, which adds some really great features. You can install both of the plugins now on your on-prem Grafana via our cli, or with one-click on GrafanaCloud.

NEW PLUGIN

Bubble Chart Panel This super-cool looking panel groups your tag values into clusters of circles. The size of the circle represents the aggregated value of the time series data. There are also multiple color schemes to make those bubbles POP (pun intended)! Currently it works against OpenTSDB and Bosun, so give it a try!

Install Now

UPDATED PLUGIN

Zabbix Alex has been hard at work, making improvements on the Zabbix App for Grafana. This update adds annotations, template variables, alerting and more. Thanks Alex! If you’d like to try out the app, head over to http://play.grafana-zabbix.org/dashboard/db/zabbix-db-mysql?orgId=2

Install 3.5.1 Now


This week’s MVC (Most Valuable Contributor)

Open source software can’t thrive without the contributions from the community. Each week we’ll recognize a Grafana contributor and thank them for all of their PRs, bug reports and feedback.

mk-dhia (Dhia)
Thank you so much for your improvements to the Elasticsearch data source!


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

This week’s tweet comes from @geek_dave

Great looking dashboard Dave! And thank you for adding new features and keeping it updated. It’s creators like you who make the dashboard repository so awesome!


Upcoming Events

We love when people talk about Grafana at meetups and conferences.

Monday, July 24, 2017 – 7:30pm | Google Campus Warsaw


Ząbkowska 27/31, Warsaw, Poland

Iot & HOME AUTOMATION #3 openHAB, InfluxDB, Grafana:
If you are interested in topics of the internet of things and home automation, this might be a good occasion to meet people similar to you. If you are into it, we will also show you how we can all work together on our common projects.

RSVP


Tell us how we’re Doing.

We’d love your feedback on what kind of content you like, length, format, etc – so please keep the comments coming! You can submit a comment on this article below, or post something at our community forum. Help us make this better.

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

timeShift(GrafanaBuzz, 1w) Issue 4

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/07/14/timeshiftgrafanabuzz-1w-issue-4/

The summer seems to be flying by! This week’s timeShift has a lot of great articles to share, including a Grafana presentation from one of our software engineers, Kubernetes monitoring, dashboard exports and backups via grafcli, scaling Graphite on AWS and a lot more. If you’ve come across a recent article about Grafana, or are writing one yourself, please get in touch, we’d be happy to feature it here. From the Blogosphere Democratizing Metrics with Grafana: Grafana Labs software developer Alexander Zobnin, recently gave a great talk at the Big Monitoring Meetup in St.

Manage Kubernetes Clusters on AWS Using Kops

Post Syndicated from Arun Gupta original https://aws.amazon.com/blogs/compute/kubernetes-clusters-aws-kops/

Any containerized application typically consists of multiple containers. There is a container for the application itself, one for database, possibly another for web server, and so on. During development, its normal to build and test this multi-container application on a single host. This approach works fine during early dev and test cycles but becomes a single point of failure for production where the availability of the application is critical. In such cases, this multi-container application is deployed on multiple hosts. There is a need for an external tool to manage such a multi-container multi-host deployment. Container orchestration frameworks provides the capability of cluster management, scheduling containers on different hosts, service discovery and load balancing, crash recovery and other related functionalities. There are multiple options for container orchestration on Amazon Web Services: Amazon ECS, Docker for AWS, and DC/OS.

Another popular option for container orchestration on AWS is Kubernetes. There are multiple ways to run a Kubernetes cluster on AWS. This multi-part blog series provides a brief overview and explains some of these approaches in detail. This first post explains how to create a Kubernetes cluster on AWS using kops.

Kubernetes and Kops overview

Kubernetes is an open source, container orchestration platform. Applications packaged as Docker images can be easily deployed, scaled, and managed in a Kubernetes cluster. Some of the key features of Kubernetes are:

  • Self-healing
    Failed containers are restarted to ensure that the desired state of the application is maintained. If a node in the cluster dies, then the containers are rescheduled on a different node. Containers that do not respond to application-defined health check are terminated, and thus rescheduled.
  • Horizontal scaling
    Number of containers can be easily scaled up and down automatically based upon CPU utilization, or manually using a command.
  • Service discovery and load balancing
    Multiple containers can be grouped together discoverable using a DNS name. The service can be load balanced with integration to the native LB provided by the cloud provider.
  • Application upgrades and rollbacks
    Applications can be upgraded to a newer version without an impact to the existing one. If something goes wrong, Kubernetes rolls back the change.

Kops, short for Kubernetes Operations, is a set of tools for installing, operating, and deleting Kubernetes clusters in the cloud. A rolling upgrade of an older version of Kubernetes to a new version can also be performed. It also manages the cluster add-ons. After the cluster is created, the usual kubectl CLI can be used to manage resources in the cluster.

Download Kops and Kubectl

There is no need to download the Kubernetes binary distribution for creating a cluster using kops. However, you do need to download the kops CLI. It then takes care of downloading the right Kubernetes binary in the cloud, and provisions the cluster.

The different download options for kops are explained at github.com/kubernetes/kops#installing. On MacOS, the easiest way to install kops is using the brew package manager.

brew update && brew install kops

The version of kops can be verified using the kops version command, which shows:

Version 1.6.1

In addition, download kubectl. This is required to manage the Kubernetes cluster. The latest version of kubectl can be downloaded using the following command:

curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl

Make sure to include the directory where kubectl is downloaded in your PATH.

IAM user permission

The IAM user to create the Kubernetes cluster must have the following permissions:

  • AmazonEC2FullAccess
  • AmazonRoute53FullAccess
  • AmazonS3FullAccess
  • IAMFullAccess
  • AmazonVPCFullAccess

Alternatively, a new IAM user may be created and the policies attached as explained at github.com/kubernetes/kops/blob/master/docs/aws.md#setup-iam-user.

Create an Amazon S3 bucket for the Kubernetes state store

Kops needs a “state store” to store configuration information of the cluster.  For example, how many nodes, instance type of each node, and Kubernetes version. The state is stored during the initial cluster creation. Any subsequent changes to the cluster are also persisted to this store as well. As of publication, Amazon S3 is the only supported storage mechanism. Create a S3 bucket and pass that to the kops CLI during cluster creation.

This post uses the bucket name kubernetes-aws-io. Bucket names must be unique; you have to use a different name. Create an S3 bucket:

aws s3api create-bucket --bucket kubernetes-aws-io

I strongly recommend versioning this bucket in case you ever need to revert or recover a previous version of the cluster. This can be enabled using the AWS CLI as well:

aws s3api put-bucket-versioning --bucket kubernetes-aws-io --versioning-configuration Status=Enabled

For convenience, you can also define KOPS_STATE_STORE environment variable pointing to the S3 bucket. For example:

export KOPS_STATE_STORE=s3://kubernetes-aws-io

This environment variable is then used by the kops CLI.

DNS configuration

As of Kops 1.6.1, a top-level domain or a subdomain is required to create the cluster. This domain allows the worker nodes to discover the master and the master to discover all the etcd servers. This is also needed for kubectl to be able to talk directly with the master.

This domain may be registered with AWS, in which case a Route 53 hosted zone is created for you. Alternatively, this domain may be at a different registrar. In this case, create a Route 53 hosted zone. Specify the name server (NS) records from the created zone as NS records with the domain registrar.

This post uses a kubernetes-aws.io domain registered at a third-party registrar.

Generate a Route 53 hosted zone using the AWS CLI. Download jq to run this command:

ID=$(uuidgen) && \
aws route53 create-hosted-zone \
--name cluster.kubernetes-aws.io \
--caller-reference $ID \
| jq .DelegationSet.NameServers

This shows an output such as the following:

[
"ns-94.awsdns-11.com",
"ns-1962.awsdns-53.co.uk",
"ns-838.awsdns-40.net",
"ns-1107.awsdns-10.org"
]

Create NS records for the domain with your registrar. Different options on how to configure DNS for the cluster are explained at github.com/kubernetes/kops/blob/master/docs/aws.md#configure-dns.

Experimental support to create a gossip-based cluster was added in Kops 1.6.2. This post uses a DNS-based approach, as that is more mature and well tested.

Create the Kubernetes cluster

The Kops CLI can be used to create a highly available cluster, with multiple master nodes spread across multiple Availability Zones. Workers can be spread across multiple zones as well. Some of the tasks that happen behind the scene during cluster creation are:

  • Provisioning EC2 instances
  • Setting up AWS resources such as networks, Auto Scaling groups, IAM users, and security groups
  • Installing Kubernetes.

Start the Kubernetes cluster using the following command:

kops create cluster \
--name cluster.kubernetes-aws.io \
--zones us-west-2a \
--state s3://kubernetes-aws-io \
--yes

In this command:

  • --zones
    Defines the zones in which the cluster is going to be created. Multiple comma-separated zones can be specified to span the cluster across multiple zones.
  • --name
    Defines the cluster’s name.
  • --state
    Points to the S3 bucket that is the state store.
  • --yes
    Immediately creates the cluster. Otherwise, only the cloud resources are created and the cluster needs to be started explicitly using the command kops update --yes. If the cluster needs to be edited, then the kops edit cluster command can be used.

This starts a single master and two worker node Kubernetes cluster. The master is in an Auto Scaling group and the worker nodes are in a separate group. By default, the master node is m3.medium and the worker node is t2.medium. Master and worker nodes are assigned separate IAM roles as well.

Wait for a few minutes for the cluster to be created. The cluster can be verified using the command kops validate cluster --state=s3://kubernetes-aws-io. It shows the following output:

Using cluster from kubectl context: cluster.kubernetes-aws.io

Validating cluster cluster.kubernetes-aws.io

INSTANCE GROUPS
NAME                 ROLE      MACHINETYPE    MIN    MAX    SUBNETS
master-us-west-2a    Master    m3.medium      1      1      us-west-2a
nodes                Node      t2.medium      2      2      us-west-2a

NODE STATUS
NAME                                           ROLE      READY
ip-172-20-38-133.us-west-2.compute.internal    node      True
ip-172-20-38-177.us-west-2.compute.internal    master    True
ip-172-20-46-33.us-west-2.compute.internal     node      True

Your cluster cluster.kubernetes-aws.io is ready

It shows the different instances started for the cluster, and their roles. If multiple cluster states are stored in the same bucket, then --name <NAME> can be used to specify the exact cluster name.

Check all nodes in the cluster using the command kubectl get nodes:

NAME                                          STATUS         AGE       VERSION
ip-172-20-38-133.us-west-2.compute.internal   Ready,node     14m       v1.6.2
ip-172-20-38-177.us-west-2.compute.internal   Ready,master   15m       v1.6.2
ip-172-20-46-33.us-west-2.compute.internal    Ready,node     14m       v1.6.2

Again, the internal IP address of each node, their current status (master or node), and uptime are shown. The key information here is the Kubernetes version for each node in the cluster, 1.6.2 in this case.

The kubectl value included in the PATH earlier is configured to manage this cluster. Resources such as pods, replica sets, and services can now be created in the usual way.

Some of the common options that can be used to override the default cluster creation are:

  • --kubernetes-version
    The version of Kubernetes cluster. The exact versions supported are defined at github.com/kubernetes/kops/blob/master/channels/stable.
  • --master-size and --node-size
    Define the instance of master and worker nodes.
  • --master-count and --node-count
    Define the number of master and worker nodes. By default, a master is created in each zone specified by --master-zones. Multiple master nodes can be created by a higher number using --master-count or specifying multiple Availability Zones in --master-zones.

A three-master and five-worker node cluster, with master nodes spread across different Availability Zones, can be created using the following command:

kops create cluster \
--name cluster2.kubernetes-aws.io \
--zones us-west-2a,us-west-2b,us-west-2c \
--node-count 5 \
--state s3://kubernetes-aws-io \
--yes

Both the clusters are sharing the same state store but have different names. This also requires you to create an additional Amazon Route 53 hosted zone for the name.

By default, the resources required for the cluster are directly created in the cloud. The --target option can be used to generate the AWS CloudFormation scripts instead. These scripts can then be used by the AWS CLI to create resources at your convenience.

Get a complete list of options for cluster creation with kops create cluster --help.

More details about the cluster can be seen using the command kubectl cluster-info:

Kubernetes master is running at https://api.cluster.kubernetes-aws.io
KubeDNS is running at https://api.cluster.kubernetes-aws.io/api/v1/proxy/namespaces/kube-system/services/kube-dns

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Check the client and server version using the command kubectl version:

Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:44:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.2", GitCommit:"477efc3cbe6a7effca06bd1452fa356e2201e1ee", GitTreeState:"clean", BuildDate:"2017-04-19T20:22:08Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

Both client and server version are 1.6 as shown by the Major and Minor attribute values.

Upgrade the Kubernetes cluster

Kops can be used to create a Kubernetes 1.4.x, 1.5.x, or an older version of the 1.6.x cluster using the --kubernetes-version option. The exact versions supported are defined at github.com/kubernetes/kops/blob/master/channels/stable.

Or, you may have used kops to create a cluster a while ago, and now want to upgrade to the latest recommended version of Kubernetes. Kops supports rolling cluster upgrades where the master and worker nodes are upgraded one by one.

As of kops 1.6.1, upgrading a cluster is a three-step process.

First, check and apply the latest recommended Kubernetes update.

kops upgrade cluster \
--name cluster2.kubernetes-aws.io \
--state s3://kubernetes-aws-io \
--yes

The --yes option immediately applies the changes. Not specifying the --yes option shows only the changes that are applied.

Second, update the state store to match the cluster state. This can be done using the following command:

kops update cluster \
--name cluster2.kubernetes-aws.io \
--state s3://kubernetes-aws-io \
--yes

Lastly, perform a rolling update for all cluster nodes using the kops rolling-update command:

kops rolling-update cluster \
--name cluster2.kubernetes-aws.io \
--state s3://kubernetes-aws-io \
--yes

Previewing the changes before updating the cluster can be done using the same command but without specifying the --yes option. This shows the following output:

NAME                 STATUS        NEEDUPDATE    READY    MIN    MAX    NODES
master-us-west-2a    NeedsUpdate   1             0        1      1      1
nodes                NeedsUpdate   2             0        2      2      2

Using --yes updates all nodes in the cluster, first master and then worker. There is a 5-minute delay between restarting master nodes, and a 2-minute delay between restarting nodes. These values can be altered using --master-interval and --node-interval options, respectively.

Only the worker nodes may be updated by using the --instance-group node option.

Delete the Kubernetes cluster

Typically, the Kubernetes cluster is a long-running cluster to serve your applications. After its purpose is served, you may delete it. It is important to delete the cluster using the kops command. This ensures that all resources created by the cluster are appropriately cleaned up.

The command to delete the Kubernetes cluster is:

kops delete cluster --state=s3://kubernetes-aws-io --yes

If multiple clusters have been created, then specify the cluster name as in the following command:

kops delete cluster cluster2.kubernetes-aws.io --state=s3://kubernetes-aws-io --yes

Conclusion

This post explained how to manage a Kubernetes cluster on AWS using kops. Kubernetes on AWS users provides a self-published list of companies using Kubernetes on AWS.

Try starting a cluster, create a few Kubernetes resources, and then tear it down. Kops on AWS provides a more comprehensive tutorial for setting up Kubernetes clusters. Kops docs are also helpful for understanding the details.

In addition, the Kops team hosts office hours to help you get started, from guiding you with your first pull request. You can always join the #kops channel on Kubernetes slack to ask questions. If nothing works, then file an issue at github.com/kubernetes/kops/issues.

Future posts in this series will explain other ways of creating and running a Kubernetes cluster on AWS.

— Arun

Analyze your GitHub Project With Elasticsearch And Grafana

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/05/10/analyze-your-github-project-with-elasticsearch-and-grafana/

The Dream

I have, for a long time, wished there was a way to easily export GitHub issues and comments to
Elasticsearch. The standard GitHub graphs for commits and traffic are great but I have
really been missing graphs and analytics on issues and comments.

If we had issues & comments in Elasticsearch, with a well-defined index mapping, we could do some
interesting analytics. For example:

  • Look at project history in terms of issues created
  • Look at project history in terms of comments (can be a measure of community engagement)
  • See how different labels trend over time.
  • Look at distributions (histograms) on the number of issues or comments created per user. Are there a few very active users that represent 70% or 90% of all issues & comments?
  • How long do PRs stay open?
  • How long until issues get their first response?

Why Elasticsearch?

Grafana is most often used with time series databases like Graphite, but for this sort of use case,
it’s about much more than measurements. Part of the power of Grafana is bringing together data from
many different places, and leveraging the strengths of its diverse set of data sources.

Elasticsearch isn’t technically a time series database, but it’s been one of our fastest growing data source
because it really shines for use cases like this. Plus, Grafana’s support for Elasticsearch is getting
better and better.

Elasticsearch is not only a document search DB. Its real power is in the kinds of aggregations you can do. It’s not ideal
for the high volume & high-resolution time series workloads that most time series databases can handle, but for
data with high cardinality (like documents with usernames, issue numbers, etc) it can really shine. It also allows
you to do ad-hoc filtering in a way that time series would not allow, as it would require a unique time series
for every possible filter condition and value.

The GitHub API Crawler

So a few weekends ago I had some left over programming energy and spent a few hours hacking together
this node.js app that uses the GitHub API to crawl all issues and comments which it
then saves as separate documents in Elasticsearch.

It stores them in Elasticsearch with this index mapping:

"mappings": {
  "issue": {
    "properties": {
      "title":            { "type": "text"  },
        "state":            { "type": "keyword"  },
        "repo":             { "type": "keyword"  },
        "labels":           { "type": "keyword"  },
        "number":           { "type": "keyword"  },
        "comments":         { "type": "long"  },
        "assignee":         { "type": "keyword"  },
        "user_login":       { "type": "keyword"  },
        "milestone":        { "type": "keyword"  },
        "created_at":       { "type": "date"  },
        "closed_at":        { "type": "date"  },
        "updated_at":       { "type": "date"  },
        "is_pull_request":  { "type": "boolean"  },
    }
  },
    "comment": {
      "properties": {
        "issue":           { "type": "keyword"  },
        "repo":            { "type": "keyword"  },
        "user_login":      { "type": "keyword"  },
        "created_at":      { "type": "date"     },
      }
    }
}

There are some more numeric fields being saved for reactions that do not need to be defined
in the index mapping.

The Dashboards

With the data finally collected, I built two dashboards; one focused on issues and another one
focused on comments. Both dashboards are templated and allow you to specify which repository
to look at and the granularity (group by time) of the data. You can also add any ad-hoc filter. For example,
only look at issues created by a specific user, or only look at issues with no comments.

Check out the dasboard on our play site. I configured the
github-to-es collector to fetch issues and comments for the main Kubernetes repo, the
main Grafana repo and the Microsoft VS Code editor repository.

The second dashboard shows comment analytics:

Useful How?

I am not exactly sure how useful this data & dashboards are yet. It was mostly a fun hobby project to see some trends and stats
for issue and comment volume. But this could also be useful data that can help you track things like issue label stats. Stats that could
be used to improve categorizing issues and visualizing changes in labeling trends. For example, the graphs could answer questions like:
How did a concerted effort to improve docs change the trend of issues labeled question?

Try it and help me improve it

Check out the GitHub repo grafana/github-to-es it has a basic README with instructions
for how to get started.

Once you have the import working you need to add an Elasticsearch data source in Grafana. For index name you specify github
and for the Timestamp field you specify created_at. Then you can import the the two dashboards i published on Grafana.com:

There are some limitations for how many issues and comments that can be imported in the initial full import due to the paging limit
in the GitHub API. GitHub API returns a maximum of 100 issues or comments per “page” and has a page limit of a maximum of 400 pages. This
means that the full import can only handle 40,000 issues and 40,000 comments.

More data & more cool graphs

There are probably many more interesting queries you can build and the collector could also be improved to fetch and store more fields.

For example:

  • Collect stars & fork stats (needs to be recorded as snapshot docs as there is no API to get historical data for this)
  • Calculate time between issue created and first comment during issue fetching to have that as a field on the issue docs
  • PR details, currently issue API does not include merge status (only a flag if its a PR)
  • Commit docs

There are probably a lot more cool things you can collect & query.

Until next time, keep on graphing!
Torkel Ödegaard
Grafana Creator & Project Lead

HiveMQ 3.2.4 released

Post Syndicated from The HiveMQ Team original https://www.hivemq.com/blog/hivemq-3-2-4-released/

The HiveMQ team is pleased to announce the availability of HiveMQ 3.2.4. This is a maintenance release for the 3.2 series and brings the following improvements:

  • Fixed an issue with duplicate delivery for QoS=2 messages
  • Fixed an cluster issue that could cause the loss of queued messages with shared subscriptions
  • Fixed an authorization issue with retained messages
  • Fixed an issue that could cause the loss of queued messages in rare edge cases
  • Enabled the use of the publishToClient() for shared subscriptions in the Publish Service of the plugin SPI
  • Increased cluster stability in network split scenarios
  • Fixed wrongly calculated metrics for dropped messages
  • Fixed an issue preventing the OnPublishSendCallback being called for queued and retained messages
  • Deprecated client-bind-port for tcp transport*
  • Improved logging
  • Improved HiveMQ bootstrap behavior for Kubernetes and Openshift Environments
  • Performance improvements

You can download the new HiveMQ version here.
* see the upgrade guide if this effects your configuration.

We strongly recommend to upgrade if you are an HiveMQ 3.2.x user.

Have a great day,
The HiveMQ Team

Tor, TPMs and service integrity attestation

Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/45602.html

One of the most powerful (and most scary) features of TPM-based measured boot is the ability for remote systems to request that clients attest to their boot state, allowing the remote system to determine whether the client has booted in the correct state. This involves each component in the boot process writing a hash of the next component into the TPM and logging it. When attestation is requested, the remote site gives the client a nonce and asks for an attestation, the client OS passes the nonce to the TPM and asks it to provide a signed copy of the hashes and the nonce and sends them (and the log) to the remote site. The remoteW site then replays the log to ensure it matches the signed hash values, and can examine the log to determine whether the system is trustworthy (whatever trustworthy means in this context).

When this was first proposed people were (justifiably!) scared that remote services would start refusing to work for users who weren’t running (for instance) an approved version of Windows with a verifiable DRM stack. Various practical matters made this impossible. The first was that, until fairly recently, there was no way to demonstrate that the key used to sign the hashes actually came from a TPM[1], so anyone could simply generate a set of valid hashes, sign them with a random key and provide that. The second is that even if you have a signature from a TPM, you have no way of proving that it’s from the TPM that the client booted with (you can MITM the request and either pass it to a client that did boot the appropriate OS or to an external TPM that you’ve plugged into your system after boot and then programmed appropriately). The third is that, well, systems and configurations vary so much that outside very controlled circumstances it’s impossible to know what a “legitimate” set of hashes even is.

As a result, so far remote attestation has tended to be restricted to internal deployments. Some enterprises use it as part of their VPN login process, and we’ve been working on it at CoreOS to enable Kubernetes clusters to verify that workers are in a trustworthy state before running jobs on them. While useful, this isn’t terribly exciting for most people. Can we do better?

Remote attestation has generally been thought of in terms of remote systems requiring that clients attest. But there’s nothing that requires things to be done in that direction. There’s nothing stopping clients from being able to request that a server attest to its state, allowing clients to make informed decisions about whether they should provide confidential data. But the problems that apply to clients apply equally well to servers. Let’s work through them in reverse order.

We have no idea what expected “good” values are

Yes, and this is a problem. CoreOS ships with an expected set of good values, and we had general agreement at the Linux Plumbers Conference that other distributions would start looking at what it would take to do the same. But how do we know that those values are themselves trustworthy? In an ideal world this would involve reproducible builds, allowing anybody to grab the source code for the OS, build it locally and verify that they have the same hashes.

Ok. So we’re able to verify that the booted OS was good. But how about the services? The rkt container runtime supports measuring each container into the TPM, which means we can verify which container images were started. If container images are also built in such a way that they’re reproducible, users can grab the source code, rebuild the container locally and again verify that it has the same hashes. Users can then be sure that the remote site is running the code they’re looking at.

Or can they? Not really – a general purpose OS has all kinds of ways to inject code into containers, so an admin could simply replace the binaries inside the container after it’s been measured, or ptrace() the server, or modify rkt so it generates correct measurements regardless of the image or, well, there’s lots they could do. So a general purpose OS is probably a bad idea here. Instead, let’s imagine an immutable OS that does nothing other than bring up networking and then reads a config file that tells it which container images to download and run. This reduces the amount of code that needs to support reproducible builds, making it easier for a client to verify that the source corresponds to the code the remote system is actually running.

Is this sufficient? Eh sadly no. Even if we know the valid values for the entire OS and every container, we don’t know the legitimate values for the system firmware. Any modified firmware could tamper with the rest of the trust chain, making it possible for you to get valid OS values even if the OS has been subverted. This isn’t a solved problem yet, and really requires hardware vendor support. Let’s handwave this for now, or assert that we’ll have some sidechannel for distributing valid firmware values.

Avoiding TPM MITMing

This one’s more interesting. If I ask the server to attest to its state, it can simply pass that through to a TPM running on another system that’s running a trusted stack and happily serve me content from a compromised stack. Suboptimal. We need some way to tie the TPM identity and the service identity to each other.

Thankfully, we have one. Tor supports running services in the .onion TLD. The key used to identify the service to the Tor network is also used to create the “hostname” of the system. I wrote a pretty hacky implementation that generates that key on the TPM, tying the service identity to the TPM. You can ask the TPM to prove that it generated a key, and that allows you to tie both the key used to run the Tor service and the key used to sign the attestation hashes to the same TPM. You now know that the attestation values came from the same system that’s running the service, and that means you know the TPM hasn’t been MITMed.

How do you know it’s a TPM at all?

This is much easier. See [1].


There’s still various problems around this, including the fact that we don’t have this immutable minimal container OS, that we don’t have the infrastructure to ensure that container builds are reproducible, that we don’t have any known good firmware values and that we don’t have a mechanism for allowing a user to perform any of this validation. But these are all solvable, and it seems like an interesting project.

“Interesting” isn’t necessarily the right metric, though. “Useful” is. And I think this is very useful. If I’m about to upload documents to a SecureDrop instance, it seems pretty important that I be able to verify that it is a SecureDrop instance rather than something pretending to be one. This gives us a mechanism.

The next few years seem likely to raise interest in ensuring that people have secure mechanisms to communicate. I’m not emotionally invested in this one, but if people have better ideas about how to solve this problem then this seems like a good time to talk about them.

[1] More modern TPMs have a certificate that chains from the TPM’s root key back to the TPM manufacturer, so as long as you trust the TPM manufacturer to have kept control of that you can prove that the signature came from a real TPM

comment count unavailable comments

Your project’s RCS history affects ease of contribution (or: don’t squash PRs)

Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/42759.html

Github recently introduced the option to squash commits on merge, and even before then several projects requested that contributors squash their commits after review but before merge. This is a terrible idea that makes it more difficult for people to contribute to projects.

I’m spending today working on reworking some code to integrate with a new feature that was just integrated into Kubernetes. The PR in question was absolutely fine, but just before it was merged the entire commit history was squashed down to a single commit at the request of the reviewer. This single commit contains type declarations, the functionality itself, the integration of that functionality into the scheduler, the client code and a large pile of autogenerated code.

I’ve got some familiarity with Kubernetes, but even then this commit is difficult for me to read. It doesn’t tell a story. I can’t see its growth. Looking at a single hunk of this diff doesn’t tell me whether it’s infrastructural or part of the integration. Given time I can (and have) figured it out, but it’s an unnecessary waste of effort that could have gone towards something else. For someone who’s less used to working on large projects, it’d be even worse. I’m paid to deal with this. For someone who isn’t, the probability that they’ll give up and do something else entirely is even greater.

I don’t want to pick on Kubernetes here – the fact that this Github feature exists makes it clear that a lot of people feel that this kind of merge is a good idea. And there are certainly cases where squashing commits makes sense. Commits that add broken code and which are immediately followed by a series of “Make this work” commits also impair readability and distract from the narrative that your RCS history should present, and Github present this feature as a way to get rid of them. But that ends up being a false dichotomy. A history that looks like “Commit”, “Revert Commit”, “Revert Revert Commit”, “Fix broken revert”, “Revert fix broken revert” is a bad history, as is a history that looks like “Add 20,000 line feature A”, “Add 20,000 line feature B”.

When you’re crafting commits for merge, think about your commit history as a textbook. Start with the building blocks of your feature and make them one commit. Build your functionality on top of them in another. Tie that functionality into the core project and make another commit. Add client support. Add docs. Include your tests. Allow someone to follow the growth of your feature over time, with each commit being a chapter of that story. And never, ever, put autogenerated code in the same commit as an actual functional change.

People can’t contribute to your project unless they can understand your code. Writing clear, well commented code is a big part of that. But so is showing the evolution of your features in an understandable way. Make sure your RCS history shows that, otherwise people will go and find another project that doesn’t make them feel frustrated.

(Edit to add: Sarah Sharp wrote on the same topic a couple of years ago)

comment count unavailable comments

First Round of systemd.conf 2015 Sponsors

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/first-round-of-systemdconf-2015-sponsors.html

First Round of systemd.conf 2015 Sponsors

We are happy to announce the first round of systemd.conf
2015
sponsors!

Our first Gold sponsor is CoreOS!

CoreOS develops software for modern infrastructure that delivers a consistent operating environment for distributed applications. CoreOS’s commercial offering, Tectonic, is an enterprise-ready platform that combines Kubernetes and the CoreOS stack to run Linux containers. In addition CoreOS is the creator and maintainer of open source projects such as CoreOS Linux, etcd, fleet, flannel and rkt. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience. Learn more about CoreOS here https://coreos.com/, Tectonic here, https://tectonic.com/ or follow CoreOS on Twitter @coreoslinux.

A Silver sponsor is Codethink:

Codethink is a software services consultancy, focusing on engineering reliable systems for long-term deployment with open source technologies.

A Bronze sponsor is Pantheon:

Pantheon is a platform for professional website development, testing, and deployment. Supporting Drupal and WordPress, Pantheon runs over 100,000 websites for the world’s top brands, universities, and media organizations on top of over a million containers.

A Bronze sponsor is Pengutronix:

Pengutronix provides consulting, training and development services for Embedded Linux to customers from the industry. The Kernel Team ports Linux to customer hardware and has more than 3100 patches in the official mainline kernel. In addition to lowlevel ports, the Pengutronix Application Team is responsible for board support packages based on PTXdist or Yocto and deals with system integration (this is where systemd plays an important role). The Graphics Team works on accelerated multimedia tasks, based on the Linux kernel, GStreamer, Qt and web technologies.

We’d like to thank our sponsors for their support! Without sponsors our conference would not be possible!

We’ll shortly announce our second round of sponsors, please stay tuned!

If you’d like to join the ranks of systemd.conf 2015 sponsors, please have a look at our Becoming a Sponsor page!

Reminder! The systemd.conf 2015 Call for Presentations ends on monday, August 31st! Please make sure to submit your proposals on the CfP page until then!

Also, don’t forget to register for the conference! Only a limited number of
registrations are available due to space constraints!
Register here!.

For further details about systemd.conf consult the conference website.