Tag Archives: Enterprise

DevOps Cafe Episode 75 – Barbara Bouldin

Post Syndicated from DevOpsCafeAdmin original http://devopscafe.org/show/2017/9/20/devops-cafe-episode-75-barbara-bouldin.html

A lot has changed (but some things haven’t) 

John and Damon chat with Barbara Bouldin about her first-hand view of the good — and the ugly — through the past few decades of the technology industry. From Bell Labs to the breakup of AT&T (“Ma Bell”) to enterprise software to transforming government agencies today, Barbara’s journey has been an interesting ride.

  

Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 
Follow Barbara Bouldin on Twitter: @bbouldin771

Notes:

 

Please tweet or leave comments or questions below and we’ll read them on the show!

Catching Up on Some Recent AWS Launches and Publications

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/catching-up-on-some-recent-aws-launches-and-publications/

As I have noted in the past, the AWS Blog Team is working hard to make sure that you know about as many AWS launches and publications as possible, without totally burying you in content! As part of our balancing act, we will occasionally publish catch-up posts to clear our queues and to bring more information to your attention. Here’s what I have in store for you today:

  • Monitoring for Cross-Region Replication of S3 Objects
  • Tags for Spot Fleet Instances
  • PCI DSS Compliance for 12 More Services
  • HIPAA Eligibility for WorkDocs
  • VPC Resizing
  • AppStream 2.0 Graphics Design Instances
  • AMS Connector App for ServiceNow
  • Regtech in the Cloud
  • New & Revised Quick Starts

Let’s jump right in!

Monitoring for Cross-Region Replication of S3 Objects
I told you about cross-region replication for S3 a couple of years ago. As I showed you at the time, you simply enable versioning for the source bucket and then choose a destination region and bucket. You can check the replication status manually, or you can create an inventory (daily or weekly) of the source and destination buckets.

The Cross-Region Replication Monitor (CRR Monitor for short) solution checks the replication status of objects across regions and gives you metrics and failure notifications in near real-time.

To learn more, read the CRR Monitor Implementation Guide and then use the AWS CloudFormation template to Deploy the CRR Monitor.

Tags for Spot Instances
Spot Instances and Spot Fleets (collections of Spot Instances) give you access to spare compute capacity. We recently gave you the ability to enter tags (key/value pairs) as part of your spot requests and to have those tags applied to the EC2 instances launched to fulfill the request:

To learn more, read Tag Your Spot Fleet EC2 Instances.

PCI DSS Compliance for 12 More Services
As first announced on the AWS Security Blog, we recently added 12 more services to our PCI DSS compliance program, raising the total number of in-scope services to 42. To learn more, check out our Compliance Resources.

HIPAA Eligibility for WorkDocs
In other compliance news, we announced that Amazon WorkDocs has achieved HIPAA eligibility and PCI DSS compliance in all AWS Regions where WorkDocs is available.

VPC Resizing
This feature allows you to extend an existing Virtual Private Cloud (VPC) by adding additional blocks of addresses. This gives you more flexibility and should help you to deal with growth. You can add up to four secondary /16 CIDRs per VPC. You can also edit the secondary CIDRs by deleting them and adding new ones. Simply select the VPC and choose Edit CIDRs from the menu:

Then add or remove CIDR blocks as desired:

To learn more, read about VPCs and Subnets.

AppStream 2.0 Graphics Design Instances
Powered by AMD FirePro S7150x2 Server GPUs and equipped with AMD Multiuser GPU technology, the new Graphics Design instances for Amazon AppStream 2.0 will let you run and stream graphics applications more cost-effectively than ever. The instances are available in four sizes, with 2-16 vCPUs and 7.5 GB to 61 GB of memory.

To learn more, read Introducing Amazon AppStream 2.0 Graphics Design, a New Lower Costs Instance Type for Streaming Graphics Applications.

AMS Connector App for ServiceNow
AWS Managed Services (AMS) provides Infrastructure Operations Management for the Enterprise. Designed to accelerate cloud adoption, it automates common operations such as change requests, patch management, security and backup.

The new AMS integration App for ServiceNow lets you interact with AMS from within ServiceNow, with no need for any custom development or API integration.

To learn more, read Cloud Management Made Easier: AWS Managed Services Now Integrates with ServiceNow.

Regtech in the Cloud
Regtech (as I learned while writing this), is short for regulatory technology, and is all about using innovative technology such as cloud computing, analytics, and machine learning to address regulatory challenges.

Working together with APN Consulting Partner Cognizant, TABB Group recently published a thought leadership paper that explains why regulations and compliance pose huge challenges for our customers in the financial services, and shows how AWS can help!

New & Revised Quick Starts
Our Quick Starts team has been cranking out new solutions and making significant updates to the existing ones. Here’s a roster:

Alfresco Content Services (v2) Atlassian Confluence Confluent Platform Data Lake
Datastax Enterprise GitHub Enterprise Hashicorp Nomad HIPAA
Hybrid Data Lake with Wandisco Fusion IBM MQ IBM Spectrum Scale Informatica EIC
Magento (v2) Linux Bastion (v2) Modern Data Warehouse with Tableau MongoDB (v2)
NetApp ONTAP NGINX (v2) RD Gateway Red Hat Openshift
SAS Grid SIOS Datakeeper StorReduce SQL Server (v2)

And that’s all I have for today!

Jeff;

Say Hello to the New Atlassian

Post Syndicated from Chris De Santis original https://www.anchor.com.au/blog/2017/09/hello-new-atlassian/

Who is Atlassian?

Atlassian is an Australian IT company that develops enterprise software, with its best-known products being its issue-tracking app, Jira, and team collaboration and wiki product, Confluence.

In December 2015, Atlassian went public and made their initial public offering (IPO) under the symbol TEAM, valuing them at $4.37 billion. In summary, they big.

What happened?

A facelift

It’s a nice sunny day in Sydney in mid-September of 2017, and Atlassian, after 15 years of consistency, has rebranded, changing their look and feel for a brighter and funner one, compared to the dreary previous look.New Atlassian Branding VideoIt’s a hell of a lot simpler and, as they show in the above video, it’s going to be used with a lot more creativity and flair in mind—it’s flexible in a sense that they can use it in a lot more ways than before, with a lot more colours than before.

Atlassian Logo ComparisonThe blues they’re using now work super-well with the logos on a white background, whereas the white logos on their new champion, brand colour blue can go both ways: some can see it as a bold, daring step which is quite attractive, while others can see it as off-putting and not very user-friendly.

New Atlassian Logo Versions

What’s it all mean?

Symbolism

In his announcement blog, Atlassian Co-Founder & Co-CEO, Mike Cannon-Brookes, mentions that the branding change reflects their newly-shifted focus on the concept of teamwork. He continues to explain that their previous logo depicted the sky-holding Greek titan Atlas and symbolised legendary service and support. But, while it has become renown, they’re shifting their focus on the concept of teamwork—why focus on something you’ve already done right, right?

Atlassian Logo EvolutionThe new logo contains more symbolism than meets the eye, as can be interpreted as:

  • Two people high-fiving
  • A mountain to scale
  • The letter “A” (seen as two pillars reinforcing each other)
Product logos

Atlassian has created and acquired many products in their adventure so far, and they all seemed to have a similar art style, but something always felt off about their consistency. Well, needless to say, this was addressed with Atlassian’s very own “identity system”, which is a pretty cool term for a consistent logo-look for 14+ products, to fit them under one brand.

New Atlassian Product LogosThe result is a set of unique marks that “still feel very related to each other”. Whereas, I also see a new set of “unknown” Pokémon.

Typeface

New Atlassian TypefaceTo add a cherry on top, Atlassian will be using their own custom-made typeface called Charlie Sans, specifically designed to balance legibility with personality–that’s probably the best way to describe it. Otherwise, I’d say, out of purely-constructive criticism, that there isn’t much difference between itself and any of the other staple fonts; i.e. Arial, Verdana, etc. Then again, I’m not a professional designer.

It doesn’t look as distinct as their previous typeface, but, to be fair, it does look very slick next to the new product logos.

Well…

What do you think about it all?

 

Image credits: Atlassian

The post Say Hello to the New Atlassian appeared first on AWS Managed Services by Anchor.

AWS Earns Department of Defense Impact Level 5 Provisional Authorization

Post Syndicated from Chris Gile original https://aws.amazon.com/blogs/security/aws-earns-department-of-defense-impact-level-5-provisional-authorization/

AWS GovCloud (US) Region image

The Defense Information Systems Agency (DISA) has granted the AWS GovCloud (US) Region an Impact Level 5 (IL5) Department of Defense (DoD) Cloud Computing Security Requirements Guide (CC SRG) Provisional Authorization (PA) for six core services. This means that AWS’s DoD customers and partners can now deploy workloads for Controlled Unclassified Information (CUI) exceeding IL4 and for unclassified National Security Systems (NSS).

We have supported sensitive Defense community workloads in the cloud for more than four years, and this latest IL5 authorization is complementary to our FedRAMP High Provisional Authorization that covers 18 services in the AWS GovCloud (US) Region. Our customers now have the flexibility to deploy any range of IL 2, 4, or 5 workloads by leveraging AWS’s services, attestations, and certifications. For example, when the US Air Force needed compute scale to support the Next Generation GPS Operational Control System Program, they turned to AWS.

In partnership with a certified Third Party Assessment Organization (3PAO), an independent validation was conducted to assess both our technical and nontechnical security controls to confirm that they meet the DoD’s stringent CC SRG standards for IL5 workloads. Effective immediately, customers can begin leveraging the IL5 authorization for the following six services in the AWS GovCloud (US) Region:

AWS has been a long-standing industry partner with DoD, federal-agency customers, and private-sector customers to enhance cloud security and policy. We continue to collaborate on the DoD CC SRG, Defense Acquisition Regulation Supplement (DFARS) and other government requirements to ensure that policy makers enact policies to support next-generation security capabilities.

In an effort to reduce the authorization burden of our DoD customers, we’ve worked with DISA to port our assessment results into an easily ingestible format by the Enterprise Mission Assurance Support Service (eMASS) system. Additionally, we undertook a separate effort to empower our industry partners and customers to efficiently solve their compliance, governance, and audit challenges by launching the AWS Customer Compliance Center, a portal providing a breadth of AWS-specific compliance and regulatory information.

We look forward to providing sustained cloud security and compliance support at scale for our DoD customers and adding additional services within the IL5 authorization boundary. See AWS Services in Scope by Compliance Program for updates. To request access to AWS’s DoD security and authorization documentation, contact AWS Sales and Business Development. For a list of frequently asked questions related to AWS DoD SRG compliance, see the AWS DoD SRG page.

To learn more about the announcement in this post, tune in for the AWS Automating DoD SRG Impact Level 5 Compliance in AWS GovCloud (US) webinar on October 11, 2017, at 11:00 A.M. Pacific Time.

– Chris Gile, Senior Manager, AWS Public Sector Risk & Compliance

 

 

ShareBeast & AlbumJams Operator Pleads Guilty to Criminal Copyright Infringement

Post Syndicated from Andy original https://torrentfreak.com/sharebeast-albumjams-operator-pleads-guilty-to-criminal-copyright-infringement-170911/

In September 2015, U.S. authorities announced action against a pair of sites involved in music piracy.

ShareBeast.com and AlbumJams.com were allegedly responsible for the distribution of “a massive library” of popular albums and tracks. Both were accused of offering thousands of tracks before their official release dates.

The U.S. Department of Justice (DOJ) placed their now familiar seizure notice on both domains, with the RIAA claiming ShareBeast was the largest illegal file-sharing site operating in the United States. Indeed, the site’s IP addresses at the time indicated at least some hosting taking place in Illinois.

“This is a huge win for the music community and legitimate music services. Sharebeast operated with flagrant disregard for the rights of artists and labels while undermining the legal marketplace,” RIAA Chairman & CEO Cary Sherman commented at the time.

“Millions of users accessed songs from Sharebeast each month without one penny of compensation going to countless artists, songwriters, labels and others who created the music.”

Now, a full two years later, former Sharebeast operator Artur Sargsyan has pleaded guilty to one felony count of criminal copyright infringement, admitting to the unauthorized distribution and reproduction of over 1 billion copies of copyrighted works.

“Through Sharebeast and other related sites, this defendant profited by illegally distributing copyrighted music and albums on a massive scale,” said U. S. Attorney John Horn.

“The collective work of the FBI and our international law enforcement partners have shut down the Sharebeast websites and prevented further economic losses by scores of musicians and artists.”

The Department of Justice says that from 2012 to 2015, 29-year-old Sargsyan used ShareBeast as a pirate music repository, infringing works produced by Ariana Grande, Katy Perry, Beyonce, Kanye West, and Justin Bieber, among others. He linked to that content from Newjams.net and Albumjams.com, two other sites under his control.

The DoJ says that Sargsyan was informed at least 100 times that there was infringing content on ShareBeast but despite the warnings, the content remained available. When those warnings produced no results, the FBI – assisted by law enforcement in the UK and the Netherlands – seized servers used by Sargsyan to distribute the material.

Brad Buckles, EVP, Anti-Piracy at the RIAA, welcomed the guilty plea.

“Sharebeast and its related sites represented the most popular network of infringing music sites operated out of the United States. The network was responsible for providing millions of downloads of popular music files including unauthorized pre-release albums and tracks.This illicit activity was a gut-punch to music creators who were paid nothing by the service,” Buckles said.

“We are incredibly grateful for the government’s commitment to protecting the rights of artists and labels. We especially thank the dedicated agents of the FBI who painstakingly unraveled this criminal enterprise, and U.S. Attorney John Horn and his team for their work and diligence in seeing this case to its successful conclusion.”

Sargsyan, of Glendale, California, will be sentenced December 4 before U.S. District Judge Timothy C. Batten.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Five Must-Watch Software Engineering Talks

Post Syndicated from Bozho original https://techblog.bozho.net/five-must-watch-software-engineering-talks/

We’ve all watched dozens of talks online. And we probably don’t remember many of them. But some do stick in our heads and we eventually watch them again (and again) because we know they are good and we want to remember the things that were said there. So I decided to compile a small list of talks that I find very insightful, useful and that have, in a way, shaped my software engineering practice or expanded my understanding of the software world.

1. How To Design A Good API and Why it Matters by Joshua Bloch – this is a must-watch (well, obviously all are). And don’t skip it because “you are not writing APIs” – everyone is writing APIs. Maybe not used by hundreds of other developers, but used by at least several, and that’s a good enough reason. Having watched this talk I ended up buying and reading one of the few software books that I have actually read end-to-end – “Effective Java” (the talk uses Java as an example, but the principles aren’t limited to Java)

2. How to write clean, testable code by Miško Hevery. Maybe there are tons of talks about testing code, maybe Uncle Bob has a more popular one, but I found this one particularly practical and the the point – that writing testable code is a skill, and that testable code is good code. (By the way, the speaker then wrote AngularJS)

3. Back to basics: the mess we’ve made of our fundamental data types by Jon Skeet. The title says it all, and it’s nice to be reminded of how fragile even the basics of programming languages are.

4. The Danger of Software Patents by Richard Stallman. That goes a little bit away from writing software, but puts software in legal context – how do legislation loopholes affect code reuse and business practices related it. It’s a bit long, but I think worth it.

5. Does my ESB look big in this? by Martin Fowler and Jim Webber. It’s about bloated enterprise architecture and how to actually do enterprise architecture without complex and expensive middleware. (Unfortunately it’s not on YouTube, so no embedding).

Although this is not a “ranking”, I’d like to add a few honourable mentions: The famous “WAT” lightning talk, showing some quirks of ruby and javascript, “The future of programming” by Bret Victor, “You suck at Excel” by Joel Spolsky, which isn’t really about creating software, but it’s cool. And a tiny shameless plug with my “Common sense driven development talk”

I hope the compilation is useful and enlightening. Enjoy.

The post Five Must-Watch Software Engineering Talks appeared first on Bozho's tech blog.

AWS Hot Startups – August 2017

Post Syndicated from Tina Barr original https://aws.amazon.com/blogs/aws/aws-hot-startups-august-2017/

There’s no doubt about it – Artificial Intelligence is changing the world and how it operates. Across industries, organizations from startups to Fortune 500s are embracing AI to develop new products, services, and opportunities that are more efficient and accessible for their consumers. From driverless cars to better preventative healthcare to smart home devices, AI is driving innovation at a fast rate and will continue to play a more important role in our everyday lives.

This month we’d like to highlight startups using AI solutions to help companies grow. We are pleased to feature:

  • SignalBox – a simple and accessible deep learning platform to help businesses get started with AI.
  • Valossa – an AI video recognition platform for the media and entertainment industry.
  • Kaliber – innovative applications for businesses using facial recognition, deep learning, and big data.

SignalBox (UK)

In 2016, SignalBox founder Alain Richardt was hearing the same comments being made by developers, data scientists, and business leaders. They wanted to get into deep learning but didn’t know where to start. Alain saw an opportunity to commodify and apply deep learning by providing a platform that does the heavy lifting with an easy-to-use web interface, blueprints for common tasks, and just a single-click to productize the models. With SignalBox, companies can start building deep learning models with no coding at all – they just select a data set, choose a network architecture, and go. SignalBox also offers step-by-step tutorials, tips and tricks from industry experts, and consulting services for customers that want an end-to-end AI solution.

SignalBox offers a variety of solutions that are being used across many industries for energy modeling, fraud detection, customer segmentation, insurance risk modeling, inventory prediction, real estate prediction, and more. Existing data science teams are using SignalBox to accelerate their innovation cycle. One innovative UK startup, Energi Mine, recently worked with SignalBox to develop deep networks that predict anomalous energy consumption patterns and do time series predictions on energy usage for businesses with hundreds of sites.

SignalBox uses a variety of AWS services including Amazon EC2, Amazon VPC, Amazon Elastic Block Store, and Amazon S3. The ability to rapidly provision EC2 GPU instances has been a critical factor in their success – both in terms of keeping their operational expenses low, as well as speed to market. The Amazon API Gateway has allowed for operational automation, giving SignalBox the ability to control its infrastructure.

To learn more about SignalBox, visit here.

Valossa (Finland)

As students at the University of Oulu in Finland, the Valossa founders spent years doing research in the computer science and AI labs. During that time, the team witnessed how the world was moving beyond text, with video playing a greater role in day-to-day communication. This spawned an idea to use technology to automatically understand what an audience is viewing and share that information with a global network of content producers. Since 2015, Valossa has been building next generation AI applications to benefit the media and entertainment industry and is moving beyond the capabilities of traditional visual recognition systems.

Valossa’s AI is capable of analyzing any video stream. The AI studies a vast array of data within videos and converts that information into descriptive tags, categories, and overviews automatically. Basically, it sees, hears, and understands videos like a human does. The Valossa AI can detect people, visual and auditory concepts, key speech elements, and labels explicit content to make moderating and filtering content simpler. Valossa’s solutions are designed to provide value for the content production workflow, from media asset management to end-user applications for content discovery. AI-annotated content allows online viewers to jump directly to their favorite scenes or search specific topics and actors within a video.

Valossa leverages AWS to deliver the industry’s first complete AI video recognition platform. Using Amazon EC2 GPU instances, Valossa can easily scale their computation capacity based on customer activity. High-volume video processing with GPU instances provides the necessary speed for time-sensitive workflows. The geo-located Availability Zones in EC2 allow Valossa to bring resources close to their customers to minimize network delays. Valossa also uses Amazon S3 for video ingestion and to provide end-user video analytics, which makes managing and accessing media data easy and highly scalable.

To see how Valossa works, check out www.WhatIsMyMovie.com or enable the Alexa Skill, Valossa Movie Finder. To try the Valossa AI, sign up for free at www.valossa.com.

Kaliber (San Francisco, CA)

Serial entrepreneurs Ray Rahman and Risto Haukioja founded Kaliber in 2016. The pair had previously worked in startups building smart cities and online privacy tools, and teamed up to bring AI to the workplace and change the hospitality industry. Our world is designed to appeal to our senses – stores and warehouses have clearly marked aisles, products are colorfully packaged, and we use these designs to differentiate one thing from another. We tell each other apart by our faces, and previously that was something only humans could measure or act upon. Kaliber is using facial recognition, deep learning, and big data to create solutions for business use. Markets and companies that aren’t typically associated with cutting-edge technology will be able to use their existing camera infrastructure in a whole new way, making them more efficient and better able to serve their customers.

Computer video processing is rapidly expanding, and Kaliber believes that video recognition will extend to far more than security cameras and robots. Using the clients’ network of in-house cameras, Kaliber’s platform extracts key data points and maps them to actionable insights using their machine learning (ML) algorithm. Dashboards connect users to the client’s BI tools via the Kaliber enterprise APIs, and managers can view these analytics to improve their real-world processes, taking immediate corrective action with real-time alerts. Kaliber’s Real Metrics are aimed at combining the power of image recognition with ML to ultimately provide a more meaningful experience for all.

Kaliber uses many AWS services, including Amazon Rekognition, Amazon Kinesis, AWS Lambda, Amazon EC2 GPU instances, and Amazon S3. These services have been instrumental in helping Kaliber meet the needs of enterprise customers in record time.

Learn more about Kaliber here.

Thanks for reading and we’ll see you next month!

-Tina

 

Hard Drive Stats for Q2 2017

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hard-drive-failure-stats-q2-2017/

Backblaze Drive Stats Q2 2017

In this update, we’ll review the Q2 2017 and lifetime hard drive failure rates for all our current drive models. We also look at how our drive migration strategy is changing the drives we use and we’ll check in on our enterprise class drives to see how they are doing. Along the way we’ll share our observations and insights and as always we welcome your comments and critiques.

Since our last report for Q1 2017, we have added 635 additional hard drives to bring us to the 83,151 drives we’ll focus on. In Q1 we added over 10,000 new drives to the mix, so adding just 635 in Q2 seems “odd.” In fact, we added 4,921 new drives and retired 4,286 old drives as we migrated from lower density drives to higher density drives. We cover more about migrations later on, but first let’s look at the Q2 quarterly stats.

Hard Drive Stats for Q2 2017

We’ll begin our review by looking at the statistics for the period of April 1, 2017 through June 30, 2017 (Q2 2017). This table includes 17 different 3 ½” drive models that were operational during the indicated period, ranging in size from 3 to 8 TB.

Quarterly Hard Drive Failure Rates for Q2 2017

When looking at the quarterly numbers, remember to look for those drives with at least 50,000 drive hours for the quarter. That works out to about 550 drives running the entire quarter. That’s a good sample size. If the sample size is below that, the failure rates can be skewed based on a small change in the number of drive failures.

As noted previously, we use the quarterly numbers to look for trends. So this time we’ve included a trend indicator in the table. The “Q2Q Trend” column is short for quarter-to-quarter trend, i.e. last quarter to this quarter. We can add, change, or delete trend columns depending on community interest. Let us know what you think in the comments.

Good Migrations

In Q2 we continued with our data migration program. For us, a drive migration means we intentionally remove a good drive from service and replace it with another drive. Drives that are removed via migrations are not counted as failed. Once they are removed they stop accumulating drive hours and other stats in our system.

There are three primary drivers for our migration program.

  1. Increase Storage Density – For example, in Q3 we replaced 3 TB drives with 8 TB drives, more than doubling the amount of storage in a given Storage Pod for the same footprint. The cost of electricity was nominally more with the 8 TB drives, but the increase in density more than offset the additional cost. For those interested you can read more about the cost of cloud storage here.
  2. Backblaze Vaults – Our Vault architecture has proven to be more cost effective over the past two years than using stand-alone Storage Pods. A major goal of the migration program is to have the entire Backblaze cloud deployed on the highly efficient and resilient Backblaze Vault architecture.
  3. Balancing the Load – With our Phoenix data center online and accepting data, we have migrated some systems to the Phoenix DC. Don’t worry, we didn’t put your data on a truck and drive it to Phoenix. We simply built new systems there and transferred the data from our Northern California DC. In the process, we are gaining valuable insights as we move towards being able to replicate data between the two data centers.
During Q2 we migrated nearly 30 Petabytes of data.

During Q2 we migrated the data on 155 systems, giving nearly 30 petabytes of data a new, more durable, place to call home. There are still 644 individual Storage Pods (Storage Pod Classics, as we call them) left to migrate to the Backblaze Vault architecture.

Just in case you don’t know, a Backblaze Vault is a logical collection of 20 beefy Storage Pods (not Classics). Using our own Reed-Solomon erasure coding library, data is spread out across the 20 Pods into 17 data shards and 3 parity shards. The data and parity shards of each arriving data blob can be stored on different Storage Pods in a given Backblaze Vault.

Lifetime Hard Drive Failure Rates for Current Drives

The table below shows the failure rates for the hard drive models we had in service as of June 30, 2017. This is over the period beginning in April 2013 and ending June 30, 2017. If you are interested in the hard drive failure rates for all the hard drives we’ve used over the years, please refer to our 2016 hard drive review.

Cumulative Hard Drive Failure Rates

Enterprise vs Consumer Drives

We added 3,595 enterprise class 8 TB drives in Q2 bringing our total to 6,054 drives. You may be tempted to compare the failure rates of the 8 TB enterprise drive (model: ST8000NM005) to the consumer 8 TB drive (model: ST8000DM002), and conclude the enterprise drives fail at a higher rate. Let’s not jump to that conclusion yet, as the average operational age of the enterprise drives is only 2.11 months.

There are some insights we can gain from the current data. The enterprise drives have 363,282 drives hours and an annualized failure rate of 1.61%. If we look back at our data, we find that as of Q3 2016, the 8 TB consumer drives had 422,263 drive hours with an annualized failure rate of 1.60%. That means that when both drive models had a similar number of drive hours, they had nearly the same annualized failure rate. There are no conclusions to be made here, but the observation is worth considering as we gather data for our comparison.

Next quarter, we should have enough data to compare the 8 TB drives, but by then the 8TB drives could be “antiques.” In the next week or so, we’ll be installing 12 TB hard drives in a Backblaze Vault. Each 60-drive Storage Pod in the Vault would have 720 TB of storage available and a 20-pod Backblaze Vault would have 14.4 petabytes of raw storage.

Better Late Than Never

Sorry for being a bit late with the hard drive stats report this quarter. We were ready to go last week, then this happened. Some folks here thought that was more important than our Q2 Hard Drive Stats. Go figure.

Drive Stats at the Storage Developers Conference

We will be presenting at the Storage Developers Conference in Santa Clara on Monday September 11th at 8:30am. We’ll be reviewing our drive stats along with some interesting observations from the SMART stats we also collect. The conference is the leading event for technical discussions and education on the latest storage technologies and standards. Come join us.

The Data For This Review

If you are interested in the data from the two tables in this review, you can download an Excel spreadsheet containing the two tables. Note: the domain for this download will be f001.backblazeb2.com.

You also can download the entire data set we use for these reports from our Hard Drive Test Data page. You can download and use this data for free for your own purposes. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone. It is free.

Good luck, and let us know if you find anything interesting.

The post Hard Drive Stats for Q2 2017 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

SUSE reaffirms support for Btrfs

Post Syndicated from corbet original https://lwn.net/Articles/731848/rss

SUSE has let it
be known
that it plans to continue developing and supporting the Btrfs
filesystem, regardless of what other distributors do. “If one of the rather small contributors to the btrfs filesystem announced to not support btrfs for production systems: should you wonder, whether SUSE, strongest contributor to btrfs today, would stop investing into btrfs?

You probably shouldn’t.

SUSE is committed to btrfs as the default filesystem for SUSE Linux Enterprise, and beyond.”

From Data Lake to Data Warehouse: Enhancing Customer 360 with Amazon Redshift Spectrum

Post Syndicated from Dylan Tong original https://aws.amazon.com/blogs/big-data/from-data-lake-to-data-warehouse-enhancing-customer-360-with-amazon-redshift-spectrum/

Achieving a 360o-view of your customer has become increasingly challenging as companies embrace omni-channel strategies, engaging customers across websites, mobile, call centers, social media, physical sites, and beyond. The promise of a web where online and physical worlds blend makes understanding your customers more challenging, but also more important. Businesses that are successful in this medium have a significant competitive advantage.

The big data challenge requires the management of data at high velocity and volume. Many customers have identified Amazon S3 as a great data lake solution that removes the complexities of managing a highly durable, fault tolerant data lake infrastructure at scale and economically.

AWS data services substantially lessen the heavy lifting of adopting technologies, allowing you to spend more time on what matters most—gaining a better understanding of customers to elevate your business. In this post, I show how a recent Amazon Redshift innovation, Redshift Spectrum, can enhance a customer 360 initiative.

Customer 360 solution

A successful customer 360 view benefits from using a variety of technologies to deliver different forms of insights. These could range from real-time analysis of streaming data from wearable devices and mobile interactions to historical analysis that requires interactive, on demand queries on billions of transactions. In some cases, insights can only be inferred through AI via deep learning. Finally, the value of your customer data and insights can’t be fully realized until it is operationalized at scale—readily accessible by fleets of applications. Companies are leveraging AWS for the breadth of services that cover these domains, to drive their data strategy.

A number of AWS customers stream data from various sources into a S3 data lake through Amazon Kinesis. They use Kinesis and technologies in the Hadoop ecosystem like Spark running on Amazon EMR to enrich this data. High-value data is loaded into an Amazon Redshift data warehouse, which allows users to analyze and interact with data through a choice of client tools. Redshift Spectrum expands on this analytics platform by enabling Amazon Redshift to blend and analyze data beyond the data warehouse and across a data lake.

The following diagram illustrates the workflow for such a solution.

This solution delivers value by:

  • Reducing complexity and time to value to deeper insights. For instance, an existing data model in Amazon Redshift may provide insights across dimensions such as customer, geography, time, and product on metrics from sales and financial systems. Down the road, you may gain access to streaming data sources like customer-care call logs and website activity that you want to blend in with the sales data on the same dimensions to understand how web and call center experiences maybe correlated with sales performance. Redshift Spectrum can join these dimensions in Amazon Redshift with data in S3 to allow you to quickly gain new insights, and avoid the slow and more expensive alternative of fully integrating these sources with your data warehouse.
  • Providing an additional avenue for optimizing costs and performance. In cases like call logs and clickstream data where volumes could be many TBs to PBs, storing the data exclusively in S3 yields significant cost savings. Interactive analysis on massive datasets may now be economically viable in cases where data was previously analyzed periodically through static reports generated by inexpensive batch processes. In some cases, you can improve the user experience while simultaneously lowering costs. Spectrum is powered by a large-scale infrastructure external to your Amazon Redshift cluster, and excels at scanning and aggregating large volumes of data. For instance, your analysts maybe performing data discovery on customer interactions across millions of consumers over years of data across various channels. On this large dataset, certain queries could be slow if you didn’t have a large Amazon Redshift cluster. Alternatively, you could use Redshift Spectrum to achieve a better user experience with a smaller cluster.

Proof of concept walkthrough

To make evaluation easier for you, I’ve conducted a Redshift Spectrum proof-of-concept (PoC) for the customer 360 use case. For those who want to replicate the PoC, the instructions, AWS CloudFormation templates, and public data sets are available in the GitHub repository.

The remainder of this post is a journey through the project, observing best practices in action, and learning how you can achieve business value. The walkthrough involves:

  • An analysis of performance data from the PoC environment involving queries that demonstrate blending and analysis of data across Amazon Redshift and S3. Observe that great results are achievable at scale.
  • Guidance by example on query tuning, design, and data preparation to illustrate the optimization process. This includes tuning a query that combines clickstream data in S3 with customer and time dimensions in Amazon Redshift, and aggregates ~1.9 B out of 3.7 B+ records in under 10 seconds with a small cluster!
  • Guidance and measurements to help assess deciding between two options: accessing and analyzing data exclusively in Amazon Redshift, or using Redshift Spectrum to access data left in S3.

Stream ingestion and enrichment

The focus of this post isn’t stream ingestion and enrichment on Kinesis and EMR, but be mindful of performance best practices on S3 to ensure good streaming and query performance:

  • Use random object keys: The data files provided for this project are prefixed with SHA-256 hashes to prevent hot partitions. This is important to ensure that optimal request rates to support PUT requests from the incoming stream in addition to certain queries from large Amazon Redshift clusters that could send a large number of parallel GET requests.
  • Micro-batch your data stream: S3 isn’t optimized for small random write workloads. Your datasets should be micro-batched into large files. For instance, the “parquet-1” dataset provided batches >7 million records per file. The optimal file size for Redshift Spectrum is usually in the 100 MB to 1 GB range.

If you have an edge case that may pose scalability challenges, AWS would love to hear about it. For further guidance, talk to your solutions architect.

Environment

The project consists of the following environment:

  • Amazon Redshift cluster: 4 X dc1.large
  • Data:
    • Time and customer dimension tables are stored on all Amazon Redshift nodes (ALL distribution style):
      • The data originates from the DWDATE and CUSTOMER tables in the Star Schema Benchmark
      • The customer table contains attributes for 3 million customers.
      • The time data is at the day-level granularity, and spans 7 years, from the start of 1992 to the end of 1998.
    • The clickstream data is stored in an S3 bucket, and serves as a fact table.
      • Various copies of this dataset in CSV and Parquet format have been provided, for reasons to be discussed later.
      • The data is a modified version of the uservisits dataset from AMPLab’s Big Data Benchmark, which was generated by Intel’s Hadoop benchmark tools.
      • Changes were minimal, so that existing test harnesses for this test can be adapted:
        • Increased the 751,754,869-row dataset 5X to 3,758,774,345 rows.
        • Added surrogate keys to support joins with customer and time dimensions. These keys were distributed evenly across the entire dataset to represents user visits from six customers over seven years.
        • Values for the visitDate column were replaced to align with the 7-year timeframe, and the added time surrogate key.

Queries across the data lake and data warehouse 

Imagine a scenario where a business analyst plans to analyze clickstream metrics like ad revenue over time and by customer, market segment and more. The example below is a query that achieves this effect: 

The query part highlighted in red retrieves clickstream data in S3, and joins the data with the time and customer dimension tables in Amazon Redshift through the part highlighted in blue. The query returns the total ad revenue for three customers over the last three months, along with info on their respective market segment.

Unfortunately, this query takes around three minutes to run, and doesn’t enable the interactive experience that you want. However, there’s a number of performance optimizations that you can implement to achieve the desired performance.

Performance analysis

Two key utilities provide visibility into Redshift Spectrum:

  • EXPLAIN
    Provides the query execution plan, which includes info around what processing is pushed down to Redshift Spectrum. Steps in the plan that include the prefix S3 are executed on Redshift Spectrum. For instance, the plan for the previous query has the step “S3 Seq Scan clickstream.uservisits_csv10”, indicating that Redshift Spectrum performs a scan on S3 as part of the query execution.
  • SVL_S3QUERY_SUMMARY
    Statistics for Redshift Spectrum queries are stored in this table. While the execution plan presents cost estimates, this table stores actual statistics for past query runs.

You can get the statistics of your last query by inspecting the SVL_S3QUERY_SUMMARY table with the condition (query = pg_last_query_id()). Inspecting the previous query reveals that the entire dataset of nearly 3.8 billion rows was scanned to retrieve less than 66.3 million rows. Improving scan selectivity in your query could yield substantial performance improvements.

Partitioning

Partitioning is a key means to improving scan efficiency. In your environment, the data and tables have already been organized, and configured to support partitions. For more information, see the PoC project setup instructions. The clickstream table was defined as:

CREATE EXTERNAL TABLE clickstream.uservisits_csv10
…
PARTITIONED BY(customer int4, visitYearMonth int4)

The entire 3.8 billion-row dataset is organized as a collection of large files where each file contains data exclusive to a particular customer and month in a year. This allows you to partition your data into logical subsets by customer and year/month. With partitions, the query engine can target a subset of files:

  • Only for specific customers
  • Only data for specific months
  • A combination of specific customers and year/months

You can use partitions in your queries. Instead of joining your customer data on the surrogate customer key (that is, c.c_custkey = uv.custKey), the partition key “customer” should be used instead:

SELECT c.c_name, c.c_mktsegment, t.prettyMonthYear, SUM(uv.adRevenue)
…
ON c.c_custkey = uv.customer
…
ORDER BY c.c_name, c.c_mktsegment, uv.yearMonthKey  ASC

This query should run approximately twice as fast as the previous query. If you look at the statistics for this query in SVL_S3QUERY_SUMMARY, you see that only half the dataset was scanned. This is expected because your query is on three out of six customers on an evenly distributed dataset. However, the scan is still inefficient, and you can benefit from using your year/month partition key as well:

SELECT c.c_name, c.c_mktsegment, t.prettyMonthYear, SUM(uv.adRevenue)
…
ON c.c_custkey = uv.customer
…
ON uv.visitYearMonth = t.d_yearmonthnum
…
ORDER BY c.c_name, c.c_mktsegment, uv.visitYearMonth ASC

All joins between the tables are now using partitions. Upon reviewing the statistics for this query, you should observe that Redshift Spectrum scans and returns the exact number of rows, 66,270,117. If you run this query a few times, you should see execution time in the range of 8 seconds, which is a 22.5X improvement on your original query!

Predicate pushdown and storage optimizations 

Previously, I mentioned that Redshift Spectrum performs processing through large-scale infrastructure external to your Amazon Redshift cluster. It is optimized for performing large scans and aggregations on S3. In fact, Redshift Spectrum may even out-perform a medium size Amazon Redshift cluster on these types of workloads with the proper optimizations. There are two important variables to consider for optimizing large scans and aggregations:

  • File size and count. As a general rule, use files 100 MB-1 GB in size, as Redshift Spectrum and S3 are optimized for reading this object size. However, the number of files operating on a query is directly correlated with the parallelism achievable by a query. There is an inverse relationship between file size and count: the bigger the files, the fewer files there are for the same dataset. Consequently, there is a trade-off between optimizing for object read performance, and the amount of parallelism achievable on a particular query. Large files are best for large scans as the query likely operates on sufficiently large number of files. For queries that are more selective and for which fewer files are operating, you may find that smaller files allow for more parallelism.
  • Data format. Redshift Spectrum supports various data formats. Columnar formats like Parquet can sometimes lead to substantial performance benefits by providing compression and more efficient I/O for certain workloads. Generally, format types like Parquet should be used for query workloads involving large scans, and high attribute selectivity. Again, there are trade-offs as formats like Parquet require more compute power to process than plaintext. For queries on smaller subsets of data, the I/O efficiency benefit of Parquet is diminished. At some point, Parquet may perform the same or slower than plaintext. Latency, compression rates, and the trade-off between user experience and cost should drive your decision.

To help illustrate how Redshift Spectrum performs on these large aggregation workloads, run a basic query that aggregates the entire ~3.7 billion record dataset on Redshift Spectrum, and compared that with running the query exclusively on Amazon Redshift:

SELECT uv.custKey, COUNT(uv.custKey)
FROM <your clickstream table> as uv
GROUP BY uv.custKey
ORDER BY uv.custKey ASC

For the Amazon Redshift test case, the clickstream data is loaded, and distributed evenly across all nodes (even distribution style) with optimal column compression encodings prescribed by the Amazon Redshift’s ANALYZE command.

The Redshift Spectrum test case uses a Parquet data format with each file containing all the data for a particular customer in a month. This results in files mostly in the range of 220-280 MB, and in effect, is the largest file size for this partitioning scheme. If you run tests with the other datasets provided, you see that this data format and size is optimal and out-performs others by ~60X. 

Performance differences will vary depending on the scenario. The important takeaway is to understand the testing strategy and the workload characteristics where Redshift Spectrum is likely to yield performance benefits. 

The following chart compares the query execution time for the two scenarios. The results indicate that you would have to pay for 12 X DC1.Large nodes to get performance comparable to using a small Amazon Redshift cluster that leverages Redshift Spectrum. 

Chart showing simple aggregation on ~3.7 billion records

So you’ve validated that Spectrum excels at performing large aggregations. Could you benefit by pushing more work down to Redshift Spectrum in your original query? It turns out that you can, by making the following modification:

The clickstream data is stored at a day-level granularity for each customer while your query rolls up the data to the month level per customer. In the earlier query that uses the day/month partition key, you optimized the query so that it only scans and retrieves the data required, but the day level data is still sent back to your Amazon Redshift cluster for joining and aggregation. The query shown here pushes aggregation work down to Redshift Spectrum as indicated by the query plan:

In this query, Redshift Spectrum aggregates the clickstream data to the month level before it is returned to the Amazon Redshift cluster and joined with the dimension tables. This query should complete in about 4 seconds, which is roughly twice as fast as only using the partition key. The speed increase is evident upon reviewing the SVL_S3QUERY_SUMMARY table:

  • Bytes scanned is 21.6X less because of the Parquet data format.
  • Only 90 records are returned back to the Amazon Redshift cluster as a result of the push-down, instead of ~66.2 million, leading to substantially less join overhead, and about 530 MB less data sent back to your cluster.
  • No adverse change in average parallelism.

Assessing the value of Amazon Redshift vs. Redshift Spectrum

At this point, you might be asking yourself, why would I ever not use Redshift Spectrum? Well, you still get additional value for your money by loading data into Amazon Redshift, and querying in Amazon Redshift vs. querying S3.

In fact, it turns out that the last version of our query runs even faster when executed exclusively in native Amazon Redshift, as shown in the following chart:

Chart comparing Amazon Redshift vs. Redshift Spectrum with pushdown aggregation over 3 months of data

As a general rule, queries that aren’t dominated by I/O and which involve multiple joins are better optimized in native Amazon Redshift. For instance, the performance difference between running the partition key query entirely in Amazon Redshift versus with Redshift Spectrum is twice as large as that that of the pushdown aggregation query, partly because the former case benefits more from better join performance.

Furthermore, the variability in latency in native Amazon Redshift is lower. For use cases where you have tight performance SLAs on queries, you may want to consider using Amazon Redshift exclusively to support those queries.

On the other hand, when you perform large scans, you could benefit from the best of both worlds: higher performance at lower cost. For instance, imagine that you wanted to enable your business analysts to interactively discover insights across a vast amount of historical data. In the example below, the pushdown aggregation query is modified to analyze seven years of data instead of three months:

SELECT c.c_name, c.c_mktsegment, t.prettyMonthYear, uv.totalRevenue
…
WHERE customer <= 3 and visitYearMonth >= 199201
… 
FROM dwdate WHERE d_yearmonthnum >= 199201) as t
…
ORDER BY c.c_name, c.c_mktsegment, uv.visitYearMonth ASC

This query requires scanning and aggregating nearly 1.9 billion records. As shown in the chart below, Redshift Spectrum substantially speeds up this query. A large Amazon Redshift cluster would have to be provisioned to support this use case. With the aid of Redshift Spectrum, you could use an existing small cluster, keep a single copy of your data in S3, and benefit from economical, durable storage while only paying for what you use via the pay per query pricing model.

Chart comparing Amazon Redshift vs. Redshift Spectrum with pushdown aggregation over 7 years of data

Summary

Redshift Spectrum lowers the time to value for deeper insights on customer data queries spanning the data lake and data warehouse. It can enable interactive analysis on datasets in cases that weren’t economically practical or technically feasible before.

There are cases where you can get the best of both worlds from Redshift Spectrum: higher performance at lower cost. However, there are still latency-sensitive use cases where you may want native Amazon Redshift performance. For more best practice tips, see the 10 Best Practices for Amazon Redshift post.

Please visit the Amazon Redshift Spectrum PoC Environment Github page. If you have questions or suggestions, please comment below.

 


Additional Reading

Learn more about how Amazon Redshift Spectrum extends data warehousing out to exabytes – no loading required.


About the Author

Dylan Tong is an Enterprise Solutions Architect at AWS. He works with customers to help drive their success on the AWS platform through thought leadership and guidance on designing well architected solutions. He has spent most of his career building on his expertise in data management and analytics by working for leaders and innovators in the space.

 

 

Amazon AppStream 2.0 Launch Recap – Domain Join, Simple Network Setup, and Lots More

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-appstream-2-0-launch-recap-domain-join-simple-network-setup-and-lots-more/

We (the AWS Blog Team) work to maintain a delicate balance between coverage and volume! On the one hand, we want to make sure that you are aware of as many features as possible. On the other, we don’t want to bury you in blog posts. As a happy medium between these two extremes we sometimes let interesting new features pile up for a couple of weeks and then pull them together in the form of a recap post such as this one.

Today I would like to tell you about the latest and greatest additions to Amazon AppStream 2.0, our application streaming service (read Amazon AppStream 2.0 – Stream Desktop Apps from AWS to learn more). We launched GPU-powered streaming instances just a month ago and have been adding features rapidly; here are some recent launches that did not get covered in individual posts at launch time:

  • Microsoft Active Directory Domains – Connect AppStream 2.0 streaming instances to your Microsoft Active Directory domain.
  • User Management & Web Portal – Create and manage users from within the AppStream 2.0 management console.
  • Persistent Storage for User Files – Use persistent, S3-backed storage for user home folders.
  • Simple Network Setup – Enable Internet access for image builder and instance fleets more easily.
  • Custom VPC Security Groups – Use VPC security groups to control network traffic.
  • Audio-In – Use microphones with your streaming applications.

These features were prioritized based on early feedback from AWS customers who are using or are considering the use of AppStream 2.0 in their enterprises. Let’s take a quick look at each one.

Domain Join
This much-requested feature allows you to connect your AppStream 2.0 streaming instances to your Microsoft Active Directory (AD) domain. After you do this you can apply existing policies to your streaming instances, and provide your users with single sign-on access to intranet resources such as web sites, printers, and file shares. Your users are authenticated using the SAML 2.0 provider of your choice, and can access applications that require a connection to your AD domain.

To get started, visit the AppStream 2.0 Console, create and store a Directory Configuration:

Newly created image builders and newly launched fleets can then use the stored Directory Configuration to join the AD domain in an Organizational Unit (OU) that you provide:

To learn more, read Using Active Directory Domains with AppStream 2.0 and follow the Setting Up the Active Directory tutorial. You can also learn more in the What’s New.

User Management & Web Portal
This feature makes it easier for you to give new users access to the applications that you are streaming with AppStream 2.0 if you are not using the Domain Join feature that I described earlier.

You can create and manage users, give them access to applications through a web portal, and send them welcome emails, all with a couple of clicks:

AppStream 2.0 sends each new user a welcome email that directs them to a web portal where they will be prompted to create a permanent password. Once they are logged in they are able to access the applications that have been assigned to them.

To learn more, read Using the AppStream 2.0 User Pool and the What’s New.

Persistent Storage
This feature allows users of streaming applications to store files for use in later AppStream 2.0 sessions. Each user is given a home folder which is stored in Amazon Simple Storage Service (S3) between sessions. The folder is made available to the streaming instance at the start of the session and changed files are periodically synced back to S3. To enable this feature, simply check Enable Home Folders when you create your next fleet:

All folders (and the files within) are stored in an S3 bucket that is automatically created within your account when the feature is enabled. There is no limit on total file storage but we recommend that individual files be limited to 5 gigabytes.

Regular S3 pricing applies; to learn more about this feature read about Persistent Storage with AppStream 2.0 Home Folders and check out the What’s New.

Simple Network Setup
Setting up Internet access for your image builder and your streaming instances was once a multi-step process. You had to create a Network Address Translation (NAT) gateway in a public subnet of one of your VPCs and configure traffic routing rules.

Now, you can do this by marking the image builder or the fleet for Internet access, selecting a VPC that has at least one public subnet, and choosing the public subnet(s), all from the AppStream 2.0 Console:

To learn more, read Network Settings for Fleet and Image Builder Instances and Enabling Internet Access Using a Public Subnet and check out the What’s New.

Custom VPC Security Groups
You can create VPC security groups and associate them with your image builders and your fleets. This gives you fine-grained control over inbound and outbound traffic to databases, license servers, file shares, and application servers. Read the What’s New to learn more.

Audio-In
You can use analog and USB microphones, mixing consoles, and other audio input devices with your streaming applications. Simply click on Enable Microphone in the AppStream 2.0 toolbar to get started. Read the What’s New to learn more.

Available Now
All of these features are available now and you can start using them today in all AWS Regions where Amazon AppStream 2.0 is available.

Jeff;

PS – If you are new to AppStream 2.0, try out some pre-installed applications. No setup needed and you’ll get to experience the power of streaming applications first-hand.

Oracle considers letting go of Java EE

Post Syndicated from corbet original https://lwn.net/Articles/731579/rss

Oracle has announced
that it is considering stepping back from management of the Java Enterprise
Edition. “We are discussing how we can improve the Java EE
development process following the delivery of Java EE 8. We believe that
moving Java EE technologies including reference implementations and test
compatibility kit to an open source foundation may be the right next step,
in order to adopt more agile processes, implement more flexible licensing,
and change the governance process. We plan on exploring this possibility
with the community, our licensees and several candidate foundations to see
if we can move Java EE forward in this direction.

Wanted: Front End Developer

Post Syndicated from Yev original https://www.backblaze.com/blog/wanted-front-end-developer/

Want to work at a company that helps customers in over 150 countries around the world protect the memories they hold dear? Do you want to challenge yourself with a business that serves consumers, SMBs, Enterprise, and developers? If all that sounds interesting, you might be interested to know that Backblaze is looking for a Front End Developer​!

Backblaze is a 10 year old company. Providing great customer experiences is the “secret sauce” that enables us to successfully compete against some of technology’s giants. We’ll finish the year at ~$20MM ARR and are a profitable business. This is an opportunity to have your work shine at scale in one of the fastest growing verticals in tech – Cloud Storage.

You will utilize HTML, ReactJS, CSS and jQuery to develop intuitive, elegant user experiences. As a member of our Front End Dev team, you will work closely with our web development, software design, and marketing teams.

On a day to day basis, you must be able to convert image mockups to HTML or ReactJS – There’s some production work that needs to get done. But you will also be responsible for helping build out new features, rethink old processes, and enabling third party systems to empower our marketing/sales/ and support teams.

Our Front End Developer must be proficient in:

  • HTML, ReactJS
  • UTF-8, Java Properties, and Localized HTML (Backblaze runs in 11 languages!)
  • JavaScript, CSS, Ajax
  • jQuery, Bootstrap
  • JSON, XML
  • Understanding of cross-browser compatibility issues and ways to work around them
  • Basic SEO principles and ensuring that applications will adhere to them
  • Learning about third party marketing and sales tools through reading documentation. Our systems include Google Tag Manager, Google Analytics, Salesforce, and Hubspot

Struts, Java, JSP, Servlet and Apache Tomcat are a plus, but not required.

We’re looking for someone that is:

  • Passionate about building friendly, easy to use Interfaces and APIs.
  • Likes to work closely with other engineers, support, and marketing to help customers.
  • Is comfortable working independently on a mutually agreed upon prioritization queue (we don’t micromanage, we do make sure tasks are reasonably defined and scoped).
  • Diligent with quality control. Backblaze prides itself on giving our team autonomy to get work done, do the right thing for our customers, and keep a pace that is sustainable over the long run. As such, we expect everyone that checks in code that is stable. We also have a small QA team that operates as a secondary check when needed.

Backblaze Employees Have:

  • Good attitude and willingness to do whatever it takes to get the job done
  • Strong desire to work for a small fast, paced company
  • Desire to learn and adapt to rapidly changing technologies and work environment
  • Comfort with well behaved pets in the office

This position is located in San Mateo, California. Regular attendance in the office is expected. Backblaze is an Equal Opportunity Employer and we offer competitive salary and benefits, including our no policy vacation policy.

If this sounds like you
Send an email to [email protected] with:

  1. Front End Dev​ in the subject line
  2. Your resume attached
  3. An overview of your relevant experience

The post Wanted: Front End Developer appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

AWS Migration Hub – Plan & Track Enterprise Application Migration

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-migration-hub-plan-track-enterprise-application-migration/

About once a week, I speak to current and potential AWS customers in our Seattle Executive Briefing Center. While I generally focus on our innovation process, we sometimes discuss other topics, including application migration. When enterprises decide to migrate their application portfolios they want to do it in a structured, orderly fashion. These portfolios typically consist of hundreds of complex Windows and Linux applications, relational databases, and more. Customers find themselves eager yet uncertain as to how to proceed. After spending time working with these customers, we have learned that their challenges generally fall in to three major categories:

Discovery – They want to make sure that they have a deep and complete understanding of all of the moving parts that power each application.

Server & Database Migration – They need to transfer on-premises workloads and database tables to the cloud.

Tracking / Management – With large application portfolios and multiple migrations happening in parallel, they need to track and manage progress in an application-centric fashion.

Over the last couple of years we have launched a set of tools that address the first two challenges. The AWS Application Discovery Service automates the process of discovering and collecting system information, the AWS Server Migration Service takes care of moving workloads to the cloud, and the AWS Database Migration Service moves relational databases, NoSQL databases, and data warehouses with minimal downtime. Partners like Racemi and CloudEndure also offer migration tools of their own.

New AWS Migration Hub
Today we are bringing this collection of AWS and partner migration tools together in the AWS Migration Hub. The hub provides access to the tools that I mentioned above, guides you through the migration process, and tracks the status of each migration, all in accord with the methodology and tenets described in our Migration Acceleration Program (MAP).

Here’s the main screen. It outlines the migration process (discovery, migration, and tracking):

Clicking on Start discovery reveals the flow of the migration process:

It is also possible to skip the Discovery step and begin the migration immediately:

The Servers list is populated using data from an AWS migration service (Server Migration Service or Database Migration Service), partner tools, or using data collected by the AWS Application Discovery Service:

I can on Group as application to create my first application:

Once I identify some applications to migrate, I can track them in the Migrations section of the Hub:

The migration tools, if authorized, automatically send status updates and results back to Migration Hub, for display on the migration status page for the application. Here you can see that Racemi DynaCenter and CloudEndure Migration have played their parts in the migration:

I can track the status of my migrations by checking the Migration Hub Dashboard:

Migration Hub works with migration tools from AWS and our Migration Partners; see the list of integrated partner tools to learn more:

Available Now
AWS Migration Hub can manage migrations in any AWS Region that has the necessary migration tools available; the hub itself runs in the US West (Oregon) Region. There is no charge for the Hub; you pay only for the AWS services that you consume in the course of the migration.

If you are ready to begin your migration to the cloud and are in need of some assistance, please take advantage of the services offered by our Migration Acceleration Partners. These organizations have earned their migration competency by repeatedly demonstrating their ability to deliver large-scale migration.

Jeff;

Automating Blue/Green Deployments of Infrastructure and Application Code using AMIs, AWS Developer Tools, & Amazon EC2 Systems Manager

Post Syndicated from Ramesh Adabala original https://aws.amazon.com/blogs/devops/bluegreen-infrastructure-application-deployment-blog/

Previous DevOps blog posts have covered the following use cases for infrastructure and application deployment automation:

An AMI provides the information required to launch an instance, which is a virtual server in the cloud. You can use one AMI to launch as many instances as you need. It is security best practice to customize and harden your base AMI with required operating system updates and, if you are using AWS native services for continuous security monitoring and operations, you are strongly encouraged to bake into the base AMI agents such as those for Amazon EC2 Systems Manager (SSM), Amazon Inspector, CodeDeploy, and CloudWatch Logs. A customized and hardened AMI is often referred to as a “golden AMI.” The use of golden AMIs to create EC2 instances in your AWS environment allows for fast and stable application deployment and scaling, secure application stack upgrades, and versioning.

In this post, using the DevOps automation capabilities of Systems Manager, AWS developer tools (CodePipeLine, CodeDeploy, CodeCommit, CodeBuild), I will show you how to use AWS CodePipeline to orchestrate the end-to-end blue/green deployments of a golden AMI and application code. Systems Manager Automation is a powerful security feature for enterprises that want to mature their DevSecOps practices.

Here are the high-level phases and primary services covered in this use case:

 

You can access the source code for the sample used in this post here: https://github.com/awslabs/automating-governance-sample/tree/master/Bluegreen-AMI-Application-Deployment-blog.

This sample will create a pipeline in AWS CodePipeline with the building blocks to support the blue/green deployments of infrastructure and application. The sample includes a custom Lambda step in the pipeline to execute Systems Manager Automation to build a golden AMI and update the Auto Scaling group with the golden AMI ID for every rollout of new application code. This guarantees that every new application deployment is on a fully patched and customized AMI in a continuous integration and deployment model. This enables the automation of hardened AMI deployment with every new version of application deployment.

 

 

We will build and run this sample in three parts.

Part 1: Setting up the AWS developer tools and deploying a base web application

Part 1 of the AWS CloudFormation template creates the initial Java-based web application environment in a VPC. It also creates all the required components of Systems Manager Automation, CodeCommit, CodeBuild, and CodeDeploy to support the blue/green deployments of the infrastructure and application resulting from ongoing code releases.

Part 1 of the AWS CloudFormation stack creates these resources:

After Part 1 of the AWS CloudFormation stack creation is complete, go to the Outputs tab and click the Elastic Load Balancing link. You will see the following home page for the base web application:

Make sure you have all the outputs from the Part 1 stack handy. You need to supply them as parameters in Part 3 of the stack.

Part 2: Setting up your CodeCommit repository

In this part, you will commit and push your sample application code into the CodeCommit repository created in Part 1. To access the initial git commands to clone the empty repository to your local machine, click Connect to go to the AWS CodeCommit console. Make sure you have the IAM permissions required to access AWS CodeCommit from command line interface (CLI).

After you’ve cloned the repository locally, download the sample application files from the part2 folder of the Git repository and place the files directly into your local repository. Do not include the aws-codedeploy-sample-tomcat folder. Go to the local directory and type the following commands to commit and push the files to the CodeCommit repository:

git add .
git commit -a -m "add all files from the AWS Java Tomcat CodeDeploy application"
git push

After all the files are pushed successfully, the repository should look like this:

 

Part 3: Setting up CodePipeline to enable blue/green deployments     

Part 3 of the AWS CloudFormation template creates the pipeline in AWS CodePipeline and all the required components.

a) Source: The pipeline is triggered by any change to the CodeCommit repository.

b) BuildGoldenAMI: This Lambda step executes the Systems Manager Automation document to build the golden AMI. After the golden AMI is successfully created, a new launch configuration with the new AMI details will be updated into the Auto Scaling group of the application deployment group. You can watch the progress of the automation in the EC2 console from the Systems Manager –> Automations menu.

c) Build: This step uses the application build spec file to build the application build artifact. Here are the CodeBuild execution steps and their status:

d) Deploy: This step clones the Auto Scaling group, launches the new instances with the new AMI, deploys the application changes, reroutes the traffic from the elastic load balancer to the new instances and terminates the old Auto Scaling group. You can see the execution steps and their status in the CodeDeploy console.

After the CodePipeline execution is complete, you can access the application by clicking the Elastic Load Balancing link. You can find it in the output of Part 1 of the AWS CloudFormation template. Any consecutive commits to the application code in the CodeCommit repository trigger the pipelines and deploy the infrastructure and code with an updated AMI and code.

 

If you have feedback about this post, add it to the Comments section below. If you have questions about implementing the example used in this post, open a thread on the Developer Tools forum.


About the author

 

Ramesh Adabala is a Solutions Architect in Southeast Enterprise Solution Architecture team at Amazon Web Services.

Red Hat Enterprise Linux 7.4 released

Post Syndicated from ris original https://lwn.net/Articles/729459/rss

Red Hat has released
the fourth update to Red Hat Enterprise Linux 7. “Red Hat Enterprise
Linux 7.4 offers new automation capabilities designed to limit IT
complexity while enhancing workload security and performance for
traditional and cloud-native applications. This provides a powerful,
flexible operating system backbone to address enterprise IT needs across
physical servers, virtual machines and hybrid, public and multi-cloud
footprints.
” See the release
notes
for more details.