Elementary OS 7 released

Post Syndicated from corbet original https://lwn.net/Articles/921854/

Version 7 of
the Ubuntu-based elementary OS distribution has been released.

In the latest version of AppCenter we’ve worked on making app
descriptions more engaging with more information, making it easier
to update to the latest versions of apps, and improving support for
sideloading and alt stores. We’ve also worked on improving
AppCenter’s responsiveness—making sure you can comfortably use it
when tiling and on small displays as well as better using space on
large displays.

Decreasing incident response time for OutSystems with AWS serverless technology

Post Syndicated from Ivo Pinto original https://aws.amazon.com/blogs/architecture/decreasing-incident-response-time-for-outsystems-with-aws-serverless-technology/

Leading modern application platform space OutSystems is a low-code platform that provides tools for companies to develop, deploy, and manage omnichannel enterprise applications.

Security is a top priority at OutSystems. Their Security Operations Center (SOC) deals with thousands of incidents a year, each with a set of response actions that need to be executed as quickly as possible. Providing security at such large scale is a challenge, even for the most well-prepared organizations. Manual and repetitive tasks account for the majority of the response time involved in this process, and decreasing this key metric requires orchestration and automation.

Security orchestration, automation, and response (SOAR) systems are designed to translate security analysts’ manual procedures into automated actions, making them faster and more scalable.

In this blog post, we’ll explore how OutSystems lowered their incident response time by 99 percent by designing and deploying a custom SOAR using Serverless services on AWS.

Solution architecture

Security incidents happen with unknown frequency, making serverless services a natural fit to boost security at OutSystems because of their increased agility and capability to scale to zero.

There are two ways to trigger SOAR actions in this architecture:

  1. Automatically through Security Information and Event Management (SIEM) security incident findings
  2. On-demand through chat application

Using the first method, when a security incident is detected by the SIEM, an event is published to Amazon Simple Notification Service (Amazon SNS). This triggers an AWS Lambda function that creates a ticket in an internal ticketing system. Then the Lambda Playbooks function triggers to decide which playbook to run depending on the incident details.

Each playbook is a set of actions that are executed in response to a trigger. Playbooks are the key component behind automated tasks. OutSystems uses AWS Step Functions to orchestrate the actions and Lambda functions to execute them.

But this solution does not exist in isolation. Depending on the playbook, Step Functions interacts with other components such as AWS Secrets Manager or external APIs.

Using the second method, the on-demand trigger for OutSystems SOAR relies on a chat application. This application calls a Lambda function URL that interacts with the playbooks we just discussed.

Figure 1 represents the high-level architecture of OutSystems’ custom SOAR.

SOAR architecture for AWS

Figure 1. SOAR architecture for AWS

This architecture was deployed with Infrastructure as Code (IaC) using AWS CloudFormation and AWS CodePipeline.

This same IaC architecture is used when new playbooks or updates to existing ones are made. Code changes that are committed to a source control repository trigger the CodePipeline which uses AWS CodeBuild and CloudFormation change sets to deploy the updates to the affected resources.

Use cases

The use cases that OutSystems has deployed playbooks for to date include:

  • SQL injection
  • Unauthorized access to credentials
  • Issuance of new certificates
  • Login brute forces
  • Impossible travel

Let’s explore the Impossible travel use case. Impossible travel happens when a user logs in from one location, and then later logs in from a different location that would be impossible to travel between within the elapsed time.

When the SIEM identifies this behavior, it triggers an alert and the following actions are performed:

  1. A ticket is created
  2. An IP address check is performed in reputation databases, such as AbuseIPDB or VirusTotal
  3. An IP address check is performed in the internal database, and the IP address is added if it is not found
  4. A search is performed for past events with the same IP address
  5. A WHOIS is performed on the IP address
  6. Recent logins of the user are identified in the SIEM, along with all related information
  7. All of this information is automatically added to the ticket. Every step listed here was previously performed manually; a task that took an average of 15 minutes. Now, the process takes just 8 seconds—a 99.1% incident response time improvement.

The following remediation actions can also be automated, along with many others:

Some of these remediation actions are already in place, while others are in development.

Conclusion

At OutSystems, much like at AWS, security is considered “job zero.” It is not only important to be proactive in preventing security incidents, but when they happen, the response must be quick, effective, and as immune to human error as possible.

With the implementation of this custom SOAR, OutSystems reduced the average response time to security incidents by 99%. Tasks that previously took 76 hours of analysts’ time are now accomplished automatically within 31 minutes.

During the evaluation period, SOAR addressed hundreds of real-world incidents with some threat intel use cases being executed thousands of times.

An architecture composed of serverless services ensures OutSystems does not pay for systems that are standing by waiting for work, and at the same time, not compromising on performance.

If you are interested in this topic—how to respond to security incidents using AWS serverless services—be sure you also read the Orchestrating a security incident response with AWS Step Functions and How to get started with security response automation on AWS blog posts.

Хронология на нежеланието да бъде разследван главния прокурор

Post Syndicated from Bozho original https://blog.bozho.net/blog/4010

Ясно е, че мнозинството в Народното събрание не иска главния прокурор да може да бъде разследван от независим от него прокурор и съответно да му бъде повдигано обвинение. И ГЕРБ, ДПС и БСП опитват да прикрият това си нежелание зад процедури и неочаквани съюзи. Нека да разгледам хронологията на това нежелание. Не претендирам за пълнота на стъпките до предишния парламент.

1. През 2009 г. Европейският съд за правата на човек осъжда България по делото „Колеви срещу България“ заради недосегаемостта на главния прокурор. Прокурор Колев е бил убит, като има съмнения, че главният прокурор е намесен в убийството, но няма кой да го разследва заради кариерна и организационна зависимост.

2. Накрая на мандата на ГЕРБ, през 2021 г., месец преди разпускането на Народното събрание, е прието изменение на Наказателно-процесуалния кодекс, с което се въвежда механизъм за разследване и повдигане на обвинение. Но този механизъм не е добре обмислен (дали съзнателно или не – не знам), и бива отменен от Конституционния съд по-късно през годината.

3. След обявяването на механизма за противоконституционен, от Демократична България говорим за ефективен механизъм за разследване на главния прокурор в две кампании поред.

4. Възраждане внасят проект за разследване на главния прокурор от шефа на следствието, което противоречи на изискването за независимост (за повече детайли – коментарът на бившия прокурор Янкулов) в началото на 47-мото Народно събрание.

5. Министерство на правосъдието, с министър Йорданова, разработва проект за механизъм за разследване на главния прокурор. Механизмът става условие в Плана за възстановяване и устойчивост, като за да бъде приет, трябва да е получил становище от Венецианската комисия. Нейното заседание е в началото на есента, поради което той не е внесен в 47-мото Народно събрание. Публикуван е за обсъждане на сайта на Министерство на правосъдието. Освен механизма за разследване, в законопроекта се предвижда и ред за обжалване на отказите на прокуратурата да образува досъдебно производство – иначе казано, ограничава „опъването на чадър“ от прокуратурата.

6. На първия ден на настоящото Народно събрание Демократична България внасяме законопроекта, след като вече Венецианската комисия е дала своето становище. След това и Възраждане внасят техния.

7. Законопроектът на Демократична България е разпределен на много комисии, но е гледан от водещата комисия два месеца по-късно – на 11.01. През това време е нямало желание да се разгледа законопроекта. Може би оправдание би било, че се чака законопроектът на Министерски съвет, който доразвива законопроекта, който министър Йорданова „оставя“ в МП. Само че ако имаше желание това да мине, можеше подобренията на МП да бъдат внесени между първо и второ четене. Желание нямаше, съответно това не беше направено.

8. От 11-ти нататък започна едно бавене – първото заседание на правна комисия по този законопроект беше прекратено, а следващото беше седмица по-късно. 6 часа четене на становища, които са изпратени на комисията и тя се е запознала с тях няма никаква полза, освен за бавене.

9. Същата седмица (след 11-ви), в петък предложихме включване на нашия законопроект в дневния ред на пленарното заседание без доклад от водещата комисия, тъй като срокът от два месеца след внасяне е изтекъл. Мнозинството (с участие на ГЕРБ, ДПС и БСП) го отхвърли, за да не наваксаме забавянето, което същото това мнозинство постигна.

10. Законопроектът беше включен в дневния ред за 26.01. Защо не за 25.01 – за да може да се забави, както ще видим в следващите изречения. На 26.01 сутринта залата размени двете точки – тази по законопроекта на Възраждане и другата, с проектите на ДБ и на Министерски съвет. Така на 26.01 до обяд минава проекта на Възраждане. След това започва четене на доклади. Докладът на водещата комисия се чете от няколко човека и отнема около час и половина (абсолютно излишно губене на време). Заради други изменения на НПК, има доклади от общо 8 комисии (като подкрепящите са с по-кратки доклади, но все пак над 2 часа се четат доклади, които всички трябва да са чели). Към 5:20 вземаме почивка, за да направим изслушване на Мета и Телус за модерацията. При обявяване на края на почивката, председателят на комисията Калина Константинова обявява край на заседанието на комисията. В зала ГЕРБ и ДПС правят проверка на кворума и не се регистрират, като по този начин прекратяват заседанието. Опитват да обвинят това липсващите депутати от ПП и ДБ, но дори всички да бяха в зала, кворум нямаше да има, защото когато 100+ депутати от ГЕРБ и ДПС не се регистрират, няма как да има кворум. Така ГЕРБ и ДПС генерират още един ден забавяне.

11. На следващия ден залата приема да продължи разглеждането на НПК и приема този на Министерски съвет (за разследване на главния прокурор, който надгражда проекта на ДБ). Има обаче един детайл – за да започнат да текат сроковете между първо и второ гласуване, когато повече от един законопроект за изменение на един и същи закон е приет на първо четене, трябва комисията да одобри обединен законопроект. От трибуната призоваваме Правна комисия да изготви обобщения законопроект в рамките на деня, но това не се случва. Не се случва и в понеделник. Заседанието на Правна комисия е свикано във вторник. Като контраст – когато се бърза по даден законопроект, такива заседания за одобряване на общи законопроекти се свикват в почивки, и са изцяло формални. Така стана по Изборния кодекс, например. Тук се загубиха още два дни.

12. Във вторник (31.01) има извънредно пленарно заседание. За да започнат да текат срокове по предложения, трябва обобщения законопроект да бъде съобщен в пленарна зала. Тогава могат да се предложат и съкратени срокове. Часове наред или председателят на Правна комисия не изпраща съобщението до председателя на НС, или съобщението се „мотае по коридорите“. Съобщението стига до председателстващия чак когато заседанието е станало закрито, за да приеме докладите на службите за сигурност (поради потенциалното наличие на класифицирана информация). Тогава Надежда Йорданова предлага тридневен срок за предложения. След почивката, поискана от ДБ, ДПС също иска почивка. Може би за да изчислят точно колко човека трябва да гласуват, за да изглежда, че е можело предложението да мине, ако е нямало отсъстващи депутати? (Някой ден бих проверил дали в такива ситуации се вземат данни от системата за достъп, за да е ясно от коя група колко човека има в сградата).

В крайна сметка в зала сме 17 души от ДБ и 40 от ПП (от ДБ има двама отсъстващи за целия ден, една от които е родилата предната вечер Илина Мутафчиева). Първото гласуване на минава, но от ГЕРБ поискват прегласуване. При прегласуването в ГЕРБ промяна няма, но последният човек от Възраждане успява да гласува, а един объркал се от Български възход си сменя гласа от „възъдржал се“ на „за“. Председателстващият пита „г-н Янев, успяхте ли да гласувате“, но справката показва, че Янев не е гласувал, макар да е бил в зала. Крайният резултат е, че ако целите групи на ПП и ДБ са били в зала, е щяло да мине (но разбира се, няма как Илина, която е родила снощи, да е в зала). Съвсем очаквано, веднага след това, Костадинов (Възраждане) обяснява пред журналисти как заради нас не е минал съкратения срок. Днес казва същото в декларация от трибуната. Тази опорка има за цел само едно – да замаскира отказа на мнозинството от ГЕРБ, ДПС и БСП да има реален механизъм за разследване на главния прокурор. Защо този път използват Костадинов за говорител – защото предния път (по 10) изглеждаше доста кухо да обвиниш другите, а ти да не си се регистрирал.

Седмица по седмица, ден по ден, процедурен трик след процедурен трик, ГЕРБ и ДПС, придружени от БСП, отлагат приемането на механизма за разследване на главния прокурор. И не само това – доста други промени, които ограничават безконтролната власт на главния прокурор. Толкова много усилие съсредоточено върху браненето на главния прокурор не е добро начало за предизборна кампания за колегите, но дано в следващия парламент не продължат да измислят нови начини да забавят неизбежното.

Материалът Хронология на нежеланието да бъде разследван главния прокурор е публикуван за пръв път на БЛОГодаря.

Passwords Are Terrible (Surprising No One)

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/02/passwords-are-terrible-surprising-no-one.html

This is the result of a security audit:

More than a fifth of the passwords protecting network accounts at the US Department of the Interior—including Password1234, Password1234!, and ChangeItN0w!—were weak enough to be cracked using standard methods, a recently published security audit of the agency found.

[…]

The results weren’t encouraging. In all, the auditors cracked 18,174—or 21 percent—­of the 85,944 cryptographic hashes they tested; 288 of the affected accounts had elevated privileges, and 362 of them belonged to senior government employees. In the first 90 minutes of testing, auditors cracked the hashes for 16 percent of the department’s user accounts.

The audit uncovered another security weakness—the failure to consistently implement multi-factor authentication (MFA). The failure extended to 25—­or 89 percent—­of 28 high-value assets (HVAs), which, when breached, have the potential to severely impact agency operations.

Original story:

To make their point, the watchdog spent less than $15,000 on building a password-cracking rig—a setup of a high-performance computer or several chained together ­- with the computing power designed to take on complex mathematical tasks, like recovering hashed passwords. Within the first 90 minutes, the watchdog was able to recover nearly 14,000 employee passwords, or about 16% of all department accounts, including passwords like ‘Polar_bear65’ and ‘Nationalparks2014!’.

Code to the beat of your own drum during Black History Month 2023

Post Syndicated from Kevin Johnson original https://www.raspberrypi.org/blog/coding-projects-black-history-month-2023/

When we think about a celebration, we also think about how important it is to be intentional about sound. And with this month of February being a celebration of Black history in the USA, we want to help you make some noise to amplify the voices, experiences, and achievements of the Black community.

Two young people using laptops at a Code Club session.

From the past and present, to those still to come in the future, countless remarkable achievements have been made by Black individuals who have chosen to move to the beat of their own drum. Music and sound can be tools to tell stories, to express ourselves, to promote change, to celebrate, and so much more. So take some time this month to make your own music with your young coders and start dancing.                

Of course, choosing to dance is not the same as choosing to devote your life to the equality and freedom of all people. But it reminds us that you can incite change by choosing to do what is right, even when you feel like you’re the only one moving to the music. It won’t be long before you see change and meet people you resonate with, and a new sound will develop in which everyone can find their rhythm.

So join us this month as we explore the power of code and music to celebrate Black History Month.

Projects to help you find your rhythm

We’ve selected three of our favourite music-related projects to help you bring a joyful atmosphere to your coding sessions this month. All of the projects are in Scratch, a programming language that uses blocks to help young people develop their confidence in computer programming while they experiment with colours and sounds to make their own projects.  

Drum star | Scratch

Find your rhythm with this clicker game where you earn points by playing the drums in different venues. The project is one of our Explore projects and it includes step-by-step instructions to help young creators develop their skills, confidence, and interest in programming. This makes it a great option for beginners who want to get started with Scratch and programming.

alt=""

Music maker | Scratch

Code to the beat of your own drum — or any instrument you like. Use this project to create your own virtual musical instrument and celebrate a Black musician you admire. For young people who have some experience with Scratch, they may enjoy expressing themselves with this Design project. Our Design projects give young people support to build on their experience to gain more independence coding their own ideas.

alt=""

Binary hero | Scratch

Can you keep up with the beat? Prove it in this game where you play the notes of a song while they scroll down the screen. You could choose to include a song associated with a moment in Black history that is meaningful to you. This project is a great opportunity for young people to expand their programming knowledge to create lists, while they also test their reaction skills with a fun game.

alt=""

For young creators who want to create projects that don’t involve music or sound, check out these projects which can help you to:

Let us know how you’re celebrating Black History Month in your community on Twitter, LinkedIn, Facebook, or Instagram all month long!

Black stories to inspire you to move

Learn about our partnership with Team4Tech and Kenya Connect, with whom we are empowering educators and students in rural Kenya to use the power of coding and computing to benefit their communities.

A young person uses a computer.
  • I Belong in Computer Science: Salome Tirado Okeze

Meet Salome, a computer science student from the UK who shares her experiences and advice for young people interested in finding out where computer science can lead them. Salome was one of the first people we interviewed for our ‘I belong’ campaign to celebrate young role models in computer science.

alt=""

Research to help set the tone  

We believe that creating inclusive and equitable learning environments is essential to supporting all young people to see computer science as an opportunity for them. To help engage young people, especially those who are underrepresented in computer science classrooms, we are carrying out research with teachers to make computing culturally relevant. Our work promoting culturally relevant pedagogy in educational settings in England has been impacted by projects of many US researchers who have already contributed heavily to this area. You can learn about two of these projects in this blog post.

Educators who want to find out how they can use culturally relevant pedagogy with their learners can download our free guidelines today.

An educator explains a computing concept to a learner.

We would also like to invite you to our monthly research seminar on 7 February 2023, when we will be joined by Dr Jean Salac who will be sharing their research on Moving from equity to justice in computing instruction for youth. Dr Salac’s session is part of our current series of seminars that centres on primary school (K–5) teaching and learning of computing. The seminars are free and open to everyone interested in computing education. We hope to see you there! 

The post Code to the beat of your own drum during Black History Month 2023 appeared first on Raspberry Pi.

„Тоест“ на 5 години

Post Syndicated from Тоест original https://www.toest.bg/toest-5-godini/

„Тоест“ на 5 години

На 1 февруари 2018 г. „група мечтатели“, както се нарекохме тогава, дадохме старт на „Тоест“. Пет години по-късно още много мечтатели се присъединиха към нашата група. Тази медия нямаше да съществува до днес без своите автори, редактори, дарители и поддръжници и най-вече без вас – нашата критична и вярна публика. „Тоест“ е общ на всички ни. Да ни е честит!

През тези години не сме изневерили на нито едно от обещанията си към вас: да не „гоним“ новините и да не се стремим да сме „винаги първи“ в съобщаването на това „какво е станало“, а да се опитваме задълбочено да обясним „какво следва от това, което е станало“; да подбираме внимателно темите, авторите и събеседниците си, да проверяваме фактите и да полагаме много грижи за редактирането на всеки материал; да публикуваме само авторско съдържание, а не препечатки на Facebook статуси, чужди статии и готови прессъобщения; да не разсейваме четенето ви с рекламни банери и изскачащи съобщения.

Така ще бъде и занапред.

Както обещахме преди паузата за основен ремонт, ще отбележим петата годишнина на „Тоест“ със свеж рестарт и много хубави изненади:

Нов сайт

Вероятно вече сте забелязали, че „Тоест“ се нанесе в нов дом. Много се надяваме той да ви харесва и да ви служи добре. Новият ни сайт е много по-бърз и по-гъвкав (особено за читателите в чужбина), дизайнът му е лек, съвременен и удобен за четене от всякакви устройства, а търсачката намира статии, автори и рубрики по зададените ключови думи в реално време. Има автоматична интеграция с платежната система Stripe и възможности за определяне на нива на финансова подкрепа от читателите и различни подаръци към тях (но за това ще ви разкажем малко по-късно). Отделно има допълнителни функционалности, които значително улесняват работата на редакторите и авторите.

Нови редактори, автори и теми

Много сме радостни, че към редакторския ни екип се присъединиха Боряна Телбис и Надежда Радулова. Боряна ще отговаря за политическите анализи и коментари, Надя – за културните статии и рубрики, а двете заедно ще си поделят всички останали обществени теми, които засягаме в „Тоест“ – граждански права и права на малцинствата, дезинформация и пропаганда, образование, здравеопазване, правосъдие, икономика и др.

Кръгът от автори на „Тоест“ също се разширява, а с това – и темите ни. Профилите на новите автори е разнообразен – сред тях имаме политолог, молекулярен биолог, художествен преводач, студенти по журналистика. Предстои да ви представим и две нови месечни рубрики.

Повече присъствие и комуникация с публиката

За тази дейност ще разчитаме на Йоанна Елми, която познавате добре като постоянна авторка на „Тоест“. През 2023 г. очаквайте в профилите ни в социалните мрежи повече визуално съдържание, сторита, кратки видеа, анкети и дискусии. А ако сте наши дарители, в най-скоро време ще бъдете поканени в специалния ни форум, където ще можете да комуникирате директно с авторите и редакторите на „Тоест“, да предлагате теми, да давате обратна връзка и да участвате пряко в процеса на подобряване на медията.

Нови форми на подкрепа

Постоянно търсим нови начини на финансиране. През идните месеци очакваме резултатите от първото ни кандидатстване по програма за подпомагане на медии. Заедно с това ще търсим подкрепа и от бизнеси, които имат ясното разбиране, че почтената и справедлива среда за предприемачество няма как да съществува в условия на корумпирана власт, неработещо правосъдие и зависими медии.

Преди всичко обаче ще продължаваме да разчитаме на подкрепата на всеки от вас. Изберете някой от дарителските пакети, които сме ви подготвили – в тях има и специални подаръци от нас и нашите партньори.

И за десерт – най-сладката новина

Ще има шоколад на „Тоест“ с марката на „Гайо“! Той ще бъде част от постоянния асортимент на семейната фабрика за занаятчийски шоколади, а 1 лв. от цената на всяко блокче ще подкрепя усилията на медията ни да създава качествена журналистика. Нямаме търпение в най-скоро време да споделим с вас как ще се казва, как ще изглежда и какъв ще е неговият вкус, специално създаден за „Тоест“.

AWS Lake Formation 2022 year in review

Post Syndicated from Jason Berkowitz original https://aws.amazon.com/blogs/big-data/aws-lake-formation-2022-year-in-review/

Data governance is the collection of policies, processes, and systems that organizations use to ensure the quality and appropriate handling of their data throughout its lifecycle for the purpose of generating business value. Data governance is increasingly top-of-mind for customers as they recognize data as one of their most important assets. Effective data governance enables better decision-making by improving data quality, reducing data management costs, and ensuring secure access to data for stakeholders. In addition, data governance is required to comply with an increasingly complex regulatory environment with data privacy (such as GDPR and CCPA) and data residency regulations (such as in the EU, Russia, and China).

For AWS customers, effective data governance improves decision-making, increases business agility, provides a competitive advantage, and reduces the risk of fines due to non-compliance with regulatory obligations. We understand the unique opportunity to provide our customers a comprehensive end-to-end data governance solution that is seamlessly integrated into our portfolio of services, and AWS Lake Formation and the AWS Glue Data Catalog are key to solving these challenges.

In this post, we are excited to summarize the features that the AWS Glue Data Catalog, AWS Glue crawler, and Lake Formation teams delivered in 2022. We have collected some of the key talks and solutions on data governance, data mesh, and modern data architecture published and presented in AWS re:Invent 2022, and a few data lake solutions built by customers and AWS Partners for easy reference. Whether you are a data platform builder, data engineer, data scientist, or any technology leader interested in data lake solutions, this post is for you.

To learn more about how customers are securing and sharing data with Lake Formation, we recommend going deeper into GoDaddy’s decentralized data mesh, Novo Nordisk’s modern data architecture, and JPMorgan’s improvements to their Federated Data Lake, a governed data mesh implementation using Lake Formation. Also, you can learn how AWS Partners integrated with Lake Formation to help customers build unique data lakes, in Starburst’s data mesh solution, Informatica’s automated data sharing solution, Ahana’s Presto integration with Lake Formation, Ascending’s custom data governance system, how PBS used machine learning on their data lakes, and how hc1 provides personalized health insights for customers.

You can review how Lake Formation is used by customers to build modern data architectures in the following re:Invent 2022 talks:

The Lake Formation team listened to customer feedback and made improvements in the areas of cross-account data governance, expanding the source of data lakes, enabling unified data governance of a business data catalog, making secure business-to-business data sharing possible, and expanding the coverage area for fine-grained access controls to Amazon Redshift. In the rest of this post, we are happy to share the progress we made in 2022.

Enhancing cross-account governance

Lake Formation provides the foundation for customers to share data across accounts within their organization. You can share AWS Glue Data Catalog resources to AWS Identity and Access Management (IAM) principals within an account as well as other AWS accounts using two methods. The first one is called the named-resource method, where users can select the names of databases and tables and choose the type of permissions to share. The second method uses LF-Tags, where users can create and associate LF-Tags to databases and tables and grant permission to IAM principals using LF-Tag policies and expressions.

In November 2022, Lake Formation introduced version 3 of its cross-account sharing feature. With this new version, Lake Formation users can share catalog resources using LF-Tags at the AWS Organizations level. Sharing data using LF-tags helps scale permissions and reduces the admin work for data lake builders. The cross-account sharing version 3 also allows you to share resources to specific IAM principals in other accounts, providing data owners control over who can access their data in other accounts. Lastly, we have removed the overhead of writing and maintaining Data Catalog resource policies by introducing AWS Resource Access Manager (AWS RAM) invites with LF-Tags-based policies in the cross-account sharing version 3. We encourage you to further explore cross-account sharing in Lake Formation.

Extending Lake Formation permissions to new data

Until re:Invent 2022, Lake Formation provided permissions management for IAM principals on Data Catalog resources with underlying data primarily on Amazon Simple Storage Service (Amazon S3). At re:Invent 2022, we introduced Lake Formation permissions management for Amazon Redshift data shares in preview mode. Amazon Redshift is a fully-managed, petabyte-scale data warehouse service in the AWS Cloud. The data sharing feature allows data owners to group databases, tables, and views in an Amazon Redshift cluster and share it with other Amazon Redshift clusters within or across AWS accounts. Data sharing reduces the need to keep multiple copies of the same data in different data warehouses to accelerate business decision-making across an organization. Lake Formation further enhances sharing data within Amazon Redshift data shares by providing fine-grained access control on tables and views.

For additional details on this feature, refer to AWS Lake Formation-managed Redshift datashares (preview) and How Redshift data share can be managed by Lake Formation.

Amazon EMR is a managed cluster platform to run big data applications using Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto at scale. You can use Amazon EMR to run batch and stream processing analytics jobs on your S3 data lakes. Starting with Amazon EMR release 6.7.0, we introduced Lake Formation permissions management on a runtime IAM role used with the EMR Steps API. This feature enables you to submit Apache Spark and Apache Hive applications to an EMR cluster through the EMR Steps API that enforces table-level and column-level permissions using Lake Formation to that IAM role submitting the application. This Lake Formation integration with Amazon EMR allows you to share an EMR cluster across multiple users in an organization with different permissions by isolating your applications through a runtime IAM role. We encourage you to check this feature in the Lake Formation workshop Integration with Amazon EMR using Runtime Roles. To explore a use case, see Introducing runtime roles for Amazon EMR steps: Use IAM roles and AWS Lake Formation for access control with Amazon EMR.

Amazon SageMaker Studio is a fully integrated development environment (IDE) for machine learning (ML) that enables data scientists and developers to prepare data for building, training, tuning, and deploying models. Studio offers a native integration with Amazon EMR so that data scientists and data engineers can interactively prepare data at petabyte scale using open-source frameworks such as Apache Spark, Presto, and Hive using Studio notebooks. With the release of Lake Formation permissions management on a runtime IAM role, Studio now supports table-level and column-level access with Lake Formation. When users connect to EMR clusters from Studio notebooks, they can choose the IAM role (called the runtime IAM role) that they want to connect with. If data access is managed by Lake Formation, users can enforce table-level and column-level permissions using policies attached to the runtime role. For more details, refer to Apply fine-grained data access controls with AWS Lake Formation and Amazon EMR from Amazon SageMaker Studio.

Ingest and catalog varied data

A robust data governance model includes data from an organization’s many data sources and methods to discover and catalog those varied data assets. AWS Glue crawlers provide the ability to discover data from sources including Amazon S3, Amazon Redshift, and NoSQL databases, and populate the AWS Glue Data Catalog.

In 2022, we launched AWS Glue crawler support for Snowflake and AWS Glue crawler support for Delta Lake tables. These integrations allow AWS Glue crawlers to create and update Data Catalog tables based on these popular data sources. This makes it even easier to create extract, transform, and load (ETL) jobs with AWS Glue based on these Data Catalog tables as sources and targets.

In 2022, the AWS Glue crawlers UI was redesigned to offer a better user experience. One of the main enhancements delivered as part of this revision is the greater insights into AWS Glue crawler history. The crawler history UI provides an easy view of crawler runs, schedules, data sources, and tags. For each crawl, the crawler history offers a summary of changes in the database schema or Amazon S3 partition changes. Crawler history also provides detailed info about DPU hours and reduces the time spent analyzing and debugging crawler operations and costs. To explore the new functionalities added to the crawlers UI, refer to Set up and monitor AWS Glue crawlers using the enhanced AWS Glue UI and crawler history.

In 2022, we also extended support for crawlers based on Amazon S3 event notifications to support catalog tables. With this feature, incremental crawling can be offloaded from data pipelines to the scheduled AWS Glue crawler, reducing crawls to incremental S3 events. For more information, refer to Build incremental crawls of data lakes with existing Glue catalog tables.

More ways to share data beyond the data lake

During re:Invent 2022, we announced a preview of AWS Data Exchange for AWS Lake Formation, a new feature that enables data subscribers to find and subscribe to third-party datasets that are managed directly through Lake Formation. Until now, AWS Data Exchange subscribers could access third-party datasets by exporting providers’ files to their own S3 buckets, calling providers’ APIs through Amazon API Gateway, or querying producers’ Amazon Redshift data shares from their Amazon Redshift cluster. With the new Lake Formation integration, data providers curate AWS Data Exchange datasets using Lake Formation tags. Data subscribers are able to query and explore the databases and tables associated with those tags, just like any other AWS Glue Data Catalog resource. Organizations can apply resource-based Lake Formation permissions to share the licensed datasets within the same account or across accounts using AWS License Manager. AWS Data Exchange for Lake Formation streamlines data licensing and sharing operations by accelerating data onboarding, reducing the amount of ETL required for end-users to access third-party data, and centralizing governance and access controls for third-party data.

At re:Invent 2022, we also announced Amazon DataZone, a new data management service that makes it faster and easier for you to catalog, discover, share, and govern data stored across AWS, on-premises, and third-party sources. Amazon DataZone is a business data catalog service that supplements the technical metadata in the AWS Glue Data Catalog. Amazon DataZone is integrated with Lake Formation permissions management so that you can effectively manage and govern access to your data, and audit who is accessing what data and for what purpose. With the publisher-subscriber model of Amazon DataZone, data assets can be shared and accessed across Regions. For additional details about the service and its capabilities, refer to the Amazon DataZone FAQs and re:Invent launch.

Conclusion

Data is transforming every field and every business. However, with data growing faster than most companies can keep track of, collecting, securing, and getting value out of that data is a challenging thing to do. A modern data strategy can help you create better business outcomes with data. AWS provides the most complete set of services for the end-to-end data journey to help you unlock value from your data and turn it into insight.

At AWS, we work backward from customer requirements. From the Lake Formation team, we worked hard to deliver the features described in this post, and we invite you to check them out. With our continued focus to invent, we hope to play a key role in empowering organizations to build new data governance models that help you derive more business value at lightning speed.

You can get started with Lake Formation by exploring our hands-on workshop modules and Getting started tutorials. We look forward to hearing from you, our customers, on your data lake and data governance use cases. Please get in touch through your AWS account team and share your comments.


About the Authors

Jason Berkowitz is a Senior Product Manager with AWS Lake Formation. He comes from a background in machine learning and data lake architectures. He helps customers become data-driven.

Aarthi Srinivasan is a Senior Big Data Architect with AWS Lake Formation. She enjoys building data lake solutions for AWS customers and partners. When not on the keyboard, she explores the latest science and technology trends and spends time with her family.

Leonardo Gómez is a Senior Analytics Specialist Solutions Architect at AWS. Based in Toronto, Canada, he has over a decade of experience in data management, helping customers around the globe address their business and technical needs.

Cook: Bounded flexible arrays in C

Post Syndicated from corbet original https://lwn.net/Articles/921799/

Kees Cook has posted a
detailed document
describing the work to improve safety of
flexible-length arrays
in the kernel.

Converting such codebases to use “modern” language features, like
those in C99 (still from the prior millennium), can be a major
challenge, but it is an entirely tractable problem. This post is a
deep dive into an effort underway in the Linux kernel to make array
index overflows (and more generally, buffer overflows) a thing of
the past, where they belong. Our success hinges on replacing
anachronistic array definitions with well-defined C99 flexible
arrays.

This work has been covered here as well.

Visualize multivariate data using a radar chart in Amazon QuickSight

Post Syndicated from Bhupinder Chadha original https://aws.amazon.com/blogs/big-data/visualize-multivariate-data-using-a-radar-chart-in-amazon-quicksight/

At AWS re:Invent 2022, we announced the general availability of two new Amazon QuickSight visuals: small multiples and text boxes. We are excited to add another new visual to QuickSight: radar charts. With radar charts, you can compare two or more items across multiple variables in QuickSight.

In this post, we explore radar charts, its use cases, and how to configure one.

What is a radar chart?

Radar charts (also known as spider charts, polar charts, web charts, or star plots) are a way to visualize multivariate data similar to a parallel coordinates chart. They are used to plot one or more groups of values over multiple common variables. They do this by providing an axis for each variable, and these axes are arranged radially around a central point and spaced equally. The center of the chart represents the minimum value, and the edges represent the maximum value on the axis. The data from a single observation is plotted along each axis and connected to form a polygon. Multiple observations can be placed in a single chart by displaying multiple polygons.

For example, consider HR wanting to comparing the employee satisfaction score for different departments like sales, marketing, and finance against various metrics such as work/life balance, diversity, inclusiveness, growth opportunities, and wages. As shown in the following radar chart, each employee metric forms the axis with each department being represented by individual series.

Another effective way of comparing radar charts is to compare a given department against the average or baseline value. For instance, the sales department feels less compensated compared to the baseline, but ranks high on work/life balance.

When to use radar charts

Radar charts are a great option when space is a constraint and you want to compare multiple groups in a compact space. Radar charts are best used for the following:

  • Visualizing multivariate data, such as comparing cars across different stats like mileage, max speed, engine power, and driving pleasure
  • Comparative analysis (comparing two or more items across a list of common variables)
  • Spot outliers and commonality

Compared to parallel coordinates, radar charts are ideal when there are a few groups of items to be compared. You should also be mindful of not displaying too many variables, which can make the chart look cluttered and difficult to read.

Radar chart use cases

Radar charts have a wide variety of industry use cases, some of which are as follows:

  • Sports analytics – Compare athlete performance across different performance parameters for selection criteria
  • Strategy – Compare and measure different technology costs between various parameters, such as contact center, claims, massive claims, and others
  • Sales – Compare performance of sales reps across different parameters like deals closed, average deal size, net new customer wins, total revenue, and deals in the pipeline
  • Call centers – Compare call center staff performance against the staff average across different dimensions
  • HR – Compare company scores in terms of diversity, work/life balance, benefits, and more
  • User research and customer success – Compare customer satisfaction scores across different parts of the product

Different radar chart configurations

Let’s use an example of visualizing staff performance within a team, using the following sample data. The goal is to compare employee performance based on various qualities like communication, work quality, productivity, creativity, dependability, punctuality, and technical skills, ranging between a score of 0–10.

To add a radar chart to your analysis, choose the radar chart icon from the visual selector.

Depending on your use case and how the data is structured, you can configure radar charts in different ways.

Value as axis (UC1 and 2 tab from the dataset)

In this scenario, all qualities (communication, dependability, and so on) are defined as measures, and the employee is defined as a dimension in the dataset.

To visualize this data in a radar chart, drag all the variables to the Values field well and the Employee field to the Color field well.

Category as axis (UC1 and 2 tab from the dataset)

Another way to visualize the same data is to reverse the series and axis configuration, where each quality is displayed as a series and employees are displayed on the axis. For this, drag the Employee field to the Category field well and all the qualities to the Value field well.

Category as axis with color (UC3 tab from the dataset)

We can visualize the same use case with a different data structure, where all the qualities and employees are defined as a dimension and scores as values.

To achieve this use case, drag the field that you want to visualize as the axis to the Category field and individual series to the Color field. In our case, we chose Qualities as our axis, added Score to the Value field well, and visualized the values for each employee by adding Employee to the Color field well.

Styling radar charts

You can customize your radar charts with the following formatting options:

  • Series style – You can choose to display the chart as either a line (default) or area series

  • Start angle – By default, this is set to 90 degrees, but you can choose a different angle if you want to rotate the radar chart to better utilize the available real estate

  • Fill area – This option applies odd/even coloring for the plot area

  • Grid shape – Choose between circle or polygon for grid shape

Summary

In this post, we looked at how radar charts can help you visualize and compare items across different variables. We also learned about the different configurations supported by radar charts and styling options to help you customize its look and feel.

We encourage you to explore radar charts and leave a comment with your feedback.


About the author

Bhupinder Chadha is a senior product manager for Amazon QuickSight focused on visualization and front end experiences. He is passionate about BI, data visualization and low-code/no-code experiences. Prior to QuickSight he was the lead product manager for Inforiver, responsible for building a enterprise BI product from ground up. Bhupinder started his career in presales, followed by a small gig in consulting and then PM for xViz, an add on visualization product.

Migrate your indexes to Amazon OpenSearch Serverless with Logstash

Post Syndicated from Prashant Agrawal original https://aws.amazon.com/blogs/big-data/migrate-your-indexes-to-amazon-opensearch-serverless-with-logstash/

We recently announced the general availability of Amazon OpenSearch Serverless , a new option for Amazon OpenSearch Service that makes it easy run large-scale search and analytics workloads without having to configure, manage, or scale OpenSearch clusters. With OpenSearch Serverless, you get the same interactive millisecond response times as OpenSearch Service with the simplicity of a serverless environment.

In this post, you’ll learn how to migrate your existing indices from an OpenSearch Service managed cluster domain to a serverless collection using Logstash.

With OpenSearch domains, you get dedicated, secure clusters configured and optimized for your workloads in minutes. You have full control over the configuration of compute, memory, and storage resources in clusters to optimize cost and performance for your applications. OpenSearch Serverless provides an even simpler way to run search and analytics workloads—without ever having to think about clusters. You simply create a collection and a group of indexes, and can start ingesting and querying the data.

Solution overview

Logstash is open-source software that provides ETL (extract, transform, and load) for your data. You can configure Logstash to connect to a source and a destination via input and output plugins. In between, you configure filters that can transform your data. This post walks you through the steps you need to set up Logstash to connect an OpenSearch Service domain (input) to an OpenSearch Serverless collection (output).

You set the source and destination plugins in Logstash’s config file. The config file has sections for Input, Filter, and Output. Once configured, Logstash will send a request to the OpenSearch Service domain and read the data according to the query you put in the input section. After data is read from OpenSearch Service, you can optionally send it to the next stage Filter for transformations such as adding or removing a field from the input data or updating a field with different values. In this example, you won’t use the Filter plugin. Next is the Output plugin. The open-source version of Logstash (Logstash OSS) provides a convenient way to use the bulk API to upload data to your collections. OpenSearch Serverless supports the logstash-output-opensearch output plugin, which supports AWS Identity and Access Management (IAM) credentials for data access control.

The following diagram illustrates our solution workflow.

Prerequisites

Before getting started, make sure you have completed the following prerequisites:

  1. Note down your OpenSearch Service domain’s ARN, user name, and password.
  2. Create an OpenSearch Serverless collection. If you’re new to OpenSearch Serverless, refer to Log analytics the easy way with Amazon OpenSearch Serverless for details on how to set up your collection.

Set up Logstash and the input and output plugins for OpenSearch

Complete the following steps to set up Logstash and your plugins:

  1. Download logstash-oss-with-opensearch-output-plugin. (This example uses the distro for macos-x64. For other distros, refer to the artifacts.)
    wget https://artifacts.opensearch.org/logstash/logstash-oss-with-opensearch-output-plugin-8.4.0-macos-x64.tar.gz

  2. Extract the downloaded tarball:
    tar -zxvf logstash-oss-with-opensearch-output-plugin-8.4.0-macos-x64.tar.gz
    cd logstash-8.4.0/

  3. Update the logstash-output-opensearch plugin to the latest version:
    <path/to/your/logstash/root/directory>/bin/logstash-plugin update logstash-output-opensearch

  4. Install the logstash-input-opensearch plugin:
    <path/to/your/logstash/root/directory>/bin/logstash-plugin install logstash-input-opensearch

Test the plugin

Let’s get into action and see how the plugin works. The following config file retrieves data from the movies index in your OpenSearch Service domain and indexes that data in your OpenSearch Serverless collection with same index name, movies.

Create a new file and add the following content, then save the file as opensearch-serverless-migration.conf. Provide the values for the OpenSearch Service domain endpoint under HOST, USERNAME, and PASSWORD in the input section, and the OpenSearch Serverless collection endpoint details under HOST along with REGION, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY in the output section.

input {
    opensearch {
        hosts =>  ["https://<HOST>:443"]
        user  =>  "<USERNAME>"
        password  =>  "<PASSWORD>"
        index =>  "movies"
        query =>  '{ "query": { "match_all": {}} }'
    }
}
output {
    opensearch {
        ecs_compatibility => disabled
        index => "movies"
        hosts => "<HOST>:443"
        auth_type => {
            type => 'aws_iam'
            aws_access_key_id => '<AWS_ACCESS_KEY_ID>'
            aws_secret_access_key => '<AWS_SECRET_ACCESS_KEY>'
            region => '<REGION>'
            service_name => 'aoss'
            }
        legacy_template => false
        default_server_major_version => 2
    }
}

You can specify a query in the input section of the preceding config. The match_all query matches all data in the movies index. You can change the query if you want to select a subset of the data. You can also use the query to parallelize the data transfer by running multiple Logstash processes with configs that specify different data slices. You can also parallelize by running Logstash processes against multiple indexes if you have them.

Start Logstash

Use the following command to start Logstash:

<path/to/your/logstash/root/directory>/bin/logstash -f <path/to/your/config/file>

After you run the command, Logstash will retrieve the data from the source index from your OpenSearch Service domain, and write to the destination index in your OpenSearch Serverless collection. When the data transfer is complete, Logstash shuts down. See the following code:

[2023-01-24T20:14:28,965][INFO][logstash.agent] Successfully
started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
…
…
[2023-01-24T20:14:38,852][INFO][logstash.javapipeline][main] Pipeline terminated {"pipeline.id"=>"main"}
[2023-01-24T20:14:39,374][INFO][logstash.pipelinesregistry] Removed pipeline from registry successfully {:pipeline_id=>:main}
[2023-01-24T20:14:39,399][INFO][logstash.runner] Logstash shut down.

Verify the data in OpenSearch Serverless

You can verify that Logstash copied all your data by comparing the document count in your domain and your collection. Run the following query either from the Dev tools tab, or with curl, postman, or a similar HTTP client. The following query helps you search all documents from the movies index and returns the top documents along with the count. By default, OpenSearch will return the document count up to a maximum of 10,000. Adding the track_total_hits flag helps you get the exact count of documents if the document count exceeds 10,000.

GET movies/_search
{
  "query": {
    "match_all": {}
  },
  "track_total_hits" : true
}

Conclusion

In this post, you migrated data from your OpenSearch Service domain to your OpenSearch Serverless collection using Logstash’s OpenSearch input and output plugins.

Stay tuned for a series of posts focusing on the various options available for you to build effective log analytics and search solutions using OpenSearch Serverless. You can also refer the Getting started with Amazon OpenSearch Serverless workshop to know more about OpenSearch Serverless.

If you have feedback about this post, submit it in the comments section. If you have questions about this post, start a new thread on the Amazon OpenSearch Service forum or contact AWS Support.


About the authors

Prashant Agrawal is a Sr. Search Specialist Solutions Architect with Amazon OpenSearch Service. He works closely with customers to help them migrate their workloads to the cloud and helps existing customers fine-tune their clusters to achieve better performance and save on cost. Before joining AWS, he helped various customers use OpenSearch and Elasticsearch for their search and log analytics use cases. When not working, you can find him traveling and exploring new places. In short, he likes doing Eat → Travel → Repeat.

Jon Handler (@_searchgeek) is a Sr. Principal Solutions Architect at Amazon Web Services based in Palo Alto, CA. Jon works closely with the CloudSearch and Elasticsearch teams, providing help and guidance to a broad range of customers who have search workloads that they want to move to the AWS Cloud. Prior to joining AWS, Jon’s career as a software developer included four years of coding a large-scale, eCommerce search engine.

Serverless logging with Amazon OpenSearch Service and Amazon Kinesis Data Firehose

Post Syndicated from Jon Handler original https://aws.amazon.com/blogs/big-data/serverless-logging-with-amazon-opensearch-service-and-amazon-kinesis-data-firehose/

In this post, you will learn how you can use Amazon Kinesis Data Firehose to build a log ingestion pipeline to send VPC flow logs to Amazon OpenSearch Serverless. First, you create the OpenSearch Serverless collection you use to store VPC flow logs, then you create a Kinesis Data Firehose delivery pipeline that forwards the flow logs to OpenSearch Serverless. Finally, you enable delivery of VPC flow logs to your Firehose delivery stream. The following diagram illustrates the solution workflow.

OpenSearch Serverless is a new serverless option offered by Amazon OpenSearch Service. OpenSearch Serverless makes it simple to run petabyte-scale search and analytics workloads without having to configure, manage, or scale OpenSearch clusters. OpenSearch Serverless automatically provisions and scales the underlying resources to deliver fast data ingestion and query responses for even the most demanding and unpredictable workloads.

Kinesis Data Firehose is a popular service that delivers streaming data from over 20 AWS services to over 15 analytical and observability tools such as OpenSearch Serverless. Kinesis Data Firehose is great for those looking for a fast and easy way to send your VPC flow logs data to your OpenSearch Serverless collection in minutes without a single line of code and without building or managing your own data ingestion and delivery infrastructure.

VPC flow logs capture the traffic information going to and from your network interfaces in your VPC. With the launch of Kinesis Data Firehose support to OpenSearch Serverless, it makes an easy solution to analyze your VPC flow logs with just a few clicks. Kinesis Data Firehose provides a true end-to-end serverless mechanism to deliver your flow logs to OpenSearch Serverless, where you can use OpenSearch Dashboards to search through those logs, create dashboards, detect anomalies, and send alerts. VPC flow logs helps you to answer questions like:

  • What percentage of your traffic is getting dropped?
  • How much traffic is getting generated for specific sources and destinations?

Create your OpenSearch Serverless collection

To get started, you first create a collection. An OpenSearch Serverless collection is a logical grouping of one or more indexes that represent an analytics workload. Complete the following steps:

  1. On the OpenSearch Service console, choose Collections under Serverless in the navigation pane.
  2. Choose Create a collection.
  3. For Collection name, enter a name (for example, vpc-flow-logs).
  4. For Collection type¸ choose Time series.
  5. For Encryption, choose your preferred encryption setting:
    1. Choose Use AWS owned key to use an AWS managed key.
    2. Choose a different AWS KMS key to use your own AWS Key Management Service (AWS KMS) key.
  6. For Network access settings, choose your preferred setting:
    1. Choose VPC to use a VPC endpoint.
    2. Choose Public to use a public endpoint.

AWS recommends that you use a VPC endpoint for all production workloads. For this walkthrough, select Public.

  1. Choose Create.

It should take couple of minutes to create the collection.

The following graphic gives a quick demonstration of creating the OpenSearch Serverless collection via the preceding steps.

At this point, you have successfully created a collection for OpenSearch Serverless. Next, you create a delivery pipeline for Kinesis Data Firehose.

Create a Kinesis Data Firehose delivery stream

To set up a delivery stream for Kinesis Data Firehose, complete the following steps:

  1. On the Kinesis Data Firehose console, choose Create delivery stream.
  2. For Source, specify Direct PUT.

Check out Source, Destination, and Name to learn more about different sources supported by Kinesis Data Firehose.

  1. For Destination, choose Amazon OpenSearch Serverless.
  2. For Delivery stream name, enter a name (for example, vpc-flow-logs).
  3. Under Destination settings, in the OpenSearch Serverless collection settings, choose Browse.
  4. Select vpc-flow-logs.
  5. Choose Choose.

If your collection is still creating, wait a few minutes and try again.

  1. For Index, specify vpc-flow-logs.
  2. In the Backup settings section, select Failed data only for the Source record backup in Amazon S3.

Kinesis Data Firehose uses Amazon Simple Storage Service (Amazon S3) to back up failed data that it attempts to deliver to your chosen destination. If you want to keep all data, select All data.

  1. For S3 Backup Bucket, choose Browse to select an existing S3 bucket, or choose Create to create a new bucket.
  2. Choose Create delivery stream.

The following graphic gives a quick demonstration of creating the Kinesis Data Firehose delivery stream via the preceding steps.

At this point, you have successfully created a delivery stream for Kinesis Data Firehose, which you will use to stream data from your VPC flow logs and send it to your OpenSearch Serverless collection.

Set up the data access policy for your OpenSearch Serverless collection

Before you send any logs to OpenSearch Serverless, you need to create a data access policy within OpenSearch Serverless that allows Kinesis Data Firehose to write to the vpc-flow-logs index in your collection. Complete the following steps:

  1. On the Kinesis Data Firehose console, choose the Configuration tab on the details page for the vpc-flow-logs delivery stream you just created.
  2. In the Permissions section, note down the AWS Identity and Access Management (IAM) role.
  3. Navigate to the vpc-flow-logs collection details page on the OpenSearch Serverless dashboard.
  4. Under Data access, choose Manage data access.
  5. Choose Create access policy.
  6. In the Name and description section, specify an access policy name, add a description, and select JSON as the policy definition method.
  7. Add the following policy in the JSON editor. Provide the collection name and index you specified during the delivery stream creation in the policy. Provide the IAM role name that you got from the permissions page of the Firehose delivery stream, and the account ID for your AWS account.
    [
      {
        "Rules": [
          {
            "ResourceType": "index",
            "Resource": [
              "index/<collection-name>/<index-name>"
            ],
            "Permission": [
              "aoss:WriteDocument",
              "aoss:CreateIndex",
              "aoss:UpdateIndex"
            ]
          }
        ],
        "Principal": [
          "arn:aws:sts::<aws-account-id>:assumed-role/<IAM-role-name>/*"
        ]
      }
    ]

  8. Choose Create.

The following graphic gives a quick demonstration of creating the data access policy via the preceding steps.

Set up VPC flow logs

In the final step of this post, you enable flow logs for your VPC with the destination as Kinesis Data Firehose, which sends the data to OpenSearch Serverless.

  1. Navigate to the AWS Management Console.
  2. Search for “VPC” and then choose Your VPCs in the search result (hover over the VPC rectangle to reveal the link).
  3. Choose the VPC ID link for one of your VPCs.
  4. On the Flow Logs tab, choose Create flow log.
  5. For Name, enter a name.
  6. Leave the Filter set to All. You can limit the traffic by selecting Accept or Reject.
  7. Under Destination, select Send to Kinesis Firehose in the same account.
  8. For Kinesis Firehose delivery stream name, choose vpc-flow-logs.
  9. Choose Create flow log.

The following graphic gives a quick demonstration of creating a flow log for your VPC following the preceding steps.

Examine the VPC flow logs data in your collection using OpenSearch Dashboards

You won’t be able to access your collection data until you configure data access. Data access policies allow users to access the actual data within a collection.

To create a data access policy for OpenSearch Dashboards, complete the following steps:

  1. Navigate to the vpc-flow-logs collection details page on the OpenSearch Serverless dashboard.
  2. Under Data access, choose Manage data access.
  3. Choose Create access policy.
  4. In the Name and description section, specify an access policy name, add a description, and select JSON as the policy definition method.
  5. Add the following policy in the JSON editor. Provide the collection name and index you specified during the delivery stream creation in the policy. Additionally, provide the IAM user and the account ID for your AWS account. You need to make sure that you have the AWS access and secret keys for the principal that you specified as an IAM user.
    [
      {
        "Rules": [
          {
            "Resource": [
              "index/<collection-name>/<index-name>"
            ],
            "Permission": [
              "aoss:ReadDocument"
            ],
            "ResourceType": "index"
          }
        ],
        "Principal": [
          "arn:aws:iam::<aws-account-id>:user/<IAM-user-name>"
        ]
      }
    ]

  6. Choose Create.
  7. Navigate to OpenSearch Serverless and choose the collection you created (vpc-flow-logs).
  8. Choose the OpenSearch Dashboards URL and log in with your IAM access key and secret key for the user you specified under Principal.
  9. Navigate to dev tools within OpenSearch Dashboards and run the following query to retrieve the VPC flow logs for your VPC:
    GET <index-name>/_search
    {
      "query": {
        "match_all": {}
      }
    }

The query returns the data as shown in the following screenshot, which contains information such as account ID, interface ID, source IP address, destination IP address, and more.

Create dashboards

After the data is flowing into OpenSearch Serverless, you can easily create dashboards to monitor the activity in your VPC. The following example dashboard shows overall traffic, accepted and rejected traffic, bytes transmitted, and some charts with the top sources and destinations.

Clean up

If you don’t want to continue using the solution, be sure to delete the resources you created:

  1. Return to the AWS console and in the VPCs section, disable the flow logs for your VPC.
  2. In the OpenSearch Serverless dashboard, delete your vpc-flow-logs collection.
  3. On the Kinesis Data Firehose console, delete your vpc-flow-logs delivery stream.

Conclusion

In this post, you created an end-to-end serverless pipeline to deliver your VPC flow logs to OpenSearch Serverless using Kinesis Data Firehose. In this example, you built a delivery pipeline for your VPC flow logs, but you can also use Kinesis Data Firehose to send logs from Amazon Kinesis Data Streams and Amazon CloudWatch, which in turn can be sent to OpenSearch Serverless collections for running analytics on those logs. With serverless solutions on AWS, you can focus on your application development rather than worrying about the ingestion pipeline and tools to visualize your logs.

Get hands-on with OpenSearch Serverless by taking the Getting Started with Amazon OpenSearch Serverless workshop and check out other pipelines for analyzing your logs.

If you have feedback about this post, share it in the comments section. If you have questions about this post, start a new thread on the Amazon OpenSearch Service forum or contact AWS Support.


About the authors

Jon Handler (@_searchgeek) is a Principal Solutions Architect at Amazon Web Services based in Palo Alto, CA. Jon works closely with the CloudSearch and Elasticsearch teams, providing help and guidance to a broad range of customers who have search workloads that they want to move to the AWS Cloud. Prior to joining AWS, Jon’s career as a software developer included four years of coding a large-scale, eCommerce search engine.

Prashant Agrawal is a Sr. Search Specialist Solutions Architect with Amazon OpenSearch Service. He works closely with customers to help them migrate their workloads to the cloud and helps existing customers fine-tune their clusters to achieve better performance and save on cost. Before joining AWS, he helped various customers use OpenSearch and Elasticsearch for their search and log analytics use cases. When not working, you can find him traveling and exploring new places. In short, he likes doing Eat → Travel → Repeat.

A Customer Success Manager’s Journey to Cybersecurity

Post Syndicated from Rapid7 original https://blog.rapid7.com/2023/01/31/a-customer-success-managers-journey-to-cybersecurity/

A Customer Success Manager’s Journey to Cybersecurity

Originally planning to pursue a career in sports journalism, Blake Walters joined Rapid7 ready to roll up his sleeves and learn about an entirely new field—cybersecurity. Walters always had an interest in computer engineering. However, he craved the ability to connect with people and build relationships instead of working deep within coding.

Walters is a learner by nature and is not afraid to take on new challenges or face new risks. Living by the mindset, “If I don’t know, I will work to figure it out,” he began his journey as a recruiter in the technology space. This gave him a great opportunity to learn more about how software is built, which eventually led him to Customer Success, where he could build relationships with customers and help others.

Walters had his first personal brush with cybersecurity when a client he was working with, a small hospital, got hit with Wannacry ransomware in 2017. He became even more curious about cybersecurity as he witnessed firsthand the impact it had on his client.

A Customer Success Manager’s Journey to Cybersecurity

“You know what cybersecurity is and you know people get hacked all the time, but unless you are in it, you don’t realize the ins and outs of what that impact is,” he said. “There were 4-5 weeks where they couldn’t access hospital records, patient information, company files, ANYTHING. That’s a big challenge for a small hospital, or any company.”

From there, the stars aligned, and Walters was approached with an opportunity to join Rapid7. He noted that during his interview there was less emphasis on having a vast amount of cybersecurity knowledge. Instead, the focus was on his ability to build relationships and proactively use the resources provided by Rapid7 to build the industry knowledge needed to be successful in the role.

According to Walters, joining Rapid7 felt like he had finally found a place where he could do what he loved, while being supported in continuing to learn a new industry and grow his career.

“With cybersecurity, it doesn’t matter what you did yesterday. Hackers are changing all the time. If we aren’t also helping our customers evolve and improve their security over time, we are doing them a disservice,” he said. “That’s why Customer Success is so important. It doesn’t matter how good you’ve been in the past, it’s about how good you’re going to be moving forward. That is an exciting and motivating mindset to have.”

One of the biggest misconceptions about cybersecurity is that you need to have specific knowledge to break into the field. According to Walters, that was not his experience.  

Everyone has a day 1. You don’t wake up with knowledge of cybersecurity products,” he said. “If you are trying to break into the field, just start reading. There is plenty of information out there. Learn the basics, and then as you’re looking at companies and jobs, start tailoring your understanding of what that company does.”

In an environment where things change so rapidly, it is integral to have an open mind and willingness to adapt. In regard to Rapid7 specifically, Walters believes diversity is key to the company’s success.

“Having different types of people and backgrounds in an organization has a huge impact. It keeps you out of groupthink and lets people collaborate for a common good,” he said. “At Rapid7, that stood out to me early in the interview process. Everyone is challenging one another to be better. That’s what I was looking for in a company regardless of what industry or business it was.”

Overall, Walters wants others out there thinking about entering the cybersecurity space to know that with some effort, you can make it happen. Even without a technical background.

“Don’t be afraid to push yourself outside your comfort zone. I came into this with no cyber experience. It shows the ability of Rapid7 to take a risk on people who are willing to come in, devote themselves to learning and growth, put in the work, and make an impact,” he said. “It’s not about just finding a job, it’s about finding a home.”

To learn more about opportunities available at Rapid7, visit: careers.rapid7.com

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close