Security updates for Wednesday

Post Syndicated from corbet original https://lwn.net/Articles/954921/

Security updates have been issued by Debian (debian-security-support and xorg-server), Fedora (java-17-openjdk, libcmis, and libreoffice), Mageia (fish), Red Hat (buildah, containernetworking-plugins, curl, fence-agents, kernel, kpatch-patch, libxml2, pixman, podman, runc, skopeo, and tracker-miners), SUSE (kernel, SUSE Manager 4.3.10 Release Notes, and SUSE Manager Client Tools), and Ubuntu (gnome-control-center, linux-gcp, linux-kvm, linux-gkeop, linux-gkeop-5.15, linux-hwe-6.2, linux-lowlatency-hwe-6.2, linux-nvidia-6.2, linux-lowlatency, linux-lowlatency-hwe-5.15, linux-oracle, linux-oracle-5.4, linux-raspi, linux-raspi-5.4, netatalk, and pydantic).

Strengthening customer third-party due diligence with renewed AWS CyberGRX assessment

Post Syndicated from Naranjan Goklani original https://aws.amazon.com/blogs/security/strengthening-customer-third-party-due-diligence-with-renewed-aws-cybergrx-assessment/

CyberGRX

Amazon Web Services (AWS) is pleased to announce the successful renewal of the AWS CyberGRX cyber risk assessment report. This third-party validated report helps customers perform effective cloud supplier due diligence on AWS and enhances customers’ third-party risk management process.

With the increase in adoption of cloud products and services across multiple sectors and industries, AWS has become a critical component of customers’ environments. Regulated customers are held to high standards by regulators and auditors when it comes to exercising effective due diligence on third parties.

Many customers use third-party cyber risk management (TPCRM) services such as CyberGRX to better manage risks from their evolving third-party environments and to drive operational efficiencies. To help with such efforts, AWS has completed the CyberGRX assessment of its security posture. CyberGRX security analysts perform the assessment and validate the results annually.

The CyberGRX assessment applies a dynamic approach to third-party risk assessment. This approach integrates advanced analytics, threat intelligence, and sophisticated risk models with vendors’ responses to provide an in-depth view of how a vendor’s security controls help protect against potential threats.

Vendor profiles are continuously updated as the risk level of cloud service providers changes, or as AWS updates its security posture and controls. This approach eliminates outdated static spreadsheets for third-party risk assessments, in which the risk matrices are not updated in near real time.

In addition, AWS customers can use the CyberGRX Framework Mapper to map AWS assessment controls and responses to well-known industry standards and frameworks, such as National Institute of Standards and Technology (NIST) 800-53, NIST Cybersecurity Framework, International Organization for Standardization (ISO) 27001, Payment Card Industry Data Security Standard (PCI DSS), and the U.S. Health Insurance Portability and Assessment Act (HIPAA). This mapping can reduce customers’ third-party supplier due-diligence burden.

Customers can access the AWS CyberGRX report at no additional cost. Customers can request access to the report by completing an access request form, available on the AWS CyberGRX page.

As always, we value your feedback and questions. Reach out to the AWS Compliance team through the Contact Us page. If you have feedback about this post, submit comments in the Comments section below. To learn more about our other compliance and security programs, see AWS Compliance Programs.

Want more AWS Security news? Follow us on Twitter.

Naranjan Goklani

Naranjan Goklani

Naranjan is an Audit Lead for Canada. He has experience leading audits, attestations, certifications, and assessments across the Americas. Naranjan has more than 13 years of experience in risk management, security assurance, and performing technology audits. He previously worked in one of the Big 4 accounting firms and supported clients from the financial services, technology, retail, and utilities industries.

Годината на изкуственото мислене

Post Syndicated from original https://www.toest.bg/godinata-na-izkustvenoto-mislene/

Годината на изкуственото мислене

Заедно с обичайните годишни равносметки през месец декември имаме възможност да осмислим 2023-та и през призмата на думите – или по-точно онези от тях, назоваващи събитията и тенденциите, които през изминалата година са ни занимавали, вълнували, тревожили или озадачили.

Българските думи на 2023 г. предстои да бъдат избрани през януари чрез кампания, организирана от платформата за грамотност „Как се пише?“. Няколко от най-популярните англоезични речници обаче вече оповестиха английските думи, които според тях улавят духа на годината. Макар че точно тези думи едва ли ще влязат в българската селекция, все пак е интересно да ги разгледаме. От една страна, защото по стечение на обстоятелствата всички те съществуват и на български. А от друга, защото заради все по-тясната взаимосвързаност на световните процеси и тенденции отразяват теми, които са значими и за българската езикова общност.

Любопитно е – но не изненадващо, имайки предвид посоката, в която се движат тези световни процеси и тенденции, – че

голяма част от излъчените като ключови за 2023 г. английски думи са свързани с темата за – или по-скоро с кризата на – истината и реалността.

Според най-популярния американски речник Мериам Уебстър“ например думата на годината за 2023-та е прилагателното аuthentic. Също като българския ѝ еквивалент „автентичен“, думата идва от латински (в английския влиза през френски, а в българския – през руски), където пък се появява като заемка от старогръцката αὐθεντικός (authentikós), тоест „присъщ на абсолютна власт; първичен, начален“.

Според един от редакторите на речника – лингвиста Питър Соколовски, цитиран от Associated Press:

През 2023 г. виждаме криза на автентичността. Осъзнаваме, че поставяйки под въпрос автентичността, ние я ценим още повече.

Соколовски и екипът му следят как определени световни събития водят до повишено търсене на определени думи. В случая с authentic такива изолирани пикове не е имало, а към думата по-скоро се е наблюдавал постоянно растящ интерес през цялата година.

Темата за автентичността се промъква по един или друг начин и в голяма част от думите „подгласници“, които според редакцията на Мериам Уебстър“ също в рамките на годината са привлекли необичаен интерес, измерван в трафик към онлайн изданието на речника.

Макар и индиректно, значенията на много от тях се оказват концептуално свързани с темата „реалност vs. измислица“, например: deepfake, dystopian, doppelganger, indict (чието търсене в речника се е увеличило с 9440% на 30 март, когато на бившия президент на САЩ Доналд Тръмп е повдигнато обвинение в заплащане на пари в замяна на мълчанието на порноактрисата Сторми Даниълс) и deadname („името, което транссексуален човек е получил при раждането си и вече не използва“).

С малко въображение, можем дори да сметнем, че по някакъв начин изкуствено създадената реалност стои и в основата на думата coronation, която благодарение на коронацията на Чарлз III се е радвала на интерес, по-голям с 15 681% от предишната година¹.

Не е нужно особено въображение, за да забележим и отбележим, че въпроси за реалността и нейната подмяна са повлияли директно на избора и на други англоезични речници. Водещият в света по посещения безплатен речник на Кеймбридж например обявява глагола hallucinate за дума на 2023 г.

Думата е навлязла в английския още през XVII век: смята се, че тя – подобно на „автентичен“ – е дошла от старогръцки (от ἀλύω/alúō ‘рея се, скитам се’) през латински (от alucinor ‘блуждая, бълнувам, бръщолевя’).

Въпреки че с обичайната си дефиниция („да виждаш, чуваш, усещаш или помиришеш нещо, което не съществува, обикновено поради здравословно състояние или като резултат от вземане на наркотици“) hallucinate се използва в английския език отдавна², Кеймбриджкият речник я избира за дума на 2023 г. заради нейното новопоявило се значение, което вече е добавено като втора дефиниция: „когато изкуствен интелект (= компютърна система, която притежава някои качества на човешкия мозък, като например способността да произвежда език по начин, изглеждащ човешки) халюцинира, той произвежда невярна информация“.

Д-р Хенри Шевлин, който изучава етиката на изкуствения интелект в Кеймбриджкия университет, допълва:

Широко разпространеното използване на термина „халюцинирам“ във връзка с грешки от системи като ChatGPT предоставя завладяваща моментна картина на това как осмисляме и антропоморфизираме изкуствения интелект. Неточната или подвеждаща информация, разбира се, съществува отдавна, независимо дали под формата на слухове, пропаганда, или „фалшиви новини“.

Като потвърждение на това, ако се върнем назад можем да видим, че терминът fake news – отново благодарение на Доналд Тръмп, в чиито изказвания присъства постоянно – е обявен за дума на годината за 2017-та от британския речник „Колинс“. Но докато преди пет години източниците на дезинформация все пак бяха най-вече с човешки произход, в наши дни вече имаме (и) друг сериозен, неизчерпаем и доста плашещ източник на шарлатанство. Споменатите дотук думи несъмнено са свързани с него, макар и косвено. Лексикографите на „Колинс“ обаче го назовават съвсем директно и без никакви заобиколки, избирайки го за най-значимата дума за 2023 г.

Става въпрос, разбира се, за термина „изкуствен интелект“, познат на английски под съкращението AI. Дефиниран като „моделиране на човешки умствени функции чрез компютърни програми“ според издателя на речника, терминът е избран заради силно ускореното му развитие, което го е превърнало в главна тема за разговор през 2023 г. В рамките на годината употребата на термина (точно под формата на инициалите AI) се е увеличила четирикратно.

Четвъртата дума, избрана за дума на годината в англоезичния свят – този път от Оксфордския речник, – също е вид съкращение, което допреди множеството медийни съобщения, оповестяващи избора на речника, вероятно е било тотално непознато за голяма част от порасналите читатели. От подрастващите обаче, включително и от българските, тя, изглежда, се използва активно.

Става въпрос за разговорната дума rizz – съществително име³, дефинирано като „стил, чар, привлекателност, както и способността на някого да привлича или съблазнява романтични или сексуални партньори“.

На пръв поглед думата rizz звучи много забавно и далеч по-незастрашително от другите думи на годината, но при по-внимателно вглеждане можем да видим, че тя също съдържа в себе си елемент на измама – или поне на някакъв вид примамване – още в самия си произход. Rizz всъщност произлиза от charisma („харизма“)и идва от старогръцката χάρισμα (khárisma, ‘благосклонност, божествен дар’, която пък идва от χάρις (kháris, ‘благосклонност, красота, благодат’).

Думата charisma започва да се употребява с познатото ни съвременно значение някъде около средата на миналия век, като преди това значението ѝ е имало главно религиозен оттенък. От една страна, тя се е използвала като обозначение за особен божи дар, благодарение на който получилият го човек притежава някакви способности, качества или власт, без да има съществени лични заслуги за тях. Освен това християнското разбиране за харизма предполага, че даровете на Светия Дух, получавани при тайнства като кръщение и евхаристия, също са харизматични.

Оказва се, че точно през идеята за „(по)дар(ък)“ думата „харизма“ споделя изненадваща етимологична връзка и с друга дума в българския език. Това е глаголът „харизвам“, който произлиза от гръцката χαρίζω (charízo ‘подарявам’), макар че старогръцките му корени са в много по-извисен регистър от разговорния начин, по който се използва българската дума.

Имайки предвид предстоящите празници и придружаващите ги ритуали, темата за подаръците ми се струва подходяща за край на този текст. И нека си пожелаем новата година освен честита – от „част“, не от „чест“! – да е изпълнена с повече автентичност, всестранен rizz и само приятни, чисто човешки халюцинации.

1 Огромният скок в търсенето на думата coronation несъмнено е впечатляващ, но все пак трябва да отбележим, че той вероятно до голяма степен се дължи на факта, че речникът „Мериам Уебстър“ е американски, а не британски.

2 Въпреки че е навлязла в английския през XVII в. със значението „храня илюзии“, справка в онлайн търсачката на Google Nrgram Viewer, която проследява колко често конкретни думи или фрази се появяват в публикувани източници между 1500 и 2019 г., показва, че популярността на думата hallucinate се повишава видимо през 50-те години на миналия век и нараства стабилно през втората му половина и в началото на XXI в.

3 Освен като съществително име, описващо нещо, което човек притежава, rizz – както на английски (обикновено в комбинация с наречието up), така, оказва се, и на български (като „rizz-вам“) – се използва и като глагол, означаващ „привличам, свалям, съблазнявам, очаровам“. Rizz първоначално се появява като сленг термин за „омайващото“ въздействие, което мъжът има върху жената, но изразът еволюира и вече се използва без оглед на пол за всеки, който притежава магнетичност и очарователност.

4 Формирането на съкращения чрез използване на част от думата се среща сравнително рядко в английския, но все пак има прецедент: например fridge от refrigerator („хладилник“) и flu от influenza („грип“).

5 Думата „харизма“ е етимологично свързана и с т.нар. харити от древногръцката митология – богини на грацията, красотата и чара, които съответстват на грациите в древноримската митология. Броят им варира, като обикновено се смята, че те са три, най-често срещани с имената Аглая („сияйна“), Ефросина („благоразумна“) и Талия („цветуща“), но една от тях понякога се появява и с наименованието Харис.

6 Думата „подарък” произлиза от праславянската *darъ, която също е етимологично свързана със старогръцката дума за „дар“ – δῶρον (dôron). Оттам, разбира се, идва и името Тодор, или „дар от Бога“.


В рубриката „От дума на дума“ Екатерина Петрова търси актуални, интересни или новопоявили се думи от нашето ежедневие и проследява често изненадващия им произход, развитието на значенията им във времето и взаимовръзките им с близки и далечни езици.

Surveillance by the US Postal Service

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/12/surveillance-by-the-us-postal-service.html

This is not about mass surveillance of mail, this is about the sorts of targeted surveillance the US Postal Inspection Service uses to catch mail thieves:

To track down an alleged mail thief, a US postal inspector used license plate reader technology, GPS data collected by a rental car company, and, most damning of all, hid a camera inside one of the targeted blue post boxes which captured the suspect’s full face as they allegedly helped themselves to swathes of peoples’ mail.

“Биволъ” сваля маската на горската мафия (ПЪРВА ЧАСТ) Чадър над незаконните сечи разпънаха горски в Софийска област

Post Syndicated from Екип на Биволъ original https://bivol.bg/nezakonna-sech-sofia-oblast.html

сряда 13 декември 2023


Липсата на реален контрол в горите и кражби за над 1 млн. лв. в Софийска област разкрива първото от серията нови разследвания на “Биволъ” за незаконния дърводобив в партньорство с…

Our role in supporting the nonprofit ecosystem

Post Syndicated from Let's Encrypt original https://letsencrypt.org/2023/12/13/ngos/

For more than ten years, we at the nonprofit Internet Security Research Group (ISRG) have been focused on our mission of building a more secure and privacy-respecting Internet for everyone, everywhere. As we touch on in our 2023 Annual Report, we now serve more than 360 million domains with free TLS certificates.

Beyond being a big number, what does that signify? What’s the importance of having TLS being widely adopted anyways? We’ll take a closer look at these questions through the lens of one group of Subscribers we can relate to particularly well: nonprofits.

Serving .org at Internet scale

Let’s Encrypt serves 57% of all websites using the .org top level domain (TLD), which is commonly used by nonprofits. In the US alone there are 1.8M registered nonprofit organizations. And while the focus of these organizations are varied, all of them rely on the Internet in some capacity.

When a nonprofit uses a TLS certificate on their website, it protects their visitors and stakeholders from snoopers, MITM attacks, and surveillance. Without TLS, nonprofits’ content could be changed without their knowledge or their visitors’ private information could be compromised. Access to free and automated TLS via Let’s Encrypt means these nonprofits face as few barriers as possible to adopting TLS.

In short, something as fundamental as security and privacy should be as easy to access as possible. For nonprofits both large and small, Let’s Encrypt makes it easy to provide security and privacy for users of their websites, enabling them to remain focused on their missions.

Zooming in on three nonprofits we serve

The American Civil Liberties Union (ACLU) uses Let’s Encrypt as it works to realize its focus of being a “guardian of liberty” for US citizens. Using Let’s Encrypt protects ACLU’s constituents when they’re trying to know their rights or take action. With more than 4 million page views per month, ACLU’s website is a critical part of their mission.

Human Rights Watch (HRW) is an international nonprofit organization. With more than 500 individuals on staff around the world, HRW’s website is a trove of information empowering individuals and organizations alike to be informed and take action with a global perspective. Nearly 70% of HRW’s web traffic comes from people outside of the United States; that’s millions of page views per month secured by Let’s Encrypt—and by extension, millions of people around the world benefitting from a more secure and privacy-respecting Web.

The Center for Democracy & Technology (CDT) uses Let’s Encrypt to advance its mission to promote democratic values by shaping technology policy and architecture, with a focus on the rights of the individual. The CDT website offers updated and insightful information into the ways policy and innovation impact the digital space. Without a TLS certificate, the content of these pages could be intercepted and changed. What’s more, for those looking to financially support CDT, using TLS on their donation page encrypts the transaction protecting user details such as credit card and other personal information. Mallory Knodel, CTO at CDT and longtime digital rights defender and advocate commented, “Billions of people in over 60 countries access the internet with less censorship and surveillance because Let’s Encrypt hastened the adoption of web security measures by making certificates easy to obtain.”

Serving philanthropic foundations

In the United States, the work of nonprofits is made possible in large part through philanthropic foundations and organizations. When it comes to philanthropy’s web presence, Let’s Encrypt is there, too.

We provide TLS to billion dollar philanthropic organizations like the Hewlett Foundation, the Silicon Valley Community Foundation, Yield Giving, and many others. Taking a look at the top 50 philanthropic organizations around the world, Let’s Encrypt serves 36% of them. For large philanthropies, their website is the primary tool they have to communicate their focus areas for future funding as well as the impact they’ve made with past giving.

One of the leading philanthropists in the US, Craig Newmark, uses Let’s Encrypt and Digital Ocean for his website, craig newmark philanthropies. Commenting on our work, Craig recently shared, “The people at ISRG have been helping protect the Internet for over ten years, and continue to protect us all. They’re a necessary part of Cyber Civil Defense and national security.”

Overall, while Let’s Encrypt aims to build a better Internet, we’re particularly proud that our impact protects those seeking to build a better world.

Internet Security Research Group (ISRG) is the parent organization of Prossimo, Let’s Encrypt, and Divvi Up. ISRG is a 501(c)(3) nonprofit. If you’d like to support our work, please consider getting involved, donating, or encouraging your company to become a sponsor.

Our role in supporting the nonprofit ecosystem

Post Syndicated from Let's Encrypt original https://letsencrypt.org/2023/12/13/ngos.html

For more than ten years, we at the nonprofit Internet Security Research Group (ISRG) have been focused on our mission of building a more secure and privacy-respecting Internet for everyone, everywhere. As we touch on in our 2023 Annual Report, we now serve more than 360 million domains with free TLS certificates.

Beyond being a big number, what does that signify? What’s the importance of having TLS being widely adopted anyways? We’ll take a closer look at these questions through the lens of one group of Subscribers we can relate to particularly well: nonprofits.

Serving .org at Internet scale

Let’s Encrypt serves 51% of all websites using the .org top level domain (TLD), which is commonly used by nonprofits. In the US alone there are 1.8M registered nonprofit organizations. And while the focus of these organizations are varied, all of them rely on the Internet in some capacity.

When a nonprofit uses a TLS certificate on their website, it protects their visitors and stakeholders from snoopers, MITM attacks, and surveillance. Without TLS, nonprofits’ content could be changed without their knowledge or their visitors’ private information could be compromised. Access to free and automated TLS via Let’s Encrypt means these nonprofits face as few barriers as possible to adopting TLS.

In short, something as fundamental as security and privacy should be as easy to access as possible. For nonprofits both large and small, Let’s Encrypt makes it easy to provide security and privacy for users of their websites, enabling them to remain focused on their missions.

Zooming in on four nonprofits we serve

The American Civil Liberties Union (ACLU) uses Let’s Encrypt as it works to realize its focus of being a “guardian of liberty” for US citizens. Using Let’s Encrypt protects ACLU’s constituents when they’re trying to know their rights or take action. With more than 4 million page views per month, ACLU’s website is a critical part of their mission.

Human Rights Watch (HRW) is an international nonprofit organization. With more than 500 individuals on staff around the world, HRW’s website is a trove of information empowering individuals and organizations alike to be informed and take action with a global perspective. Nearly 70% of HRW’s web traffic comes from people outside of the United States; that’s millions of page views per month secured by Let’s Encrypt—and by extension, millions of people around the world benefitting from a more secure and privacy-respecting Web.

The Center for Democracy & Technology (CDT) uses Let’s Encrypt to advance its mission to promote democratic values by shaping technology policy and architecture, with a focus on the rights of the individual. The CDT website offers updated and insightful information into the ways policy and innovation impact the digital space. Without a TLS certificate, the content of these pages could be intercepted and changed. What’s more, for those looking to financially support CDT, using TLS on their donation page encrypts the transaction protecting user details such as credit card and other personal information. Mallory Knodel, CTO at CDT and longtime digital rights defender and advocate commented, “Billions of people in over 60 countries access the internet with less censorship and surveillance because Let’s Encrypt hastened the adoption of web security measures by making certificates easy to obtain.”

Serving philanthropic foundations

In the United States, the work of nonprofits is made possible in large part through philanthropic foundations and organizations. When it comes to philanthropy’s web presence, Let’s Encrypt is there, too.

We provide TLS to billion dollar philanthropic organizations like the Hewlett Foundation, the Silicon Valley Community Foundation, and many others. Taking a look at the top 50 philanthropic organizations around the world, Let’s Encrypt serves 36% of them. For large philanthropies, their website is the primary tool they have to communicate their focus areas for future funding as well as the impact they’ve made with past giving.

One of the leading philanthropists in the US, Craig Newmark, uses Let’s Encrypt and Digital Ocean for his website, craig newmark philanthropies. Commenting on our work, Craig recently shared, “The people at ISRG have been helping protect the Internet for over ten years, and continue to protect us all. They’re a necessary part of Cyber Civil Defense and national security.”

Overall, while Let’s Encrypt aims to build a better Internet, we’re particularly proud that our impact protects those seeking to build a better world.

Internet Security Research Group (ISRG) is the parent organization of Prossimo, Let’s Encrypt, and Divvi Up. ISRG is a 501(c)(3) nonprofit. If you’d like to support our work, please consider getting involved, donating, or encouraging your company to become a sponsor.

The end of vger.kernel.org

Post Syndicated from corbet original https://lwn.net/Articles/954783/

Konstantin Ryabitsev has announced
that the movement of kernel mailing lists away from the venerable
vger.kernel.org system is nearly complete:

Over the past few months we’ve migrated all of the vger.kernel.org
mailing lists, with the exception of the Big One (linux-kernel, aka
LKML). This list alone is responsible for about 80% of all vger
mailing list traffic, so we left it for the last.

This Thursday, December 14, at 11AM Pacific (19:00 UTC), we will
switch the MX record for vger to point to the new location
(subspace.kernel.org), which will complete the mailing list
migration from the legacy vger server to the new infrastructure.

Graber: LXD now re-licensed and under a CLA

Post Syndicated from corbet original https://lwn.net/Articles/954777/

The story of Canonical’s takeover of the LXD container manager, and the
subsequent creation of the Incus fork, has been
simmering for a while. Now Incus developer Stéphane Graber reports
that Canonical has changed the license and contribution terms for LXD:

Per the commit message performing the re-licensing, all further
contributions will be under the AGPLv3 license and all
contributions from Canonical employees have been re-licensed to
AGPLv3.

However, Canonical does not own the copyright on any contribution
from non-employees, such as the many changes they have imported
from Incus over the past few months. Those therefore remain under
the Apache2 license that they were contributed under.

As a result, Canonical cannot release LXD under the AGPLv3 license
and likely never will be able to. LXD is now under a weird mix of
Apache2 and AGPLv3 with no clear metadata indicating what file or
what part of each file is under one license or the other.

He also notes that this change will put an end to the flow of patches — in
either direction — between the two projects.

Patch Tuesday – December 2023

Post Syndicated from Adam Barnett original https://blog.rapid7.com/2023/12/12/patch-tuesday-december-2023/

Patch Tuesday - December 2023

Microsoft is addressing 34 vulnerabilities this December Patch Tuesday, including a single zero-day vulnerability and three critical remote code execution (RCE) vulnerabilities. December Patch Tuesday has historically seen fewer patches than a typical month, and this trend continues in 2023. This total does not include eight browser vulnerabilities published earlier this month. At time of writing, none of the vulnerabilities patched today are yet added to the CISA KEV list.

Certain AMD processors: zero-day information disclosure

This month’s lone zero-day vulnerability is CVE-2023-20588, which describes a potential information disclosure due to a flaw in certain AMD processor models as listed on the AMD advisory. AMD states that a divide-by-zero on these processor models could potentially return speculative data. AMD believes the potential impact of the vulnerability is low since local access is required; however, Microsoft ranks severity as important under its own proprietary severity scale. The vulnerability is patched at the OS level in all supported versions of Windows, even as far back as Windows Server 2008 for Azure-hosted assets participating in the Extended Security Update (ESU) program.

Outlook: no-interaction critical RCE

CVE-2023-35628 describes a critical RCE vulnerability in the MSHTML proprietary browser engine still used by Outlook, among others, to render HTML content. Of particular note: the most concerning exploitation scenario leads to exploitation as soon as Outlook retrieves and processes the specially crafted malicious email. This means that exploitation could occur before the user interacts with the email in any way; not even the Preview Pane is required in this scenario. Other attack vectors exist: the user could also click a malicious link received via email, instant message, or other medium. Assets where Internet Explorer 11 has been fully disabled are still vulnerable until patched; the MSHTML engine remains installed within Windows regardless of the status of IE11.

Internet Connection Sharing: critical RCE

This month also brings patches for a pair of critical RCE vulnerabilities in Internet Connection Sharing. CVE-2023-35630 and CVE-2023-35641 share a number of similarities: a base CVSS v3.1 score of 8.8, Microsoft critical severity ranking, low attack complexity, and presumably execution in SYSTEM context on the target machine, although the advisories do not specify execution context. Description of the exploitation method does differ between the two, however. CVE-2023-35630 requires the attacker to modify an option->length field in a DHCPv6 DHCPV6_MESSAGE_INFORMATION_REQUEST input message. Exploitation of CVE-2023-35641 is also via a maliciously crafted DHCP message to an ICS server, but the advisory gives no further clues. A broadly similar ICS vulnerability in September 2023 led to RCE in a SYSTEM context on the ICS server. In all three cases, a mitigating factor is the requirement for the attack to be launched from the same network segment as the ICS server. It seems improbable that either of this month’s ICS vulnerabilities are exploitable against a target on which ICS is not running, although Microsoft does not explicitly deny the possibility.

Holiday season update

Notable by their absence this month: no security patches for Exchange, SharePoint, Visual Studio/.NET, or SQL Server. There are also no lifecycle transitions for Microsoft products this month, although a number of Windows Server 2019 editions and Office components will transition out of mainstream support and into extended support from January 2024.

Summary Charts

Patch Tuesday - December 2023
Sharing is caring, unless it’s exploitative.
Patch Tuesday - December 2023
A rare occurence: Remote Code Execution not in the top spot.
Patch Tuesday - December 2023
Fewer vulns this month overall means less variation in the heatmap.

Summary Tables

Azure vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-35624 Azure Connected Machine Agent Elevation of Privilege Vulnerability No No 7.3
CVE-2023-35625 Azure Machine Learning Compute Instance for SDK Users Information Disclosure Vulnerability No No 4.7

Browser vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-35618 Microsoft Edge (Chromium-based) Elevation of Privilege Vulnerability No No 9.6
CVE-2023-36880 Microsoft Edge (Chromium-based) Information Disclosure Vulnerability No No 4.8
CVE-2023-38174 Microsoft Edge (Chromium-based) Information Disclosure Vulnerability No No 4.3
CVE-2023-6512 Chromium: CVE-2023-6512 Inappropriate implementation in Web Browser UI No No N/A
CVE-2023-6511 Chromium: CVE-2023-6511 Inappropriate implementation in Autofill No No N/A
CVE-2023-6510 Chromium: CVE-2023-6510 Use after free in Media Capture No No N/A
CVE-2023-6509 Chromium: CVE-2023-6509 Use after free in Side Panel Search No No N/A
CVE-2023-6508 Chromium: CVE-2023-6508 Use after free in Media Stream No No N/A

ESU Windows vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-36006 Microsoft WDAC OLE DB provider for SQL Server Remote Code Execution Vulnerability No No 8.8
CVE-2023-35639 Microsoft ODBC Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-35641 Internet Connection Sharing (ICS) Remote Code Execution Vulnerability No No 8.8
CVE-2023-35630 Internet Connection Sharing (ICS) Remote Code Execution Vulnerability No No 8.8
CVE-2023-35628 Windows MSHTML Platform Remote Code Execution Vulnerability No No 8.1
CVE-2023-21740 Windows Media Remote Code Execution Vulnerability No No 7.8
CVE-2023-35633 Windows Kernel Elevation of Privilege Vulnerability No No 7.8
CVE-2023-35632 Windows Ancillary Function Driver for WinSock Elevation of Privilege Vulnerability No No 7.8
CVE-2023-36011 Win32k Elevation of Privilege Vulnerability No No 7.8
CVE-2023-36005 Windows Telephony Server Elevation of Privilege Vulnerability No No 7.5
CVE-2023-36004 Windows DPAPI (Data Protection Application Programming Interface) Spoofing Vulnerability No No 7.5
CVE-2023-35622 Windows DNS Spoofing Vulnerability No No 7.5
CVE-2023-35643 DHCP Server Service Information Disclosure Vulnerability No No 7.5
CVE-2023-35638 DHCP Server Service Denial of Service Vulnerability No No 7.5
CVE-2023-35629 Microsoft USBHUB 3.0 Device Driver Remote Code Execution Vulnerability No No 6.8
CVE-2023-35642 Internet Connection Sharing (ICS) Denial of Service Vulnerability No No 6.5
CVE-2023-36012 DHCP Server Service Information Disclosure Vulnerability No No 5.3
CVE-2023-20588 AMD: CVE-2023-20588 AMD Speculative Leaks Security Notice No Yes N/A

Microsoft Dynamics vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-36020 Microsoft Dynamics 365 (on-premises) Cross-site Scripting Vulnerability No No 7.6
CVE-2023-35621 Microsoft Dynamics 365 Finance and Operations Denial of Service Vulnerability No No 7.5

Microsoft Dynamics Azure vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-36019 Microsoft Power Platform Connector Spoofing Vulnerability No No 9.6

Microsoft Office vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-35636 Microsoft Outlook Information Disclosure Vulnerability No No 6.5
CVE-2023-36009 Microsoft Word Information Disclosure Vulnerability No No 5.5
CVE-2023-35619 Microsoft Outlook for Mac Spoofing Vulnerability No No 5.3

System Center vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-36010 Microsoft Defender Denial of Service Vulnerability No No 7.5

Windows vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-35634 Windows Bluetooth Driver Remote Code Execution Vulnerability No No 8
CVE-2023-35644 Windows Sysmain Service Elevation of Privilege No No 7.8
CVE-2023-36696 Windows Cloud Files Mini Filter Driver Elevation of Privilege Vulnerability No No 7.8
CVE-2023-35631 Win32k Elevation of Privilege Vulnerability No No 7.8
CVE-2023-36391 Local Security Authority Subsystem Service Elevation of Privilege Vulnerability No No 7.8
CVE-2023-36003 XAML Diagnostics Elevation of Privilege Vulnerability No No 6.7
CVE-2023-35635 Windows Kernel Denial of Service Vulnerability No No 5.5

Federate IAM-based single sign-on to Amazon Redshift role-based access control with Okta

Post Syndicated from Debu Panda original https://aws.amazon.com/blogs/big-data/federate-iam-based-single-sign-on-to-amazon-redshift-role-based-access-control-with-okta/

Amazon Redshift accelerates your time to insights with fast, easy, and secure cloud data warehousing at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical queries.

You can use your preferred SQL clients to analyze your data in an Amazon Redshift data warehouse. Connect seamlessly by leveraging an identity provider (IdP) or single sign-on (SSO) credentials to connect to the Redshift data warehouse to reuse existing user credentials and avoid additional user setup and configuration. Using role-based access control (RBAC), you can simplify managing user privileges, database permissions and management of security permissions in Amazon Redshift. You can also use redshift database roles to define a set of elevated permissions, such as for a system monitor or database administrator.

Using AWS Identity and Access Management (IAM) with RBAC, organizations can simplify user management because you no longer need to create users and map them to database roles manually. You can define the mapped database roles as a principal tag for the IdP groups or IAM role, so Redshift database roles and users who are members of those IdP groups are granted to the database roles automatically.

Earlier in 2023, we launched support for Okta integration with Amazon Redshift Serverless using database roles. In this post, we focus on Okta as the IdP and provide step-by-step guidance to integrate a Redshift provisioned cluster with Okta using the Redshift Query Editor v2 and with SQL clients like SQL Workbench/J. You can use this mechanism with other IdP providers such as Azure Active Directory or Ping with any applications or tools using Amazon’s JDBC, ODBC, or Python driver.

Recently we also announced Amazon Redshift integration with AWS IAM Identity Center, supporting trusted identity propagation, allowing you to use third-party Identity Providers (IdP) such as Microsoft Entra ID (Azure AD), Okta, Ping, and OneLogin. This integration simplifies the authentication and authorization process for Amazon Redshift users using Query Editor V2 or Amazon Quicksight, making it easier for them to securely access your data warehouse. AWS IAM Identity Center offers automatic user and group provisioning from Okta to itself by utilizing the System for Cross-domain Identity Management (SCIM) 2.0 protocol. This integration allows for seamless synchronization of information between two services, ensuring accurate and up-to-date information in AWS IAM Identity Center. Refer to Integrate Okta with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On blog post to learn more about setting up single sign-on (SSO) to Amazon Redshift using integration with IdC and Okta as the Identity Provider.

If you are interested in using IAM-based single sign-on with Amazon Redshift database roles then you can continue reading this blog.

Solution overview

The following diagram illustrates the authentication flow of Okta with a Redshift provisioned cluster using federated IAM roles and automatic database role mapping.

Architecture Diagram

The workflow contains the following steps:

  1. Either the user chooses an IdP app in their browser, or the SQL client initiates a user authentication request to the IdP (Okta).
  2. Upon a successful authentication, Okta submits a request to the AWS federation endpoint with a SAML assertion containing the principal tags.
  3. The AWS federation endpoint validates the SAML assertion and invokes the AWS Security Token Service (AWS STS) API AssumeRoleWithSAML. The SAML assertion contains the IdP user and group information that is stored in the RedshiftDbUser and RedshiftDbRoles principal tags, respectively. Temporary IAM credentials are returned to the SQL client or, if using the Query Editor v2, the user’s browser is redirected to the Query Editor v2 console using the temporary IAM credentials.
  4. The temporary IAM credentials are used by the SQL client or Query Editor v2 to call the Redshift API GetClusterCredentialsWithIAM. This API uses the principal tags to determine the user and database roles that the user belongs to. An associated database user is created if the user is signing in for the first time and is granted the matching database roles automatically. A temporary password is returned to the SQL client.
  5. Using the database user and temporary password, the SQL client or Query Editor v2 connects to Amazon Redshift. Upon login, the user is authorized based on the Redshift database roles that were assigned in Step 4.

Prerequisites

You need the following prerequisites to set up this solution:

Connect with a Redshift provisioned cluster as a federated user using Query Editor v2

To connect using Query Editor v2, complete the following steps:

  1. Follow all the steps described in the sections Set up your Okta application and Set up AWS configuration in the following post.
  2. For the Amazon Redshift access IAM policy, replace the policy with the following JSON to use the GetClusterCredentialsWithIAM API:
    {
    	"Version": "2012-10-17",
    	"Statement": [
    					{
    						"Sid": "VisualEditor0",
    						"Effect": "Allow",
    						"Action": "redshift:GetClusterCredentialsWithIAM",
    						"Resource": "arn:aws:redshift:us-west-2:123456789012:dbname:redshift-cluster-1/dev"
    					}
    				]
    }

Now you’re ready to connect to your Redshift provisioned cluster using Query Editor v2 and federated login.

  1. Use the SSO URL from Okta and log in to your Okta account with your user credentials. For this demo, we log in with user Ethan.
  2. In Query Editor v2, choose your Redshift provisioned cluster (right-click) and choose Create connection.
  3. For Authentication, select Temporary credentials using your IAM identity.
  4. For Database, enter the database name you want to connect to.
  5. Choose Create connection.
  6. Run the following command to validate that you are logged in as a federated user and also to get the list of roles associated with that user for the current session:
SELECT current_user,* FROM pg_get_session_roles() eff_ro(name name, roleid integer);

Because Ethan is part of the sales group and has been granted permissions to access tables in the sales_schema, he should be able to access those tables without any issues. However, if he tries to access tables in the finance_schema, he would receive a permission denied error because Ethan is not part of the finance group in Okta.

Okta-QEV2-Federation

Connect with a Redshift provisioned cluster as a federated user via a third-party client

To connect as a federated user via a third-party client, complete the following steps:

  1. Follow steps 1 and 2 which are described in above section (Connect with a Redshift provisioned cluster as a federated user using Query Editor v2).
  2. Use the Redshift JDBC driver v2.1.0.18 and above because it supports authentication with IAM group federation. For the URL, enter jdbc:redshift:iam://<cluster endpoint>:<port>:<databasename>?groupfederation=true. For example, jdbc:redshift:iam://redshift-cluster-1.abdef0abc0ab.us-west-2.redshift.amazonaws.com:5439/dev?groupfederation=true

In the preceding URL, groupfederation is a mandatory parameter that allows you to authenticate with the IAM credentials for the Redshift provisioned cluster. Without the groupfederation parameter, it will not use Redshift database roles.

  1. For Username and Password, enter your Okta credentials.

SQL Workbench/J - Connection

  1. To set up extended properties, follow Steps 4–9 in the section Configure the SQL client (SQL Workbench/J) in the following post.

User Ethan will be able to access the sales_schema tables. If Ethan tries to access the tables in the finance_schema, he will get a permission denied error.

SQL Workbench/J Demo

Troubleshooting

If your connection didn’t work, consider the following:

  • Enable logging in the driver. For instructions, see Configure logging.
  • Make sure to use the latest Amazon Redshift JDBC driver version.
  • If you’re getting errors while setting up the application on Okta, make sure you have admin access.
  • If you can authenticate via the SQL client but get a permission issue or can’t see objects, grant the relevant permission to the role.

Clean up

When you’re done testing the solution, clean up the resources to avoid incurring future charges:

  1. Delete the Redshift provisioned cluster.
  2. Delete the IAM roles, IAM IdPs, and IAM policies.

Conclusion

In this post, we provided step-by-step instructions to integrate a Redshift provisioned cluster with Okta using the Redshift Query Editor v2 and SQL Workbench/J with the help of federated IAM roles and automatic database role mapping. You can use a similar setup with other SQL clients (such as DBeaver or DataGrip). We also showed how Okta group membership is mapped automatically with Redshift provisioned cluster roles to use role-based authentication seamlessly.

If you have any feedback or questions, please leave them in the comments.


About the Authors

Debu-PandaDebu Panda is a Senior Manager, Product Management at AWS. He is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world.

Ranjan Burman is an Analytics Specialist Solutions Architect at AWS. He specializes in Amazon Redshift and helps customers build scalable analytical solutions. He has more than 16 years of experience in different database and data warehousing technologies. He is passionate about automating and solving customer problems with cloud solutions.

Maneesh Sharma is a Senior Database Engineer at AWS with more than a decade of experience designing and implementing large-scale data warehouse and analytics solutions. He collaborates with various Amazon Redshift Partners and customers to drive better integration.

Orchestrate Amazon EMR Serverless Spark jobs with Amazon MWAA, and data validation using Amazon Athena

Post Syndicated from Gaurav Parekh original https://aws.amazon.com/blogs/big-data/orchestrate-amazon-emr-serverless-spark-jobs-with-amazon-mwaa-and-data-validation-using-amazon-athena/

As data engineering becomes increasingly complex, organizations are looking for new ways to streamline their data processing workflows. Many data engineers today use Apache Airflow to build, schedule, and monitor their data pipelines.

However, as the volume of data grows, managing and scaling these pipelines can become a daunting task. Amazon Managed Workflows for Apache Airflow (Amazon MWAA) can help simplify the process of building, running, and managing data pipelines. By providing Apache Airflow as a fully managed platform, Amazon MWAA allows data engineers to focus on building data workflows instead of worrying about infrastructure.

Today, businesses and organizations require cost-effective and efficient ways to process large amounts of data. Amazon EMR Serverless is a cost-effective and scalable solution for big data processing that can handle large volumes of data. The Amazon Provider in Apache Airflow comes with EMR Serverless operators and is already included in Amazon MWAA, making it easy for data engineers to build scalable and reliable data processing pipelines. You can use EMR Serverless to run Spark jobs on the data, and use Amazon MWAA to manage the workflows and dependencies between these jobs. This integration can also help reduce costs by automatically scaling the resources needed to process data.

Amazon Athena is a serverless, interactive analytics service built on open-source frameworks, supporting open-table and file formats. You can use standard SQL to interact with data. Athena, a serverless and interactive analytics service, makes this possible without the need to manage complex infrastructure.

In this post, we use Amazon MWAA, EMR Serverless, and Athena to build a complete end-to-end data processing pipeline.

Solution overview

The following diagram illustrates the solution architecture.

The workflow includes the following steps:

  1. Create an Amazon MWAA workflow that retrieves data from your input Amazon Simple Storage Service (Amazon S3) bucket.
  2. Use EMR Serverless to process the data stored in Amazon S3. EMR Serverless automatically scales up or down based on the workload, so you don’t need to worry about provisioning or managing any infrastructure.
  3. Use EMR Serverless to transform the data using PySpark code and then store the transformed data back in your S3 bucket.
  4. Use Athena to create an external table based on the S3 dataset and run queries to analyze the transformed data. Athena uses the AWS Glue Data Catalog to store the table metadata.

Prerequisites

You should have the following prerequisites:

Data preparation

To illustrate using EMR Serverless jobs with Apache Spark via Amazon MWAA and data validation using Athena, we use the publicly available NYC taxi dataset. Download the following datasets to your local machine:

  • Green taxi and Yellow taxi trip records – Trip records for yellow and green taxis, which include information such as pick-up and drop-off dates and times, locations, trip distances, and payment types. In our example, we use the latest Parquet files for 2022.
  • Dataset for Taxi zone lookup – A dataset that provides location IDs and corresponding zone details for taxis.

In later steps, we upload these datasets to Amazon S3.

Create solution resources

This section outlines the steps for setting up data processing and transformation.

Create an EMR Serverless application

You can create one or more EMR Serverless applications that use open source analytics frameworks like Apache Spark or Apache Hive. Unlike EMR on EC2, you do not need to delete or terminate EMR Serverless applications. EMR Serverless application is only a definition and once created, can be re-used as long as needed. This makes the MWAA pipeline simpler as now you just have to submit jobs to a pre-created EMR Serverless application.

By default, EMR Serverless application will auto-start on job submission and auto-stop when idle for 15 minutes by default to ensure cost efficiency. You can modify the amount of idle time or choose to turn the feature off.

To create an application using EMR Serverless console, follow the instructions in “Create an EMR Serverless application”. Note down the application ID as we will use it in following steps.

Create an S3 bucket and folders

Complete the following steps to set up your S3 bucket and folders:

  1. On the Amazon S3 console, create an S3 bucket to store the dataset.
  2. Note the name of the S3 bucket to use in later steps.
  3. Create an input_data folder for storing input data.
  4. Within that folder, create three separate folders, one for each dataset: green, yellow, and zone_lookup.

You can download and work with the latest datasets available. For our testing, we use the following files:

  • The green/ folder has the file green_tripdata_2022-06.parquet
  • The yellow/ folder has the file yellow_tripdata_2022-06.parquet
  • The zone_lookup/ folder has the file taxi_zone_lookup.csv

Set up the Amazon MWAA DAG scripts

Complete the following steps to set up your DAG scripts:

  1. Download the following scripts to your local machine:
    1. requirements.txt – A Python dependency is any package or distribution that is not included in the Apache Airflow base install for your Apache Airflow version on your Amazon MWAA environment. For this post, we use Boto3 version >=1.23.9.
    2. blog_dag_mwaa_emrs_ny_taxi.py – This script is a part of the Amazon MWAA DAG and consists of the following tasks: yellow_taxi_zone_lookup, green_taxi_zone_lookup, and ny_taxi_summary,. These tasks involve running Spark jobs to lookup taxi zones, and generating a data summary .
    3. green_zone.py – This PySpark script reads data files for green taxi rides and zone lookup, performs a join operation to combine them, and generates an output file containing green taxi rides with zone information. It utilizes temporary views for the df_green and df_zone data frames, performs column-based joins, and aggregates data like passenger count, trip distance, and fare amount. Lastly, it creates the output_data folder in the specified S3 bucket to write the resulting data frame, df_green_zone, as Parquet files.
    4. yellow_zone.py – This PySpark script processes yellow taxi ride and zone lookup data files by joining them to generate an output file containing yellow taxi rides with zone information. The script accepts a user-provided S3 bucket name and initiates a Spark session with the application name yellow_zone. It reads the yellow taxi files and zone lookup file from the specified S3 bucket, creates temporary views, performs a join based on location ID, and calculates statistics such as passenger count, trip distance, and fare amount. Lastly, it creates the output_data folder in the specified S3 bucket to write the resulting data frame, df_yellow_zone, as Parquet files.
    5. ny_taxi_summary.py – This PySpark script processes the green_zone and yellow_zone files to aggregate statistics on taxi rides, grouping data by service zones and location IDs. It requires an S3 bucket name as a command line argument, creates a SparkSession named ny_taxi_summary, reads the files from S3, performs a join, and generates a new data frame named ny_taxi_summary. It creates an output_data folder in the specified S3 bucket to write the resulting data frame to new Parquet files.
  2. On your local machine, update the blog_dag_mwaa_emrs_ny_taxi.py script with the following information:
    • Update your S3 bucket name in the following two lines:
      S3_LOGS_BUCKET = "<<bucket_name_here>>"
      S3_BASE_BUCKET = "<<bucket_name_here>>"

    • Update your role name ARN:
      JOB_ROLE_ARN = “<<emr_serverless_execution_role ARN here>>”
      e.g. arn:aws:iam::<<ACCOUNT_ID>>:role/<<ROLE_NAME>>

    • Update EMR Serverless Application ID. Use the Application ID created earlier.
      EMR_SERVERLESS_APPLICATION_ID  = “<<emr serverless application ID here>>

  3. Upload the requirements.txt file to the S3 bucket created earlier
  4. In the S3 bucket, create a folder named dags and upload the updated blog_dag_mwaa_emrs_ny_taxi.py file from your local machine.
  5. On the Amazon S3 console, create a new folder named scripts inside the S3 bucket and upload the scripts to this folder from your local machine.

Create an Amazon MWAA environment

To create an Airflow environment, complete the following steps:

  1. On the Amazon MWAA console, choose Create environment.
  2. For Name, enter mwaa_emrs_athena_pipeline.
  3. For Airflow version, choose the latest version (for this post, 2.5.1).
  4. For S3 Bucket, enter the path to your S3 bucket.
  5. For DAGs folder, enter the path to your dags folder.
  6. For Requirements file, enter the path to the requirements.txt file.
  7. Choose Next.
  8. For Virtual private cloud (VPC), choose a VPC that has a minimum of two private subnets.

This will populate two of the private subnets in your VPC.

  1. Under Web server access, select Public network.

This allows the Apache Airflow UI to be accessed over the internet by users granted access to the IAM policy for your environment.

  1. For Security group(s), select Create new security group.
  2. For Environment class, select mw1.small.
  3. For Execution role, choose Create a new role.
  4. For Role name, enter a name.
  5. Leave the other configurations as default and choose Next.
  6. On the next page, choose Create environment.

It may take about 20–30 minutes to create your Amazon MWAA environment.

  1. When the Amazon MWAA environment status changes to Available, navigate to the IAM console and update cluster execution role to add pass role privileges to emr_serverless_execution_role.

Trigger the Amazon MWAA DAG

To trigger the DAG, complete the following steps:

  1. On the Amazon MWAA console, choose Environments in the navigation pane.
  2. Open your environment and choose Open Airflow UI.
  3. Select blog_dag_mwaa_emr_ny_taxi, choose the play icon, and choose Trigger DAG.
  4. When the DAG is running, choose the DAG blog_dag_mwaa_emrs_ny_taxi and choose Graph to locate your DAG run workflow.

The DAG will take approximately 4–6 minutes to run all the scripts. You will see all the complete tasks and the overall status of the DAG will show as success.

To rerun the DAG, remove s3://<<your_s3_bucket here >>/output_data/.

Optionally, to understand how Amazon MWAA runs these tasks, choose the task you want to inspect.

Choose Run to view the task run details.

The following screenshot shows an example of the task logs.

If you like to dive deep in the execution logs, then on the EMR Serverless console, navigate to “Applications”. The Apache Spark driver logs will indicate the initiation of your job along with the details for executors, stages and tasks that were created by EMR Serverless. These logs can be helpful to monitor your job progress and troubleshoot failures.

By default, EMR Serverless will store application logs securely in Amazon EMR managed storage for a period of 30 days. However, you can also specify Amazon S3 or Amazon CloudWatch as your log delivery options during job submission.

Validate the final result set with Athena

Let’s validate the data loaded by the process using Athena SQL queries.

  1. On the Athena console, choose Query editor in the navigation pane.
  2. If you’re using Athena for the first time, under Settings, choose Manage and enter the S3 bucket location that you created earlier (<S3_BUCKET_NAME>/athena), then choose Save.
  3. In the query editor, enter the following query to create an external table:
CREATE EXTERNAL TABLE default.ny_taxi_summary(
  pu_service_zone string, 
  pulocationid bigint, 
  do_service_zone string, 
  dolocationid bigint, 
  passenger_count bigint, 
  trip_distance double, 
  fare_amount double, 
  extra double, 
  mta_tax double, 
  tip_amount double, 
  tolls_amount double, 
  improvement_surcharge double, 
  total_amount double, 
  congestion_surcharge double, 
  airport_fee double)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
  's3://<<YOUR-S3-BUCKET Here>>/output_data/ny_taxi_summary/' -- *** Change bucket name to your bucket***
TBLPROPERTIES (
  'classification'='parquet', 
  'compressionType'='none');


Run the following query on the recently created ny_taxi_summary table to retrieve the first 10 rows to validate the data:

select * from default.ny_taxi_summary limit 10;

Clean up

To prevent future charges, complete the following steps:

  1. On the Amazon S3 console, delete the S3 bucket you created to store the Amazon MWAA DAG, scripts, and logs.
  2. On the Athena console, drop the table you created:
    drop table default.ny_taxi_summary;

  3. On the Amazon MWAA console, navigate to the environment that you created and choose Delete.
  4. On the EMR Studio console, delete the application.

To delete the application, navigate to the List applications page. Select the application that you created and choose Actions → Stop to stop the application. After the application is in the STOPPED state, select the same application and choose Actions → Delete.

Conclusion

Data engineering is a critical component of many organizations, and as data volumes continue to grow, it’s essential to find ways to streamline data processing workflows. The combination of Amazon MWAA, EMR Serverless, and Athena provides a powerful solution to build, run, and manage data pipelines efficiently. With this end-to-end data processing pipeline, data engineers can easily process and analyze large amounts of data quickly and cost-effectively without the need to manage complex infrastructure. The integration of these AWS services provides a robust and scalable solution for data processing, helping organizations make informed decisions based on their data insights.

Now that you’ve seen how to submit Spark jobs on EMR Serverless via Amazon MWAA, we encourage you to use Amazon MWAA to create a workflow that will run PySpark jobs via EMR Serverless.

We welcome your feedback and inquiries. Please feel free to reach out to us if you have any questions or comments.


About the authors

Rahul Sonawane is a Principal Analytics Solutions Architect at AWS with AI/ML and Analytics as his area of specialty.

Gaurav Parekh is a Solutions Architect helping AWS customers build large scale modern architecture. He specializes in data analytics and networking. Outside of work, Gaurav enjoys playing cricket, soccer and volleyball.


Audit History

December 2023: This post was reviewed for technical accuracy by Santosh Gantaram, Sr. Technical Account Manager.

New for AWS Amplify – Query MySQL and PostgreSQL database for AWS CDK

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-for-aws-amplify-query-mysql-and-postgresql-database-for-aws-cdk/

Today we are announcing the general availability to connect and query your existing MySQL and PostgreSQL databases with support for AWS Cloud Development Kit (AWS CDK), a new feature to create a real-time, secure GraphQL API for your relational database within or outside Amazon Web Services (AWS). You can now generate the entire API for all relational database operations with just your database endpoint and credentials. When your database schema changes, you can run a command to apply the latest table schema changes.

In 2021, we announced AWS Amplify GraphQL Transformer version 2, enabling developers to develop more feature-rich, flexible, and extensible GraphQL-based app backends even with minimal cloud expertise. This new GraphQL Transformer was redesigned from the ground up to generate extensible pipeline resolvers to route a GraphQL API request, apply business logic, such as authorization, and communicate with the underlying data source, such as Amazon DynamoDB.

However, customers wanted to use relational database sources for their GraphQL APIs such as their Amazon RDS or Amazon Aurora databases in addition to Amazon DynamoDB. You can now use @model types of Amplify GraphQL APIs for both relational database and DynamoDB data sources. Relational database information is generated to a separate schema.sql.graphql file. You can continue to use the regular schema.graphql files to create and manage DynamoDB-backed types.

When you simply provide any MySQL or PostgreSQL database information, whether behind a virtual private cloud (VPC) or publicly accessible on the internet, AWS Amplify automatically generates a modifiable GraphQL API that securely connects to your database tables and exposes create, read, update, or delete (CRUD) queries and mutations. You can also rename your data models to be more idiomatic for the frontend. For example, a database table is called “todos” (plural, lowercase) but is exposed as “ToDo” (singular, PascalCase) to the client.

With one line of code, you can add any of the existing Amplify GraphQL authorization rules to your API, making it seamless to build use cases such as owner-based authorization or public read-only patterns. Because the generated API is built on AWS AppSync‘ GraphQL capabilities, secure real-time subscriptions are available out of the box. You can subscribe to any CRUD events from any data model with a few lines of code.

Getting started with your MySQL database in AWS CDK
The AWS CDK lets you build reliable, scalable, cost-effective applications in the cloud with the considerable expressive power of a programming language. To get started, install the AWS CDK on your local machine.

$ npm install -g aws-cdk

Run the following command to verify the installation is correct and print the version number of the AWS CDK.

$ cdk –version

Next, create a new directory for your app:

$ mkdir amplify-api-cdk
$ cd amplify-api-cdk

Initialize a CDK app by using the cdk init command.

$ cdk init app --language typescript

Install Amplify’s GraphQL API construct in the new CDK project:

$ npm install @aws-amplify/graphql-api-construct

Open the main stack file in your CDK project (usually located in lib/<your-project-name>-stack.ts). Import the necessary constructs at the top of the file:

import {
    AmplifyGraphqlApi,
    AmplifyGraphqlDefinition
} from '@aws-amplify/graphql-api-construct';

Generate a GraphQL schema for a new relational database API by executing the following SQL statement on your MySQL database. Make sure to output the results to a .csv file, including column headers, and replace <database-name> with the name of your database, schema, or both.

SELECT
  INFORMATION_SCHEMA.COLUMNS.TABLE_NAME,
  INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME,
  INFORMATION_SCHEMA.COLUMNS.COLUMN_DEFAULT,
  INFORMATION_SCHEMA.COLUMNS.ORDINAL_POSITION,
  INFORMATION_SCHEMA.COLUMNS.DATA_TYPE,
  INFORMATION_SCHEMA.COLUMNS.COLUMN_TYPE,
  INFORMATION_SCHEMA.COLUMNS.IS_NULLABLE,
  INFORMATION_SCHEMA.COLUMNS.CHARACTER_MAXIMUM_LENGTH,
  INFORMATION_SCHEMA.STATISTICS.INDEX_NAME,
  INFORMATION_SCHEMA.STATISTICS.NON_UNIQUE,
  INFORMATION_SCHEMA.STATISTICS.SEQ_IN_INDEX,
  INFORMATION_SCHEMA.STATISTICS.NULLABLE
      FROM INFORMATION_SCHEMA.COLUMNS
      LEFT JOIN INFORMATION_SCHEMA.STATISTICS ON INFORMATION_SCHEMA.COLUMNS.TABLE_NAME=INFORMATION_SCHEMA.STATISTICS.TABLE_NAME AND INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME=INFORMATION_SCHEMA.STATISTICS.COLUMN_NAME
      WHERE INFORMATION_SCHEMA.COLUMNS.TABLE_SCHEMA = '<database-name>';

Run the following command, replacing <path-schema.csv> with the path to the .csv file created in the previous step.

$ npx @aws-amplify/cli api generate-schema \
    --sql-schema <path-to-schema.csv> \
    --engine-type mysql –out lib/schema.sql.graphql

You can open schema.sql.graphql file to see the imported data model from your MySQL database schema.

input AMPLIFY {
     engine: String = "mysql"
     globalAuthRule: AuthRule = {allow: public}
}

type Meals @model {
     id: Int! @primaryKey
     name: String!
}

type Restaurants @model {
     restaurant_id: Int! @primaryKey
     address: String!
     city: String!
     name: String!
     phone_number: String!
     postal_code: String!
     ...
}

If you haven’t already done so, go to the Parameter Store in the AWS Systems Manager console and create a parameter for the connection details of your database, such as hostname/url, database name, port, username, and password. These will be required in the next step for Amplify to successfully connect to your database and perform GraphQL queries or mutations against it.

In the main stack class, add the following code to define a new GraphQL API. Replace the dbConnectionConfg options with the parameter paths created in the previous step.

new AmplifyGraphqlApi(this, "MyAmplifyGraphQLApi", {
  apiName: "MySQLApi",
  definition: AmplifyGraphqlDefinition.fromFilesAndStrategy(
    [path.join(__dirname, "schema.sql.graphql")],
    {
      name: "MyAmplifyGraphQLSchema",
      dbType: "MYSQL",
      dbConnectionConfig: {
        hostnameSsmPath: "/amplify-cdk-app/hostname",
        portSsmPath: "/amplify-cdk-app/port",
        databaseNameSsmPath: "/amplify-cdk-app/database",
        usernameSsmPath: "/amplify-cdk-app/username",
        passwordSsmPath: "/amplify-cdk-app/password",
      },
    }
  ),
  authorizationModes: { apiKeyConfig: { expires: cdk.Duration.days(7) } },
  translationBehavior: { sandboxModeEnabled: true },
});

This configuration assums that your database is accessible from the internet. Also, the default authorization mode is set to Api Key for AWS AppSync and the sandbox mode is enabled to allow public access on all models. This is useful for testing your API before adding more fine-grained authorization rules.

Finally, deploy your GraphQL API to AWS Cloud.

$ cdk deploy

You can now go to the AWS AppSync console and find your created GraphQL API.

Choose your project and the Queries menu. You can see newly created GraphQL APIs compatible with your tables of MySQL database, such as getMeals to get one item or listRestaurants to list all items.

For example, when you select items with fields of address, city, name, phone_number, and so on, you can see a new GraphQL query. Choose the Run button and you can see the query results from your MySQL database.

When you query your MySQL database, you can see the same results.

How to customize your GraphQL schema for your database
To add a custom query or mutation in your SQL, open the generated schema.sql.graphql file and use the @sql(statement: "") pass in parameters using the :<variable> notation.

type Query {
     listRestaurantsInState(state: String): Restaurants @sql("SELECT * FROM Restaurants WHERE state = :state;”)
}

For longer, more complex SQL queries, you can reference SQL statements in the customSqlStatements config option. The reference value must match the name of a property mapped to a SQL statement. In the following example, a searchPosts property on customSqlStatements is being referenced:

type Query {
      searchPosts(searchTerm: String): [Post]
      @sql(reference: "searchPosts")
}

Here is how the SQL statement is mapped in the API definition.

new AmplifyGraphqlApi(this, "MyAmplifyGraphQLApi", { 
    apiName: "MySQLApi",
    definition: AmplifyGraphqlDefinition.fromFilesAndStrategy( [path.join(__dirname, "schema.sql.graphql")],
    {
        name: "MyAmplifyGraphQLSchema",
        dbType: "MYSQL",
        dbConnectionConfig: {
        //	...ssmPaths,
     }, customSqlStatements: {
        searchPosts: // property name matches the reference value in schema.sql.graphql 
        "SELECT * FROM posts WHERE content LIKE CONCAT('%', :searchTerm, '%');",
     },
    }
  ),
//...
});

The SQL statement will be executed as if it were defined inline in the schema. The same rules apply in terms of using parameters, ensuring valid SQL syntax, and matching return types. Using a reference file keeps your schema clean and allows the reuse of SQL statements across fields. It is best practice for longer, more complicated SQL queries.

Or you can change a field and model name using the @refersTo directive. If you don’t provide the @refersTo directive, AWS Amplify assumes that the model name and field name exactly match the database table and column names.

type Todo @model @refersTo(name: "todos") {
     content: String
     done: Boolean
}

When you want to create relationships between two database tables, use the @hasOne and @hasMany directives to establish a 1:1 or 1:M relationship. Use the @belongsTo directive to create a bidirectional relationship back to the relationship parent. For example, you can make a 1:M relationship between a restaurant and its meals menus.

type Meals @model {
     id: Int! @primaryKey
     name: String!
     menus: [Restaurants] @hasMany(references: ["restaurant_id"])
}

type Restaurants @model {
     restaurant_id: Int! @primaryKey
     address: String!
     city: String!
     name: String!
     phone_number: String!
     postal_code: String!
     meals: Meals @belongsTo(references: ["restaurant_id"])
     ...
}

Whenever you make any change to your GraphQL schema or database schema in your DB instances, you should deploy your changes to the cloud:

Whenever you make any change to your GraphQL schema or database schema in your DB instances, you should re-run the SQL script and export to .csv step mentioned earlier in this guide to re-generate your schema.sql.graphql file and then deploy your changes to the cloud:

$ cdk deploy

To learn more, see Connect API to existing MySQL or PostgreSQL database in the AWS Amplify documentation.

Now available
The relational database support for AWS Amplify now works with any MySQL and PostgreSQL databases hosted anywhere within Amazon VPC or even outside of AWS Cloud.

Give it a try and send feedback to AWS re:Post for AWS Amplify, the GitHub repository of Amplify GraphQL API, or through your usual AWS Support contacts.

Channy

P.S. Specially thanks to René Huangtian Brandel, a principal product manager at AWS for his contribution to write sample codes.

[$] Project Bluefin: A customized Fedora Silverblue desktop image

Post Syndicated from jake original https://lwn.net/Articles/954059/

So-called “immutable” Linux distributions have been in development for
some time, but (unless you count Chrome OS) haven’t gained much traction. Project Bluefin, is a heavily
customized set of Fedora
Silverblue
images coming from the Universal Blue community; they are
designed to deliver a reliable Linux desktop that’s as easy to use as a
Chromebook but more customizable. Bluefin’s mission is to change up
the desktop experience and attract a new generation of open-source
contributors with a “cloud-native”
take
on developing and delivering the operating system.

ASRock Industrial 4×4 BOX-7840U Review AMD Powered NUC Form Factor

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/asrock-industrial-4x4-box-7840u-review-amd-powered-nuc-form-factor/

In our ASRock Industrial 4×4 BOX-7840U review, we see how this AMD Ryzen 7040 series mini PC compares to others we have tested

The post ASRock Industrial 4×4 BOX-7840U Review AMD Powered NUC Form Factor appeared first on ServeTheHome.

The collective thoughts of the interwebz