Tag Archives: Opinions

The Lack Of Native MFA For Active Directory Is A Big Sin For Microsoft

Post Syndicated from Bozho original https://techblog.bozho.net/the-lack-of-native-mfa-for-active-directory-is-a-big-sin-for-microsoft/

Active Directory is dominant in the enterprise world (as well as the public sector). From my observation, the majority of organization rely on Active Directory for their user accounts. While that may be changing in recent years with more advanced and cloud IAM and directory solutions, the landscape in the last two decades is a domination of Microsoft’s Active Directory.

As a result of that dominance, many cyber attacks rely on exploiting some aspects of Active Directory. Whether it would be weaknesses of Kerberos, “pass the ticket”, golden ticket, etc. Standard attacks like password spraying, credential stuffing and other brute forcing also apply, especially if the Exchange web access is enabled. Last, but not least, simply browsing the active directory once authenticated with a compromised account, provides important information for further exploitation (finding other accounts, finding abandoned, but not disabled accounts, finding passwords in description fields, etc).

Basically, having access an authentication endpoint which interfaces the Active Directory allows attackers to gain access and then do lateral movement.

What is the most recommended measures for preventing authentication attacks? Multi-factor authentication. And the sad reality is that Microsoft doesn’t offer native MFA for Active Directory.

Yes, there are things like Microsoft Hello for Business, but that can’t be used in web and email context – it is tied to the Windows machine. And yes, there are third-party options. But they incur additional cost, and are complex to setup and manage. We all know the power of defaults and built-in features in security – it should be readily available and simple in order to have wide adoption.

What Microsoft should have done is introduce standard, TOTP-based MFA and enforce it through native second-factor screens in Windows, Exchange web access, Outlook and others. Yes, that would require Kerberos upgrades, but it is completely feasible. Ideally, it should be enabled by a single click, which would prompt users to enroll their smart phone apps (Google Authenticator, Microsoft Authenticator, Authy or other) on their next successful login. Of course, there may be users without smartphones, and so the option to not enroll for MFA may be available to certain less-privileged AD groups.

By not doing that, Microsoft exposes all on-premise AD deployments to all sorts of authentication attacks mentioned above. And for me that’s a big sin.

Microsoft would say, of course, that their Azure AD supports many MFA options and is great and modern and secure and everything. And that’s true, if you want to chose to migrate to Azure and use Office365. And pay for subscription vs just the Windows Server license. It’s not a secret that Microsoft’s business model is shifting towards cloud, subscription services. And there’s nothing wrong with that. But leaving on-prem users with no good option for proper MFA across services, including email, is irresponsible.

The post The Lack Of Native MFA For Active Directory Is A Big Sin For Microsoft appeared first on Bozho's tech blog.

Open APIs – Public Infrastructure in the Digital Age

Post Syndicated from Bozho original https://techblog.bozho.net/open-apis-public-infrastructure-in-the-digital-age/

When “public infrastructure” is mentioned, typically people think of roads, bridges, rails, dams, power plants, city lights. These are all enablers, publicly funded/owned/managed (not necessarily all of these), which allow the larger public to do business and to cover basic needs. Public infrastructure is sometimes free, but not always (you pay electricity bills and toll fees; and of course someone will rightly point out that nothing is free, because we pay it through taxes, but that’s not the point).

In the digital age, we can think of some additional examples to “public infrastructure”. The most obvious one, which has a physical aspects, is fiber-optic cables. Sometimes they are publicly owned (especially in rural areas), and their goal is to provide internet access, which itself is an enabler for business and day-to-day household activities. More and more countries, municipalities and even smaller communities invest in owning fiber-optic cables in order to make sure there’s equal access to the internet. But cables are still physical infrastructure.

Something entirely digital, that is increasingly turning into public infrastructure, are open government APIs. They are not fully perceived as public infrastructure, and exist as such only in the heads of a handful of policymakers and IT experts, but in essence they are exactly that – government-owned infrastructure that enables businesses and other activities.

But let me elaborate. Open APIs let the larger public access data and/or modify data that is collected and/or centralized and/or monitored by government institutions (central or local). Some examples:

  • Electronic health infrastructure – the Bulgarian government is building a centralized health record as well as centralized e-prescriptions and e-hospitalization. It is all APIs, where private companies develop software for hospitals, general practitioners, pharmacies, labs. Other companies may develop apps for citizens to help them improve their health or match them with nutrition and sport advice. All of that is based on open APIs (following the FHIR standard) and allows for fair competition, while managing access to sensitive data, audit logs and most importantly – collection in a centralized store.
  • Toll system – we have a centralized road toll system, which offers APIs (unfortunately, via an overly complicated model of intermediaries) which supports multiple resellers to sell toll passes (time-based and distance-based). This allows telecoms (through apps), banks (through e-banking), supermarkets, fleet management companies and others to offer better UI and integrated services.
  • Tax systems – businesses will be happy to report their taxes through their ERP automatically, rather than manually exporting and uploading, or manually filling data in complex forms.
  • E-delivery of documents – Bulgaria has a centralized system for electronic delivery of documents to public institutions. That system has an API, which allows third parties to integrate and send documents as part of more complex services, on behalf of citizens and organizations.
  • Car registration – car registers are centralized, but opening up their APIs would allow car (re)sellers to handle all the paperwork on behalf of their customers, online, by a click of a button in their internal system. Car part owners can fetch data about registered cars per brand and model in order to make sure there are enough spare parts in stock (based on the typical lifecycle of car parts).

Core systems and central registers with open APIs are digital public infrastructure that would allow a more seamless, integrated state. There are a lot of details to be taken into account – access management and authentication (who has the right to read or write certain data), fees (if a system is heavily used, the owning institution might charge a fee), change management and upgrades, zero downtime, integrity, format, etc.

But the policy that I have always followed and advocated for is clear – mandatory open APIs for all government systems. Bureaucracy and paperwork may become nearly invisible, hidden behind APIs, if this principle is followed.

The post Open APIs – Public Infrastructure in the Digital Age appeared first on Bozho's tech blog.

On Disinformation and Large Online Platforms

Post Syndicated from Bozho original https://techblog.bozho.net/on-disinformation-and-large-online-platforms/

This week I was invited to be a panelist, together with other digital ministers, on a side-event organized by Ukraine in Davos, during the World Economic Forum. The topic was disinformation, and I’d like to share my thoughts on it. The video recording is here, but below is not a transcript, but an expanded version.

Bulgaria is seemingly more susceptible to disinformation, for various reasons. We have a majority of the population that has positive sentiments about Russia, for historical reasons. And disinformation campaigns have been around before the war and after the wear started. The typical narratives that are being pushed every day are about the bad, decadent west; the slavic, traditional, conservative Russian government; the evil and aggressive NATO; the great and powerful, but peaceful Russian army, and so on.

These disinformation campaign are undermining public discourse and even public policy. COVID vaccination rates in Bulgaria are one of the lowest in the world (and therefore the mortality rate is one of the highest). Propaganda and conspiracy theories took hold into our society and literally killed our relatives and friends. The war is another example – Bulgaria is on the first spot when it comes to people thinking that the west (EU/NATO) is at fault for the war in Ukraine.

Kremlin uses the same propaganda techniques developed in the cold war, but applied on the free internet, much more efficiently. They use European values of free speech to undermine those same European values.

Their main channels are social networks, who seem to remain blissfully ignorant of the local context as the one described above.

What we’ve seen, and what has been leaked and discussed for a long time is that troll factories amplify anonymous websites. They share content, like content, make it seem like it’s noteworthy to the algorithms.

We know how it works. But governments can’t just block a website, because they think it’s false information. A government may easily go beyond the good intentions and do censorship. In 4 years I won’t be a minister and the next government may decide I’m spreading “western propaganda” and block my profiles, my blogs, my interviews in the media.

I said all of that in front of the Bulgarian parliament last week. I also said that local measures are insufficient, and risky.

That’s why we have to act smart. We need to strike down the mechanisms for weaponzing social networks – for spreading disinformation to large portions of the population, not to block the information itself. Brute force is dangerous. And helps the Kremlin in their narrative about the bad, hypocritical west that talks about free speech, but has the power to shut you down if a bureaucrat says so.

The solution, in my opinion, is to regulate recommendation engines, on a European level. To make these algorithms find and demote these networks of trolls (they fail at that – Facebook claims they found 3 Russian-linked accounts in January).

How to do it? It’s hard to answer if we don’t know the data and the details of how they currently work. Social networks can try to cluster users by IPs, ASs, VPN exit nodes, content similarity, DNS and WHOIS data for websites, photo databases, etc. They can consult national media registers (if they exist), via APIs, to make sure something is a media and not an auto-generated website with pre-written false content (which is what actually happens).

The regulation should make it a focus of social media not to moderate everything, but to not promote inauthentic behavior.

Europe and its partners must find a way to regulate algorithms without curbing freedom of expression. And I was in Brussels last week to underline that. We can use the Digital services act to do exactly that, and we have to do it wisely.

I’ve been criticized – why I’m taking on this task while I can do just cool things like eID and eServices and removing bureaucracy. I’m.doing those, of course, without delay.

But we are here as government officials to tackle the systemic risks. The eID I’ll introduce will do no good if we lose the hearts and minds of people to Kremlin propaganda.

The post On Disinformation and Large Online Platforms appeared first on Bozho's tech blog.

Don’t Reinvent Date Formats

Post Syndicated from Bozho original https://techblog.bozho.net/dont-reinvent-date-formats/

Microsoft Exchange has a bug that practically stops email. (The public sector is primarily using Exchange, so many of the institutions I’m responsible for as a minister, have their email “stuck”). The bug is described here, and fortunately, has a solution.

But let me say something simple and obvious: don’t reinvent date formats, please. When in doubt, use ISO 8601 or epoch millis (in UTC), or RFC 2822. Nothing else makes sense.

Certainly treating an int as a date is an abysmal idea (it doesn’t even save that much resources). 202201010000 is not a date format worth considering.

(As a side note, another advice – add automate tests for future timestamps. Sometiimes they catch odd behavior).

I’ll finish with Jon Skeet’s talk on dates, strings and numbers.

The post Don’t Reinvent Date Formats appeared first on Bozho's tech blog.

I Have Been Appointed As E-Governance Minister of Bulgaria

Post Syndicated from Bozho original https://techblog.bozho.net/i-have-been-appointed-as-e-governance-minister-of-bulgaria/

Last week the Bulgarian National assembly appointed the new government. I am one of the appointed ministers – a minister for electronic governance.

The portfolio includes digitizing registers and processes in all government institutions, reducing bureaucracy, electronic identity, cybersecurity, digital skills and more.

Thanks to all my readers for following this blog throughout the years. I will be sharing some digital policy details here from now on while I’m minister. That may include some technical articles, but they are unlikely to be developer-oriented.

I hope to make some important changes and put forward key ideas for e-governance and digital policy that can be used as an example outside my country (last time I was involved in public policy, I helped pass an “open source law”).

I’ve written a few articles about IT people looking for challenges – not just technical challenges. And I think that’s a great challenge where I’ll have to put all my knowledge and skills to work for the common good.

The post I Have Been Appointed As E-Governance Minister of Bulgaria appeared first on Bozho's tech blog.

Simple Things That Are Actually Hard: User Authentication

Post Syndicated from Bozho original https://techblog.bozho.net/simple-things-that-are-actually-hard-user-authentication/

You build a system. User authentication is the component that is always there, regardless of the functionality of the system. And by now it should be simple to implement it – just “drag” some ready-to-use authentication module, or configure it with some basic options (e.g. Spring Security), and you’re done.

Well, no. It’s the most obvious thing and yet it’s extremely complicated to get right. It’s not just login form -> check username/password -> set cookie. It has a lot of other things to think about:

  • Cookie security – how to make it so that a cookie doesn’t leak or can’t be forged. Should you even have a cookie, or use some stateless approach like JWT, use SameSite lax or strict?
  • Bind cookie to IP and logout user if IP changes?
  • Password requirements – minimum length, special characters? UI to help with selecting a password?
  • Storing passwords in the database – bcrypt, scrypt, PBKDF2, SHA with multiple iterations?
  • Allow storing in the browser? Generally “yes”, but some applications deliberately hash it before sending it, so that it can’t be stored automatically
  • Email vs username – do you need a username at all? Should change of email be allowed?
  • Rate-limiting authentication attempts – how many failed logins should block the account, for how long, should admins get notifications or at least logs for locked accounts? Is the limit per IP, per account, a combination of those?
  • Captcha – do you need captcha at all, which one, and after how many attempts? Is Re-Captcha an option?
  • Password reset – password reset token database table or expiring links with HMAC? Rate-limit password reset?
  • SSO – should your service should support LDAP/ActiveDirectory authentication (probably yes), should it support SAML 2.0 or OpenID Connect, and if yes, which ones? Or all of them? Should it ONLY support SSO, rather than internal authentication?
  • 2FA – TOTP or other? Implement the whole 2FA flow, including enable/disable and use or backup codes; add option to not ask for 2FA for a particular device for a period of time>
  • Login by link – should the option to send a one-time login link be email be supported?
  • XSS protection – make sure no XSS vulnerabilities exist especially on the login page (but not only, as XSS can steal cookies)
  • Dedicated authentication log – keep a history of all logins, with time, IP, user agent
  • Force logout – is the ability to logout a logged-in device needed, how to implement it, e.g. with stateless tokens it’s not trivial.
  • Keeping a mobile device logged in – what should be stored client-side? (certainly not the password)
  • Working behind proxy – if the client IP matters (it does), make sure the X-Forwarded-For header is parsed
  • Capture login timezone for user and store it in the session to adjust times in the UI?
  • TLS Mutual authentication – if we need to support hardware token authentication with private key, we should enable TLS mutual. What should be in the truststore, does the web server support per-page mutual TLS or should we use a subdomain?

And that’s for the most obvious feature that every application has. No wonder it has been implemented incorrectly many, many times. The IT world is complex and nothing is simple. Sending email isn’t simple, authentication isn’t simple, logging isn’t simple. Working with strings and dates isn’t simple, sanitizing input and output isn’t simple.

We have done a poor job in building the frameworks and tools to help us with all those things. We can’t really ignore them, we have to think about them actively and take conscious, informed decisions.

The post Simple Things That Are Actually Hard: User Authentication appeared first on Bozho's tech blog.

Integrity Guarantees of Blockchains In Case of Single Owner Or Colluding Owners

Post Syndicated from Bozho original https://techblog.bozho.net/integrity-guarantees-of-blockchains-in-case-of-single-owner-or-colluding-owners/

The title may sound as a paper title, rather than a blogpost, because it was originally an idea for such, but I’m unlikely to find the time to put a proper paper about it, so here it is – a blogpost.

Blockchain has been touted as the ultimate integrity guarantee – if you “have blockchain”, nobody can tamper with your data. Of course, reality is more complicated, and even in the most distributed of ledgers, there are known attacks. But most organizations that are experimenting with blockchain, rely on a private network, sometimes having themselves as the sole owner of the infrastructure, and sometimes sharing it with just a few partners.

The point of having the technology in the first place is to guarantee that once collected, data cannot be tampered with. So let’s review how that works in practice.

First, we have two define two terms – “tamper-resistant” (sometimes referred to as tamper-free) and “tamper-evident”. “Tamper-resistant” means nobody can ever tamper with the data and the state of the data structure is always guaranteed to be without any modifications. “Tamper-evident”, on the other hand, means that a data structure can be validated for integrity violations, and it will be known that there have been modifications (alterations, deletions or back-dating of entries). Therefore, with tamper-evident structures you can prove that the data is intact, but if it’s not intact, you can’t know the original state. It’s still a very important property, as the ability to prove that data is not tampered with is crucial for compliance and legal aspects.

Blockchain is usually built ontop of several main cryptographic primitives: cryptographic hashes, hash chains, Merkle trees, cryptographic timestamps and digital signatures. They all play a role in the integrity guarantees, but the most important ones are the Merkle tree (with all of its variations, like a Patricia Merkle tree) and the hash chain. The original bitcoin paper describes a blockchain to be a hash chain, based on the roots of multiple Merkle trees (which form a single block). Some blockchains rely on a single, ever-growing merkle tree, but let’s not get into particular implementation details.

In all cases, blockchains are considered tamper-resistant because their significantly distributed in a way that enough number of members have a copy of the data. If some node modifies that data, e.g. 5 blocks in the past, it has to prove to everyone else that this is the correct merkle root for that block. You have to have more than 50% of the network capacity in order to do that (and it’s more complicated than just having them), but it’s still possible. In a way, tamper resistance = tamper evidence + distributed data.

But many of the practical applications of blockchain rely on private networks, serving one or several entities. They are often based on proof of authority, which means whoever has access to a set of private keys, controls what the network agree on. So let’s review the two cases:

  • Multiple owners – in case of multiple node owners, several of them can collude to rewrite the chain. The collusion can be based on mutual business interest (e.g. in a supply chain, several members may team up against the producer to report distorted data), or can be based on security compromise (e.g. multiple members are hacked by the same group). In that case, the remaining node owners can have a backup of the original data, but finding out whether the rest were malicious or the changes were legitimate part of the business logic would require a complicated investigation.
  • Single owner – a single owner can have a nice Merkle tree or hash chain, but an admin with access to the underlying data store can regenerate the whole chain and it will look legitimate, while in reality it will be tampered with. Splitting access between multiple admins is one approach (or giving them access to separate nodes, none of whom has access to a majority), but they often drink beer together and collusion is again possible. But more importantly – you can’t prove to a 3rd party that your own employees haven’t colluded under orders from management in order to cover some tracks to present a better picture to a regulator.

In the case of a single owner, you don’t even have a tamper-evident structure – the chain can be fully rewritten and nobody will understand that. In case of multiple owners, it depends on the implementation. There will be a record of the modification at the non-colluding party, but proving which side “cheated” would be next to impossible. Tamper-evidence is only partially achieved, because you can’t prove whose data was modified and whose data hasn’t (you only know that one of the copies has tampered data).

In order to achieve tamper-evident structure with both scenarios is to use anchoring. Checkpoints of the data need to be anchored externally, so that there is a clear record of what has been the state of the chain at different points in time. Before blockchain, the recommended approach was to print it in newspapers (e.g. as an ad) and because it has a large enough circulation, nobody can collect all newspapers and modify the published checkpoint hash. This published hash would be either a root of the Merkle tree, or the latest hash in a hash chain. An ever-growing Merkle tree would allow consistency and inclusion proofs to be validated.

When we have electronic distribution of data, we can use public blockchains to regularly anchor our internal ones, in order to achieve proper tamper-evident data. We, at LogSentinel, for example, do exactly that – we allow publishing the latest Merkle root and the latest hash chain to Ethereum. Then even if those with access to the underlying datastore manage to modify and regenerate the entire chain/tree, there will be no match with the publicly advertised values.

How to store data on publish blockchains is a separate topic. In case of Ethereum, you can put any payload within a transaction, so you can put that hash in low-value transactions between two own addresses (or self-transactions). You can use smart-contracts as well, but that’s not necessary. For Bitcoin, you can use OP_RETURN. Other implementations may have different approaches to storing data within transactions.

If we want to achieve tamper-resistance, we just need to have several copies of the data, all subject to tamper-evidence guarantees. Just as in a public network. But what a public network gives is is a layer, which we can trust with providing us with the necessary piece for achieving local tamper evidence. Of course, going to hardware, it’s easier to have write-only storage (WORM, write once, ready many). The problem with it, is that it’s expensive and that you can’t reuse it. It’s not so much applicable to use-cases that require short-lived data that requires tamper-resistance.

So in summary, in order to have proper integrity guarantees and the ability to prove that the data in a single-owner or multi-owner private blockchains hasn’t been tampered with, we have to send publicly the latest hash of whatever structure we are using (chain or tree). If not, we are only complicating our lives by integrating a complex piece of technology without getting the real benefit it can bring – proving the integrity of our data.

The post Integrity Guarantees of Blockchains In Case of Single Owner Or Colluding Owners appeared first on Bozho's tech blog.

Hypotheses About What Happened to Facebook

Post Syndicated from Bozho original https://techblog.bozho.net/hypotheses-about-what-happened-to-facebook/

Facebook was down. I’d recommend reading Cloudflare’s summary. Then I recommend reading Facebook’s own account on the incident. But let me expand on that. Facebook published announcements and withdrawals for certain BGP prefixes which lead to removing its DNS servers from “the map of the internet” – they told everyone “the part of our network where our DNS servers are doesn’t exist”. That was the result of a backbone self-inflicted failure due to a bug in the auditing tool that checks whether the commands executed aren’t doing harmful things.

Facebook owns a lot of IPs. According to RIPEstat they are part of 399 prefixes (147 of them are IPv4). The DNS servers are located in two of those 399. Facebook uses a.ns.facebook.com, b.ns.facebook.com, c.ns.facebook.com and d.ns.facebook.com, which get queries whenever someone wants to know the IPs of Facebook-owned domains. These four nameservers are served by the same Autonomous System from just two prefixes – 129.134.30.0/23 and 185.89.218.0/23. Of course “4 nameservers” is a logical construct, there are probably many actual servers behind that (using anycast).

I wrote a simple “script” to fetch all the withdrawals and announcements for all Facebook-owned prefixes (from the great API of RIPEstats). Facebook didn’t remove itself from the map entirely. As CloudFlare points out, it was just some prefixes that are affected. It can be just these two, or a few others as well, but it seems that just a handful were affected. If we sort the resulting CSV from the above script by withdrawals, we’ll notice that 129.134.30.0/23 and 185.89.218.0/23 are the pretty high up (alongside 185.89 and 123.134 with a /24, which are all included in the /23). Now that perfectly matches Facebook’s account that their nameservers automatically withdraw themselves if they fail to connect to other parts of the infrastructure. Everything may have also been down, but the logic for withdrawal is present only in the networks that have nameservers in them.

So first, let me make three general observations that are not as obvious and as universal as they may sound, but they are worth discussing:

  • Use longer DNS TTLs if possible – if Facebook had 6 hour TTL on its domains, we may have not figured out that their name servers are down. This is hard to ask for such a complex service that uses DNS for load-balancing and geographical distribution, but it’s worth considering. Also, if they killed their backbone and their entire infrastructure was down anyway, the DNS TTL would not have solved the issue. But
  • We need improved caching logic for DNS. It can’t be just “present or not”; DNS caches may keep “last known good state” in case of SERVFAIL and fallback to that. All of those DNS resolvers that had to ask the authoritative nameserver “where can I find facebook.com” knew where to find facebook.com just a minute ago. Then they got a failure and suddenly they are wondering “oh, where could Facebook be?”. It’s not that simple, of course, but such cache improvement is worth considering. And again, if their entire infrastructure was down, this would not have helped.
  • Consider having an authoritative nameserver outside your main AS. If something bad happens to your AS routes (regardless of the reason), you may still have DNS working. That may have downsides – generally, it will be hard to manage and sync your DNS infrastructure. But at least having a spare set of nameservers and the option to quickly point glue records there is worth considering. It would not have saved Facebook in this case, as again, they claim the entire infrastructure was inaccessible due to a “broken” backbone.
  • Have a 100% test coverage on critical tools, such as the auditing tool that had a bug. 100% test coverage is rarely achievable in any project, but in such critical tools it’s a must.

The main explanation is the accidental outage. This is what Facebook engineers explain in the blogpost and other accounts, and that’s what seems to have happened. However, there are alternative hypotheses floating around, so let me briefly discuss all of the options.

  • Accidental outage due to misconfiguration – a very likely scenario. These things may happen to everyone and Facebook is known for it “break things” mentality, so it’s not unlikely that they just didn’t have the right safeguards in place and that someone ran a buggy update. The scenarios why and how that may have happened are many, and we can’t know from the outside (even after Facebook’s brief description). This remains the primary explanation, following my favorite Hanlon’s razor. A bug in the audit tool is absolutely realistic (btw, I’d love Facebook to publish their internal tools).
  • Cyber attack – It cannot be known by the data we have, but this would be a sophisticated attack that gained access to their BGP administration interface, which I would assume is properly protected. Not impossible, but a 6-hour outage of a social network is not something a sophisticated actor (e.g. a nation state) would invest resources in. We can’t rule it out, as this might be “just a drill” for something bigger to follow. If I were an attacker that wanted to take Facebook down, I’d try to kill their DNS servers, or indeed, “de-route” them. If we didn’t know that Facebook lets its DNS servers cut themselves from the network in case of failures, the fact that so few prefixes were updated might be in indicator of targeted attack, but this seems less and less likely.
  • Deliberate self-sabotage1.5 billion records are claimed to be leaked yesterday. At the same time, a Facebook whistleblower is testifying in the US congress. Both of these news are potentially damaging to Facebook reputation and shares. If they wanted to drown the news and the respective share price plunge in a technical story that few people understand but everyone is talking about (and then have their share price rebound, because technical issues happen to everyone), then that’s the way to do it – just as a malicious actor would do, but without all the hassle to gain access from outside – de-route the prefixes for the DNS servers and you have a “perfect” outage. These coincidences have lead people to assume such a plot, but from the observed outage and the explanation given by Facebook on why the DNS prefixes have been automatically withdrawn, this sounds unlikely.

Distinguishing between the three options is actually hard. You can mask a deliberate outage as an accident, a malicious actor can make it look like a deliberate self-sabotage. That’s why there are speculations. To me, however, by all of the data we have in RIPEStat and the various accounts by CloudFlare, Facebook and other experts, it seems that a chain of mistakes (operational and possibly design ones) lead to this.

The post Hypotheses About What Happened to Facebook appeared first on Bozho's tech blog.

Digital Transformation and Technological Utopianism

Post Syndicated from Bozho original https://techblog.bozho.net/digital-transformation-and-technological-utopianism/

Today I read a very interesting article about the prominence of Bulgarian hackers (in the black-hat sense) and virus authors in the 90s, linking that to the focus on technical education in the 80s, lead by the Bulgarian communist party in an effort to revive communism through technology.

Near the end of the article I was pleasantly surprised to read my name, as a political candidate who advocates for digital e-government and transformation of the public sector. The article then ended with something that I’m in deep disagreement with, but that has merit, and is worth discussing (and you can replace “Bulgaria” with probably any country there):

Of course, the belief that all the problems of a corrupt Bulgaria can be solved through the perfect tools is not that different to the Bulgarian Communist Party’s old dream that central planning through electronic brains would create communism. In both cases, the state is to be stripped back to a minimum

My first reaction was to deny ever claiming that the state would be stripped back to a minimum, as it will not (risking to enrage my libertarian readers), or to argue that I’ve never claimed there are “perfect tools” that can solve all problems, nor that digital transformation is the only way to solve those problems. But what I’ve said or written has little to do with the overall perception of techno-utopianism that IT people-turned-policy makers are usually struggling with.

So I decided to clearly state what e-government and digital transformation of the public sector is about.

First, it’s just catching up to the efficiency of the private sector. Sadly, there’s nothing visionary about wanting to digitize paper processes and provide services online. It’s something that’s been around for two decades in the private sector and the public sector just has to catch up, relying on all the expertise accumulated in those decades. Nothing grandiose or mind-boggling, just not being horribly inefficient.

When the world grows more complex, legislation and regulation grows more complex, the government gets more and more functions and more and more details to care about. There are more topics to have policy about (and many to take an informed decision to NOT have a policy about). All of that, today, can’t rely on pen-and-paper and a few proverbial smart and well-intentioned people. The government needs technology to catch up and do its job. It has had the luxury to not have competition and therefore it lagged behind. When there are no market forces to drive the digital transformation, what’s left is technocratic politicians. This efficiency has nothing to do with ideology, left or right. You can have “small government” and still have it inefficient and incapable of making sense of the world.

Second, technology is an enabler. Yes, it can help solve the problems with corruption, nepotism, lack of accountability. But as a tool, not as the solution itself. Take open data, for example (something I’ve been working on five years ago when Bulgaria jumped to the top of the EU open data index). Just having the data out there is an important effort, but by itself it doesn’t solve any problem. You need journalists, NGOs, citizens and a general understanding in society what transparency means. Same for accountability – it’s one thing to have every document digitized, every piece of data – published and every government official action leaving an audit trail; it’s a completely different story to have society act on those things – to have the institutions to investigate, to have the public pressure to turn that into political accountability.

Technology is also a threat – and that’s beyond the typical cybersecurity concerns. It poses the risk of dangerous institutions becoming too efficient; of excessive government surveillance; of entrenched interests carving their ways into the digital systems to perpetuate their corrupt agenda. I’m by no means ignoring those risks – they are real already. The Nazis, for example, were extremely efficient in finding the Jewish population in the Netherlands because the Dutch were very good at citizen registration. This doesn’t mean that you shouldn’t have an efficient citizen registration system. It means that it’s not good or bad per se.

And that gets us to the question of technological utopianism, of which I’m sometimes accused (though not directly in the quoted article). When you are an IT person, you have a technical hammer and everything may look like a binary nail. That’s why it’s very important to have a glimpse on humanities sides as well. Technology alone will not solve anything. And my blockchain skepticism is a hint in that direction – many blockchain enthusiasts are claiming that blockchain will solve many problems in many areas of life. It won’t. At least not just through clever cryptography and consensus algorithms. I once even wrote a sci-fi story about exactly the aforementioned communist dream of a centralized computer brain that solves all social issues while people are left to do what they want. And argued that no matter how perfect it is, it won’t work in a non-utopian human world. In other words, I’m rather critical of techno-utopianism as well.

The communist party, according to the author, saw technology as a tool by which the communist government would achieve its ideological goal.

My idea is quite different. First, technology necessary for “catching up” of the public sector, and second, I see technology as an enabler. What for – whether it’s for accountability or surveillance, fight with corruption or entrenching corruption even further – it’s our role as individuals, as society, and (in my case) as politicians, to formulate and advocate for. We have to embed our values, after democratic debate, into the digital tools (e.g. by making them privacy-preserving). But if we want to have good governance, and to be good at policy-making in the 21st century, we need digital tools, fully understanding their pitfalls and without putting them on a pedestal.

The post Digital Transformation and Technological Utopianism appeared first on Bozho's tech blog.

Every Serialization Framework Should Have Its Own Transient Annotation

Post Syndicated from Bozho original https://techblog.bozho.net/every-serialization-framework-should-have-its-own-transient-annotation/

We’ve all used dozens of serialization frameworks – for JSON, XML, binary, and ORMs (which are effectively serialization frameworks for relational databases). And there’s always the moment when you need to exclude some field from an object – make it “transient”.

So far so good, but then comes the point where one object is used by several serialization frameworks within the same project/runtime. That’s not necessarily the case, but let me discuss the two alternatives first:

  • Use the same object for all serializations (JSON/XML for APIs, binary serialization for internal archiving, ORM/database) – preferred if there are only minor differences between the serialized/persisted fields. Using the same object saves a lot of tedious transferring between DTOs.
  • Use different DTOs for different serializations – that becomes a necessity when scenarios become more complex and using the same object becomes a patchwork of customizations and exceptions

Note that both strategies can exist within the same project – there are simple objects and complex objects, and you can only have a variety of DTOs for the latter. But let’s discuss the first option.

If each serialization framework has its own “transient” annotation, it’s easy to tweak the serialization of one or two fields. More importantly, it will have predictable behavior. If not, then you may be forced to have separate DTOs even for classes where one field differs in behavior across the serialization targets.

For example the other day I had the following surprise – we use Java binary serialization (ObjectOutputStream) for some internal buffering of large collections, and the objects are then indexed. In a completely separate part of the application, objects of the same class get indexed with additional properties that are irrelevant for the binary serialization and therefore marked with the Java transient modifier. It turns out, GSON respects the “transient” modifier and these fields are never indexed.

In conclusion, this post has two points. The first is – expect any behavior from serialization frameworks and have tests to verify different serialization scenarios. And the second is for framework designers – don’t reuse transient modifiers/annotations from the language itself or from other frameworks, it’s counterintuitive.

The post Every Serialization Framework Should Have Its Own Transient Annotation appeared first on Bozho's tech blog.

A Developer Running For Parliament

Post Syndicated from Bozho original https://techblog.bozho.net/a-developer-running-for-parliament/

That won’t be a typical publication you’d see on a developer’s blog. But yes, I’m running for parliament (in my country, Bulgaria, an EU member). And judging by the current polls for the party I’m with, I’ll make it.

But why? Well, I’ll refer to four previous posts in this blog to illustrate my decision.

First, I used to be a government advisor 4 years ago. So the “ship of public service” has sailed. What I didn’t realize back then was that in order to drive sustainable change in the digital realm of the public sector, you need to have a political debate about the importance and goals of those changes, not merely “ghost-writing” them.

A great strategy and a great law and even a great IT system is useless without the mental uptake by a sufficient amount of people. So, that’s the reason one has to be on the forefront of political debate in order to make sure digital transformation is done right. And this forefront is parliament. I’m happy to have supported my party as an expert for the past four years and that expertise is valued. That’s the biggest argument here – you need people like me, with deep technical knowledge and experience in many IT projects, to get things done right on every level. That’s certainly not a one-man task, though.

Second, it’s a challenge. I once wrong “What is challenging for developers” and the last point is “open ended problems”. Digitally transforming an entire country is certainly a challenge in the last category – “open ended problems”. There is no recipe, no manual for that.

Third, lawmaking is quite like programming (except it doesn’t regulate computer behavior, it regulates public life, which is far more complex and important). I already have a decent lawmaking experience and writing better, more precise and more “digital-friendly” laws is something that I like doing and something that I see as important.

Fourth, ethics has been important for me as a developer and it’s much more important for a politician.

For this blog it means I will be writing a bit more high-level stuff than day-to-day tips and advice. I hope I’ll still be able to (and sometimes have to) write some code in order to solve problems, but that won’t be enough material for blogposts. But I’ll surely share thoughts on cybersecurity, quality of public sector projects and system integration.

Software engineering and politics require very different skills. I think I am a good engineer (and I hope to remain so), and I have been a manager and a founder in the last couple of years as well. I’ve slowly, over time, developed my communication skills. National politics, even in a small country, is a tough feat, though. But as engineers we are constantly expanding our knowledge and skills, so I’ll try to transfer that mindset into a new realm.

The post A Developer Running For Parliament appeared first on Bozho's tech blog.

The Syslog Hell

Post Syndicated from Bozho original https://techblog.bozho.net/the-syslog-hell/

Syslog. You’ve probably heard about that, especially if you are into monitoring or security. Syslog is perceived to be the common, unified way that systems can send logs to other systems. Linux supports syslog, many network and security appliances support syslog as a way to share their logs. On the other side, a syslog server is receiving all syslog messages. It sounds great in theory – having a simple, common way to represent logs messages and send them across systems.

Reality can’t be further from that. Syslog is not one thing – there are multiple “standards”, and each of those is implemented incorrectly more often than not. Many vendors have their own way of representing data, and it’s all a big mess.

First, the RFCs. There are two RFCs – RFC3164 (“old” or “BSD” syslog) and RFC5424 (the new variant that obsoletes 3164). RFC3164 is not a standard, while RFC5424 is (mostly).

Those RFCs concern the contents of a syslog message. Then there’s RFC6587 which is about transmitting a syslog message over TCP. It’s also not a standard, but rather “an observation”. Syslog is usually transmitted over UDP, so fitting it into TCP requires some extra considerations. Now add TLS on top of that as well.

Then there are content formats. RFC5424 defines a key-value structure, but RFC 3164 does not – everything after the syslog header is just a non-structured message string. So many custom formats exist. For example firewall vendors tend to define their own message formats. At least they are often documented (e.g. check WatchGuard and SonicWall), but parsing them requires a lot of custom knowledge about that vendor’s choices. Sometimes the documentation doesn’t fully reflect the reality, though.

Instead of vendor-specific formats, there are also de-facto standards like CEF and the less popular LEEF. They define a structure of the message and are actually syslog-independent (you can write CEF/LEEF to a file). But when syslog is used for transmitting CEF/LEEF, the message should respect RFC3164.

And now comes the “fun” part – incorrect implementations. Many vendors don’t really respect those documents. They come up with their own variations of even the simplest things like a syslog header. Date formats are all over the place, hosts are sometimes missing, priority is sometimes missing, non-host identifiers are used in place of hosts, colons are placed frivolously.

Parsing all of that mess is extremely “hacky”, with tons of regexes trying to account for all vendor quirks. I’m working on a SIEM, and our collector is open source – you can check our syslog package. Some vendor-specific parsers are yet missing, but we are adding new ones constantly. The date formats in the CEF parser tell a good story.

If it were just two RFCs with one de-facto message format standard for one of them and a few option for TCP/UDP transmission, that would be fine. But what makes things hell is the fact that too many vendors decided not to care about what is in the RFCs, they decided that “hey, putting a year there is just fine” even though the RFC says “no”, that they don’t really need to set a host in the header, and that they didn’t really need to implement anything new after their initial legacy stuff was created.

Too many vendors (of various security and non-security software) came up with their own way of essentially representing key-value pairs, too many vendors thought their date format is the right one, too many vendors didn’t take the time to upgrade their logging facility in the past 12 years.

Unfortunately that’s representative of our industry (yes, xkcd). Someone somewhere stitches something together and then decades later we have an incomprehensible patchwork of stringly-typed, randomly formatted stuff flying around whatever socket it finds suitable. And it’s never the right time and the right priority to clean things up, to get up to date, to align with others in the field. We, as an industry (both security and IT in general) are creating a mess out of everything. Yes, the world is complex, and technology is complex as well. Our job is to make it all palpable, abstracted away, simplified and standardized. And we are doing the opposite.

The post The Syslog Hell appeared first on Bozho's tech blog.

Developers Are Obsessed With Their Text Editors

Post Syndicated from Bozho original https://techblog.bozho.net/developers-are-obsessed-with-their-text-editors/

Developers are constantly discussing and even fighting about text editors and IDEs. Which one is better, why is it better, what’s the philosophy behind one or the other, which one makes you more productive, which one has better themes, which one is more customizable.

I myself have fallen victim to this trend, with several articles about why Emacs is not a good idea for Java, why I still use Eclipse (though I’d still prefer some IDEA features), and what’s the difference between an editor and an IDE (for those who’d complain about the imprecise title of this post).

Are text editors and IDEs important? Sure, they are one of our main tools that we use everyday and therefore it should be very, very good (more metaphors about violin players and tennis players, please). But most text editors and IDEs are good. They evolve, they copy each other, they attract their audiences. They are also good in different ways, but most of the top ones achieve their goal (otherwise they wouldn’t be so popular). Sure, someone prefers a certain feature to be implemented in a certain way, or demands having another feature (e.g. I demand having call hierarchies on all constructors and IDEA doesn’t give me that, duh…) But those things are rarely significant in the grand scheme of things.

The comparable insignificance comes from the structure of our work, or why we are being now often called “software engineers” – it’s not about typing speed, or the perfectly optimized tool for creating code. Our time is dedicated to thinking, designing, reading, naming things. And the quality of our code writing/editing/debugging tool is not on the top of the list of things that drive productivity and quality.

We should absolutely master our tools, though. Creating software requires much more than editing text. Refactoring, advanced search, advanced code navigation, debugging, hot-swap/hot-deploy/reload-on-save, version control integration – all of these things are important for doing our job better.

My point is that text editors or IDEs occupy too much of developers’ time and mind, with too little benefit. Next time you think it’s a good idea to argue about which editor/IDE a colleague SHOULD be using, think twice. It’s not a good investment of your time and energy. And next time you consider standardizing on an editor/IDE for the whole team, don’t. Leave people with their preference, it doesn’t affect team consistency.

The post Developers Are Obsessed With Their Text Editors appeared first on Bozho's tech blog.

Releasing Often Helps With Analyzing Performance Issues

Post Syndicated from Bozho original https://techblog.bozho.net/releasing-often-helps-with-analyzing-performance-issues/

Releasing often is a good thing. It’s cool, and helps us deliver new functionality quickly, but I want to share one positive side-effect – it helps with analyzing production performance issues.

We do releases every 5 to 10 days and after a recent release, the application CPU chart jumped twice (the lines are differently colored because we use blue-green deployment):

What are the typical ways to find performance issues with production loads?

  • Connect a profiler directly to production – tricky, as it requires managing network permissions and might introduce unwanted overhead
  • Run performance tests against a staging or local environment and do profiling there – good, except your performance tests might not hit exactly the functionality that causes the problem (this is what happens in our case, as it was some particular types of API calls that caused it, which weren’t present in our performance tests). Also, performance tests can be tricky
  • Do a thread dump (and heap dump) and analyze them locally – a good step, but requires some luck and a lot of experience analyzing dumps, even if equipped with the right tools
  • Check your git history / release notes for what change might have caused it – this is what helped us resolve the issue. And it was possible because there were only 10 days of commits between the releases.

We could go through all of the commits and spot potential performance issues. Most of them turned out not to be a problem, and one seemingly unproblematic pieces was discovered to be the problem after commenting it out for a brief period a deploying a quick release without it, to test the hypothesis. I’ll share a separate post about the particular issue, but we would have to waste a lot more time if that release has 3 months worth of commits rather than 10 days.

Sometimes it’s not an obvious spike in the CPU or memory, but a more gradual issue that you introduce at some point and it starts being a problem a few months later. That’s what happened a few months ago, when we noticed a stead growth in the CPU with the growth of ingested data. Logical in theory, but the CPU usage grew faster than the data ingestion rate, which isn’t good.

So we were able to answer the question “when did it start growing” in order to be able to pinpoint the release that introduced the issue. because the said release had only 5 days of commits, it was much easier to find the culprit.

All of the above techniques are useful and should be employed at the right time. But releasing often gives you a hand with analyzing where a performance issues is coming from.

The post Releasing Often Helps With Analyzing Performance Issues appeared first on Bozho's tech blog.

Let’s Kill Security Questions

Post Syndicated from Bozho original https://techblog.bozho.net/lets-kill-security-questions/

Let’s kill security questions

Security questions still exist. They are less dominant now, but we haven’t yet condemned them as an industry hard enough so that they stop being added to authentication flows.

But they are bad. They are like passwords, but more easily guessable, because you have a password hint. And while there are opinions that they might be okay in certain scenarios, they have so many pitfalls that in practice we should just not consider them an option.

What are those pitfalls? Social engineering. Almost any security question’s answer is guessable by doing research on the target person online. We share more about our lives and don’t even realize how that affects us security-wise. Many security questions have a limited set of possible answers that can be enumerated with a brute force attack (e.g. what are the most common pet names; what are the most common last names in a given country for a given period of time, in order to guess someone’s mother’s maiden name; what are the high schools in the area where the person lives, and so on). So when someone wants to takeover your account, if all they have to do is open your Facebook profile or try 20-30 options, you have no protection.

But what are they for in the first place? Account recovery. You have forgotten your password and the system asks you some details about you to allow you to reset your password. We already have largely solved the problem of account recovery – send a reset password link to the email of the user. If the system itself is an email service, or in a couple of other scenarios, you can use a phone number, where a one-time password is sent for recovery purposes (or a secondary email, for email providers).

So we have the account recovery problem largely solved, why are security questions still around? Inertia, I guess. And the five monkeys experiment. There is no good reason to have a security question if you can have recovery email or phone. And you can safely consider that to be true (ok, maybe there are edge cases).

There are certain types of account recovery measures that resemble security questions and can be implemented as an additional layer, on top of a phone or email recovery. For more important services (e.g. your Facebook account or your main email), it may not be safe to consider just owning the phone or just having access to the associated email to be enough. Phones get stolen, emails get “broken into”. That’s why a security-like set of questions may serve as additional protection. For example – guessing recent activity. Facebook does that sometimes by asking you about your activity on the site or about your friends. This is not perfect, as it can be monitored by the malicious actor, but is an option. For your email, you can be asked what are the most recent emails that you’ve sent, and be presented with options to choose from, with some made up examples. These things are hard to implement because of geographic and language differences, but “guess your recent activity among these choices”, e.g. dynamically defined security questions, may be an acceptable additional step for account recovery.

But fixed security questions – no. Let’s kill those. I’m not the first to argue against security questions, but we need to be reminded that certain bad security practices should be left in the past.

Authentication is changing. We are desperately trying to get rid of the password itself (and still failing to do so), but before we manage to do so, we should first get rid of the “bad password in disguise”, the security question.

The post Let’s Kill Security Questions appeared first on Bozho's tech blog.

Discovering an OSSEC/Wazuh Encryption Issue

Post Syndicated from Bozho original https://techblog.bozho.net/discovering-an-ossec-wazuh-encryption-issue/

I’m trying to get the Wazuh agent (a fork of OSSEC, one of the most popular open source security tools, used for intrusion detection) to talk to our custom backend (namely, our LogSentinel SIEM Collector) to allow us to reuse the powerful Wazuh/OSSEC functionalities for customers that want to install an agent on each endpoint rather than just one collector that “agentlessly” reaches out to multiple sources.

But even though there’s a good documentation on the message format and encryption, I couldn’t get to successfully decrypt the messages. (I’ll refer to both Wazuh and OSSEC, as the functionality is almost identical in both, with the distinction that Wazuh added AES support in addition to blowfish)

That lead me to a two-day investigation on possible reasons. The first side-discovery was the undocumented OpenSSL auto-padding of keys and IVs described in my previous article. Then it lead me to actually writing C code (an copying the relevant Wazuh/OSSEC pieces) in order to debug the issue. With Wazuh/OSSEC I was generating one ciphertext and with Java and openssl CLI – a different one.

I made sure the key, key size, IV and mode (CBC) are identical. That they are equally padded and that OpenSSL’s EVP API is correctly used. All of that was confirmed and yet there was a mismatch, and therefore I could not decrypt the Wazuh/OSSEC message on the other end.

After discovering the 0-padding, I also discovered a mistake in the documentation, which used a static IV of FEDCA9876543210 rather than the one found in the code, where the 0 preceded 9 – FEDCA0987654321. But that didn’t fix the issue either, only got me one step closer.

A side-note here on IVs – Wazuh/OSSEC is using a static IV, which is a bad practice. The issue is reported 5 years ago, but is minor, because they are using some additional randomness per message that remediates the use of a static IV; it’s just not idiomatic to do it that way and may have unexpected side-effects.

So, after debugging the C code, I got to a simple code that could be used to reproduce the issue and asked a question on Stackoverflow. 5 minutes after posting the question I found another, related question that had the answer – using hex strings like that in C doesn’t work. Instead, they should be encoded: char *iv = (char *)"\xFE\xDC\xBA\x09\x87\x65\x43\x21\x00\x00\x00\x00\x00\x00\x00\x00";. So, the value is not the bytes corresponding to the hex string, but the ASCII codes of each character in the hex string. I validated that in the receiving Java end with this code:

This has an implication on the documentation, as well as on the whole scheme as well. Because the Wazuh/OSSEC AES key is: MD5(password) + MD5(MD5(agentName) + MD5(agentID)){0, 15}, the 2nd part is practically discarded, because the MD5(password) is 32 characters (= 32 ASCII codes/bytes), which is the length of the AES key. This makes the key derived from a significantly smaller pool of options – the permutations of 16 bytes, rather than of 256 bytes.

I raised an issue with Wazuh. Although this can be seen as a vulnerability (due to the reduced key space), it’s rather minor from security point of view, and as communication is mostly happening within the corporate network, I don’t think it has to be privately reported and fixed immediately.

Yet, I made a recommendation for introducing an additional configuration option to allow to transition to the updated protocol without causing backward compatibility issues. In fact, I’d go further and recommend using TLS/DTLS rather than a home-grown, AES-based scheme. Mutual authentication can be achieved through TLS mutual authentication rather than through a shared secret.

It’s satisfying to discover issues in popular software, especially when they are not written in your “native” programming language. And as a rule of thumb – encodings often cause problems, so we should be extra careful with them.

The post Discovering an OSSEC/Wazuh Encryption Issue appeared first on Bozho's tech blog.

Is It Really Two-Factor Authentication?

Post Syndicated from Bozho original https://techblog.bozho.net/is-it-really-two-factor-authentication/

Terminology-wise, there is a clear distinction between two-factor authentication (multi-factor authentication) and two-step verification (authentication), as this article explains. 2FA/MFA is authentication using more than one factors, i.e. “something you know” (password), “something you have” (token, card) and “something you are” (biometrics). Two-step verification is basically using two passwords – one permanent and another one that is short-lived and one-time.

At least that’s the theory. In practice it’s more complicated to say which authentication methods belongs to which category (“something you X”). Let me illustrate that with a few emamples:

  • An OTP hardware token is considered “something you have”. But it uses a shared symmetric secret with the server so that both can generate the same code at the same time (if using TOTP), or the same sequence. This means the secret is effectively “something you know”, because someone may steal it from the server, even though the hardware token is protected. Unless, of course, the server stores the shared secret in an HSM and does the OTP comparison on the HSM itself (some support that). And there’s still a theoretical possibility for the keys to leak prior to being stored on hardware. So is a hardware token “something you have” or “something you know”? For practical purposes it can be considered “something you have”
  • Smartphone OTP is often not considered as secure as a hardware token, but it should be, due to the secure storage of modern phones. The secret is shared once during enrollment (usually with on-screen scanning), so it should be “something you have” as much as a hardware token
  • SMS is not considered secure and often given as an example for 2-step verification, because it’s just another password. While that’s true, this is because of a particular SS7 vulnerability (allowing the interception of mobile communication). If mobile communication standards were secure, the SIM card would be tied to the number and only the SIM card holder would be able to receive the message, making it “something you have”. But with the known vulnerabilities, it is “something you know”, and that something is actually the phone number.
  • Fingerprint scanners represent “something you are”. And in most devices they are built in a way that the scanner authenticates to the phone (being cryptographically bound to the CPU) while transmitting the fingerprint data, so you can’t just intercept the bytes transferred and then replay them. That’s the theory; it’s not publicly documented how it’s implemented. But if it were not so, then “something you are” is “something you have” – a sequence of bytes representing your fingerprint scan, and that can leak. This is precisely why biometric identification should only be done locally, on the phone, without any server interaction – you can’t make sure the server is receiving sensor-scanned data or captured and replayed data. That said, biometric factors are tied to the proper implementation of the authenticating smartphone application – if your, say, banking application needs a fingerprint scan to run, a malicious actor should not be able to bypass that by stealing shared credentials (userIDs, secrets) and do API calls to your service. So to the server there’s no “something you are”. It’s always “something that the client-side application has verified that you are, if implemented properly”
  • A digital signature (via a smartcard or yubikey or even a smartphone with secure hardware storage for private keys) is “something you have” – it works by signing one-time challenges, sent by the server and verifying that the signature has been created by the private key associated with the previously enrolled public key. Knowing the public key gives you nothing, because of how public-key cryptography works. There’s no shared secret and no intermediary whose data flow can be intercepted. A private key is still “something you know”, but by putting it in hardware it becomes “something you have”, i.e. a true second factor. Of course, until someone finds out that the random generation of primes used for generating the private key has been broken and you can derive the private key form the public key (as happened recently with one vendor).

There isn’t an obvious boundary between theoretical and practical. “Something you are” and “something you have” can eventually be turned into “something you know” (or “something someone stores”). Some theoretical attacks can become very practical overnight.

I’d suggest we stick to calling everything “two-factor authentication”, because it’s more important to have mass understanding of the usefulness of the technique than to nitpick on the terminology. 2FA does not solve phishing, unfortunately, but it solves leaked credentials, which is good enough and everyone should have some form of it. Even SMS is better than nothing (obviously, for high-profile systems, digital signatures is the way to go).

The post Is It Really Two-Factor Authentication? appeared first on Bozho's tech blog.

Making Sense of the Information Security Landscape

Post Syndicated from Bozho original https://techblog.bozho.net/making-sense-of-the-information-security-landscape/

There are hundreds of different information security solutions out there and choosing which one to pick can be hard. Usually decisions are driven by recommendations, vendor familiarity, successful upsells, compliance needs, etc. I’d like to share my understanding of the security landscape by providing one-line descriptions of each of the different categories of products.

Note that these categories are not strictly defined sometimes and they may overlap. They may have evolved over time and a certain category can include several products from legacy categories. The explanations will be slightly simplified. For a generalization and summary, skip the list and go to the next paragraph. This post aims to summarize a lot of Gertner and Forester reports, as well as product data sheets, combined with some real world observations and to bring this to a technical level, rather than broad business-focused capabilities. I’ll split them in several groups, though they may be overlapping.

Monitoring and auditing

  • SIEM (Security Information and Event Management) – collects logs from all possible sources (applications, OSs, network appliances) and raises alarms if there are anomalies
  • IDS (Intrusion Detection System) – listening to network packets and finding malicious signatures or statistical anomalies. There are multiple ways to listen to the traffic: proxy, port mirroring, network tap, host-based interface listener. Deep packet inspection is sometimes involved, which requires sniffing TLS at the host or terminating it at a proxy in order to be able to inspect encrypted communication (especially for TLS 1.3), effectively doing an MITM “attack” on the organization users.
  • IPS (Intrusion Prevention System) – basically a marketing upgrade of IDS with the added option to “block” traffic rather than just “report” the intrusion.
  • UEBA (User and Entity Behavior Analytics) – a system that listens to system activity (via logs and/or directly monitoring endpoints for user and system activity, including via screen capture) that tries to identify user behavior patterns (as well as system component behavior patterns) and report on any anomalies and changes in the pattern, also classifying users as less or more “risky”. Recently UEBA has been part of next-gen SIEMs
  • SUBA (Security user Behavior Analytics) – same as UEBA, but named so after the purpose (security) rather than the entities monitored. Used by Forester (whereas UEBA is used by Gartner)
  • DAM (Database Activity Monitoring) – tools that monitor and log database queries and configuration changes, looking for suspicious patterns and potentially blocking them based on policies. Implemented via proxy or agents installed at the host
  • DAP (Database Audit and Protection) – based on DAM, but with added features for content classification (similar to DLPs), vulnerability detection and more clever behavior analysis (e.g. through UEBA)
  • FIM (File Integrity Monitoring) – usually a feature of other tools, FIM is constantly monitoring files for potentially suspicious changes
  • SOC (Security Operations Center) – this is more of an organizational unit that employs multiple tools (usually a SIEM, DLP, CASB) to fully handle the security of an organization.

Access proxies

  • CASB (Cloud Access Security Broker) – a proxy (usually) that organizations go through when connecting to cloud services that allow them to enforce security policies and detect anomalies, e.g. regarding authentication and authorization, input and retrieval of sensitive data. CASBs may involve additional encryption options for the data being used.
  • CSG (Cloud Security Gateway) – effectively the same as CASB
  • SWG (Secure Web Gateway) – a proxy for accessing the web, includes filtering malicious websites, filtering potentially malicious downloads, limiting uploads
  • SASE (Secure Access Service Edge) – like CASB/CSG, but also providing additional bundled functionalities like a Firewall, SWG, VPN, DNS management, etc.

Firewalls

  • WAF (Web Application Firewall) – a firewall (working as a reverse proxy) that you put in front of web applications to protect them from typical web vulnerabilities that may not be addressed by the application developer – SQL injections, XSS, CSRF, etc.
  • NF (Network Firewall) – the typical firewall that allows you to allow or block traffic based on protocol, port, source/destination
  • NGFW (Next Generation Firewall) – a firewall that combines both network firewall, (web) application firewall and providing analysis of the traffic thus detecting potential anomalies/intrusions/data exfiltration

Data protection

  • DLP (Data Leak Prevention / Data Loss Prevention) – that’s a broad category of tools that aim at preventing data loss – mostly accidental, but sometimes malicious as well. Sometimes involves installing an agent in each machine, in other case it’s proxy-based. Many other solutions provide DLP functionality, like IPS/IDS, WAFs, CASBs, but DLPs are focused on inspecting user activities (including via UEBA/SUBA), network traffic (including via SWGs), communication (most often email) and publicly facing storage (e.g. FTP, S3), that may lead to leaking data. DLPs include discovering sensitive data in structured (databases) and unstructured (office documents) data. Other DLP features are encryption of data at rest and tokenization of sensitive data.
  • ILDP (Information Leak Detection and Prevention) – same as DLP
  • IPC (Information Protection and Control) – same as DLP
  • EPS (Extrusion Prevention System) – same as DLP, focused on monitoring outbound traffic for exfiltration attempts
  • CMF (Content Monitoring and Filtering) – part of DLP. May overlap with SWG functionalities.
  • CIP (Critical Information Protection) – part of DLP, focused on critical information, e.g. through encryption and tokenization
  • CDP (Continuous Data Protection) – basically incremental/real-time backup management, with retention settings and possibly encryption

Vulnerability testing

  • RASP (Runtime Application Self-protection) – tools (usually in the form of libraries that are included in the application runtime) that monitor in real-time the application usage and can block certain actions (at binary level) or even shut down the application if a cyber attack is detected.
  • IASTInteractive Application Security Testing – Similar to RASP, the subtle difference being that IASP is usually used in pre-production environments while RASP is used in production
  • SAST (Static Application Security Testing) – tools that scan application source code for vulnerabilities
  • DAST (Dynamic Application Security Testing) – tools that scans web applications for vulnerabilities through their exposed HTTP endpoints
  • VA (Vulnerability assessment) – a process helped by many tools (including those above, and more) for finding, assessing and eliminating vulnerabilities

Identity and access

  • IAM (Identity and Access Management) – products that allow organizations to centralize authentication and enrollment of their users, providing single-sign-on capabilities, centralized monitoring authentication activity, applying access policies (e.g. working hours), enforcing 2FA, etc.
  • SSO – the ability to use the same credentials for logging into multiple (preferably all) applications in an organization.
  • WAM (Web Access Management) – the “older” version of IAM, lacking flexibility and some features like centralized user enrollment/provisioning
  • PAM (Privileged access management) – managing credentials of privileged users (e.g. system administrators). Instead of having admin credentials stored in local password managers (or worse – sticky notes or files on the desktop), credentials are stored in a centralized, protected vault and “released” for use only after a certain approval process for executing a given admin task, in some cases monitoring and logging the executed activities. The PAM handles regular password changes. It basically acts as a proxy (though not necessarily in the network sense) between a privileged user and a system that requires elevated privileges.

Endpoint protection

  • AV (Anti-Virus) – the good old antivirus software that gets malicious software signatures form a centrally managed blacklist and blocks programs that match those signatures
  • NGAV (Next Generation Anti-Virus) – going beyond signature matching, NGAV looks for suspicious activities (e.g. filesystem, memory, registry access/modification) and uses policies and rules to block such activity even from previously unknown and not yet blacklisted programs. Machine learning is usually said to be employed, but in many cases that’s mostly marketing.
  • EPP (Endpoint Protection Platform) – includes NGAV as well as a management layer that allows centrally provisioning and managing policies, reporting and workflows for remediation
  • EDR (Endpoint Detection and Response) – using an agent to collect endpoint (device) data, centralize it, combine it with network logs and analyze that in order to detect malicious activity. After suspected malicious activity is detected, allow centralized response, including blocking/shutting down/etc. Compared to NGAV, EDR makes use of the data across the organization, while NGAV usually focuses on individual machines, but that’s not universally true
  • ATP (Advanced threat protection) – same as EDR
  • ATD (Advanced threat detection) – same as above, with just monitoring and analytics capabilities

Coordination and automation

  • UTM (Unified Threat Management) – combining multiple monitoring and prevention tools in one suite (antivirus/NGAV/EDR), DLP, Firewalls, VPNs, etc. The benefit being that you purchase one thing rather than finding your way through the jungle described above. At least that’s on paper; in reality you still get different modules, sometimes not even properly integrated with each other.
  • SOAR (Security Orchestration, Automation and Response) – tools for centralizing security alerts and configuring automated actions in response. Alert fatigue is a real thing with many false positives generated by tools like SIEMs/DLPs/EDRs. Reducing those false alarms is often harder than just scripting the way they are handled. SOAR provides that – it ingests alerts and allows you to use pre-built or custom response “cookbooks” that include checking data (e.g. whether an IP is in some blacklist, are there attachments of certain content type in a flagged email, whether an employee is on holiday, etc.), creating tickets and alerting via multiple channels (email/sms/other type of push)
  • TIP (Threat Intelligence Platform) – threat intelligence is often part of other solutions like SIEMs, EDRs and DLPs and involves collecting information (intelligence) about certain resources like IP addresses, domain names, certificates. When these items are discovered in the collected logs, the TIP can enrich the event with what it knows about the given item and even act in order to block a request, if a threat threshold is reached. In short – scanning public and private databases to find information about malicious actors and their assets.

Email

  • SEG (Secure email gateway) – a proxy for all incoming and outgoing email that scans them for malicious attachments, potential phishing and in some cases data exfiltration attempts.
  • MFT (Managed File Transfer) – a tool that allows sharing files securely with someone by replacing attachments. Shared files can be tracked, monitored, audited and scanned for vulnerabilities, and access can be cut once the files was downloaded by the recipient, reducing the risk of data leaks.

DDoS

  • DDoS mitigation/protection – services that hide your actual IP in an attempt to block malicious DDoS traffic before it reaches your network (when it’s too late). They usually rely on large global networks an data centers (called “scrubbing centers”) to send clean traffic to your servers.

Compliance

  • GRC (Governance, Risk and Compliance) – a management tool for handling all the policies, audits, risk assessments, workflows and reports regarding different aspects of compliance, including security compliance
  • IRM – allegedly, philosophically different and more modern and advanced, in reality – the same as GRC with some additional monitoring features

So let’s summarize the ways that all of these solutions work:

  • Monitoring logs and other events
  • Inspecting incoming traffic and finding malicious activities
  • Inspecting outgoing traffic and applying policies
  • Application vulnerability detection
  • Automating certain aspects of the alerting, investigation and response handling

Monitoring (which is central to most tools) is usually done via proxies, port mirroring, network taps or host-based interface listeners, each having its pros and cons. Enforcement is almost always done via proxies. Bypassing these proxies should not be possible, but for cloud services you can’t really block access if the service is accessed outside your corporate environment (unless the SaaS provider has an IP whitelist feature).

In most cases, even though machine learning/AI is advertised as “the new thing”, tools make decisions based on configured policies (rules). Organizations are drowned in complex policies that they should keep up to date and syncrhonize across tools. Policy management, especially given there’s no indsutry standard for how policies should be defined, is a huge burden. In theory, it gives flexibility and should be there, in practice it may lead to a messy and hard to manage environment.

Monitoring is universally seen as the way to receive actionable intelligence from systems. This is much messier in reality than in demos and often leads to systems being left unmonitored and alerts being ignored. Alert fatigue, which follows from the complexity of policy management, is a bug problem in information security. SOAR is a way to remedy that but it sounds like a band-aid on a broken process rather than a true solution – false alarms should be reduced rather than being closed quasi-automatically. If handling an alert is automatable, then tha tool that generates it should be able to know it’s not a real problem.

The complexity of the security landscape is obviously huge – product categories are defined based on multiple criteria – what problem they solve, how they solve it, or to what extent they solve it. Is a SIEM also a DLP if it uses UEBA to block certain traffic (next-gen SIEMs may be able to invoke blocking actions even if requiring another system to carry it out). Is a DLP a CASB if it does encryption of data that’s stored in cloud services? Should you have an EPP and a SIEM, if the EPP gives you good enough overview of the events being logged in your infrastructure? Is a CASB a WAF for SaaS? Is a SIEM a DAM if it supports native database audit logs? You can’t answer these questions at a category level, you have to look at particular products and how well they implement a certain feature.

Can you have a unified proxy (THE proxy) that monitors everything incoming and outgoing and collects that data, acting as WAF, DLP, SIEM, CASB, SEG? Can you have just one agent that is both a EDR, and a DLP? Well, certainly categories like SASE and UTM go in that direction, trying to ease the decision making process.

I think it’s most important to start from the attack targets, rather than from the means to get there or from the means to prevent getting there. Unfortunately, enterprise security is often driven by “I need to have this type of product”. This leads to semi-abandoned and partially configured tools for which organizations pay millions. Because there is never enough people to be able to go into the intricate details of yet another security soluion, and organizations rely on consultants to set things up.

I don’t have solutions to the problems stated above, but I hope I’ve given a good overview of the landscape. And I think we should focus less on “security products” and more on “security techniques” and on people that can implement them. You don’t have a billion dollar corporation to sell you a silver bullet (which you can’t fire). You need traind experts. That’s hard. There aren’t enough of them. And the security team is often undervalued in the enterprise. Yes, cybersecurity is very important, but I’m not sure whether this will ever get enough visibility and be prioritized over purely business goals. And maybe it shouldn’t, if risk is properly calculated.

All the products above are ways to buy some feeling of security. If used properly and in the right combination, it can be more than a feeling. But too often a feeling is just good enough.

The post Making Sense of the Information Security Landscape appeared first on Bozho's tech blog.

Encryption Overview [Webinar]

Post Syndicated from Bozho original https://techblog.bozho.net/encryption-overview-webinar/

“Encryption” has turned into a buzzword, especially after privacy standards and regulation vaguely mention it and vendors rush to provide “encryption”. But what does it mean in practice? I did a webinar (hosted by my company, LogSentinel) to explain the various aspects and pitfalls of encryption.

You can register to watch the webinar here, or view it embedded below:

And here are the slides:

Of course, encryption is a huge topic, worth a whole course, rather than just a webinar, but I hope I’m providing good starting points. The interesting technique that we employ in our company is “searchable encryption” which allows to have encrypted data and still search in it. There are many more very nice (and sometimes niche) applications of encryption and cryptography in general, as Bruce Schneier mentions in his recent interview. These applications can solve very specific problems with information security and privacy that we face today. We only need to make them mainstream or at least increase awareness.

The post Encryption Overview [Webinar] appeared first on Bozho's tech blog.

Seven Legacy Integration Patterns

Post Syndicated from Bozho original https://techblog.bozho.net/seven-legacy-integration-patterns/

If we have to integrate two (or more) systems nowadays, we know – we either use an API or, more rarely, some message queue.

Unfortunately, many systems in the world do not support API integration. And many more a being created as we speak, that don’t have APIs. So when you inevitably have to integrate with them, you are left with imperfect choices to make. Below are seven patterns to integrate with legacy systems (or not-so-legacy systems that are built in legacy ways).

Initially I wanted to use “bad integration patterns”. But if you don’t have other options, they are not bad – they are inevitable. What’s bad is that fact that so many systems continue to be built without integration in mind.

I’ve seen all of these, on more than one occasion. And I’ve heard many more stories about them. Unfortunately, they are not the exception (fortunately, they are also not the rule, at least not anymore).

  1. Files on FTP – one application uploads files (XML, CSV, other) to an FTP (or other shared resources) and the other ones reads them via a scheduled job, parses them and optionally spits a response – either in the same FTP, or via email. Sharing files like that is certainly not ideal in terms of integration – you don’t get real-time status of your request, and other aspects are trickier to get right – versioning, high availability, authentication, security, traceability (audit trail).
  2. Shared database – two applications sharing the same database may sound like a recipe for disaster, but it’s not uncommon to see it in the wild. If you are lucky, one application will be read-only. But breaking changes to the database structure and security concerns are major issues. You can only use this type of integration is you expose your database directly to another application, which you normally don’t want to do.
  3. Full daily dump – instead of sharing an active database, some organizations do a full dump of their data every day or week and provide to to the other party for import. Obvious data privacy issues exist with that, as it’s a bad idea to have full dumps of your data flying around (in some cases on DVDs or portable HDDs), in addition to everything mention below – versioning, authentication, etc.
  4. Scraping – when an app has no API, it’s still possible to extract data from it or push data to it – via the user interface. With web applications that’s easier, as they “speak” HTML and HTTP. With desktop apps, screen scraping has emerged as an option. The so-called RPA software (Robotic process automation) relies on all types of scraping to integrate legacy systems. It’s very fragile and requires complicated (and sometimes expensive) tooling to get right. Not to mention the security aspect, which requires storing credentials in non-hashed form somewhere in order to let the scraper login.
  5. Email – when the sending or receiving system don’t support other forms of integration, email comes as a last resort. If you can trigger something by connecting a mailbox or if an email is produced after some event happens in the sending system, this may be all you need to integrate. Obviously, email is a very bad means of integration – it’s unstructured, it can fail for many reasons, and it’s just not meant for software integration. You can attach structured data, if you want to get extra inventive, but if you can get both ends to support the same format, you can probably get them extended to support proper APIs.
  6. Adapters – you can develop a custom module that has access to the underlying database, but exposes a proper API. That’s an almost acceptable solution, as you can have a properly written (sort-of) microservice independent of the original application and other system won’t know they are integrating with a legacy piece of software. It’s tricky to get it right in some cases, however, as you have to understand well the state space of the database. Read-only is easy, writing is much harder or next to impossible.
  7. Paper – no, I’m not making this up. There are cases where one organizations prints some data and then the other organization (or department) receives the paper documents (by mail or otherwise) and inputs them in their system. Expensive projects exist out there that aim to remove the paper component and introduce actual integration, as paper-based input is error-prone and slow. The few legitimate scenarios for a paper-based step is when you need an extra security and the paper trail, combined with the fact that paper is effectively airgapped, may give you that. But even then it shouldn’t be the only transport layer.

If you need to do any of the above, it’s usually because at least one of the system is stuck and can’t be upgraded. It’s either too legacy to touch, or the vendor is gone, or adding an API is “not on their roadmap” and would be too expensive.

If you are building a system, always provide an API. Some other system will have to integrate with it, sooner or later. It’s not sustainable to build close systems and delay the integration question for when it’s needed. Assume it’s always needed.

Fancy ESBs may be able to patch things quickly with one of the approaches above and integrate the “unintegratable”, but heavy reliance on an ESB is an indicator of too many legacy or low-quality systems.

But simply having an API doesn’t cut it either. If you don’t support versioning and backward-compatible APIs, you’ll be in an even more fragile state, as you’ll be breaking existing integrations as you progress.

Enterprise integration is tricky. But, as with many things in software, it’s best handled in the applications that we build. If we build them right, things are much easier. Otherwise, organizations have to revert to the legacy approaches mentioned above and introduce complexity, fragility, security and privacy risks and a general feeling of low-quality that has to be supported by increasingly unhappy people.

The post Seven Legacy Integration Patterns appeared first on Bozho's tech blog.