Tag Archives: Opinions

Is Ransomware Protection Working?

Post Syndicated from Bozho original https://techblog.bozho.net/is-ransomware-protection-working/

Recently an organization I’m familiar with suffered a ransomware attack. A LockBit ransomware infected a workstation and spread to a few servers. The attack was contained thanks to a quick incident response, but there was one important thing: the organization had a top-notch EDR installed. It’s in the Gartner’s leader quadrant for endpoint protection platforms. And it didn’t stop it. I won’t mention names, because the point here is not to attack a particular brand, but the fact that the attackers created a different build of the LockBit codebase meant that the signature-based detection didn’t work and so the files on the endpoints were encrypted. Not only that – attackers downloaded port scanners and executed password spraying on the network, in order to pivot to the servers, without much trouble caused by the EDR.

Ransomware is the most widespread type of attack, we have a security industry that’s trying to cope with those attacks for a long time, and in 2024 it’s sufficient to just rebuild the ransomware from source in order to not get caught. That’s a problem. To add insult to injury, the malware executable was called lockbit.exe_.

I think we need to think of better ways to tackle ransomware. I’m not an antivirus/EDR in-depth expert, but I have been thinking the following: why doesn’t the operating system have a built-in anti-ransomware functionality? The malware can stop endpoint agents, but it can’t stop the OS (well, it can stop OS services or modify OS dlls, I agree). I was thinking of native NTFS listeners to detect ransomware behavior – very fast enumeration and modification of files. Then I read this article titled “Microsoft Can Fix Ransomware Tomorrow” which echoed my thoughts. It’s not that simple as the title implies, and the article goes to argue that it’s hard, but something has to be done.

This may seriously impact the EDR market – if the top threat – ransomware – is automatically contained by built-in functionality, why purchase an EDR? Obviously, there are many other reasons to do so, but every budget cut will start from the EDR. Ransomware has been a successful driver for more investment in security. To prevent putting too much functionality in the OS, EDRs can tap into the detection capability of the OS to take further actions.

But ransomware works in a very simple way. Detecting mass-encryption of files is easy. Rate-limiting it is reasonable. You may be aware, but CPUs have AES-specific instructions that improve the performance of AES encryption – the fact that these instructions are used may be a further indicator of an ongoing ransomware attack.

I think we have to tap into the lower levels of our software stack – file system, OS, CPU – in order to tackle ransomware. It’s clear that omnipotent endpoint agents don’t exist, no matter how much “AI magic” you sprinkle ontop of them in the data sheet (I remember once asking another EDR vendor on a conference how exactly their AI is working. My impression from the answer was: it’s rules).

As I said, I’m no AV/EDR expert, and I expect comments like “But we are already doing that”. I’m sure APIs like this one are utilized by AV/EDRs, but they may be too slow or too heavy to be used. And it may mean that this APIs can be optimized for the ransomware-detection usecase. I don’t have a ready answer (otherwise I’d be an EDR vendor), but I’d welcome a serious discussion on that. We can’t be in a situation where purchasing an expensive security tool doesn’t reliably solve the most prominent threat – “off-the-shelf” ransomware.

The post Is Ransomware Protection Working? appeared first on Bozho's tech blog.

The xz Backdoor Should Not Happen Again

Post Syndicated from Bozho original https://techblog.bozho.net/the-xz-backdoor-should-not-happen-again/

A few days ago a significant supply chain attack attempt was accidentally revealed – the xz utiliy was compromised, likely by a nation state, in order to plant a backdoor which allows sniffing on encrypted traffic.

The xz library is a building block of many other packages and is basically ubiquitous. A famous XKCD strip describes the situation graphically:

Dependency

This means that if it wasn’t accidentally discovered due to worsened performance, we would eventually have a carefully planted backdoor on practically every Linux server out there. This is a major issue and even though open source security is better than closed source security, even if just by allowing backdoors to be discovered by anyone, we need to address such nation state attempts of planting backdoors.

I propose two complementary measures:

  1. Public funding for open source – the EU and the US need to create a structured, not overly bureaucratic process to fund the maintenance of core open source projects (like xz). Germany has done a good job in setting up its Sovereign tech fund, but we need broader instruments that make sure there is no open source abandonware on which many other projects depend. Currently large corporations often fund the development of open source, but xz is an example that the little building blocks may fall through the cracks. Open source funding can also be directed at systematic security analysis of open source projects (like the one in point 2, but not limited the security services).
  2. Analyzing high-risk project – security services and other public and private organizations need to first pinpoint high-risk projects (ones that if compromised, cause a huge risk that trickles down to the whole ecosystem), rank projects based on risk, and then analyze no just source code, but also maintenance activities, maintainer recruitment and churn, commit patterns and so on. In hindsight, the xz backdoor could have been caught by monitoring such metadata and the suspicious activities by the “hacker”. We, of course, need (open source) tools to do these analysis, but also highly-skilled people in the security services of larger countries.

Overall, we can and should learn lessons and take measures based on this incident. Because the next one might not cause noticeable performance degradation and get into actual production, which will be devastating.

The post The xz Backdoor Should Not Happen Again appeared first on Bozho's tech blog.

Why Facebook’s Lack of Customer Support Is a Problem

Post Syndicated from Bozho original https://techblog.bozho.net/why-facebooks-lack-of-customer-support-is-a-problem/

Facebook is arguably the biggest social network. The network effect makes it hard for people to leave Facebook, and so many businesses, celebrities, institutions, politicians rely on it for reaching out to their customers/fans/citizens/voters.

Yet, at least in my part of the world, the customer support of Facebook is practically non-existent. Because I’m a member of parliament and former minister that had handled disinformation and relations with Meta, many people turn to me for their Facebook woes. And they are almost never resolved.

A few examples: a deep fake of the Bulgarian prime minister was circulating on Facebook for several days, after two institutions submitted official take-down notices. Profiles of fellow members of parliament were blocked/hacked. None of their support requests succeeded and their profiles remained blocked for months. A fellow member of parliament with paid subscription could not change his cover photo during an election campaign for mayor, and Facebook’s support stopped answering. Facebook bulk-deleted our candidate pages after one election campaign (after it has been taking ad money), and its support did not respond adequately (pages remained deleted). One colleague’s ad account was hacked and a malicious actor used his credit card to promote ads. He was unable to remove the intruder and Facebook’s support didn’t manage to do it either, so my colleagues had to remove the credit card. When I became a minister, my request for a blue checkmark was initially rejected and the official support channel didn’t answer. And in all of those cases support was requested in English, so it’s not about language-specific limitations.

I’m sure anyone using Facebook for business has similar experiences. In a nutshell, support is useless, even if you are paying customer or advertiser. And clearly there is no market pressure to change that.

The European Union recently introduced the Digital Services Act which at least pushes forward a long-time proposal of mine for appeals and independent arbitration for decisions that block access. I don’t know if that’s working already, but at least it’s a step.

So why is that a problem? Facebook argues it is not a ‘natural monopoly’, and I’ll agree with that to an extent – it faces competition from different types of social networks. But its scale and the network effect means it is not just a regular market player – it is (as the digital services act puts it) – a very large online platform that has gained a broad influence and therefore needs to be required to bear extra responsibility. The ability for some entity with 4 million users in a country of 7 million to arbitrarily ban members of parliament or candidates for mayors, or to choose (because of inefficiency) to leave a deep fake of a prime minister up for days, is a systemic risk. It’s a systemic risk to leave a business to be reliant on the whims and inefficiencies of the nearly non-existent customer support.

If a company can’t get customer support sorted, market forces usually push it out of the market. But because of the network effect (and its policy of acquiring some potential competitors), this hasn’t been the case. And if one of the most highly-valued companies on earth can’t have a decent support process, regulators should step up and set standards.

The post Why Facebook’s Lack of Customer Support Is a Problem appeared first on Bozho's tech blog.

Another Attack Vector For SMS Interception

Post Syndicated from Bozho original https://techblog.bozho.net/another-attack-vector-for-sms-interception/

SMS codes for 2FAs have been discussed for a long time, and everyone knowledgeable in security knows they are not secure. What’s more – you should remove your phone number from sensitive services like Gmail, because if an attacker can fallback to SMS, the account is compromised.

Many authors have discussed how insecure SMS is, including Brian Krebbs, citing an example of NetNumber ID abuse, in addition to SIM Swap attacks and SS7 vulnerabilities.

Another aspect that I recently thought about is again related to intermediaries. Bulk SMS resellers integrate with various telecoms around the globe and accept outgoing SMS by API calls, which they then forward to a telecom in the relevant country. Global companies that send a lot of SMS try to look for cheap bulk deals (you are probably aware that Twitter/X recently decided to charge for SMS 2FA, because it was incurring high costs). These services are sometimes called a2p (application to person).

This means that intermediaries receive the 2FA code before it reaches the subscriber. A malicious insider or an attacker that compromises those intermediaries can thus have access to 2FA codes before they reach the subscriber.

I don’t know if this attack vector has been used, but it is a valid attack – if an attacker knows the intermediaries that a given service is using, they can either try to compromise the systems of the intermediaries, or gain access through a compromised insider. In either scenario, the victim’s 2FA code will be accessible to the attacker in real time.

This just reinforces the rule of thumb – don’t rely on SMS for two-factor authentication.

The post Another Attack Vector For SMS Interception appeared first on Bozho's tech blog.

eIDAS 2.0, QWACs And The Security Of The Web

Post Syndicated from Bozho original https://techblog.bozho.net/eidas-2-0-qwacs-and-the-security-of-the-web/

Tension has been high in the past months regarding a proposed change to the European eIDAS regulation which defines trust services, digital identity, and the so-called QWACs – qualified website authentication certificates. The proposal aims at making sure that EU-issued certificates are recognized by browsers. Here’s a summary from Scott Helme, and a discussion with Troy Hunt, and another good post by Eric Rescorla, former Firefox CTO so I’ll skip the intro.

Objections

Early in the process, Mozilla issued a position paper that raises some issues with the proposal. One of them is that what the EU suggests is basically an Extended Validation certificate – something that we had in the past (remember the big green address bars?), and which we have abandoned some time ago, and for a reason – multiple studies found that they do not bring any benefits. The EU says “QWACs (EVs) give the user more trust because they know which legal entity is behind a given website”. And the expert community says “well, in which scenario is that useful, and what about faking it – opening an entity with the same name in a different jurisdiction?”.

Later in the process, an additional limitation was added for browser vendors – that they cannot mandate additional security requirements than those specified by the EU standards body – ETSI. This is, to me, counterintuitive policy-wise, because in general, you set minimum requirements in regulations, not maximum. Of course, this prevents browser vendors from having arbitrary requirements. Not that they’ve had such requirements per se, but for example their CA inclusion page says “Mozilla is under no obligation to explain the reasoning behind any inclusion decision.” For me, this is not an acceptable process for something as important.

Mozilla (and various experts) also note, correctly, that if a CA gets compromised, this affects the entire world – the traffic to any website can be sniffed through man-in-the-middle attacks. And this has happened before. The Electronic Frontier Foundation, a respected digital rights organization, also objected to the approach.

Then Mozilla launched a campaign website against the amendment, which has the wrong tone and has some gross oversimplifications and factually incorrect statements (for example, it’s not true that QTSPs are not independently vetted). Then the European Signature Dialog (basically, an association of EU CAs, called QTSPs – qualified trust service providers), responded to it in a similarly inappropriate way. It said “Mozilla is generally perceived as a Google satellite, paving the way for Google to push through its own commercial interests” (which is false, but let’s not go into that).

The statements that QWACs are better against phishing, is arguably not true, even if you consult the paper that the ESD linked. It says: “Our analysis shows that it is generally impossible to differentiate between benign sites and phishing sites based on the content of their certificates alone. However, we present empirical evidence that current phishing websites for popular targets do typically not replicate the issuer and subject information”. So the fact that phishing sites don’t bother using EVs is somehow a reasons that EVs(QWACs) help against phishing? I’m disappointed by this ESD piece – they know better. Mozilla also knows better, as this negative campaign website introduces a tone that’s not constructive.

Insufficient assessment

What becomes apparent from the impact assessment study, another study, and the subsequent impact assessment is that there have been efforts to agree with browser vendors on including EU issued certificates without the CAs having to go through the root program process of the browsers, which they have refused.

I think this is not a good impact assessment. It does not assess impact. It doesn’t try to find out what will happen once this is passed, nor it tries to compare root programs with current ETSI standards to find the gaps. Neither the initial study, nor the impact assessment review the security aspects of the change.

For example, due the current usage patterns of QWACs for internal API-based communication between EU institutions, QWACs have sometimes been issued to private addresses (e.g. 192.186.xx.xx). Once they become automatically approved by the browsers, a security risk arises – what if I have a trusted certificate for your local router IP?

Also, it’s not a good process to include additional limitations in the trialogue, which is an informal process between the EU parliament, commission and council. I, as a Bulgarian member of parliament, requested from our government the drafts from the trialogue, and I was not granted access (due to EU rules). This is unacceptable as a legislative process, which should be fully transparent.

I have criticized this process and insufficient impact assessments before – for the copyright directive introduction of a requirement for automated content takedown, and for the introduction of mandatory fingerprints in ID cards.. There just doesn’t seem to be enough technical justification for regulations that have a very significant technical impact – not just in the EU, but in the world (as browsers have a global trusted CA list, not a EU one).

Technical or political debate?

The debate, as it seems, has conflated two separate issues – the technical and the political one. What Mozilla (and I presume other browser vendors) are implying is that they are responsible for the security in their browsers and they should be able to enforce security rules, while the EU is saying – private US entities (for-profit or non-profit) cannot have full control over who gets trusted and who doesn’t. Both are valid arguments. The EU seems to be pursuing a digital sovereignty agenda here, which, strategically, is a good idea.

But the question is whether that’s the best approach, and if not – how to improve it.

Some data and an anecdote to further illustrate the status quo. The Certinomis French QTSP (CA) has been distrusted by Mozilla a while ago. It is, however, on the EU trusted list. With the changes, Mozilla and others should re-trust it. The concerns raised by Mozilla seem legitimate, and so the fact that EU conformity assessment bodies do regular audits may not be sufficient for the purposes of web security (but this assumption needs to also be critically assessed).

Currently, there are 146 root/subordinate CA certificates listed for QWAC QTSPs. Of those 146, 115 are included in one or more root trust stores, and between 57 and 80 are missing from one or more. It’s far from an ideal picture, but these numbers should have been in the impact assessment and the problem statement in the first place. So that the legislators can identify the reasons for not including a CA in one or more root program. Is it a technical shortcoming of the CA, or it it a vendor discretion? Certainly, there doesn’t seem to be a ban on EU CAs/QTSPs by browser vendors.

So, essentially, the political results here is that EU CAs will get a fast-track into trust stores. I’m sure other jurisdictions will try to pass similar legislation, which will complicate the scene even further. Some of them will not be as democratic as the EU. And if a browser vendor thinks some CA is not trustworthy by their standards, they may come up with very clever workarounds of the regulation.

The first one – ignoring it, as there are no fines. But they can also introduce different paddock colors for different cases – e.g. a QWAC from a browser-approved CA gets a green paddock, a QWAC that has not passed the browser root program gets a yellow paddock. Compared to the current grey one, the yellow color may be perceived as less trustworthy. And then we’ll have to argue whether yellow is a clear enough indication and whether it shows trust or not so much. Time and time again I have stated that you can’t really regulate exact features and UI.

A technical solution to the political question?

I’ve heard many times that technical solutions to political problems are wrong. And I’ve seen many cases where, if you delve into the right level of detail, there is a solution that is both technically good and serves the political goal. In this case this is the so-called “Certificate transparency”. In a nutshell, each certificate, before being issued, is placed in a public verifiable data structure (merkle tree – used in blockchain implementations), and gets a signed certificate timestamp (SCT) as a response, which is then included as an X.509 attribute. This mitigates the risk of compromised CAs issuing certificates for websites that do not belong to the one requesting the certificate. In no time (up to 24 hours) they will be caught and distrusted, which raises the bar significantly.

Unfortunately, ETSI hasn’t included Certificate transparency in the current QWAC standard. The ESD association mentioned above says in another document that “The Browsers can easily bring any additional rules they want to impose on QTSPs such as Certificate Transparency to ETSI and other international standards bodies to be adopted through an open process of consensus by the internet community”, which I think is the wrong approach – Certificate transparency is an IETF RFC and is a de-facto standard. What will surely help is moving it (they are actually 2 RFCs) out of the “Experimental” status in IETF, but we can’t require every standard to be mirrored by ETSI.

I don’t know why CT has not been referenced by ETSI so far. It’s true that certificate log servers are based mostly in the US (and one in China), but nothing stops an organization from running a CT log. It “just” takes some infrastructure and bandwidth to support the load, but I think it’s a price EU QTSPs can pay, e.g. by sharing the costs for a couple of CT logs.

As a sidenote, I think it’s worth noting that the EU can also do more towards the adoption of DANE – a standard, which gets rid of CAs, as the public key is stored in a DNS record. It relies on DNSSEC, which doesn’t have huge adoption yet, and both are trickier to implement than it sounds, but if we want to be independent from browser decisions on which CA to trust, we can remove CAs from the trust equation. I’m fully aware it’s far from simple, and we’ll have to support PKI/CAs for a long, long time, but it’s a valid policy direction – mandate DNSSEC and DANE support.

Conclusion

Certificate transparency requirements can be added now to the eIDAS Annex IV. We are too late in the legislation process for that to be done smoothly, but I’d appreciate if the legislative bodies tried it. It would be just one additional point – “(k) details about the inclusion of the certificate in a public ledger” (note that the same eIDAS 2.0 regulates ledgers, and a CT log is a ledger, so that can be leveraged).

If not in Annex IV, then I’d strongly suggest including CT in the next version of the relevant ETSI standard. And I think it would be good if the EU Commission or ETSI to do a gap analysis of current root programs and the current ETSI standards to see if something important is missing.

Furthermore, the European Commission should initiate a series of objective studies on the effectiveness of extended validation certificates. The studies that I’ve read are not in favour of the EV/QWAC approach, but if we are to argue EV/QWACs are worth it, we need better justification.

A compromise is possible, that would make browsers confident that there will be no rogue CAs while at the same time giving Europe a say over trust in the web.

The post eIDAS 2.0, QWACs And The Security Of The Web appeared first on Bozho's tech blog.

MERDA – A Framework For Countering Disinformation

Post Syndicated from Bozho original https://techblog.bozho.net/merda-a-framework-for-countering-disinformation/

Yesterday, on an conference about disinformation, I jokingly coined the acronym MERDA (Monitor, Educate, React, Disrupt, Adapt) for countering disinformation. Now I’ll put the pretentious label “framework” and describe what I mean by that. While this may not seem a very technical topic, fit for a techblog, in fact it has a lot of technical aspects, as disinformation today is spread through technical means (social networks, anonymous websites, messengers). And therefore especially the “Disrupt” part is quite technical.

Monitor – in order to tackle disinformation narratives, we need to monitor them. This includes media monitoring tools (including social media) and building reports on rising narratives that may potentially be disinformation campaigns. These tools include a lot of scraping online content, and consuming APIs where such exist and are accessible. Notably, Facebook removed much of their API access to content, which makes it harder to monitor for trends. It has to be noted that this doesn’t mean monitoring individuals – it’s just about trends, keywords, phrases – sometimes known, sometimes unknown (e.g. the tool can look for very popular tweets, extract the key phrases from it, and then search for that). Governments can list their “named entities” and keep track of narratives/keywords/phrases relating to these named entities (ministers, prime minister, ministries, parties, etc.)

Educate – media literacy, and social media literacy, is a skill. Knowing that “Your page will be disabled if you don’t click here” is a skill. Being able to recognize logical fallacies and propaganda techniques is also a skill and it needs to be taught. Ultimately, the best defense against disinformation is a well informed and prepared public.

React – public institutions need to know how and when to react to certain narratives. It helps if they know them (through monitoring), but they need the so called “strategic communications” in order to respond adequately to disinformation about current events, debunking, pre-bunking and giving the official angle (note that I’m not saying the official angle is always right – it sometimes isn’t, that’s why it has to be supported by credible evidence).

Disrupt – this is the hard part – how to disrupt disinformation campaigns. How to identify and disable troll farms, which engage in coordinated inauthentic behavior – sharing, liking, commenting, cross-posting in groups – creating an artificial buzz around a topic. Facebook is, I think, quite bad at that – this is why I have proposed a local legislation that requires following certain guidelines for identifying troll farms (groups of fake accounts). Then we need a mechanism to take them down, which takes into account freedom of speech – i.e. the possibility that someone is not, in fact, a troll, but merely a misled observer. Fortunately, the digital services act provides for out-of-court appeals for moderator decisions.

The “disrupt” part is not just about troll farms – it’s about fake websites as well. Tracking linked websites, identifying the flow of narratives through these websites, trying to find the ultimate owners, is a hard and quite technical task. We know that there are thousands such anonymous websites that repost, in various languages, disinformation narratives – but taking down a website requires good legal reasons. “I don’t like their articles” is not a good reason.

The “disrupt” part also needs to tackle ad networks – some obscure ad networks are the way disinformation websites get financial support. They usually advertise not-so-legal products. Stopping the inflow of money is one way to reduce disinformation.

Adapt – threat actors in the disinformation space (usually nation-states like Russia) are dynamic and they change their tactics, techniques and procedures (TTPs). Institutions that are trying to reduce the harm of disinformation also need to be adaptable, to constantly look for new ways of getting the false or misleading information through.

Tackling disinformation is walking on thin ice. A wrong step may be seen as curbing free speech. But if we analyze patterns and techniques, rather than content itself, then we are on mostly on the safe side – it doesn’t matter what the article says, if it’s shared by 100 fake accounts and the website is supported by ads of illegal drugs that use deep fakes of famous physicians.

And it’s a complicated technical task – I’ve seen companies claiming they identify troll farms, rings of fake news website, etc. But I haven’t seen any tool that’s good enough. And MERDA … is the situation we are in – active, coordinated exploitation of misleading and incorrect information for political and geopolitical purposes.

The post MERDA – A Framework For Countering Disinformation appeared first on Bozho's tech blog.

Anticorruption Principles For Public Sector Information Systems

Post Syndicated from Bozho original https://techblog.bozho.net/anticorruption-principles-for-public-sector-information-systems/

As a public official, I’ve put a lot of though on how to make the current and upcoming public government information systems prone to corruption. And I can list several main principles, some of them very technical, which, if followed, would guarantee that the information systems themselves achieve two properties:

  1. they prevent paper-based corruption
  2. they do not generate additional risk for corruption

So here are the principles that each information system should follow:

  • Auditability – the software must allow for proper external audits. This means having the up-to-date source code available, especially for custom-built software. If it’s proprietary, it means “code available” contract clauses. This also means availability of documentation – what components it has, what integrations exist, what network and firewall rules are needed. If you can’t audit a system, it surely generates corruption
  • Traceability – every meaningful action, performed by users of the system, should be logged. This means a full audit log not just for the application, but also for the underlying database as well as servers. If “delete entry” is logged at the application, but DELETE FROM is not logged by the database, we are simply shifting the corruption motives to more technically skilled people. I’ve seen examples of turned-off DB audit logs, and systems that (deliberately?) miss to log some important user actions. Corruption is thus built in the system or the configuration of its parts.
  • Tamper-evidence – audit logs and in some cases core data should be tamper-evident. That means that any modification to past data should be detectable upon inspection (included scheduled inspections). One of the strong aspects of blockchain is the markle trees and hash chains it uses to guarantee tamper-evidence. A similar cryptographic approach must be applied to public systems, otherwise we are shifting the corruption incentive to those who can alter the audit log.
  • Legally sound use of cryptography – merkle trees are not legally defined, but other cryptographic tools are – trusted timestamps and digital signatures. Any document (or data) that carries legal meaning should be timestamped with the so called “(qualified) timestamp” according to the eIDAS EU regulation. Every document that needs a signature should be signed by an electronic signature (which is the legal name for the cryptographic term “digital signatures”). Private keys should always be stored on HSMs or smartcards to make sure they cannot leak. This prevents corruption as you can’t really spoof singatures or backdate documents. Backdating in particular is a common theme in corruption schemes, and a trusted cryptographic timestamp prevents that entirely.
  • Identity and access management – traceability is great if you are sure you are “tracing” the right people. If identity and access management isn’t properly handled, impersonation, bruteforce or leaked credentials can make it easier for malicious internal (or external) actors to do improper stuff and frame someone else. It’s highly recommended to use 2FA, and possibly hardware tokens. For sysadmins it’s a must to use a privileged access management system (PAM).
  • Data protection (encryption, backup management) – government data is sometimes sensitive – population registers, healthcare databases, taxes and customs databases, etc. They should not leak (captain obvious). Data leak prevention is a whole field, but I’d pinpoint two obvious aspects. The first is live data encryption – if you encrypt data granularly, and require decryption on the fly, you can centralize data access and therefore log every access. Otherwise, if the data in the database is in plaintext, there’s always a way to get it out somehow (Database activity monitoring (DAM) tools may help, of course). The second aspect is backup management – even if your production data is properly protected, encrypted, DAM’ed, your backup may leak. Therefore backup encryption is also important, and the decryption keys should be kept securely (ideally, wrapped by an HSM). How is data protection related to corruption? Well, these databases are sold on the black market, “privileged access” to sensitive data may be sold to certain people.
  • Transparency – every piece of data that should not be protected, should be public. The more open data and public documents there are, the less likely it is for someone to try to manipulate data. If the published data says something, you can’t go and remove it, hoping nobody would see it.
  • Randomness – some systems rely on randomness for a core feature – assigning cases. This is true for courts and for agencies who do inspections – you should randomly select a judge, and randomly assign someone to do an inspection. If you don’t have proper, audited, secure randomness, this can be abused (and it has been abused many times), e.g. to get the “right” judge in a sensitive case. We are now proposing a proper random case assignment system for the judiciary in my country. It should be made sure that /dev/random is not modified, and a distributed, cryptographically-backed random-generation system can be deployed. It sounds like too much complexity just for a RNG, but sometimes it’s very important to rely on non-controlled randomness (even if it’s pseudorandomness)
  • Data validation – data should be subject to the maximum validation on entry. Any anomalies should be blocked from even getting into the database. Because the option for creating confusion helps corruption. For example there’s the so called “corruption cyrillic” – in countries that use the cyryllic alphabet, malicious users enter identically-looking latin charcter to hide themselves from searches and reports. Another example – in the healthcare system, reimbursement requests used to be validated post-factum. This creates incentives for corruption, for “under the table” correction of “technical mistakes” and ultimately, schemes for draining funds. If input data is validated not just a simple form inputs, but with a set of business rules, it’s less likely for deliberately incorrect data to be entered and processes
  • Automated risk analysis – after data is entered (by civil servants, by external parties, by citizens), in some cases risk analysis should be done. For example, we are now proposing online registration of cars. However, some cars are much more likely to be stolen than others (based on price, ease of unlocking, currently operating criminals skillset, etc.). So the registration system should take into account all known factors and require the car to be presented at the traffic police for further inspection. Similarly for healthcare – some risk analysis on anomalous events (e.g. high-price medicines sold in unlikely succession) should be flagged automatically and inspected. That risk analysis should be based on carefully crafted methodologies, put into the system with something like a rules engine (rather than hardcoded, which I’ve also seen).

Throughout the years others and myself have managed to put some of those in laws and bylaws in Bulgaria, but there hasn’t been a systematic approach to ensuring that they are all followed, and followed properly. Which is the hard part, of course. Many people know the theory, it’s just not that easy to put in in practice in a complex environment. But these principles (and probably others that I miss) need to be the rule, rather than the exception in public sector information systems if we want to reduce corruption risks.

The post Anticorruption Principles For Public Sector Information Systems appeared first on Bozho's tech blog.

Methodology for Return on Security Investment

Post Syndicated from Bozho original https://techblog.bozho.net/methodology-for-return-on-security-investment/

Measuring return-on-investement for security (information security/cybersecurity) has always been hard. This is a problem for both cybersecurity vendors and service providers as well as for CISOs, as they find it hard to convince the budget stakeholders why they need another pile of money for tool X.

Return on Security Investment (ROSI) has been discussed, including academically, for a while. But we haven’t yet found a sound methodology for it. I’m not proposing one either, but I wanted to mark some points for such a methodology that I think are important. Otherwise, decisions are often taken by “auditor said we need X” or “regulation says we need Y”. Which are decent reasons to buy something, but it makes security look like a black hole cost center. It’s certainly no profit center, but the more tangibility we add, the more likely investments are going to work.

I think the leading metric is “likelihood of critical incident”. Businesses are (rightly) concerned with this. They don’t care about the number of reconnaissance attempts, false positives ratios, MTTRs and other technical things. This likelihood, if properly calculated, can lead to a sum of money lost due to the incident (due to lack of availability, data loss, reputational cost, administrative fines, etc.). The problem is we can’t get company X and say “you are 20% likely to get hit because that’s the number for SMEs”. It’s likely that a number from a vendor presentation won’t ring true. So I think the following should be factored in the methodology:

  • Likelihood of incident per type – ransomware, DDoS, data breach, insider data manipulation, are all differently likely.
  • Likelihood of incident per industry – industries vary greatly in terms of hacker incentive. Apart from generic ransomware, other attacks are more likely to be targeted at the financial industry, for example, than the forestry industry. That’s why EU directives NIS and NIS2 prioritize some industries as more critical
  • Likelihood of incident per organization size or revenue – not all SMEs and not all large enterprises are the same – the number of employees and their qualification may mean increased or decreased risk; company revenue may make it stand out ontop of the target list (or at the bottom)
  • Likelihood of incident per team size and skill – if you have one IT guy doing printers and security, it’s more likely to get hit by a critical incident than if you have a SOC team. Sounds obvious, but it’s a spectrum, and probably one with diminishing returns, especially for SMEs
  • Likelihood of incident per available security products – if you have nothing installed, you are more likely to get hit. If you have a simple AV, you can the basic attacks out. If you have a firewall, a SIEM/XDR, SOAR, threat intel subscriptions, things are different. Having them, of course, doesn’t mean they are properly deployed, but the types of tools matter in the ballpark calculations

How to get that data – I’m sure someone collects it. If nobody does, governments should. Such metrics are important for security decisions and therefore for the overall security of the ecosystem.

The post Methodology for Return on Security Investment appeared first on Bozho's tech blog.

Why I’m Not So Alarmed About AI And Jobs

Post Syndicated from Bozho original https://techblog.bozho.net/why-im-not-so-alarmed-about-ai-and-jobs/

With the advances in large language models (e.g. ChatGPT), referred to as AI, concerns are rising about a sweeping loss of jobs because of the new tools. Some claim jobs will be completely replaced, others claim that jobs will be cut because of a significant increase in efficiency. Labour parties and unions are organizing conferences about the future of jobs, universal basic income, etc.

These concerns are valid and these debates should be held. I’m addressing this post to the more extreme alarmists and not trying to diminish the rapid changes that these technological advances are bringing. We have to think about regulations, ethical AI and safeguards. And the recent advances are pushing us in that direction, which is good.

But in technology we often live the phrase “when you have a hammer, everything looks like a nail”. The blockchain revolution didn’t happen, and so I think we are a bit more eager than warranted about the recent advances in AI. Let me address three aspects:

First – automation. The claim is, AI will swiftly automate a lot of jobs and many people with fall out of the labour market. The reality is that GPT/LLMs should be integrated in existing business processes. If regular automation hasn’t already killed those jobs, AI won’t do it so quickly. If an organization doesn’t use automation already for boilerplate tasks, it won’t overnight automate them with AI. Let me remind you that RPA (Robotic process automation) solutions have been advertised as AI. They really “kill” jobs in the enterprise. They’ve been around for nearly two decades and we haven’t heard a large alarmed choir about RPA. I’m aware there is a significant difference in LLMs and RPA, but the idea that a piece of technology will swiftly lead to staff reduction across industries is not something I agree with.

Second – efficiency. Especially in software development, where products like Copilot are already production-ready, it seems that with the increase of efficiency, there may be staff reduction. But if a piece of software used to be built for 6 months before AI, it will be built for, say, 3 months with AI. Note that code writing speed is not the only aspect of software development – other overhead and blockers will continue to exist – requirement clarifications, customer feedback, architecture decisions, operational and scalability issues, etc., so increase in efficiency is unlikely to be orders of magnitude. AT the same time, there is a shortage of software developers. With the advances of AI, there will be less of a shortage, meaning more software can be built within the same timeframe.

For outsourcing this means that the price per hour or per finished product may increase because of AI (speed is also a factor in pricing). A company will be able to service more customers for a given time. And there’s certainly a lot of demand for digital transformation. For product companies this increase in efficiency will mean faster time-to-market for the product and new features. Which will make product companies more competitive. In both cases, AI is unlikely to kills jobs in the near future.

Sure, ChatGPT can write a website. You can create a free website with site-builders even today. And this hasn’t killed web developers. It just makes the easiest websites cheaper. By the way, building software once and maintaining it are completely different things. Even if ChatGPT can build a website, maintenance is going to be tough through prompts.

At the same time, AI will put more intellectual pressure on junior developers, who are typically given the boilerplate work, which is going to be more automatable. But on the other hand AI will improve the training process of those junior developers. Companies may have to put more effort in training developers, and career paths may have to be adjusted, but it’s unlikely that the demand for software developers will drop.

Third, there is a claim that generative AI will kill jobs in the creative professions. Ever since I wrote an algorthmic music generator, I’ve been saying that it will not. Sure, composers of elevator music will be eventually gone. But poets, for example, won’t. ChatGPT is rather bad at poetry. It can’t actually write proper poetry. It seems to know just the AABB rhyme scheme, it ignores instructions on meter (“use dactylic tetrameter” doesn’t seem to mean anything to it). With image and video generation, the problem with unrealistic hands and fingers (and similar ones) doesn’t seem to be going away with larger models (even though the latest version of Midjourney is neatly going around it). It will certainly require post-editing. Will it make certain industries more efficient? Yes, which will allow them to produce more content for a given time. Will there be enough demand? I can’t say. The market will decide.

LLMs and AI will be change things. It will improve efficiency. It will disrupt some industries. And we have to debate this. But we still have time.

The post Why I’m Not So Alarmed About AI And Jobs appeared first on Bozho's tech blog.

Nothing Is Secure [slides]

Post Syndicated from Bozho original https://techblog.bozho.net/nothing-is-secure-slides/

Yesterday I gave a talk on a local BSides conference in Bulgaria titled “Nothing is secure”.

The point is simple: security is very hard, there are many details, many tools, many processes that we need to tackles and many problems that we need to solve day and night. And this, combined with the inherent complexity of IT systems, makes things inherently insecure. We have to manage risks and governments need to have long-term policies in education (in order to have trained experts), in standardization (in order to let systems “talk” to each other easily and reduce moving parts) and responsibility of vendors for (at least) critical infrastructure. Below are my slides:

The post Nothing Is Secure [slides] appeared first on Bozho's tech blog.

Internally And Externally Facing Honeypots

Post Syndicated from Bozho original https://techblog.bozho.net/internally-and-externally-facing-honeypots/

Honeypots are great security tools – you install a “decoy”, which attracts malicious traffic. They have certain ports open and they work with certain protocols, mimicking regular interactions, e.g. SSH, RDP, Telnet, HTTP. Usually, at least in introductory materials, honeypots are assumed to be externally-facing (e.g. installed in the DMZ). This means attackers can see it in the open internet and you can collect valuable information.

However, there can be a different mode for honeypots – internally-facing. In normal circumstances, they’d be completely silent. Only in case of a real intruder (doing lateral movement) or during security audits and pentests they will collect data (otherwise nobody has any business poking in that IP address).

It makes sense to have both types of honeypots. Here are the positive sides of an externally facing honeypot:

  • Constantly collects threat information (IPs, attempted passwords, attempted protocols) and apply this knowledge in other tools (e.g. insert IPs in SIEM/Firewall)
  • Distinguish automated probes from human intrusion attempts
  • Visualize trends in malicious activity

And the benefits of internally-facing honeypot:

  • Get alerted in case of lateral movement. Almost every hit on the internal honeypot needs to be investigated immediately
  • No risk for allowing intruders in through 0days in the honeypot software stack
  • Not consuming much resources (the external honeypot has to services potentially many requests; the internal one is serving 0 if everything is fine)

The post Internally And Externally Facing Honeypots appeared first on Bozho's tech blog.

A Security Issue in Android That Remains Unfixed – Pull-down Menu On Lock Screen

Post Syndicated from Bozho original https://techblog.bozho.net/a-security-issue-in-android-that-remains-unfixed-pull-down-menu-on-lock-screen/

Having your phone lying around when your kids are playing with everything they find is a great security test. They immediately discover new features and ways to go beyond the usual flow.

This is the way I recently discovered a security issue with Android. Apparently, even if the phone is locked, the pull-down menu with quick settings works. Also, volume control works. Not every functionality inside the quick settings menu works fully while unlocked, but you can disable mobile data and Wi-Fi, you can turn on your hotspot, you can switch to Airplane mode.

While this has been pointed out on Google Pixel forums, on reddit and Stack Exchange, it has not been fixed in stock Android. Different manufacturers seem to have acknowledged the issue in their custom ROMs, but that’s not a reliable long-term solution.

Let me explain why this is an issue. First, it breaks the assumption that when the phone is locked nothing works. Breaking user assumptions is bad by itself.

Second, it allows criminals to steal your phone and put in in Airplane mode, thus disabling any ability to track the phone – either through “find my phone” services, or by the police through mobile carriers. They can silence the phone, so that it’s not found with “ring my phone” functionality. It’s true that an attacker can just take out the SIM card, but having the Wi-Fi on still allows tracking using wifi networks through which the phone passes.

Third, the hotspot (similar issues go with Bluetooth). Allowing a connection can be used to attack the device. It’s not trivial, but it’s not impossible either. It can also be used to do all sorts of network attacks on other devices connected to the hotspot (e.g. you enable the hotspot, a laptop connects automatically, and you execute an APR poisoning attack). The hotspot also allows attackers to use a device to commit online crimes and frame the owner. Especially if they do not steal the phone, but leave it lying where it originally was, just with the hotspot turned on. Of course, they would need to get the password for the hotspot, but this can be obtained through social engineering.

The interesting thing is that when you use Google’s Family Link to lock a device that’s given to a child, the pull-down menu doesn’t work. So the basic idea that “once locked, nothing should be accessible” is there, it’s just not implemented in the default use-case.

While the things described above are indeed edge-cases and may be far fetched, I think they should be fixed. The more functionality is available on a locked phone, the more attack surface it has (including for the exploitation of 0days).

The post A Security Issue in Android That Remains Unfixed – Pull-down Menu On Lock Screen appeared first on Bozho's tech blog.

The Lack Of Native MFA For Active Directory Is A Big Sin For Microsoft

Post Syndicated from Bozho original https://techblog.bozho.net/the-lack-of-native-mfa-for-active-directory-is-a-big-sin-for-microsoft/

Active Directory is dominant in the enterprise world (as well as the public sector). From my observation, the majority of organization rely on Active Directory for their user accounts. While that may be changing in recent years with more advanced and cloud IAM and directory solutions, the landscape in the last two decades is a domination of Microsoft’s Active Directory.

As a result of that dominance, many cyber attacks rely on exploiting some aspects of Active Directory. Whether it would be weaknesses of Kerberos, “pass the ticket”, golden ticket, etc. Standard attacks like password spraying, credential stuffing and other brute forcing also apply, especially if the Exchange web access is enabled. Last, but not least, simply browsing the active directory once authenticated with a compromised account, provides important information for further exploitation (finding other accounts, finding abandoned, but not disabled accounts, finding passwords in description fields, etc).

Basically, having access an authentication endpoint which interfaces the Active Directory allows attackers to gain access and then do lateral movement.

What is the most recommended measures for preventing authentication attacks? Multi-factor authentication. And the sad reality is that Microsoft doesn’t offer native MFA for Active Directory.

Yes, there are things like Microsoft Hello for Business, but that can’t be used in web and email context – it is tied to the Windows machine. And yes, there are third-party options. But they incur additional cost, and are complex to setup and manage. We all know the power of defaults and built-in features in security – it should be readily available and simple in order to have wide adoption.

What Microsoft should have done is introduce standard, TOTP-based MFA and enforce it through native second-factor screens in Windows, Exchange web access, Outlook and others. Yes, that would require Kerberos upgrades, but it is completely feasible. Ideally, it should be enabled by a single click, which would prompt users to enroll their smart phone apps (Google Authenticator, Microsoft Authenticator, Authy or other) on their next successful login. Of course, there may be users without smartphones, and so the option to not enroll for MFA may be available to certain less-privileged AD groups.

By not doing that, Microsoft exposes all on-premise AD deployments to all sorts of authentication attacks mentioned above. And for me that’s a big sin.

Microsoft would say, of course, that their Azure AD supports many MFA options and is great and modern and secure and everything. And that’s true, if you want to chose to migrate to Azure and use Office365. And pay for subscription vs just the Windows Server license. It’s not a secret that Microsoft’s business model is shifting towards cloud, subscription services. And there’s nothing wrong with that. But leaving on-prem users with no good option for proper MFA across services, including email, is irresponsible.

The post The Lack Of Native MFA For Active Directory Is A Big Sin For Microsoft appeared first on Bozho's tech blog.

Open APIs – Public Infrastructure in the Digital Age

Post Syndicated from Bozho original https://techblog.bozho.net/open-apis-public-infrastructure-in-the-digital-age/

When “public infrastructure” is mentioned, typically people think of roads, bridges, rails, dams, power plants, city lights. These are all enablers, publicly funded/owned/managed (not necessarily all of these), which allow the larger public to do business and to cover basic needs. Public infrastructure is sometimes free, but not always (you pay electricity bills and toll fees; and of course someone will rightly point out that nothing is free, because we pay it through taxes, but that’s not the point).

In the digital age, we can think of some additional examples to “public infrastructure”. The most obvious one, which has a physical aspects, is fiber-optic cables. Sometimes they are publicly owned (especially in rural areas), and their goal is to provide internet access, which itself is an enabler for business and day-to-day household activities. More and more countries, municipalities and even smaller communities invest in owning fiber-optic cables in order to make sure there’s equal access to the internet. But cables are still physical infrastructure.

Something entirely digital, that is increasingly turning into public infrastructure, are open government APIs. They are not fully perceived as public infrastructure, and exist as such only in the heads of a handful of policymakers and IT experts, but in essence they are exactly that – government-owned infrastructure that enables businesses and other activities.

But let me elaborate. Open APIs let the larger public access data and/or modify data that is collected and/or centralized and/or monitored by government institutions (central or local). Some examples:

  • Electronic health infrastructure – the Bulgarian government is building a centralized health record as well as centralized e-prescriptions and e-hospitalization. It is all APIs, where private companies develop software for hospitals, general practitioners, pharmacies, labs. Other companies may develop apps for citizens to help them improve their health or match them with nutrition and sport advice. All of that is based on open APIs (following the FHIR standard) and allows for fair competition, while managing access to sensitive data, audit logs and most importantly – collection in a centralized store.
  • Toll system – we have a centralized road toll system, which offers APIs (unfortunately, via an overly complicated model of intermediaries) which supports multiple resellers to sell toll passes (time-based and distance-based). This allows telecoms (through apps), banks (through e-banking), supermarkets, fleet management companies and others to offer better UI and integrated services.
  • Tax systems – businesses will be happy to report their taxes through their ERP automatically, rather than manually exporting and uploading, or manually filling data in complex forms.
  • E-delivery of documents – Bulgaria has a centralized system for electronic delivery of documents to public institutions. That system has an API, which allows third parties to integrate and send documents as part of more complex services, on behalf of citizens and organizations.
  • Car registration – car registers are centralized, but opening up their APIs would allow car (re)sellers to handle all the paperwork on behalf of their customers, online, by a click of a button in their internal system. Car part owners can fetch data about registered cars per brand and model in order to make sure there are enough spare parts in stock (based on the typical lifecycle of car parts).

Core systems and central registers with open APIs are digital public infrastructure that would allow a more seamless, integrated state. There are a lot of details to be taken into account – access management and authentication (who has the right to read or write certain data), fees (if a system is heavily used, the owning institution might charge a fee), change management and upgrades, zero downtime, integrity, format, etc.

But the policy that I have always followed and advocated for is clear – mandatory open APIs for all government systems. Bureaucracy and paperwork may become nearly invisible, hidden behind APIs, if this principle is followed.

The post Open APIs – Public Infrastructure in the Digital Age appeared first on Bozho's tech blog.

On Disinformation and Large Online Platforms

Post Syndicated from Bozho original https://techblog.bozho.net/on-disinformation-and-large-online-platforms/

This week I was invited to be a panelist, together with other digital ministers, on a side-event organized by Ukraine in Davos, during the World Economic Forum. The topic was disinformation, and I’d like to share my thoughts on it. The video recording is here, but below is not a transcript, but an expanded version.

Bulgaria is seemingly more susceptible to disinformation, for various reasons. We have a majority of the population that has positive sentiments about Russia, for historical reasons. And disinformation campaigns have been around before the war and after the wear started. The typical narratives that are being pushed every day are about the bad, decadent west; the slavic, traditional, conservative Russian government; the evil and aggressive NATO; the great and powerful, but peaceful Russian army, and so on.

These disinformation campaign are undermining public discourse and even public policy. COVID vaccination rates in Bulgaria are one of the lowest in the world (and therefore the mortality rate is one of the highest). Propaganda and conspiracy theories took hold into our society and literally killed our relatives and friends. The war is another example – Bulgaria is on the first spot when it comes to people thinking that the west (EU/NATO) is at fault for the war in Ukraine.

Kremlin uses the same propaganda techniques developed in the cold war, but applied on the free internet, much more efficiently. They use European values of free speech to undermine those same European values.

Their main channels are social networks, who seem to remain blissfully ignorant of the local context as the one described above.

What we’ve seen, and what has been leaked and discussed for a long time is that troll factories amplify anonymous websites. They share content, like content, make it seem like it’s noteworthy to the algorithms.

We know how it works. But governments can’t just block a website, because they think it’s false information. A government may easily go beyond the good intentions and do censorship. In 4 years I won’t be a minister and the next government may decide I’m spreading “western propaganda” and block my profiles, my blogs, my interviews in the media.

I said all of that in front of the Bulgarian parliament last week. I also said that local measures are insufficient, and risky.

That’s why we have to act smart. We need to strike down the mechanisms for weaponzing social networks – for spreading disinformation to large portions of the population, not to block the information itself. Brute force is dangerous. And helps the Kremlin in their narrative about the bad, hypocritical west that talks about free speech, but has the power to shut you down if a bureaucrat says so.

The solution, in my opinion, is to regulate recommendation engines, on a European level. To make these algorithms find and demote these networks of trolls (they fail at that – Facebook claims they found 3 Russian-linked accounts in January).

How to do it? It’s hard to answer if we don’t know the data and the details of how they currently work. Social networks can try to cluster users by IPs, ASs, VPN exit nodes, content similarity, DNS and WHOIS data for websites, photo databases, etc. They can consult national media registers (if they exist), via APIs, to make sure something is a media and not an auto-generated website with pre-written false content (which is what actually happens).

The regulation should make it a focus of social media not to moderate everything, but to not promote inauthentic behavior.

Europe and its partners must find a way to regulate algorithms without curbing freedom of expression. And I was in Brussels last week to underline that. We can use the Digital services act to do exactly that, and we have to do it wisely.

I’ve been criticized – why I’m taking on this task while I can do just cool things like eID and eServices and removing bureaucracy. I’m.doing those, of course, without delay.

But we are here as government officials to tackle the systemic risks. The eID I’ll introduce will do no good if we lose the hearts and minds of people to Kremlin propaganda.

The post On Disinformation and Large Online Platforms appeared first on Bozho's tech blog.

Don’t Reinvent Date Formats

Post Syndicated from Bozho original https://techblog.bozho.net/dont-reinvent-date-formats/

Microsoft Exchange has a bug that practically stops email. (The public sector is primarily using Exchange, so many of the institutions I’m responsible for as a minister, have their email “stuck”). The bug is described here, and fortunately, has a solution.

But let me say something simple and obvious: don’t reinvent date formats, please. When in doubt, use ISO 8601 or epoch millis (in UTC), or RFC 2822. Nothing else makes sense.

Certainly treating an int as a date is an abysmal idea (it doesn’t even save that much resources). 202201010000 is not a date format worth considering.

(As a side note, another advice – add automate tests for future timestamps. Sometiimes they catch odd behavior).

I’ll finish with Jon Skeet’s talk on dates, strings and numbers.

The post Don’t Reinvent Date Formats appeared first on Bozho's tech blog.

I Have Been Appointed As E-Governance Minister of Bulgaria

Post Syndicated from Bozho original https://techblog.bozho.net/i-have-been-appointed-as-e-governance-minister-of-bulgaria/

Last week the Bulgarian National assembly appointed the new government. I am one of the appointed ministers – a minister for electronic governance.

The portfolio includes digitizing registers and processes in all government institutions, reducing bureaucracy, electronic identity, cybersecurity, digital skills and more.

Thanks to all my readers for following this blog throughout the years. I will be sharing some digital policy details here from now on while I’m minister. That may include some technical articles, but they are unlikely to be developer-oriented.

I hope to make some important changes and put forward key ideas for e-governance and digital policy that can be used as an example outside my country (last time I was involved in public policy, I helped pass an “open source law”).

I’ve written a few articles about IT people looking for challenges – not just technical challenges. And I think that’s a great challenge where I’ll have to put all my knowledge and skills to work for the common good.

The post I Have Been Appointed As E-Governance Minister of Bulgaria appeared first on Bozho's tech blog.

Simple Things That Are Actually Hard: User Authentication

Post Syndicated from Bozho original https://techblog.bozho.net/simple-things-that-are-actually-hard-user-authentication/

You build a system. User authentication is the component that is always there, regardless of the functionality of the system. And by now it should be simple to implement it – just “drag” some ready-to-use authentication module, or configure it with some basic options (e.g. Spring Security), and you’re done.

Well, no. It’s the most obvious thing and yet it’s extremely complicated to get right. It’s not just login form -> check username/password -> set cookie. It has a lot of other things to think about:

  • Cookie security – how to make it so that a cookie doesn’t leak or can’t be forged. Should you even have a cookie, or use some stateless approach like JWT, use SameSite lax or strict?
  • Bind cookie to IP and logout user if IP changes?
  • Password requirements – minimum length, special characters? UI to help with selecting a password?
  • Storing passwords in the database – bcrypt, scrypt, PBKDF2, SHA with multiple iterations?
  • Allow storing in the browser? Generally “yes”, but some applications deliberately hash it before sending it, so that it can’t be stored automatically
  • Email vs username – do you need a username at all? Should change of email be allowed?
  • Rate-limiting authentication attempts – how many failed logins should block the account, for how long, should admins get notifications or at least logs for locked accounts? Is the limit per IP, per account, a combination of those?
  • Captcha – do you need captcha at all, which one, and after how many attempts? Is Re-Captcha an option?
  • Password reset – password reset token database table or expiring links with HMAC? Rate-limit password reset?
  • SSO – should your service should support LDAP/ActiveDirectory authentication (probably yes), should it support SAML 2.0 or OpenID Connect, and if yes, which ones? Or all of them? Should it ONLY support SSO, rather than internal authentication?
  • 2FA – TOTP or other? Implement the whole 2FA flow, including enable/disable and use or backup codes; add option to not ask for 2FA for a particular device for a period of time>
  • Login by link – should the option to send a one-time login link be email be supported?
  • XSS protection – make sure no XSS vulnerabilities exist especially on the login page (but not only, as XSS can steal cookies)
  • Dedicated authentication log – keep a history of all logins, with time, IP, user agent
  • Force logout – is the ability to logout a logged-in device needed, how to implement it, e.g. with stateless tokens it’s not trivial.
  • Keeping a mobile device logged in – what should be stored client-side? (certainly not the password)
  • Working behind proxy – if the client IP matters (it does), make sure the X-Forwarded-For header is parsed
  • Capture login timezone for user and store it in the session to adjust times in the UI?
  • TLS Mutual authentication – if we need to support hardware token authentication with private key, we should enable TLS mutual. What should be in the truststore, does the web server support per-page mutual TLS or should we use a subdomain?

And that’s for the most obvious feature that every application has. No wonder it has been implemented incorrectly many, many times. The IT world is complex and nothing is simple. Sending email isn’t simple, authentication isn’t simple, logging isn’t simple. Working with strings and dates isn’t simple, sanitizing input and output isn’t simple.

We have done a poor job in building the frameworks and tools to help us with all those things. We can’t really ignore them, we have to think about them actively and take conscious, informed decisions.

The post Simple Things That Are Actually Hard: User Authentication appeared first on Bozho's tech blog.

Integrity Guarantees of Blockchains In Case of Single Owner Or Colluding Owners

Post Syndicated from Bozho original https://techblog.bozho.net/integrity-guarantees-of-blockchains-in-case-of-single-owner-or-colluding-owners/

The title may sound as a paper title, rather than a blogpost, because it was originally an idea for such, but I’m unlikely to find the time to put a proper paper about it, so here it is – a blogpost.

Blockchain has been touted as the ultimate integrity guarantee – if you “have blockchain”, nobody can tamper with your data. Of course, reality is more complicated, and even in the most distributed of ledgers, there are known attacks. But most organizations that are experimenting with blockchain, rely on a private network, sometimes having themselves as the sole owner of the infrastructure, and sometimes sharing it with just a few partners.

The point of having the technology in the first place is to guarantee that once collected, data cannot be tampered with. So let’s review how that works in practice.

First, we have two define two terms – “tamper-resistant” (sometimes referred to as tamper-free) and “tamper-evident”. “Tamper-resistant” means nobody can ever tamper with the data and the state of the data structure is always guaranteed to be without any modifications. “Tamper-evident”, on the other hand, means that a data structure can be validated for integrity violations, and it will be known that there have been modifications (alterations, deletions or back-dating of entries). Therefore, with tamper-evident structures you can prove that the data is intact, but if it’s not intact, you can’t know the original state. It’s still a very important property, as the ability to prove that data is not tampered with is crucial for compliance and legal aspects.

Blockchain is usually built ontop of several main cryptographic primitives: cryptographic hashes, hash chains, Merkle trees, cryptographic timestamps and digital signatures. They all play a role in the integrity guarantees, but the most important ones are the Merkle tree (with all of its variations, like a Patricia Merkle tree) and the hash chain. The original bitcoin paper describes a blockchain to be a hash chain, based on the roots of multiple Merkle trees (which form a single block). Some blockchains rely on a single, ever-growing merkle tree, but let’s not get into particular implementation details.

In all cases, blockchains are considered tamper-resistant because their significantly distributed in a way that enough number of members have a copy of the data. If some node modifies that data, e.g. 5 blocks in the past, it has to prove to everyone else that this is the correct merkle root for that block. You have to have more than 50% of the network capacity in order to do that (and it’s more complicated than just having them), but it’s still possible. In a way, tamper resistance = tamper evidence + distributed data.

But many of the practical applications of blockchain rely on private networks, serving one or several entities. They are often based on proof of authority, which means whoever has access to a set of private keys, controls what the network agree on. So let’s review the two cases:

  • Multiple owners – in case of multiple node owners, several of them can collude to rewrite the chain. The collusion can be based on mutual business interest (e.g. in a supply chain, several members may team up against the producer to report distorted data), or can be based on security compromise (e.g. multiple members are hacked by the same group). In that case, the remaining node owners can have a backup of the original data, but finding out whether the rest were malicious or the changes were legitimate part of the business logic would require a complicated investigation.
  • Single owner – a single owner can have a nice Merkle tree or hash chain, but an admin with access to the underlying data store can regenerate the whole chain and it will look legitimate, while in reality it will be tampered with. Splitting access between multiple admins is one approach (or giving them access to separate nodes, none of whom has access to a majority), but they often drink beer together and collusion is again possible. But more importantly – you can’t prove to a 3rd party that your own employees haven’t colluded under orders from management in order to cover some tracks to present a better picture to a regulator.

In the case of a single owner, you don’t even have a tamper-evident structure – the chain can be fully rewritten and nobody will understand that. In case of multiple owners, it depends on the implementation. There will be a record of the modification at the non-colluding party, but proving which side “cheated” would be next to impossible. Tamper-evidence is only partially achieved, because you can’t prove whose data was modified and whose data hasn’t (you only know that one of the copies has tampered data).

In order to achieve tamper-evident structure with both scenarios is to use anchoring. Checkpoints of the data need to be anchored externally, so that there is a clear record of what has been the state of the chain at different points in time. Before blockchain, the recommended approach was to print it in newspapers (e.g. as an ad) and because it has a large enough circulation, nobody can collect all newspapers and modify the published checkpoint hash. This published hash would be either a root of the Merkle tree, or the latest hash in a hash chain. An ever-growing Merkle tree would allow consistency and inclusion proofs to be validated.

When we have electronic distribution of data, we can use public blockchains to regularly anchor our internal ones, in order to achieve proper tamper-evident data. We, at LogSentinel, for example, do exactly that – we allow publishing the latest Merkle root and the latest hash chain to Ethereum. Then even if those with access to the underlying datastore manage to modify and regenerate the entire chain/tree, there will be no match with the publicly advertised values.

How to store data on publish blockchains is a separate topic. In case of Ethereum, you can put any payload within a transaction, so you can put that hash in low-value transactions between two own addresses (or self-transactions). You can use smart-contracts as well, but that’s not necessary. For Bitcoin, you can use OP_RETURN. Other implementations may have different approaches to storing data within transactions.

If we want to achieve tamper-resistance, we just need to have several copies of the data, all subject to tamper-evidence guarantees. Just as in a public network. But what a public network gives is is a layer, which we can trust with providing us with the necessary piece for achieving local tamper evidence. Of course, going to hardware, it’s easier to have write-only storage (WORM, write once, ready many). The problem with it, is that it’s expensive and that you can’t reuse it. It’s not so much applicable to use-cases that require short-lived data that requires tamper-resistance.

So in summary, in order to have proper integrity guarantees and the ability to prove that the data in a single-owner or multi-owner private blockchains hasn’t been tampered with, we have to send publicly the latest hash of whatever structure we are using (chain or tree). If not, we are only complicating our lives by integrating a complex piece of technology without getting the real benefit it can bring – proving the integrity of our data.

The post Integrity Guarantees of Blockchains In Case of Single Owner Or Colluding Owners appeared first on Bozho's tech blog.

Hypotheses About What Happened to Facebook

Post Syndicated from Bozho original https://techblog.bozho.net/hypotheses-about-what-happened-to-facebook/

Facebook was down. I’d recommend reading Cloudflare’s summary. Then I recommend reading Facebook’s own account on the incident. But let me expand on that. Facebook published announcements and withdrawals for certain BGP prefixes which lead to removing its DNS servers from “the map of the internet” – they told everyone “the part of our network where our DNS servers are doesn’t exist”. That was the result of a backbone self-inflicted failure due to a bug in the auditing tool that checks whether the commands executed aren’t doing harmful things.

Facebook owns a lot of IPs. According to RIPEstat they are part of 399 prefixes (147 of them are IPv4). The DNS servers are located in two of those 399. Facebook uses a.ns.facebook.com, b.ns.facebook.com, c.ns.facebook.com and d.ns.facebook.com, which get queries whenever someone wants to know the IPs of Facebook-owned domains. These four nameservers are served by the same Autonomous System from just two prefixes – 129.134.30.0/23 and 185.89.218.0/23. Of course “4 nameservers” is a logical construct, there are probably many actual servers behind that (using anycast).

I wrote a simple “script” to fetch all the withdrawals and announcements for all Facebook-owned prefixes (from the great API of RIPEstats). Facebook didn’t remove itself from the map entirely. As CloudFlare points out, it was just some prefixes that are affected. It can be just these two, or a few others as well, but it seems that just a handful were affected. If we sort the resulting CSV from the above script by withdrawals, we’ll notice that 129.134.30.0/23 and 185.89.218.0/23 are the pretty high up (alongside 185.89 and 123.134 with a /24, which are all included in the /23). Now that perfectly matches Facebook’s account that their nameservers automatically withdraw themselves if they fail to connect to other parts of the infrastructure. Everything may have also been down, but the logic for withdrawal is present only in the networks that have nameservers in them.

So first, let me make three general observations that are not as obvious and as universal as they may sound, but they are worth discussing:

  • Use longer DNS TTLs if possible – if Facebook had 6 hour TTL on its domains, we may have not figured out that their name servers are down. This is hard to ask for such a complex service that uses DNS for load-balancing and geographical distribution, but it’s worth considering. Also, if they killed their backbone and their entire infrastructure was down anyway, the DNS TTL would not have solved the issue. But
  • We need improved caching logic for DNS. It can’t be just “present or not”; DNS caches may keep “last known good state” in case of SERVFAIL and fallback to that. All of those DNS resolvers that had to ask the authoritative nameserver “where can I find facebook.com” knew where to find facebook.com just a minute ago. Then they got a failure and suddenly they are wondering “oh, where could Facebook be?”. It’s not that simple, of course, but such cache improvement is worth considering. And again, if their entire infrastructure was down, this would not have helped.
  • Consider having an authoritative nameserver outside your main AS. If something bad happens to your AS routes (regardless of the reason), you may still have DNS working. That may have downsides – generally, it will be hard to manage and sync your DNS infrastructure. But at least having a spare set of nameservers and the option to quickly point glue records there is worth considering. It would not have saved Facebook in this case, as again, they claim the entire infrastructure was inaccessible due to a “broken” backbone.
  • Have a 100% test coverage on critical tools, such as the auditing tool that had a bug. 100% test coverage is rarely achievable in any project, but in such critical tools it’s a must.

The main explanation is the accidental outage. This is what Facebook engineers explain in the blogpost and other accounts, and that’s what seems to have happened. However, there are alternative hypotheses floating around, so let me briefly discuss all of the options.

  • Accidental outage due to misconfiguration – a very likely scenario. These things may happen to everyone and Facebook is known for it “break things” mentality, so it’s not unlikely that they just didn’t have the right safeguards in place and that someone ran a buggy update. The scenarios why and how that may have happened are many, and we can’t know from the outside (even after Facebook’s brief description). This remains the primary explanation, following my favorite Hanlon’s razor. A bug in the audit tool is absolutely realistic (btw, I’d love Facebook to publish their internal tools).
  • Cyber attack – It cannot be known by the data we have, but this would be a sophisticated attack that gained access to their BGP administration interface, which I would assume is properly protected. Not impossible, but a 6-hour outage of a social network is not something a sophisticated actor (e.g. a nation state) would invest resources in. We can’t rule it out, as this might be “just a drill” for something bigger to follow. If I were an attacker that wanted to take Facebook down, I’d try to kill their DNS servers, or indeed, “de-route” them. If we didn’t know that Facebook lets its DNS servers cut themselves from the network in case of failures, the fact that so few prefixes were updated might be in indicator of targeted attack, but this seems less and less likely.
  • Deliberate self-sabotage1.5 billion records are claimed to be leaked yesterday. At the same time, a Facebook whistleblower is testifying in the US congress. Both of these news are potentially damaging to Facebook reputation and shares. If they wanted to drown the news and the respective share price plunge in a technical story that few people understand but everyone is talking about (and then have their share price rebound, because technical issues happen to everyone), then that’s the way to do it – just as a malicious actor would do, but without all the hassle to gain access from outside – de-route the prefixes for the DNS servers and you have a “perfect” outage. These coincidences have lead people to assume such a plot, but from the observed outage and the explanation given by Facebook on why the DNS prefixes have been automatically withdrawn, this sounds unlikely.

Distinguishing between the three options is actually hard. You can mask a deliberate outage as an accident, a malicious actor can make it look like a deliberate self-sabotage. That’s why there are speculations. To me, however, by all of the data we have in RIPEStat and the various accounts by CloudFlare, Facebook and other experts, it seems that a chain of mistakes (operational and possibly design ones) lead to this.

The post Hypotheses About What Happened to Facebook appeared first on Bozho's tech blog.