Sunday, August 30, 2020

Save Your Cloud: DoS On VMs In OpenNebula 4.6.1

This is a post about an old vulnerability that I finally found the time to blog about. It dates back to 2014, but from a technical point of view it is nevertheless interesting: An XML parser that tries to fix structural errors in a document caused a DoS problem.

All previous posts of this series focused on XSS. This time, we present a vulnerability which is connected another Cloud Management Platform: OpenNebula. This Infrastructure-as-a-Service platform started as a research project in 2005. It is used by information technology companies like IBM, Dell and Akamai as well as academic institutions and the European Space Administrations (ESA). By relying on standard Linux tools as far as possible, OpenNebula reaches a high level of customizability and flexibility in hypervisors, storage systems, and network infrastructures. OpenNebula is distributed using the Apache-2 license.


OpenNebula offers a broad variety of interfaces to control a cloud. This post focuses on Sunstone, OpenNebula's web interface (see Figure 1).

Figure 1: OpenNebula's Sunstone Interface displaying a VM's control interface

Before OpenNebula 4.6.2, Sunstone had no Cross-Site Request Forgery (CSRF) protection. This is a severe problem. Consider an attacker who lures a victim into clicking on a malicious link while being logged in at a private cloud. This enables the attacker to send arbitrary requests to the private cloud through the victims browser. However, we could find other bugs in OpenNebula that allowed us to perform much more sophisticated attacks.

Denial-of-Service on OpenNebula-VM

At its backend, OpenNebula manages VMs with XML documents. A sample for such an XML document looks like this:
<VM>
   <ID>0</ID>
   <NAME>My VM</NAME>
   <PERMISSIONS>...</PERMISSIONS>
   <MEMORY>512</MEMORY>
   <CPU>1</CPU>
   ...
</VM>
OpenNebula 4.6.1 contains a bug in the sanitization of input for these XML documents: Whenever a VM's name contains an opening XML tag (but no corresponding closing one), an XML generator at the backend automatically inserts the corresponding closing tag to ensure well-formedness of the resulting document. However, the generator outputs an XML document that does not comply with the XML schema OpenNebula expects. The listing below shows the structure that is created after renaming the VM to 'My <x> VM':
<VM>
   <ID>0</ID>
   <NAME>My <x> VM</x>
      <PERMISSIONS>...</PERMISSIONS>
      <MEMORY>512</MEMORY>
      <CPU>1</CPU>
      ...
   </NAME>
</VM>
The generator closes the <x> tag, but not the <NAME> tag. At the end of the document, the generator closes all opened tags including <NAME>.

OpenNebula saves the incorrectly generated XML document in a database. The next time the OpenNebula core retrieves information about that particular VM from the database the XML parser is mixed up and runs into an error because it only expects a string as name, not an XML tree. As a result, Sunstone cannot be used to control the VM anymore. The Denial-of-Service attack can only be reverted from the command line interface of OpenNebula.

This bug can be triggered by a CSRF-attack, which means that it is a valid attack against a private cloud: By luring a victim onto a maliciously crafted website while logged in into Sunstone, an attacker can make all the victim's VMs uncontrollable via Sunstone. A video of the attack can be seen here:


Background Info:

This is a bit of a harder topic to write about considering most of my audience are hackers not Ethereum developers or blockchain architects. So you may not know what a smart contract is nor how it is situated within the blockchain development model. So I am going to cover a little bit of context to help with understanding.  I will cover the bare minimum needed as an attacker.

A Standard Application Model:
  • In client server we generally have the following:
  • Front End - what the user sees (HTML Etc)
  • Server Side - code that handles business logic
  • Back End - Your database for example MySQL

A Decentralized Application Model:

Now with a Decentralized applications (DAPP) on the blockchain you have similar front end server side technology however
  • Smart contracts are your access into the blockchain.
  • Your smart contract is kind of like an API
  • Essentially DAPPs are Ethereum enabled applications using smart contracts as an API to the blockchain data ledger
  • DAPPs can be banking applications, wallets, video games etc.

A blockchain is a trust-less peer to peer decentralized database or ledger

The back-end is distributed across thousands of nodes in its entirety on each node. Meaning every single node has a Full "database" of information called a ledger.  The second difference is that this ledger is immutable, meaning once data goes in, data cannot be changed. This will come into play later in this discussion about smart contracts.

Consensus:

The blockchain of these decentralized ledgers is synchronized by a consensus mechanism you may be familiar with called "mining" or more accurately, proof of work or optionally Proof of stake.

Proof of stake is simply staking large sums of coins which are at risk of loss if one were to perform a malicious action while helping to perform consensus of data.   

Much like proof of stake, proof of work(mining) validates hashing calculations to come to a consensus but instead of loss of coins there is a loss of energy, which costs money, without reward if malicious actions were to take place.

Each block contains transactions from the transaction pool combined with a nonce that meets the difficulty requirements.  Once a block is found and accepted it places them on the blockchain in which more then half of the network must reach a consensus on. 

The point is that no central authority controls the nodes or can shut them down. Instead there is consensus from all nodes using either proof of work or proof of stake. They are spread across the whole world leaving a single centralized jurisdiction as an impossibility.

Things to Note: 

First Note: Immutability

  • So, the thing to note is that our smart contracts are located on the blockchain
  • And the blockchain is immutable
  • This means an Agile development model is not going to work once a contract is deployed.
  • This means that updates to contracts is next to impossible
  • All you can really do is createa kill-switch or fail safe functions to disable and execute some actions if something goes wrong before going permanently dormant.
  • If you don't include a kill switch the contract is open and available and you can't remove it

Second Note:  Code Is Open Source
  • Smart Contracts are generally open source
  • Which means people like ourselves are manually bug hunting smart contracts and running static analysis tools against smart contract code looking for bugs.

When issues are found the only course of action is:
  • Kill the current contract which stays on the blockchain
  • Then deploy a whole new version.
  • If there is no killSwitch the contract will be available forever.
Now I know what you're thinking, these things are ripe for exploitation.
And you would be correct based on the 3rd note


Third Note: Security in the development process is lacking
  • Many contracts and projects do not even think about and SDLC.
  • They rarely add penetration testing and vulnerability testing in the development stages if at all
  • At best there is a bug bounty before the release of their main-nets
  • Which usually get hacked to hell and delayed because of it.
  • Things are getting better but they are still behind the curve, as the technology is new and blockchain mostly developers and marketers.  Not hackers or security testers.


Forth Note:  Potential Data Exposure via Future Broken Crypto
  • If sensitive data is placed on the blockchain it is there forever
  • Which means that if a cryptographic algorithm is broken anything which is encrypted with that algorithm is now accessible
  • We all know that algorithms are eventually broken!
  • So its always advisable to keep sensitive data hashed for integrity on the blockchain but not actually stored on the blockchain directly


 Exploitation of Re-Entrancy Vulnerabilities:

With a bit of the background out of the way let's get into the first attack in this series.

Re-Entrancy attacks allow an attacker to create a re-cursive loop within a contract by having the contract call the target function rather than a single request from a  user. Instead the request comes from the attackers contract which does not let the target contracts execution complete until the tasks intended by the attacker are complete. Usually this task will be draining the money out of the contract until all of the money for every user is in the attackers account.

Example Scenario:

Let's say that you are using a bank and you have deposited 100 dollars into your bank account.  Now when you withdraw your money from your bank account the bank account first sends you 100 dollars before updating your account balance.

Well what if when you received your 100 dollars, it was sent to malicious code that called the withdraw function again not letting  the initial target deduct your balance ?

With this scenario you could then request 100 dollars, then request 100 again and you now have 200 dollars sent to you from the bank. But 50% of that money is not yours. It's from the whole collection of money that the bank is tasked to maintain for its accounts.

Ok that's pretty cool, but what if that was in a re-cursive loop that did not BREAK until all accounts at the bank were empty?  

That is Re-Entrancy in a nutshell.   So let's look at some code.

Example Target Code:


           function withdraw(uint withdrawAmount) public returns (uint) {
       
1.         require(withdrawAmount <= balances[msg.sender]);
2.         require(msg.sender.call.value(withdrawAmount)());

3.          balances[msg.sender] -= withdrawAmount;
4.          return balances[msg.sender];
        }

Line 1: Checks that you are only withdrawing the amount you have in your account or sends back an error.
Line 2: Sends your requested amount to the address the requested that withdrawal.
Line 3: Deducts the amount you withdrew from your account from your total balance.
Line 4. Simply returns your current balance.

Ok this all seems logical.. however the issue is in Line 2 - Line 3.   The balance is being sent back to you before the balance is deducted. So if you were to call this from a piece of code which just accepts anything which is sent to it, but then re-calls the withdraw function you have a problem as it never gets to Line 3 which deducts the balance from your total. This means that Line 1 will always have enough money to keep withdrawing.

Let's take a look at how we would do that:

Example Attacking Code:


          function attack() public payable {
1.           bankAddress.withdraw(amount);
         }

2.    function () public payable {
         
3.            if (address(bankAddress).balance >= amount) {
4.               bankAddress.withdraw(amount);
                }
}

Line 1: This function is calling the banks withdraw function with an amount less than the total in your account
Line 2: This second function is something called a fallback function. This function is used to accept payments that come into the contract when no function is specified. You will notice this function does not have a name but is set to payable.
Line 3:  This line is checking that the target accounts balance is greater than the amount being withdrawn.
Line 4:  Then again calling the withdraw function to continue the loop which will in turn be sent back to the fallback function and repeat lines over and over until the target contracts balance is less than the amount being requested.



Review the diagram above which shows the code paths between the target and attacking code. During this whole process the first code example from the withdraw function is only ever getting to lines 1-2 until the bank is drained of money. It never actually deducts your requested amount until the end when the full contract balance is lower then your withdraw amount. At this point it's too late and there is no money left in the contract.


Setting up a Lab Environment and coding your Attack:

Hopefully that all made sense. If you watch the videos associated with this blog you will see it all in action.  We will now analyze code of a simple smart contract banking application. We will interface with this contract via our own smart contract we code manually and turn into an exploit to take advantage of the vulnerability.

Download the target code from the following link:

Then lets open up an online ethereum development platform at the following link where we will begin analyzing and exploiting smart contracts in real time in the video below:

Coding your Exploit and Interfacing with a Contract Programmatically:

The rest of this blog will continue in the video below where we will  manually code an interface to a full smart contract and write an exploit to take advantage of a Re-Entrency Vulnerability:

 


Conclusion: 

In this smart contract exploit writing intro we showed a vulnerability that allowed for re entry to a contract in a recursive loop. We then manually created an exploit to take advantage of the vulnerability. This is just the beginning, as this series progresses you will see other types of vulnerabilities and have the ability to code and exploit them yourself.  On this journey through the decentralized world you will learn how to code and craft exploits in solidity using various development environments and test nets.

Continue reading


Saturday, August 29, 2020

Meta-aprendizaje Con GPT-3: Aprender A Sumar Leyendo O A Escribir Código Fuente Con Servicios Cognitivos De Texto Predictivo.

En el equipo de Ideas Locas nos encanta hacer proyectos y trabajar con tecnologías de Inteligencia Artificial. Dentro de este tan amplio campo, los servicios cognitivos de NLP (Natural Language Processing) están adquiriendo cada vez mayor importancia. Gracias a ellos podemos empezar a crear interfaces de usuario que humanizan la interacción persona-ordenador, con todas las ventajas que ello supone. 

Figura 1: Meta-aprendizaje con GPT-3: Aprender a sumar leyendo
o a escribir código fuente con servicios cognitivos de Texto Predictivo.

Dentro de este subconjunto de servicios que tratan de alcanzar la paridad humana, hoy os quiero hablar de la generación de texto, o los servicios de Texto Predictivo que ya se usaron en el pasado para construir novelas inventadas de Harry Potter, Drácula o Don Quijote de la Mancha como nos contó nuestro compañeros Fran Ramírez, y tenéis el artidulo de Chema Alonso sobre cómo los servicios de AI han comenzado a superar la paridad humana.

Figura 2: Creación de novelas con herramientas de Texto Predictivo

El problema del Texto Predictivo es aparentemente sencillo de entender. Se trata de crear un modelo entrenado que debe predecir el siguiente elemento del texto basándose en una secuencia previa de caracteres que llevarán a criterios puramente probabilísticos. Por tanto, dada una entrada como  "soy hacke" la salida del modelo debería de ser una "r". A nivel humano esto parece obvio pero la creación y el entrenamiento de estos modelos de IA es realmente complejo, debido a que el lenguaje lo es, aunque no nos demos cuenta como hablantes. 

Figura 3: Modelo de texto predictivo entrenado con probabilidades

Para resolver este problema, el último modelo que ha visto la luz ha sido GPT-3  (con permiso de GShard  de Google que lleva escasos días público) y ha sido desarrollado por OpenAI, empresa de Elon Musk. Es el pistoletazo de salida de una nueva generación de modelos gigantescos  - para que os hagáis una pequeña idea, GPT-3 cuenta con 175.000 millones de parámetros - desarrollados y entrenados por grandes empresas tecnológicas para ser servidos a los usuarios a través de APIs

Figura 4: GPT-3 en GitHub

Estos modelos tienen muchas utilidades de todo tipo, y algunos ejemplos los tenéis en el libro de Machine Learning aplicado a Ciberseguridad: Técnicas y ejemplos en la detección de amenazas de nuestros compañeros Fran Ramírez, Carmen Torrano, Sergio Hernández y José Torres. Son tan potentes que estas APIs son privadas y se debe rellenar un formulario detallando los usos que se le va a dar junto con los posibles riesgos, para luego esperar que sea aceptado, ya que podrían usarse para cosas que podrían considerarse negativas.

Figura 5: Libro de Machine Learning aplicado a Ciberseguridad

Puede resultar algo exagerado para un modelo que únicamente predice la siguiente letra, ¿verdad?. Lo que se ha observado con GPT-3, que ya se empezaba a intuir con su hermano pequeño GPT-2, es que es capaz de desarrollar un meta-aprendizaje, es decir, ha aprendido a aprender. Esto es resultado de haber sido entrenado con prácticamente la totalidad de los textos que hay en la red y es aquí es donde reside la potencia de este modelo.

Figura 6: Formulario para ser aceptado


Comúnmente, un modelo se desarrolla para resolver un problema específico. Sin embargo, GPT-3 ha cambiado radicalmente esta mentalidad, ya que ha sido entrenado para una tarea general y son los usuarios los que han ido encontrando distintos casos de uso en los que el modelo se desenvuelve a la perfección. Vamos a ver algunos ejemplos en diferentes ámbitos.

Generación de texto

La generación de texto es el problema fundamental para el cual el modelo ha sido entrenado, por tanto, su desempeño en esta tarea es espectacular. Es capaz de generar contenido escrito de forma que el lector ni se percate de que el texto ha sido generado por un modelo de IA. Pero mucho ojo con esto, porque puede ser utilizado para generar desinformación y ser usado en Fake News.


Operaciones matemáticas

El modelo es capaz de predecir que después de la secuencia "3 + 3 = " el carácter más probable es el "6". ¿Esto significa que sepa sumar? Realmente no, es únicamente cuestión de probabilidad, pero sí que puede dar respuesta a operaciones matemáticas de esta forma... ¿se podrían hacer cálculos matemáticos seguros?

Escribir código

GPT-3 es capaz de escribir código en distintos lenguajes de programación mediante una descripción en lenguaje natural, consiguiendo incluso el desarrollo de frontales web o de redes neuronales.




Estos son únicamente unos ejemplos de lo que GPT-3 es capaz de lograr en los meses que lleva en producción, y puedes ver muchos más ejemplos de aplicación de GPT-3 en este enlace de GPT-3 examples. Nadie sabe aún cuáles son los límites de estos modelos y que nos depararán los próximos que salgan a la luz pero, ¿a que ya no resulta tan exagerado poner las APIs privadas?

Autor: Pablo Saucedo (@psaucedo)

More info