Text-to-SQL Model Vulnerabilities Found in New Study, Enabling Data Theft and DoS Attacks

Using Text-to-SQL models to generate malicious code, a group of academics has developed innovative techniques that might be used by adversaries to gather confidential data and launch denial-of-service (DoS) attacks.

According to Xutan Peng, a researcher at the University of Sheffield, “a wide range of database applications leverage AI approaches that can transform human questions into SQL queries (particularly Text-to-SQL) to better engage with users.”

“We discovered that crackers can trick Text-to-SQL models into creating malicious code by posing a few cleverly crafted questions. Because this code runs automatically on the database, the results could be quite bad (such as data breaches and denial-of-service assaults).”

Text-to-SQL-Model-Vulnerabilities-Found-in-New-Study-Enabling-Data-Theft-and-DoS-Attacks-1 — Text-to-SQL Model Vulnerabilities Found in New Study, Enabling Data Theft and DoS Attacks

The results represent the first empirical instance where natural language processing (NLP) models have been used as an attack vector in the open, and they were validated against two commercial products BAIDU-UNIT and AI2sql.

Black box attacks are similar to SQL injection flaws in that they incorporate a malicious payload in the input question that is copied to the created SQL query and causes unexpected outcomes.

The study found that the specially designed payloads might be weaponized to execute malicious SQL queries, which would then allow an attacker to change the backend databases and launch DoS attacks against the server.

In addition, a second class of assaults looked into the possibilities of tampering with various pre-trained language models (PLMs), which are models that have been developed using a huge dataset without regard to the use cases they are applied to.

The researchers noted that there are numerous ways to introduce backdoors into PLM-based frameworks by tainting the training samples, including word replacements, the creation of unique prompts, and sentence-style modifications.

With a 100% success rate and no performance impact, backdoor attacks on four separate open-source models (BART-BASE, BART-LARGE, T5-BASE, and T5-3B) were conducted using a corpus tainted with malicious samples. As a result, such problems are difficult to identify in practical settings.

The researchers propose applying strong software engineering standards, evaluating off-the-shelf models to avoid supply chain vulnerabilities, and implementing classifiers to check for suspicious strings in inputs.