Skip to content

Feat: Integrating DQE Testing Approaches into SQLancer first in MySQL #1251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

luliqwerty
Copy link

This PR is to Integrating DQE (Differential Query Execution) Testing Approaches into SQLancer. More information see DQE paper.

Take MySQL as the first step.
The code is divided into three main parts:

  1. DQEBase.class: defines the core logic of DQE, including the addition and deletion of auxiliary columns, and the comparison of query results.

  2. SQLQueryError.java: provide standardized error representation and comparison functionality for SQL query operations in DQE.

  3. MySQLDQEOracle.class: Inherits from DQEBase and implements the specific logic of executing DQE in MySQL, including generating queries, executing queries, comparing results, etc.

@mrigger
Copy link
Contributor

mrigger commented Jun 30, 2025

Thanks, that's great! Some high-level comments first:

  • for the SQLQueryError, this looks quite similar to ExpectedErrors. Do we really need this class or can we combine these classes?
  • for statements like UPDATE, DELETE, and ALTER TABLE, do you think we can use and extend the generator classes rather than implementing the logic in the DQE oracle?

@luliqwerty
Copy link
Author

Thanks, that's great! Some high-level comments first:

  • for the SQLQueryError, this looks quite similar to ExpectedErrors. Do we really need this class or can we combine these classes?
  • for statements like UPDATE, DELETE, and ALTER TABLE, do you think we can use and extend the generator classes rather than implementing the logic in the DQE oracle?

Regarding the SQLQueryError and ExpectedErrors classes, while they do share some surface-level similarities, I think it's important to highlight that their purposes and usage patterns are quite different in practice:

  • SQLQueryError is specifically designed to represent actual errors returned by the database, with structured fields like error level, code, and message. It plays a crucial role in DQE logic to distinguish between different types of errors returned by statements such as UPDATE, SELECT, or DELETE.
  • On the other hand, ExpectedErrors serves as a declarative mechanism for defining acceptable errors during testing, primarily relying on substring or regex-based matching. It does not require the same level of structural detail as SQLQueryError.

Given these differences, merging them into a single class could lead to confusion, increased coupling, and reduced clarity in terms of responsibility separation. I believe keeping them separate helps maintain a clean architecture and supports better extensibility in the future.

That said, if you have any alternative suggestions or design ideas on how we can better organize this part of the code, I’d be more than happy to discuss them further. Thanks again for your thoughtful feedback! @mrigger @JensonSung

@JensonSung
Copy link
Contributor

I think current implementation is good but specialized for DQE method.

@JensonSung
Copy link
Contributor

Thanks, that's great! Some high-level comments first:

  • for the SQLQueryError, this looks quite similar to ExpectedErrors. Do we really need this class or can we combine these classes?
  • for statements like UPDATE, DELETE, and ALTER TABLE, do you think we can use and extend the generator classes rather than implementing the logic in the DQE oracle?

First, revising ExepctedErrors like SQLQueryError to provide a structured information may help ease to identify errors compared to a string-formed information. While DQE does not distinguish which error is expected but compare whether errors are consistent. A clear name could be required.
Second, it seems that every kind of statements like update and delete deserve a proper generator, preparing for new testing methods.
These two improvements can be put on the agenda.

@luliqwerty luliqwerty deleted the branch sqlancer:main July 4, 2025 12:56
@luliqwerty luliqwerty closed this Jul 4, 2025
@luliqwerty luliqwerty deleted the main branch July 4, 2025 12:56
@luliqwerty luliqwerty restored the main branch July 4, 2025 12:57
@luliqwerty luliqwerty reopened this Jul 4, 2025
boolean generateLimit = false;
int limit = 0;
if (operateOnSingleTable) {
if (Randomly.getBooleanWithSmallProbability()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the expression generator already has methods for generating ORDER BYs, so we should try to re-use those. Could we perhaps also re-use or factor out components from the random query generator?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have add some tests for your mentioned method generateOrderBys() but I got some unexpected errors my repo commit. Sometimes test can't pass with many errors, sometimes this method generated statement like EXISTS (SELECT 1 WHERE FALSE) DESC, which never return rows in SQL statement. I don't know if the methed have some logic errors, or maybe my test have some wrongs, can you give me some suggestions? Thank you very much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only generating columns in order by expressions is enough for applying DQE, which can identify the correctness of complex expressions through WHERE clauses.

String selectRowId = String.format("SELECT %s FROM %s", rowId, tableName);
SQLancerResultSet resultSet = new SQLQueryAdapter(selectRowId).executeAndGet(state);
HashSet<String> rows = new HashSet<>();
if (resultSet != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have some auxiliary methods for fetching rows (used by other test oracles). Can we perhaps re-use any of these here?

@mrigger
Copy link
Contributor

mrigger commented Jul 6, 2025

Given these differences, merging them into a single class could lead to confusion, increased coupling, and reduced clarity in terms of responsibility separation. I believe keeping them separate helps maintain a clean architecture and supports better extensibility in the future.

Okay, then let's leave them separate for now. I guess there is a conceptual commonality (both are about errors returned by the database system), but if that does not (yet) make sense at an implementation level, we could try to address this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants