-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Rust: Model more data flow constructs as calls using MaD #20953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| private import internal.MethodCallExprImpl | ||
| import codeql.rust.elements.CallExprBase | ||
| import codeql.rust.elements.ArgList | ||
| import codeql.rust.elements.Attr |
Check warning
Code scanning / CodeQL
Redundant import Warning
codeql.rust.elements.ArgList
8a00ac7 to
42734b1
Compare
| /** | ||
| * Index assignments like `a[i] = rhs` are treated as `*a.index_mut(i) = rhs`, | ||
| * so they should in principle be handled by `referenceAssignment`. | ||
| * | ||
| * However, this would require support for [generalized reverse flow][1], which | ||
| * is not yet implemented, so instead we simulate reverse flow where it would | ||
| * have applied via the model for `<_ as core::ops::index::IndexMut>::index_mut`. | ||
| * | ||
| * The same is the case for compound assignments like `a[i] += rhs`, which are | ||
| * treated as `(*a.index_mut(i)).add_assign(rhs)`. | ||
| * | ||
| * [1]: https://github.com/github/codeql/pull/18109 | ||
| */ |
Check warning
Code scanning / CodeQL
Predicate QLDoc style Warning
6f6ef8a to
a534e8a
Compare
| // extract the algorithm name from the type of `ce` or its receiver. | ||
| exists(Type t, TypePath tp | | ||
| t = inferType([ce, ce.(MethodCallExpr).getReceiver()], tp) and | ||
| t = inferType([call, call.(MethodCall).getReceiver()], tp) and |
Check warning
Code scanning / CodeQL
Omittable 'exists' variable Warning
in this argument
2d109fa to
1558c99
Compare
1558c99 to
5a5679b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request enhances Rust data flow analysis by modeling index expressions and operations as method calls using Model-as-Data (MaD) flow summaries. This enables more accurate tracking of taint flow through indexing operations, compound assignments, and arithmetic operations.
Key Changes:
- Index expressions (
x[y]) are now modeled as calls toindex()orindex_mut()methods - Compound assignments (
x += y) now properly track data flow from both operands - Binary operations (
x + y) are modeled as method calls with flow summaries
Reviewed changes
Copilot reviewed 32 out of 33 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
rust/ql/test/library-tests/dataflow/collections/main.rs |
New comprehensive test file for array and custom indexer data flow |
rust/ql/test/library-tests/dataflow/collections/inline-flow.ql |
New test query for the collections test suite |
rust/ql/test/library-tests/dataflow/global/main.rs |
Updated test expectation comment from MISSING to correctly expected flow |
Multiple .expected files |
Updated test expectations reflecting improved data flow analysis with new MaD models |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
In this PR we
x[y], as*x.index(y)or*x.index_mut(y)in data flow. This means we can replace the hard-coded read/store steps for indexers with flow summaries for<_ as core::ops::index::Index>::indexand<_ as core::ops::index::IndexMut>::index_mut, respectively. However, due to a limitation in how we support reverse flow, we still need special casing for assignments where the LHS is an index expression.x += y; previously, the value ofxwas not considered overwritten, and the value ofydid not propagate intox.x + yandx += yas calls, and replace the hard-coded taint steps with flow summaries.Using flow summaries also has the benefit that it applies to cases where the methods are invoked directly instead of through index/operation syntax.
DCA is mostly uneventful.