Skip to content

Python: Modernize Unexpected Raise In Special Method query #20120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 0 additions & 16 deletions python/ql/src/Functions/IncorrectRaiseInSpecialMethod.py

This file was deleted.

42 changes: 20 additions & 22 deletions python/ql/src/Functions/IncorrectRaiseInSpecialMethod.qhelp
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ When the expression <code>a + b</code> is evaluated the Python virtual machine w
is not implemented it will call <code>type(b).__radd__(b, a)</code>.</p>
<p>
Since the virtual machine calls these special methods for common expressions, users of the class will expect these operations to raise standard exceptions.
For example, users would expect that the expression <code>a.b</code> might raise an <code>AttributeError</code>
For example, users would expect that the expression <code>a.b</code> may raise an <code>AttributeError</code>
if the object <code>a</code> does not have an attribute <code>b</code>.
If a <code>KeyError</code> were raised instead,
then this would be unexpected and may break code that expected an <code>AttributeError</code>, but not a <code>KeyError</code>.
Expand All @@ -20,50 +20,48 @@ Therefore, if a method is unable to perform the expected operation then its resp
</p>

<ul>
<li>Attribute access, <code>a.b</code>: Raise <code>AttributeError</code></li>
<li>Arithmetic operations, <code>a + b</code>: Do not raise an exception, return <code>NotImplemented</code> instead.</li>
<li>Indexing, <code>a[b]</code>: Raise <code>KeyError</code>.</li>
<li>Hashing, <code>hash(a)</code>: Use <code>__hash__ = None</code> to indicate that an object is unhashable.</li>
<li>Equality methods, <code>a != b</code>: Never raise an exception, always return <code>True</code> or <code>False</code>.</li>
<li>Ordering comparison methods, <code>a &lt; b</code>: Raise a <code>TypeError</code> if the objects cannot be ordered.</li>
<li>Most others: Ideally, do not implement the method at all, otherwise raise <code>TypeError</code> to indicate that the operation is unsupported.</li>
<li>Attribute access, <code>a.b</code> (<code>__getattr__</code>): Raise <code>AttributeError</code>.</li>
<li>Arithmetic operations, <code>a + b</code> (<code>__add__</code>): Do not raise an exception, return <code>NotImplemented</code> instead.</li>
<li>Indexing, <code>a[b]</code> (<code>__getitem__</code>): Raise <code>KeyError</code> or <code>IndexError</code>.</li>
<li>Hashing, <code>hash(a)</code> (<code>__hash__</code>): Should not raise an exception. Use <code>__hash__ = None</code> to indicate that an object is unhashable rather than raising an exception.</li>
<li>Equality methods, <code>a == b</code> (<code>__eq__</code>): Never raise an exception, always return <code>True</code> or <code>False</code>.</li>
<li>Ordering comparison methods, <code>a &lt; b</code> (<code>__lt__</code>): Raise a <code>TypeError</code> if the objects cannot be ordered.</li>
<li>Most others: If the operation is never supported, the method often does not need to be implemented at all; otherwise a <code>TypeError</code> should be raised.</li>
</ul>

</overview>
<recommendation>
<p>If the method is meant to be abstract, then declare it so using the <code>@abstractmethod</code> decorator.
<p>If the method is intended to be abstract, and thus always raise an exception, then declare it so using the <code>@abstractmethod</code> decorator.
Otherwise, either remove the method or ensure that the method raises an exception of the correct type.
</p>

</recommendation>
<example>

<p>
This example shows two unhashable classes. The first class is unhashable in a non-standard way which may cause maintenance problems.
The second, corrected, class uses the standard idiom for unhashable classes.
In the following example, the <code>__add__</code> method of <code>A</code> raises a <code>TypeError</code> if <code>other</code> is of the wrong type.
However, it should return <code>NotImplemented</code> instead of rising an exception, to allow other classes to support adding to <code>A</code>.
This is demonstrated in the class <code>B</code>.
</p>
<sample src="IncorrectRaiseInSpecialMethod.py" />
<sample src="examples/IncorrectRaiseInSpecialMethod.py" />
<p>
In this example, the first class is implicitly abstract; the <code>__add__</code> method is unimplemented,
presumably with the expectation that it will be implemented by sub-classes.
The second class makes this explicit with an <code>@abstractmethod</code> decoration on the unimplemented <code>__add__</code> method.
In the following example, the <code>__getitem__</code> method of <code>C</code> raises a <code>ValueError</code>, rather than a <code>KeyError</code> or <code>IndexError</code> as expected.
</p>
<sample src="IncorrectRaiseInSpecialMethod2.py" />
<sample src="examples/IncorrectRaiseInSpecialMethod2.py" />
<p>
In this last example, the first class implements a collection backed by the file store.
However, should an <code>IOError</code> be raised in the <code>__getitem__</code> it will propagate to the caller.
The second class handles any <code>IOError</code> by reraising a <code>KeyError</code> which is the standard exception for
the <code>__getitem__</code> method.
In the following example, the class <code>__hash__</code> method of <code>D</code> raises <code>NotImplementedError</code>.
This causes <code>D</code> to be incorrectly identified as hashable by <code>isinstance(obj, collections.abc.Hashable)</code>; so the correct
way to make a class unhashable is to set <code>__hash__ = None</code>.
</p>

<sample src="IncorrectRaiseInSpecialMethod3.py" />
<sample src="examples/IncorrectRaiseInSpecialMethod3.py" />


</example>
<references>

<li>Python Language Reference: <a href="http://docs.python.org/dev/reference/datamodel.html#special-method-names">Special Method Names</a>.</li>
<li>Python Library Reference: <a href="https://docs.python.org/2/library/exceptions.html">Exceptions</a>.</li>
<li>Python Library Reference: <a href="https://docs.python.org/3/library/exceptions.html">Exceptions</a>.</li>



Expand Down
228 changes: 151 additions & 77 deletions python/ql/src/Functions/IncorrectRaiseInSpecialMethod.ql
Original file line number Diff line number Diff line change
Expand Up @@ -7,114 +7,188 @@
* error-handling
* @problem.severity recommendation
* @sub-severity high
* @precision very-high
* @precision high
* @id py/unexpected-raise-in-special-method
*/

import python
import semmle.python.ApiGraphs
import semmle.python.dataflow.new.internal.DataFlowDispatch

private predicate attribute_method(string name) {
name = "__getattribute__" or name = "__getattr__" or name = "__setattr__"
/** Holds if `name` is the name of a special method for attribute access such as `a.b`, that should raise an `AttributeError`. */
private predicate attributeMethod(string name) {
name = ["__getattribute__", "__getattr__", "__delattr__"] // __setattr__ excluded as it makes sense to raise different kinds of errors based on the `value` parameter
}

private predicate indexing_method(string name) {
name = "__getitem__" or name = "__setitem__" or name = "__delitem__"
/** Holds if `name` is the name of a special method for indexing operations such as `a[b]`, that should raise a `LookupError`. */
private predicate indexingMethod(string name) {
name = ["__getitem__", "__delitem__"] // __setitem__ excluded as it makes sense to raise different kinds of errors based on the `value` parameter
}

private predicate arithmetic_method(string name) {
name in [
"__add__", "__sub__", "__or__", "__xor__", "__rshift__", "__pow__", "__mul__", "__neg__",
"__radd__", "__rsub__", "__rdiv__", "__rfloordiv__", "__div__", "__rdiv__", "__rlshift__",
"__rand__", "__ror__", "__rxor__", "__rrshift__", "__rpow__", "__rmul__", "__truediv__",
"__rtruediv__", "__pos__", "__iadd__", "__isub__", "__idiv__", "__ifloordiv__", "__idiv__",
"__ilshift__", "__iand__", "__ior__", "__ixor__", "__irshift__", "__abs__", "__ipow__",
"__imul__", "__itruediv__", "__floordiv__", "__div__", "__divmod__", "__lshift__", "__and__"
/** Holds if `name` is the name of a special method for arithmetic operations. */
private predicate arithmeticMethod(string name) {
name =
[
"__add__", "__sub__", "__and__", "__or__", "__xor__", "__lshift__", "__rshift__", "__pow__",
"__mul__", "__div__", "__divmod__", "__truediv__", "__floordiv__", "__matmul__", "__radd__",
"__rsub__", "__rand__", "__ror__", "__rxor__", "__rlshift__", "__rrshift__", "__rpow__",
"__rmul__", "__rdiv__", "__rdivmod__", "__rtruediv__", "__rfloordiv__", "__rmatmul__",
"__iadd__", "__isub__", "__iand__", "__ior__", "__ixor__", "__ilshift__", "__irshift__",
"__ipow__", "__imul__", "__idiv__", "__idivmod__", "__itruediv__", "__ifloordiv__",
"__imatmul__", "__pos__", "__neg__", "__abs__", "__invert__",
]
}

private predicate ordering_method(string name) {
name = "__lt__"
or
name = "__le__"
or
name = "__gt__"
or
name = "__ge__"
or
name = "__cmp__" and major_version() = 2
/** Holds if `name is the name of a special method for ordering operations such as `a < b`. */
private predicate orderingMethod(string name) {
name =
[
"__lt__",
"__le__",
"__gt__",
"__ge__",
]
}

private predicate cast_method(string name) {
name = "__nonzero__" and major_version() = 2
or
name = "__int__"
or
name = "__float__"
or
name = "__long__"
or
name = "__trunc__"
or
name = "__complex__"
/** Holds if `name` is the name of a special method for casting an object to a numeric type, such as `int(x)` */
private predicate castMethod(string name) {
name =
[
"__int__",
"__float__",
"__index__",
"__trunc__",
"__complex__"
] // __bool__ excluded as it makes sense to allow it to always raise
}

predicate correct_raise(string name, ClassObject ex) {
ex.getAnImproperSuperType() = theTypeErrorType() and
/** Holds if we allow a special method named `name` to raise `exec` as an exception. */
predicate correctRaise(string name, Expr exec) {
execIsOfType(exec, "TypeError") and
(
name = "__copy__" or
name = "__deepcopy__" or
name = "__call__" or
indexing_method(name) or
attribute_method(name)
indexingMethod(name) or
attributeMethod(name) or
// Allow add methods to raise a TypeError, as they can be used for sequence concatenation as well as arithmetic
name = ["__add__", "__iadd__", "__radd__"]
)
or
preferred_raise(name, ex)
or
preferred_raise(name, ex.getASuperType())
exists(string execName |
preferredRaise(name, execName, _) and
execIsOfType(exec, execName)
)
}

predicate preferred_raise(string name, ClassObject ex) {
attribute_method(name) and ex = theAttributeErrorType()
or
indexing_method(name) and ex = Object::builtin("LookupError")
or
ordering_method(name) and ex = theTypeErrorType()
or
arithmetic_method(name) and ex = Object::builtin("ArithmeticError")
or
name = "__bool__" and ex = theTypeErrorType()
/** Holds if it is preferred for `name` to raise exceptions of type `execName`. `message` is the alert message. */
predicate preferredRaise(string name, string execName, string message) {
attributeMethod(name) and
execName = "AttributeError" and
message = "should raise an AttributeError instead."
or
indexingMethod(name) and
execName = "LookupError" and
message = "should raise a LookupError (KeyError or IndexError) instead."
or
orderingMethod(name) and
execName = "TypeError" and
message = "should raise a TypeError or return NotImplemented instead."
or
arithmeticMethod(name) and
execName = "ArithmeticError" and
message = "should raise an ArithmeticError or return NotImplemented instead."
or
name = "__bool__" and
execName = "TypeError" and
message = "should raise a TypeError instead."
}

predicate no_need_to_raise(string name, string message) {
name = "__hash__" and message = "use __hash__ = None instead"
or
cast_method(name) and message = "there is no need to implement the method at all."
/** Holds if `exec` is an exception object of the type named `execName`. */
predicate execIsOfType(Expr exec, string execName) {
// Might make sense to have execName be an IPA type here. Or part of a more general API modeling builtin/stdlib subclass relations.
exists(string subclass |
execName = "TypeError" and
subclass = "TypeError"
or
execName = "LookupError" and
subclass = ["LookupError", "KeyError", "IndexError"]
or
execName = "ArithmeticError" and
subclass = ["ArithmeticError", "FloatingPointError", "OverflowError", "ZeroDivisionError"]
or
execName = "AttributeError" and
subclass = "AttributeError"
|
exec = API::builtin(subclass).getACall().asExpr()
or
exec = API::builtin(subclass).getASubclass().getACall().asExpr()
)
}

/**
* Holds if `meth` need not be implemented if it always raises. `message` is the alert message, and `allowNotImplemented` is true
* if we still allow the method to always raise `NotImplementedError`.
*/
predicate noNeedToAlwaysRaise(Function meth, string message, boolean allowNotImplemented) {
meth.getName() = "__hash__" and
message = "use __hash__ = None instead." and
allowNotImplemented = false
or
castMethod(meth.getName()) and
message = "this method does not need to be implemented." and
allowNotImplemented = true and
// Allow an always raising cast method if it's overriding other behavior
not exists(Function overridden |
overridden.getName() = meth.getName() and
overridden.getScope() = getADirectSuperclass+(meth.getScope()) and
not alwaysRaises(overridden, _)
)
}

/** Holds if `func` has a decorator likely marking it as an abstract method. */
predicate isAbstract(Function func) { func.getADecorator().(Name).getId().matches("%abstract%") }

/** Holds if `f` always raises the exception `exec`. */
predicate alwaysRaises(Function f, Expr exec) {
directlyRaises(f, exec) and
strictcount(Expr e | directlyRaises(f, e)) = 1 and
not exists(f.getANormalExit())
}

predicate is_abstract(FunctionObject func) {
func.getFunction().getADecorator().(Name).getId().matches("%abstract%")
/** Holds if `f` directly raises `exec` using a `raise` statement. */
predicate directlyRaises(Function f, Expr exec) {
exists(Raise r |
r.getScope() = f and
exec = r.getException() and
exec instanceof Call
)
}

predicate always_raises(FunctionObject f, ClassObject ex) {
ex = f.getARaisedType() and
strictcount(f.getARaisedType()) = 1 and
not exists(f.getFunction().getANormalExit()) and
/* raising StopIteration is equivalent to a return in a generator */
not ex = theStopIterationType()
/** Holds if `exec` is a `NotImplementedError`. */
predicate isNotImplementedError(Expr exec) {
exec = API::builtin("NotImplementedError").getACall().asExpr()
}

from FunctionObject f, ClassObject cls, string message
/** Gets the name of the builtin exception type `exec` constructs, if it can be determined. */
string getExecName(Expr exec) { result = exec.(Call).getFunc().(Name).getId() }

from Function f, Expr exec, string message
where
f.getFunction().isSpecialMethod() and
not is_abstract(f) and
always_raises(f, cls) and
f.isSpecialMethod() and
not isAbstract(f) and
directlyRaises(f, exec) and
(
no_need_to_raise(f.getName(), message) and not cls.getName() = "NotImplementedError"
exists(boolean allowNotImplemented, string subMessage |
alwaysRaises(f, exec) and
noNeedToAlwaysRaise(f, subMessage, allowNotImplemented) and
(allowNotImplemented = true implies not isNotImplementedError(exec)) and // don't alert if it's a NotImplementedError and that's ok
message = "This method always raises $@ - " + subMessage
)
or
not correct_raise(f.getName(), cls) and
not cls.getName() = "NotImplementedError" and
exists(ClassObject preferred | preferred_raise(f.getName(), preferred) |
message = "raise " + preferred.getName() + " instead"
not isNotImplementedError(exec) and
not correctRaise(f.getName(), exec) and
exists(string subMessage | preferredRaise(f.getName(), _, subMessage) |
if alwaysRaises(f, exec)
then message = "This method always raises $@ - " + subMessage
else message = "This method raises $@ - " + subMessage
)
)
select f, "Function always raises $@; " + message, cls, cls.toString()
select f, message, exec, getExecName(exec)
15 changes: 0 additions & 15 deletions python/ql/src/Functions/IncorrectRaiseInSpecialMethod2.py

This file was deleted.

Loading