Cheuk Ting Ho: I Hate Writing Tests, That’s Why I Use Hypothesis | PyData Tel Aviv 2022 - 2023

Details

Title : Cheuk Ting Ho: I Hate Writing Tests, That’s Why I Use Hypothesis | PyData Tel Aviv 2022 Author(s): PyData Link(s) : https://www.youtube.com/watch?v=__DfM6R4nVs

Rough Notes

The difficult parts of writing tests include:

Importing the code.
Organizing the tests.
Thinking of the test cases.

Hypothesis helps with the last 2.

Property based testing: Easiest to see by difference again testing by example.

Testing by property	Testing by example
The given is obvious (e.g. say its a number)	Need to think of what is and what is not
Works extra well with typing	Take extra steps to write examples
Edge case automatically found	May overlook edge cases

Hypothesis does this by using Python decorators to modify your test so that it will take an input and it will run a defined strategy to be used for that input. A strategy generates test data. For e.g. if we have some encode and decode functions for strings of some format:

from hypothesis import given
from hypothesis.strategies import text

@given(text())
def test_decode_inverts_encode(s):
    assert decode(encode(s)) == s

For scientific computing, there are Numpy related methods, which require separate installation via pip install hypothesis[numpy]. The relevant functionality is located at hypothesis.extra.numpy, which will have strategies for scalar and array dtypes.

There is also Pandas functionality, which is also installed when the numpy functionality is installed. The pandas strategies are located in hypothesis.extra.pandas.

If you lazy, there is a ghostwriter that will write tests for you, using your typing as a hint. The ghostwriter will also find the right strategy for you, and also includes a CLI tool like hypothesis write gzip. It will also automatically format the code using Black.

Ghostwriters do the following among many things:

Fuzz i.e. checking valid input only leads to expected exceptions.
Idempotency i.e. result does not change when using the function on its own output.
Roundtrip i.e. calling the 2nd function to the result of the 1st one will will go back to the input.
Equivalence i.e. check the 1st function has the same effect as the 2nd function, helpful when reinventing the function.
Binary operations which test for binary operators.
Ufunc for testing numpy array ufunc.

Some points to consider:

Tests can run slower, generating strategies is expensive.
Tests can be harder to understand.
If no types are there, nothing much can be done.
Cannot test whether a machine learning model is doing what it is supposed to do.