Huanfa Chen - huanfa.chen@ucl.ac.uk
23/03/2026
Image Credit: https://madewithml.com/courses/mlops/testing/
print() statements for testingunittest, pytest, or nose2Arrange: set up the test data and environmentAct: execute the code being testedAssert: verify that the output is as expectedtests directory, with test files named test_*.pytest_*assert in Python?assert is a keyword in Python used for debugging purposesAssertionError@pytest.mark.parametrize to run the same test with different inputs@pytest.fixture to set up reusable test data or statetests/code/@pytest.mark.<name> to label tests and create groupsGreat expectations library allows to create expectations and to compare with data# tests/data/test_dataset.py
def test_dataset(df):
"""Test dataset quality and integrity."""
column_list = ["id", "created_on", "title", "description", "tag"]
df.expect_table_columns_to_match_ordered_list(column_list=column_list) # schema adherence
tags = ["computer-vision", "natural-language-processing", "mlops", "other"]
df.expect_column_values_to_be_in_set(column="tag", value_set=tags) # expected labels
df.expect_compound_columns_to_be_unique(column_list=["title", "description"]) # data leaks
df.expect_column_values_to_not_be_null(column="tag") # missing values
df.expect_column_values_to_be_unique(column="id") # unique values
df.expect_column_values_to_be_of_type(column="title", type_="str") # type adherence
# Expectation suite
expectation_suite = df.get_expectation_suite(discard_failed_expectations=False)
results = df.validate(expectation_suite=expectation_suite, only_return_failures=True).to_json_dict()
assert results["success"]expect_column_pair_values_a_to_be_greater_than_bexpect_column_mean_to_be_between© CASA | ucl.ac.uk/bartlett/casa