array_contains is currently marked as Compatible in Comet, but the null handling behavior should be verified to ensure it matches Spark's three-valued logic.
According to Spark's array_contains behavior:
true if the value is found in the arrayfalse if no match found AND no null elements existnull if no match found BUT null elements exist (indeterminate result)null if search value is nullExamples:
SELECT array_contains(array(1, 2, 3), 2);
-- Spark returns: true
SELECT array_contains(array(1, 2, 3), 5);
-- Spark returns: false
SELECT array_contains(array(1, null, 3), 2);
-- Spark returns: null (no match, but null element exists - indeterminate)
SELECT array_contains(array(1, null, 3), 1);
-- Spark returns: true (found match)
SELECT array_contains(array(1, 2, 3), null);
-- Spark returns: null (search value is null)
Comet uses DataFusion's array_has function:
val arrayContainsScalarExpr =
scalarFunctionExprToProto("array_has", arrayExprProto, keyExprProto)
array_contains(array(1, null, 3), 2) - should return null, not falsearray_contains(array(1, 2, 3), null) - should return nullIf DataFusion's array_has doesn't implement three-valued logic, this should be:
IncompatibleThe test file includes null tests:
checkSparkAnswerAndOperator(sql(s"SELECT array_contains(a, cast(null as $typeName)) FROM t2"))
But we should verify the specific three-valued null logic case.
Note: This issue was generated with AI assistance.