Skip to content

JSON cast not properly interpreted in bigquery #11753

@aandrestrumid

Description

@aandrestrumid

What happened?

When it comes to manipulating JSON columns in SQL, there are differences some compatibility with the bigquery syntax.

Here's an example where I parse a string that contains json, and try to extract one field as an int.

First I get ibis to generate it:

import ibis
import ibis.expr.datatypes as idt


def test_ibis_bigquery_json_cast_generates_parse_json():
    schema = {
        'json_string_col': idt.string,  # This is a STRING, not JSON type
    }
    table = ibis.table(schema, name='test_table')

    result = table.select(
        hello=table.json_string_col.cast(idt.json)['hello'].cast(idt.float64)
    )

    bigquery_sql = ibis.to_sql(result, dialect='bigquery')
    assert str(bigquery_sql) == (
        'SELECT\n'
        "  CAST(CAST(`t0`.`json_string_col` AS JSON)['hello'] AS FLOAT64) AS `hello`\n"
        'FROM `test_table` AS `t0`'
    )

I get

SELECT
  CAST(CAST(`t0`.`json_string_col` AS JSON)['hello'] AS FLOAT64) AS `hello`
FROM `test_table` AS `t0`

If I now try to run something similar in big query I get this error: Invalid cast from STRING to JSON at

Here's an end to end SQL example, with the correct syntax which uses:

  • PARSE_JSON instead of cast
  • JSON_VALUE instead of [], with a query string ($.hello)
  • SAFE_CAST to safely cast the json scalar
WITH sample_data AS (
  SELECT '{"foo": "bar", "hello": 1}' AS json_string_col
)
SELECT
  json_string_col,
  --CAST(CAST(json_string_col AS JSON)['hello'] AS FLOAT64) AS `hello`, -- Invalid cast from STRING to JSON at [7:13]
  SAFE_CAST(JSON_VALUE(PARSE_JSON(json_string_col), '$.hello') AS INT64) as hello,
FROM sample_data;

I'm a bit new to ibis, is it reasonable expectation for it to handle these non-standard syntax from big query?
If not is there a workaround to write my own expressions and dispatch them based on the engine (I'm mainly doing json parse, Get field, cast field)?

What version of ibis are you using?

ibis-framework==11.0.0

What backend(s) are you using, if any?

bigquery

Relevant log output

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior inside of ibis

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions