AWS Athena: Fixing ICEBERG_CURSOR_ERROR on Valid Security Lake Files

Select Language:

If you’ve been working with Athena engine v3 and Apache Iceberg tables managed by Amazon Security Lake, you might have encountered an error like “ICEBERG_CURSOR_ERROR: Failed to read Parquet file.” This issue occurs during certain query operations, especially when trying to read the data part of the parquet files with specific queries, such as GROUP BY, UNNEST, or column projections. Surprisingly, manifest-only queries—which don’t actually read the parquet file data—still work fine because they avoid opening the actual files.

This problem seems to affect multiple sources and shows up in various parts of your data, regardless of the account or date partition. The affected files are all in the Parquet format version 2, managed by Lake Formation, and are generally fine when checked with tools like pyarrow. They open and read quickly, and their schemas look correct. However, Athena struggles with these files, especially with wider data ranges, even though smaller or older data reads work without issues.

So what’s going on? Here are some key points:

– The Parquet files are not corrupted; they open normally with pyarrow.
– The Iceberg metadata appears correct, with references to the affected files that are coherent.
– The compactor regularly touches these files, indicating they are actively being rewritten, but the problem persists.
– Several attempts to fix or work around this, such as running optimize operations or snapshot queries, are either blocked or don’t resolve the problem.
– Modifying table properties to disable vectorization or excluding certain partitions temporarily helps but isn’t reliable long-term.
– There are operational restrictions because these tables are managed by Lake Formation, which prevents running certain DDL operations like ALTER or OPTIMIZE directly.

Given this scenario, here are approaches and questions you might consider:

First, confirm whether AWS has acknowledged any bugs or regressions related to engine v3’s ability to read certain Parquet files. It’s good to check if others are experiencing the same symptoms, which can help identify whether the problem is widespread or specific to your setup.

If DDL operations are not an option because of the managed nature of your tables, a common workaround involves creating an external table that references the same S3 files. While operationally cumbersome, it can bypass the issue because external tables don’t rely on Iceberg’s metadata or compaction processes.

Finally, keep an eye on updates from AWS or broader community discussions. Sometimes, a fix or a workaround is released after a known issue is identified. Meanwhile, gathering detailed logs, Athena execution traces, or pyarrow outputs can be helpful if you need to escalate the issue or seek specialized support.

In summary, if you’re running into this error, it’s likely a regression or bug that needs AWS’s attention. Using external tables as a workaround may help in the short term, but staying informed about updates and sharing your experience with AWS support or community forums can help push towards a more permanent solution.