Select Language:
If you’re using an S3 Access Point to control access to your bucket, you might think that you don’t need to set explicit permissions on the bucket for your application or users. After all, the access point policy should handle everything. However, in practice, especially with AWS Glue, there can be some inconsistencies.
When AWS Glue runs an ETL job, it creates several Spark executors that connect directly to Amazon S3. These executors use the Hadoop S3A client to access storage. The problem is that this client doesn’t always correctly recognize or follow the access point’s delegated policies during the requester pays setup. Instead of trusting the access point, it still checks permissions on the actual bucket ARN. This mismatch can lead to a 403 Access Denied error even when your access point is set up correctly.
To work around this, there are a few steps you can take. First, even if you have delegation in place, give your Glue job role explicit read permissions on the bucket. Focus only on specific actions like GetObject, GetObjectVersion, and ListBucket, and restrict these permissions to the bucket’s ARN. This helps when Glue’s runtime is trying to verify permissions.
Next, make sure your Glue job role has the necessary permissions to call s3:GetAccessPoint and s3:GetAccessPointPolicy. These are sometimes overlooked but are essential because they allow Glue to resolve access point URLs properly at startup.
Another important detail is the structure of the URI you use for the access point. The Hadoop client expects the full ARN or alias in a specific format, such as:
s3://
Using just s3://accesspoint/
Additionally, you should enable the requester pays option globally for your job by setting these configurations:
spark._jsc.hadoopConfiguration().set(“fs.s3.useRequesterPaysHeader”, “true”)
spark._jsc.hadoopConfiguration().set(“fs.s3a.requester.pays”, “true”)
In newer Glue versions (like 5.0), which use updated AWS SDKs, handling of requester pays buckets and access points is better synchronized. If you’re running an older version, upgrading can help improve this experience.
In summary, even if Access Point delegation seems straightforward, some client behaviors require you to grant explicit permissions at the bucket level for Glue to work smoothly. This is more of a client-side handling issue than a permissions problem. Keep an eye on the official documentation for updates and best practices, as AWS continues improving how these features work together.
You can find detailed info on using S3 access points with Glue, understanding requester pays buckets, and configuring your jobs in the official AWS docs. Just remember, this setup can be nuanced, and granting explicit permissions, even redundantly, helps prevent access issues.




