AWS Glue Python shell job timeout with custom Libraries

This is short post on Timeout errors faced using custom libraries with AWS Glue Python shell job.  I referred the steps listed in AWS docs to create a custom library , and submitted the job with timeout of 5 minutes.  But the job timed out without any errors in logs. Cloudwatch log reported following messages

2020-06-13T12:02:28.821+05:30 Installed /glue/lib/installation/redshift_utils-0.1-py3.7.egg
2020-06-13T12:02:28.822+05:30 Processing dependencies for redshift-utils==0.1
2020-06-13T12:12:45.550+05:30 Searching for redshift-module==0.1
2020-06-13T12:12:45.550+05:30 Reading

On searching for error, I came across this AWS Forum post ,where it was recommended  to use python3.6. I referred back documentation and it confirmed that AWS Glue shell jobs are compatible with python 2.7 and 3.6. I was using python3.7 virtualenv for my testing, so this had to be fixed. 

To easily manage multiple environments, I installed miniconda on my Mac which allows to create virtual environment with different python version. Post installation, I created a new python3.6 env with conda and created the egg file

conda create -n venv36 python=3.6
conda activate venv36
python bdist_egg

Amit Bansal

Experienced professional with 16 years of expertise in database technologies. In-depth knowledge of designing and implementation of Disaster Recovery / HA solutions, Database Migrations , performance tuning and creating technical solutions. Skills: Oracle,MySQL, PostgreSQL, Aurora, AWS, Redshift, Hadoop (Cloudera) , Elasticsearch, Python

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.