AWS Glue Python shell job timeout with custom Libraries

This is short post on Timeout errors faced using custom libraries with AWS Glue Python shell job.  I referred the steps listed in AWS docs to create a custom library , and submitted the job with timeout of 5 minutes.  But the job timed out without any errors in logs. Cloudwatch log reported following messages


2020-06-13T12:02:28.821+05:30 Installed /glue/lib/installation/redshift_utils-0.1-py3.7.egg
2020-06-13T12:02:28.822+05:30 Processing dependencies for redshift-utils==0.1
2020-06-13T12:12:45.550+05:30 Searching for redshift-module==0.1
2020-06-13T12:12:45.550+05:30 Reading https://pypi.org/simple/redshift-module/

On searching for error, I came across this AWS Forum post ,where it was recommended  to use python3.6. I referred back documentation and it confirmed that AWS Glue shell jobs are compatible with python 2.7 and 3.6. I was using python3.7 virtualenv for my testing, so this had to be fixed. 

To easily manage multiple environments, I installed miniconda on my Mac which allows to create virtual environment with different python version. Post installation, I created a new python3.6 env with conda and created the egg file

conda create -n venv36 python=3.6
conda activate venv36
python setup.py bdist_egg

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.