AWS Credentials#
If a workflow needs to access S3, it will be important to establish AWS credentials before first access to the object storage. If you are working in a workflow using a dask cluster, you will need to establish AWS credentials before the cluster starts up.
The following code block will handle*general configuration for establishing AWS credentials for the notebook and any spawned cluster workers.
You will need to make sure your AWS_PROFILE
or AWS_S3_ENDPOINT
environmental variables are established before this is run. If they are not, the defaults will be set to the access the OSN storage pod (with a profile named osn-hytest-scratch
and an endpoint of https://usgs.osn.mghpcc.org
), which will only work if you have this AWS profile configured. Please reach out to asnyder@usgs.gov if you are a USGS staff member and you would like access to the credentials for this scratch space to test out our tutorials.
import os
import logging
import configparser
awsconfig = configparser.ConfigParser()
awsconfig.read(
os.path.expanduser('~/.aws/credentials') # default location... if yours is elsewhere, change this.
)
## NOTE: The default will be for the OSN / RENCI profile and endpoint. Override this
## by setting environment variables before executing this cell/notebook.
_profile_nm = os.environ.get('AWS_PROFILE', 'osn-hytest-scratch')
_endpoint = os.environ.get('AWS_S3_ENDPOINT', 'https://usgs.osn.mghpcc.org')
# Set environment vars based on parsed awsconfig
try:
os.environ['AWS_ACCESS_KEY_ID'] = awsconfig[_profile_nm]['aws_access_key_id']
os.environ['AWS_SECRET_ACCESS_KEY'] = awsconfig[_profile_nm]['aws_secret_access_key']
os.environ['AWS_S3_ENDPOINT'] = _endpoint
os.environ['AWS_PROFILE'] = _profile_nm
os.environ['AWS_DEFAULT_PROFILE'] = _profile_nm
os.environ['AWS_S3_REGION'] = _profile_nm
except KeyError:
logging.error("Problem parsing the AWS credentials file. ")