Skip to content

Athena Client

Bases: object

A class to interact with AWS Athena.

__init__(boto3_session, s3_output=None)

Initialize the Athena instance.

Parameters:

Name Type Description Default
boto3_session Session

A boto3 session with the necessary permissions to interact with Athena.

required
s3_output Optional[str]

The S3 path where the query results will be stored.

None

Returns:

Type Description
None

Examples:

>>> from utils import create_session # See utils module
>>> from athena import Athena
>>> boto3_session = create_session(profile_name='profile_name', role_arn='arn:aws:iam::123456789012:role/role_name')
>>> s3_output = 's3://bucket-name/path/to/query-results/'
>>> athena = Athena(boto3_session=boto3_session, s3_output=s3_output)

create_ctas_table(database, query, **kwargs)

Create a table in Athena using the Create Table As Select (CTAS) approach. Additional arguments can be passed to the CTAS query; the most important arguments are typically:

  • ctas_table Optional[str]: The name of the CTAS table. If None, a name with a random string is used.

  • ctas_database Optional[str]: The name of the database where the CTAS table will be created. If None, database is used.

  • s3_output Optional[str]: The S3 path where the CTAS table will be stored. If None, s3_output attribute of the current instance is used. This may not be desirable if the CTAS table is somewhat permanent and you want to store it in a different location than the query results.

  • storage_format Optional[str]: The storage format for the CTAS query results, such as ORC, PARQUET, AVRO, JSON, or TEXTFILE. PARQUET by default.

  • write_compression Optional[str]: The compression type to use for any storage format that allows compression to be specified.

  • partition_info Optional[List[str]]: A list of columns by which the CTAS table will be partitioned.

See awswrangler.athena.create_ctas_table for more details.

Parameters:

Name Type Description Default
database str

The name of the database to create the table in.

required
query str

The query to create the table without the CREATE TABLE statement.

required
**kwargs Any

Additional arguments passed to awswrangler.athena.create_ctas_table.

{}

Returns:

Type Description
Dict[str, Union[str, _QueryMetadata]]

A dictionary with the the CTAS database and table names. If wait is False, the query ID is included, otherwise a Query metadata object is added instead.

Examples:

>>> query = '''
>>>         SELECT * FROM my_database.my_table
>>>         WHERE date >= DATE '2021-01-01';
>>>         '''
>>> athena.create_ctas_table(
>>>            database='my_database',
>>>            query=query,
>>>            ctas_table='my_ctas_table',
>>>            wait=True,
>>>            storage_format='PARQUET',
>>>            write_compression='snappy',
>>>            partition_info=['date']
>>>        )

create_database(database, **kwargs)

Create a database in Athena.

Parameters:

Name Type Description Default
database str

The name of the database to create.

required
**kwargs Dict[str, Any]

Additional arguments passed to awswrangler.athena.start_query_execution.

{}

Returns:

Type Description
None

Examples:

>>> athena.create_database(database='my_database', wait=True) # See awswrangler.athena.start_query_execution for additional arguments

create_table(database, query, **kwargs)

Create a table in Athena.

Parameters:

Name Type Description Default
database str

The name of the database to create the table in.

required
query str

The query to create the table.

required
**kwargs Dict[str, Any]

Additional arguments passed to awswrangler.athena.start_query_execution.

{}

Returns:

Type Description
None

Examples:

>>> ddl = '''
>>>       CREATE EXTERNAL TABLE IF NOT EXISTS my_database.my_table (
>>>           id CHAR(1),
>>>           date TIMESTAMP
>>>       )
>>>       STORED AS PARQUET
>>>       LOCATION 's3://bucket-name/path/to/data/'
>>>       TBLPROPERTIES ('parquet.compress'='SNAPPY');
>>>       '''
>>> athena.create_table(database='my_database', query=ddl, wait=True)

create_view(database, query, **kwargs)

Create a view in Athena.

Parameters:

Name Type Description Default
query str

The query to create the view.

required
**kwargs Dict[str, Any]

Additional arguments passed to awswrangler.athena.start_query_execution.

{}

Returns:

Type Description
None

Examples:

>>> ddl = '''
>>>       CREATE OR REPLACE VIEW my_database.my_view AS
>>>       SELECT * FROM my_database.my_table;
>>>       '''
>>> athena.create_view(database='my_database', query=ddl, wait=True)

drop_database(database, **kwargs)

Drop a database in Athena.

Parameters:

Name Type Description Default
database str

The name of the database to drop.

required
**kwargs Dict[str, Any]

Additional arguments passed to awswrangler.athena.start_query_execution.

{}

Returns:

Type Description
None

Examples:

>>> athena.drop_database(database='my_database', wait=True)

drop_table(database, table, **kwargs)

Drop a table in Athena in the specified database.

Parameters:

Name Type Description Default
database str

The name of the database to drop the table from.

required
table str

The name of the table to drop.

required
**kwargs Dict[str, Any]

Additional arguments passed to awswrangler.athena.start_query_execution.

{}

Returns:

Type Description
None

Examples:

>>> athena.drop_table(database='my_database', table='my_table', wait=True)

drop_view(database, view, **kwargs)

Drop a view in Athena in the specified database.

Parameters:

Name Type Description Default
database str

The name of the database to drop the view from.

required
view str

The name of the view to drop.

required
**kwargs Dict[str, Any]

Additional arguments passed to awswrangler.athena.start_query_execution.

{}

Returns:

Type Description
None

Examples:

>>> athena.drop_view(database='my_database', view='my_view', wait=True)

query(database, query, ctas_approach=False, **kwargs)

Execute a query in Athena.

Parameters:

Name Type Description Default
database str

The name of the database to start the query in; the query can reference other databases.

required
query str

The query to execute.

required
ctas_approach bool

Use the Create Table As Select (CTAS) approach to execute the query.

False
**kwargs Any

Additional arguments passed to awswrangler.athena.read_sql_query.

{}

Returns:

Type Description
DataFrame

The result of the query.

Examples:

>>> query = '''
>>>         SELECT * FROM my_database.my_table
>>>         WHERE date >= DATE '2021-01-01';
>>>         '''
>>> athena.query(database='my_database', query=query, ctas_approach=False)