Bootstrap FreeKB - Amazon Web Services (AWS) - Getting Started with Athena
Amazon Web Services (AWS) - Getting Started with Athena


AWS Athena can be used to

  • query lines in files in an S3 Bucket
  • query SQL databases

For example, let's say you have a file named foo.txt in one of your S3 Buckets.

 

And let's say foo.txt contains the following lines.

1	hello
2	world

 

Let's start by creating a different S3 Bucket that will be used to store the AWS Athena stuff. The aws s3api create-bucket command can be used to create an S3 Bucket.

aws s3api create-bucket --bucket my-athena-bucket-abc123 --region us-east-1

 

Over in the AWS Athena console, if this is your first time using Athena, select Edit settings.

 

And let's select the S3 Bucket you just created.

 

The aws s3api list-objects command can be used to list the objects in the S3 Bucket. At this point, there should be no objects in the S3 Bucket.

~]$ aws s3api list-objects --bucket my-athena-bucket-abc123
{
    "Contents": []
}

 

In the Athena Editor, let's go with query CREATE DATABASE mydatabase and select Run. The result should be Query successful

 

There should now be a 0 byte TXT file in your Athena S3 Bucket.

 

And the database you just created should be listed in the left panel of the Athena console.

 

Remember that in this trivial example, foo.txt in the other S3 Bucket contains the following lines.

1	hello
2	world

 

Now let's select the database you just created (mydatabase in this example) and create a table.

  • Since the first field in the log file is an integer we create the id INT column
  • Since the second field in the log file is a string we create the message STRING column
  • ROW FORMAT DELIMITED is used to parse each line in the file
  • FIELDS TERMINATED BY '\t' is used to specified that fields in each line are delimited with a tab
  • LINES TERMINATED BY '\n' is used to say that each line ends with newline
  • LOCATION 's3://my-bucket-abc123/' is used to process all of the files at and below the root directory in the S3 Bucket (not your Athena S3 Bucket!)
CREATE EXTERNAL TABLE IF NOT EXISTS poc (
    id INT, 
    message STRING)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY '\t'
    LINES TERMINATED BY '\n'    
    LOCATION 's3://my-bucket-abc123/';

 

Let's created the table. Once created, the table and the columns in the table should be displayed in the left panel.

 

There should now be another 0 byte TXT file in your Athena S3 Bucket.

 

And the query result should contain each filed from each line in foo.txt. Nice!

 




Did you find this article helpful?

If so, consider buying me a coffee over at Buy Me A Coffee



Comments


Add a Comment


Please enter 125c83 in the box below so that we can be sure you are a human.