

Aws athena redshift how to#
Not everyone knows how to use a Hadoop cluster or a data lake, but lots of people have data in S3 and know how to write a SQL query.

This can be a significant saving over a service like EMR, where you pay for compute instances whether or not you’re doing anything.Īthena uses standard SQL for queries-so it’s accessible to a wide audience. If you’re not running a query, you don’t pay anything. Instead, you just run queries and the data is read directly from S3.Īthena can be cheaper than similar services-you only pay for the queries you run. What do we mean by that? You don’t have to spin up a cluster, manage capacity, or load data. But what they really wanted to do was data analysis-not database administration.Īmazon built Athena to make it easier to query data in S3, and it has several benefits:Īthena is serverless-Amazon manages all the compute infrastructure, so you don’t have to. A lot of people were building similar plumbing and running similar infrastructure. They saw customers wanting to analyze data in S3 who ran large and expensive EMR clusters. Like many AWS services, Amazon created Athena to solve challenges their customers were facing. It can analyze petabytes of data directly from S3, and queries return within seconds or minutes.Ĭommon use cases for Athena include analyzing log files from CloudFront and similar services, ad hoc exploration of analytics data, and experimenting with new data sets. You log in, select the data, run a query, and get results. It lets you ask questions about data stored in S3 using familiar SQL queries, with no infrastructure to manage. It just takes a lot of time and effort.Īmazon Athena is an interactive, serverless query service that cuts out this overhead. Of course, it’s possible to administer your own data infrastructure. As data sets get bigger and bigger, managing those databases becomes more work and requires more expertise. In the past, if you wanted to ask questions about data, you had to copy that data into a database.
