With these 3 basic queries under your belt, there’s no reason for you to have to miss a deadline while waiting for someone else to pull data for you again. They won’t do the whole job for you, but it should be enough to get you numbers for pivoting, graphing, and other applications of your analysis skills. I’ve written them all against an imaginary table of tomato sales data:
Get It All
select * from tomato_sales
In SQL, the simplest queries are often the most powerful. This grabs every row and every column of the table. If the table is under 50,000 rows, you should have no problem opening it in Excel. If it’s bigger, the program may slow down, depending on your RAM and what other applications are running.
Get a sample
select * from tomato_sales sample(1)
If getting every row would be too much for Excel to handle, or would take too long to run, the sample function (or its analog in whatever flavor of SQL you’re using) gets you a random set of rows to experiment with. Just tell it what percentage of rows you want (for example, this would return 1% of all rows), and you’ll have a lightning fast result.
Bonus: If you’re not sure how long a query will take, I’ve found that many of them scale linearly as you increase the sample size, after about 10 seconds. For example, if a 3% sample takes 300 seconds, and a 5% sample takes 500 seconds, you can be pretty confident that the full query would take 10,000 seconds (167 minutes).
Order Your Results
select date, supermarket, revenue from tomato_sales order by supermarket, date
There are two columns that you’re interested in sorting by here, and the order is important. All of the rows from each supermarket will be grouped together and within that group they will be ordered by date. That will make it much easier to spot trends!
That’s it! I hope that reading this quick introduction pays off in spades.