AI for Analytics #1
We are launching a newsletter and a podcast on how AI is transforming the data industry!
Welcome to AI for Analytics, a new substack dedicated to investigating and exploring new AI solutions specifically applied to querying and consuming structured data that would typically be used in business intelligence.
Each post will share relevant articles related to our theme that we think will be helpful to the analytics community in general. This substack will also operate in tandem with our new podcast of the same name, and we will cross-post content from the podcast into this substack as we think is beneficial. We will also have some exciting guests on our podcast and may invite them to share something in this substack, too.
Our podcast will commence shortly, and we’ll share it from this substack, too.
Snowflake announced DeepSeek-R1 in Preview on Cortex AI
DeepSeek’s R1 family of models have caused a significant impact in the AI world, and this is felt in the analytics world, too. Artyom, Brian and I discussed this and felt that Snowflake could be one of the bigger beneficiaries. Snowflake are much more focused on traditional data warehousing than some of its competitors, notably Databricks and GCP, and as such, is less able than them to pivot to training powerful LLMs to deploy in their platform.
DeepSeek R1 models are competitive with the best models out there, and it’s no surprise to see Snowflake jump early to power their AI features, including Cortex. It will prove a real boon to Snowflake in offering a state-of-the-art model hosted purely on their infrastructure, that they can then provide to their most security-conscious customers who would not be happy with any data or metadata leaving their Snowflake infrastructure to go to OpenAI, for example.
Gemini now can use Python to analyze Google Sheets data
Google’s Gemini has been rolling out as a chat interface for AI assistance in many G-suite products as you will have seen. What we’re seeing with this release is that Gemini can now do more complex analysis in sheets based on the dataset that is in the sheet already. This is similar to existing tools on the market like Julius.ai which performs complex analysis on a single dataset too. This approach could yield good results as the LLM is not having to retrieve the data and do any transformation using code. It is relying upon an existing dataset in the sheet and then interpreting what it means based on the headers and other context on the sheet, then analysing it and doing further data science type work in Python afterwards.
workspaceupdates.googleblog.com
Databricks introduces Semantic Layer support in AI/BI product
This is a further step and acknowledgement from Databricks that a semantic layer of some kind is needed for safe answers from an AI analytical system. Calculated metrics in Databricks AI BI allows a human to define how a metric should be calculated to then be pulled as an object by AI answering a BI type question. This is the start towards compilation as essentially there is a small compiler which will take the object requested and then rewrite the query generated to have the aggregate instead of the object which denotes the aggregate.
Perplexity Bets Big on Super Bowl
Long have Super Bowl commercials been a gauge of hype-cycle progress. Anyone remember Larry David shilling for FTX during the Super Bowl a few years ago? Despite the drama, that actually worked out ok with investors being made more than whole, but that’s another story for another time. We’re talking about AI here - specifically Perplexity forgoing the traditional spendy Super Bowl commercial and instead running a sweepstakes in which a lucky user who asked at least 5 questions during the big game could win a $1 million prize. As of writing this post we still don’t know who the lucky winner is, but hopefully we’re not all losers with this Super Bowl stunt indicating that we’re about to plummet over the AI hype precipice.