Saturday, August 03, 2024

Where is the productivity in AI? Try this!

For some reason, mainstream is now asking for proof of productivity from AI. There are some skeptics. Let me show you how it increases my productivity as a developer. 

As Easy as 123

Using continue code assistant, I was able to build with very little help an application that uses streamlit for UX, MySQL for DBMS and LangChain for chaining model and logic. The ease with which I can now "talk" to my tables in DBMS makes mysql workbench kind of obsolete. For run-of-the-mill DBMS reporting, we don't need to use any expensive human talent to get it done. This is a boost in productivity for anyone who has to back up an argument with data. This is probably why Snowflake acquired streamlit. The productivity gain is astronomical as I don't have to keep searching for the "right" syntax. I barely check on API reference as the code tells me which method and object I need. 

Agents R US

While I used chains and linked them together to get the end result, I could have created distinct semi-autonomous agents which would get the information as it updates and report the state in real-time. This type of work takes weeks today in a organization and it can now be done in hours. 

PaaS This!

You need a source of data, a connector to read/write to the data and a execution environment which allows for use of models from many sources (using their API key) and frameworks to keep all these components and their state in sync. This is not done in a IaaS setting, this needs a PaaS. No wonder, you can't get away from HuggingFace. No wonder they just announce Github models as competition to HF. 

Models are 4Ever: 

APIs come and go, but models are forever. I have used three different models in a single application and am paying no more 2.5 cents for 10K tokens. I am beginning to wonder if they actually make money providing me this service. Let's look at their investment, a typical model (like GPT), requires 144GPUs to load GPT model, it uses 750W per GPU used. Most cannot afford this, so they go for a model that fits into a single system, but single system needs to be configured for GPU passthrough to the VMs without any bloatware from K8s. We are looking at a hard requirement that a single node offer performance of 200TFLOPs as a minimum. 

Larger models with 400B+ parameters are now called giants. We need Giants because they keep the context around for longer and capture deeper relationships between parameter. But these giants shouldn't share context across tenants. I believe currently they do. 


AI AlterEgo

 The killer application for AI is to enable expert profiles in enterprise and productivity applications. These are not bots that help you ge...