ai Featured

How I stopped worrying and learned to love robots

AI seems to be absolutely everywhere. As a designer, or a product manager, how do I use it effectively to provide value to my users?

Mariusz Cieśla

May 30, 2024 • 4 min read

With large language models being pushed left and right, the "AI" hype is back on track. Since it's not my first rodeo – in fact, I have been working with some kind of "AI" at LILT for a good couple of years now – I thought I'll take a stab at writing a post on how we think about AI inside LILT design team and how I personally think about using it in my day-to-day work.

The hammer and the nail

As usually in the height of the hype cycle, everyone and their dog are adding "AI" to their application now, mostly in the form of simple chat front-end to something like ChatGPT, Gemini, Claude, or whatever other model they've been paying for. Most of those chats are relatively useless for the users and incredibly prone to prompt injection, which makes companies end up in a rough spot where their AI chatbot promises someone a $1 car and "that's a legally binding offer, no take-backsies".

When you have a hammer, everything looks like a nail, but a hammer in the wrong hands can make some damage, and we're seeing this unfold in the case of large language models.

Head-up display for the user

Any model, built well, is an incredibly great productivity enhancer. For years in our team, we've been talking about our AI-assisted user interface as "head-up display for the linguist". And just like in a fighter jet HUD, we leave the final decision to the human - the role of the model is to provide suggestions, context and catch mistakes on the fly, and learn as the user updates those suggestions, to provide better suggestions in the future.

In fact, in some of our preliminary quality tests, using a large language model to read through a translated document and select parts of the text that look misaligned with the broader context of the brand language yielded incredible increase in final customer-approved quality score from 90%-ish to well over 99%. The role of the model was simply to read through the document, catch places in which the text seems off, and send those parts to the human reviewer to correct.

We did away with the chat-based interface entirely, only using the model as a crucial back-end piece of the process, focused around user needs – which is why experienced Product Designers and Product Managers continue to be incredible for taking the idea, creating and iterating on use cases for models based on user research and customer needs. Let's face it: contrary to what chronically online LLM nerds say, nobody wants to learn to prompt. Building front-ends that get the user to their stated success without forcing them into the mess of prompt engineering is where I can see AI-based products going. The models get better, the interfaces should too.

Garbage in, garbage out

When you jump into ChatGPT, Claude, Gemini, or any other model, you will initially see responses that feel like a whole load of hallucination, and you might think that they're not worth much.

Publicly available models are trained on vast amounts of data (leaving the ethical and quality issue of training them on basically the whole internet, leading to Gemini telling people to eat rocks), and in the end they're just stochastic parrots, which means that if you don't provide them a lot of context in a very specific way, you'll run into your standard "garbage in, garbage out" issue.

The solution we see to this in practice is to provide the users with models that are fine-tuned to their needs. Long term, you will likely see more and more people training local, fine-tuned models to solve their very specific problems. This does not only increase the quality of the output over time, it will also protect your potentially sensitive data from leaking into someone else's answers.

Focus on the user

In the end, as with any tool, our job is to take it, think about it, and provide it to the users in the way that is useful for them. Instead of blindly jumping on the AI hype train, the role of the product designers and product managers is to take all the functionality that LLMs provide, evaluate it, think through all possible ways they can be used to benefit the user, and then test and iterate until it works.

With all the excitement and hype, we all need to take three steps back and look at the bigger picture – how can our customers benefit from this? Not sure? Try asking a model for some suggestions (but make sure to fine-tune it first).

Will it take my job?

In the end, just like a hammer, a model is just a tool, and a lot depends on how capable are you at using it.

Will the models be able to fully replace all humans at doing all jobs? Not likely, at least not for now. Will it be able to replace some people, especially doing Bullshit Jobs? Probably. Will companies – due to the nature of hype and the capitalist system – try to replace people to cut costs and increase shareholder value? Likely.

I predict we will see models shine in places that require generating new information purely based on existing information, summarizing and analyzing large amounts of data, providing insights and spotting patterns, as well as providing suggestions based on historical information. Any human endeavor requiring creativity, connecting dots, forward-thinking, and coming up with entirely new ideas seems safe, at least for now.

TL;DR: so what comes next?

Here's my lukewarm take about where we will see this going:

local, secure, fine-tuned models will become more common, for privacy and quality reasons: since prompting makes the model learn new information, people will be reluctant to share sensitive information with public models, especially in corporate setting
we will see smart product teams do away with prompting and chat-based interfaces in favor of specialized interfaces that use models as a tool to direct and guide the user to their stated goal
we will see improvements in large language models slow down, due to training data and hardware limitations
your job is likely safe, and being skeptical about the model output is likely warranted, unless you're using a very well finetuned model

Image by Susan Cipriano from Pixabay