AWS Summit 2024 in London

A tide of humanity arrived at ExCeL on 24 April 2024 for the AWS Summit. I was reminded of the cattle vs pets analogy as thousands queued for their badges; however, once in, there were many parallel streams and a village of vendors to navigate. Much of the talk was about AI (with AWS Claude, Bedrock and Sagemaker referenced frequently) so my takeaways are on how the various speakers sought to leverage and improve AI in their applications, using real-world examples as much as possible. 

Graph Databases

Graphs are a useful way of thinking about networks such as, 

  • People (in CRMs); 
  • Transactions (in payments, risks and supply chains); and 
  • Knowledge (from molecular relationships to accelerate drug discovery to repair manuals in manufacturing. 

Think about what’s important, unusual (perhaps unexpected or outliers), and what’s next: what we can or might predict. To link to the AI theme that ran through many of the talks, Lee described using Graphs to “ground” LLMs to help explain relationships so you can distinguish between Similarity and Relevance. 

In addition to current uses:

  • Retrieval Augmented Generation (RAG) – prompting the LLM with relevant data and connections
  • Natural Language Querying – using LLM to query knowledge graphs using natural language
  • Knowledge Graph Construction – deconstructing unstructured text into knowledge graphs using LLM 

 there are emerging uses:

  • LLM Training – using the knowledge graph as a source of structured domain understanding to train or fine-tune a language model
  • Explainability for Responsible AI – helping to explaining results to humans (and regulators) with decision accountability 
  • Guard Rails for Responsible AI – providing stronger and more explicit guard rails or language models

Choosing the Right Use Cases for Gen AI

Prasad included a reminder of The Frugal Architect: “Find the dimension you’re going to make money over,  then make sure the architecture follows the money”. In the rush to apply GenAI across a process, first ensure you understand where your revenue is coming from and then make sure your costs are aligned. This gives you the opportunity to reduce cost as a proportion of revenue through economies of scale. For example, in the following flow, the expert is the cost and the revenue will be generated from fulfilment of the process. We therefore want to use GenAI to cut the expert’s time-per-case to accelerate throughput. You could do this by summarising the case, assisting on the research, and drafting a response. 

Improving on Foundation Models

A number of the talks referenced the following diagram illustrating the trade-offs between Foundation Models (FMs) and therefore the process you might go through to narrow the wide range (100s) of available models to perhaps three when evaluating efficacy for your particular use case.

The four approaches (from cheapest, to most accurate) are shown below. Simpler talks covered prompt engineering, but there were live examples of RAG and Fine-Tuning, along with the associated MLOps – a hot topic in itself –  to show you how it could be done. 

Prompt Engineering

A prompt is an instruction to your FM to give you an answer to something. It contains:

  • Instruction (what you want it to do, such as answer a question)
  • Context (such as the type of answer, the background to the topic or specific requirements on the answer)
  • Input data (e.g. “here’s one I made earlier, make me another like it”; or inviting the model to search its own information)
  • Output indicator (e.g. Do you want it to answer you as a 5-y-o-child? In 3 bullets for a slide? As a blog post?)

A good way to start is by experimenting on your chosen model with a given task. Here are some guidelines:

  1. Be clear and concise
  2. Include context
  3. Use directives
  4. Include output
  5. Start with a question
  6. Provide example responses
  7. Break up complex tasks
  8. Experiment and be creative

Other talks also referenced building up a Prompt Catalogue (including input prompts and sample output, labelled data, questions and answers) as part of a process to evaluate different FMs for a particular use case.

Retrieval Augmented Generation

RAG concerns text generation based on specific corpus of data to generate accurate responses without hallucination. Examples include the work journal publishers are doing to extract specialist knowledge (especially in scientific research, but also in legal, engineering and many other domains) so that the Gen AI tools are iterating with user queries, trawling the literature and then summarising the relevant responses. This is described in the following slide from an AWS talk.

Fine-tuning

The Booking.com example is easily relatable whilst also showing how “humans in the loop” (HIL) can be used to refine language models. Just looking at the high-level implementation journey below, you can see there are a number of steps towards ensuring auto-generated listings are focussed, clear and to editorial standards.

If we look at the HIL step, you can see how the experts prioritised the importance of various criteria to improve the listings.

One last element was the use of Parameter-Efficient Fine-Tuning (PEFT), which is an approach that helps you to improve the performance of large AI models while optimising for resources like time, energy, and computational power. In short, PEFT approaches only fine-tune a small number of (extra) model parameters while freezing most parameters of the pretrained LLMs. This also overcomes the issues of catastrophic forgetting, a behaviour observed during the full fine-tuning of LLMs. PEFT approaches have also been shown to be better than fine-tuning in the low-data regimes and generalise better to out-of-domain scenarios.

Conclusion

In any conference of this size, you have to pick your talks to get a flavour of what’s hot. AI is inescapable at the moment and there are many companies using the tools pragmatically to cut cost and time from core (and revenue-generating) business processes. If you want to skill up, then ML Ops seems to be the place to be. There is no shortage of tools and the barrier to entry is low enough for any reasonably-sized company to implement AI experiments and ideas. 

By Alistair Park, CCO at Estafet

Stay Informed with Our Newsletter!

Get the latest news, exclusive articles, and updates delivered to your inbox.