# Language Models on the AI Executive Order
Matt Hodges
2023-11-01

This post is adapted from a Jupyter [Notebook found on
GitHub](https://github.com/hodgesmr/llm_ai_eo).

On October 30th, 2023, President Biden signed the [Executive Order on
the Safe, Secure, and Trustworthy Development and Use of Artificial
Intelligence](https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/).
The order itself is quite sweeping and touches many government
departments and agencies, with a focus on harnessing AI’s potential and
defending against harms and risks.

In this post, we’ll deploy language models to rapidly discover
information from the Order. For the easiest setup, I recommend trying
this out in a Google Colab notebook.

<a target="_blank" href="https://colab.research.google.com/github/hodgesmr/llm_ai_eo/blob/main/llm_ai_eo.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
<a target="_blank" href="https://github.com/hodgesmr/llm_ai_eo/blob/main/llm_ai_eo.ipynb">
<img src="https://img.shields.io/badge/-Open_in_Github-blue?logo=github&labelColor=gray" alt="Open In Github"/>
</a>

Many of the strategies presented here are extensions from Simon
Willison’s work in his blog post, [Embedding paragraphs from my blog
with E5-large-v2](https://til.simonwillison.net/llms/embed-paragraphs).
Simon also maintains a handy command line utility for working with
various LLM models, aptly named
[LLM](https://llm.datasette.io/en/stable/). While Simon’s writing
largely focuses on the CLI capabilities of the tool (and the usefully
opinionated integrations with SQLite), I prefer working with Pandas
Dataframes. Here I show how to use the LLM library in that fashion.

Embeddings are kindof a magic black box to end users, but the basic idea
is that language models can create vectors or numerical values that
represent not only words or sentences, but also the symantic *meaning*
of those words. Early research on this subject comes from
[word2vec](https://code.google.com/archive/p/word2vec/). To illustrate:
`vector('king') - vector('man') + vector('woman')` is mathematically
close to `vector('queen')`. I find that *fascinating*! We’ll use this
concept to extract and match information against the Executive Order
text.

We’ll deploy a technique known as [Retrieval-Augmented
Generation](https://research.ibm.com/blog/retrieval-augmented-generation-RAG).
From a high level, this allows us to inject context into a LLM without
training or tuning it. We use another system to locate language that
likely contains the answer to our query, and then ask the model to pull
it out for us.

Our high livel strategy:

1.  Calculate embeddings on the Executive Order text
2.  Calculate embeddings on a query
3.  Calculate the cosine similarity between every text embedding and the
    query
4.  Select the top three passages that are symantically similar to the
    query
5.  Pass the passages and the query to the LLM for rapid summarization

### Environment

First install the dependencies, which include the [MLC LLaMA 2
model](https://mlc.ai) for summarization, the
[LLM](https://llm.datasette.io/en/stable/) library, and the
[E5-large-v2](https://huggingface.co/intfloat/e5-large-v2) language
model for text embedding.

Note, these models are constantly changing, and getting them up and
running on your system might take some independent investigation. If
running in Google Colab, check [this tutorial for
MLC](https://colab.research.google.com/github/mlc-ai/notebooks/blob/main/mlc-llm/tutorial_chat_module_getting_started.ipynb).
If running LLaMA with the LLM library on macOS, check the [repository’s
instructions](https://github.com/simonw/llm-mlc).

``` python
pip install --pre -U -f https://mlc.ai/wheels mlc-chat-nightly-cu118 mlc-ai-nightly-cu118
git lfs install
pip install llm
llm install llm-sentence-transformers
llm sentence-transformers register intfloat/e5-large-v2 -a lv2
llm install llm-mlc
llm mlc setup
llm mlc download-model Llama-2-7b-chat --alias llama2
```

### Load Data

Before getting started, we need the Executive Order text to work
against. This is probably the least interesting part of this post. I
simply opened the Order in Firefox reader view, copy+pasted it into
VSCode, did some manual find/replace to clean up the white space, and
then concatenated paragraphs to get chunks as close to 400 words as I
could. I picked 400 because the embedding model truncates at 512
*tokens* and a token is either a word or a symantically important subset
of a word, so I allowed for some buffer. *This took less than half an
hour.* Rather than share code to do this work, I simply provide the
cleaned text here.

Load it into a Pandas Dataframe with a single column:

``` python
import pandas as pd

df = pd.read_csv(
    "https://raw.githubusercontent.com/hodgesmr/llm_ai_eo/main/eo.txt",
    sep="_",  # trick to let us read the lines into a Dataframe; '_' not present
    header=None,
)
df.columns = ["passage"]

df.head()
```

<div id="df-bb16b36d-004b-481b-8a49-3f3c710dd35f" class="colab-df-container">
    <div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }
&#10;    .dataframe tbody tr th {
        vertical-align: top;
    }
&#10;    .dataframe thead th {
        text-align: right;
    }
</style>

|     | passage                                             |
|-----|-----------------------------------------------------|
| 0   | By the authority vested in me as President by ...   |
| 1   | \(a\) Artificial Intelligence must be safe and s... |
| 2   | \(c\) The responsible development and use of AI ... |
| 3   | \(e\) The interests of Americans who increasingl... |
| 4   | \(g\) It is important to manage the risks from t... |

</div>
    <div class="colab-df-buttons">
&#10;  <div class="colab-df-container">
    <button class="colab-df-convert" onclick="convertToInteractive('df-bb16b36d-004b-481b-8a49-3f3c710dd35f')"
            title="Convert this dataframe to an interactive table."
            style="display:none;">
&#10;  <svg xmlns="http://www.w3.org/2000/svg" height="24px" viewBox="0 -960 960 960">
    <path d="M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z"/>
  </svg>
    </button>
&#10;  <style>
    .colab-df-container {
      display:flex;
      gap: 12px;
    }
&#10;    .colab-df-convert {
      background-color: #E8F0FE;
      border: none;
      border-radius: 50%;
      cursor: pointer;
      display: none;
      fill: #1967D2;
      height: 32px;
      padding: 0 0 0 0;
      width: 32px;
    }
&#10;    .colab-df-convert:hover {
      background-color: #E2EBFA;
      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);
      fill: #174EA6;
    }
&#10;    .colab-df-buttons div {
      margin-bottom: 4px;
    }
&#10;    [theme=dark] .colab-df-convert {
      background-color: #3B4455;
      fill: #D2E3FC;
    }
&#10;    [theme=dark] .colab-df-convert:hover {
      background-color: #434B5C;
      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);
      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));
      fill: #FFFFFF;
    }
  </style>
&#10;    <script>
      const buttonEl =
        document.querySelector('#df-bb16b36d-004b-481b-8a49-3f3c710dd35f button.colab-df-convert');
      buttonEl.style.display =
        google.colab.kernel.accessAllowed ? 'block' : 'none';
&#10;      async function convertToInteractive(key) {
        const element = document.querySelector('#df-bb16b36d-004b-481b-8a49-3f3c710dd35f');
        const dataTable =
          await google.colab.kernel.invokeFunction('convertToInteractive',
                                                    [key], {});
        if (!dataTable) return;
&#10;        const docLinkHtml = 'Like what you see? Visit the ' +
          '<a target="_blank" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'
          + ' to learn more about interactive tables.';
        element.innerHTML = '';
        dataTable['output_type'] = 'display_data';
        await google.colab.output.renderOutput(dataTable, element);
        const docLink = document.createElement('div');
        docLink.innerHTML = docLinkHtml;
        element.appendChild(docLink);
      }
    </script>
  </div>
&#10;
<div id="df-b2eed367-b96a-4267-9dbf-89c97eb35125">
  <button class="colab-df-quickchart" onclick="quickchart('df-b2eed367-b96a-4267-9dbf-89c97eb35125')"
            title="Suggest charts."
            style="display:none;">
&#10;<svg xmlns="http://www.w3.org/2000/svg" height="24px"viewBox="0 0 24 24"
     width="24px">
    <g>
        <path d="M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z"/>
    </g>
</svg>
  </button>
&#10;<style>
  .colab-df-quickchart {
      --bg-color: #E8F0FE;
      --fill-color: #1967D2;
      --hover-bg-color: #E2EBFA;
      --hover-fill-color: #174EA6;
      --disabled-fill-color: #AAA;
      --disabled-bg-color: #DDD;
  }
&#10;  [theme=dark] .colab-df-quickchart {
      --bg-color: #3B4455;
      --fill-color: #D2E3FC;
      --hover-bg-color: #434B5C;
      --hover-fill-color: #FFFFFF;
      --disabled-bg-color: #3B4455;
      --disabled-fill-color: #666;
  }
&#10;  .colab-df-quickchart {
    background-color: var(--bg-color);
    border: none;
    border-radius: 50%;
    cursor: pointer;
    display: none;
    fill: var(--fill-color);
    height: 32px;
    padding: 0;
    width: 32px;
  }
&#10;  .colab-df-quickchart:hover {
    background-color: var(--hover-bg-color);
    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);
    fill: var(--button-hover-fill-color);
  }
&#10;  .colab-df-quickchart-complete:disabled,
  .colab-df-quickchart-complete:disabled:hover {
    background-color: var(--disabled-bg-color);
    fill: var(--disabled-fill-color);
    box-shadow: none;
  }
&#10;  .colab-df-spinner {
    border: 2px solid var(--fill-color);
    border-color: transparent;
    border-bottom-color: var(--fill-color);
    animation:
      spin 1s steps(1) infinite;
  }
&#10;  @keyframes spin {
    0% {
      border-color: transparent;
      border-bottom-color: var(--fill-color);
      border-left-color: var(--fill-color);
    }
    20% {
      border-color: transparent;
      border-left-color: var(--fill-color);
      border-top-color: var(--fill-color);
    }
    30% {
      border-color: transparent;
      border-left-color: var(--fill-color);
      border-top-color: var(--fill-color);
      border-right-color: var(--fill-color);
    }
    40% {
      border-color: transparent;
      border-right-color: var(--fill-color);
      border-top-color: var(--fill-color);
    }
    60% {
      border-color: transparent;
      border-right-color: var(--fill-color);
    }
    80% {
      border-color: transparent;
      border-right-color: var(--fill-color);
      border-bottom-color: var(--fill-color);
    }
    90% {
      border-color: transparent;
      border-bottom-color: var(--fill-color);
    }
  }
</style>
&#10;  <script>
    async function quickchart(key) {
      const quickchartButtonEl =
        document.querySelector('#' + key + ' button');
      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.
      quickchartButtonEl.classList.add('colab-df-spinner');
      try {
        const charts = await google.colab.kernel.invokeFunction(
            'suggestCharts', [key], {});
      } catch (error) {
        console.error('Error during call to suggestCharts:', error);
      }
      quickchartButtonEl.classList.remove('colab-df-spinner');
      quickchartButtonEl.classList.add('colab-df-quickchart-complete');
    }
    (() => {
      let quickchartButtonEl =
        document.querySelector('#df-b2eed367-b96a-4267-9dbf-89c97eb35125 button');
      quickchartButtonEl.style.display =
        google.colab.kernel.accessAllowed ? 'block' : 'none';
    })();
  </script>
</div>
    </div>
  </div>

### Calculate Embeddings

Now that we have a Dataframe of chunks of the Executive Order, we can
calculate embeddings of each chunk. To do this we’ll use the
[E5-large-v2](https://huggingface.co/intfloat/e5-large-v2) language
model, which was trained to handle text prefixed with either `passage:`
or `query:`. Every chunk is considered a passage. We’ll add this as
another column on our Dataframe.

``` python
import llm

embedding_model = llm.get_embedding_model("lv2")
text_to_embed = df.passage.to_list()

# Our embedding model expects `passage: ` prefixes
text_to_embed = [f'passage: {t}' for t in text_to_embed]

df['embedding'] = list(embedding_model.embed_multi(text_to_embed))

df.head()
```

<div id="df-881f1164-104e-4a47-b886-c457247e5bb2" class="colab-df-container">
    <div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }
&#10;    .dataframe tbody tr th {
        vertical-align: top;
    }
&#10;    .dataframe thead th {
        text-align: right;
    }
</style>

|  | passage | embedding |
|----|----|----|
| 0 | By the authority vested in me as President by ... | \[0.032344698905944824, -0.04333016648888588, 0... |
| 1 | \(a\) Artificial Intelligence must be safe and s... | \[0.01886950619518757, -0.057347141206264496, 0... |
| 2 | \(c\) The responsible development and use of AI ... | \[0.0486459881067276, -0.0712570995092392, 0.02... |
| 3 | \(e\) The interests of Americans who increasingl... | \[0.03564070537686348, -0.04887280985713005, 0.... |
| 4 | \(g\) It is important to manage the risks from t... | \[0.04095401614904404, -0.042341429740190506, 0... |

</div>
    <div class="colab-df-buttons">
&#10;  <div class="colab-df-container">
    <button class="colab-df-convert" onclick="convertToInteractive('df-881f1164-104e-4a47-b886-c457247e5bb2')"
            title="Convert this dataframe to an interactive table."
            style="display:none;">
&#10;  <svg xmlns="http://www.w3.org/2000/svg" height="24px" viewBox="0 -960 960 960">
    <path d="M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z"/>
  </svg>
    </button>
&#10;  <style>
    .colab-df-container {
      display:flex;
      gap: 12px;
    }
&#10;    .colab-df-convert {
      background-color: #E8F0FE;
      border: none;
      border-radius: 50%;
      cursor: pointer;
      display: none;
      fill: #1967D2;
      height: 32px;
      padding: 0 0 0 0;
      width: 32px;
    }
&#10;    .colab-df-convert:hover {
      background-color: #E2EBFA;
      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);
      fill: #174EA6;
    }
&#10;    .colab-df-buttons div {
      margin-bottom: 4px;
    }
&#10;    [theme=dark] .colab-df-convert {
      background-color: #3B4455;
      fill: #D2E3FC;
    }
&#10;    [theme=dark] .colab-df-convert:hover {
      background-color: #434B5C;
      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);
      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));
      fill: #FFFFFF;
    }
  </style>
&#10;    <script>
      const buttonEl =
        document.querySelector('#df-881f1164-104e-4a47-b886-c457247e5bb2 button.colab-df-convert');
      buttonEl.style.display =
        google.colab.kernel.accessAllowed ? 'block' : 'none';
&#10;      async function convertToInteractive(key) {
        const element = document.querySelector('#df-881f1164-104e-4a47-b886-c457247e5bb2');
        const dataTable =
          await google.colab.kernel.invokeFunction('convertToInteractive',
                                                    [key], {});
        if (!dataTable) return;
&#10;        const docLinkHtml = 'Like what you see? Visit the ' +
          '<a target="_blank" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'
          + ' to learn more about interactive tables.';
        element.innerHTML = '';
        dataTable['output_type'] = 'display_data';
        await google.colab.output.renderOutput(dataTable, element);
        const docLink = document.createElement('div');
        docLink.innerHTML = docLinkHtml;
        element.appendChild(docLink);
      }
    </script>
  </div>
&#10;
<div id="df-c9b38182-800a-413a-bc1c-380569d3041c">
  <button class="colab-df-quickchart" onclick="quickchart('df-c9b38182-800a-413a-bc1c-380569d3041c')"
            title="Suggest charts."
            style="display:none;">
&#10;<svg xmlns="http://www.w3.org/2000/svg" height="24px"viewBox="0 0 24 24"
     width="24px">
    <g>
        <path d="M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z"/>
    </g>
</svg>
  </button>
&#10;<style>
  .colab-df-quickchart {
      --bg-color: #E8F0FE;
      --fill-color: #1967D2;
      --hover-bg-color: #E2EBFA;
      --hover-fill-color: #174EA6;
      --disabled-fill-color: #AAA;
      --disabled-bg-color: #DDD;
  }
&#10;  [theme=dark] .colab-df-quickchart {
      --bg-color: #3B4455;
      --fill-color: #D2E3FC;
      --hover-bg-color: #434B5C;
      --hover-fill-color: #FFFFFF;
      --disabled-bg-color: #3B4455;
      --disabled-fill-color: #666;
  }
&#10;  .colab-df-quickchart {
    background-color: var(--bg-color);
    border: none;
    border-radius: 50%;
    cursor: pointer;
    display: none;
    fill: var(--fill-color);
    height: 32px;
    padding: 0;
    width: 32px;
  }
&#10;  .colab-df-quickchart:hover {
    background-color: var(--hover-bg-color);
    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);
    fill: var(--button-hover-fill-color);
  }
&#10;  .colab-df-quickchart-complete:disabled,
  .colab-df-quickchart-complete:disabled:hover {
    background-color: var(--disabled-bg-color);
    fill: var(--disabled-fill-color);
    box-shadow: none;
  }
&#10;  .colab-df-spinner {
    border: 2px solid var(--fill-color);
    border-color: transparent;
    border-bottom-color: var(--fill-color);
    animation:
      spin 1s steps(1) infinite;
  }
&#10;  @keyframes spin {
    0% {
      border-color: transparent;
      border-bottom-color: var(--fill-color);
      border-left-color: var(--fill-color);
    }
    20% {
      border-color: transparent;
      border-left-color: var(--fill-color);
      border-top-color: var(--fill-color);
    }
    30% {
      border-color: transparent;
      border-left-color: var(--fill-color);
      border-top-color: var(--fill-color);
      border-right-color: var(--fill-color);
    }
    40% {
      border-color: transparent;
      border-right-color: var(--fill-color);
      border-top-color: var(--fill-color);
    }
    60% {
      border-color: transparent;
      border-right-color: var(--fill-color);
    }
    80% {
      border-color: transparent;
      border-right-color: var(--fill-color);
      border-bottom-color: var(--fill-color);
    }
    90% {
      border-color: transparent;
      border-bottom-color: var(--fill-color);
    }
  }
</style>
&#10;  <script>
    async function quickchart(key) {
      const quickchartButtonEl =
        document.querySelector('#' + key + ' button');
      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.
      quickchartButtonEl.classList.add('colab-df-spinner');
      try {
        const charts = await google.colab.kernel.invokeFunction(
            'suggestCharts', [key], {});
      } catch (error) {
        console.error('Error during call to suggestCharts:', error);
      }
      quickchartButtonEl.classList.remove('colab-df-spinner');
      quickchartButtonEl.classList.add('colab-df-quickchart-complete');
    }
    (() => {
      let quickchartButtonEl =
        document.querySelector('#df-c9b38182-800a-413a-bc1c-380569d3041c button');
      quickchartButtonEl.style.display =
        google.colab.kernel.accessAllowed ? 'block' : 'none';
    })();
  </script>
</div>
    </div>
  </div>

For our symantic searching, we’ll also need an embedding of our query.
And the model would like that prefixed with `query:`. Let’s ask what the
Order says regarding AI and healthcare:

``` python
query = "what does it say about healthcare?"

# Our embbeding model expects `query: ` prefix for retrieval
query_to_embed = f"query: {query}"
query_vector = embedding_model.embed(query_to_embed)

print(query_vector)
```

    [0.011035123839974403, -0.06264020502567291, 0.036343760788440704, -0.022550513967871666, -0.004930663388222456, 0.027655886486172676, -0.04244294762611389, -0.026744479313492775, -0.022813718765974045, 0.013104002922773361, 0.027848346158862114, -0.041959188878536224, 0.02923852950334549, 0.03592933714389801, 0.02084210328757763, 0.028341282159090042, -0.02188134379684925, 0.009380431845784187, 0.010694948956370354, -0.046167585998773575, 0.04979575797915459, -0.04051537066698074, -0.04705166816711426, 0.054594166576862335, -0.021378282457590103, -0.006090054754167795, -0.027005767449736595, -0.0056915683671832085, -0.02485739439725876, 0.025049963966012, 0.0013038198230788112, 0.020098360255360603, 0.03132014721632004, -0.10214236378669739, 0.03457639366388321, -0.005869136657565832, -0.041733402758836746, -0.0533079169690609, 0.043018240481615067, 0.02142527513206005, -0.013251637108623981, 0.021434243768453598, -0.01846863329410553, 0.06185981631278992, -0.006901243235915899, -0.007515963166952133, -0.026446444913744926, -0.022192247211933136, 0.008555920794606209, -0.00683502247557044, 0.04217502102255821, -0.05847731605172157, 0.05262995511293411, 0.010208839550614357, -0.026769086718559265, -0.0054522184655070305, 0.020620698109269142, 0.04013380780816078, -0.016234276816248894, -0.013201197609305382, 0.033620886504650116, 0.00933841709047556, 0.057325609028339386, -0.039771128445863724, 0.024898990988731384, -0.04952569678425789, 0.01868751272559166, -0.009222128428518772, 0.028759649023413658, 0.04622277244925499, -0.04479452595114708, -0.007873269729316235, 0.003972833044826984, 0.08234458416700363, 0.047766849398612976, -0.04056665301322937, -0.03526405245065689, 0.028415270149707794, 0.03255781903862953, 0.03818396478891373, -0.04255992919206619, -0.0518893301486969, -0.036653585731983185, 0.020367737859487534, -0.0004731871304102242, 0.030738964676856995, -0.0012389217736199498, 0.0024446682073175907, 0.05554775893688202, 0.013511369936168194, 0.06238795816898346, 0.0018740486120805144, 0.008448319509625435, 0.03504892811179161, -0.04160211980342865, -0.00638029258698225, -0.050224918872117996, -0.005032839719206095, 0.013626458123326302, 0.008170071057975292, -0.04054887965321541, -0.027991971001029015, -0.034977927803993225, 0.014853181317448616, -0.024243168532848358, -0.02605707384645939, -0.05607284978032112, -0.00930757261812687, -0.03581421077251434, -0.0025592155288904905, -0.02390565536916256, -0.0380810908973217, -0.033349402248859406, -0.011960298754274845, -0.04205334186553955, 0.055994290858507156, 0.006477762944996357, 0.00991995818912983, 0.030702101066708565, 0.024755865335464478, 0.07455798983573914, 0.04026920348405838, -0.032499562948942184, -0.0020086164586246014, -0.035176586359739304, 0.0520380362868309, -0.04051459580659866, -0.04515765979886055, 0.04827379435300827, 0.0681162029504776, 0.01639464683830738, -0.012349777854979038, -0.024919848889112473, 0.0261624027043581, -0.0003906007332261652, 0.08279916644096375, -0.04583611711859703, -0.05605150759220123, 0.03352531045675278, 0.05521564930677414, -0.0490930899977684, -0.027380632236599922, 0.02854541875422001, 0.012674689292907715, -0.002931063063442707, -0.053501006215810776, -0.0288915503770113, -0.031300533562898636, -0.03822864219546318, -0.03784075006842613, -0.017202073708176613, -0.005835125222802162, -0.02155722677707672, -0.00019966973923146725, 0.038443099707365036, -0.055773064494132996, 0.0020749655086547136, -0.023329386487603188, -0.006293139886111021, 0.019844476133584976, 0.044987935572862625, 0.03083648905158043, -0.01008790172636509, 0.02570483647286892, 0.002781765768304467, 0.007234315853565931, 0.03924374282360077, -0.014665482565760612, -0.026686524972319603, 0.040999021381139755, 0.0032443120144307613, 0.0015701769152656198, 0.053893182426691055, 0.045284025371074677, 0.023676631972193718, 0.007213208358734846, 0.05706159397959709, -0.06629643589258194, 0.010098790749907494, -0.03431336581707001, 0.0032631983049213886, 0.021331751719117165, -0.05340828001499176, -0.022274604067206383, -0.0436270497739315, 0.019490502774715424, 0.04619332030415535, -0.03779701516032219, -0.004007418639957905, -0.03109969012439251, -0.028927365317940712, -0.01648799516260624, -0.04384511336684227, 0.02099577710032463, -0.02223723568022251, -0.03204162418842316, -0.022197725251317024, -0.011281806975603104, -0.015587741509079933, 0.041081756353378296, 0.044835954904556274, 0.005444942973554134, 0.004057107958942652, 0.0030030268244445324, 0.008095227181911469, 0.010580062866210938, 0.002692743204534054, 0.027165278792381287, -0.04451528936624527, 0.018396437168121338, -0.012467225082218647, 0.01907152123749256, 0.024384012445807457, -0.03359571844339371, 0.002541579073294997, 0.03602273389697075, 0.007620786316692829, 0.0045867254957556725, -0.004530118312686682, 0.039663612842559814, -0.07118961960077286, 0.027506954967975616, 0.010756077244877815, -0.0027250293642282486, 0.020495502278208733, 0.025394247844815254, -0.037594713270664215, 0.017433777451515198, -0.008695165626704693, -0.02986626699566841, 0.012270666658878326, 0.016099197790026665, -0.03702491894364357, -0.03618799149990082, -0.012862246483564377, -0.05321879684925079, 0.03497174754738808, -0.012480347417294979, 0.012074748054146767, 0.021339774131774902, 0.025768857449293137, 0.027273213490843773, 0.006413111928850412, -0.01951907016336918, -0.020737817510962486, 0.03631201386451721, 0.005693630315363407, 0.014690431766211987, 0.0400673933327198, 0.040662623941898346, 0.005502395331859589, 0.0173542071133852, -0.04771675169467926, 0.02629489079117775, -0.003811753587797284, -0.007251544389873743, 0.04416917636990547, -0.061770226806402206, -0.016220202669501305, 0.03719717636704445, -0.01617533341050148, -0.03763722628355026, -0.04766486957669258, -0.025192739441990852, -0.019239462912082672, -0.008094199001789093, -0.005068974103778601, -0.004565471783280373, 0.07414168119430542, -0.041430603712797165, 0.013045714236795902, 0.018361356109380722, 0.03342536836862564, -0.041664931923151016, -0.05250570923089981, -0.02396155521273613, -0.01865033246576786, -0.03515772148966789, 0.016364071518182755, -0.009918905794620514, 0.05359259247779846, -0.0008409120491705835, -0.003592758672311902, 0.030167188495397568, -0.008410172536969185, -0.002642557490617037, -0.026539165526628494, 0.003059686627238989, 0.029047558084130287, 0.01878088153898716, -0.04435497149825096, 0.004310435149818659, 0.05454620346426964, -0.009366977959871292, -0.009419695474207401, 0.031485412269830704, -0.016963956877589226, 0.007122103590518236, 0.009967380203306675, 0.024634771049022675, 0.022842811420559883, -0.0031100360210984945, -0.0643397867679596, -0.02673438936471939, -0.035047076642513275, -0.00703747384250164, 0.049209486693143845, -0.050232626497745514, 0.04699047654867172, 0.026202980428934097, -0.016680346801877022, -0.02222183719277382, -0.041290491819381714, -0.03719739243388176, -0.0399385504424572, 0.016396909952163696, -0.02527480199933052, 0.018787767738103867, 0.0006646675174124539, -0.01867998018860817, 0.03068104200065136, -0.0020506661385297775, -0.044911354780197144, -0.012430372647941113, -0.024320200085639954, 0.017804201692342758, -0.032045990228652954, -0.007124380674213171, 0.01598973572254181, 0.012343218550086021, -0.011404650285840034, -0.01943874917924404, 0.04149242863059044, -0.005655502434819937, -0.016777103766798973, -0.011175236664712429, -0.04135586693882942, -0.03618283197283745, 0.02853410318493843, 0.044518813490867615, 0.06200827658176422, 0.038675494492053986, -0.036812301725149155, -0.02377907745540142, 0.025207694619894028, 0.040055833756923676, 0.04191999137401581, 0.009881209582090378, 0.02629314549267292, 0.024176236242055893, 0.020148472860455513, 0.008412880823016167, -0.004142292309552431, -0.004587574861943722, -0.03939927741885185, -0.01107271108776331, 0.026553431525826454, -0.04704265668988228, 0.0134046645835042, 0.03654501587152481, 0.017579609528183937, -0.00949288159608841, 0.06082817539572716, -0.006728034000843763, 0.008356080390512943, -0.051711030304431915, 0.03844171389937401, -0.01830201968550682, 0.00854733120650053, 0.034922968596220016, 0.004259245935827494, 0.020917760208249092, 0.010708301328122616, -0.02309831604361534, -0.04656030982732773, -0.04652083292603493, 0.05957057699561119, 0.02332383207976818, 0.027773132547736168, -0.02072453871369362, 0.01621401123702526, 0.01397494412958622, -0.018857672810554504, -0.027436040341854095, -0.002191656269133091, -0.004994520451873541, -0.004126602318137884, -0.017894916236400604, -0.01036886591464281, -0.04088125005364418, -0.001628147903829813, -0.038663312792778015, 0.018833275884389877, -0.030392177402973175, 0.008629074320197105, -0.028720470145344734, 0.02279716171324253, -0.01806335523724556, 0.008062546141445637, 0.010905652306973934, 0.03126005455851555, 0.04747697710990906, 0.04772476851940155, 0.0025600406806916, -0.027565544471144676, -0.02668774127960205, 0.04895208775997162, 0.035160355269908905, -0.003617622423917055, -0.024726560339331627, 0.0027959730941802263, 0.032439906150102615, -0.02167009562253952, 0.018054930493235588, -0.01872582919895649, 0.0019295919919386506, -0.02558404952287674, 0.006836448796093464, -0.0369705893099308, -0.02392488345503807, -0.0006662389496341348, -0.03071492351591587, -0.035662174224853516, -0.027658283710479736, 0.03889033943414688, -0.0037014412228018045, 0.026146110147237778, -0.012518925592303276, -0.020388586446642876, -0.05085292086005211, 0.04047977924346924, -0.04683171212673187, -0.025453921407461166, -0.022075410932302475, 0.04574955627322197, -0.04705757275223732, 0.014057472348213196, -0.0019396235002204776, 0.02185961976647377, -0.0364665687084198, 0.029328808188438416, 0.03975827246904373, -0.025506163015961647, -0.048030223697423935, 0.020831944420933723, -0.010402142070233822, -0.045267313718795776, -0.009884222410619259, -0.03742368519306183, 0.040443453937768936, 0.03946416452527046, -0.01560758426785469, 0.05435897037386894, -0.01847420074045658, -0.022371601313352585, -0.02834167331457138, -0.024244116619229317, 0.016461217775940895, -0.015590337105095387, -0.00129281438421458, -0.04336583614349365, 0.031417686492204666, 0.031635694205760956, -0.04004405438899994, -0.030732493847608566, -0.006123801227658987, 0.04235366731882095, 0.01829519309103489, -0.04180118814110756, 0.022241240367293358, 0.007255924865603447, -0.011486359871923923, 0.03379999101161957, -0.0004169400199316442, -0.0184877160936594, 0.028425395488739014, 0.028919873759150505, 0.02586987614631653, 0.024930693209171295, 0.014547272585332394, -0.026854408904910088, -0.024916205555200577, 0.021643545478582382, 0.020358487963676453, -0.06973747164011002, 0.03774351254105568, -0.057213641703128815, 0.019329674541950226, 0.058849673718214035, -0.01767851412296295, -0.04142051935195923, 0.029757970944046974, 0.03226760774850845, 0.013497359119355679, 0.04983055964112282, -0.06860023736953735, -0.009090565145015717, 0.01876700483262539, -0.045754898339509964, -0.018559377640485764, 0.03907673433423042, -0.04343864694237709, 0.01983906328678131, 0.05090088024735451, -0.026130903512239456, 0.0198112390935421, -0.01724187843501568, 0.027581052854657173, -0.015150460414588451, 0.044671617448329926, -0.021948102861642838, -0.039155758917331696, -0.011670168489217758, 0.057173795998096466, 0.05130016431212425, 0.015086568892002106, 0.029861880466341972, 0.00045294041046872735, 0.012837855145335197, -0.013540389016270638, -0.0035288180224597454, -0.006696672644466162, 0.005161927081644535, -0.007997330278158188, -0.06018849462270737, 0.0040719653479754925, 0.04676881060004234, 0.05494493618607521, -0.02383231744170189, -0.02409941330552101, -0.02361765317618847, 0.05019611492753029, -0.02561134286224842, -0.011039866134524345, -0.0035623686853796244, -0.007303849793970585, -0.05598662421107292, -0.014519513584673405, -0.016709184274077415, 0.023226818069815636, 0.0020540591794997454, 0.018423371016979218, 0.07162746787071228, -0.030987678095698357, -0.011564436368644238, -0.05398257449269295, -0.03393613547086716, 0.029724063351750374, 0.014501385390758514, 0.01975683681666851, 0.025587748736143112, -0.0031189853325486183, -0.028188880532979965, -0.028325166553258896, -0.058148093521595, 0.06817413866519928, -0.05185883492231369, 0.010479864664375782, 0.013737188652157784, -0.025603286921977997, 0.004288510885089636, 0.03330473229289055, 0.03081020526587963, 0.00929239485412836, 0.013587307184934616, 0.04477598890662193, -0.010759776458144188, -0.013349821791052818, -0.03450658544898033, 0.04914005845785141, -0.008455680683255196, -0.02566939778625965, 0.06540139764547348, -0.014167072251439095, -0.04206572845578194, 0.040725916624069214, -0.027056824415922165, -0.027743373066186905, 0.01723734475672245, -0.026086552068591118, 0.011066379956901073, -0.028094105422496796, 0.021090257912874222, -0.006315530743449926, -0.0038536370266228914, 0.002642259933054447, -0.027106409892439842, -0.04236611723899841, 0.034201622009277344, 0.02585034817457199, 0.029318731278181076, 0.04393518716096878, -0.021400194615125656, -0.01966845989227295, 0.037550002336502075, -0.01055433414876461, 0.013854894787073135, -0.02539399452507496, -0.029923539608716965, 0.011284036561846733, 0.03166112303733826, -0.05249398201704025, 0.023592980578541756, -0.032214708626270294, 0.048454105854034424, -0.03129222244024277, 0.019876103848218918, 0.027158288285136223, -0.010018077678978443, -0.015372109599411488, -0.03651623800396919, 0.030494078993797302, 0.030631577596068382, 0.013395410031080246, -0.04203599691390991, 0.017429262399673462, -0.0336260125041008, -0.00458034360781312, 0.04320629686117172, 0.04873400554060936, -0.036522023379802704, -0.007033052854239941, 0.04841015860438347, 0.04686618968844414, -0.0226470734924078, -0.0156414732336998, 0.03650740534067154, -0.02229906991124153, 0.04247375950217247, 0.02045833133161068, 0.0011794024612754583, -0.02260892651975155, -0.034504398703575134, 0.044768236577510834, -0.031263917684555054, -0.045521024614572525, 0.005588601343333721, -0.056517187505960464, 0.03272703289985657, 0.03745061159133911, 0.048094216734170914, 0.04812173917889595, -0.051041409373283386, 0.017124954611063004, -0.028181584551930428, 0.0380525179207325, -0.0037846474442631006, -0.03125273063778877, 0.01894977129995823, -0.040786415338516235, -0.04169308394193649, 0.05256915092468262, -0.03413494676351547, 0.02746882289648056, 0.033805206418037415, 0.007925352081656456, 0.02444133721292019, 0.018344828858971596, -0.009217273443937302, 0.0021543705370277166, 0.03575904294848442, 0.020160242915153503, 0.02286422997713089, -0.020686518400907516, 0.026341790333390236, 0.12740176916122437, -0.00893761869519949, 0.020138844847679138, -0.02238314412534237, 0.018193142488598824, 0.03919895738363266, -0.022538235411047935, 0.025507012382149696, -0.023426273837685585, 0.027672510594129562, 0.04479461535811424, 0.051694680005311966, 0.024090928956866264, 0.05755620077252388, -0.014703532680869102, 0.017124010249972343, -0.01627686806023121, -0.0018241779180243611, -0.02017812617123127, -0.013297722674906254, 0.02810887061059475, -0.03654468432068825, -0.040724869817495346, -0.02429257705807686, 0.02536279335618019, -0.004012022167444229, -0.0012858447153121233, -0.04617433249950409, 0.017472991719841957, 0.05619405210018158, 0.003411441808566451, 0.02130633033812046, -0.03523368015885353, -0.017101695761084557, -0.020463088527321815, -0.025893187150359154, -0.04365244880318642, 0.03732496127486229, 0.003834214061498642, -0.02485889010131359, 0.023965951055288315, 0.04507003724575043, -0.039340391755104065, -0.01395449135452509, 0.043282754719257355, -0.0028774579986929893, 0.05358149856328964, -0.02236671932041645, 0.005119263660162687, 0.04382428526878357, 0.013711007311940193, -0.01928921416401863, -0.04386340454220772, 0.013973870314657688, -0.01751251146197319, -0.021039115265011787, 0.018643466755747795, 0.021641723811626434, -0.019717521965503693, -0.04447349160909653, -0.045124512165784836, 0.001166265457868576, 0.027790851891040802, 0.0349087193608284, -0.05426891893148422, -0.020197005942463875, 0.013967194594442844, 0.020221564918756485, 0.02676599659025669, -0.05146752670407295, 0.028382916003465652, -0.03574734181165695, -0.0307468231767416, 0.0271960087120533, -0.060862455517053604, -0.019536489620804787, 0.029809515923261642, 0.005997729953378439, -0.003236045129597187, -0.02296912670135498, -0.012584153562784195, 0.020569052547216415, -0.0398726686835289, 0.012893066741526127, 0.05747450888156891, -0.04004427418112755, -0.0016912904102355242, 0.03588302433490753, -0.040752533823251724, -0.021931834518909454, -0.03274866193532944, 0.02146611548960209, -0.01936306245625019, 0.017079008743166924, -0.007505142129957676, 0.06527362018823624, -0.020604269579052925, 0.01886555179953575, -0.00841802079230547, 0.03616372495889664, 0.020610619336366653, 0.04090982303023338, -0.00286914873868227, 0.01186350453644991, -0.015292976051568985, -0.03638255596160889, 0.028694557026028633, 0.002353183226659894, 0.017770130187273026, 0.01283283717930317, 0.012778712436556816, 0.02851232700049877, 0.008139816112816334, 0.03719107434153557, -0.03728439658880234, 0.012673329561948776, 0.03146594390273094, -0.019588639959692955, 0.006983111146837473, 0.03652822971343994, 0.01976114511489868, 0.043489858508110046, 0.04203694313764572, -0.0072769043035805225, -0.031777746975421906, 0.025811728090047836, -0.02263280190527439, -0.024895695969462395, -0.024923745542764664, -0.021317312493920326, 0.025197185575962067, -0.031068982556462288, -0.03568267449736595, -0.014184455387294292, -0.018119284883141518, 0.04019543156027794, 0.027073360979557037, 0.019060516729950905, -0.032629843801259995, 0.01988561637699604, -0.046449366956949234, -0.026410577818751335, -0.03311290591955185, -0.04030410945415497, 0.0073549640364944935, 0.03774447366595268, -0.02418743632733822, 0.03917284309864044, 0.037112291902303696, 0.04895860329270363, -0.03377828747034073, -0.021235689520835876, -0.008727184496819973, 0.012683587148785591, 0.025314902886748314, 0.02410356141626835, 0.06801478564739227, -0.02123037353157997, -0.0036285393871366978, 0.043283384293317795, 0.025348279625177383, 0.03189971670508385, -0.04675361514091492, -0.008785491809248924, -0.01910555362701416, -0.03074111044406891, -0.032706547528505325, -0.02195163257420063, -0.025955626741051674, -0.013656404800713062, -0.0017331260023638606, 0.04122364893555641, -0.003353940322995186, 0.003944667987525463, 0.02557617612183094, 0.033677756786346436, -0.0069642821326851845, -0.037354182451963425, -0.003706785151734948, 0.018041592091321945, 0.023099323734641075, 0.005993315484374762, -0.023070724681019783, -0.003592608030885458, 0.01580343022942543, -0.021968074142932892, -0.055538859218358994, 0.022645818069577217, 0.008289964869618416, 0.007825851440429688, 0.029414691030979156, 0.00884107407182455, -0.029503418132662773, -0.00685040233656764, -0.04297958314418793, -0.0067012435756623745, 0.0408974252641201, 0.03618587553501129, 0.008660215884447098, 0.005283386912196875, 0.02173357643187046, 0.016801103949546814, 0.014148939400911331, 0.04589837044477463, -0.07519549131393433, 0.0008492959314025939, -0.03911217674612999, 0.01957186684012413, 0.04065250977873802, -0.017508283257484436, -0.022696377709507942, -0.023634128272533417, 0.023389438167214394, 0.005290127359330654, 0.030684763565659523, 0.027884958311915398, 0.0007517277263104916, -0.01770036853849888, 0.011700128205120564, -0.010737712495028973, 0.04028428718447685, 0.023366102948784828, -0.03046400658786297, -0.023285800591111183, -0.047068167477846146, -0.019877061247825623, 0.005805285647511482, 0.026605457067489624, 0.014891230501234531, -0.04306803271174431, 0.06144380196928978, 0.02467053383588791, 0.009495215490460396, 0.02271324396133423, 0.022022845223546028, -0.0344187393784523, 0.034510448575019836, 0.035804156213998795, -0.02319440431892872, -0.012169327586889267, 0.027845632284879684, -0.03938868269324303, 0.014688534662127495, -0.0326634906232357, 0.035405971109867096, 0.09329289197921753, -0.021386025473475456, -0.005334523506462574, -0.013434350490570068, 0.03625260666012764, -0.035231638699769974, 0.01714746095240116, -0.010116915218532085, -0.015791669487953186, 0.03326767310500145, 0.06997721642255783, 0.02518271654844284, -0.025026991963386536, -0.033969491720199585, 0.006438582669943571, -0.010330992750823498, 0.022015077993273735, 0.035159625113010406, 0.03975333645939827, -0.09261862188577652, -0.030692344531416893, -0.007167263887822628, 0.030045578256249428, 0.04790728539228439, 0.037478335201740265, 0.008102349936962128, 0.023854972794651985, -0.02429278939962387, -0.007957016117870808, -0.029902758076786995, -0.03795326128602028, 0.02096726931631565, -0.009724924340844154, 0.05335197597742081, -0.031247764825820923, -0.029010239988565445, 0.03935291990637779, 0.0004320724692661315, -0.015092982910573483, -0.005163619294762611, 0.0006704957922920585, 0.019765393808484077, -0.0023283318150788546, 0.008752129971981049, 0.011935974471271038, 0.041026521474123, -0.02911226823925972, -0.00707174651324749, -0.05678257718682289, 0.018372097983956337, -0.007054861634969711, -0.03395901620388031, 0.029341937974095345, 0.011115530505776405, 0.04074866324663162, -0.015816841274499893, 0.05173950269818306, -0.015220806002616882, -0.04581461101770401, -0.03737398982048035, -0.0463782399892807, -0.04844944179058075, 0.011588484980165958, 0.007512776181101799, 0.012602098286151886, 0.008942355401813984, -0.019375091418623924, -0.00784570537507534, -0.016177771613001823, 0.020358964800834656, 0.030678516253829002, 0.058078765869140625, -0.05181894451379776, -0.007111021783202887, 0.004367280285805464, 0.04949195310473442, 0.024104589596390724, 0.013972863554954529, -0.022043360397219658, 0.02481302246451378, -0.02353677898645401, -0.019958026707172394, -0.03086337447166443, -0.0342184379696846, -0.042619749903678894, -0.013002551160752773, 0.031215941533446312, -0.029242197051644325, -0.039825692772865295, -0.04158396273851395, 0.019694391638040543, 0.02995566837489605, 0.028752531856298447, -0.02492348663508892, -0.006047347094863653, 0.04264743626117706, -0.006050477270036936, -0.012319333851337433, -0.02809244394302368, -0.04274909943342209, -0.015643877908587456, 0.0013809194788336754, 0.022640932351350784, -0.03112141042947769, 0.029752563685178757, -0.013332386501133442, 0.022779937833547592, 0.026944121345877647, 0.0026380603667348623, -0.04207104071974754, 0.022562168538570404, -0.01429622434079647, 0.044384222477674484, 0.004642078652977943]

### Symantic Search

If we were using the LLM module’s preferred structures for Collection
and storing data in SQLite, we could simply use [llm
similar](https://llm.datasette.io/en/stable/embeddings/cli.html#llm-similar)
or its [corresponding Python
API](https://llm.datasette.io/en/stable/embeddings/python-api.html#retrieving-similar-items).
As far as I can tell, the API doesn’t yet support other data structures
of embeddings (like our Dataframe), so we’ll have to calculate [cosine
similarities](https://en.wikipedia.org/wiki/Cosine_similarity)
ourselves. Lucky for us, we can [borrow from Simon’s open source
library](https://github.com/simonw/llm/blob/abcb457b20367ee56e27602e3553bb4bd6a17312/llm/__init__.py#L252):

``` python
def cosine_similarity(a, b):
    dot_product = sum(x * y for x, y in zip(a, b))
    magnitude_a = sum(x * x for x in a) ** 0.5
    magnitude_b = sum(x * x for x in b) ** 0.5
    return dot_product / (magnitude_a * magnitude_b)
```

Now, iterate over every embedding in our Dataframe and calculate the
similarity score against our query embedding vector:

``` python
comp_df = df.copy()
comp_df['similarity'] = comp_df.apply(
    lambda row : cosine_similarity(
        query_vector,
        row.embedding,
    ),
    axis=1,
)

comp_df.head()
```

<div id="df-3a86beae-ba8a-42d5-8a4c-9d3bddc6b005" class="colab-df-container">
    <div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }
&#10;    .dataframe tbody tr th {
        vertical-align: top;
    }
&#10;    .dataframe thead th {
        text-align: right;
    }
</style>

|  | passage | embedding | similarity |
|----|----|----|----|
| 0 | By the authority vested in me as President by ... | \[0.032344698905944824, -0.04333016648888588, 0... | 0.781552 |
| 1 | \(a\) Artificial Intelligence must be safe and s... | \[0.01886950619518757, -0.057347141206264496, 0... | 0.778486 |
| 2 | \(c\) The responsible development and use of AI ... | \[0.0486459881067276, -0.0712570995092392, 0.02... | 0.779455 |
| 3 | \(e\) The interests of Americans who increasingl... | \[0.03564070537686348, -0.04887280985713005, 0.... | 0.794971 |
| 4 | \(g\) It is important to manage the risks from t... | \[0.04095401614904404, -0.042341429740190506, 0... | 0.785406 |

</div>
    <div class="colab-df-buttons">
&#10;  <div class="colab-df-container">
    <button class="colab-df-convert" onclick="convertToInteractive('df-3a86beae-ba8a-42d5-8a4c-9d3bddc6b005')"
            title="Convert this dataframe to an interactive table."
            style="display:none;">
&#10;  <svg xmlns="http://www.w3.org/2000/svg" height="24px" viewBox="0 -960 960 960">
    <path d="M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z"/>
  </svg>
    </button>
&#10;  <style>
    .colab-df-container {
      display:flex;
      gap: 12px;
    }
&#10;    .colab-df-convert {
      background-color: #E8F0FE;
      border: none;
      border-radius: 50%;
      cursor: pointer;
      display: none;
      fill: #1967D2;
      height: 32px;
      padding: 0 0 0 0;
      width: 32px;
    }
&#10;    .colab-df-convert:hover {
      background-color: #E2EBFA;
      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);
      fill: #174EA6;
    }
&#10;    .colab-df-buttons div {
      margin-bottom: 4px;
    }
&#10;    [theme=dark] .colab-df-convert {
      background-color: #3B4455;
      fill: #D2E3FC;
    }
&#10;    [theme=dark] .colab-df-convert:hover {
      background-color: #434B5C;
      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);
      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));
      fill: #FFFFFF;
    }
  </style>
&#10;    <script>
      const buttonEl =
        document.querySelector('#df-3a86beae-ba8a-42d5-8a4c-9d3bddc6b005 button.colab-df-convert');
      buttonEl.style.display =
        google.colab.kernel.accessAllowed ? 'block' : 'none';
&#10;      async function convertToInteractive(key) {
        const element = document.querySelector('#df-3a86beae-ba8a-42d5-8a4c-9d3bddc6b005');
        const dataTable =
          await google.colab.kernel.invokeFunction('convertToInteractive',
                                                    [key], {});
        if (!dataTable) return;
&#10;        const docLinkHtml = 'Like what you see? Visit the ' +
          '<a target="_blank" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'
          + ' to learn more about interactive tables.';
        element.innerHTML = '';
        dataTable['output_type'] = 'display_data';
        await google.colab.output.renderOutput(dataTable, element);
        const docLink = document.createElement('div');
        docLink.innerHTML = docLinkHtml;
        element.appendChild(docLink);
      }
    </script>
  </div>
&#10;
<div id="df-e7a5f54a-b8df-438f-9088-d3b333baa2c3">
  <button class="colab-df-quickchart" onclick="quickchart('df-e7a5f54a-b8df-438f-9088-d3b333baa2c3')"
            title="Suggest charts."
            style="display:none;">
&#10;<svg xmlns="http://www.w3.org/2000/svg" height="24px"viewBox="0 0 24 24"
     width="24px">
    <g>
        <path d="M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z"/>
    </g>
</svg>
  </button>
&#10;<style>
  .colab-df-quickchart {
      --bg-color: #E8F0FE;
      --fill-color: #1967D2;
      --hover-bg-color: #E2EBFA;
      --hover-fill-color: #174EA6;
      --disabled-fill-color: #AAA;
      --disabled-bg-color: #DDD;
  }
&#10;  [theme=dark] .colab-df-quickchart {
      --bg-color: #3B4455;
      --fill-color: #D2E3FC;
      --hover-bg-color: #434B5C;
      --hover-fill-color: #FFFFFF;
      --disabled-bg-color: #3B4455;
      --disabled-fill-color: #666;
  }
&#10;  .colab-df-quickchart {
    background-color: var(--bg-color);
    border: none;
    border-radius: 50%;
    cursor: pointer;
    display: none;
    fill: var(--fill-color);
    height: 32px;
    padding: 0;
    width: 32px;
  }
&#10;  .colab-df-quickchart:hover {
    background-color: var(--hover-bg-color);
    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);
    fill: var(--button-hover-fill-color);
  }
&#10;  .colab-df-quickchart-complete:disabled,
  .colab-df-quickchart-complete:disabled:hover {
    background-color: var(--disabled-bg-color);
    fill: var(--disabled-fill-color);
    box-shadow: none;
  }
&#10;  .colab-df-spinner {
    border: 2px solid var(--fill-color);
    border-color: transparent;
    border-bottom-color: var(--fill-color);
    animation:
      spin 1s steps(1) infinite;
  }
&#10;  @keyframes spin {
    0% {
      border-color: transparent;
      border-bottom-color: var(--fill-color);
      border-left-color: var(--fill-color);
    }
    20% {
      border-color: transparent;
      border-left-color: var(--fill-color);
      border-top-color: var(--fill-color);
    }
    30% {
      border-color: transparent;
      border-left-color: var(--fill-color);
      border-top-color: var(--fill-color);
      border-right-color: var(--fill-color);
    }
    40% {
      border-color: transparent;
      border-right-color: var(--fill-color);
      border-top-color: var(--fill-color);
    }
    60% {
      border-color: transparent;
      border-right-color: var(--fill-color);
    }
    80% {
      border-color: transparent;
      border-right-color: var(--fill-color);
      border-bottom-color: var(--fill-color);
    }
    90% {
      border-color: transparent;
      border-bottom-color: var(--fill-color);
    }
  }
</style>
&#10;  <script>
    async function quickchart(key) {
      const quickchartButtonEl =
        document.querySelector('#' + key + ' button');
      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.
      quickchartButtonEl.classList.add('colab-df-spinner');
      try {
        const charts = await google.colab.kernel.invokeFunction(
            'suggestCharts', [key], {});
      } catch (error) {
        console.error('Error during call to suggestCharts:', error);
      }
      quickchartButtonEl.classList.remove('colab-df-spinner');
      quickchartButtonEl.classList.add('colab-df-quickchart-complete');
    }
    (() => {
      let quickchartButtonEl =
        document.querySelector('#df-e7a5f54a-b8df-438f-9088-d3b333baa2c3 button');
      quickchartButtonEl.style.display =
        google.colab.kernel.accessAllowed ? 'block' : 'none';
    })();
  </script>
</div>
    </div>
  </div>

And select the 3 passages with the best similary scores. We’ll feed this
as context to the LLaMA model.

``` python
best_3_matches = comp_df.sort_values("similarity", ascending = False).head(3)
context = "\n".join(best_3_matches.passage.values)
```

### Ask the LLM

Now that we’ve selected the top 3 passages, let’s feed them into LLaMA
2.

``` python
model = llm.get_model("llama2")
```

Even though we’re providing prefixed context to the model, it’s helpful
to give it a system prompt to guide how it responds. This can help it
stay “focussed” on the context and respond in the voice that we expect.
The system prompt is open to creativity and experimentation.

``` python
system = "You are an assistant. You answer questions in a single \
paragraph about the policy from President Biden. The provided context \
comes directly from the policy. You MUST use the provided information \
as context. Not all provided information will be helpful, ONLY reference \
information if it is related to my query. You may quote the context \
information if helpful."
```

Now, feed the context and the query into the model.

``` python
print(f"Query: {query}\n")
response = model.prompt(
    f'{context}\n{query}',
    system=system,
)

print(f"Response:\n")
print(response.text())
```

    Query: what does it say about healthcare?

    Response:

    The policy from President Biden related to healthcare is outlined in section 8(b)(i) of the policy, which states that:
    "Within 90 days of the date of this order, the Secretary of HHS shall, in consultation with the Secretary of Defense and the Secretary of Veterans Affairs, establish an HHS AI Task Force that shall, within 365 days of its creation, develop a strategic plan that includes policies and frameworks — possibly including regulatory action, as appropriate — on responsible deployment and use of AI and AI-enabled technologies in the health and human services sector (including research and discovery, drug and device safety, healthcare delivery and financing, and public health), and identify appropriate guidance and resources to promote that deployment, including in the following areas:
    (A) development, maintenance, and use of predictive and generative AI-enabled technologies in healthcare delivery and financing — including quality measurement, performance improvement, program integrity, benefits administration, and patient experience — taking into account considerations such as appropriate human oversight of the application of AI-generated output;
    (D) incorporation of safety, privacy, and security standards into the software-development lifecycle for protection of personally identifiable information, including measures to address AI-enhanced cybersecurity threats in the health and human services sector; (E) development, maintenance, and availability of documentation to help users determine appropriate and safe uses of AI in local settings in the health and human services sector; (F) work to be done with State, local, Tribal, and territorial health and human services agencies to advance positive use cases and best practices for use of AI in local settings; and (G) identification of uses of AI to promote workplace efficiency and satisfaction in the health and human services sector, including reducing administrative burdens."
    This section outlines the responsibilities of the Secretary of HHS related to AI in the healthcare sector. Some of the key points include:
    * Establishing an HHS AI Task Force to develop a strategic plan for responsible AI deployment and use in the health and human services sector.
    * Identifying policies and frameworks for regulatory action, as appropriate, to ensure responsible deployment and use of AI in healthcare.
    * Developing guidance and resources to promote the appropriate and safe use of AI in healthcare

Overall, this looks like it does a good job!

Of course, it’s extremely important to keep a human in the loop when
referencing government documents. The model may still hallucinate, or it
could entirely miss important context. Some of these shortcoming are
baked into the model itself, others are implementation details of this
post.

If nothing else, this shows a fascinating interface to interact with
long, wordy, documents!

### License

    Copyright (c) 2023, Matt Hodges
    All rights reserved.

    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions are met:

    * Redistributions of source code must retain the above copyright notice, this
      list of conditions and the following disclaimer.

    * Redistributions in binary form must reproduce the above copyright notice,
      this list of conditions and the following disclaimer in the documentation
      and/or other materials provided with the distribution.

    * Neither the name of Language Models on the AI Executive Order nor the names of its
      contributors may be used to endorse or promote products derived from
      this software without specific prior written permission.

    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
    AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
    IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
    DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
    FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
    DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
    SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
    CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
    OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
