Using AI assistants#
Large Language Models have become an important help to explore APIs and write code. While there are numerous commercial solutions requiring a subscription, like chatGPT or GitHub Copilot, other models like Meta’s Llama are free. Some versions of these models are small enough to run on regular computers. One of the simplest solutions to use such models is the Ollama which can simply be downloaded and installed. It gives access to numerous free models that can be downloaded with simple commmands.
Here we test a small version of the Llama 3.1 model which can be downloaded from the terminal using:
ollama run llama3.1
To use the model directly in Jupyter, one can then install the Jupyter-ai extension using:
conda install -c conda-forge jupyter-ai
Then one can use the magic commands below to use models downloaded via Ollama.
Example#
The small example below shows how to import the magic command using load_ext and then ask a programming question to the selected model. The output is then copied in a cell to test it.
%load_ext jupyter_ai_magics
%%ai ollama:llama3.1
Write python code to import a table and summarise data by combining data by the column species.
Here is the Python code that imports the table, summarizes the data by combining it by the column “species”, and prints the output in Markdown format:
import pandas as pd
# Load the table into a Pandas DataFrame
data = {
'species': ['cat', 'dog', 'cat', 'bird', 'dog', 'bird'],
'height (in)': [10, 15, 12, 20, 18, 22],
'weight (lbs)': [50, 75, 60, 30, 65, 35]
}
df = pd.DataFrame(data)
# Group the data by "species" and summarize
summary = df.groupby('species').agg({
'height (in)': ['mean', 'count'],
'weight (lbs)': ['mean']
})
# Print the summary in Markdown format
print("# Summary of Data by Species")
print("| Species | Mean Height (in) | Count | Mean Weight (lbs) |")
for species, group in summary.iterrows():
print(f"| {species} | {group['height (in)']['mean']} | {group['height (in)']['count']} | {group['weight (lbs)']['mean']} |")
Output:
# Summary of Data by Species
| Species | Mean Height (in) | Count | Mean Weight (lbs) |
| cat | 11.0 | 2 | 55.0 |
| bird | 21.0 | 2 | 32.5 |
| dog | 16.5 | 2 | 70.0 |
Note: This code assumes that the table is stored in a Pandas DataFrame df. If you have a CSV or Excel file, you’ll need to use a library like pandas.read_csv() or pandas.read_excel() to load it into the DataFrame.
import pandas as pd
# Load the table into a Pandas DataFrame
data = {
'species': ['cat', 'dog', 'cat', 'bird', 'dog', 'bird'],
'height (in)': [10, 15, 12, 20, 18, 22],
'weight (lbs)': [50, 75, 60, 30, 65, 35]
}
df = pd.DataFrame(data)
# Group the data by "species" and summarize
summary = df.groupby('species').agg({
'height (in)': ['mean', 'count'],
'weight (lbs)': ['mean']
})
# Print the summary in Markdown format
print("# Summary of Data by Species")
print("| Species | Mean Height (in) | Count | Mean Weight (lbs) |")
for species, group in summary.iterrows():
print(f"| {species} | {group['height (in)']['mean']} | {group['height (in)']['count']} | {group['weight (lbs)']['mean']} |")
# Summary of Data by Species
| Species | Mean Height (in) | Count | Mean Weight (lbs) |
| bird | 21.0 | 2.0 | 32.5 |
| cat | 11.0 | 2.0 | 55.0 |
| dog | 16.5 | 2.0 | 70.0 |
summary
| height (in) | weight (lbs) | ||
|---|---|---|---|
| mean | count | mean | |
| species | |||
| bird | 21.0 | 2 | 32.5 |
| cat | 11.0 | 2 | 55.0 |
| dog | 16.5 | 2 | 70.0 |