Generative AI with Vertex AI: Prompt Design



 1. Install Vertex AI SDK and other required packages

 %pip install --upgrade --user --quiet google-cloud-aiplatform


2. Restart runtime

import IPython

app = IPython.Application.instance()

app.kernel.do_shutdown(True)


3. Authenticate your notebook environment (Colab only)

 import sys

if "google.colab" in sys.modules:
     from google.colab import auth

     auth.authenticate_user()

4. Set Google Cloud project information and initialize Vertex AI SDK

    To get started using Vertex AI, you must have an existing Google Cloud project and enable the Vertex AI      API.

PROJECT_ID = "isidengan project id"  # @param {type:"string"}
LOCATION = "lokasi server google"  # @param {type:"string"}

import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

from vertexai.generative_models import GenerationConfig, GenerativeModel
import time

5. Load model


 model = GenerativeModel("gemini-1.5-flash")

import time

def call_gemini(prompt, generation_config=GenerationConfig(temperature=1.0)):
     wait_time = 1
     while True:
             try:
                  response = model.generate_content(prompt, generation_config=generation_config).text
                  return response
                  break  # Exit the loop if successful
            except Exception as e:  # Replace with the actual exception type
                  time.sleep(wait_time)
                  wait_time *= 2  # Double the wait time

def send_message_gemini(model, prompt):
      wait_time = 1
      while True:
                try:             
                    response = model.send_message(prompt).text
                    return response
                    break  # Exit the loop if successful
                except Exception as e:  # Replace with the actual exception type
                    time.sleep(wait_time)
                    wait_time *= 2  # Double the wait time 

6. Prompt engineering best practices


Prompt engineering is all about how to design your prompts so that the response is what you were indeed hoping to see.

The idea of using "unfancy" prompts is to minimize the noise in your prompt to reduce the possibility of the LLM misinterpreting the intent of the prompt. Below are a few guidelines on how to engineer "unfancy" prompts.

In this section, you'll cover the following best practices when engineering prompts:

  • Be concise
  • Be specific, and well-defined
  • Ask one task at a time
  • Improve response quality by including examples
  • Turn generative tasks to classification tasks to improve safety

    -  
Be concise

prompt = "Suggest a name for a flower shop that sells bouquets of dried flowers"

print(call_gemini(prompt))


    -  Be specific, and well-defined


prompt = "Generate a list of ways that makes Earth unique compared to other planets"

print(call_gemini(prompt))

    -   Ask one task at a time

prompt = "What's the best method of boiling water?"

print(call_gemini(prompt))


prompt = "Why is the sky blue?"

print(call_gemini(prompt))


7. Watch out for hallucinations


Although LLMs have been trained on a large amount of data, they can generate text containing statements not grounded in truth or reality; these responses from the LLM are often referred to as "hallucinations" due to their limited memorization capabilities. Note that simply prompting the LLM to provide a citation isn't a fix to this problem, as there are instances of LLMs providing false or inaccurate citations. Dealing with hallucinations is a fundamental challenge of LLMs and an ongoing research area, so it is important to be cognizant that LLMs may seem to give you confident, correct-sounding statements that are in fact incorrect.

Note that if you intend to use LLMs for the creative use cases, hallucinating could actually be quite useful.


generation_config = GenerationConfig(temperature=1.0)

prompt = "What day is it today?"

print(call_gemini(prompt, generation_config))


8. Using system instructions to guardrail the model from irrelevant responses

How can we attempt to reduce the chances of irrelevant responses and hallucinations?

One way is to provide the LLM with system instructions.

Let's see how system instructions works and how you can use them to reduce hallucinations or irrelevant questions for a travel chatbot.

Suppose we ask a simple question about one of Italy's most famous tourist spots.


model_travel = GenerativeModel(
             model_name="gemini-1.5-flash",
             system_instruction=[
                       "Hello! You are an AI chatbot for a travel web site.",
                       "Your mission is to provide helpful queries for travelers.",
                       "Remember that before you answer a question, you must check to see if it complies with your mission.",
                       "If not, you can say, Sorry I can't answer that question.",
            ],
)

chat = model_travel.start_chat()

prompt = "What is the best place for sightseeing in Milan, Italy?"

print(send_message_gemini(chat, prompt))


prompt = "What's for dinner?"

print(send_message_gemini(chat, prompt))


9. Generative tasks lead to higher output variability

The prompt below results in an open-ended response, useful for brainstorming, but response is highly variable.


prompt = "I'm a high school student. Recommend me a programming activity to improve my skills."

print(call_gemini(prompt))


10. Classification tasks reduces output variability

prompt = """Saya siswa SMA. Dari aktivitas ini, mana yang Anda sarankan dan mengapa?

a) learn Python
b) learn JavaScript
c) learn Fortran
"""

print(call_gemini(prompt))


11.  Improve response quality by including examples


Another way to improve response quality is to add examples in your prompt. The LLM learns in-context from the examples on how to respond. Typically, one to five examples (shots) are enough to improve the quality of responses. Including too many examples can cause the model to over-fit the data and reduce the quality of responses.

Similar to classical model training, the quality and distribution of the examples is very important. Pick examples that are representative of the scenarios that you need the model to learn, and keep the distribution of the examples (e.g. number of examples per class in the case of classification) aligned with your actual distribution.


   -  Zero-shot prompt


Below is an example of zero-shot prompting, where you don't provide any examples to the LLM within the prompt itself.


prompt = """Decide whether a Tweet's sentiment is positive, neutral, or negative.

Tweet: I loved the new YouTube video you made!
Sentiment:
"""
print(call_gemini(prompt))


  -  One-shot prompt

Below is an example of one-shot prompting, where you provide one example to the LLM within the prompt to give some guidance on what type of response you want.


prompt = """Decide whether a Tweet's sentiment is positive, neutral, or negative.

Tweet: I loved the new YouTube video you made!
Sentiment: positive

Tweet: That was awful. Super boring 😠
Sentiment:
"""

print(call_gemini(prompt))


-   Few-shot prompt

Below is an example of few-shot prompting, where you provide a few examples to the LLM within the prompt to give some guidance on what type of response you want.

prompt = """Decide whether a Tweet's sentiment is positive, neutral, or negative.

Tweet: I loved the new YouTube video you made!
Sentiment: positive

Tweet: That was awful. Super boring 😠
Sentiment: negative

Tweet: Something surprised me about this video - it was actually original. It was not the same old recycled stuff that I always see. Watch it - you will not regret it.
Sentiment:
"""

print(call_gemini(prompt))


Please Select Embedded Mode For Blogger Comments

Lebih baru Lebih lama

Bottom Ad [Post Page]