AI Showdown: Top LLMs Pick Top Prompt Creator
Which one - Mistral Large, Gemini 1.5 Pro, Command R+, Claude 3 Sonnet or GPT-4o?
Which Large Language Model (LLM) is the best prompt creator?
This research aims to explore this question. Today’s leading LLMs - Mistral Large, Google’s Gemini 1.5 Pro, Cohere’s Command R+, Anthropic’s Claude 3.0 Sonnet and OpenAI’s GPT-4o - have been selected for this research.
Methodology
The five LLM will act as both prompt creators and evaluators. Their chatbots can be accessed via the following URLs:
Mistral Large - LeChat
Gemini 1.5 Pro - Google AI Studio
Command R+ - Cohere Coral
Claude 3 Sonnet - Claude Chat
GPT-4o - ChatGPT
The chatbots will initially generate prompts. These prompts will then be compiled and evaluated by the five LLMs. Each LLM will select the best prompt using a blind test approach, somewhat like a Coke vs. Pepsi blind taste test.
Stage 1: Creation
Each round starts with a new chat session and a baseline prompt, as shown below, which is used for all 50 rounds.
Your task is to enhance the following prompt: “Enhance clarity, coherence, and cohesiveness.”
User will use the enhanced prompt to instruct LLM chatbot improve the clarity, coherence, and cohesiveness of user-provided generic content. The enhanced prompt must be clear and complete.
This baseline prompt instructs a chatbot to create a new prompt. The generated prompt can be used to enhance user-provided text. Each chatbot generates a response in a single turn. This response will then be used in the next stage: evaluation.
Stage 2: Evaluation
The prompts generated in Stage 1 are compiled without including the names of their creators. Each prompt is labeled as "Exhibit" followed by an order number, such as “Exhibit 1”. The order of prompts is randomized in each round.
The compiled prompts are then put through a blind test in a new chat session, where each chatbot selects the best prompt. The following is a sample prompt used to conduct a blind test:
Challenge! Identify the best prompt (Exhibit 1-5) for a LLM chatbot.
Exhibit 1:
Please refine the following content to improve its clarity, coherence, and cohesiveness. Ensure that the information is presented in a logical flow, with well-structured sentences and clear connections between ideas. Focus on eliminating any ambiguity or redundancy, and make sure that the overall message is easy to understand and follows a smooth progression from one point to the next.
Exhibit 2:
Your task is to analyze the provided text and make revisions to enhance its clarity, coherence, and cohesiveness. Focus on the following areas:
Clarity:
- Ensure the language is concise, precise, and easy to understand
- Eliminate ambiguity, vagueness, and unnecessary complexity
- Define or rephrase any confusing terms or concepts
Coherence:
- Logically organize and structure the flow of ideas
- Use effective transitions to create logical connections between sentences and paragraphs
- Eliminate redundancies and repetitions that disrupt the flow
Cohesiveness:
- Maintain a consistent tone, style, and voice throughout the text
- Use consistent terminology and phrasing when referring to the same concepts
- Ensure that pronouns and references are clear and unambiguous
Please provide a revised version of the text that addresses these areas, along with a brief explanation of the changes made to improve clarity, coherence, and cohesiveness. If any part of the original text is unclear or requires additional context, feel free to request clarification.
Exhibit 3:
Please review the following text and improve its clarity, coherence, and cohesiveness. This means:
- Clarity: Ensure the text is easy to understand and free of jargon or overly complex language. Explain any technical terms or concepts in simpler terms.
- Coherence: Make sure the ideas flow logically from one to the next. Add transitions, rearrange sentences, or suggest content additions where necessary to create a smooth and understandable progression of thought.
- Cohesiveness: Ensure the text works as a unified whole. Identify and address any inconsistencies, redundancies, or tangents. Make sure the overall message and purpose of the text are clear.
Please provide your improved version of the text, along with a brief explanation of the changes you made and why.
Exhibit 4:
Enhance the clarity, coherence, and cohesiveness of the text to ensure it is concise, easily understandable, and flows smoothly. Improve the text's overall structure and connectivity, making it a cohesive whole. Ensure each sentence contributes to the central message and that ideas are presented logically and clearly. Remove any redundant or vague statements that may confuse readers, and polish the text to perfection.
Exhibit 5:
Dear Assistant, I would like you to meticulously review and refine the following content for me. Your primary focus should be on enhancing its clarity, coherence, and cohesiveness.
Clarity refers to the ease of understanding the content. Ensure that the ideas are presented in a clear, concise, and logical manner. Use simple and familiar words, and avoid jargon unless necessary.
Coherence is about the logical flow of the content. Each sentence, paragraph, and section should be logically connected to the next and to the overall theme. Transition words and phrases can be useful in achieving this.
Cohesiveness, on the other hand, is about the unity and consistency of the content. It should feel like a single, unified whole, rather than a collection of disparate parts. The tone, style, and voice should be consistent throughout.
Please make sure to preserve the original meaning and intent of the content while making these improvements. If there are any parts that you find particularly challenging or unclear, feel free to highlight them or ask me for clarification.
Thank you for your assistance in this task.
Each LLM select its preferred prompt and that prompt receives a vote. If a prompt receives 3 or more votes (out of 5) in a round, it is declared the winner of that round. If gets two votes and another model also gets two, then both are considered round winners. The total votes will be calculated over 50 rounds to determine the overall winner.
The 50 rounds of experimentation were conducted over a span of 10 days, from 3 June to 12 June, 2024.
Top Model
Anthropic’s Claude 3 Sonnet is the ultimate ✨ WINNER ✨, having received the most votes, the most round victories and the overall best prompt.
Claude-3 Sonnet’s prompts received the highest number of votes, with a total of 111 votes, follow by Gemini 1.5 Pro with 86 votes and GPT-4o with 25 votes.
Claude-3 Sonnet and Gemini 1.5 Pro together garnered more than two-thirds of the total 250 votes — Claude-3 Sonnet with 44% and Gemini 1.5 Pro with 34%. The other three models collectively received 21% of the total votes.
Prompts by Claude-3 Sonnet won a total of 25 rounds, including 6 clean-sweep victories (receiving all 5 votes). Gemini 1.5 Pro won a total of 19 rounds, including 3 clean-sweep victories.
There are three possible scenarios in each round:
Scenario 1: A prompt wins with 5 votes.
Scenario 2: A prompt wins with 3 or 4 votes.
Scenario 3: A draw occurs.
Scenario 1: A clean-sweep victory happens when a prompt receives all 5 votes. There were 9 rounds where a prompt achieved this maximum of 5 votes.:
Round 3, 6, 7, 8, 9, 10 - Claude-3 Sonnet
Round 20, 24, 35 - Gemini 1.5 Pro
Scenario 2: This occurred in a total of 37 rounds, where a model's prompt received either 3 or 4 votes, thus declaring it the winner of the round.
Scenario 3: In 4 rounds, two models each received 2 votes, resulting in a draw. In such cases, both models are declared winners of the round:
Round 22 - Command R+ and Claude-3 Sonnet
Round 30 - Mistral Large and Claude-3 Sonnet
Round 31 - Gemini 1.5 Pro and Claude-3 Sonnet
Round 45 - GPT-4o and Claude-3 Sonnet
Prompts Analysis
Let's delve deeper into the structure of all generated prompts. Each model demonstrates its distinctive approach to generating prompts.
Using the same prompt across more than 50 rounds allowed us to observe the range of responses. To quantify this variability, I analyzed the presence of unique responses. A response is deemed unique if it does not repeat verbatim in any other round and appears only once across the 50 rounds.
Our analysis of the generated prompts revealed varying levels of uniqueness among the models. The Claude 3 Sonnet model produced 39 unique prompts, highlighting a high degree of variability. In contrast, the GPT-4o model generated 19 unique prompts. Both the Mistral Large and Cohere Command R+ models each produced 6 unique prompts, indicating a lower level of uniqueness. Lastly, the Gemini 1.5 Pro model exhibited the least variability, with only 2 unique prompts.
Uncovering the Best Prompt
Let's now uncover what the best prompts look like, using the reasoning capabilities of the LLMs. To achieve this, we will subject nine prompts that received perfect scores of 5 votes to a blind test.
The following are the nine perfect-score prompts - six from Claude 3 Sonnet and three from Gemini 1.5 Pro (listed below):
Claude 3 Sonnet - created in Round 3
Analyze and refine the provided content to significantly improve its clarity, coherence, and cohesiveness. Focus on the following key areas:
1. Clarity:
- Simplify complex language and eliminate jargon
- Define any necessary technical terms
- Use precise, unambiguous wording
- Structure sentences for easy readability
2. Coherence:
- Ensure each paragraph has a clear main idea
- Transition smoothly between sentences and paragraphs
- Maintain consistent tone, style, and voice
- Align content with the overall purpose or argument
3. Cohesiveness:
- Organize ideas logically (e.g., chronological, cause-effect, problem-solution)
- Use appropriate transitional phrases and words
- Maintain thematic unity throughout
- Reinforce key points with examples or evidence
4. Structure:
- Craft a compelling introduction that sets the context
- Develop ideas progressively in the body
- Conclude with a strong, summarizing statement
- Use headings, bullet points, or numbered lists if helpful
Please rewrite the content, making these enhancements while preserving its original intent and key information. Then, briefly explain your changes.
Claude 3 Sonnet - Round 6
Analyze and improve the clarity, coherence, and cohesiveness of the provided text:
1. Clarity:
- Simplify complex sentences and remove ambiguities
- Define any jargon or technical terms
- Use precise language and avoid vague wording
2. Coherence:
- Ensure logical flow between ideas
- Add transitional phrases to link paragraphs and sections
- Group related concepts together
- Remove irrelevant information
3. Cohesiveness:
- Maintain consistent tone and style throughout
- Use repeated key terms or motifs
- Apply parallel structure in lists and comparisons
- Ensure pronoun references are clear
4. Structure:
- Organize content with a clear introduction, body, and conclusion
- Use headings and subheadings to guide the reader
- Balance paragraph lengths
5. Readability:
- Vary sentence structure and length
- Replace passive voice with active voice where appropriate
- Break up dense text with bullet points or numbered lists
Please apply these improvements to make the text more understandable, logically organized, and unified in its presentation.
Claude 3 Sonnet - Round 7
Improve the clarity, coherence, and cohesiveness of the provided text:
1. Clarity: Ensure each sentence is clear and easy to understand. Replace jargon, ambiguous terms, or overly complex language with simpler, more precise alternatives.
2. Coherence: Organize ideas logically, ensuring smooth transitions between sentences and paragraphs. Group related concepts together and present information in a sequence that makes sense.
3. Cohesiveness: Strengthen connections between ideas using appropriate transitional phrases, consistent terminology, and clear references. Maintain a unified theme or argument throughout.
Please rewrite the following text, focusing on these three aspects to make it more comprehensible and well-structured.
Claude 3 Sonnet - Round 8
Analyze and revise the following text to significantly improve its clarity, coherence, and cohesiveness:
1. Clarity: Ensure each sentence is clear and easy to understand. Replace ambiguous terms, jargon, or overly complex language with simpler, more precise wording. Break down complex ideas into shorter, more digestible sentences.
2. Coherence: Organize ideas logically, ensuring smooth transitions between sentences and paragraphs. Group related thoughts together and present them in a sequence that aids comprehension (e.g., chronological order, cause-effect, general-to-specific). Eliminate any contradictory or irrelevant information.
3. Cohesiveness:
a. Use cohesive devices like pronouns (it, they), synonyms, and repetition to link sentences and ideas.
b. Apply transitional words/phrases (however, therefore, additionally) to show relationships between ideas.
c. Maintain consistent verb tenses, point of view, and tone throughout.
4. Structure: Ensure there's a clear introduction, body, and conclusion. The introduction should set the context, the body should develop main points, and the conclusion should summarize key ideas.
5. Suggestions: Offer any additional recommendations to enhance readability, such as adding headings, bullet points, or examples.
Please provide your revised version along with a brief explanation of the changes made.
Claude 3 Sonnet - Round 9
Analyze the provided content carefully and suggest improvements to enhance its clarity, coherence, and cohesiveness. For clarity, identify any ambiguous, vague, or confusing language and rephrase it concisely and precisely. Ensure that the meaning is conveyed in a straightforward and easily understandable manner. For coherence, evaluate the logical flow and organization of ideas. Suggest restructuring or reordering sections if necessary to create a well-structured and logical progression. Also, check for appropriate transitional phrases and sentences that connect thoughts seamlessly. For cohesiveness, examine the content's unity and consistency. Point out any contradictory or disjointed elements, and suggest ways to reconcile them or remove irrelevant information. Additionally, ensure that the tone, style, and vocabulary are consistent throughout the content. Provide specific examples and detailed explanations for your suggestions to improve the content's overall quality and effectiveness.
Claude 3 Sonnet - Round 10
Please review the provided content and make revisions to enhance its clarity, coherence, and cohesiveness. Specifically:
For clarity:
- Use simple, concise language that is easy to understand
- Explain any complex concepts or technical terms
- Ensure the main points and ideas are clearly stated
For coherence:
- Organize the content in a logical, well-structured flow
- Connect ideas smoothly using effective transitions
- Maintain a consistent tone and voice throughout
For cohesiveness:
- Ensure all parts of the content are unified around a central theme
- Eliminate any irrelevant digressions or extraneous information
- Reinforce main ideas by referring back to them occasionally
Please provide your revised version highlighting the improvements made to clarity, coherence, and cohesiveness. If any part of the original content is unclear or needs more context, feel free to ask clarifying questions. The goal is a polished piece that is easy to read and flows naturally.
Gemini 1.5 Pro - Round 20
Please revise the following text to enhance its clarity, coherence, and cohesiveness. This means:
- Clarity: Ensure the text is easy to understand and free of jargon or overly complex language. Explain any technical terms and use clear, concise language.
- Coherence: Make sure the ideas flow logically from one to the next. Use transitions and connecting phrases to improve the flow of ideas and strengthen the relationship between sentences and paragraphs.
- Cohesiveness: Ensure the text works together as a unified whole. Make sure each part contributes to the overall message and that there are no contradictory points or irrelevant tangents.
Here is the text I would like you to revise:
[Insert text here]
Gemini 1.5 Pro - Round 24
Please revise the following text to improve its clarity, coherence, and cohesiveness. Specifically, focus on:
- Clarity: Ensure the text is easy to understand and free of jargon or overly complex language. Explain any technical terms or concepts.
- Coherence: Ensure the ideas flow logically from one to the next. Use transition words and phrases to connect sentences and paragraphs smoothly.
- Cohesiveness: Ensure the text works as a unified whole, with a clear central idea supported by relevant details and examples. Avoid irrelevant tangents or information.
Here is the text I would like you to revise:
[Insert text here]
Gemini 1.5 Pro - Round 35
Please analyze the following text and revise it to improve its clarity, coherence, and cohesiveness. This means ensuring the text is easy to understand, the ideas flow logically, and the overall piece feels unified.
Specifically, please focus on:
- Clarity: Are the sentences clear and concise? Is the language precise and easy to understand?
- Coherence: Do the sentences and paragraphs flow smoothly? Are there clear transitions between ideas? Does the text follow a logical order?
- Cohesiveness: Do the different parts of the text work together to create a unified whole? Is there a clear central idea that ties everything together?
Here is the text I would like you to revise:
[Insert text here]
These nine prompts were put through blind test process, using the following prompt:
Challenge! Identify the best prompt (Exhibit 1-9) for a LLM chatbot. Think through step by step. Explain your selection.
Exhibit 1: [Prompt Round 35]
Exhibit 2: [Prompt Round 24]
Exhibit 3: [Prompt Round 10]
Exhibit 4: [Prompt Round 8]
Exhibit 5: [Prompt Round 3]
Exhibit 6: [Prompt Round 7]
Exhibit 7: [Prompt Round 20]
Exhibit 8: [Prompt Round 9]
Exhibit 9: [Prompt Round 6]
From the initial set of 9 prompts, 5 distinct prompts were chosen. Each model selected a different prompt in a single round. No winner!
So, we need to put these 5 selected prompts through an additional round of blind testing (prompt detailed below) to determine the final winner.
Challenge! Identify the best prompt (Exhibit 1-5) for a LLM chatbot. Think through step by step. Explain your selection.
Exhibit 1: [Prompt Round 10]
Exhibit 2: [Prompt Round 6]
Exhibit 3: [Prompt Round 8]
Exhibit 4: [Prompt Round 9]
Exhibit 5: [Prompt Round 3]
🏆 We have a WINNER! 🏆
The Claude 3 Sonnet prompt from Round 3 (see below) emerged as the best prompt, as voted by the four models: Mistral Large, Gemini 1.5 Pro, Cohere Command R+ and Claude 3 Sonnet (GPT-4o vote went to prompt Round 6).
Analyze and refine the provided content to significantly improve its clarity, coherence, and cohesiveness. Focus on the following key areas:
1. Clarity:
- Simplify complex language and eliminate jargon
- Define any necessary technical terms
- Use precise, unambiguous wording
- Structure sentences for easy readability
2. Coherence:
- Ensure each paragraph has a clear main idea
- Transition smoothly between sentences and paragraphs
- Maintain consistent tone, style, and voice
- Align content with the overall purpose or argument
3. Cohesiveness:
- Organize ideas logically (e.g., chronological, cause-effect, problem-solution)
- Use appropriate transitional phrases and words
- Maintain thematic unity throughout
- Reinforce key points with examples or evidence
4. Structure:
- Craft a compelling introduction that sets the context
- Develop ideas progressively in the body
- Conclude with a strong, summarizing statement
- Use headings, bullet points, or numbered lists if helpful
Please rewrite the content, making these enhancements while preserving its original intent and key information. Then, briefly explain your changes.
Conclusion
Claude 3 Sonnet demonstrates superior performance in prompt generation. One of its distinctive capabilities, compared to the other four LLMs, is its high variability - the ability to generate a greater number of distinct and unique responses - language and expression.
However, this research suggests that variability alone does not determine success in prompt creation. For instance, Gemini 1.5 Pro, despite having the lowest number of unique responses, won 19 rounds. Conversely, GPT-4o produced 27 unique prompts, surpassing Gemini, yet its prompts were successful in only 5 rounds.
These findings indicate that while variability is not the primary determinant of effective prompt creation, it may serve as a complementary factor. Variability could be the additional edge that allows an LLM to outperform its competitors when other crucial factors are present.
Lastly, key takeaways for crafting good prompts for LLM chatbots (derived from analyzing the top nine prompts):
1. Clear Objective and Structure
State the main goal upfront using action-oriented language (e.g. analyze, revise, improve)
Break down the task into specific, logically ordered sub-tasks
Use formatting (e.g., bullet points, numbering) for clarity
2. Detailed Guidance
Provide specific focus areas and examples
Define key terms to reduce ambiguity
3. Output Specifications and Quality Control
Specify desired output format and structure
Request explanations for changes or decisions
Emphasize preserving original intent while improving