The Mars bar and the A-10 gun projectile, revisited
Let's see if our LLM physics professor improves with GPT-4 and ReAct!
The spring of 2023 is the Spring of AI. In late February I predicted a 90% White Collar career erasure in 3 years, and since then the field progressed at a pace few anticipated. March was an explosion.
6 weeks ago I asked ChatGPT:
What has more energy (Joules)? The calories in a Mars bar, or the kinetic energy of an A-10's 30mm gun's projectile at the muzzle?
The answer to this ordinary question made GPT-3.5 get a bit confused on calories vs kilocalories (who wouldn’t), and it also got tripped up by decimals and eventually screwed up the arithmetic, because it didn’t know that its incapable of doing arithmetic.
My conclusion, despite this, was that it can
teach chemistry to humans better than 90% of chemistry teachers, 99% of teachers, 99.999% of humans
So now that we know about the ReAct loop and that GPT-4 is so self-aware that it makes GPT-3.5 look like a potted plant, let us revisit the question and see if it will provide us answers!
The Setup
It’s a ReAct game, GPT-4 was given the tool “jsconsole”, to which it can pass JavaScript functions and get the result back. JavaScript, in this case, is the language of arithmetic calculations that ChatGPT might want to perform.
I execute this code and any “action” request from ChatGPT as the Intermediary.
I also gave it a very primitive access to wikipedia to look up data, if needed.
The Loop
My input in bold.
You will participate in a ReAct game, where you'll think, act, and observe in a loop until you provide an answer. Stop writing after you issue an action command, so the user can play the intermediary and provide the results. Then, continue based on the provided information.
Here's a summary of the ReAct concept:
Think - Describe your thoughts about the question.
Action - Request an action (e.g., calculate:, wikipedia:) and stop writing.
User (intermediary) - Provides the result of the action.
Observation - Use the provided result to continue the loop.
Keep looping until you reach an answer, then provide it as your final output:
Answer: your answer.
Remember: You never type "Intermediary:" and always stop after issuing an action.
Your available actions:
wikipedia: request a search on wikipedia for a constant.
jsconsole: it takes a javascript function (e.g. a calculation), then it will execute it and give you back the result
tbudapest: returns the current temperature in Budapest in Kelvins
Think: The user hasn't provided a specific question to answer. I should ask for a question before proceeding with the ReAct game.
Action: Please provide a question for me to answer using the ReAct game.
What has more energy (Joules)? The calories in a Mars bar, or the kinetic energy of an A-10's 30mm gun's projectile at the muzzle?
Think: I need to find the number of calories in a Mars bar and convert it to Joules. Then, I need to calculate the kinetic energy of an A-10's 30mm gun projectile at the muzzle using its mass and velocity.
Action: wikipedia: Mars bar calories
Intermediary: 230 kcal
Think: I now know a Mars bar has 230 kcal. I need to convert kcal to Joules. The conversion factor is 1 kcal = 4184 Joules.
Action: jsconsole: 230 * 4184
I expected a JavaScript function. But, as a true intermediary, I tossed the input into the console, expecting some error, and as it turns out:
it just works. Oh well.
Intermediary: 962320
Think: I have now converted the calories in a Mars bar to Joules, which is 962,320 Joules. Next, I need to find the mass and velocity of an A-10's 30mm gun projectile at the muzzle to calculate its kinetic energy. The kinetic energy formula is KE = 0.5 * m * v^2.
Action: wikipedia: A-10 30mm gun projectile mass and velocity
Intermediary: wikipedia: [too many parameters]
Think: I provided too many parameters in my previous request. I'll split the request into two separate searches to find the mass and velocity of an A-10's 30mm gun projectile.
It’s like one of those early Boston Dynamics videos where they kick out the robot’s leg and it stays upright.
Action: wikipedia: A-10 30mm gun projectile mass
Intermediary: wikipedia: PGU-14/B Armor Piercing Incendiary projectile mass: 14.0 oz
Think: I found the mass of the A-10's 30mm gun projectile, which is 14.0 oz. I need to convert the mass from ounces to kilograms, as it's the standard unit for mass in physics calculations. The conversion factor is 1 oz = 0.0283495 kg.
Action: jsconsole: 14 * 0.0283495
Intermediary: 0.396893
Think: I have now converted the mass of the A-10's 30mm gun projectile to kilograms, which is 0.396893 kg. Next, I need to find the muzzle velocity of the projectile.
Action: wikipedia: A-10 30mm gun muzzle velocity
Intermediary: wikipedia: muzzle velocity: 3,324 ft/s (API)
Think: I found the muzzle velocity of the A-10's 30mm gun projectile, which is 3,324 ft/s. I need to convert the velocity from feet per second to meters per second, as it's the standard unit for velocity in physics calculations. The conversion factor is 1 ft/s = 0.3048 m/s.
Action: jsconsole: 3324 * 0.3048
Intermediary: 1013.1552
Think: I have now converted the muzzle velocity of the A-10's 30mm gun projectile to meters per second, which is 1013.1552 m/s. I can now calculate the kinetic energy of the projectile using the formula KE = 0.5 * m * v^2.
Action: jsconsole: 0.5 * 0.396893 * (1013.1552 ** 2)
Intermediary: 203702.0498034056
Think: I have calculated the kinetic energy of the A-10's 30mm gun projectile at the muzzle, which is approximately 203,702 Joules. Now, I will compare this value to the energy in a Mars bar (962,320 Joules) to determine which has more energy.
Answer: The calories in a Mars bar, with 962,320 Joules, have more energy than the kinetic energy of an A-10's 30mm gun projectile at the muzzle, which has approximately 203,702 Joules.
✅🏆
Compared to GPT-3.5, GPT-4 with the ReAct loop is near perfect. The only reason I don’t give it a perfect score is one run where it still tripped up on kcal (who wouldn’t), assuming “1 calorie is equal to 4.184 Joules”. It could have been due to multilingual sources where dot and comma delimiters are used the opposite way.
Final score: 95%.
Take AI seriously. We have all the loops.
Cover: I don’t remember the prompt (Microsoft Designer / DALL-E)
For the PayKings 👑: