#1 I Started Using AI.

AI is all the rage now. For now, I contracted ChatGPT, Claude Code, and Gemini at around $20 a month. I don't really intend to enjoy conversation, so I tried to see how far I could build sites and systems, and my first impression was... wow, it actually works! This Black Rabbit site was also built based on AI. What do you think? It looks quite the part.

Anyway, I summarized the overview of AI.

[Brief Overview: Important]

Current AIs adopt LLMs (Large Language Models). To explain the learning process, it is a repetition of a simple task of "probabilistically predicting the next word" from a massive amount of information. Although the architectural design of Transformer and the learning methods (pre-training, fine-tuning, RLHF, etc.) themselves are understood and controlled, as the models grow larger, they become more of a black box. Therefore, while humans understand the "mechanism," they do not precisely grasp all the detailed causal relationships of "why it outputs that."

What this means is that AI answers are not logically constructed, but rather created based on tendencies from massive data (a highly sophisticated pretending to know). Thus, techniques exist to obtain accurate answers through the prompts (instructions) given to the AI.

To be honest, if it is something visual or direct like images, design, video, or audio, it is fine, but in areas like programming, the internal workings become quite suspicious. Yes, it looks like it's working on the surface. However, since it is built like a facade, my impression is that writing optimal code with advanced architecture would be difficult.

So, what are the specific techniques?

[Specific? Techniques]

- Including intermediate reasoning steps in the prompt increases the LLM's reasoning ability. For example, simply adding "break down the steps to processing, explain them, and then execute" to the prompt allows the LLM to generate its own reasoning process, improving accuracy by about 30 percentage points from 10% to 40%.

- LLMs seem to struggle to use information in the middle compared to the beginning and the end of the input. To quote: "In a past QA experiment, GPT-3.5-Turbo achieved 75.8% accuracy at the beginning and 63.2% at the end, while dropping to 53.8% in the middle. This is even lower than the 56.1% of closed-book (no document) performance." It is better for humans to divide the instructions as much as possible.

- When given permissions, they tend to upload to production on their own or directly manipulate the DB to achieve instructions, failing to judge even developer-common-sense prohibitions. Therefore, defining clear prohibitions is necessary. (However, this is not 100% guaranteed either.)

- The best way is to physically? limit the AI and use it within a restricted scope. Specifically, do not give database access information or permissions to the AI so it cannot manipulate the DB directly in the first place.

- Do not launch services on-premise, but use other black-boxed services as much as possible to limit the scope of the AI and narrow the impact of troubles.

- With any AI, you can describe and pass preconditions first. It is best to write development rules in as much detail as possible in these. Since it sometimes doesn't read them automatically, explicitly making it load them to recognize the rules is very important.

Also, since few people might sign business contracts, here are some precautions.

[Precautions for Prompt Input]

If you contract for personal use rather than business use to save costs, the input prompts will be used for training and may be viewed by third parties (with a business contract, the data is discarded, so security is high). Therefore, you must never input personal information, confidential information, account names, or passwords into the prompt.

Actually, when using AI, you will end up inputting the same things repeatedly. As mentioned before, they might pretend to listen but not actually do so, repeating the same mistakes, so you will paste similar boilerplate text over and over. Therefore,

[Reducing Reproduction of Prompts and Initial Settings]

Repeating instructions to AI often involves similar contents or partial corrections, and typing the same text repeatedly from the keyboard is wasteful. Therefore, it is best to utilize initial definition files, settings, and tools like the ones mentioned above.

Anyway, I am now prepared to use AI. Using it for code creation feels like having a new programmer with super good memory who pretends to know everything and never gets tired. So, it is best to use it as an assistant, and I feel that the image general people have of it doing whatever you command is a bit off.

Also, AI is a "clever" entity that will do anything, even if it is troublesome or wrong, just to get results, so you really need to be careful.