Using AI Services & Tools

This notes is for using AI services and tools. For working with APIs, please refer this note. For tools and notes in Vibe Coding, refer this note.

Other tools

CLIProxyAPI — Wrap Gemini CLI, Antigravity, ChatGPT Codex, Claude Code, Qwen Code, iFlow as an OpenAI/Gemini/Claude/Codex compatible API service, allowing you to enjoy the free Gemini 2.5 Pro, GPT 5, Claude, Qwen model through API

Using AI services

This is based on my personal experience with the current versions. The list may change significantly in the future.

☝

TL;RD: Coding with Claude Code. Others with Gemini. That’s it!

Summarize YouTube videos: Notebook LM or ask Grok with a URL.

Get updated news (requires up-to-date information): ~~Grok~~ Gemini, then ChatGPT, then ~~Gemini~~ Grok.

Check references/sources: Perplexity then ~~Grok~~ Gemini / ChatGPT.

Translation: Gemini, ChatGPT, or Grok.

Coding assistance: Claude, then Gemini, then Grok, then ChatGPT.

Record and summarize live meetings: ChatGPT Pro.

All-in-one chatbot models: Monica (affordable option).

Work with personal files/sources: Use Project or Spaces features in AI services and upload your resources.

Summarize Vietnamese books: ChatGPT, then Notion AI.

Voice Mode (for English speaking practice): Gemini, ~~ChatGPT~~ ~~(has memory),~~ ~~Grok~~ ~~(for creative conversations)~~.

AI IDE: (Updated: whatever with Claude Code extension) Cursor, then VSCode with Github Copilot. Both use Claude models.

Image editing/generation: Grok Imagine (⭐), Gemini Banana.

Generate new photo based on current photo: Gemini Banana (with a good prompt)
Replace clothes: Grok Imagine or Gemini Banana.

Video generation (photo to video): Grok Imagine.

Local AI

Using Ollama

Check the model list on the home page.

To download model (cannot do with the desktop app): ollama pull <model_name>

Models I use: qwen3:8b (for tasks need a quick response), qwen3-coder:30b (to use with claude code on my Mac M4), gpt-oss:20b (daily chat).

Run with API endpoints, check this official doc.

Setting up LM Studio

⚠️

If you wanna use Claude Code with local AIs or enable web (more easily), use Ollama instead.

Download LM Studio (it’s better on macOS than ollama)

Need to enable server before using endpoints (⚠️ Make sure to enable CORS)

No need to load the model before using request to endpoints, it will be loaded automatically.

Example of curl

1curl http://127.0.0.1:1234/v1/chat/completions \
2  -H "Content-Type: application/json" \
3  -d '{
4    "model": "google/gemma-3-27b", // or: openai/gpt-oss-20b
5    "messages": [ 
6      { "role": "system", "content": "Always answer in rhymes." },
7      { "role": "user", "content": "Introduce yourself." }
8    ], 
9    "temperature": 0.7, 
10    "max_tokens": -1,
11    "stream": true
12  }'

Using Local Models with IDEs

Download LM Studio and download the coder models. Alternatively, you can use Ollama.

In your IDE (VSCode or Cursor), install the Continue extension.

In LM Studio, navigate to the Developer tab, select your downloaded model → Settings → enable "Serve on Local Network" → enable the server.

In your IDE, select the "Continue" tab on the left sidebar → Choose "Or, configure your own model" → "Click here to view more providers" (or select the Ollama icon tab if you're using Ollama) → in the provider list, select LM Studio → Set Model to "Autodetect" → Connect → a config file will open at ~/.continue/config.yaml, keep the default settings and save.

That's it!

As another option, you can use Granite.code (from IBM)

No need to remember commands using AI

I’m using Claude Code, if you use another Coding CLI service, modify the codes. Insert below codes in .bashrc or .zshrc and then source ~/.zshrc:

1claude_execute() {
2  emulate -L zsh
3  setopt NO_GLOB
4  local query="$*"
5  local prompt="You are a command line expert. The user wants to run a command but they don't know how. Here is what they asked: ${query}. Return ONLY the exact shell command needed. Do not prepend with an explanation, no markdown, no code blocks - just return the raw command you think will solve their query."
6  local cmd
7  # use Claude Code
8  cmd=$(claude --dangerously-skip-permissions --disallowedTools "Bash(*)" --model default -p "$prompt" --output-format text | tr -d '\000-\037' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
9  if [[ -z "$cmd" ]]; then
10    echo "claude_execute: No command found"
11    return 1
12  fi
13  echo -e "$ \033[0;36m$cmd\033[0m"
14  eval "$cmd"
15}
16alias ask="noglob claude_execute"

1# Usage
2ask "List all conda env in this computer"

1curl http://127.0.0.1:1234/v1/chat/completions \ 2 -H "Content-Type: application/json" \ 3 -d '{ 4 "model": "google/gemma-3-27b", // or: openai/gpt-oss-20b 5 "messages": [ 6 { "role": "system", "content": "Always answer in rhymes." }, 7 { "role": "user", "content": "Introduce yourself." } 8 ], 9 "temperature": 0.7, 10 "max_tokens": -1, 11 "stream": true 12 }'

1claude_execute() { 2 emulate -L zsh 3 setopt NO_GLOB 4 local query="$*" 5 local prompt="You are a command line expert. The user wants to run a command but they don't know how. Here is what they asked: ${query}. Return ONLY the exact shell command needed. Do not prepend with an explanation, no markdown, no code blocks - just return the raw command you think will solve their query." 6 local cmd 7 # use Claude Code 8 cmd=$(claude --dangerously-skip-permissions --disallowedTools "Bash(*)" --model default -p "$prompt" --output-format text | tr -d '\000-\037' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') 9 if [[ -z "$cmd" ]]; then 10 echo "claude_execute: No command found" 11 return 1 12 fi 13 echo -e "$ \033[0;36m$cmd\033[0m" 14 eval "$cmd" 15} 16alias ask="noglob claude_execute"