OpenClaw 在 ironclaw Docker 环境的更新、启动与 CORS 排障指南

最近有朋友反馈：OpenClaw 在容器里明明装好了，但后台一直显示离线，或者直接报 origin not allowed。
这类问题本质上通常集中在 3 个点：二进制更新不完整、启动脚本路径失效、CORS 没放行当前来源。

这篇直接给你一套我自己验证过的排障顺序，按顺序执行，基本都能恢复在线。

最短修复路径（建议先走这一套）

强制重装 OpenClaw（不要只跑 openclaw update）。
执行 pnpm approve-builds -g，把原生模块构建确认掉。
把 entrypoint.sh 里的启动命令改为绝对路径。
在 openclaw.json 里配置 gateway.controlUi.allowedOrigins。
重启容器，再看实时日志定位最后一跳错误。

1. 强制更新二进制（解决“命令丢失/启动失败”）

当 openclaw update 更新后反而异常，建议直接重装：

1
2
3
4
5
6
7
8
# 1) 清理历史残留
rm -rf /home/agent/.npm-global/lib/node_modules/openclaw

# 2) 重新安装最新版本
pnpm add -g openclaw@latest

# 3) 允许并构建依赖（关键步骤）
pnpm approve-builds -g

在 approve-builds 交互界面里，按 a 全选后回车，再输入 y 确认。

可额外确认下实际可执行文件路径：

1
which openclaw

2. 修正 entrypoint 启动脚本（解决“容器起来但网关离线”）

容器平台通常通过 entrypoint.sh 拉起进程，如果你从 npm 切到 pnpm，默认命令路径很容易失效。

常见错误写法：

1
runuser -p -u agent -- "$@"

建议改成绝对路径启动，避免 PATH 漂移：

1
runuser -p -u agent -- /home/agent/.local/share/pnpm/openclaw gateway run --bind lan --port 18789

如果你的环境路径不同，请替换成 which openclaw 输出的真实路径。

3. 修复 CORS：`origin not allowed`

报这个错时，基本就是控制台来源没被允许。

配置文件：

1
vi /home/agent/.openclaw/openclaw.json

在 gateway 下加入：

1
2
3
4
5
"gateway": {
  "controlUi": {
    "allowedOrigins": ["*"]
  }
}

"*" 仅建议用于个人测试环境。线上建议改成明确域名白名单，例如：

1
"allowedOrigins": ["https://your-console-domain.example"]

4. 常见故障速查表

故障现象	排查重点	处理动作
网页显示“离线”	网关未启动、依赖未构建	看日志 + 执行 `pnpm approve-builds -g`
`origin not allowed`	CORS 来源未放行	配置 `gateway.controlUi.allowedOrigins`
`openclaw: command not found`	PATH 或安装路径变化	`which openclaw` + 改绝对路径
启动后异常退出	锁文件或权限问题	清理 lock + 校正目录权限

5. 补充：权限和重启建议

先确保脚本可执行：

1
chmod 755 /app/entrypoint.sh

校正数据目录归属：

1
chown -R agent:agent /home/agent/.openclaw /home/agent/openclaw

不要习惯性 pkill -9，优先优雅停止（SIGTERM）：

1
pkill -f openclaw-gateway

6. 如果还离线，做最后定位

在控制台重启容器（Restart）。
手动前台启动网关，看实时报错：

1
/home/agent/.local/share/pnpm/openclaw gateway run --bind lan --port 18789

重点看这类关键错误：EADDRINUSE（端口占用）、EACCES（权限不足）、origin not allowed（来源未放行）。

附录：entrypoint.sh 完整脚本

点击展开完整 entrypoint.sh

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
#!/bin/bash
set -eu -o pipefail
# dajuan
# Security: Prevent accidental exposure of sensitive environment variables
# Never log, echo, or print the values of these variables:
# - NEARAI_API_KEY
# - OPENCLAW_GATEWAY_TOKEN
# - SSH_PUBKEY
#
# Only log variable names in error messages, never their values.
#
# WARNING: Do not enable debug mode (set -x) as it will expose all variable values
# in the shell output. If debugging is needed, use explicit echo statements
# that only print variable names, not values.

# Ensure volume mount points are writable by agent (Docker often creates volumes as root)
mkdir -p /home/agent/.openclaw /home/agent/openclaw
chown -R agent:agent /home/agent/.openclaw /home/agent/openclaw

# ============================================
# SSH Server Configuration (runs as agent user on port 2222)
# ============================================
setup_ssh() {
  echo "Setting up SSH server..."

  # Configure authorized_keys from SSH_PUBKEY environment variable
  if [ -n "${SSH_PUBKEY:-}" ]; then
    echo "Configuring SSH authorized_keys..."
    mkdir -p /home/agent/.ssh
    echo "${SSH_PUBKEY}" > /home/agent/.ssh/authorized_keys
    if [ -n "${BASTION_SSH_PUBKEY:-}" ]; then
      echo "${BASTION_SSH_PUBKEY}" >> /home/agent/.ssh/authorized_keys
    fi
    # Ensure correct permissions for StrictModes (home directory must not be world-writable)
    chmod 755 /home/agent
    chmod 700 /home/agent/.ssh
    chmod 600 /home/agent/.ssh/authorized_keys
    chown -R agent:agent /home/agent/.ssh
    echo "SSH authorized_keys configured successfully"

    # Create privilege separation directory required by sshd
    mkdir -p /run/sshd
    chmod 0755 /run/sshd

    # Unlock agent account to allow SSH key-based login (account may be locked by default)
    passwd -d agent 2>/dev/null || usermod -U agent 2>/dev/null || true

    # Start SSH daemon on port 2222 (non-privileged); listen on all interfaces for external access
    # sshd forks/daemonizes, so the child process keeps running after entrypoint enters the restart loop
    echo "Starting SSH daemon on port 2222..."
    SSHD_OUTPUT=$(/usr/sbin/sshd -f /dev/null \
      -o Port=2222 \
      -o ListenAddress=0.0.0.0 \
      -o HostKey=/home/agent/ssh/ssh_host_ed25519_key \
      -o AuthorizedKeysFile=/home/agent/.ssh/authorized_keys \
      -o PasswordAuthentication=no \
      -o PermitRootLogin=no \
      -o PidFile=/home/agent/ssh/sshd.pid \
      -o StrictModes=yes \
      -o UsePAM=yes \
      -o AcceptEnv="LANG LC_*" \
      -o PrintMotd=no 2>&1) && SSHD_RC=0 || SSHD_RC=$?
    if [ "$SSHD_RC" -eq 0 ]; then
      echo "SSH daemon started on port 2222"
    else
      echo "Warning: Failed to start SSH daemon (exit code: $SSHD_RC)" >&2
      echo "SSHD output: $SSHD_OUTPUT" >&2
      echo "SSH access will not be available" >&2
    fi
  else
    echo "Warning: SSH_PUBKEY not set - SSH access will not be available" >&2
  fi
  chown -R agent:agent /home/agent/.ssh 2>/dev/null || true
}

setup_ssh

# ============================================
# OpenClaw Configuration
# ============================================

# Validate required environment variables
if [ -z "${NEARAI_API_KEY:-}" ]; then
  echo "Warning: NEARAI_API_KEY environment variable is not provided. Using placeholder 'nearai-api-key'." >&2
  echo "Warning: The service may not function correctly without a valid API key." >&2
  NEARAI_API_KEY=nearai-api-key
  export NEARAI_API_KEY
  # echo "Error: NEARAI_API_KEY environment variable is required" >&2
  # exit 1
fi

# Auto-generate gateway auth token if not configured (export so envsubst sees it)
if [ -z "${OPENCLAW_GATEWAY_TOKEN:-}" ]; then
  OPENCLAW_GATEWAY_TOKEN=$(openssl rand -hex 32)
  export OPENCLAW_GATEWAY_TOKEN
fi

# Default NEAR AI Cloud API base URL (export so envsubst sees it)
if [ -z "${NEARAI_API_URL:-}" ]; then
  NEARAI_API_URL=https://cloud-api.near.ai/v1
  export NEARAI_API_URL
fi

# Create config directory if it doesn't exist
# Note: Directory is already created and owned by agent in Dockerfile, but ensure it exists
mkdir -p /home/agent/.openclaw
chmod 700 /home/agent/.openclaw 2>/dev/null || true

# Generate config from template if it doesn't exist or if forced
# Set OPENCLAW_FORCE_CONFIG_REGEN=1 to force regeneration even if config exists
FORCE_REGEN="${OPENCLAW_FORCE_CONFIG_REGEN:-0}"
if [ ! -f /home/agent/.openclaw/openclaw.json ] || [ "${FORCE_REGEN}" = "1" ]; then
  if [ "${FORCE_REGEN}" = "1" ]; then
    echo "Force regenerating config from template (OPENCLAW_FORCE_CONFIG_REGEN=1)..."
  else
    echo "Generating config from template..."
  fi

  # Template file must exist
  if [ ! -f /app/openclaw.json.template ]; then
    echo "Error: Template file /app/openclaw.json.template not found" >&2
    exit 1
  fi

  # Export variables for envsubst (only the ones we need)
  export NEARAI_API_KEY
  export NEARAI_API_URL
  export OPENCLAW_GATEWAY_TOKEN
  export OPENCLAW_GATEWAY_BIND="${OPENCLAW_GATEWAY_BIND:-lan}"

  # Use envsubst to substitute environment variables in the template
  # OpenClaw supports ${VAR_NAME} syntax natively, so we can use the template directly
  # Write to tmp file then mv to prevent symlink attacks (entrypoint runs as root,
  # but /home/agent/.openclaw is agent-owned — a symlink there could overwrite system files)
  echo "Config file created at /home/agent/.openclaw/openclaw.json"
fi

# Generate streaming config if it doesn't exist (separate from openclaw.json to avoid schema conflicts)
if [ ! -f /home/agent/.openclaw/streaming.json ] || [ "${FORCE_REGEN}" = "1" ]; then
  if [ -f /app/streaming.json ]; then
    cp /app/streaming.json /home/agent/.openclaw/streaming.json.tmp
    chown agent:agent /home/agent/.openclaw/streaming.json.tmp
    chmod 600 /home/agent/.openclaw/streaming.json.tmp
    echo "Streaming config created at /home/agent/.openclaw/streaming.json"
  fi
fi

# Create workspace directory if it doesn't exist
# Note: Directory is already created and owned by agent in Dockerfile, but ensure it exists
mkdir -p /home/agent/openclaw
chmod 700 /home/agent/openclaw 2>/dev/null || true

# Copy workspace bootstrap files (SOUL.md, TOOLS.md, etc.) if they don't already exist
# These are injected into the system prompt by OpenClaw automatically
if [ -d /app/workspace ]; then
  for f in /app/workspace/*.md; do
    [ -f "$f" ] || continue
    fname=$(basename "$f")
    if [ ! -f "/home/agent/openclaw/$fname" ]; then
      cp "$f" "/home/agent/openclaw/$fname"
      chown agent:agent "/home/agent/openclaw/$fname"
      echo "Bootstrap file $fname installed to workspace"
    fi
  done

  # Copy pre-installed skills to managed location (shared across all agents)
  if [ -d /app/workspace/skills ]; then
    mkdir -p /home/agent/.openclaw/skills
    for skill_dir in /app/workspace/skills/*/; do
      [ -d "$skill_dir" ] || continue
      skill_name=$(basename "$skill_dir")
      if [ ! -d "/home/agent/.openclaw/skills/$skill_name" ]; then
        cp -r "$skill_dir" "/home/agent/.openclaw/skills/$skill_name"
        echo "Skill '$skill_name' installed to managed skills"
      fi
    done
    chown -R agent:agent /home/agent/.openclaw/skills
  fi
fi

# ============================================
# Auto-approve Device Pairing (for multi-tenant deployment)
# ============================================
# When OPENCLAW_AUTO_APPROVE_DEVICES=1, automatically approve the FIRST device pairing request only.
# Subsequent devices require manual approval for security.
# This is useful for headless/automated deployments where initial setup needs automation.
#
# NOTE: Since openclaw 2026.2.15, the CLI resolves gateway targets using the bind mode
# (lan/loopback). When bind=lan, the CLI connects via the LAN IP, and the gateway treats
# it as a remote client requiring manual pairing — a chicken-and-egg problem.
# Fix: force the CLI to connect via loopback (ws://127.0.0.1:<port>) so the gateway
# recognizes it as a local client and auto-approves the pairing silently.
# See: openclaw #16299, #11448, #16434
AUTO_APPROVE_DEVICES="${OPENCLAW_AUTO_APPROVE_DEVICES:-0}"
AUTO_APPROVE_FLAG="/home/agent/.openclaw/.device_approved"

start_auto_approve_daemon() {
  if [ "$AUTO_APPROVE_DEVICES" = "1" ]; then
    echo "Starting auto-approve daemon for first device pairing..."
    (
      # Wait for gateway to start
      sleep 10

      # Read gateway port and token from config for loopback CLI connection.
      # The CLI must connect via loopback so the gateway sees it as a local client
      # and auto-approves the pairing (isLocalDirectRequest → silent: true).
      GATEWAY_PORT=$(jq -r '.gateway.port // 18789' /home/agent/.openclaw/openclaw.json 2>/dev/null || echo 18789)
      GATEWAY_TOKEN=$(jq -r '.gateway.auth.token // empty' /home/agent/.openclaw/openclaw.json 2>/dev/null || true)

      if [ -z "$GATEWAY_TOKEN" ]; then
        echo "Warning: Could not read gateway token from config. Auto-approve daemon exiting." >&2
        exit 1
      fi

      LOOPBACK_ARGS="--url ws://127.0.0.1:${GATEWAY_PORT} --token ${GATEWAY_TOKEN}"

      while true; do
        # Check if we already approved a device - if so, exit daemon
        if [ -f "$AUTO_APPROVE_FLAG" ]; then
          echo "First device already approved. Auto-approve daemon exiting."
          exit 0
        fi

        # Get pending device requests (connect via loopback for auto-pairing)
        PENDING=$(runuser -p -u agent -- env HOME=/home/agent openclaw devices list --json $LOOPBACK_ARGS 2>/dev/null || echo '{"pending":[]}')

        # Get the first pending request ID only
        FIRST_REQUEST_ID=$(echo "$PENDING" | jq -r '.pending[0]?.requestId // empty' 2>/dev/null)

        if [ -n "$FIRST_REQUEST_ID" ]; then
          echo "Auto-approving first device pairing request: $FIRST_REQUEST_ID"
          if runuser -p -u agent -- env HOME=/home/agent openclaw devices approve "$FIRST_REQUEST_ID" $LOOPBACK_ARGS 2>/dev/null; then
            # Mark that we've approved a device
            touch "$AUTO_APPROVE_FLAG"
            chown agent:agent "$AUTO_APPROVE_FLAG" 2>/dev/null || true
            echo "First device approved. Subsequent devices require manual approval."
            echo "Auto-approve daemon exiting."
            exit 0
          fi
        fi

        # Check every 5 seconds
        sleep 5
      done
    ) &
    echo "Auto-approve daemon started (will approve first device only)"
  fi
}

# Final ownership fix: ensure everything is owned by agent before dropping privileges
# (config generation and bootstrap above may have created files as root)
# Pre-create subdirs the gateway needs — prevents root-owned dirs at runtime
mkdir -p /home/agent/.openclaw/{identity,credentials,cron,agents,canvas}
chown -R agent:agent /home/agent/.openclaw /home/agent/openclaw

start_auto_approve_daemon

# Config integrity check — restore from template if critical keys are clobbered
# (e.g., AI agent used config.patch/exec to modify openclaw.json and stripped defaults)
validate_config() {
  local cfg="/home/agent/.openclaw/openclaw.json"
  if [ ! -f "$cfg" ]; then
    echo "Warning: Config file missing" >&2
    return 1
  fi
  local primary
  primary=$(jq -r '.agents.defaults.model.primary // empty' "$cfg" 2>/dev/null) || true
  if [ -z "$primary" ]; then
    echo "Warning: agents.defaults.model.primary is missing — config may be clobbered" >&2
    return 1
  fi
  return 0
}

restore_config() {
  echo "Restoring config from template..."
  export OPENCLAW_GATEWAY_BIND="${OPENCLAW_GATEWAY_BIND:-lan}"
  chown agent:agent /home/agent/.openclaw/openclaw.json.tmp
  chmod 600 /home/agent/.openclaw/openclaw.json.tmp
  echo "Config restored from template"
}

# Execute the command with automatic restart (openclaw is installed globally)
# The loop keeps the container alive and restarts the gateway if it exits
RESTART_DELAY="${OPENCLAW_RESTART_DELAY:-5}"

while true; do
  echo "Starting: $*"

  chown -R agent:agent /home/agent/.openclaw /home/agent/openclaw 2>/dev/null || true
  
  echo "Cleaning up any orphaned gateway processes..."
  pkill -9 -f openclaw-gateway 2>/dev/null || true

  runuser -p -u agent -- /home/agent/.local/share/pnpm/openclaw gateway run --bind lan --port 18789 && EXIT_CODE=$? || EXIT_CODE=$?
  echo "Process exited with code $EXIT_CODE. Restarting in ${RESTART_DELAY}s..."
  sleep "$RESTART_DELAY"
done