diff --git a/.env.example b/.env.example index 1d89f82..9b2fd75 100644 --- a/.env.example +++ b/.env.example @@ -29,6 +29,17 @@ DEFAULT_TENANT=CityGraph-new2 DEFAULT_PROJECT=CityGraph-new2 INGEST_API_KEYS=dev-key-1 +# External services, optional +AMAP_WEB_KEY= +AMAP_JS_KEY= +AMAP_SECURITY_JSCODE= +GAODE_CRAWLER_PATH= +TRAVEL_KG_DATA_ROOT=./data +TRAVEL_AGENCY_SOURCE_ROOT=./data/source/travel_agency +TRAVEL_DELIVERY_ROOT=./data/source/travel_delivery_20260602 +TRAVEL_KG_EXPORT_ROOT=./data/exports +TRAVEL_KG_ENV_PATH=./.env + # Docker host ports API_PORT=8102 POSTGRES_PORT=5433 diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..b4f19dc --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,13 @@ +# 更新日志 + +## 0.1.0 - 2026-06-09 + +首个 GitHub 发布版本。 + +- 整理 `new2` 旅行知识图谱系统源码。 +- 增加 Dockerfile 和 Docker Compose 一键启动环境。 +- 随仓库发布 PostgreSQL 与 FalkorDB 数据快照。 +- 增加 PostgreSQL 初始化恢复脚本和 FalkorDB 快照种子容器。 +- 增加安全版 `.env.example`。 +- 增加中文 README、系统介绍、架构、部署、数据快照、API 和维护文档。 +- 本地验证 Docker 构建、快照恢复、管理后台访问和管理员登录。 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..1af3e11 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,48 @@ +# 贡献指南 + +欢迎基于本项目继续扩展旅行知识图谱能力。提交前请先确保本地可以构建和启动。 + +## 开发环境 + +后端: + +```bash +python3 -m venv .venv +source .venv/bin/activate +pip install -r requirements.txt +cp .env.example .env +python -m uvicorn app.main:app --host 0.0.0.0 --port 8102 --reload +``` + +前端: + +```bash +cd admin-web +npm install +npm run dev +``` + +Docker: + +```bash +docker compose up -d --build +``` + +## 提交规范 + +- 保持改动聚焦,一次提交解决一个明确问题。 +- 不提交 `.env`、密钥、浏览器缓存、日志、数据卷和 `node_modules/`。 +- 如果修改 API,同步更新 `docs/API_REFERENCE.md`。 +- 如果修改部署配置,同步更新 `README.md` 和 `docs/DEPLOYMENT.md`。 +- 如果修改快照,同步更新 `docs/DATA_SNAPSHOTS.md` 中的数量和哈希。 + +## 建议检查 + +```bash +python -m compileall app +cd admin-web && npm run build +cd .. +docker compose config +``` + +涉及 Docker、快照或登录逻辑的修改,需要额外执行完整 compose 启动验证。 diff --git a/README.md b/README.md index 2ff1705..7bd5412 100644 --- a/README.md +++ b/README.md @@ -1,16 +1,36 @@ # 旅行知识图谱管理系统 -这是 `new2` 版旅行/城市知识图谱系统,包含 FastAPI 后端、React 管理后台、PostgreSQL 管理库和 FalkorDB 图数据库。仓库已附带数据库快照,用户下载后可以用 Docker 直接恢复图谱数据。 +![Python](https://img.shields.io/badge/Python-3.12-blue) +![FastAPI](https://img.shields.io/badge/FastAPI-0.136-green) +![React](https://img.shields.io/badge/React-18-61dafb) +![Docker](https://img.shields.io/badge/Docker-Compose-2496ed) +![Graph](https://img.shields.io/badge/Graph-FalkorDB-purple) -## 数据快照 +面向贵州、贵阳旅行场景的知识图谱管理系统。项目包含 FastAPI 后端、React 管理后台、PostgreSQL 管理库、FalkorDB 图数据库、图谱 schema、采集/抽取脚本和可恢复的数据快照。用户 clone 仓库后可以直接用 Docker 启动完整系统,并恢复随仓库发布的图谱数据。 -- PostgreSQL:`snapshots/postgres/kg_admin_new2.dump` -- FalkorDB:`snapshots/falkordb/dump.rdb` -- 默认管理库 schema:`kg_admin_new2` -- 默认 FalkorDB 图谱:`guiyang_new2` -- 空间图谱:`guiyang_spatial_v1` +## 项目亮点 -当前快照包含约 `80609` 条空间 POI、`37457` 条候选实体,以及 FalkorDB 中的贵阳与旅行社相关图谱。详细结构见 `docs/reports/new2_current_kg_schema_snapshot.md`。 +- 完整后台:数据源、批次、实体审核、证据质量、图谱广场、发布回滚、权限、任务和通知模块。 +- 双数据库架构:PostgreSQL 保存后台业务数据与审核流程,FalkorDB 保存可查询图谱。 +- 可复现数据:仓库内置 PostgreSQL 与 FalkorDB 快照,下载后可恢复图谱数据。 +- Docker 一键启动:`docker compose up -d --build` 同时启动 API、管理后台、PostgreSQL 和 FalkorDB。 +- 旅行客服场景:内置固定线路、周边资源、酒店报价、车辆、行程推荐和图谱问答相关接口。 +- 可扩展 Agent:保留高德、网页、小红书、抖音、事件抽取、多源对齐、审计等 Agent 代码。 + +## 系统架构 + +```mermaid +flowchart LR + U["运营/标注/客服用户"] --> W["React 管理后台 /admin"] + W --> A["FastAPI 后端 /v1/admin"] + A --> P[("PostgreSQL\nkg_admin_new2")] + A --> F[("FalkorDB\nguiyang_new2\nguiyang_spatial_v1")] + A --> S["Schema 与采集脚本"] + A --> L["OpenAI 兼容 LLM\n可选"] + A --> M["高德地图 API\n可选"] +``` + +更详细的模块说明见 [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)。 ## 快速启动 @@ -37,12 +57,25 @@ http://localhost:8102/admin 常用服务端口: -- 管理后台/API:`8102` -- PostgreSQL:`5433` -- FalkorDB Redis 协议:`6380` -- FalkorDB Browser:`3002` +| 服务 | 默认地址 | +| --- | --- | +| 管理后台/API | `http://localhost:8102` | +| 管理后台页面 | `http://localhost:8102/admin` | +| API 文档 | `http://localhost:8102/docs` | +| PostgreSQL | `localhost:5433` | +| FalkorDB Redis 协议 | `localhost:6380` | +| FalkorDB Browser | `http://localhost:3002` | -## 重新恢复快照 +## 数据快照 + +仓库包含可恢复快照: + +| 数据库 | 快照文件 | 默认库/图 | +| --- | --- | --- | +| PostgreSQL | `snapshots/postgres/kg_admin_new2.dump` | database `kg_admin`, schema `kg_admin_new2` | +| FalkorDB | `snapshots/falkordb/dump.rdb` | graph `guiyang_new2`, `guiyang_spatial_v1` | + +当前快照包含约 `80609` 条空间 POI、`37457` 条候选实体,以及 FalkorDB 中的贵阳与旅行社相关图谱。详细结构见 [docs/reports/new2_current_kg_schema_snapshot.md](docs/reports/new2_current_kg_schema_snapshot.md) 和 [docs/DATA_SNAPSHOTS.md](docs/DATA_SNAPSHOTS.md)。 PostgreSQL 初始化脚本只会在 Docker 数据卷首次创建时执行。如果要丢弃本地修改并从仓库快照重新恢复: @@ -51,7 +84,7 @@ docker compose down -v docker compose up -d --build ``` -## 验证数据 +## 验证命令 检查 API: @@ -72,14 +105,48 @@ docker compose exec postgres psql -U admin -d kg_admin \ docker compose exec falkordb redis-cli -p 6379 GRAPH.LIST ``` +## 项目结构 + +```text +. +├── app/ # FastAPI 后端、API 路由、Agent、图谱核心逻辑 +├── admin-web/ # React + Vite 管理后台源码 +├── docker/ # Docker 初始化脚本 +├── docs/ # 项目说明、架构、部署、数据和报告文档 +├── schema搭建/ # 图谱 schema、百科样例数据和 DSL 资料 +├── scripts/ # 采集、构建、发布和快照导出脚本 +├── snapshots/ # 可恢复数据库快照 +├── Dockerfile # 后端镜像与前端静态资源构建 +├── docker-compose.yml # 一键启动完整系统 +├── .env.example # 本地配置模板 +└── requirements.txt # Python 后端依赖 +``` + +不会上传 `.env`、`node_modules/`、运行日志、浏览器缓存、本地 Docker 数据卷和临时文件。仓库保留系统源码、Docker 配置、schema、脚本、必要报告和数据库快照。 + +## 文档导航 + +| 文档 | 内容 | +| --- | --- | +| [docs/PROJECT_OVERVIEW.md](docs/PROJECT_OVERVIEW.md) | 系统定位、业务能力和功能地图 | +| [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) | 容器、后端、前端和数据架构 | +| [docs/DEPLOYMENT.md](docs/DEPLOYMENT.md) | Docker 部署、端口、环境变量和常见问题 | +| [docs/DATA_SNAPSHOTS.md](docs/DATA_SNAPSHOTS.md) | 数据快照、恢复、重导出和校验 | +| [docs/API_REFERENCE.md](docs/API_REFERENCE.md) | API 分组、常用接口和调用示例 | +| [docs/MAINTENANCE.md](docs/MAINTENANCE.md) | 维护流程、发布检查和仓库边界 | +| [CONTRIBUTING.md](CONTRIBUTING.md) | 开发参与和提交规范 | +| [SECURITY.md](SECURITY.md) | 演示账号、密钥和生产安全建议 | +| [CHANGELOG.md](CHANGELOG.md) | 版本记录 | + ## 本地开发 -后端依赖: +后端开发: ```bash python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt +cp .env.example .env python -m uvicorn app.main:app --host 0.0.0.0 --port 8102 --reload ``` @@ -101,4 +168,8 @@ npm run dev bash scripts/export_snapshots.sh ``` -然后提交 `snapshots/` 中更新后的 dump 文件即可。 +然后提交 `snapshots/` 中更新后的 dump 文件即可。快照文件接近 GitHub 单文件建议上限,更新前请先确认文件大小不超过 GitHub 的 100 MB 单文件限制。 + +## 生产使用提醒 + +默认账号、数据库密码和 `AUTH_SECRET` 仅用于演示。正式部署前请修改 `.env` 或 compose 环境变量,并为 LLM、高德等外部服务单独配置密钥。更多建议见 [SECURITY.md](SECURITY.md)。 diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 0000000..48978a7 --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,39 @@ +# 安全说明 + +本仓库默认配置用于本地演示和项目复现,不适合直接暴露到公网。 + +## 演示账号 + +Docker 快照初始化后会提供演示账号: + +```text +admin@example.com / change-me +``` + +生产环境必须修改默认密码,并替换 `AUTH_SECRET`。 + +## 密钥管理 + +不要提交以下内容: + +- `.env` +- 数据库真实密码 +- `AUTH_SECRET` +- LLM API Key +- 高德地图 Key +- Cookie、浏览器 profile 或登录态文件 + +`.env.example` 只保留可运行的示例值。 + +## 生产部署建议 + +- 使用 HTTPS 和反向代理。 +- 不向公网暴露 PostgreSQL 和 FalkorDB 端口。 +- 修改数据库账号、密码和默认管理员账号。 +- 定期备份 PostgreSQL 与 FalkorDB 数据卷。 +- 为 LLM、高德等外部服务配置最小权限密钥。 +- 开启容器日志和异常监控。 + +## 报告问题 + +如果发现凭据泄露、越权访问或数据安全问题,请优先私下联系仓库维护者处理,再公开 issue。 diff --git a/admin-web/src/panels/plaza/ManualIngestPanel.tsx b/admin-web/src/panels/plaza/ManualIngestPanel.tsx index 1af04bd..ce4f9f9 100644 --- a/admin-web/src/panels/plaza/ManualIngestPanel.tsx +++ b/admin-web/src/panels/plaza/ManualIngestPanel.tsx @@ -1141,8 +1141,8 @@ export default function ManualIngestPanel() { const fillSample = () => { setText(SAMPLE_TEXT); setRootEntity("花溪公园"); - setSourceName("医学文档"); - setSourceUrl("/Users/jier/upload/demo-manual-document.txt"); + setSourceName("旅行示例文档"); + setSourceUrl("demo-source://manual-ingest/huaxi-park.txt"); setBusinessScene("scenic"); }; diff --git a/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx b/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx index d65204a..ac5f6d4 100644 --- a/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx +++ b/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx @@ -833,11 +833,13 @@ function resultPanelTitle(payload: AssistantPayload | null) { if (mode === "nearby_resource") return "景区附近资源"; if (mode === "hotel_resource") return "酒店价格资源"; if (mode === "route_catalog") return "线路清单"; + if (mode === "route_price") return "线路报价"; return "路线推荐"; } function modeFallback(mode?: string) { const labels: Record = { + route_price: "线路报价", route_catalog: "线路清单", multi_task_agent: "多任务客服 Agent", route_match_fast: "固定线路快速匹配", diff --git a/app/agents/gaode_connector.py b/app/agents/gaode_connector.py index 94a6065..613c8cb 100644 --- a/app/agents/gaode_connector.py +++ b/app/agents/gaode_connector.py @@ -11,6 +11,7 @@ from __future__ import annotations import importlib.util import os +from pathlib import Path from typing import Any import requests @@ -18,7 +19,8 @@ import urllib3 urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning) -_CRAWL_PATH = "/Users/xuexue/PycharmProjects/PythonProject/xuexue-CityGraph/crawl_guiyan.py" +_PROJECT_ROOT = Path(__file__).resolve().parents[2] +_CRAWL_PATH = os.getenv("GAODE_CRAWLER_PATH", str(_PROJECT_ROOT / "scripts" / "crawl_guiyan.py")) _mod: Any = None # 高德官方一级 POI 类型编码(按 type code 网格扫描,不靠热度关键词) diff --git a/app/api/travel_assistant.py b/app/api/travel_assistant.py index 5c4432c..3c6e05e 100644 --- a/app/api/travel_assistant.py +++ b/app/api/travel_assistant.py @@ -616,6 +616,7 @@ async def _llm_intent(question: str, fallback: dict[str, Any], enabled: bool) -> TASK_LABELS = { + "route_price": "线路报价", "route_catalog": "线路清单", "route_match": "线路匹配", "nearby_resource": "景区附近酒店/餐饮", @@ -637,6 +638,19 @@ def _route_task_needed(question: str, intent: dict[str, Any]) -> bool: return bool(intent.get("duration_days") or intent.get("destinations")) and _has_any(question, route_terms) +def _route_price_question(question: str) -> bool: + price_terms = ("多少钱", "价格", "报价", "费用", "线路多少钱", "路线多少钱", "产品多少钱", "怎么收费") + route_terms = ( + "路线", "线路", "产品", "行程", "旅游", "游", "几日游", "一日游", "二日游", "三日游", "四日游", + "五日游", "六日游", "黄小西", "小西", "镇梵", + ) + hotel_terms = ("酒店", "住宿", "房型", "房价", "房费", "间夜") + vehicle_terms = ("车辆", "车型", "用车", "车费", "车价") + if _has_any(question, hotel_terms) or _has_any(question, vehicle_terms): + return False + return _has_any(question, price_terms) and _has_any(question, route_terms) + + def _resource_task_needed(question: str) -> bool: resource_terms = ("酒店", "住宿", "住哪", "客栈", "民宿", "餐饮", "餐厅", "吃饭", "饭店", "美食") near_or_choice_terms = ("附近", "周边", "可选", "选择", "推荐", "哪些", "那些", "有哪些", "有什么", "住", "吃") @@ -668,7 +682,9 @@ def _agent_task_plan(question: str, intent: dict[str, Any]) -> list[dict[str, An "reason": reason, }) - if _route_catalog_question(question): + if _route_price_question(question): + add("route_price", 0.95, "命中线路产品报价问题") + elif _route_catalog_question(question): add("route_catalog", 0.96, "命中线路清单问法") elif _route_task_needed(question, intent): add("route_match", 0.92, "命中出行天数/景区/推荐行程约束") @@ -730,6 +746,8 @@ def _rule_fast_intent_method(question: str, intent: dict[str, Any]) -> str: tasks = _agent_task_plan(question, intent) if len(tasks) > 1: return "rule_multi_task_fast_path" + if _route_price_question(question): + return "rule_route_price_fast_path" if _vehicle_only_question(question, intent): return "rule_vehicle_fast_path" if _route_catalog_question(question): @@ -1777,6 +1795,7 @@ def _infer_response_mode(response: dict[str, Any]) -> str: method = _value(trace.get("method"), 120) for token, inferred in ( ("multi_task", "multi_task_agent"), + ("route_price", "route_price"), ("route_catalog", "route_catalog"), ("nearby_resource", "nearby_resource"), ("hotel_resource", "hotel_resource"), @@ -1793,6 +1812,7 @@ def _infer_response_mode(response: dict[str, Any]) -> str: def _response_mode_label(mode: str) -> str: labels = { "multi_task_agent": "多任务客服 Agent", + "route_price": "线路报价", "route_catalog": "线路清单", "route_match_fast": "固定线路快速匹配", "fixed_route_item": "固定线路深度匹配", @@ -1813,6 +1833,7 @@ def _agent_capabilities(response: dict[str, Any]) -> list[str]: capabilities: list[str] = [] mode_caps = { "multi_task_agent": ["多意图拆解", "KG 模板编排", "结构化聚合输出"], + "route_price": ["线路报价", "团期价格", "核价边界"], "route_catalog": ["线路清单", "同线路版本合并", "后续追问入口"], "route_match_fast": ["固定线路召回", "景点覆盖评分", "报价规则提示"], "fixed_route_item": ["固定线路召回", "每日行程", "费用槽位", "资源槽位"], @@ -2408,7 +2429,9 @@ def _attach_hotel_rate_summaries(graph_data: dict[str, Any], rate_index: dict[st def _hotel_resource_question(question: str) -> bool: - return any(token in question for token in ("房型", "淡季", "旺季", "挂牌价", "房价", "房费", "酒店价格", "住宿价格", "多少钱")) + hotel_terms = ("酒店", "住宿", "客栈", "民宿", "房型", "房价", "房费", "间夜") + price_terms = ("淡季", "旺季", "挂牌价", "房价", "房费", "酒店价格", "住宿价格", "多少钱") + return any(token in question for token in price_terms) and any(token in question for token in hotel_terms) def _match_hotel_rate_entries(question: str, rate_index: dict[str, dict[str, Any]]) -> list[dict[str, Any]]: @@ -3213,7 +3236,16 @@ def _score_fixed_product(intent: dict[str, Any], entry: dict[str, Any]) -> tuple text = _fixed_product_text(entry) route_text = _fixed_route_core_text(entry) score = 0 + score_cap = 100 reasons: list[str] = [] + raw_question = _value(intent.get("raw_text"), 500) + raw_norm = _norm_text(raw_question) + product_name = _value(product.get("name"), 160) + product_name_norm = _norm_text(product_name) + exact_product_name = bool(product_name_norm and len(product_name_norm) >= 8 and product_name_norm in raw_norm) + if exact_product_name: + score += 42 + reasons.append("命中指定线路名称") desired_days = _as_optional_int(intent.get("duration_days")) desired_nights = _as_optional_int(intent.get("duration_nights")) product_days, product_nights = _entry_duration_values(entry) @@ -3231,15 +3263,20 @@ def _score_fixed_product(intent: dict[str, Any], entry: dict[str, Any]) -> tuple else: reasons.append(f"{desired_days}天固定路线匹配") else: + score_cap = min(score_cap, 72) reasons.append(f"时长不匹配:需求{_duration_label(desired_days, desired_nights)},产品{_duration_label(product_days, product_nights)}") + missing_required = False for dest in intent.get("destinations") or []: aliases = ATTRACTION_ALIASES.get(dest, [dest]) if _contains_any(route_text, aliases): - score += 35 + score += 24 reasons.append(f"覆盖{dest}") else: - score -= 28 + missing_required = True + score -= 40 reasons.append(f"未覆盖{dest}") + if missing_required: + score_cap = min(score_cap, 45) requested_destinations = set(intent.get("destinations") or []) if requested_destinations and not intent.get("inferred_destinations"): extra_destinations = [ @@ -3248,11 +3285,25 @@ def _score_fixed_product(intent: dict[str, Any], entry: dict[str, Any]) -> tuple if canonical not in requested_destinations and _contains_any(route_text, aliases) ] if extra_destinations: - score -= min(24, 6 * len(extra_destinations)) + if not exact_product_name: + score_cap = min(score_cap, max(68, 90 - 8 * len(extra_destinations))) + score -= min(36, 12 * len(extra_destinations)) reasons.append(f"含额外景点{'、'.join(extra_destinations[:3])},需确认是否接受") - if product.get("base_price_status") == "ready_for_reference_quote": + has_price_reference = bool( + product.get("base_price_status") == "ready_for_reference_quote" + or _value(product.get("base_price_text"), 80) + or _value(product.get("adult_settlement_text"), 80) + or _value(product.get("child_settlement_text"), 80) + or _value(product.get("free_ticket_settlement_text"), 80) + or _value(product.get("single_room_diff_text"), 80) + or _value(product.get("quote_formula"), 80) + ) + if has_price_reference: score += 8 reasons.append("已有报价表依据") + elif intent.get("price_query"): + score_cap = min(score_cap, 70) + reasons.append("线路报价数据待补") if _is_low_budget(intent): if any(term in text for term in ("经济", "性价比", "普通", "四钻", "4钻")): score += 5 @@ -3272,7 +3323,7 @@ def _score_fixed_product(intent: dict[str, Any], entry: dict[str, Any]) -> tuple score += 2 if not reasons: reasons.append("按固定路线、景点覆盖和资源槽位综合匹配") - return score, reasons[:7] + return min(score, score_cap), reasons[:7] def _format_price_reference(product: dict[str, Any], intent: dict[str, Any]) -> str: @@ -4210,6 +4261,7 @@ def _fixed_route_match_fast_response( intent_method: str, graph_data: dict[str, Any], ) -> dict[str, Any]: + price_query = bool(intent.get("price_query") or _route_price_question(question)) entries = list(graph_data.get("products", {}).values()) strict_matches = _strict_fixed_route_matches(intent, entries) strict_note = _strict_route_gap_note(intent, entries) @@ -4236,11 +4288,22 @@ def _fixed_route_match_fast_response( ) ranked = _dedupe_route_entries(ranked) plans = [_build_fixed_route_plan(entry, idx + 1, intent) for idx, entry in enumerate(ranked[:8])] - followups = [ - "要不要继续查某个景区附近可选酒店/餐饮?", - "要不要继续查这条线路的可选车辆?", - "要不要继续查某个景区的门票/观光车/保险等费用?", - ] + if price_query: + for plan in plans: + plan["price_query"] = True + if plan.get("rank") == 1: + plan["label"] = "线路报价" + followups = [ + "请确认出行日期属于哪个价格区间。", + "请确认成人/儿童人数、酒店档位和是否有单房差。", + "请确认是否接受线路中额外包含的景点,或需要继续找更贴合的线路。", + ] + else: + followups = [ + "要不要继续查某个景区附近可选酒店/餐饮?", + "要不要继续查这条线路的可选车辆?", + "要不要继续查某个景区的门票/观光车/保险等费用?", + ] evidence: list[dict[str, Any]] = [] for plan in plans: evidence.append({"type": "固定线路产品", "name": plan["plan_name"], "summary": plan["route_summary"], "source": plan["source"]}) @@ -4258,15 +4321,20 @@ def _fixed_route_match_fast_response( "strict_match_count": len(strict_matches), "strict_match_note": strict_note, "resource_counts": {"TourProduct": len(entries)}, - "method": "fixed_route_item_route_match_fast_v1", + "method": "fixed_route_item_route_price_lookup_v1" if price_query else "fixed_route_item_route_match_fast_v1", "intent_method": intent_method, - "response_mode": "route_match_fast", + "response_mode": "route_price" if price_query else "route_match_fast", + "price_query": price_query, }, } def _copy_fixed_text(plans: list[dict[str, Any]], followups: list[str], strict_note: str = "") -> str: - lines = ["您好,按您当前需求,先从已有固定线路产品里匹配如下:"] + is_price_query = any(plan.get("price_query") for plan in plans) + if is_price_query: + lines = ["您好,按您当前需求,先从已有固定线路产品里匹配并核对报价如下:"] + else: + lines = ["您好,按您当前需求,先从已有固定线路产品里匹配如下:"] if strict_note: lines.append(f"注意:{strict_note} 以下为相近替代方案,不要直接承诺完全满足客户天数/景点。") for plan in plans[:8]: @@ -4276,7 +4344,8 @@ def _copy_fixed_text(plans: list[dict[str, Any]], followups: list[str], strict_n lines.append(f"{plan['rank']}. {line_prefix}{plan['plan_name']}({_duration_label(plan.get('duration_days'), plan.get('duration_nights'))})") lines.append(f"匹配点:{'、'.join(plan['match_reasons'][:4])}") lines.append(f"路线:{plan['route_summary']}") - lines.append(f"报价依据:{plan['quote_summary']}") + quote_label = "线路报价" if is_price_query else "报价依据" + lines.append(f"{quote_label}:{plan['quote_summary']}") vehicle = next((item["detail"] for item in plan.get("cost_breakdown", []) if item["category"] == "小包团用车"), "") if vehicle: lines.append(f"用车建议:{vehicle}") @@ -4325,7 +4394,7 @@ def _fixed_route_item_task_response( if kind == "route_catalog": catalog_data = shared.setdefault("catalog_data", _cached_fixed_route_catalog_graph(graph_name)) return _fixed_route_catalog_response(question, graph_name, intent, intent_method, catalog_data, limit=5) - if kind == "route_match": + if kind in {"route_match", "route_price"}: catalog_data = shared.setdefault("catalog_data", _cached_fixed_route_catalog_graph(graph_name)) return _fixed_route_match_fast_response(question, graph_name, intent, intent_method, catalog_data) if kind == "nearby_resource": @@ -4419,6 +4488,9 @@ def _fixed_route_item_response(question: str, graph_name: str, intent: dict[str, if responses: return _merge_fixed_task_responses(question, graph_name, intent, intent_method, executed_tasks, responses) + if _route_price_question(question): + catalog_data = _cached_fixed_route_catalog_graph(graph_name) + return _fixed_route_match_fast_response(question, graph_name, intent, intent_method, catalog_data) if _route_catalog_question(question): catalog_data = _cached_fixed_route_catalog_graph(graph_name) return _fixed_route_catalog_response(question, graph_name, intent, intent_method, catalog_data) @@ -4869,6 +4941,7 @@ async def travel_assistant_query( if intent_method == "llm_intent_parser": intent = _guard_llm_intent(question, rule_intent, intent) intent = _complete_intent_defaults(question, intent) + intent["price_query"] = _route_price_question(question) planned_tasks = _agent_task_plan(question, intent) intent_confidence = _rule_agent_confidence(question, intent, planned_tasks) intent["planned_tasks"] = planned_tasks diff --git a/docker-compose.yml b/docker-compose.yml index 903c837..829f137 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -86,6 +86,7 @@ services: AMAP_WEB_KEY: "" AMAP_JS_KEY: "" AMAP_SECURITY_JSCODE: "" + GAODE_CRAWLER_PATH: "" ports: - "${API_PORT:-8102}:8000" depends_on: diff --git a/docs/API_REFERENCE.md b/docs/API_REFERENCE.md new file mode 100644 index 0000000..aaf038c --- /dev/null +++ b/docs/API_REFERENCE.md @@ -0,0 +1,91 @@ +# API 说明 + +FastAPI 会自动生成交互式 API 文档。启动服务后访问: + +```text +http://localhost:8102/docs +``` + +所有后台接口统一挂载在: + +```text +/v1/admin +``` + +## 常用接口 + +| 接口 | 方法 | 说明 | +| --- | --- | --- | +| `/v1/admin/health` | `GET` | 健康检查 | +| `/v1/admin/auth/login` | `POST` | 管理员登录 | +| `/v1/admin/auth/me` | `GET` | 当前登录用户 | +| `/v1/admin/projects` | `GET/POST` | 项目管理 | +| `/v1/admin/ontology-schemas/current` | `GET` | 当前 schema | +| `/v1/admin/source-profiles` | `GET/POST/PATCH` | 数据源管理 | +| `/v1/admin/batches` | `GET` | 批次管理 | +| `/v1/admin/entities` | `GET` | 候选实体列表 | +| `/v1/admin/conflicts` | `GET` | 冲突列表 | +| `/v1/admin/publish-jobs` | `GET/POST` | 发布任务 | +| `/v1/admin/graph/overview` | `GET` | 图谱概览 | +| `/v1/admin/graph/query` | `POST` | 图谱查询 | +| `/v1/admin/plaza/overview` | `GET` | 图谱广场概览 | +| `/v1/admin/manual-ingest/extract` | `POST` | 手动抽取 | +| `/v1/admin/travel/assistant-query` | `POST` | 旅行客服问答 | +| `/v1/admin/super-agent/run` | `POST` | Super Agent 任务 | +| `/v1/admin/roles` | `GET/POST` | 角色管理 | +| `/v1/admin/users` | `GET/POST` | 用户管理 | +| `/v1/admin/areas/tree` | `GET` | 区域树 | +| `/v1/admin/notifications` | `GET` | 通知列表 | + +## 登录示例 + +```bash +curl -s http://localhost:8102/v1/admin/auth/login \ + -H 'Content-Type: application/json' \ + -d '{"username":"admin@example.com","password":"change-me"}' +``` + +返回中包含 `access_token`,后续接口可使用: + +```bash +TOKEN="上一步返回的 access_token" +curl http://localhost:8102/v1/admin/auth/me \ + -H "Authorization: Bearer $TOKEN" +``` + +## 图谱查询示例 + +```bash +curl http://localhost:8102/v1/admin/graph/overview +``` + +```bash +curl http://localhost:8102/v1/admin/graph/query \ + -H 'Content-Type: application/json' \ + -d '{"query":"MATCH (n) RETURN n LIMIT 10"}' +``` + +## 旅行客服问答示例 + +```bash +curl http://localhost:8102/v1/admin/travel/assistant-query \ + -H 'Content-Type: application/json' \ + -d '{"question":"黄小西三日游多少钱?"}' +``` + +## 前端调用 + +React 管理后台通过 `admin-web/src/api.ts` 访问同源 API。Docker 部署时前端和 API 同在 `http://localhost:8102`,因此无需额外配置跨域代理。 + +## 外部服务 + +LLM 和高德地图相关能力默认关闭或留空。启用前需要在 `.env` 或 Docker Compose 环境变量中配置: + +- `LLM_API_BASE` +- `LLM_API_KEY` +- `LLM_MODEL` +- `LLM_EXTRACTION_ENABLED=true` +- `AMAP_WEB_KEY` +- `AMAP_JS_KEY` +- `AMAP_SECURITY_JSCODE` +- `GAODE_CRAWLER_PATH` diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md new file mode 100644 index 0000000..2b663ef --- /dev/null +++ b/docs/ARCHITECTURE.md @@ -0,0 +1,81 @@ +# 架构说明 + +系统采用前后端分离加双数据库架构。Docker Compose 会启动 PostgreSQL、FalkorDB 和 FastAPI API 服务;React 管理后台在镜像构建时被打包为静态文件,由 FastAPI 挂载到 `/admin`。 + +## 容器拓扑 + +```mermaid +flowchart TB + subgraph compose["Docker Compose"] + API["api\nFastAPI + 静态后台"] + PG["postgres\nPostgreSQL 16"] + Seed["falkordb-seed\n复制 dump.rdb"] + FK["falkordb\nFalkorDB"] + end + Browser["浏览器"] --> API + API --> PG + API --> FK + Seed --> FK +``` + +## 服务职责 + +| 服务 | 职责 | +| --- | --- | +| `api` | 提供 `/v1/admin/*` API,挂载 `/admin` 前端页面,连接 PostgreSQL 与 FalkorDB | +| `postgres` | 保存后台业务数据、用户权限、项目、候选实体、审核记录和任务 | +| `falkordb-seed` | 首次启动时把仓库内的 `snapshots/falkordb/dump.rdb` 写入数据卷 | +| `falkordb` | 保存图数据库,支持 Cypher/Redis 协议访问和 FalkorDB Browser | + +## 后端模块 + +| 路径 | 说明 | +| --- | --- | +| `app/main.py` | FastAPI 入口、CORS、路由挂载和前端静态资源挂载 | +| `app/config.py` | 环境变量配置 | +| `app/db.py` | PostgreSQL 连接池 | +| `app/api/` | 管理后台 API 路由 | +| `app/agents/` | 采集、抽取、对齐、审计和外部站点 Agent | +| `app/kg_core/` | 空间图谱与核心图谱辅助逻辑 | +| `app/schemas/` | 抽取 schema | +| `app/security.py`、`app/auth.py` | 登录、令牌和权限相关逻辑 | + +## 前端模块 + +| 路径 | 说明 | +| --- | --- | +| `admin-web/src/App.tsx` | 管理后台主应用与路由 | +| `admin-web/src/api.ts` | API 客户端 | +| `admin-web/src/panels/plaza/` | 图谱广场、用户查询、手动抽取和 Super Agent | +| `admin-web/src/panels/acquisition/` | 数据源、批次和冲突工作台 | +| `admin-web/src/panels/review/` | 证据质量、字段审核、专家签核和资产库 | +| `admin-web/src/panels/modeling/` | Schema、词表和健康检查 | +| `admin-web/src/panels/publish/` | 发布与回滚 | +| `admin-web/src/panels/system/` | 用户、权限、区域、通知、Agent 设置和日志 | + +## 数据层 + +PostgreSQL 和 FalkorDB 承担不同职责: + +- PostgreSQL:结构化管理数据、审核过程数据、用户权限、任务、来源、候选实体和证据。 +- FalkorDB:图谱实体、关系、路线、资源、POI、空间索引和面向查询的图结构。 + +默认配置: + +| 配置项 | 默认值 | +| --- | --- | +| PostgreSQL database | `kg_admin` | +| PostgreSQL schema | `kg_admin_new2` | +| FalkorDB 业务图 | `guiyang_new2` | +| FalkorDB 空间图 | `guiyang_spatial_v1` | + +## 构建过程 + +`Dockerfile` 使用多阶段构建: + +1. Node.js 阶段进入 `admin-web/`,执行 `npm ci` 和 `npm run build`。 +2. Python 阶段安装 `requirements.txt`。 +3. 复制 `app/`、`schema搭建/` 和前端构建产物到镜像。 +4. 容器启动 `uvicorn app.main:app --host 0.0.0.0 --port 8000`。 + +快照文件不会进入 API 镜像,它们由 `docker-compose.yml` 作为只读挂载提供给数据库容器。 diff --git a/docs/DATA_SNAPSHOTS.md b/docs/DATA_SNAPSHOTS.md new file mode 100644 index 0000000..c4cb066 --- /dev/null +++ b/docs/DATA_SNAPSHOTS.md @@ -0,0 +1,103 @@ +# 数据快照说明 + +仓库内置数据库快照,目的是让用户 clone 后可以直接恢复 `new2` 图谱系统,而不需要重新采集、抽取和发布数据。 + +## 快照文件 + +| 文件 | 类型 | 用途 | +| --- | --- | --- | +| `snapshots/postgres/kg_admin_new2.dump` | PostgreSQL custom dump | 恢复后台业务库、账号、项目、候选实体、审核数据等 | +| `snapshots/falkordb/dump.rdb` | FalkorDB RDB | 恢复图数据库中的业务图和空间图 | + +默认数据: + +| 项 | 值 | +| --- | --- | +| PostgreSQL database | `kg_admin` | +| PostgreSQL schema | `kg_admin_new2` | +| FalkorDB 业务图 | `guiyang_new2` | +| FalkorDB 空间图 | `guiyang_spatial_v1` | +| 空间 POI | 约 `80609` 条 | +| 候选实体 | 约 `37457` 条 | + +## 恢复流程 + +Docker Compose 首次启动时会自动恢复: + +1. `postgres` 容器创建数据卷。 +2. `docker/postgres-init/01-restore-snapshot.sh` 使用 `pg_restore` 恢复 PostgreSQL 快照。 +3. 脚本把演示管理员密码重置为 `change-me`。 +4. `falkordb-seed` 容器把 `dump.rdb` 复制到 FalkorDB 数据卷。 +5. `falkordb` 容器读取 RDB 并加载图数据。 + +如果数据卷已存在,初始化不会重复执行。需要重置时运行: + +```bash +docker compose down -v +docker compose up -d --build +``` + +## 校验数据 + +PostgreSQL: + +```bash +docker compose exec postgres psql -U admin -d kg_admin \ + -c "SELECT COUNT(*) FROM kg_admin_new2.amap_spatial_pois;" +``` + +候选实体: + +```bash +docker compose exec postgres psql -U admin -d kg_admin \ + -c "SELECT COUNT(*) FROM kg_admin_new2.candidate_entities;" +``` + +FalkorDB 图列表: + +```bash +docker compose exec falkordb redis-cli -p 6379 GRAPH.LIST +``` + +快照文件哈希: + +```bash +shasum -a 256 snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb +``` + +当前发布快照的参考哈希: + +```text +c70a3fe2730cd40a96e729097cef1eb39c66498371b88b2e36e985c923043e75 snapshots/postgres/kg_admin_new2.dump +dde96ac99bff58d18bb00e84939772d8a4efc4893aeeae02329aa893ae51f247 snapshots/falkordb/dump.rdb +``` + +## 重新导出快照 + +如果本机仍保留 `new2` 原始容器,可以运行: + +```bash +bash scripts/export_snapshots.sh +``` + +脚本默认从以下容器导出: + +| 容器 | 用途 | +| --- | --- | +| `zn-kg-new2-postgres` | 导出 `kg_admin_new2` schema | +| `zn-kg-new2-falkordb` | 触发 `BGSAVE` 并复制 `dump.rdb` | + +导出后请验证: + +```bash +ls -lh snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb +shasum -a 256 snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb +``` + +## 仓库边界 + +`data/`、浏览器 profile、日志、缓存和本地 Docker 数据卷不进入 Git。它们是运行过程或采集过程中的临时产物,不适合作为 GitHub 项目内容。系统可复现所需的数据已经收敛到 `snapshots/`、`schema搭建/`、`docs/` 和源码目录中。 + +## GitHub 文件大小提醒 + +GitHub 单文件硬限制为 100 MB。当前两个快照均低于该限制。后续如果快照继续增大,建议改用 Git LFS、Release Asset 或对象存储,并在 README 中保留下载和恢复说明。 diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md new file mode 100644 index 0000000..7099fc0 --- /dev/null +++ b/docs/DEPLOYMENT.md @@ -0,0 +1,169 @@ +# 部署指南 + +本文档说明如何用 Docker 启动、重置、配置和排查旅行知识图谱管理系统。 + +## 环境要求 + +- Docker Desktop 或 Docker Engine +- Docker Compose v2 +- 至少 4 GB 可用内存 +- 至少 3 GB 可用磁盘空间 + +## 一键启动 + +```bash +docker compose up -d --build +``` + +启动后访问: + +```text +http://localhost:8102/admin +``` + +默认账号: + +```text +admin@example.com / change-me +``` + +## 首次启动会发生什么 + +1. 构建 API 镜像,并打包 React 管理后台。 +2. 创建 PostgreSQL 数据卷。 +3. PostgreSQL 初始化脚本恢复 `snapshots/postgres/kg_admin_new2.dump`。 +4. `falkordb-seed` 把 `snapshots/falkordb/dump.rdb` 写入 FalkorDB 数据卷。 +5. FastAPI 服务等待 PostgreSQL 和 FalkorDB 健康后启动。 + +## 常用命令 + +查看服务状态: + +```bash +docker compose ps +``` + +查看 API 日志: + +```bash +docker compose logs -f api +``` + +停止服务: + +```bash +docker compose down +``` + +停止并删除数据卷,下一次启动将重新恢复快照: + +```bash +docker compose down -v +docker compose up -d --build +``` + +## 端口配置 + +可以在启动时覆盖端口: + +```bash +API_PORT=18102 \ +POSTGRES_PORT=15433 \ +FALKORDB_PORT=16380 \ +FALKORDB_BROWSER_PORT=13002 \ +docker compose up -d --build +``` + +默认端口: + +| 变量 | 默认值 | 说明 | +| --- | --- | --- | +| `API_PORT` | `8102` | FastAPI 与管理后台 | +| `POSTGRES_PORT` | `5433` | PostgreSQL 映射端口 | +| `FALKORDB_PORT` | `6380` | FalkorDB Redis 协议端口 | +| `FALKORDB_BROWSER_PORT` | `3002` | FalkorDB Browser | + +## 环境变量 + +Docker Compose 已提供可运行默认值。生产部署时建议改成 `.env` 文件或部署平台的环境变量。 + +| 变量 | 说明 | +| --- | --- | +| `DATABASE_URL` | 后端连接 PostgreSQL 的 URL | +| `DB_SCHEMA` | 默认 `kg_admin_new2` | +| `DB_MIGRATIONS_ENABLED` | 快照部署默认 `false` | +| `FALKORDB_HOST` | Docker 内默认 `falkordb` | +| `FALKORDB_GRAPH` | 默认业务图 `guiyang_new2` | +| `AUTH_SECRET` | JWT 签名密钥,生产必须替换 | +| `AUTH_DEFAULT_USERNAME` | 默认管理员用户名 | +| `AUTH_DEFAULT_PASSWORD` | 默认管理员密码 | +| `LLM_API_BASE` | OpenAI 兼容模型服务地址,可选 | +| `LLM_API_KEY` | LLM 密钥,可选 | +| `LLM_EXTRACTION_ENABLED` | 是否启用 LLM 抽取 | +| `AMAP_WEB_KEY`、`AMAP_JS_KEY` | 高德地图密钥,可选 | +| `GAODE_CRAWLER_PATH` | 外部高德采集脚本路径,可选 | +| `TRAVEL_AGENCY_SOURCE_ROOT` | 旅行社原始资料目录,仅运行采集/构图脚本时需要 | +| `TRAVEL_DELIVERY_ROOT` | POI 交付 CSV 目录,仅运行采集/增强脚本时需要 | +| `TRAVEL_KG_EXPORT_ROOT` | 采集/构图脚本导出目录 | + +## 健康检查 + +API: + +```bash +curl http://localhost:8102/v1/admin/health +``` + +PostgreSQL: + +```bash +docker compose exec postgres pg_isready -U admin -d kg_admin +``` + +FalkorDB: + +```bash +docker compose exec falkordb redis-cli -p 6379 PING +``` + +## 常见问题 + +### 管理后台 404 + +确认镜像已重新构建: + +```bash +docker compose up -d --build api +``` + +前端静态资源由 FastAPI 挂载在 `/admin`,直接访问根路径不会进入后台。 + +### 数据没有恢复 + +PostgreSQL 初始化脚本只在数据卷首次创建时运行。如果已经创建过数据卷,需要先删除卷: + +```bash +docker compose down -v +docker compose up -d --build +``` + +### 端口被占用 + +使用端口变量覆盖默认端口,例如: + +```bash +API_PORT=18102 docker compose up -d +``` + +### 登录失败 + +初始化脚本会把 `admin@example.com` 的演示密码设置为 `change-me`。如果仍失败,先确认 PostgreSQL 已重新恢复快照,再查看 API 日志。 + +## 生产加固建议 + +- 修改 `AUTH_SECRET`、数据库密码和默认管理员密码。 +- 不要把真实 `.env`、LLM key、高德 key 提交到仓库。 +- 用反向代理提供 HTTPS。 +- 给 PostgreSQL 和 FalkorDB 配置持久化备份。 +- 如果面向公网,限制数据库端口暴露,只暴露 API/前端。 +- 开启日志采集和容器监控。 diff --git a/docs/MAINTENANCE.md b/docs/MAINTENANCE.md new file mode 100644 index 0000000..e48b1db --- /dev/null +++ b/docs/MAINTENANCE.md @@ -0,0 +1,101 @@ +# 维护指南 + +本文档记录项目维护、发布和提交前检查流程,帮助仓库长期保持可下载、可启动、可理解。 + +## 提交前检查 + +```bash +git status --short +python -m compileall app +cd admin-web && npm run build +cd .. +docker compose config +``` + +如果修改 Docker 或数据库快照,建议再做一次完整启动验证: + +```bash +API_PORT=18102 \ +POSTGRES_PORT=15433 \ +FALKORDB_PORT=16380 \ +FALKORDB_BROWSER_PORT=13002 \ +docker compose up -d --build + +curl http://localhost:18102/v1/admin/health + +docker compose down -v +``` + +## 更新系统代码 + +- 后端 API 放在 `app/api/`。 +- Agent 和抽取流程放在 `app/agents/`。 +- 图谱核心能力放在 `app/kg_core/`。 +- 前端页面和面板放在 `admin-web/src/`。 +- 配置项统一从 `app/config.py` 和环境变量读取。 + +历史采集/构图脚本的本地资料路径统一在 `scripts/common_paths.py` 中配置,默认指向仓库内 `data/source` 和 `data/exports`。需要接入自己的原始资料时,可通过 `TRAVEL_AGENCY_SOURCE_ROOT`、`TRAVEL_DELIVERY_ROOT`、`TRAVEL_KG_EXPORT_ROOT`、`GAODE_CRAWLER_PATH` 覆盖。 + +新增功能时请同步更新: + +- README 的功能说明或项目结构。 +- `docs/API_REFERENCE.md` 中的接口列表。 +- 需要运行环境变量时更新 `.env.example` 和 `docs/DEPLOYMENT.md`。 + +## 更新图谱 schema + +图谱 schema 和 DSL 资料主要放在 `schema搭建/`。如果 schema 已经发布到数据库或 FalkorDB,建议同时更新: + +- schema 文件 +- 发布脚本 +- `docs/reports/new2_current_kg_schema_snapshot.md` +- 数据快照 + +## 更新数据快照 + +从本地 `new2` 容器导出: + +```bash +bash scripts/export_snapshots.sh +``` + +导出后检查: + +```bash +ls -lh snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb +shasum -a 256 snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb +``` + +快照更新后必须执行 Docker 恢复验证,确保新用户 clone 仓库后可以直接使用。 + +## 仓库不应包含 + +- `.env` 和真实密钥 +- `node_modules/` +- Python 虚拟环境 +- Playwright/浏览器 profile +- Docker 数据卷 +- 临时日志、截图和缓存 +- 超过 GitHub 限制的大文件 + +## 发布建议 + +成熟发布建议包含: + +1. 代码和文档已提交。 +2. Docker 可以从零构建。 +3. PostgreSQL 和 FalkorDB 快照可以恢复。 +4. 默认账号可以登录。 +5. README 中的启动命令与端口正确。 +6. GitHub 仓库首页可以看到系统定位、架构、数据和部署方式。 + +## 当前验证记录 + +本地已用非默认端口完成 Docker 验证: + +- API 健康检查返回 `{"status":"ok"}`。 +- 管理后台 `/admin/` 返回 `200`。 +- PostgreSQL 恢复后 `kg_admin_new2.amap_spatial_pois` 为 `80609` 条。 +- PostgreSQL 恢复后 `kg_admin_new2.candidate_entities` 为 `37457` 条。 +- FalkorDB 可列出 `guiyang_new2`、`guiyang_spatial_v1` 等图。 +- `admin@example.com / change-me` 登录成功。 diff --git a/docs/PROJECT_OVERVIEW.md b/docs/PROJECT_OVERVIEW.md new file mode 100644 index 0000000..969b6b2 --- /dev/null +++ b/docs/PROJECT_OVERVIEW.md @@ -0,0 +1,58 @@ +# 系统介绍 + +旅行知识图谱管理系统是 `new2` 版本的城市与旅行领域知识图谱平台,面向景区、旅行社、文旅运营和智能客服场景。系统把采集资料、百科文本、POI 空间数据、线路产品、酒店/餐饮/车辆等资源沉淀为可审核、可发布、可查询的图谱资产。 + +## 目标用户 + +- 文旅运营人员:查看图谱覆盖、数据质量、缺口和发布状态。 +- 数据标注与审核人员:处理实体字段、证据来源、冲突合并和专家签核。 +- 产品和线路人员:维护固定线路、景点组合、报价说明和资源约束。 +- 智能客服研发人员:基于图谱接口构建线路问答、周边资源推荐和报价查询。 +- 工程维护人员:通过 Docker、快照和脚本复现系统与数据。 + +## 核心能力 + +| 能力 | 说明 | +| --- | --- | +| 数据源与批次管理 | 管理来源、采集批次、原始记录和质量摘要 | +| 实体审核 | 查看候选实体、字段决策、证据链、审查历史和合并 | +| 图谱广场 | 汇总图谱规模、使用情况、健康告警和用户查询 | +| Schema 管理 | 管理 ontology schema、DSL、版本和发布记录 | +| 证据质量 | 聚合 POI 证据、资源质量和字段可信度 | +| 发布与回滚 | 创建发布任务、查看 diff、回滚图谱版本 | +| 城市空间图谱 | 使用高德 POI 与空间网格支持周边检索 | +| 旅行客服 Agent | 支持线路清单、线路匹配、线路报价、酒店资源、车辆和附近资源查询 | +| 权限与组织 | 内置角色、能力矩阵、用户和区域责任管理 | + +## 典型业务流程 + +```mermaid +flowchart LR + A["采集/导入资料"] --> B["抽取候选实体"] + B --> C["字段证据与质量检查"] + C --> D["人工审核/冲突合并"] + D --> E["发布到 FalkorDB 图谱"] + E --> F["图谱查询/客服问答/运营分析"] +``` + +## 随仓库发布的内容 + +- 后端源码:`app/` +- 前端源码:`admin-web/` +- Docker 运行环境:`Dockerfile`、`docker-compose.yml`、`docker/` +- 图谱数据快照:`snapshots/postgres/`、`snapshots/falkordb/` +- 图谱 schema 与样例资料:`schema搭建/` +- 采集、构建、发布和快照脚本:`scripts/` +- 项目文档:`README.md`、`docs/` + +## 默认演示数据 + +快照以 `new2` 本地系统为来源,包含贵阳/贵州旅行场景相关数据: + +- PostgreSQL schema:`kg_admin_new2` +- FalkorDB 主要图:`guiyang_new2` +- FalkorDB 空间图:`guiyang_spatial_v1` +- 空间 POI:约 `80609` 条 +- 候选实体:约 `37457` 条 + +这些数据随仓库一起发布,下载后通过 Docker 初始化脚本自动恢复。 diff --git a/docs/kg-redesign/entity_alignment_publish_policy.md b/docs/kg-redesign/entity_alignment_publish_policy.md index 92f8b20..114b6ee 100644 --- a/docs/kg-redesign/entity_alignment_publish_policy.md +++ b/docs/kg-redesign/entity_alignment_publish_policy.md @@ -198,6 +198,6 @@ NEARBY_ATTRACTION -> 青岩古镇: 已对齐到 amap:B035300ESE 对应脚本: ```text -/Users/xuexue/new2/scripts/align_huaxi_kg_with_existing_graph.py +scripts/align_huaxi_kg_with_existing_graph.py ``` diff --git a/docs/kg-redesign/implementation_plan.md b/docs/kg-redesign/implementation_plan.md index def8878..2094c2e 100644 --- a/docs/kg-redesign/implementation_plan.md +++ b/docs/kg-redesign/implementation_plan.md @@ -4,7 +4,7 @@ 已完成: -- 从 `/Users/xuexue/new` 复制到 `/Users/xuexue/new2`。 +- 从 `原 new 目录` 复制到 `项目根目录`。 - 排除 `node_modules`、`.env`、`data`、`__pycache__`、运行产物。 - 将空间汇报材料和基准测试材料纳入 `docs/reports`。 diff --git a/docs/kg-redesign/new2_clone_status.md b/docs/kg-redesign/new2_clone_status.md index 8fab91a..ae98db3 100644 --- a/docs/kg-redesign/new2_clone_status.md +++ b/docs/kg-redesign/new2_clone_status.md @@ -2,7 +2,7 @@ ## 文件复刻 -`/Users/xuexue/new2` 已从 `/Users/xuexue/new` 完整补齐: +`项目根目录` 已从 `原 new 目录` 完整补齐: - `.env` - `admin-web/node_modules` diff --git a/docs/reports/new2_current_kg_schema_snapshot.md b/docs/reports/new2_current_kg_schema_snapshot.md index 068883a..e79afb3 100644 --- a/docs/reports/new2_current_kg_schema_snapshot.md +++ b/docs/reports/new2_current_kg_schema_snapshot.md @@ -1,7 +1,7 @@ # new2 当前知识图谱 Schema 快照 生成时间:2026-05-28 -项目目录:`/Users/xuexue/new2` +项目目录:`项目根目录` ## 1. 当前配置 diff --git a/docs/客服图谱快速查询实现说明.md b/docs/客服图谱快速查询实现说明.md index 79e9be9..33dfadc 100644 --- a/docs/客服图谱快速查询实现说明.md +++ b/docs/客服图谱快速查询实现说明.md @@ -28,8 +28,8 @@ 代码位置: -- `/Users/xuexue/new2/app/api/travel_assistant.py:1944` -- `/Users/xuexue/new2/app/api/travel_assistant.py:3082` +- `app/api/travel_assistant.py:1944` +- `app/api/travel_assistant.py:3082` ### 方法二:线路清单使用轻量图谱查询 @@ -47,8 +47,8 @@ TourProduct -> ProductDay -> RouteStop -> ScenicAttraction/SubAttraction 代码位置: -- `/Users/xuexue/new2/app/api/travel_assistant.py:1679` -- `/Users/xuexue/new2/app/api/travel_assistant.py:2975` +- `app/api/travel_assistant.py:1679` +- `app/api/travel_assistant.py:2975` ### 方法三:景区附近资源使用 NEARBY 关系直查 @@ -63,7 +63,7 @@ ScenicAttraction -> ATTRACTION_NEARBY_RESOURCE -> Restaurant 代码位置: -- `/Users/xuexue/new2/app/api/travel_assistant.py:2013` +- `app/api/travel_assistant.py:2013` ### 方法四:费用资源兼容两条查询路径 @@ -86,7 +86,7 @@ ScenicAttraction -> ATTRACTION_HAS_ITEM -> TravelItem 代码位置: -- `/Users/xuexue/new2/app/api/travel_assistant.py:2196` +- `app/api/travel_assistant.py:2196` ### 方法五:推荐问题才走完整图谱排序 @@ -114,8 +114,8 @@ NEARBY 代码位置: -- `/Users/xuexue/new2/app/api/travel_assistant.py:1536` -- `/Users/xuexue/new2/app/api/travel_assistant.py:3082` +- `app/api/travel_assistant.py:1536` +- `app/api/travel_assistant.py:3082` ## 3. 当前已经验证的效果 @@ -146,10 +146,10 @@ NEARBY 代码位置: -- `/Users/xuexue/new2/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:131` -- `/Users/xuexue/new2/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:156` -- `/Users/xuexue/new2/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:182` -- `/Users/xuexue/new2/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:218` +- `admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:131` +- `admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:156` +- `admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:182` +- `admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:218` ## 5. 是否满足当前原型需求 diff --git a/scripts/align_huaxi_kg_with_existing_graph.py b/scripts/align_huaxi_kg_with_existing_graph.py index 1547c01..4f965d3 100644 --- a/scripts/align_huaxi_kg_with_existing_graph.py +++ b/scripts/align_huaxi_kg_with_existing_graph.py @@ -13,7 +13,7 @@ import sys from pathlib import Path from typing import Any -ROOT = Path("/Users/xuexue/new2") +ROOT = Path(__file__).resolve().parents[1] if str(ROOT) not in sys.path: sys.path.insert(0, str(ROOT)) diff --git a/scripts/amap_js_api_probe.mjs b/scripts/amap_js_api_probe.mjs index 39cee9f..ed89e4a 100644 --- a/scripts/amap_js_api_probe.mjs +++ b/scripts/amap_js_api_probe.mjs @@ -1,6 +1,7 @@ import http from "node:http"; import { spawn } from "node:child_process"; import { readFileSync } from "node:fs"; +import path from "node:path"; function readEnvKey(path, key) { const txt = readFileSync(path, "utf8"); @@ -62,8 +63,10 @@ async function wait(ms) { } async function main() { - const key = readEnvKey("/Users/xuexue/new2/.env", "AMAP_JS_KEY"); - const security = readEnvKey("/Users/xuexue/new2/.env", "AMAP_SECURITY_JSCODE"); + const root = path.resolve(new URL("..", import.meta.url).pathname, ".."); + const envPath = process.env.TRAVEL_KG_ENV_PATH || path.join(root, ".env"); + const key = readEnvKey(envPath, "AMAP_JS_KEY"); + const security = readEnvKey(envPath, "AMAP_SECURITY_JSCODE"); if (!key || !security) throw new Error("missing AMap JS key/security"); const chrome = spawn( diff --git a/scripts/build_travel_agency_project.py b/scripts/build_travel_agency_project.py index 12470e0..51c61b3 100644 --- a/scripts/build_travel_agency_project.py +++ b/scripts/build_travel_agency_project.py @@ -16,10 +16,11 @@ from falkordb import FalkorDB from psycopg.rows import dict_row from psycopg.types.json import Jsonb +from common_paths import PROJECT_ROOT, TRAVEL_AGENCY_SOURCE_ROOT, TRAVEL_KG_EXPORT_ROOT -SOURCE_DIR = Path("/Users/xuexue/Downloads/旅行社业务") -OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/旅行社项目入库") -SCHEMA_DIR = Path("/Users/xuexue/new2/schema搭建/travel_agency_business") +SOURCE_DIR = TRAVEL_AGENCY_SOURCE_ROOT +OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "旅行社项目入库" +SCHEMA_DIR = PROJECT_ROOT / "schema搭建/travel_agency_business" DB_URL = "postgresql://admin:password@localhost:5433/kg_admin" DB_SCHEMA = "kg_admin_new2" TENANT_ID = "travel_agency" @@ -1829,7 +1830,7 @@ def write_outputs(builder: KGBuilder, schema: dict[str, Any], qa: list[dict[str, f"生成时间:{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", "", "## 数据来源", - "- `/Users/xuexue/Downloads/旅行社业务/2026年新行程打包`:既有线路产品、每日行程、费用包含/不含、自费项、风险提示。", + "- `TRAVEL_AGENCY_SOURCE_ROOT/2026年新行程打包`:既有线路产品、每日行程、费用包含/不含、自费项、风险提示。", "- `滨海国旅2-8人拼小团计划...xlsx`:2-8人拼小团团期、房型、成人/儿童/单房差、景区小交通、证件退费政策。", "- `20-25人独立成团.xlsx`:独立成团产品、季节价、20/25人报价、泰语导游和2+1大巴服务。", "- `住宿资源库(四钻及以上).xlsx`、`餐厅资源库.xlsx`:酒店/餐厅资源、区域、价格、适用场景。", diff --git a/scripts/build_travel_fixed_route_item_graph.py b/scripts/build_travel_fixed_route_item_graph.py index d92a606..626a923 100644 --- a/scripts/build_travel_fixed_route_item_graph.py +++ b/scripts/build_travel_fixed_route_item_graph.py @@ -18,15 +18,16 @@ from falkordb import FalkorDB from psycopg.rows import dict_row from psycopg.types.json import Jsonb +from common_paths import PROJECT_ROOT, TRAVEL_AGENCY_SOURCE_ROOT, TRAVEL_KG_EXPORT_ROOT -SOURCE_ROOT = Path("/Users/xuexue/Downloads/旅行社业务") +SOURCE_ROOT = TRAVEL_AGENCY_SOURCE_ROOT ROUTE_SOURCE_DIR = SOURCE_ROOT / "2026年新行程打包" ROUTE_MD_DIR = SOURCE_ROOT / "2026年新行程打包_md整理" ROUTE_MD_PRODUCTS = ROUTE_MD_DIR / "products" -LEGACY_SCRIPT = Path("/Users/xuexue/new2/scripts/build_travel_graph_existing_product_project.py") +LEGACY_SCRIPT = PROJECT_ROOT / "scripts/build_travel_graph_existing_product_project.py" -SCHEMA_OUT_DIR = Path("/Users/xuexue/new2/schema搭建/travel_fixed_route_item") -OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_fixed_route_item_旅行社固定线路资源图谱") +SCHEMA_OUT_DIR = PROJECT_ROOT / "schema搭建/travel_fixed_route_item" +OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_fixed_route_item_旅行社固定线路资源图谱" DB_URL = "postgresql://admin:password@localhost:5433/kg_admin" DB_SCHEMA = "kg_admin_new2" diff --git a/scripts/build_travel_graph_existing_product_project.py b/scripts/build_travel_graph_existing_product_project.py index 91e1be3..150d0ef 100644 --- a/scripts/build_travel_graph_existing_product_project.py +++ b/scripts/build_travel_graph_existing_product_project.py @@ -17,13 +17,14 @@ from falkordb import FalkorDB from psycopg.rows import dict_row from psycopg.types.json import Jsonb +from common_paths import PROJECT_ROOT, TRAVEL_AGENCY_SOURCE_ROOT, TRAVEL_KG_EXPORT_ROOT -SOURCE_ROOT = Path("/Users/xuexue/Downloads/旅行社业务") +SOURCE_ROOT = TRAVEL_AGENCY_SOURCE_ROOT ROUTE_MD_DIR = SOURCE_ROOT / "2026年新行程打包_md整理" ROUTE_MD_PRODUCTS = ROUTE_MD_DIR / "products" -SCHEMA_SRC = Path("/Users/xuexue/new2/schema搭建/travel_agency_business/travel_agency_existing_product_schema.v1.json") -SCHEMA_OUT_DIR = Path("/Users/xuexue/new2/schema搭建/travel_graph_existing_product") -OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_graph_旅行社线路制定") +SCHEMA_SRC = PROJECT_ROOT / "schema搭建/travel_agency_business/travel_agency_existing_product_schema.v1.json" +SCHEMA_OUT_DIR = PROJECT_ROOT / "schema搭建/travel_graph_existing_product" +OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_graph_旅行社线路制定" AMAP_CACHE_PATH = OUT_DIR / "amap_poi_enrichment_cache.json" AMAP_DRIVING_CACHE_PATH = OUT_DIR / "amap_driving_distance_cache.json" diff --git a/scripts/cleanup_huaxi_demo_duplicates.py b/scripts/cleanup_huaxi_demo_duplicates.py index 039b73f..56f4dbb 100644 --- a/scripts/cleanup_huaxi_demo_duplicates.py +++ b/scripts/cleanup_huaxi_demo_duplicates.py @@ -5,7 +5,7 @@ from __future__ import annotations import sys from pathlib import Path -ROOT = Path("/Users/xuexue/new2") +ROOT = Path(__file__).resolve().parents[1] if str(ROOT) not in sys.path: sys.path.insert(0, str(ROOT)) diff --git a/scripts/common_paths.py b/scripts/common_paths.py new file mode 100644 index 0000000..31fc059 --- /dev/null +++ b/scripts/common_paths.py @@ -0,0 +1,17 @@ +from __future__ import annotations + +import os +from pathlib import Path + + +PROJECT_ROOT = Path(__file__).resolve().parents[1] +DATA_ROOT = Path(os.getenv("TRAVEL_KG_DATA_ROOT", PROJECT_ROOT / "data")).expanduser() +TRAVEL_AGENCY_SOURCE_ROOT = Path( + os.getenv("TRAVEL_AGENCY_SOURCE_ROOT", DATA_ROOT / "source" / "travel_agency") +).expanduser() +TRAVEL_DELIVERY_ROOT = Path( + os.getenv("TRAVEL_DELIVERY_ROOT", DATA_ROOT / "source" / "travel_delivery_20260602") +).expanduser() +TRAVEL_KG_EXPORT_ROOT = Path(os.getenv("TRAVEL_KG_EXPORT_ROOT", DATA_ROOT / "exports")).expanduser() +GAODE_CRAWLER_PATH = Path(os.getenv("GAODE_CRAWLER_PATH", PROJECT_ROOT / "scripts" / "crawl_guiyan.py")).expanduser() +ENV_PATH = Path(os.getenv("TRAVEL_KG_ENV_PATH", PROJECT_ROOT / ".env")).expanduser() diff --git a/scripts/enrich_travel_graph_amap_driving_metrics.py b/scripts/enrich_travel_graph_amap_driving_metrics.py index 54f1f54..c454f8a 100644 --- a/scripts/enrich_travel_graph_amap_driving_metrics.py +++ b/scripts/enrich_travel_graph_amap_driving_metrics.py @@ -21,10 +21,12 @@ from typing import Any import requests import urllib3 +from common_paths import GAODE_CRAWLER_PATH, PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT + urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning) -BUILD_SCRIPT = Path("/Users/xuexue/new2/scripts/build_travel_graph_existing_product_project.py") -OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_graph_旅行社线路制定") +BUILD_SCRIPT = PROJECT_ROOT / "scripts/build_travel_graph_existing_product_project.py" +OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_graph_旅行社线路制定" NODES_PATH = OUT_DIR / "抽取结果_nodes.json" CACHE_PATH = OUT_DIR / "amap_driving_distance_cache.json" REPORT_CSV = OUT_DIR / "amap_driving_distance_report.csv" @@ -48,7 +50,7 @@ def load_key() -> str: for key in (os.environ.get("AMAP_WEB_KEY"), os.environ.get("AMAP_KEY")): if key: return key - crawl_path = Path("/Users/xuexue/PycharmProjects/PythonProject/xuexue-CityGraph/crawl_guiyan.py") + crawl_path = GAODE_CRAWLER_PATH if crawl_path.exists(): spec = importlib.util.spec_from_file_location("crawl_guiyan", crawl_path) mod = importlib.util.module_from_spec(spec) diff --git a/scripts/enrich_travel_graph_amap_pois.py b/scripts/enrich_travel_graph_amap_pois.py index 6736131..cf003a5 100644 --- a/scripts/enrich_travel_graph_amap_pois.py +++ b/scripts/enrich_travel_graph_amap_pois.py @@ -21,10 +21,12 @@ from typing import Any import requests import urllib3 +from common_paths import GAODE_CRAWLER_PATH, PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT + urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning) -BUILD_SCRIPT = Path("/Users/xuexue/new2/scripts/build_travel_graph_existing_product_project.py") -OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_graph_旅行社线路制定") +BUILD_SCRIPT = PROJECT_ROOT / "scripts/build_travel_graph_existing_product_project.py" +OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_graph_旅行社线路制定" NODES_PATH = OUT_DIR / "抽取结果_nodes.json" CACHE_PATH = OUT_DIR / "amap_poi_enrichment_cache.json" REPORT_CSV = OUT_DIR / "amap_poi_enrichment_report.csv" @@ -47,7 +49,7 @@ def load_key() -> str: for key in (os.environ.get("AMAP_WEB_KEY"), os.environ.get("AMAP_KEY")): if key: return key - crawl_path = Path("/Users/xuexue/PycharmProjects/PythonProject/xuexue-CityGraph/crawl_guiyan.py") + crawl_path = GAODE_CRAWLER_PATH if crawl_path.exists(): spec = importlib.util.spec_from_file_location("crawl_guiyan", crawl_path) mod = importlib.util.module_from_spec(spec) diff --git a/scripts/enrich_travel_poi_with_amap.py b/scripts/enrich_travel_poi_with_amap.py index dee278b..ff12376 100644 --- a/scripts/enrich_travel_poi_with_amap.py +++ b/scripts/enrich_travel_poi_with_amap.py @@ -18,10 +18,10 @@ import urllib.request from pathlib import Path from typing import Any +from common_paths import ENV_PATH, TRAVEL_DELIVERY_ROOT -BASE_DIR = Path("/Users/xuexue/Documents/trae_projects/travel- graph/delivery_20260602") +BASE_DIR = TRAVEL_DELIVERY_ROOT OUT_DIR = BASE_DIR / "amap_enriched" -ENV_PATH = Path("/Users/xuexue/Desktop/zn-kg/.env") CACHE_PATH = OUT_DIR / "_amap_cache.json" SCENIC_TYPES = "110000" diff --git a/scripts/enrich_travel_poi_with_amap_js.mjs b/scripts/enrich_travel_poi_with_amap_js.mjs index 4665a5e..dbc2415 100644 --- a/scripts/enrich_travel_poi_with_amap_js.mjs +++ b/scripts/enrich_travel_poi_with_amap_js.mjs @@ -4,7 +4,8 @@ import path from "node:path"; import http from "node:http"; import { spawn } from "node:child_process"; -const BASE_DIR = "/Users/xuexue/Documents/trae_projects/travel- graph/delivery_20260602"; +const ROOT = path.resolve(new URL("..", import.meta.url).pathname, ".."); +const BASE_DIR = process.env.TRAVEL_DELIVERY_ROOT || path.join(ROOT, "data", "source", "travel_delivery_20260602"); const OUT_DIR = path.join(BASE_DIR, "amap_js_enriched"); const CACHE_FILE = path.join(OUT_DIR, "_amap_js_cache.json"); const CHROME = "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"; @@ -185,8 +186,9 @@ function jsString(v) { } async function initAmap(cdp) { - const key = readEnvKey("/Users/xuexue/new2/.env", "AMAP_JS_KEY"); - const security = readEnvKey("/Users/xuexue/new2/.env", "AMAP_SECURITY_JSCODE"); + const envPath = process.env.TRAVEL_KG_ENV_PATH || path.join(ROOT, ".env"); + const key = readEnvKey(envPath, "AMAP_JS_KEY"); + const security = readEnvKey(envPath, "AMAP_SECURITY_JSCODE"); if (!key || !security) throw new Error("missing AMap JS key/security"); const expr = ` (async () => { diff --git a/scripts/export_travel_agency_2_0_for_tencent.py b/scripts/export_travel_agency_2_0_for_tencent.py index 4dc9b3d..02fe3a8 100644 --- a/scripts/export_travel_agency_2_0_for_tencent.py +++ b/scripts/export_travel_agency_2_0_for_tencent.py @@ -12,9 +12,11 @@ from typing import Any from falkordb import FalkorDB +from common_paths import TRAVEL_KG_EXPORT_ROOT + GRAPH_NAME = "travel_agency_2_0_test" -OUT_ROOT = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试") +OUT_ROOT = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试" SCHEMA_SIMPLE = OUT_ROOT / "tencent_adp_schema.simple.json" FILTERED_DIR = OUT_ROOT / "filtered_import_from_travel_fixed_route_item" POI_DIR = OUT_ROOT / "poi_nearby_import_without_amap" diff --git a/scripts/fix_huaxi_event_temporal_fields.py b/scripts/fix_huaxi_event_temporal_fields.py index 98996de..cc0f44d 100644 --- a/scripts/fix_huaxi_event_temporal_fields.py +++ b/scripts/fix_huaxi_event_temporal_fields.py @@ -13,7 +13,7 @@ import sys from pathlib import Path from typing import Any -ROOT = Path("/Users/xuexue/new2") +ROOT = Path(__file__).resolve().parents[1] if str(ROOT) not in sys.path: sys.path.insert(0, str(ROOT)) diff --git a/scripts/import_travel_agency_2_0_customer_service_rules.py b/scripts/import_travel_agency_2_0_customer_service_rules.py index 9c06e4f..fc45b7a 100644 --- a/scripts/import_travel_agency_2_0_customer_service_rules.py +++ b/scripts/import_travel_agency_2_0_customer_service_rules.py @@ -5,9 +5,11 @@ from datetime import datetime from falkordb import FalkorDB +from common_paths import TRAVEL_AGENCY_SOURCE_ROOT + GRAPH_NAME = "travel_agency_2_0_test" -SOURCE_FILE = "/Users/xuexue/Downloads/旅行社业务/线上客资回复话术.docx" +SOURCE_FILE = str(TRAVEL_AGENCY_SOURCE_ROOT / "线上客资回复话术.docx") UPDATED_AT = datetime.now().strftime("%Y-%m-%d %H:%M:%S") diff --git a/scripts/import_travel_poi_nearby_without_amap.py b/scripts/import_travel_poi_nearby_without_amap.py index b65521e..ae4c85b 100644 --- a/scripts/import_travel_poi_nearby_without_amap.py +++ b/scripts/import_travel_poi_nearby_without_amap.py @@ -14,6 +14,8 @@ from falkordb import FalkorDB from psycopg.rows import dict_row from psycopg.types.json import Jsonb +from common_paths import TRAVEL_DELIVERY_ROOT, TRAVEL_KG_EXPORT_ROOT + DB_URL = "postgresql://admin:password@localhost:5433/kg_admin" DB_SCHEMA = "kg_admin_new2" @@ -22,10 +24,10 @@ PROJECT_ID = "travel_agency_2_0_test" GRAPH_NAME = "travel_agency_2_0_test" TEMPLATE_ID = "travel_agency_2_0_poi_nearby_import_without_amap_v1" -SOURCE_DIR = Path("/Users/xuexue/Documents/trae_projects/travel- graph/delivery_20260602") +SOURCE_DIR = TRAVEL_DELIVERY_ROOT HOTEL_FILE = SOURCE_DIR / "hotel_poi.csv" RESTAURANT_FILE = SOURCE_DIR / "restaurant_poi.csv" -OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试/poi_nearby_import_without_amap") +OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试/poi_nearby_import_without_amap" RUN_UPDATED_AT = datetime.now().strftime("%Y-%m-%d %H:%M:%S") diff --git a/scripts/kg_schema_v1_preview_from_report.py b/scripts/kg_schema_v1_preview_from_report.py index ec02752..ce3622a 100644 --- a/scripts/kg_schema_v1_preview_from_report.py +++ b/scripts/kg_schema_v1_preview_from_report.py @@ -13,7 +13,7 @@ from copy import deepcopy from pathlib import Path from typing import Any -ROOT = Path("/Users/xuexue/new2") +ROOT = Path(__file__).resolve().parents[1] IN_JSON = ROOT / "docs/reports/huaxi_kg_extraction_comparison.json" SCHEMA_JSON = ROOT / "app/schemas/kg_extraction_v1.schema.json" OUT_JSON = ROOT / "docs/reports/huaxi_kg_schema_v1_ready.json" @@ -217,7 +217,7 @@ def write_review_plan(raw: dict[str, Any], payload: dict[str, Any], validation: "-> final_score < 0.8 或模型冲突:进入人工审核", "```", "", - "对应严格 JSON 输出:`/Users/xuexue/new2/docs/reports/huaxi_kg_schema_v1_ready.json`", + "对应严格 JSON 输出:`docs/reports/huaxi_kg_schema_v1_ready.json`", ] OUT_REVIEW.write_text("\n".join(lines), encoding="utf-8") diff --git a/scripts/migrate_fixed_route_item_to_travel_agency_2_0_core.py b/scripts/migrate_fixed_route_item_to_travel_agency_2_0_core.py index 8c9b798..9dd4be7 100644 --- a/scripts/migrate_fixed_route_item_to_travel_agency_2_0_core.py +++ b/scripts/migrate_fixed_route_item_to_travel_agency_2_0_core.py @@ -14,6 +14,8 @@ from falkordb import FalkorDB from psycopg.rows import dict_row from psycopg.types.json import Jsonb +from common_paths import TRAVEL_KG_EXPORT_ROOT + DB_URL = "postgresql://admin:password@localhost:5433/kg_admin" DB_SCHEMA = "kg_admin_new2" @@ -25,7 +27,7 @@ TARGET_PROJECT = "travel_agency_2_0_test" TARGET_GRAPH = "travel_agency_2_0_test" TARGET_TEMPLATE_ID = "travel_agency_2_0_fixed_route_core_import_v1" -OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试/filtered_import_from_travel_fixed_route_item") +OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试/filtered_import_from_travel_fixed_route_item" RUN_UPDATED_AT = datetime.now().strftime("%Y-%m-%d %H:%M:%S") EXCLUDED_ENTITY_TYPES = { diff --git a/scripts/organize_existing_routes_to_md.py b/scripts/organize_existing_routes_to_md.py index 12d3ffe..3c2e3ea 100644 --- a/scripts/organize_existing_routes_to_md.py +++ b/scripts/organize_existing_routes_to_md.py @@ -9,10 +9,11 @@ from datetime import datetime from pathlib import Path from typing import Any +from common_paths import TRAVEL_AGENCY_SOURCE_ROOT, TRAVEL_KG_EXPORT_ROOT -SOURCE_DIR = Path("/Users/xuexue/Downloads/旅行社业务/2026年新行程打包") -OUT_DIR = Path("/Users/xuexue/Downloads/旅行社业务/2026年新行程打包_md整理") -GRAPH_OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/旅行社项目入库/已有路线产品Markdown") +SOURCE_DIR = TRAVEL_AGENCY_SOURCE_ROOT / "2026年新行程打包" +OUT_DIR = TRAVEL_AGENCY_SOURCE_ROOT / "2026年新行程打包_md整理" +GRAPH_OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "旅行社项目入库/已有路线产品Markdown" ATTRACTION_ALIASES = { diff --git a/scripts/publish_huaxi_kg_schema_v1_to_falkor.py b/scripts/publish_huaxi_kg_schema_v1_to_falkor.py index 3f9a921..97243f6 100644 --- a/scripts/publish_huaxi_kg_schema_v1_to_falkor.py +++ b/scripts/publish_huaxi_kg_schema_v1_to_falkor.py @@ -9,7 +9,7 @@ import sys from pathlib import Path from typing import Any -ROOT = Path("/Users/xuexue/new2") +ROOT = Path(__file__).resolve().parents[1] if str(ROOT) not in sys.path: sys.path.insert(0, str(ROOT)) diff --git a/scripts/publish_travel_agency_2_2_schema.py b/scripts/publish_travel_agency_2_2_schema.py index 2c27c5c..2ab12a0 100644 --- a/scripts/publish_travel_agency_2_2_schema.py +++ b/scripts/publish_travel_agency_2_2_schema.py @@ -10,13 +10,14 @@ from psycopg.rows import dict_row from psycopg.types.json import Jsonb from app.config import settings +from common_paths import PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT PROJECT_ID = "travel_agency_2_0_test" TENANT_ID = "travel_agency" GRAPH_NAME = "travel_agency_2_0_test" NAMESPACE = "travel_agency_2_0" -SCHEMA_DIR = Path("/Users/xuexue/new2/schema搭建/travel_agency_2_0_test") -DOWNLOAD_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试") +SCHEMA_DIR = PROJECT_ROOT / "schema搭建/travel_agency_2_0_test" +DOWNLOAD_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试" CURRENT_JSON = SCHEMA_DIR / "travel_agency_2_0_schema.current.json" diff --git a/scripts/publish_travel_agency_2_3_schema.py b/scripts/publish_travel_agency_2_3_schema.py index e302aa0..c2bfb6d 100644 --- a/scripts/publish_travel_agency_2_3_schema.py +++ b/scripts/publish_travel_agency_2_3_schema.py @@ -10,13 +10,14 @@ from psycopg.rows import dict_row from psycopg.types.json import Jsonb from app.config import settings +from common_paths import PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT PROJECT_ID = "travel_agency_2_0_test" TENANT_ID = "travel_agency" GRAPH_NAME = "travel_agency_2_0_test" NAMESPACE = "travel_agency_2_0" -SCHEMA_DIR = Path("/Users/xuexue/new2/schema搭建/travel_agency_2_0_test") -DOWNLOAD_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试") +SCHEMA_DIR = PROJECT_ROOT / "schema搭建/travel_agency_2_0_test" +DOWNLOAD_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试" CURRENT_JSON = SCHEMA_DIR / "travel_agency_2_0_schema.current.json" diff --git a/scripts/publish_travel_agency_2_4_schema.py b/scripts/publish_travel_agency_2_4_schema.py index 93246d6..feccdf5 100644 --- a/scripts/publish_travel_agency_2_4_schema.py +++ b/scripts/publish_travel_agency_2_4_schema.py @@ -10,14 +10,15 @@ from psycopg.rows import dict_row from psycopg.types.json import Jsonb from app.config import settings +from common_paths import PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT PROJECT_ID = "travel_agency_2_0_test" TENANT_ID = "travel_agency" GRAPH_NAME = "travel_agency_2_0_test" NAMESPACE = "travel_agency_2_0" -SCHEMA_DIR = Path("/Users/xuexue/new2/schema搭建/travel_agency_2_0_test") -DOWNLOAD_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试") +SCHEMA_DIR = PROJECT_ROOT / "schema搭建/travel_agency_2_0_test" +DOWNLOAD_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试" CURRENT_JSON = SCHEMA_DIR / "travel_agency_2_0_schema.current.json"