Polish project documentation and runtime config
This commit is contained in:
11
.env.example
11
.env.example
@@ -29,6 +29,17 @@ DEFAULT_TENANT=CityGraph-new2
|
||||
DEFAULT_PROJECT=CityGraph-new2
|
||||
INGEST_API_KEYS=dev-key-1
|
||||
|
||||
# External services, optional
|
||||
AMAP_WEB_KEY=
|
||||
AMAP_JS_KEY=
|
||||
AMAP_SECURITY_JSCODE=
|
||||
GAODE_CRAWLER_PATH=
|
||||
TRAVEL_KG_DATA_ROOT=./data
|
||||
TRAVEL_AGENCY_SOURCE_ROOT=./data/source/travel_agency
|
||||
TRAVEL_DELIVERY_ROOT=./data/source/travel_delivery_20260602
|
||||
TRAVEL_KG_EXPORT_ROOT=./data/exports
|
||||
TRAVEL_KG_ENV_PATH=./.env
|
||||
|
||||
# Docker host ports
|
||||
API_PORT=8102
|
||||
POSTGRES_PORT=5433
|
||||
|
||||
13
CHANGELOG.md
Normal file
13
CHANGELOG.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# 更新日志
|
||||
|
||||
## 0.1.0 - 2026-06-09
|
||||
|
||||
首个 GitHub 发布版本。
|
||||
|
||||
- 整理 `new2` 旅行知识图谱系统源码。
|
||||
- 增加 Dockerfile 和 Docker Compose 一键启动环境。
|
||||
- 随仓库发布 PostgreSQL 与 FalkorDB 数据快照。
|
||||
- 增加 PostgreSQL 初始化恢复脚本和 FalkorDB 快照种子容器。
|
||||
- 增加安全版 `.env.example`。
|
||||
- 增加中文 README、系统介绍、架构、部署、数据快照、API 和维护文档。
|
||||
- 本地验证 Docker 构建、快照恢复、管理后台访问和管理员登录。
|
||||
48
CONTRIBUTING.md
Normal file
48
CONTRIBUTING.md
Normal file
@@ -0,0 +1,48 @@
|
||||
# 贡献指南
|
||||
|
||||
欢迎基于本项目继续扩展旅行知识图谱能力。提交前请先确保本地可以构建和启动。
|
||||
|
||||
## 开发环境
|
||||
|
||||
后端:
|
||||
|
||||
```bash
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
cp .env.example .env
|
||||
python -m uvicorn app.main:app --host 0.0.0.0 --port 8102 --reload
|
||||
```
|
||||
|
||||
前端:
|
||||
|
||||
```bash
|
||||
cd admin-web
|
||||
npm install
|
||||
npm run dev
|
||||
```
|
||||
|
||||
Docker:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
## 提交规范
|
||||
|
||||
- 保持改动聚焦,一次提交解决一个明确问题。
|
||||
- 不提交 `.env`、密钥、浏览器缓存、日志、数据卷和 `node_modules/`。
|
||||
- 如果修改 API,同步更新 `docs/API_REFERENCE.md`。
|
||||
- 如果修改部署配置,同步更新 `README.md` 和 `docs/DEPLOYMENT.md`。
|
||||
- 如果修改快照,同步更新 `docs/DATA_SNAPSHOTS.md` 中的数量和哈希。
|
||||
|
||||
## 建议检查
|
||||
|
||||
```bash
|
||||
python -m compileall app
|
||||
cd admin-web && npm run build
|
||||
cd ..
|
||||
docker compose config
|
||||
```
|
||||
|
||||
涉及 Docker、快照或登录逻辑的修改,需要额外执行完整 compose 启动验证。
|
||||
103
README.md
103
README.md
@@ -1,16 +1,36 @@
|
||||
# 旅行知识图谱管理系统
|
||||
|
||||
这是 `new2` 版旅行/城市知识图谱系统,包含 FastAPI 后端、React 管理后台、PostgreSQL 管理库和 FalkorDB 图数据库。仓库已附带数据库快照,用户下载后可以用 Docker 直接恢复图谱数据。
|
||||

|
||||

|
||||

|
||||

|
||||

|
||||
|
||||
## 数据快照
|
||||
面向贵州、贵阳旅行场景的知识图谱管理系统。项目包含 FastAPI 后端、React 管理后台、PostgreSQL 管理库、FalkorDB 图数据库、图谱 schema、采集/抽取脚本和可恢复的数据快照。用户 clone 仓库后可以直接用 Docker 启动完整系统,并恢复随仓库发布的图谱数据。
|
||||
|
||||
- PostgreSQL:`snapshots/postgres/kg_admin_new2.dump`
|
||||
- FalkorDB:`snapshots/falkordb/dump.rdb`
|
||||
- 默认管理库 schema:`kg_admin_new2`
|
||||
- 默认 FalkorDB 图谱:`guiyang_new2`
|
||||
- 空间图谱:`guiyang_spatial_v1`
|
||||
## 项目亮点
|
||||
|
||||
当前快照包含约 `80609` 条空间 POI、`37457` 条候选实体,以及 FalkorDB 中的贵阳与旅行社相关图谱。详细结构见 `docs/reports/new2_current_kg_schema_snapshot.md`。
|
||||
- 完整后台:数据源、批次、实体审核、证据质量、图谱广场、发布回滚、权限、任务和通知模块。
|
||||
- 双数据库架构:PostgreSQL 保存后台业务数据与审核流程,FalkorDB 保存可查询图谱。
|
||||
- 可复现数据:仓库内置 PostgreSQL 与 FalkorDB 快照,下载后可恢复图谱数据。
|
||||
- Docker 一键启动:`docker compose up -d --build` 同时启动 API、管理后台、PostgreSQL 和 FalkorDB。
|
||||
- 旅行客服场景:内置固定线路、周边资源、酒店报价、车辆、行程推荐和图谱问答相关接口。
|
||||
- 可扩展 Agent:保留高德、网页、小红书、抖音、事件抽取、多源对齐、审计等 Agent 代码。
|
||||
|
||||
## 系统架构
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
U["运营/标注/客服用户"] --> W["React 管理后台 /admin"]
|
||||
W --> A["FastAPI 后端 /v1/admin"]
|
||||
A --> P[("PostgreSQL\nkg_admin_new2")]
|
||||
A --> F[("FalkorDB\nguiyang_new2\nguiyang_spatial_v1")]
|
||||
A --> S["Schema 与采集脚本"]
|
||||
A --> L["OpenAI 兼容 LLM\n可选"]
|
||||
A --> M["高德地图 API\n可选"]
|
||||
```
|
||||
|
||||
更详细的模块说明见 [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)。
|
||||
|
||||
## 快速启动
|
||||
|
||||
@@ -37,12 +57,25 @@ http://localhost:8102/admin
|
||||
|
||||
常用服务端口:
|
||||
|
||||
- 管理后台/API:`8102`
|
||||
- PostgreSQL:`5433`
|
||||
- FalkorDB Redis 协议:`6380`
|
||||
- FalkorDB Browser:`3002`
|
||||
| 服务 | 默认地址 |
|
||||
| --- | --- |
|
||||
| 管理后台/API | `http://localhost:8102` |
|
||||
| 管理后台页面 | `http://localhost:8102/admin` |
|
||||
| API 文档 | `http://localhost:8102/docs` |
|
||||
| PostgreSQL | `localhost:5433` |
|
||||
| FalkorDB Redis 协议 | `localhost:6380` |
|
||||
| FalkorDB Browser | `http://localhost:3002` |
|
||||
|
||||
## 重新恢复快照
|
||||
## 数据快照
|
||||
|
||||
仓库包含可恢复快照:
|
||||
|
||||
| 数据库 | 快照文件 | 默认库/图 |
|
||||
| --- | --- | --- |
|
||||
| PostgreSQL | `snapshots/postgres/kg_admin_new2.dump` | database `kg_admin`, schema `kg_admin_new2` |
|
||||
| FalkorDB | `snapshots/falkordb/dump.rdb` | graph `guiyang_new2`, `guiyang_spatial_v1` |
|
||||
|
||||
当前快照包含约 `80609` 条空间 POI、`37457` 条候选实体,以及 FalkorDB 中的贵阳与旅行社相关图谱。详细结构见 [docs/reports/new2_current_kg_schema_snapshot.md](docs/reports/new2_current_kg_schema_snapshot.md) 和 [docs/DATA_SNAPSHOTS.md](docs/DATA_SNAPSHOTS.md)。
|
||||
|
||||
PostgreSQL 初始化脚本只会在 Docker 数据卷首次创建时执行。如果要丢弃本地修改并从仓库快照重新恢复:
|
||||
|
||||
@@ -51,7 +84,7 @@ docker compose down -v
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
## 验证数据
|
||||
## 验证命令
|
||||
|
||||
检查 API:
|
||||
|
||||
@@ -72,14 +105,48 @@ docker compose exec postgres psql -U admin -d kg_admin \
|
||||
docker compose exec falkordb redis-cli -p 6379 GRAPH.LIST
|
||||
```
|
||||
|
||||
## 项目结构
|
||||
|
||||
```text
|
||||
.
|
||||
├── app/ # FastAPI 后端、API 路由、Agent、图谱核心逻辑
|
||||
├── admin-web/ # React + Vite 管理后台源码
|
||||
├── docker/ # Docker 初始化脚本
|
||||
├── docs/ # 项目说明、架构、部署、数据和报告文档
|
||||
├── schema搭建/ # 图谱 schema、百科样例数据和 DSL 资料
|
||||
├── scripts/ # 采集、构建、发布和快照导出脚本
|
||||
├── snapshots/ # 可恢复数据库快照
|
||||
├── Dockerfile # 后端镜像与前端静态资源构建
|
||||
├── docker-compose.yml # 一键启动完整系统
|
||||
├── .env.example # 本地配置模板
|
||||
└── requirements.txt # Python 后端依赖
|
||||
```
|
||||
|
||||
不会上传 `.env`、`node_modules/`、运行日志、浏览器缓存、本地 Docker 数据卷和临时文件。仓库保留系统源码、Docker 配置、schema、脚本、必要报告和数据库快照。
|
||||
|
||||
## 文档导航
|
||||
|
||||
| 文档 | 内容 |
|
||||
| --- | --- |
|
||||
| [docs/PROJECT_OVERVIEW.md](docs/PROJECT_OVERVIEW.md) | 系统定位、业务能力和功能地图 |
|
||||
| [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) | 容器、后端、前端和数据架构 |
|
||||
| [docs/DEPLOYMENT.md](docs/DEPLOYMENT.md) | Docker 部署、端口、环境变量和常见问题 |
|
||||
| [docs/DATA_SNAPSHOTS.md](docs/DATA_SNAPSHOTS.md) | 数据快照、恢复、重导出和校验 |
|
||||
| [docs/API_REFERENCE.md](docs/API_REFERENCE.md) | API 分组、常用接口和调用示例 |
|
||||
| [docs/MAINTENANCE.md](docs/MAINTENANCE.md) | 维护流程、发布检查和仓库边界 |
|
||||
| [CONTRIBUTING.md](CONTRIBUTING.md) | 开发参与和提交规范 |
|
||||
| [SECURITY.md](SECURITY.md) | 演示账号、密钥和生产安全建议 |
|
||||
| [CHANGELOG.md](CHANGELOG.md) | 版本记录 |
|
||||
|
||||
## 本地开发
|
||||
|
||||
后端依赖:
|
||||
后端开发:
|
||||
|
||||
```bash
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
cp .env.example .env
|
||||
python -m uvicorn app.main:app --host 0.0.0.0 --port 8102 --reload
|
||||
```
|
||||
|
||||
@@ -101,4 +168,8 @@ npm run dev
|
||||
bash scripts/export_snapshots.sh
|
||||
```
|
||||
|
||||
然后提交 `snapshots/` 中更新后的 dump 文件即可。
|
||||
然后提交 `snapshots/` 中更新后的 dump 文件即可。快照文件接近 GitHub 单文件建议上限,更新前请先确认文件大小不超过 GitHub 的 100 MB 单文件限制。
|
||||
|
||||
## 生产使用提醒
|
||||
|
||||
默认账号、数据库密码和 `AUTH_SECRET` 仅用于演示。正式部署前请修改 `.env` 或 compose 环境变量,并为 LLM、高德等外部服务单独配置密钥。更多建议见 [SECURITY.md](SECURITY.md)。
|
||||
|
||||
39
SECURITY.md
Normal file
39
SECURITY.md
Normal file
@@ -0,0 +1,39 @@
|
||||
# 安全说明
|
||||
|
||||
本仓库默认配置用于本地演示和项目复现,不适合直接暴露到公网。
|
||||
|
||||
## 演示账号
|
||||
|
||||
Docker 快照初始化后会提供演示账号:
|
||||
|
||||
```text
|
||||
admin@example.com / change-me
|
||||
```
|
||||
|
||||
生产环境必须修改默认密码,并替换 `AUTH_SECRET`。
|
||||
|
||||
## 密钥管理
|
||||
|
||||
不要提交以下内容:
|
||||
|
||||
- `.env`
|
||||
- 数据库真实密码
|
||||
- `AUTH_SECRET`
|
||||
- LLM API Key
|
||||
- 高德地图 Key
|
||||
- Cookie、浏览器 profile 或登录态文件
|
||||
|
||||
`.env.example` 只保留可运行的示例值。
|
||||
|
||||
## 生产部署建议
|
||||
|
||||
- 使用 HTTPS 和反向代理。
|
||||
- 不向公网暴露 PostgreSQL 和 FalkorDB 端口。
|
||||
- 修改数据库账号、密码和默认管理员账号。
|
||||
- 定期备份 PostgreSQL 与 FalkorDB 数据卷。
|
||||
- 为 LLM、高德等外部服务配置最小权限密钥。
|
||||
- 开启容器日志和异常监控。
|
||||
|
||||
## 报告问题
|
||||
|
||||
如果发现凭据泄露、越权访问或数据安全问题,请优先私下联系仓库维护者处理,再公开 issue。
|
||||
@@ -1141,8 +1141,8 @@ export default function ManualIngestPanel() {
|
||||
const fillSample = () => {
|
||||
setText(SAMPLE_TEXT);
|
||||
setRootEntity("花溪公园");
|
||||
setSourceName("医学文档");
|
||||
setSourceUrl("/Users/jier/upload/demo-manual-document.txt");
|
||||
setSourceName("旅行示例文档");
|
||||
setSourceUrl("demo-source://manual-ingest/huaxi-park.txt");
|
||||
setBusinessScene("scenic");
|
||||
};
|
||||
|
||||
|
||||
@@ -833,11 +833,13 @@ function resultPanelTitle(payload: AssistantPayload | null) {
|
||||
if (mode === "nearby_resource") return "景区附近资源";
|
||||
if (mode === "hotel_resource") return "酒店价格资源";
|
||||
if (mode === "route_catalog") return "线路清单";
|
||||
if (mode === "route_price") return "线路报价";
|
||||
return "路线推荐";
|
||||
}
|
||||
|
||||
function modeFallback(mode?: string) {
|
||||
const labels: Record<string, string> = {
|
||||
route_price: "线路报价",
|
||||
route_catalog: "线路清单",
|
||||
multi_task_agent: "多任务客服 Agent",
|
||||
route_match_fast: "固定线路快速匹配",
|
||||
|
||||
@@ -11,6 +11,7 @@ from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import requests
|
||||
@@ -18,7 +19,8 @@ import urllib3
|
||||
|
||||
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
|
||||
|
||||
_CRAWL_PATH = "/Users/xuexue/PycharmProjects/PythonProject/xuexue-CityGraph/crawl_guiyan.py"
|
||||
_PROJECT_ROOT = Path(__file__).resolve().parents[2]
|
||||
_CRAWL_PATH = os.getenv("GAODE_CRAWLER_PATH", str(_PROJECT_ROOT / "scripts" / "crawl_guiyan.py"))
|
||||
_mod: Any = None
|
||||
|
||||
# 高德官方一级 POI 类型编码(按 type code 网格扫描,不靠热度关键词)
|
||||
|
||||
@@ -616,6 +616,7 @@ async def _llm_intent(question: str, fallback: dict[str, Any], enabled: bool) ->
|
||||
|
||||
|
||||
TASK_LABELS = {
|
||||
"route_price": "线路报价",
|
||||
"route_catalog": "线路清单",
|
||||
"route_match": "线路匹配",
|
||||
"nearby_resource": "景区附近酒店/餐饮",
|
||||
@@ -637,6 +638,19 @@ def _route_task_needed(question: str, intent: dict[str, Any]) -> bool:
|
||||
return bool(intent.get("duration_days") or intent.get("destinations")) and _has_any(question, route_terms)
|
||||
|
||||
|
||||
def _route_price_question(question: str) -> bool:
|
||||
price_terms = ("多少钱", "价格", "报价", "费用", "线路多少钱", "路线多少钱", "产品多少钱", "怎么收费")
|
||||
route_terms = (
|
||||
"路线", "线路", "产品", "行程", "旅游", "游", "几日游", "一日游", "二日游", "三日游", "四日游",
|
||||
"五日游", "六日游", "黄小西", "小西", "镇梵",
|
||||
)
|
||||
hotel_terms = ("酒店", "住宿", "房型", "房价", "房费", "间夜")
|
||||
vehicle_terms = ("车辆", "车型", "用车", "车费", "车价")
|
||||
if _has_any(question, hotel_terms) or _has_any(question, vehicle_terms):
|
||||
return False
|
||||
return _has_any(question, price_terms) and _has_any(question, route_terms)
|
||||
|
||||
|
||||
def _resource_task_needed(question: str) -> bool:
|
||||
resource_terms = ("酒店", "住宿", "住哪", "客栈", "民宿", "餐饮", "餐厅", "吃饭", "饭店", "美食")
|
||||
near_or_choice_terms = ("附近", "周边", "可选", "选择", "推荐", "哪些", "那些", "有哪些", "有什么", "住", "吃")
|
||||
@@ -668,7 +682,9 @@ def _agent_task_plan(question: str, intent: dict[str, Any]) -> list[dict[str, An
|
||||
"reason": reason,
|
||||
})
|
||||
|
||||
if _route_catalog_question(question):
|
||||
if _route_price_question(question):
|
||||
add("route_price", 0.95, "命中线路产品报价问题")
|
||||
elif _route_catalog_question(question):
|
||||
add("route_catalog", 0.96, "命中线路清单问法")
|
||||
elif _route_task_needed(question, intent):
|
||||
add("route_match", 0.92, "命中出行天数/景区/推荐行程约束")
|
||||
@@ -730,6 +746,8 @@ def _rule_fast_intent_method(question: str, intent: dict[str, Any]) -> str:
|
||||
tasks = _agent_task_plan(question, intent)
|
||||
if len(tasks) > 1:
|
||||
return "rule_multi_task_fast_path"
|
||||
if _route_price_question(question):
|
||||
return "rule_route_price_fast_path"
|
||||
if _vehicle_only_question(question, intent):
|
||||
return "rule_vehicle_fast_path"
|
||||
if _route_catalog_question(question):
|
||||
@@ -1777,6 +1795,7 @@ def _infer_response_mode(response: dict[str, Any]) -> str:
|
||||
method = _value(trace.get("method"), 120)
|
||||
for token, inferred in (
|
||||
("multi_task", "multi_task_agent"),
|
||||
("route_price", "route_price"),
|
||||
("route_catalog", "route_catalog"),
|
||||
("nearby_resource", "nearby_resource"),
|
||||
("hotel_resource", "hotel_resource"),
|
||||
@@ -1793,6 +1812,7 @@ def _infer_response_mode(response: dict[str, Any]) -> str:
|
||||
def _response_mode_label(mode: str) -> str:
|
||||
labels = {
|
||||
"multi_task_agent": "多任务客服 Agent",
|
||||
"route_price": "线路报价",
|
||||
"route_catalog": "线路清单",
|
||||
"route_match_fast": "固定线路快速匹配",
|
||||
"fixed_route_item": "固定线路深度匹配",
|
||||
@@ -1813,6 +1833,7 @@ def _agent_capabilities(response: dict[str, Any]) -> list[str]:
|
||||
capabilities: list[str] = []
|
||||
mode_caps = {
|
||||
"multi_task_agent": ["多意图拆解", "KG 模板编排", "结构化聚合输出"],
|
||||
"route_price": ["线路报价", "团期价格", "核价边界"],
|
||||
"route_catalog": ["线路清单", "同线路版本合并", "后续追问入口"],
|
||||
"route_match_fast": ["固定线路召回", "景点覆盖评分", "报价规则提示"],
|
||||
"fixed_route_item": ["固定线路召回", "每日行程", "费用槽位", "资源槽位"],
|
||||
@@ -2408,7 +2429,9 @@ def _attach_hotel_rate_summaries(graph_data: dict[str, Any], rate_index: dict[st
|
||||
|
||||
|
||||
def _hotel_resource_question(question: str) -> bool:
|
||||
return any(token in question for token in ("房型", "淡季", "旺季", "挂牌价", "房价", "房费", "酒店价格", "住宿价格", "多少钱"))
|
||||
hotel_terms = ("酒店", "住宿", "客栈", "民宿", "房型", "房价", "房费", "间夜")
|
||||
price_terms = ("淡季", "旺季", "挂牌价", "房价", "房费", "酒店价格", "住宿价格", "多少钱")
|
||||
return any(token in question for token in price_terms) and any(token in question for token in hotel_terms)
|
||||
|
||||
|
||||
def _match_hotel_rate_entries(question: str, rate_index: dict[str, dict[str, Any]]) -> list[dict[str, Any]]:
|
||||
@@ -3213,7 +3236,16 @@ def _score_fixed_product(intent: dict[str, Any], entry: dict[str, Any]) -> tuple
|
||||
text = _fixed_product_text(entry)
|
||||
route_text = _fixed_route_core_text(entry)
|
||||
score = 0
|
||||
score_cap = 100
|
||||
reasons: list[str] = []
|
||||
raw_question = _value(intent.get("raw_text"), 500)
|
||||
raw_norm = _norm_text(raw_question)
|
||||
product_name = _value(product.get("name"), 160)
|
||||
product_name_norm = _norm_text(product_name)
|
||||
exact_product_name = bool(product_name_norm and len(product_name_norm) >= 8 and product_name_norm in raw_norm)
|
||||
if exact_product_name:
|
||||
score += 42
|
||||
reasons.append("命中指定线路名称")
|
||||
desired_days = _as_optional_int(intent.get("duration_days"))
|
||||
desired_nights = _as_optional_int(intent.get("duration_nights"))
|
||||
product_days, product_nights = _entry_duration_values(entry)
|
||||
@@ -3231,15 +3263,20 @@ def _score_fixed_product(intent: dict[str, Any], entry: dict[str, Any]) -> tuple
|
||||
else:
|
||||
reasons.append(f"{desired_days}天固定路线匹配")
|
||||
else:
|
||||
score_cap = min(score_cap, 72)
|
||||
reasons.append(f"时长不匹配:需求{_duration_label(desired_days, desired_nights)},产品{_duration_label(product_days, product_nights)}")
|
||||
missing_required = False
|
||||
for dest in intent.get("destinations") or []:
|
||||
aliases = ATTRACTION_ALIASES.get(dest, [dest])
|
||||
if _contains_any(route_text, aliases):
|
||||
score += 35
|
||||
score += 24
|
||||
reasons.append(f"覆盖{dest}")
|
||||
else:
|
||||
score -= 28
|
||||
missing_required = True
|
||||
score -= 40
|
||||
reasons.append(f"未覆盖{dest}")
|
||||
if missing_required:
|
||||
score_cap = min(score_cap, 45)
|
||||
requested_destinations = set(intent.get("destinations") or [])
|
||||
if requested_destinations and not intent.get("inferred_destinations"):
|
||||
extra_destinations = [
|
||||
@@ -3248,11 +3285,25 @@ def _score_fixed_product(intent: dict[str, Any], entry: dict[str, Any]) -> tuple
|
||||
if canonical not in requested_destinations and _contains_any(route_text, aliases)
|
||||
]
|
||||
if extra_destinations:
|
||||
score -= min(24, 6 * len(extra_destinations))
|
||||
if not exact_product_name:
|
||||
score_cap = min(score_cap, max(68, 90 - 8 * len(extra_destinations)))
|
||||
score -= min(36, 12 * len(extra_destinations))
|
||||
reasons.append(f"含额外景点{'、'.join(extra_destinations[:3])},需确认是否接受")
|
||||
if product.get("base_price_status") == "ready_for_reference_quote":
|
||||
has_price_reference = bool(
|
||||
product.get("base_price_status") == "ready_for_reference_quote"
|
||||
or _value(product.get("base_price_text"), 80)
|
||||
or _value(product.get("adult_settlement_text"), 80)
|
||||
or _value(product.get("child_settlement_text"), 80)
|
||||
or _value(product.get("free_ticket_settlement_text"), 80)
|
||||
or _value(product.get("single_room_diff_text"), 80)
|
||||
or _value(product.get("quote_formula"), 80)
|
||||
)
|
||||
if has_price_reference:
|
||||
score += 8
|
||||
reasons.append("已有报价表依据")
|
||||
elif intent.get("price_query"):
|
||||
score_cap = min(score_cap, 70)
|
||||
reasons.append("线路报价数据待补")
|
||||
if _is_low_budget(intent):
|
||||
if any(term in text for term in ("经济", "性价比", "普通", "四钻", "4钻")):
|
||||
score += 5
|
||||
@@ -3272,7 +3323,7 @@ def _score_fixed_product(intent: dict[str, Any], entry: dict[str, Any]) -> tuple
|
||||
score += 2
|
||||
if not reasons:
|
||||
reasons.append("按固定路线、景点覆盖和资源槽位综合匹配")
|
||||
return score, reasons[:7]
|
||||
return min(score, score_cap), reasons[:7]
|
||||
|
||||
|
||||
def _format_price_reference(product: dict[str, Any], intent: dict[str, Any]) -> str:
|
||||
@@ -4210,6 +4261,7 @@ def _fixed_route_match_fast_response(
|
||||
intent_method: str,
|
||||
graph_data: dict[str, Any],
|
||||
) -> dict[str, Any]:
|
||||
price_query = bool(intent.get("price_query") or _route_price_question(question))
|
||||
entries = list(graph_data.get("products", {}).values())
|
||||
strict_matches = _strict_fixed_route_matches(intent, entries)
|
||||
strict_note = _strict_route_gap_note(intent, entries)
|
||||
@@ -4236,11 +4288,22 @@ def _fixed_route_match_fast_response(
|
||||
)
|
||||
ranked = _dedupe_route_entries(ranked)
|
||||
plans = [_build_fixed_route_plan(entry, idx + 1, intent) for idx, entry in enumerate(ranked[:8])]
|
||||
followups = [
|
||||
"要不要继续查某个景区附近可选酒店/餐饮?",
|
||||
"要不要继续查这条线路的可选车辆?",
|
||||
"要不要继续查某个景区的门票/观光车/保险等费用?",
|
||||
]
|
||||
if price_query:
|
||||
for plan in plans:
|
||||
plan["price_query"] = True
|
||||
if plan.get("rank") == 1:
|
||||
plan["label"] = "线路报价"
|
||||
followups = [
|
||||
"请确认出行日期属于哪个价格区间。",
|
||||
"请确认成人/儿童人数、酒店档位和是否有单房差。",
|
||||
"请确认是否接受线路中额外包含的景点,或需要继续找更贴合的线路。",
|
||||
]
|
||||
else:
|
||||
followups = [
|
||||
"要不要继续查某个景区附近可选酒店/餐饮?",
|
||||
"要不要继续查这条线路的可选车辆?",
|
||||
"要不要继续查某个景区的门票/观光车/保险等费用?",
|
||||
]
|
||||
evidence: list[dict[str, Any]] = []
|
||||
for plan in plans:
|
||||
evidence.append({"type": "固定线路产品", "name": plan["plan_name"], "summary": plan["route_summary"], "source": plan["source"]})
|
||||
@@ -4258,15 +4321,20 @@ def _fixed_route_match_fast_response(
|
||||
"strict_match_count": len(strict_matches),
|
||||
"strict_match_note": strict_note,
|
||||
"resource_counts": {"TourProduct": len(entries)},
|
||||
"method": "fixed_route_item_route_match_fast_v1",
|
||||
"method": "fixed_route_item_route_price_lookup_v1" if price_query else "fixed_route_item_route_match_fast_v1",
|
||||
"intent_method": intent_method,
|
||||
"response_mode": "route_match_fast",
|
||||
"response_mode": "route_price" if price_query else "route_match_fast",
|
||||
"price_query": price_query,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def _copy_fixed_text(plans: list[dict[str, Any]], followups: list[str], strict_note: str = "") -> str:
|
||||
lines = ["您好,按您当前需求,先从已有固定线路产品里匹配如下:"]
|
||||
is_price_query = any(plan.get("price_query") for plan in plans)
|
||||
if is_price_query:
|
||||
lines = ["您好,按您当前需求,先从已有固定线路产品里匹配并核对报价如下:"]
|
||||
else:
|
||||
lines = ["您好,按您当前需求,先从已有固定线路产品里匹配如下:"]
|
||||
if strict_note:
|
||||
lines.append(f"注意:{strict_note} 以下为相近替代方案,不要直接承诺完全满足客户天数/景点。")
|
||||
for plan in plans[:8]:
|
||||
@@ -4276,7 +4344,8 @@ def _copy_fixed_text(plans: list[dict[str, Any]], followups: list[str], strict_n
|
||||
lines.append(f"{plan['rank']}. {line_prefix}{plan['plan_name']}({_duration_label(plan.get('duration_days'), plan.get('duration_nights'))})")
|
||||
lines.append(f"匹配点:{'、'.join(plan['match_reasons'][:4])}")
|
||||
lines.append(f"路线:{plan['route_summary']}")
|
||||
lines.append(f"报价依据:{plan['quote_summary']}")
|
||||
quote_label = "线路报价" if is_price_query else "报价依据"
|
||||
lines.append(f"{quote_label}:{plan['quote_summary']}")
|
||||
vehicle = next((item["detail"] for item in plan.get("cost_breakdown", []) if item["category"] == "小包团用车"), "")
|
||||
if vehicle:
|
||||
lines.append(f"用车建议:{vehicle}")
|
||||
@@ -4325,7 +4394,7 @@ def _fixed_route_item_task_response(
|
||||
if kind == "route_catalog":
|
||||
catalog_data = shared.setdefault("catalog_data", _cached_fixed_route_catalog_graph(graph_name))
|
||||
return _fixed_route_catalog_response(question, graph_name, intent, intent_method, catalog_data, limit=5)
|
||||
if kind == "route_match":
|
||||
if kind in {"route_match", "route_price"}:
|
||||
catalog_data = shared.setdefault("catalog_data", _cached_fixed_route_catalog_graph(graph_name))
|
||||
return _fixed_route_match_fast_response(question, graph_name, intent, intent_method, catalog_data)
|
||||
if kind == "nearby_resource":
|
||||
@@ -4419,6 +4488,9 @@ def _fixed_route_item_response(question: str, graph_name: str, intent: dict[str,
|
||||
if responses:
|
||||
return _merge_fixed_task_responses(question, graph_name, intent, intent_method, executed_tasks, responses)
|
||||
|
||||
if _route_price_question(question):
|
||||
catalog_data = _cached_fixed_route_catalog_graph(graph_name)
|
||||
return _fixed_route_match_fast_response(question, graph_name, intent, intent_method, catalog_data)
|
||||
if _route_catalog_question(question):
|
||||
catalog_data = _cached_fixed_route_catalog_graph(graph_name)
|
||||
return _fixed_route_catalog_response(question, graph_name, intent, intent_method, catalog_data)
|
||||
@@ -4869,6 +4941,7 @@ async def travel_assistant_query(
|
||||
if intent_method == "llm_intent_parser":
|
||||
intent = _guard_llm_intent(question, rule_intent, intent)
|
||||
intent = _complete_intent_defaults(question, intent)
|
||||
intent["price_query"] = _route_price_question(question)
|
||||
planned_tasks = _agent_task_plan(question, intent)
|
||||
intent_confidence = _rule_agent_confidence(question, intent, planned_tasks)
|
||||
intent["planned_tasks"] = planned_tasks
|
||||
|
||||
@@ -86,6 +86,7 @@ services:
|
||||
AMAP_WEB_KEY: ""
|
||||
AMAP_JS_KEY: ""
|
||||
AMAP_SECURITY_JSCODE: ""
|
||||
GAODE_CRAWLER_PATH: ""
|
||||
ports:
|
||||
- "${API_PORT:-8102}:8000"
|
||||
depends_on:
|
||||
|
||||
91
docs/API_REFERENCE.md
Normal file
91
docs/API_REFERENCE.md
Normal file
@@ -0,0 +1,91 @@
|
||||
# API 说明
|
||||
|
||||
FastAPI 会自动生成交互式 API 文档。启动服务后访问:
|
||||
|
||||
```text
|
||||
http://localhost:8102/docs
|
||||
```
|
||||
|
||||
所有后台接口统一挂载在:
|
||||
|
||||
```text
|
||||
/v1/admin
|
||||
```
|
||||
|
||||
## 常用接口
|
||||
|
||||
| 接口 | 方法 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| `/v1/admin/health` | `GET` | 健康检查 |
|
||||
| `/v1/admin/auth/login` | `POST` | 管理员登录 |
|
||||
| `/v1/admin/auth/me` | `GET` | 当前登录用户 |
|
||||
| `/v1/admin/projects` | `GET/POST` | 项目管理 |
|
||||
| `/v1/admin/ontology-schemas/current` | `GET` | 当前 schema |
|
||||
| `/v1/admin/source-profiles` | `GET/POST/PATCH` | 数据源管理 |
|
||||
| `/v1/admin/batches` | `GET` | 批次管理 |
|
||||
| `/v1/admin/entities` | `GET` | 候选实体列表 |
|
||||
| `/v1/admin/conflicts` | `GET` | 冲突列表 |
|
||||
| `/v1/admin/publish-jobs` | `GET/POST` | 发布任务 |
|
||||
| `/v1/admin/graph/overview` | `GET` | 图谱概览 |
|
||||
| `/v1/admin/graph/query` | `POST` | 图谱查询 |
|
||||
| `/v1/admin/plaza/overview` | `GET` | 图谱广场概览 |
|
||||
| `/v1/admin/manual-ingest/extract` | `POST` | 手动抽取 |
|
||||
| `/v1/admin/travel/assistant-query` | `POST` | 旅行客服问答 |
|
||||
| `/v1/admin/super-agent/run` | `POST` | Super Agent 任务 |
|
||||
| `/v1/admin/roles` | `GET/POST` | 角色管理 |
|
||||
| `/v1/admin/users` | `GET/POST` | 用户管理 |
|
||||
| `/v1/admin/areas/tree` | `GET` | 区域树 |
|
||||
| `/v1/admin/notifications` | `GET` | 通知列表 |
|
||||
|
||||
## 登录示例
|
||||
|
||||
```bash
|
||||
curl -s http://localhost:8102/v1/admin/auth/login \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"username":"admin@example.com","password":"change-me"}'
|
||||
```
|
||||
|
||||
返回中包含 `access_token`,后续接口可使用:
|
||||
|
||||
```bash
|
||||
TOKEN="上一步返回的 access_token"
|
||||
curl http://localhost:8102/v1/admin/auth/me \
|
||||
-H "Authorization: Bearer $TOKEN"
|
||||
```
|
||||
|
||||
## 图谱查询示例
|
||||
|
||||
```bash
|
||||
curl http://localhost:8102/v1/admin/graph/overview
|
||||
```
|
||||
|
||||
```bash
|
||||
curl http://localhost:8102/v1/admin/graph/query \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"query":"MATCH (n) RETURN n LIMIT 10"}'
|
||||
```
|
||||
|
||||
## 旅行客服问答示例
|
||||
|
||||
```bash
|
||||
curl http://localhost:8102/v1/admin/travel/assistant-query \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"question":"黄小西三日游多少钱?"}'
|
||||
```
|
||||
|
||||
## 前端调用
|
||||
|
||||
React 管理后台通过 `admin-web/src/api.ts` 访问同源 API。Docker 部署时前端和 API 同在 `http://localhost:8102`,因此无需额外配置跨域代理。
|
||||
|
||||
## 外部服务
|
||||
|
||||
LLM 和高德地图相关能力默认关闭或留空。启用前需要在 `.env` 或 Docker Compose 环境变量中配置:
|
||||
|
||||
- `LLM_API_BASE`
|
||||
- `LLM_API_KEY`
|
||||
- `LLM_MODEL`
|
||||
- `LLM_EXTRACTION_ENABLED=true`
|
||||
- `AMAP_WEB_KEY`
|
||||
- `AMAP_JS_KEY`
|
||||
- `AMAP_SECURITY_JSCODE`
|
||||
- `GAODE_CRAWLER_PATH`
|
||||
81
docs/ARCHITECTURE.md
Normal file
81
docs/ARCHITECTURE.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# 架构说明
|
||||
|
||||
系统采用前后端分离加双数据库架构。Docker Compose 会启动 PostgreSQL、FalkorDB 和 FastAPI API 服务;React 管理后台在镜像构建时被打包为静态文件,由 FastAPI 挂载到 `/admin`。
|
||||
|
||||
## 容器拓扑
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph compose["Docker Compose"]
|
||||
API["api\nFastAPI + 静态后台"]
|
||||
PG["postgres\nPostgreSQL 16"]
|
||||
Seed["falkordb-seed\n复制 dump.rdb"]
|
||||
FK["falkordb\nFalkorDB"]
|
||||
end
|
||||
Browser["浏览器"] --> API
|
||||
API --> PG
|
||||
API --> FK
|
||||
Seed --> FK
|
||||
```
|
||||
|
||||
## 服务职责
|
||||
|
||||
| 服务 | 职责 |
|
||||
| --- | --- |
|
||||
| `api` | 提供 `/v1/admin/*` API,挂载 `/admin` 前端页面,连接 PostgreSQL 与 FalkorDB |
|
||||
| `postgres` | 保存后台业务数据、用户权限、项目、候选实体、审核记录和任务 |
|
||||
| `falkordb-seed` | 首次启动时把仓库内的 `snapshots/falkordb/dump.rdb` 写入数据卷 |
|
||||
| `falkordb` | 保存图数据库,支持 Cypher/Redis 协议访问和 FalkorDB Browser |
|
||||
|
||||
## 后端模块
|
||||
|
||||
| 路径 | 说明 |
|
||||
| --- | --- |
|
||||
| `app/main.py` | FastAPI 入口、CORS、路由挂载和前端静态资源挂载 |
|
||||
| `app/config.py` | 环境变量配置 |
|
||||
| `app/db.py` | PostgreSQL 连接池 |
|
||||
| `app/api/` | 管理后台 API 路由 |
|
||||
| `app/agents/` | 采集、抽取、对齐、审计和外部站点 Agent |
|
||||
| `app/kg_core/` | 空间图谱与核心图谱辅助逻辑 |
|
||||
| `app/schemas/` | 抽取 schema |
|
||||
| `app/security.py`、`app/auth.py` | 登录、令牌和权限相关逻辑 |
|
||||
|
||||
## 前端模块
|
||||
|
||||
| 路径 | 说明 |
|
||||
| --- | --- |
|
||||
| `admin-web/src/App.tsx` | 管理后台主应用与路由 |
|
||||
| `admin-web/src/api.ts` | API 客户端 |
|
||||
| `admin-web/src/panels/plaza/` | 图谱广场、用户查询、手动抽取和 Super Agent |
|
||||
| `admin-web/src/panels/acquisition/` | 数据源、批次和冲突工作台 |
|
||||
| `admin-web/src/panels/review/` | 证据质量、字段审核、专家签核和资产库 |
|
||||
| `admin-web/src/panels/modeling/` | Schema、词表和健康检查 |
|
||||
| `admin-web/src/panels/publish/` | 发布与回滚 |
|
||||
| `admin-web/src/panels/system/` | 用户、权限、区域、通知、Agent 设置和日志 |
|
||||
|
||||
## 数据层
|
||||
|
||||
PostgreSQL 和 FalkorDB 承担不同职责:
|
||||
|
||||
- PostgreSQL:结构化管理数据、审核过程数据、用户权限、任务、来源、候选实体和证据。
|
||||
- FalkorDB:图谱实体、关系、路线、资源、POI、空间索引和面向查询的图结构。
|
||||
|
||||
默认配置:
|
||||
|
||||
| 配置项 | 默认值 |
|
||||
| --- | --- |
|
||||
| PostgreSQL database | `kg_admin` |
|
||||
| PostgreSQL schema | `kg_admin_new2` |
|
||||
| FalkorDB 业务图 | `guiyang_new2` |
|
||||
| FalkorDB 空间图 | `guiyang_spatial_v1` |
|
||||
|
||||
## 构建过程
|
||||
|
||||
`Dockerfile` 使用多阶段构建:
|
||||
|
||||
1. Node.js 阶段进入 `admin-web/`,执行 `npm ci` 和 `npm run build`。
|
||||
2. Python 阶段安装 `requirements.txt`。
|
||||
3. 复制 `app/`、`schema搭建/` 和前端构建产物到镜像。
|
||||
4. 容器启动 `uvicorn app.main:app --host 0.0.0.0 --port 8000`。
|
||||
|
||||
快照文件不会进入 API 镜像,它们由 `docker-compose.yml` 作为只读挂载提供给数据库容器。
|
||||
103
docs/DATA_SNAPSHOTS.md
Normal file
103
docs/DATA_SNAPSHOTS.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# 数据快照说明
|
||||
|
||||
仓库内置数据库快照,目的是让用户 clone 后可以直接恢复 `new2` 图谱系统,而不需要重新采集、抽取和发布数据。
|
||||
|
||||
## 快照文件
|
||||
|
||||
| 文件 | 类型 | 用途 |
|
||||
| --- | --- | --- |
|
||||
| `snapshots/postgres/kg_admin_new2.dump` | PostgreSQL custom dump | 恢复后台业务库、账号、项目、候选实体、审核数据等 |
|
||||
| `snapshots/falkordb/dump.rdb` | FalkorDB RDB | 恢复图数据库中的业务图和空间图 |
|
||||
|
||||
默认数据:
|
||||
|
||||
| 项 | 值 |
|
||||
| --- | --- |
|
||||
| PostgreSQL database | `kg_admin` |
|
||||
| PostgreSQL schema | `kg_admin_new2` |
|
||||
| FalkorDB 业务图 | `guiyang_new2` |
|
||||
| FalkorDB 空间图 | `guiyang_spatial_v1` |
|
||||
| 空间 POI | 约 `80609` 条 |
|
||||
| 候选实体 | 约 `37457` 条 |
|
||||
|
||||
## 恢复流程
|
||||
|
||||
Docker Compose 首次启动时会自动恢复:
|
||||
|
||||
1. `postgres` 容器创建数据卷。
|
||||
2. `docker/postgres-init/01-restore-snapshot.sh` 使用 `pg_restore` 恢复 PostgreSQL 快照。
|
||||
3. 脚本把演示管理员密码重置为 `change-me`。
|
||||
4. `falkordb-seed` 容器把 `dump.rdb` 复制到 FalkorDB 数据卷。
|
||||
5. `falkordb` 容器读取 RDB 并加载图数据。
|
||||
|
||||
如果数据卷已存在,初始化不会重复执行。需要重置时运行:
|
||||
|
||||
```bash
|
||||
docker compose down -v
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
## 校验数据
|
||||
|
||||
PostgreSQL:
|
||||
|
||||
```bash
|
||||
docker compose exec postgres psql -U admin -d kg_admin \
|
||||
-c "SELECT COUNT(*) FROM kg_admin_new2.amap_spatial_pois;"
|
||||
```
|
||||
|
||||
候选实体:
|
||||
|
||||
```bash
|
||||
docker compose exec postgres psql -U admin -d kg_admin \
|
||||
-c "SELECT COUNT(*) FROM kg_admin_new2.candidate_entities;"
|
||||
```
|
||||
|
||||
FalkorDB 图列表:
|
||||
|
||||
```bash
|
||||
docker compose exec falkordb redis-cli -p 6379 GRAPH.LIST
|
||||
```
|
||||
|
||||
快照文件哈希:
|
||||
|
||||
```bash
|
||||
shasum -a 256 snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb
|
||||
```
|
||||
|
||||
当前发布快照的参考哈希:
|
||||
|
||||
```text
|
||||
c70a3fe2730cd40a96e729097cef1eb39c66498371b88b2e36e985c923043e75 snapshots/postgres/kg_admin_new2.dump
|
||||
dde96ac99bff58d18bb00e84939772d8a4efc4893aeeae02329aa893ae51f247 snapshots/falkordb/dump.rdb
|
||||
```
|
||||
|
||||
## 重新导出快照
|
||||
|
||||
如果本机仍保留 `new2` 原始容器,可以运行:
|
||||
|
||||
```bash
|
||||
bash scripts/export_snapshots.sh
|
||||
```
|
||||
|
||||
脚本默认从以下容器导出:
|
||||
|
||||
| 容器 | 用途 |
|
||||
| --- | --- |
|
||||
| `zn-kg-new2-postgres` | 导出 `kg_admin_new2` schema |
|
||||
| `zn-kg-new2-falkordb` | 触发 `BGSAVE` 并复制 `dump.rdb` |
|
||||
|
||||
导出后请验证:
|
||||
|
||||
```bash
|
||||
ls -lh snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb
|
||||
shasum -a 256 snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb
|
||||
```
|
||||
|
||||
## 仓库边界
|
||||
|
||||
`data/`、浏览器 profile、日志、缓存和本地 Docker 数据卷不进入 Git。它们是运行过程或采集过程中的临时产物,不适合作为 GitHub 项目内容。系统可复现所需的数据已经收敛到 `snapshots/`、`schema搭建/`、`docs/` 和源码目录中。
|
||||
|
||||
## GitHub 文件大小提醒
|
||||
|
||||
GitHub 单文件硬限制为 100 MB。当前两个快照均低于该限制。后续如果快照继续增大,建议改用 Git LFS、Release Asset 或对象存储,并在 README 中保留下载和恢复说明。
|
||||
169
docs/DEPLOYMENT.md
Normal file
169
docs/DEPLOYMENT.md
Normal file
@@ -0,0 +1,169 @@
|
||||
# 部署指南
|
||||
|
||||
本文档说明如何用 Docker 启动、重置、配置和排查旅行知识图谱管理系统。
|
||||
|
||||
## 环境要求
|
||||
|
||||
- Docker Desktop 或 Docker Engine
|
||||
- Docker Compose v2
|
||||
- 至少 4 GB 可用内存
|
||||
- 至少 3 GB 可用磁盘空间
|
||||
|
||||
## 一键启动
|
||||
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
启动后访问:
|
||||
|
||||
```text
|
||||
http://localhost:8102/admin
|
||||
```
|
||||
|
||||
默认账号:
|
||||
|
||||
```text
|
||||
admin@example.com / change-me
|
||||
```
|
||||
|
||||
## 首次启动会发生什么
|
||||
|
||||
1. 构建 API 镜像,并打包 React 管理后台。
|
||||
2. 创建 PostgreSQL 数据卷。
|
||||
3. PostgreSQL 初始化脚本恢复 `snapshots/postgres/kg_admin_new2.dump`。
|
||||
4. `falkordb-seed` 把 `snapshots/falkordb/dump.rdb` 写入 FalkorDB 数据卷。
|
||||
5. FastAPI 服务等待 PostgreSQL 和 FalkorDB 健康后启动。
|
||||
|
||||
## 常用命令
|
||||
|
||||
查看服务状态:
|
||||
|
||||
```bash
|
||||
docker compose ps
|
||||
```
|
||||
|
||||
查看 API 日志:
|
||||
|
||||
```bash
|
||||
docker compose logs -f api
|
||||
```
|
||||
|
||||
停止服务:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
```
|
||||
|
||||
停止并删除数据卷,下一次启动将重新恢复快照:
|
||||
|
||||
```bash
|
||||
docker compose down -v
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
## 端口配置
|
||||
|
||||
可以在启动时覆盖端口:
|
||||
|
||||
```bash
|
||||
API_PORT=18102 \
|
||||
POSTGRES_PORT=15433 \
|
||||
FALKORDB_PORT=16380 \
|
||||
FALKORDB_BROWSER_PORT=13002 \
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
默认端口:
|
||||
|
||||
| 变量 | 默认值 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| `API_PORT` | `8102` | FastAPI 与管理后台 |
|
||||
| `POSTGRES_PORT` | `5433` | PostgreSQL 映射端口 |
|
||||
| `FALKORDB_PORT` | `6380` | FalkorDB Redis 协议端口 |
|
||||
| `FALKORDB_BROWSER_PORT` | `3002` | FalkorDB Browser |
|
||||
|
||||
## 环境变量
|
||||
|
||||
Docker Compose 已提供可运行默认值。生产部署时建议改成 `.env` 文件或部署平台的环境变量。
|
||||
|
||||
| 变量 | 说明 |
|
||||
| --- | --- |
|
||||
| `DATABASE_URL` | 后端连接 PostgreSQL 的 URL |
|
||||
| `DB_SCHEMA` | 默认 `kg_admin_new2` |
|
||||
| `DB_MIGRATIONS_ENABLED` | 快照部署默认 `false` |
|
||||
| `FALKORDB_HOST` | Docker 内默认 `falkordb` |
|
||||
| `FALKORDB_GRAPH` | 默认业务图 `guiyang_new2` |
|
||||
| `AUTH_SECRET` | JWT 签名密钥,生产必须替换 |
|
||||
| `AUTH_DEFAULT_USERNAME` | 默认管理员用户名 |
|
||||
| `AUTH_DEFAULT_PASSWORD` | 默认管理员密码 |
|
||||
| `LLM_API_BASE` | OpenAI 兼容模型服务地址,可选 |
|
||||
| `LLM_API_KEY` | LLM 密钥,可选 |
|
||||
| `LLM_EXTRACTION_ENABLED` | 是否启用 LLM 抽取 |
|
||||
| `AMAP_WEB_KEY`、`AMAP_JS_KEY` | 高德地图密钥,可选 |
|
||||
| `GAODE_CRAWLER_PATH` | 外部高德采集脚本路径,可选 |
|
||||
| `TRAVEL_AGENCY_SOURCE_ROOT` | 旅行社原始资料目录,仅运行采集/构图脚本时需要 |
|
||||
| `TRAVEL_DELIVERY_ROOT` | POI 交付 CSV 目录,仅运行采集/增强脚本时需要 |
|
||||
| `TRAVEL_KG_EXPORT_ROOT` | 采集/构图脚本导出目录 |
|
||||
|
||||
## 健康检查
|
||||
|
||||
API:
|
||||
|
||||
```bash
|
||||
curl http://localhost:8102/v1/admin/health
|
||||
```
|
||||
|
||||
PostgreSQL:
|
||||
|
||||
```bash
|
||||
docker compose exec postgres pg_isready -U admin -d kg_admin
|
||||
```
|
||||
|
||||
FalkorDB:
|
||||
|
||||
```bash
|
||||
docker compose exec falkordb redis-cli -p 6379 PING
|
||||
```
|
||||
|
||||
## 常见问题
|
||||
|
||||
### 管理后台 404
|
||||
|
||||
确认镜像已重新构建:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build api
|
||||
```
|
||||
|
||||
前端静态资源由 FastAPI 挂载在 `/admin`,直接访问根路径不会进入后台。
|
||||
|
||||
### 数据没有恢复
|
||||
|
||||
PostgreSQL 初始化脚本只在数据卷首次创建时运行。如果已经创建过数据卷,需要先删除卷:
|
||||
|
||||
```bash
|
||||
docker compose down -v
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
### 端口被占用
|
||||
|
||||
使用端口变量覆盖默认端口,例如:
|
||||
|
||||
```bash
|
||||
API_PORT=18102 docker compose up -d
|
||||
```
|
||||
|
||||
### 登录失败
|
||||
|
||||
初始化脚本会把 `admin@example.com` 的演示密码设置为 `change-me`。如果仍失败,先确认 PostgreSQL 已重新恢复快照,再查看 API 日志。
|
||||
|
||||
## 生产加固建议
|
||||
|
||||
- 修改 `AUTH_SECRET`、数据库密码和默认管理员密码。
|
||||
- 不要把真实 `.env`、LLM key、高德 key 提交到仓库。
|
||||
- 用反向代理提供 HTTPS。
|
||||
- 给 PostgreSQL 和 FalkorDB 配置持久化备份。
|
||||
- 如果面向公网,限制数据库端口暴露,只暴露 API/前端。
|
||||
- 开启日志采集和容器监控。
|
||||
101
docs/MAINTENANCE.md
Normal file
101
docs/MAINTENANCE.md
Normal file
@@ -0,0 +1,101 @@
|
||||
# 维护指南
|
||||
|
||||
本文档记录项目维护、发布和提交前检查流程,帮助仓库长期保持可下载、可启动、可理解。
|
||||
|
||||
## 提交前检查
|
||||
|
||||
```bash
|
||||
git status --short
|
||||
python -m compileall app
|
||||
cd admin-web && npm run build
|
||||
cd ..
|
||||
docker compose config
|
||||
```
|
||||
|
||||
如果修改 Docker 或数据库快照,建议再做一次完整启动验证:
|
||||
|
||||
```bash
|
||||
API_PORT=18102 \
|
||||
POSTGRES_PORT=15433 \
|
||||
FALKORDB_PORT=16380 \
|
||||
FALKORDB_BROWSER_PORT=13002 \
|
||||
docker compose up -d --build
|
||||
|
||||
curl http://localhost:18102/v1/admin/health
|
||||
|
||||
docker compose down -v
|
||||
```
|
||||
|
||||
## 更新系统代码
|
||||
|
||||
- 后端 API 放在 `app/api/`。
|
||||
- Agent 和抽取流程放在 `app/agents/`。
|
||||
- 图谱核心能力放在 `app/kg_core/`。
|
||||
- 前端页面和面板放在 `admin-web/src/`。
|
||||
- 配置项统一从 `app/config.py` 和环境变量读取。
|
||||
|
||||
历史采集/构图脚本的本地资料路径统一在 `scripts/common_paths.py` 中配置,默认指向仓库内 `data/source` 和 `data/exports`。需要接入自己的原始资料时,可通过 `TRAVEL_AGENCY_SOURCE_ROOT`、`TRAVEL_DELIVERY_ROOT`、`TRAVEL_KG_EXPORT_ROOT`、`GAODE_CRAWLER_PATH` 覆盖。
|
||||
|
||||
新增功能时请同步更新:
|
||||
|
||||
- README 的功能说明或项目结构。
|
||||
- `docs/API_REFERENCE.md` 中的接口列表。
|
||||
- 需要运行环境变量时更新 `.env.example` 和 `docs/DEPLOYMENT.md`。
|
||||
|
||||
## 更新图谱 schema
|
||||
|
||||
图谱 schema 和 DSL 资料主要放在 `schema搭建/`。如果 schema 已经发布到数据库或 FalkorDB,建议同时更新:
|
||||
|
||||
- schema 文件
|
||||
- 发布脚本
|
||||
- `docs/reports/new2_current_kg_schema_snapshot.md`
|
||||
- 数据快照
|
||||
|
||||
## 更新数据快照
|
||||
|
||||
从本地 `new2` 容器导出:
|
||||
|
||||
```bash
|
||||
bash scripts/export_snapshots.sh
|
||||
```
|
||||
|
||||
导出后检查:
|
||||
|
||||
```bash
|
||||
ls -lh snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb
|
||||
shasum -a 256 snapshots/postgres/kg_admin_new2.dump snapshots/falkordb/dump.rdb
|
||||
```
|
||||
|
||||
快照更新后必须执行 Docker 恢复验证,确保新用户 clone 仓库后可以直接使用。
|
||||
|
||||
## 仓库不应包含
|
||||
|
||||
- `.env` 和真实密钥
|
||||
- `node_modules/`
|
||||
- Python 虚拟环境
|
||||
- Playwright/浏览器 profile
|
||||
- Docker 数据卷
|
||||
- 临时日志、截图和缓存
|
||||
- 超过 GitHub 限制的大文件
|
||||
|
||||
## 发布建议
|
||||
|
||||
成熟发布建议包含:
|
||||
|
||||
1. 代码和文档已提交。
|
||||
2. Docker 可以从零构建。
|
||||
3. PostgreSQL 和 FalkorDB 快照可以恢复。
|
||||
4. 默认账号可以登录。
|
||||
5. README 中的启动命令与端口正确。
|
||||
6. GitHub 仓库首页可以看到系统定位、架构、数据和部署方式。
|
||||
|
||||
## 当前验证记录
|
||||
|
||||
本地已用非默认端口完成 Docker 验证:
|
||||
|
||||
- API 健康检查返回 `{"status":"ok"}`。
|
||||
- 管理后台 `/admin/` 返回 `200`。
|
||||
- PostgreSQL 恢复后 `kg_admin_new2.amap_spatial_pois` 为 `80609` 条。
|
||||
- PostgreSQL 恢复后 `kg_admin_new2.candidate_entities` 为 `37457` 条。
|
||||
- FalkorDB 可列出 `guiyang_new2`、`guiyang_spatial_v1` 等图。
|
||||
- `admin@example.com / change-me` 登录成功。
|
||||
58
docs/PROJECT_OVERVIEW.md
Normal file
58
docs/PROJECT_OVERVIEW.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# 系统介绍
|
||||
|
||||
旅行知识图谱管理系统是 `new2` 版本的城市与旅行领域知识图谱平台,面向景区、旅行社、文旅运营和智能客服场景。系统把采集资料、百科文本、POI 空间数据、线路产品、酒店/餐饮/车辆等资源沉淀为可审核、可发布、可查询的图谱资产。
|
||||
|
||||
## 目标用户
|
||||
|
||||
- 文旅运营人员:查看图谱覆盖、数据质量、缺口和发布状态。
|
||||
- 数据标注与审核人员:处理实体字段、证据来源、冲突合并和专家签核。
|
||||
- 产品和线路人员:维护固定线路、景点组合、报价说明和资源约束。
|
||||
- 智能客服研发人员:基于图谱接口构建线路问答、周边资源推荐和报价查询。
|
||||
- 工程维护人员:通过 Docker、快照和脚本复现系统与数据。
|
||||
|
||||
## 核心能力
|
||||
|
||||
| 能力 | 说明 |
|
||||
| --- | --- |
|
||||
| 数据源与批次管理 | 管理来源、采集批次、原始记录和质量摘要 |
|
||||
| 实体审核 | 查看候选实体、字段决策、证据链、审查历史和合并 |
|
||||
| 图谱广场 | 汇总图谱规模、使用情况、健康告警和用户查询 |
|
||||
| Schema 管理 | 管理 ontology schema、DSL、版本和发布记录 |
|
||||
| 证据质量 | 聚合 POI 证据、资源质量和字段可信度 |
|
||||
| 发布与回滚 | 创建发布任务、查看 diff、回滚图谱版本 |
|
||||
| 城市空间图谱 | 使用高德 POI 与空间网格支持周边检索 |
|
||||
| 旅行客服 Agent | 支持线路清单、线路匹配、线路报价、酒店资源、车辆和附近资源查询 |
|
||||
| 权限与组织 | 内置角色、能力矩阵、用户和区域责任管理 |
|
||||
|
||||
## 典型业务流程
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A["采集/导入资料"] --> B["抽取候选实体"]
|
||||
B --> C["字段证据与质量检查"]
|
||||
C --> D["人工审核/冲突合并"]
|
||||
D --> E["发布到 FalkorDB 图谱"]
|
||||
E --> F["图谱查询/客服问答/运营分析"]
|
||||
```
|
||||
|
||||
## 随仓库发布的内容
|
||||
|
||||
- 后端源码:`app/`
|
||||
- 前端源码:`admin-web/`
|
||||
- Docker 运行环境:`Dockerfile`、`docker-compose.yml`、`docker/`
|
||||
- 图谱数据快照:`snapshots/postgres/`、`snapshots/falkordb/`
|
||||
- 图谱 schema 与样例资料:`schema搭建/`
|
||||
- 采集、构建、发布和快照脚本:`scripts/`
|
||||
- 项目文档:`README.md`、`docs/`
|
||||
|
||||
## 默认演示数据
|
||||
|
||||
快照以 `new2` 本地系统为来源,包含贵阳/贵州旅行场景相关数据:
|
||||
|
||||
- PostgreSQL schema:`kg_admin_new2`
|
||||
- FalkorDB 主要图:`guiyang_new2`
|
||||
- FalkorDB 空间图:`guiyang_spatial_v1`
|
||||
- 空间 POI:约 `80609` 条
|
||||
- 候选实体:约 `37457` 条
|
||||
|
||||
这些数据随仓库一起发布,下载后通过 Docker 初始化脚本自动恢复。
|
||||
@@ -198,6 +198,6 @@ NEARBY_ATTRACTION -> 青岩古镇: 已对齐到 amap:B035300ESE
|
||||
对应脚本:
|
||||
|
||||
```text
|
||||
/Users/xuexue/new2/scripts/align_huaxi_kg_with_existing_graph.py
|
||||
scripts/align_huaxi_kg_with_existing_graph.py
|
||||
```
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
|
||||
已完成:
|
||||
|
||||
- 从 `/Users/xuexue/new` 复制到 `/Users/xuexue/new2`。
|
||||
- 从 `原 new 目录` 复制到 `项目根目录`。
|
||||
- 排除 `node_modules`、`.env`、`data`、`__pycache__`、运行产物。
|
||||
- 将空间汇报材料和基准测试材料纳入 `docs/reports`。
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## 文件复刻
|
||||
|
||||
`/Users/xuexue/new2` 已从 `/Users/xuexue/new` 完整补齐:
|
||||
`项目根目录` 已从 `原 new 目录` 完整补齐:
|
||||
|
||||
- `.env`
|
||||
- `admin-web/node_modules`
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# new2 当前知识图谱 Schema 快照
|
||||
|
||||
生成时间:2026-05-28
|
||||
项目目录:`/Users/xuexue/new2`
|
||||
项目目录:`项目根目录`
|
||||
|
||||
## 1. 当前配置
|
||||
|
||||
|
||||
@@ -28,8 +28,8 @@
|
||||
|
||||
代码位置:
|
||||
|
||||
- `/Users/xuexue/new2/app/api/travel_assistant.py:1944`
|
||||
- `/Users/xuexue/new2/app/api/travel_assistant.py:3082`
|
||||
- `app/api/travel_assistant.py:1944`
|
||||
- `app/api/travel_assistant.py:3082`
|
||||
|
||||
### 方法二:线路清单使用轻量图谱查询
|
||||
|
||||
@@ -47,8 +47,8 @@ TourProduct -> ProductDay -> RouteStop -> ScenicAttraction/SubAttraction
|
||||
|
||||
代码位置:
|
||||
|
||||
- `/Users/xuexue/new2/app/api/travel_assistant.py:1679`
|
||||
- `/Users/xuexue/new2/app/api/travel_assistant.py:2975`
|
||||
- `app/api/travel_assistant.py:1679`
|
||||
- `app/api/travel_assistant.py:2975`
|
||||
|
||||
### 方法三:景区附近资源使用 NEARBY 关系直查
|
||||
|
||||
@@ -63,7 +63,7 @@ ScenicAttraction -> ATTRACTION_NEARBY_RESOURCE -> Restaurant
|
||||
|
||||
代码位置:
|
||||
|
||||
- `/Users/xuexue/new2/app/api/travel_assistant.py:2013`
|
||||
- `app/api/travel_assistant.py:2013`
|
||||
|
||||
### 方法四:费用资源兼容两条查询路径
|
||||
|
||||
@@ -86,7 +86,7 @@ ScenicAttraction -> ATTRACTION_HAS_ITEM -> TravelItem
|
||||
|
||||
代码位置:
|
||||
|
||||
- `/Users/xuexue/new2/app/api/travel_assistant.py:2196`
|
||||
- `app/api/travel_assistant.py:2196`
|
||||
|
||||
### 方法五:推荐问题才走完整图谱排序
|
||||
|
||||
@@ -114,8 +114,8 @@ NEARBY
|
||||
|
||||
代码位置:
|
||||
|
||||
- `/Users/xuexue/new2/app/api/travel_assistant.py:1536`
|
||||
- `/Users/xuexue/new2/app/api/travel_assistant.py:3082`
|
||||
- `app/api/travel_assistant.py:1536`
|
||||
- `app/api/travel_assistant.py:3082`
|
||||
|
||||
## 3. 当前已经验证的效果
|
||||
|
||||
@@ -146,10 +146,10 @@ NEARBY
|
||||
|
||||
代码位置:
|
||||
|
||||
- `/Users/xuexue/new2/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:131`
|
||||
- `/Users/xuexue/new2/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:156`
|
||||
- `/Users/xuexue/new2/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:182`
|
||||
- `/Users/xuexue/new2/admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:218`
|
||||
- `admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:131`
|
||||
- `admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:156`
|
||||
- `admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:182`
|
||||
- `admin-web/src/panels/plaza/TravelAgencyAssistantPanel.tsx:218`
|
||||
|
||||
## 5. 是否满足当前原型需求
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ import sys
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
ROOT = Path("/Users/xuexue/new2")
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
if str(ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(ROOT))
|
||||
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
import http from "node:http";
|
||||
import { spawn } from "node:child_process";
|
||||
import { readFileSync } from "node:fs";
|
||||
import path from "node:path";
|
||||
|
||||
function readEnvKey(path, key) {
|
||||
const txt = readFileSync(path, "utf8");
|
||||
@@ -62,8 +63,10 @@ async function wait(ms) {
|
||||
}
|
||||
|
||||
async function main() {
|
||||
const key = readEnvKey("/Users/xuexue/new2/.env", "AMAP_JS_KEY");
|
||||
const security = readEnvKey("/Users/xuexue/new2/.env", "AMAP_SECURITY_JSCODE");
|
||||
const root = path.resolve(new URL("..", import.meta.url).pathname, "..");
|
||||
const envPath = process.env.TRAVEL_KG_ENV_PATH || path.join(root, ".env");
|
||||
const key = readEnvKey(envPath, "AMAP_JS_KEY");
|
||||
const security = readEnvKey(envPath, "AMAP_SECURITY_JSCODE");
|
||||
if (!key || !security) throw new Error("missing AMap JS key/security");
|
||||
|
||||
const chrome = spawn(
|
||||
|
||||
@@ -16,10 +16,11 @@ from falkordb import FalkorDB
|
||||
from psycopg.rows import dict_row
|
||||
from psycopg.types.json import Jsonb
|
||||
|
||||
from common_paths import PROJECT_ROOT, TRAVEL_AGENCY_SOURCE_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
SOURCE_DIR = Path("/Users/xuexue/Downloads/旅行社业务")
|
||||
OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/旅行社项目入库")
|
||||
SCHEMA_DIR = Path("/Users/xuexue/new2/schema搭建/travel_agency_business")
|
||||
SOURCE_DIR = TRAVEL_AGENCY_SOURCE_ROOT
|
||||
OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "旅行社项目入库"
|
||||
SCHEMA_DIR = PROJECT_ROOT / "schema搭建/travel_agency_business"
|
||||
DB_URL = "postgresql://admin:password@localhost:5433/kg_admin"
|
||||
DB_SCHEMA = "kg_admin_new2"
|
||||
TENANT_ID = "travel_agency"
|
||||
@@ -1829,7 +1830,7 @@ def write_outputs(builder: KGBuilder, schema: dict[str, Any], qa: list[dict[str,
|
||||
f"生成时间:{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
|
||||
"",
|
||||
"## 数据来源",
|
||||
"- `/Users/xuexue/Downloads/旅行社业务/2026年新行程打包`:既有线路产品、每日行程、费用包含/不含、自费项、风险提示。",
|
||||
"- `TRAVEL_AGENCY_SOURCE_ROOT/2026年新行程打包`:既有线路产品、每日行程、费用包含/不含、自费项、风险提示。",
|
||||
"- `滨海国旅2-8人拼小团计划...xlsx`:2-8人拼小团团期、房型、成人/儿童/单房差、景区小交通、证件退费政策。",
|
||||
"- `20-25人独立成团.xlsx`:独立成团产品、季节价、20/25人报价、泰语导游和2+1大巴服务。",
|
||||
"- `住宿资源库(四钻及以上).xlsx`、`餐厅资源库.xlsx`:酒店/餐厅资源、区域、价格、适用场景。",
|
||||
|
||||
@@ -18,15 +18,16 @@ from falkordb import FalkorDB
|
||||
from psycopg.rows import dict_row
|
||||
from psycopg.types.json import Jsonb
|
||||
|
||||
from common_paths import PROJECT_ROOT, TRAVEL_AGENCY_SOURCE_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
SOURCE_ROOT = Path("/Users/xuexue/Downloads/旅行社业务")
|
||||
SOURCE_ROOT = TRAVEL_AGENCY_SOURCE_ROOT
|
||||
ROUTE_SOURCE_DIR = SOURCE_ROOT / "2026年新行程打包"
|
||||
ROUTE_MD_DIR = SOURCE_ROOT / "2026年新行程打包_md整理"
|
||||
ROUTE_MD_PRODUCTS = ROUTE_MD_DIR / "products"
|
||||
LEGACY_SCRIPT = Path("/Users/xuexue/new2/scripts/build_travel_graph_existing_product_project.py")
|
||||
LEGACY_SCRIPT = PROJECT_ROOT / "scripts/build_travel_graph_existing_product_project.py"
|
||||
|
||||
SCHEMA_OUT_DIR = Path("/Users/xuexue/new2/schema搭建/travel_fixed_route_item")
|
||||
OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_fixed_route_item_旅行社固定线路资源图谱")
|
||||
SCHEMA_OUT_DIR = PROJECT_ROOT / "schema搭建/travel_fixed_route_item"
|
||||
OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_fixed_route_item_旅行社固定线路资源图谱"
|
||||
|
||||
DB_URL = "postgresql://admin:password@localhost:5433/kg_admin"
|
||||
DB_SCHEMA = "kg_admin_new2"
|
||||
|
||||
@@ -17,13 +17,14 @@ from falkordb import FalkorDB
|
||||
from psycopg.rows import dict_row
|
||||
from psycopg.types.json import Jsonb
|
||||
|
||||
from common_paths import PROJECT_ROOT, TRAVEL_AGENCY_SOURCE_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
SOURCE_ROOT = Path("/Users/xuexue/Downloads/旅行社业务")
|
||||
SOURCE_ROOT = TRAVEL_AGENCY_SOURCE_ROOT
|
||||
ROUTE_MD_DIR = SOURCE_ROOT / "2026年新行程打包_md整理"
|
||||
ROUTE_MD_PRODUCTS = ROUTE_MD_DIR / "products"
|
||||
SCHEMA_SRC = Path("/Users/xuexue/new2/schema搭建/travel_agency_business/travel_agency_existing_product_schema.v1.json")
|
||||
SCHEMA_OUT_DIR = Path("/Users/xuexue/new2/schema搭建/travel_graph_existing_product")
|
||||
OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_graph_旅行社线路制定")
|
||||
SCHEMA_SRC = PROJECT_ROOT / "schema搭建/travel_agency_business/travel_agency_existing_product_schema.v1.json"
|
||||
SCHEMA_OUT_DIR = PROJECT_ROOT / "schema搭建/travel_graph_existing_product"
|
||||
OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_graph_旅行社线路制定"
|
||||
AMAP_CACHE_PATH = OUT_DIR / "amap_poi_enrichment_cache.json"
|
||||
AMAP_DRIVING_CACHE_PATH = OUT_DIR / "amap_driving_distance_cache.json"
|
||||
|
||||
|
||||
@@ -5,7 +5,7 @@ from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
ROOT = Path("/Users/xuexue/new2")
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
if str(ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(ROOT))
|
||||
|
||||
|
||||
17
scripts/common_paths.py
Normal file
17
scripts/common_paths.py
Normal file
@@ -0,0 +1,17 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
PROJECT_ROOT = Path(__file__).resolve().parents[1]
|
||||
DATA_ROOT = Path(os.getenv("TRAVEL_KG_DATA_ROOT", PROJECT_ROOT / "data")).expanduser()
|
||||
TRAVEL_AGENCY_SOURCE_ROOT = Path(
|
||||
os.getenv("TRAVEL_AGENCY_SOURCE_ROOT", DATA_ROOT / "source" / "travel_agency")
|
||||
).expanduser()
|
||||
TRAVEL_DELIVERY_ROOT = Path(
|
||||
os.getenv("TRAVEL_DELIVERY_ROOT", DATA_ROOT / "source" / "travel_delivery_20260602")
|
||||
).expanduser()
|
||||
TRAVEL_KG_EXPORT_ROOT = Path(os.getenv("TRAVEL_KG_EXPORT_ROOT", DATA_ROOT / "exports")).expanduser()
|
||||
GAODE_CRAWLER_PATH = Path(os.getenv("GAODE_CRAWLER_PATH", PROJECT_ROOT / "scripts" / "crawl_guiyan.py")).expanduser()
|
||||
ENV_PATH = Path(os.getenv("TRAVEL_KG_ENV_PATH", PROJECT_ROOT / ".env")).expanduser()
|
||||
@@ -21,10 +21,12 @@ from typing import Any
|
||||
import requests
|
||||
import urllib3
|
||||
|
||||
from common_paths import GAODE_CRAWLER_PATH, PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
|
||||
|
||||
BUILD_SCRIPT = Path("/Users/xuexue/new2/scripts/build_travel_graph_existing_product_project.py")
|
||||
OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_graph_旅行社线路制定")
|
||||
BUILD_SCRIPT = PROJECT_ROOT / "scripts/build_travel_graph_existing_product_project.py"
|
||||
OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_graph_旅行社线路制定"
|
||||
NODES_PATH = OUT_DIR / "抽取结果_nodes.json"
|
||||
CACHE_PATH = OUT_DIR / "amap_driving_distance_cache.json"
|
||||
REPORT_CSV = OUT_DIR / "amap_driving_distance_report.csv"
|
||||
@@ -48,7 +50,7 @@ def load_key() -> str:
|
||||
for key in (os.environ.get("AMAP_WEB_KEY"), os.environ.get("AMAP_KEY")):
|
||||
if key:
|
||||
return key
|
||||
crawl_path = Path("/Users/xuexue/PycharmProjects/PythonProject/xuexue-CityGraph/crawl_guiyan.py")
|
||||
crawl_path = GAODE_CRAWLER_PATH
|
||||
if crawl_path.exists():
|
||||
spec = importlib.util.spec_from_file_location("crawl_guiyan", crawl_path)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
|
||||
@@ -21,10 +21,12 @@ from typing import Any
|
||||
import requests
|
||||
import urllib3
|
||||
|
||||
from common_paths import GAODE_CRAWLER_PATH, PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
|
||||
|
||||
BUILD_SCRIPT = Path("/Users/xuexue/new2/scripts/build_travel_graph_existing_product_project.py")
|
||||
OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_graph_旅行社线路制定")
|
||||
BUILD_SCRIPT = PROJECT_ROOT / "scripts/build_travel_graph_existing_product_project.py"
|
||||
OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_graph_旅行社线路制定"
|
||||
NODES_PATH = OUT_DIR / "抽取结果_nodes.json"
|
||||
CACHE_PATH = OUT_DIR / "amap_poi_enrichment_cache.json"
|
||||
REPORT_CSV = OUT_DIR / "amap_poi_enrichment_report.csv"
|
||||
@@ -47,7 +49,7 @@ def load_key() -> str:
|
||||
for key in (os.environ.get("AMAP_WEB_KEY"), os.environ.get("AMAP_KEY")):
|
||||
if key:
|
||||
return key
|
||||
crawl_path = Path("/Users/xuexue/PycharmProjects/PythonProject/xuexue-CityGraph/crawl_guiyan.py")
|
||||
crawl_path = GAODE_CRAWLER_PATH
|
||||
if crawl_path.exists():
|
||||
spec = importlib.util.spec_from_file_location("crawl_guiyan", crawl_path)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
|
||||
@@ -18,10 +18,10 @@ import urllib.request
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from common_paths import ENV_PATH, TRAVEL_DELIVERY_ROOT
|
||||
|
||||
BASE_DIR = Path("/Users/xuexue/Documents/trae_projects/travel- graph/delivery_20260602")
|
||||
BASE_DIR = TRAVEL_DELIVERY_ROOT
|
||||
OUT_DIR = BASE_DIR / "amap_enriched"
|
||||
ENV_PATH = Path("/Users/xuexue/Desktop/zn-kg/.env")
|
||||
CACHE_PATH = OUT_DIR / "_amap_cache.json"
|
||||
|
||||
SCENIC_TYPES = "110000"
|
||||
|
||||
@@ -4,7 +4,8 @@ import path from "node:path";
|
||||
import http from "node:http";
|
||||
import { spawn } from "node:child_process";
|
||||
|
||||
const BASE_DIR = "/Users/xuexue/Documents/trae_projects/travel- graph/delivery_20260602";
|
||||
const ROOT = path.resolve(new URL("..", import.meta.url).pathname, "..");
|
||||
const BASE_DIR = process.env.TRAVEL_DELIVERY_ROOT || path.join(ROOT, "data", "source", "travel_delivery_20260602");
|
||||
const OUT_DIR = path.join(BASE_DIR, "amap_js_enriched");
|
||||
const CACHE_FILE = path.join(OUT_DIR, "_amap_js_cache.json");
|
||||
const CHROME = "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome";
|
||||
@@ -185,8 +186,9 @@ function jsString(v) {
|
||||
}
|
||||
|
||||
async function initAmap(cdp) {
|
||||
const key = readEnvKey("/Users/xuexue/new2/.env", "AMAP_JS_KEY");
|
||||
const security = readEnvKey("/Users/xuexue/new2/.env", "AMAP_SECURITY_JSCODE");
|
||||
const envPath = process.env.TRAVEL_KG_ENV_PATH || path.join(ROOT, ".env");
|
||||
const key = readEnvKey(envPath, "AMAP_JS_KEY");
|
||||
const security = readEnvKey(envPath, "AMAP_SECURITY_JSCODE");
|
||||
if (!key || !security) throw new Error("missing AMap JS key/security");
|
||||
const expr = `
|
||||
(async () => {
|
||||
|
||||
@@ -12,9 +12,11 @@ from typing import Any
|
||||
|
||||
from falkordb import FalkorDB
|
||||
|
||||
from common_paths import TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
|
||||
GRAPH_NAME = "travel_agency_2_0_test"
|
||||
OUT_ROOT = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试")
|
||||
OUT_ROOT = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试"
|
||||
SCHEMA_SIMPLE = OUT_ROOT / "tencent_adp_schema.simple.json"
|
||||
FILTERED_DIR = OUT_ROOT / "filtered_import_from_travel_fixed_route_item"
|
||||
POI_DIR = OUT_ROOT / "poi_nearby_import_without_amap"
|
||||
|
||||
@@ -13,7 +13,7 @@ import sys
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
ROOT = Path("/Users/xuexue/new2")
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
if str(ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(ROOT))
|
||||
|
||||
|
||||
@@ -5,9 +5,11 @@ from datetime import datetime
|
||||
|
||||
from falkordb import FalkorDB
|
||||
|
||||
from common_paths import TRAVEL_AGENCY_SOURCE_ROOT
|
||||
|
||||
|
||||
GRAPH_NAME = "travel_agency_2_0_test"
|
||||
SOURCE_FILE = "/Users/xuexue/Downloads/旅行社业务/线上客资回复话术.docx"
|
||||
SOURCE_FILE = str(TRAVEL_AGENCY_SOURCE_ROOT / "线上客资回复话术.docx")
|
||||
UPDATED_AT = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
|
||||
|
||||
|
||||
@@ -14,6 +14,8 @@ from falkordb import FalkorDB
|
||||
from psycopg.rows import dict_row
|
||||
from psycopg.types.json import Jsonb
|
||||
|
||||
from common_paths import TRAVEL_DELIVERY_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
DB_URL = "postgresql://admin:password@localhost:5433/kg_admin"
|
||||
DB_SCHEMA = "kg_admin_new2"
|
||||
|
||||
@@ -22,10 +24,10 @@ PROJECT_ID = "travel_agency_2_0_test"
|
||||
GRAPH_NAME = "travel_agency_2_0_test"
|
||||
TEMPLATE_ID = "travel_agency_2_0_poi_nearby_import_without_amap_v1"
|
||||
|
||||
SOURCE_DIR = Path("/Users/xuexue/Documents/trae_projects/travel- graph/delivery_20260602")
|
||||
SOURCE_DIR = TRAVEL_DELIVERY_ROOT
|
||||
HOTEL_FILE = SOURCE_DIR / "hotel_poi.csv"
|
||||
RESTAURANT_FILE = SOURCE_DIR / "restaurant_poi.csv"
|
||||
OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试/poi_nearby_import_without_amap")
|
||||
OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试/poi_nearby_import_without_amap"
|
||||
RUN_UPDATED_AT = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ from copy import deepcopy
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
ROOT = Path("/Users/xuexue/new2")
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
IN_JSON = ROOT / "docs/reports/huaxi_kg_extraction_comparison.json"
|
||||
SCHEMA_JSON = ROOT / "app/schemas/kg_extraction_v1.schema.json"
|
||||
OUT_JSON = ROOT / "docs/reports/huaxi_kg_schema_v1_ready.json"
|
||||
@@ -217,7 +217,7 @@ def write_review_plan(raw: dict[str, Any], payload: dict[str, Any], validation:
|
||||
"-> final_score < 0.8 或模型冲突:进入人工审核",
|
||||
"```",
|
||||
"",
|
||||
"对应严格 JSON 输出:`/Users/xuexue/new2/docs/reports/huaxi_kg_schema_v1_ready.json`",
|
||||
"对应严格 JSON 输出:`docs/reports/huaxi_kg_schema_v1_ready.json`",
|
||||
]
|
||||
OUT_REVIEW.write_text("\n".join(lines), encoding="utf-8")
|
||||
|
||||
|
||||
@@ -14,6 +14,8 @@ from falkordb import FalkorDB
|
||||
from psycopg.rows import dict_row
|
||||
from psycopg.types.json import Jsonb
|
||||
|
||||
from common_paths import TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
DB_URL = "postgresql://admin:password@localhost:5433/kg_admin"
|
||||
DB_SCHEMA = "kg_admin_new2"
|
||||
|
||||
@@ -25,7 +27,7 @@ TARGET_PROJECT = "travel_agency_2_0_test"
|
||||
TARGET_GRAPH = "travel_agency_2_0_test"
|
||||
TARGET_TEMPLATE_ID = "travel_agency_2_0_fixed_route_core_import_v1"
|
||||
|
||||
OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试/filtered_import_from_travel_fixed_route_item")
|
||||
OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试/filtered_import_from_travel_fixed_route_item"
|
||||
RUN_UPDATED_AT = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
|
||||
EXCLUDED_ENTITY_TYPES = {
|
||||
|
||||
@@ -9,10 +9,11 @@ from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from common_paths import TRAVEL_AGENCY_SOURCE_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
SOURCE_DIR = Path("/Users/xuexue/Downloads/旅行社业务/2026年新行程打包")
|
||||
OUT_DIR = Path("/Users/xuexue/Downloads/旅行社业务/2026年新行程打包_md整理")
|
||||
GRAPH_OUT_DIR = Path("/Users/xuexue/Downloads/图谱数据/旅行社项目入库/已有路线产品Markdown")
|
||||
SOURCE_DIR = TRAVEL_AGENCY_SOURCE_ROOT / "2026年新行程打包"
|
||||
OUT_DIR = TRAVEL_AGENCY_SOURCE_ROOT / "2026年新行程打包_md整理"
|
||||
GRAPH_OUT_DIR = TRAVEL_KG_EXPORT_ROOT / "旅行社项目入库/已有路线产品Markdown"
|
||||
|
||||
|
||||
ATTRACTION_ALIASES = {
|
||||
|
||||
@@ -9,7 +9,7 @@ import sys
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
ROOT = Path("/Users/xuexue/new2")
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
if str(ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(ROOT))
|
||||
|
||||
|
||||
@@ -10,13 +10,14 @@ from psycopg.rows import dict_row
|
||||
from psycopg.types.json import Jsonb
|
||||
|
||||
from app.config import settings
|
||||
from common_paths import PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
PROJECT_ID = "travel_agency_2_0_test"
|
||||
TENANT_ID = "travel_agency"
|
||||
GRAPH_NAME = "travel_agency_2_0_test"
|
||||
NAMESPACE = "travel_agency_2_0"
|
||||
SCHEMA_DIR = Path("/Users/xuexue/new2/schema搭建/travel_agency_2_0_test")
|
||||
DOWNLOAD_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试")
|
||||
SCHEMA_DIR = PROJECT_ROOT / "schema搭建/travel_agency_2_0_test"
|
||||
DOWNLOAD_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试"
|
||||
CURRENT_JSON = SCHEMA_DIR / "travel_agency_2_0_schema.current.json"
|
||||
|
||||
|
||||
|
||||
@@ -10,13 +10,14 @@ from psycopg.rows import dict_row
|
||||
from psycopg.types.json import Jsonb
|
||||
|
||||
from app.config import settings
|
||||
from common_paths import PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
PROJECT_ID = "travel_agency_2_0_test"
|
||||
TENANT_ID = "travel_agency"
|
||||
GRAPH_NAME = "travel_agency_2_0_test"
|
||||
NAMESPACE = "travel_agency_2_0"
|
||||
SCHEMA_DIR = Path("/Users/xuexue/new2/schema搭建/travel_agency_2_0_test")
|
||||
DOWNLOAD_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试")
|
||||
SCHEMA_DIR = PROJECT_ROOT / "schema搭建/travel_agency_2_0_test"
|
||||
DOWNLOAD_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试"
|
||||
CURRENT_JSON = SCHEMA_DIR / "travel_agency_2_0_schema.current.json"
|
||||
|
||||
|
||||
|
||||
@@ -10,14 +10,15 @@ from psycopg.rows import dict_row
|
||||
from psycopg.types.json import Jsonb
|
||||
|
||||
from app.config import settings
|
||||
from common_paths import PROJECT_ROOT, TRAVEL_KG_EXPORT_ROOT
|
||||
|
||||
|
||||
PROJECT_ID = "travel_agency_2_0_test"
|
||||
TENANT_ID = "travel_agency"
|
||||
GRAPH_NAME = "travel_agency_2_0_test"
|
||||
NAMESPACE = "travel_agency_2_0"
|
||||
SCHEMA_DIR = Path("/Users/xuexue/new2/schema搭建/travel_agency_2_0_test")
|
||||
DOWNLOAD_DIR = Path("/Users/xuexue/Downloads/图谱数据/travel_agency_2_0_test_旅行社2.0测试")
|
||||
SCHEMA_DIR = PROJECT_ROOT / "schema搭建/travel_agency_2_0_test"
|
||||
DOWNLOAD_DIR = TRAVEL_KG_EXPORT_ROOT / "travel_agency_2_0_test_旅行社2.0测试"
|
||||
CURRENT_JSON = SCHEMA_DIR / "travel_agency_2_0_schema.current.json"
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user