You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
cbmc/codedetect/docs/developer-guide.md

1749 lines
42 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# CodeDetect开发者指南
本文档为开发者提供CodeDetect系统的详细开发指南包括系统架构、模块说明、开发环境设置、编码规范、测试指南和发布流程。
## 目录
1. [系统架构](#系统架构)
2. [开发环境设置](#开发环境设置)
3. [模块开发指南](#模块开发指南)
4. [编码规范](#编码规范)
5. [测试指南](#测试指南)
6. [调试指南](#调试指南)
7. [性能优化](#性能优化)
8. [发布流程](#发布流程)
9. [贡献指南](#贡献指南)
---
## 系统架构
### 整体架构
```
┌─────────────────────────────────────────────────────────────┐
│ Web界面层 │
├─────────────────┬─────────────────┬─────────────────────────┤
│ REST API │ WebSocket │ 文件上传 │
└─────────────────┴─────────────────┴─────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 业务逻辑层 │
├─────────────────┬─────────────────┬─────────────────────────┤
│ 工作流管理 │ 作业调度 │ 结果聚合 │
└─────────────────┴─────────────────┴─────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 核心引擎层 │
├─────────────────┬─────────────────┬─────────────────────────┤
│ 代码解析 │ 规范生成 │ 规范突变 │
└─────────────────┴─────────────────┴─────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 验证执行层 │
├─────────────────┬─────────────────┬─────────────────────────┤
│ CBMC运行器 │ 结果解析 │ 报告生成 │
└─────────────────┴─────────────────┴─────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 基础设施层 │
├─────────────────┬─────────────────┬─────────────────────────┤
│ 配置管理 │ 日志系统 │ 监控告警 │
└─────────────────┴─────────────────┴─────────────────────────┘
```
### 核心模块
#### 1. 解析模块 (`src/parse/`)
**功能**: C/C++代码解析和分析
**主要组件**:
- `code_parser.py`: 主解析器使用clang或自定义解析器
- `ast_processor.py`: AST处理和元数据提取
- `complexity_analyzer.py`: 代码复杂度分析
- `dependency_resolver.py`: 依赖关系解析
**设计原则**:
- 模块化设计,支持多种解析器
- 错误恢复机制,部分解析失败不影响整体流程
- 缓存机制提高重复解析性能
#### 2. 突变模块 (`src/mutate/`)
**功能**: CBMC规范突变生成和质量评估
**主要组件**:
- `engine.py`: 突变引擎主类
- `operators.py`: CBMC特定突变操作符
- `selector.py`: 突变选择策略
- `evaluator.py`: 突变质量评估
- `mutation_types.py`: 突变类型定义
**CBMC突变操作符**:
1. **谓词突变**: 修改条件表达式中的运算符
2. **边界突变**: 调整边界值
3. **算术突变**: 修改算术运算
4. **数组突变**: 修改数组访问和边界
5. **指针突变**: 修改指针操作
#### 3. 验证模块 (`src/verify/`)
**功能**: CBMC验证执行和结果分析
**主要组件**:
- `cbmc_runner.py`: CBMC执行器
- `command_builder.py`: CBMC命令构建
- `result_parser.py`: 结果解析器
- `harness_gen.py`: 测试线束生成器
- `verification_types.py`: 验证类型定义
**特性**:
- 异步执行支持
- 超时控制
- 资源限制
- 结果缓存
#### 4. UI模块 (`src/ui/`)
**功能**: Web界面和API服务
**主要组件**:
- `web_app.py`: Flask应用
- `api.py`: REST API接口
- `websocket.py`: WebSocket实时通信
- `static/`: 前端静态资源
**技术栈**:
- 后端: Flask + SocketIO
- 前端: 原生JavaScript + WebSocket
- 实时通信: WebSocket
#### 5. 配置模块 (`src/config/`)
**功能**: 系统配置管理
**主要组件**:
- `config_manager.py`: 配置管理器
- `validation.py`: 配置验证
- `environment.py`: 环境变量处理
**特性**:
- YAML配置文件支持
- 环境变量覆盖
- 配置验证
- 热重载支持
---
## 开发环境设置
### 前置要求
- **操作系统**: Linux (推荐Ubuntu 20.04+)
- **Python**: 3.8+
- **内存**: 最少4GB推荐8GB+
- **存储**: 最少2GB可用空间
### 依赖安装
1. **系统依赖**:
```bash
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install -y python3 python3-pip python3-venv git cmake build-essential
# CBMC安装
sudo apt-get install -y cbmc
```
2. **Python虚拟环境**:
```bash
# 创建虚拟环境
python3 -m venv venv
source venv/bin/activate
# 升级pip
pip install --upgrade pip
```
3. **项目依赖**:
```bash
# 安装项目依赖
pip install -r requirements.txt
# 安装开发依赖
pip install -r requirements-dev.txt
```
### 开发工具配置
1. **VS Code配置** (`.vscode/settings.json`):
```json
{
"python.linting.enabled": true,
"python.linting.pylintEnabled": true,
"python.formatting.provider": "black",
"editor.formatOnSave": true,
"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": ["tests"]
}
```
2. **Git配置**:
```bash
git config --local user.name "Your Name"
git config --local user.email "your.email@example.com"
git config --local core.autocrlf input
```
3. **Pre-commit钩子**:
```bash
# 安装pre-commit
pre-commit install
# 运行pre-commit检查
pre-commit run --all-files
```
### 环境变量配置
创建 `.env` 文件:
```bash
# LLM配置
LLM_PROVIDER=deepseek
LLM_MODEL=deepseek-coder
LLM_API_KEY=your-api-key
LLM_MAX_TOKENS=2000
LLM_TEMPERATURE=0.3
# CBMC配置
CBMC_PATH=/usr/bin/cbmc
CBMC_TIMEOUT=300
CBMC_DEPTH=20
# Web配置
WEB_HOST=0.0.0.0
WEB_PORT=8080
WEB_DEBUG=false
# 日志配置
LOG_LEVEL=INFO
LOG_FILE=logs/app.log
```
---
## 模块开发指南
### 新增解析器支持
1. **创建解析器类** (`src/parse/custom_parser.py`):
```python
from typing import List, Dict, Any, Optional
from dataclasses import dataclass
@dataclass
class CustomParseResult:
functions: List[Dict[str, Any]]
variables: List[Dict[str, Any]]
complexity: Dict[str, Any]
class CustomParser:
def __init__(self, config: Optional[Dict] = None):
self.config = config or {}
def parse_file(self, file_path: str) -> CustomParseResult:
"""实现文件解析逻辑"""
pass
def parse_code(self, code: str, filename: str = "temp.c") -> CustomParseResult:
"""实现代码字符串解析逻辑"""
pass
def get_supported_extensions(self) -> List[str]:
"""返回支持的文件扩展名"""
return [".c", ".cpp", ".h", ".hpp"]
```
2. **注册解析器** (`src/parse/__init__.py`):
```python
from .custom_parser import CustomParser
PARSER_REGISTRY = {
"default": DefaultParser,
"custom": CustomParser,
# 添加更多解析器
}
def get_parser(parser_type: str = "default", config: Optional[Dict] = None):
"""获取指定类型的解析器"""
parser_class = PARSER_REGISTRY.get(parser_type, DefaultParser)
return parser_class(config)
```
3. **配置解析器** (`config/default.yaml`):
```yaml
parser:
type: "custom" # 使用自定义解析器
config:
option1: "value1"
option2: "value2"
```
### 新增突变操作符
1. **定义突变类型** (`src/mutate/mutation_types.py`):
```python
from enum import Enum
class MutationType(Enum):
PREDICATE = "predicate"
BOUNDARY = "boundary"
ARITHMETIC = "arithmetic"
ARRAY = "array"
POINTER = "pointer"
CUSTOM = "custom" # 新增自定义类型
```
2. **实现突变操作符** (`src/mutate/operators.py`):
```python
class CBMCMutationOperators:
def __init__(self, config: Optional[Dict] = None):
self.config = config or {}
def generate_custom_mutations(
self,
specification: str,
function_metadata: List[Dict]
) -> List[str]:
"""生成自定义突变"""
mutations = []
# 实现自定义突变逻辑
# 例如:修改循环条件
import re
loop_pattern = r'for\s*\(\s*.*;\s*.*;\s*.*\s*\)'
def replace_loop_condition(match):
condition = match.group(0)
# 修改循环条件
return condition.replace('i < n', 'i <= n')
mutated_spec = re.sub(loop_pattern, replace_loop_condition, specification)
if mutated_spec != specification:
mutations.append(mutated_spec)
return mutations
```
3. **集成到突变引擎** (`src/mutate/engine.py`):
```python
class MutationEngine:
def generate_mutations(self, specification: str, function_metadata: List[Dict], **kwargs):
# ... 现有逻辑
# 生成自定义突变
if MutationType.CUSTOM in mutation_types:
custom_mutations = self.operators.generate_custom_mutations(
specification, function_metadata
)
for custom_spec in custom_mutations:
mutation = CBMCMutation(
specification=custom_spec,
mutation_type=MutationType.CUSTOM,
confidence=0.8, # 评估置信度
description="自定义突变描述"
)
all_mutations.append(mutation)
```
### 新增验证类型
1. **定义验证类型** (`src/verify/verification_types.py`):
```python
from enum import Enum
class VerificationType(Enum):
MEMORY_SAFETY = "memory_safety"
OVERFLOW_DETECTION = "overflow_detection"
POINTER_VALIDITY = "pointer_validity"
CONCURRENCY = "concurrency"
CUSTOM_TYPE = "custom_type" # 新增验证类型
```
2. **实现验证逻辑** (`src/verify/custom_verifier.py`):
```python
class CustomVerifier:
def __init__(self, config: Optional[Dict] = None):
self.config = config or {}
async def verify_custom_type(
self,
specification: str,
source_file: str,
function_metadata: Dict
) -> VerificationResult:
"""实现自定义验证逻辑"""
# 构建验证命令
command = self._build_verification_command(
specification, source_file, VerificationType.CUSTOM_TYPE
)
# 执行验证
result = await self._execute_verification(command)
# 解析结果
parsed_result = self._parse_verification_result(result)
return parsed_result
def _build_verification_command(self, specification: str, source_file: str, verification_type: VerificationType) -> List[str]:
"""构建验证命令"""
base_command = ["cbmc", source_file]
# 添加自定义选项
if verification_type == VerificationType.CUSTOM_TYPE:
base_command.extend([
"--custom-option",
"--custom-value"
])
return base_command
```
### 新增API端点
1. **定义API路由** (`src/ui/api.py`):
```python
from flask import request, jsonify
from functools import wraps
def require_auth(f):
@wraps(f)
def decorated(*args, **kwargs):
token = request.headers.get('Authorization')
if not token or not validate_token(token):
return jsonify({"success": False, "error": "Unauthorized"}), 401
return f(*args, **kwargs)
return decorated
@app.route('/api/custom', methods=['POST'])
@require_auth
def handle_custom_operation():
"""处理自定义操作"""
try:
data = request.get_json()
# 验证输入
if not data or 'required_field' not in data:
return jsonify({
"success": False,
"error": {
"code": "VALIDATION_ERROR",
"message": "Missing required field"
}
}), 400
# 处理业务逻辑
result = process_custom_operation(data)
return jsonify({
"success": True,
"data": result
})
except Exception as e:
logger.error(f"Custom operation failed: {e}")
return jsonify({
"success": False,
"error": {
"code": "INTERNAL_ERROR",
"message": str(e)
}
}), 500
```
2. **添加WebSocket事件** (`src/ui/websocket.py`):
```python
@socketio.on('custom_event')
def handle_custom_event(data):
"""处理自定义WebSocket事件"""
try:
# 验证事件数据
if not validate_event_data(data):
emit('error', {'message': 'Invalid event data'})
return
# 处理事件
result = process_custom_event(data)
# 发送结果
emit('custom_result', {
'success': True,
'data': result
})
except Exception as e:
emit('error', {
'success': False,
'error': str(e)
})
```
---
## 编码规范
### Python编码规范
#### 1. 代码风格
遵循PEP 8规范使用Black格式化代码
```python
# 正确的函数定义
def process_verification_result(result: VerificationResult) -> Dict[str, Any]:
"""处理验证结果并返回格式化数据。
Args:
result: 验证结果对象
Returns:
格式化的结果字典
"""
if not result:
return {"status": "error", "message": "No result provided"}
return {
"status": result.status,
"execution_time": result.execution_time,
"memory_used": result.memory_used,
"timestamp": datetime.utcnow().isoformat()
}
```
#### 2. 类型注解
所有函数和公共方法都应有类型注解:
```python
from typing import List, Dict, Any, Optional, Union
from dataclasses import dataclass
@dataclass
class FunctionMetadata:
name: str
return_type: str
parameters: List[Parameter]
complexity_score: float
line_start: int
line_end: int
def analyze_function_complexity(
metadata: FunctionMetadata,
options: Optional[Dict[str, Any]] = None
) -> float:
"""分析函数复杂度。
Args:
metadata: 函数元数据
options: 分析选项
Returns:
复杂度评分 (0.0-1.0)
"""
options = options or {}
# 实现复杂度分析逻辑
return calculate_complexity(metadata)
```
#### 3. 错误处理
使用具体的异常类型和适当的错误消息:
```python
class ValidationError(Exception):
"""验证错误"""
pass
class ConfigurationError(Exception):
"""配置错误"""
pass
def validate_configuration(config: Dict[str, Any]) -> None:
"""验证配置参数。
Args:
config: 配置字典
Raises:
ValidationError: 当配置验证失败时
ConfigurationError: 当配置格式错误时
"""
if not isinstance(config, dict):
raise ConfigurationError("Configuration must be a dictionary")
required_keys = ['llm', 'cbmc', 'web']
for key in required_keys:
if key not in config:
raise ValidationError(f"Missing required configuration key: {key}")
# 验证子配置
if not isinstance(config.get('llm'), dict):
raise ValidationError("LLM configuration must be a dictionary")
```
#### 4. 日志记录
使用结构化日志记录:
```python
import logging
import structlog
# 配置结构化日志
structlog.configure(
processors=[
structlog.stdlib.filter_by_level,
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.processors.UnicodeDecoder(),
structlog.processors.JSONRenderer()
],
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
wrapper_class=structlog.stdlib.BoundLogger,
cache_logger_on_first_use=True,
)
logger = structlog.get_logger()
def process_verification_request(request_id: str, data: Dict[str, Any]) -> VerificationResult:
"""处理验证请求。
Args:
request_id: 请求ID
data: 请求数据
Returns:
验证结果
"""
logger.info(
"Processing verification request",
request_id=request_id,
function_count=len(data.get('functions', [])),
verification_types=data.get('verification_types', [])
)
try:
result = perform_verification(data)
logger.info(
"Verification completed",
request_id=request_id,
status=result.status,
execution_time=result.execution_time
)
return result
except Exception as e:
logger.error(
"Verification failed",
request_id=request_id,
error=str(e),
exc_info=True
)
raise
```
### 文档规范
#### 1. 模块文档
每个模块都应有详细的文档字符串:
```python
"""
突变引擎模块
负责生成CBMC验证规范的突变版本以提高验证覆盖率。
主要功能:
- 生成多种类型的突变
- 评估突变质量
- 选择最佳突变组合
- 支持自定义突变策略
使用示例:
engine = MutationEngine()
mutations = engine.generate_mutations(
specification,
function_metadata,
max_mutations=10
)
"""
class MutationEngine:
"""CBMC突变引擎。
提供智能的突变生成功能,支持多种突变类型和质量评估。
Attributes:
operators (CBMCMutationOperators): 突变操作符集合
selector (SpecSelector): 突变选择器
evaluator (QualityEvaluator): 质量评估器
config (Dict): 引擎配置
"""
```
#### 2. 函数文档
遵循Google或NumPy文档字符串格式
```python
def generate_mutations(
self,
specification: str,
function_metadata: List[Dict[str, Any]],
max_mutations: int = 10,
mutation_types: Optional[List[MutationType]] = None,
quality_threshold: float = 0.7
) -> List[CBMCMutation]:
"""生成验证规范的突变版本。
基于原始规范和函数元数据,生成多种类型的突变以提高验证覆盖率。
使用质量评估确保生成的突变具有实际价值。
Args:
specification: 原始CBMC验证规范字符串
function_metadata: 函数元数据列表,包含函数名、参数、复杂度等信息
max_mutations: 最大突变数量默认为10
mutation_types: 要生成的突变类型列表None表示使用所有类型
quality_threshold: 突变质量阈值 (0.0-1.0),低于此值的突变将被丢弃
Returns:
突变结果列表,按质量评分排序
Raises:
ValidationError: 当规范格式无效时
ConfigurationError: 当配置错误时
Example:
>>> engine = MutationEngine()
>>> metadata = [{"name": "test", "complexity_score": 0.5}]
>>> mutations = engine.generate_mutations(
... "void test() { }",
... metadata,
... max_mutations=5
... )
>>> len(mutations) <= 5
True
"""
```
### 测试规范
#### 1. 测试文件组织
```
tests/
├── unit/ # 单元测试
│ ├── test_parse.py
│ ├── test_mutate.py
│ └── test_verify.py
├── integration/ # 集成测试
│ ├── test_complete_pipeline.py
│ └── test_freertos_verification.py
├── performance/ # 性能测试
│ └── test_system_performance.py
└── regression/ # 回归测试
└── test_regression_suite.py
```
#### 2. 测试用例编写
```python
import pytest
from unittest.mock import Mock, AsyncMock
from src.mutate.engine import MutationEngine
from src.mutate.mutation_types import MutationType
class TestMutationEngine:
"""突变引擎测试类"""
@pytest.fixture
def mutation_engine(self):
"""创建突变引擎实例"""
config = {"max_mutations": 5}
return MutationEngine(config)
@pytest.fixture
def sample_specification(self):
"""示例规范"""
return "void test(int x) { __CPROVER_assume(x > 0); }"
@pytest.fixture
def sample_metadata(self):
"""示例函数元数据"""
return [{
"name": "test",
"return_type": "void",
"parameters": [{"name": "x", "type": "int"}],
"complexity_score": 0.5
}]
def test_generate_mutations_basic(
self,
mutation_engine: MutationEngine,
sample_specification: str,
sample_metadata: List[Dict]
):
"""测试基本突变生成功能。
验证突变引擎能够正常生成突变,并返回正确格式的结果。
"""
mutations = mutation_engine.generate_mutations(
sample_specification,
sample_metadata,
max_mutations=3
)
# 验证基本属性
assert isinstance(mutations, list)
assert len(mutations) <= 3
# 验证突变对象结构
for mutation in mutations:
assert hasattr(mutation, 'specification')
assert hasattr(mutation, 'mutation_type')
assert hasattr(mutation, 'confidence')
assert mutation.specification is not None
assert mutation.mutation_type is not None
assert 0 <= mutation.confidence <= 1
@pytest.mark.asyncio
async def test_generate_mutations_async(
self,
mutation_engine: MutationEngine,
sample_specification: str,
sample_metadata: List[Dict]
):
"""测试异步突变生成功能。
验证突变引擎的异步方法能够正常工作。
"""
mutations = await mutation_engine.generate_mutations_async(
sample_specification,
sample_metadata
)
assert isinstance(mutations, list)
assert len(mutations) > 0
def test_quality_threshold_filtering(
self,
mutation_engine: MutationEngine,
sample_specification: str,
sample_metadata: List[Dict]
):
"""测试质量阈值过滤功能。
验证低质量的突变被正确过滤。
"""
mutations = mutation_engine.generate_mutations(
sample_specification,
sample_metadata,
quality_threshold=0.9 # 高阈值
)
# 验证所有突变都满足质量要求
for mutation in mutations:
assert mutation.confidence >= 0.9
def test_empty_specification_handling(
self,
mutation_engine: MutationEngine
):
"""测试空规范处理。
验证对空输入的处理是否正确。
"""
mutations = mutation_engine.generate_mutations("", [], max_mutations=1)
# 空规范可能返回空列表或抛出异常
assert isinstance(mutations, list)
```
---
## 测试指南
### 测试框架
使用pytest作为主要测试框架
```bash
# 运行所有测试
pytest
# 运行特定测试文件
pytest tests/unit/test_mutate.py
# 运行特定测试类
pytest tests/unit/test_mutate.py::TestMutationEngine
# 运行特定测试方法
pytest tests/unit/test_mutate.py::TestMutationEngine::test_generate_mutations_basic
# 生成覆盖率报告
pytest --cov=src --cov-report=html --cov-report=term
# 并行运行测试
pytest -n auto # 需要pytest-xdist插件
```
### 测试分类
#### 1. 单元测试
测试单个函数或类的功能:
```python
def test_function_isolation(self):
"""测试函数的独立功能"""
# 准备测试数据
input_data = {"key": "value"}
# 调用被测试函数
result = function_to_test(input_data)
# 验证结果
assert result == expected_result
```
#### 2. 集成测试
测试模块间的交互:
```python
def test_complete_pipeline(self):
"""测试完整的验证管道"""
# 准备测试文件
with tempfile.NamedTemporaryFile(mode='w', suffix='.c', delete=False) as f:
f.write(test_code)
temp_file = f.name
try:
# 运行完整管道
result = run_verification_pipeline(temp_file)
# 验证管道结果
assert result['success'] is True
assert 'verifications' in result
assert len(result['verifications']) > 0
finally:
os.unlink(temp_file)
```
#### 3. 性能测试
测试系统性能特征:
```python
def test_response_time(self):
"""测试API响应时间"""
start_time = time.time()
# 执行操作
result = api_call()
end_time = time.time()
response_time = end_time - start_time
# 验证性能要求
assert response_time < 1.0, f"响应时间 {response_time:.2f}s 过长"
```
### Mock和Stub
使用unittest.mock创建测试替身
```python
from unittest.mock import Mock, AsyncMock, patch
def test_with_mock_llm(self):
"""使用Mock的LLM服务测试"""
with patch('src.mutate.engine.LLMService') as mock_llm:
# 设置Mock行为
mock_llm.return_value.generate_spec.return_value = "mocked_spec"
# 运行测试
result = mutation_engine.generate_spec(function_metadata)
# 验证Mock被正确调用
mock_llm.return_value.generate_spec.assert_called_once()
assert result == "mocked_spec"
def test_async_with_mock(self):
"""测试异步函数的Mock"""
async def test_async_operation():
with patch('src.verify.cbmc_runner.CBMCRunner') as mock_runner:
# 设置异步Mock
mock_runner.return_value.run_verification.return_value = AsyncMock(
return_value=mock_verification_result
)
# 运行异步测试
result = await runner.run_verification(
function_metadata, source_file, specification
)
# 验证结果
assert result.status == "success"
```
### 测试数据管理
#### 1. 测试固件
使用pytest fixture管理测试数据
```python
@pytest.fixture
def sample_c_code(self):
"""示例C代码"""
return """
int add(int a, int b) {
return a + b;
}
"""
@pytest.fixture
def sample_function_metadata(self):
"""示例函数元数据"""
return {
"name": "add",
"return_type": "int",
"parameters": [
{"name": "a", "type": "int"},
{"name": "b", "type": "int"}
],
"complexity_score": 0.2
}
@pytest.fixture
def temporary_c_file(self, sample_c_code):
"""临时C文件"""
with tempfile.NamedTemporaryFile(mode='w', suffix='.c', delete=False) as f:
f.write(sample_c_code)
temp_file = f.name
yield temp_file
# 清理
os.unlink(temp_file)
```
#### 2. 参数化测试
使用参数化测试减少代码重复:
```python
@pytest.mark.parametrize("input_code,expected_functions", [
("int f() { return 0; }", ["f"]),
("int f() { return 0; } int g() { return 1; }", ["f", "g"]),
("", []), # 空代码
])
def test_function_extraction(self, input_code, expected_functions):
"""参数化测试函数提取"""
parser = CodeParser()
result = parser.parse_code(input_code)
actual_functions = [f.name for f in result.functions]
assert actual_functions == expected_functions
```
---
## 调试指南
### 日志调试
#### 1. 配置调试日志
```python
import logging
# 设置调试级别日志
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
def debug_function_call(func):
"""装饰器:记录函数调用"""
@functools.wraps(func)
def wrapper(*args, **kwargs):
logger.debug(f"调用 {func.__name__}: args={args}, kwargs={kwargs}")
result = func(*args, **kwargs)
logger.debug(f"{func.__name__} 返回: {result}")
return result
return wrapper
@debug_function_call
def process_data(data):
"""处理数据函数"""
return data * 2
```
#### 2. 使用调试器
```python
import pdb
def debug_complex_function(data):
"""复杂函数调试"""
result = []
for item in data:
# 设置断点
if len(result) > 5:
pdb.set_trace() # 手动断点
processed_item = item * 2
result.append(processed_item)
return result
```
### 性能调试
#### 1. 性能分析
```python
import cProfile
import pstats
def profile_function():
"""性能分析函数"""
profiler = cProfile.Profile()
profiler.enable()
# 运行要分析的代码
result = run_verification_pipeline("test.c")
profiler.disable()
stats = pstats.Stats(profiler).sort_stats('cumulative')
stats.print_stats(10) # 显示前10个最耗时的函数
return result
```
#### 2. 内存调试
```python
import tracemalloc
def debug_memory_usage():
"""内存使用调试"""
# 开始内存跟踪
tracemalloc.start()
# 运行代码
result = memory_intensive_operation()
# 获取内存快照
snapshot1 = tracemalloc.take_snapshot()
snapshot2 = tracemalloc.take_snapshot()
# 比较快照
top_stats = snapshot2.compare_to(snapshot1, 'lineno')
print("[ Top 10 内存增长位置 ]")
for stat in top_stats[:10]:
print(stat)
tracemalloc.stop()
return result
```
### 远程调试
#### 1. 配置远程调试
```python
import debugpy
def enable_remote_debug():
"""启用远程调试"""
debugpy.listen(5678) # 监听端口
print("等待调试器连接...")
debugpy.wait_for_client() # 等待调试器连接
print("调试器已连接")
# 在应用启动时调用
if os.getenv('DEBUG_MODE') == 'true':
enable_remote_debug()
```
#### 2. VS Code远程调试配置
```json
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Remote Debug",
"type": "python",
"request": "attach",
"connect": {
"host": "localhost",
"port": 5678
},
"pathMappings": [
{
"localRoot": "${workspaceFolder}",
"remoteRoot": "."
}
]
}
]
}
```
---
## 性能优化
### 代码优化
#### 1. 算法优化
```python
# 优化前O(n²) 复杂度
def find_duplicates_slow(items):
"""查找重复项(慢版本)"""
duplicates = []
for i, item1 in enumerate(items):
for j, item2 in enumerate(items):
if i != j and item1 == item2:
duplicates.append(item1)
return duplicates
# 优化后O(n) 复杂度
def find_duplicates_fast(items):
"""查找重复项(快版本)"""
seen = set()
duplicates = []
for item in items:
if item in seen and item not in duplicates:
duplicates.append(item)
seen.add(item)
return duplicates
```
#### 2. 缓存优化
```python
from functools import lru_cache
import hashlib
# 使用LRU缓存
@lru_cache(maxsize=1000)
def expensive_computation(input_data):
"""昂贵的计算函数"""
result = complex_calculation(input_data)
return result
# 自定义缓存
class ComputationCache:
def __init__(self, max_size=1000):
self.cache = {}
self.max_size = max_size
def get(self, key):
return self.cache.get(key)
def set(self, key, value):
if len(self.cache) >= self.max_size:
# 移除最旧的条目
oldest_key = next(iter(self.cache))
del self.cache[oldest_key]
self.cache[key] = value
def compute(self, input_data):
"""带缓存的计算"""
cache_key = hashlib.md5(str(input_data).encode()).hexdigest()
cached_result = self.get(cache_key)
if cached_result is not None:
return cached_result
result = expensive_computation(input_data)
self.set(cache_key, result)
return result
```
### 并发优化
#### 1. 异步I/O
```python
import asyncio
import aiohttp
# 同步版本(慢)
def fetch_urls_sync(urls):
"""同步获取URL内容"""
results = []
for url in urls:
response = requests.get(url)
results.append(response.json())
return results
# 异步版本(快)
async def fetch_urls_async(urls):
"""异步获取URL内容"""
async with aiohttp.ClientSession() as session:
tasks = [fetch_single_url(session, url) for url in urls]
results = await asyncio.gather(*tasks)
return results
async def fetch_single_url(session, url):
"""获取单个URL"""
async with session.get(url) as response:
return await response.json()
```
#### 2. 线程池优化
```python
from concurrent.futures import ThreadPoolExecutor, as_completed
def parallel_processing(items, max_workers=4):
"""并行处理"""
with ThreadPoolExecutor(max_workers=max_workers) as executor:
# 提交所有任务
futures = [executor.submit(process_item, item) for item in items]
# 收集结果
results = []
for future in as_completed(futures):
try:
result = future.result()
results.append(result)
except Exception as e:
logger.error(f"处理失败: {e}")
return results
```
### 内存优化
#### 1. 生成器使用
```python
# 内存密集型版本
def process_large_file_list(file_paths):
"""处理大文件列表(内存密集)"""
results = []
for file_path in file_paths:
data = read_large_file(file_path)
processed_data = process_data(data)
results.append(processed_data)
return results # 一次性返回所有结果
# 内存友好版本
def process_large_file_generator(file_paths):
"""处理大文件列表(生成器)"""
for file_path in file_paths:
data = read_large_file(file_path)
processed_data = process_data(data)
yield processed_data # 逐个生成结果
# 使用生成器
for result in process_large_file_generator(large_file_list):
# 每次只处理一个结果
handle_result(result)
```
#### 2. 对象池模式
```python
class ObjectPool:
"""对象池模式"""
def __init__(self, creator_func, max_size=10):
self.creator_func = creator_func
self.max_size = max_size
self.pool = []
self.lock = threading.Lock()
def get(self):
"""获取对象"""
with self.lock:
if self.pool:
return self.pool.pop()
return self.creator_func()
def put(self, obj):
"""归还对象"""
with self.lock:
if len(self.pool) < self.max_size:
# 重置对象状态
if hasattr(obj, 'reset'):
obj.reset()
self.pool.append(obj)
# 使用对象池
connection_pool = ObjectPool(create_db_connection, max_size=5)
def database_operation():
"""数据库操作"""
conn = connection_pool.get()
try:
result = execute_query(conn, "SELECT * FROM table")
return result
finally:
connection_pool.put(conn)
```
---
## 发布流程
### 版本管理
#### 1. 语义化版本
遵循语义化版本规范:`MAJOR.MINOR.PATCH`
- **MAJOR**: 不兼容的API更改
- **MINOR**: 向后兼容的功能新增
- **PATCH**: 向后兼容的问题修复
#### 2. 版本标签
```bash
# 创建版本标签
git tag -a v1.2.3 -m "Version 1.2.3: Add new features and fix bugs"
# 推送标签
git push origin v1.2.3
# 查看标签
git tag -l
```
### 构建和测试
#### 1. 自动化构建
```bash
#!/bin/bash
# build.sh - 构建脚本
set -e # 遇到错误立即退出
echo "=== 开始构建 ==="
# 1. 安装依赖
echo "安装依赖..."
pip install -r requirements.txt
pip install -r requirements-dev.txt
# 2. 运行测试
echo "运行测试..."
pytest --cov=src --cov-report=xml --cov-report=term
pytest_exit_code=$?
if [ $pytest_exit_code -ne 0 ]; then
echo "测试失败,构建终止"
exit 1
fi
# 3. 代码质量检查
echo "代码质量检查..."
flake8 src/ --max-line-length=100
black --check src/
mypy src/
# 4. 构建文档
echo "构建文档..."
cd docs/
make html
cd ..
# 5. 创建发布包
echo "创建发布包..."
python setup.py sdist bdist_wheel
echo "=== 构建完成 ==="
```
#### 2. Docker构建
```dockerfile
# Dockerfile
FROM python:3.9-slim
# 安装系统依赖
RUN apt-get update && apt-get install -y \
cbmc \
git \
cmake \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# 设置工作目录
WORKDIR /app
# 复制依赖文件
COPY requirements.txt .
COPY requirements-dev.txt .
# 安装Python依赖
RUN pip install --no-cache-dir -r requirements.txt
# 复制应用代码
COPY src/ ./src/
COPY config/ ./config/
# 创建日志目录
RUN mkdir -p logs
# 设置环境变量
ENV PYTHONPATH=/app
ENV LOG_LEVEL=INFO
# 暴露端口
EXPOSE 8080
# 启动命令
CMD ["python", "src/ui/web_app.py"]
```
### 发布检查清单
#### 1. 代码质量检查
- [ ] 所有测试通过
- [ ] 代码覆盖率 > 80%
- [ ] 静态代码分析通过
- [ ] 类型检查通过
- [ ] 安全扫描通过
#### 2. 文档检查
- [ ] API文档更新
- [ ] 用户手册更新
- [ ] 更新日志编写
- [ ] 版本说明编写
#### 3. 兼容性检查
- [ ] 向后兼容性测试
- [ ] 依赖版本兼容性
- [ ] 配置文件兼容性
- [ ] 数据库迁移测试
### 部署流程
#### 1. 滚动更新
```bash
#!/bin/bash
# deploy.sh - 部署脚本
set -e
APP_NAME="codedetect"
IMAGE_TAG="v1.2.3"
REGISTRY="your-registry"
echo "=== 开始部署 ==="
# 1. 构建和推送镜像
echo "构建Docker镜像..."
docker build -t ${REGISTRY}/${APP_NAME}:${IMAGE_TAG} .
docker push ${REGISTRY}/${APP_NAME}:${IMAGE_TAG}
# 2. 更新Kubernetes部署
echo "更新Kubernetes部署..."
kubectl set image deployment/${APP_NAME} \
${APP_NAME}=${REGISTRY}/${APP_NAME}:${IMAGE_TAG}
# 3. 等待部署完成
echo "等待部署完成..."
kubectl rollout status deployment/${APP_NAME} --timeout=300s
# 4. 健康检查
echo "执行健康检查..."
sleep 30
curl -f http://your-domain.com/api/health || exit 1
echo "=== 部署完成 ==="
```
#### 2. 回滚机制
```bash
#!/bin/bash
# rollback.sh - 回滚脚本
set -e
APP_NAME="codedetect"
PREVIOUS_VERSION="v1.2.2"
echo "=== 开始回滚 ==="
# 1. 获取当前版本
CURRENT_VERSION=$(kubectl get deployment ${APP_NAME} -o jsonpath='{.spec.template.spec.containers[0].image}')
echo "当前版本: ${CURRENT_VERSION}"
# 2. 回滚到上一个版本
echo "回滚到版本: ${PREVIOUS_VERSION}"
kubectl set image deployment/${APP_NAME} \
${APP_NAME}=your-registry/${APP_NAME}:${PREVIOUS_VERSION}
# 3. 等待回滚完成
echo "等待回滚完成..."
kubectl rollout status deployment/${APP_NAME} --timeout=300s
# 4. 验证回滚
echo "验证回滚结果..."
kubectl get deployment ${APP_NAME}
echo "=== 回滚完成 ==="
```
---
## 贡献指南
### 贡献流程
#### 1. Fork和克隆
```bash
# Fork项目到个人账户
git clone https://github.com/your-username/codedetect.git
cd codedetect
```
#### 2. 创建功能分支
```bash
# 创建功能分支
git checkout -b feature/new-feature
# 或者创建修复分支
git checkout -b fix/bug-fix
```
#### 3. 开发和测试
```bash
# 安装开发依赖
pip install -r requirements-dev.txt
# 运行测试
pytest
# 代码格式化
black src/
flake8 src/
# 类型检查
mypy src/
```
#### 4. 提交更改
```bash
# 添加更改
git add .
# 提交更改
git commit -m "feat: 添加新功能
详细描述新功能的内容和目的。"
# 推送到个人仓库
git push origin feature/new-feature
```
#### 5. 创建Pull Request
1. 在GitHub上创建Pull Request
2. 填写PR模板
3. 等待代码审查
4. 根据反馈修改代码
5. 合并到主分支
### 代码审查清单
#### 审查要点
- [ ] 代码符合项目编码规范
- [ ] 包含适当的测试用例
- [ ] 文档完整且准确
- [ ] 性能影响评估
- [ ] 安全性考虑
- [ ] 错误处理完善
- [ ] 向后兼容性
#### PR模板
```markdown
## 变更描述
简要描述此PR的目的和内容。
## 变更类型
- [ ] Bug修复
- [ ] 新功能
- [ ] 文档更新
- [ ] 重构
- [ ] 性能优化
- [ ] 测试改进
## 测试说明
描述如何测试此变更。
## 相关问题
链接到相关的GitHub Issues。
## 检查清单
- [ ] 代码自审查通过
- [ ] 所有测试通过
- [ ] 文档已更新
- [ ] 变更日志已更新
```
### 发布节奏
#### 发布周期
- **补丁版本**: 每2周如有必要
- **次要版本**: 每4-6周
- **主要版本**: 每3-6个月
#### 发布冻结
- 发布前1周进入特性冻结期
- 只接受bug修复和安全更新
- 完成最终测试和文档更新
---
*最后更新: 2024年1月*