Compare commits

...

34 Commits

Author SHA1 Message Date
p5tlr2yxg 751f752f8d Merge pull request 'hp分支实验2完成' (#5) from hp into master
10 hours ago
lc 98ac77cc73 实验2脚本优化
14 hours ago
lc 018a020aae 实验2任务完成
14 hours ago
hp 5ce6b687a3 ./scripts/verify_ir.sh test/test_case/functional/09_func_defn.sy --run
2 days ago
hp 3ad2437169 docs: 更新人员 3 (hp) 的任务完成情况日志
2 days ago
hp fc08c12f40 feat: 支持 const 常量声明与初始化
2 days ago
hp d8486b7313 feat: 支持函数调用及参数传递
2 days ago
hp 4064885a1e feat: 支持带有参数的函数定义
2 days ago
hp 6f63c4c7ba feat: 支持全局变量声明与初始化+修改了.gitignore
2 days ago
hp fd4f3b5fa8 Merge branch 'master' into hp
1 week ago
hp 16ecb30384 lab1 testall
1 week ago
p5tlr2yxg 79756970ef Merge pull request 'lyy' (#3) from lyy into master
1 week ago
Oliveira 2de1561210 feat(irgen): 实现控制流与逻辑表达式代码生成
1 week ago
Oliveira 5cc0bf587a Merge branch 'master' into lyy
1 week ago
Oliveira d2f49b71c5 gitignore add
1 week ago
p5tlr2yxg b740348401 Merge pull request 'lc' (#2) from lc into master
1 week ago
lc 8ba3b01271 "IRGen部分实现,任务1完成"
1 week ago
lc 8297700496 IRGen部分实现:支持赋值表达式
1 week ago
lc b1477851c6 “IRGen部分实现:支持一元运算符(正负号)”
1 week ago
lc fe3a3410a6 “IRGen部分实现:支持更多二元运算符(Sub, Mul, Div, Mod)”
1 week ago
Oliveira 02e5a7d4e7 merge error
1 week ago
Oliveira c858f75d9a Merge branch 'master' into lyy
1 week ago
lc 83b6f17c78 lab2 IRGen部分实现2
1 week ago
lc a85898b35a lab2 IRGen部分实现
1 week ago
olivame b3230ec6d5 Merge branch 'dyz'
2 weeks ago
olivame 304599b17b chore: add sema checker source and reference pdf
2 weeks ago
olivame 3e066b8375 docs(test): add lab progress notes and sema test cases
2 weeks ago
olivame 3342955abb lab2: rebuild semantic analysis for current SysY grammar
2 weeks ago
olivame 7dd139671b fix(frontend): tighten grammar to match sysy2022
2 weeks ago
hp 15cbb37435 lab1
2 weeks ago
Oliveira 3e4165c4bb feat(grammar): 完善 SysY.g4 语法定义支持 Lab1 要求
2 weeks ago
p5tlr2yxg d5469f93f6 测试通过
2 weeks ago
Ethereal 18366d3cc8 分支测试
2 weeks ago
olivame 3d0361e648 feat(frontend): complete lab1 SysY grammar support
2 weeks ago

2
.gitignore vendored

@ -52,6 +52,7 @@ compile_commands.json
.idea/
.fleet/
.vs/
.trae/
*.code-workspace
# CLion
@ -68,3 +69,4 @@ Thumbs.db
# Project outputs
# =========================
test/test_result/
sema_check

@ -20,6 +20,10 @@
如果希望进一步参考编译相关项目和往届优秀实现,可以查看编译比赛官网的技术支持栏目:<https://compiler.educg.net/#/index?TYPE=26COM>。其中的“备赛推荐”整理了一些编译相关项目,也能看到往届优秀作品的开源实现,这些内容都很值得参考。
此外,仓库中还提供了一份当前实现状态与测试入口的总览文档,便于组内同步进度:
- `doc/实验进度与测试方法.md`
## 3. 头歌平台协作流程
头歌平台的代码托管方式与 GitHub/Gitee 类似。如果你希望基于当前仓库快速开始协作,可以参考下面这套流程。

@ -0,0 +1,10 @@
#!/bin/bash
mkdir -p build/generated/antlr4
java -jar third_party/antlr-4.13.2-complete.jar \
-Dlanguage=Cpp \
-visitor -no-listener \
-Xexact-output-dir \
-o build/generated/antlr4 \
src/antlr4/SysY.g4
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCOMPILER_PARSE_ONLY=ON
cmake --build build -j "$(nproc)"

@ -0,0 +1,137 @@
### 人员 1基础表达式与赋值支持lc已完成
- 任务 1.1支持更多二元运算符Sub, Mul, Div, Mod
- 任务 1.2:支持一元运算符(正负号)
- 任务 1.3:支持赋值表达式
- 任务 1.4:支持逗号分隔的多个变量声明
### 人员 2控制流支持lyy,已完成)
- 任务 2.1:支持 if-else 条件语句
- 任务 2.2:支持 while 循环语句
- 任务 2.3:支持 break/continue 语句
- 任务 2.4:支持比较和逻辑表达式
### 人员 3函数与全局变量支持
- 任务 3.1:支持全局变量声明与初始化
- 任务 3.2:支持函数参数处理
- 任务 3.3:支持函数调用生成
- 任务 3.4:支持 const 常量声明
## 人员 1 完成情况详细说明(更新于 2026-03-30
### ✅ 已完成任务
人员 1 已完整实现 Lab2 IR 生成的基础功能模块,包括:
1. **二元运算符**(任务 1.1
- 实现 `Sub`, `Mul`, `Div`, `Mod` 四种运算符
- 修改文件:`include/ir/IR.h`, `src/ir/IRBuilder.cpp`, `src/ir/IRPrinter.cpp`, `src/irgen/IRGenExp.cpp`
2. **一元运算符**(任务 1.2
- 实现正负号运算符(`+`, `-`
- 新增 `UnaryInst` 类支持一元指令
- 负号生成 `sub 0, x` 指令LLVM IR 标准形式)
3. **赋值表达式**(任务 1.3
- 实现变量赋值语句的 IR 生成
- 修改文件:`src/irgen/IRGenStmt.cpp`
4. **多变量声明**(任务 1.4
- 支持逗号分隔的变量声明(如 `int a, b, c;`
- 支持带初始化的多变量声明(如 `int a = 1, b = 2;`
### 🧪 测试验证
- **Lab1 语法分析**:✅ 通过10/11 functional 测试1 个数组测试超出范围)
- **Lab2 语义分析**:✅ 通过6 正例 + 4 反例)
- **IR 生成测试**:✅ 通过7/7 自定义测试用例)
- 测试脚本:`./scripts/test_lab2_ir1.sh`
- 测试用例目录:`test/test_case/irgen_lab1_4/`
### 📝 代码质量
- 所有修改已通过编译测试
- 未影响原有 Lab1 和 Lab2 Sema 功能
- 代码风格与项目保持一致
- 关键函数添加了注释说明
### 🔄 协作接口
人员 1 的实现为后续任务提供了以下接口:
- **表达式生成**`visitAddExp`, `visitMulExp`, `visitUnaryExp`
- **语句生成**`visitStmt`(支持赋值和 return
- **变量管理**`storage_map_` 维护变量名到栈槽位的映射
- **IR 构建**`IRBuilder::CreateBinary`, `IRBuilder::CreateNeg`, `IRBuilder::CreateStore`
后续人员可以在此基础上扩展更复杂的功能(控制流、函数调用等)。
## 人员 2 完成情况详细说明(更新于 2026-03-31
### ✅ 已完成任务
人员 2 已完整实现 Lab2 IR 生成中涉及的控制流支持,包括:
1. **IR 结构与底层辅助拓展**
- 补充 `Int1` 基础类型以及 `Value::IsInt1()`
- 新增 `CmpInst`, `ZextInst`, `BranchInst` 以及 `CondBranchInst` 以支持关系计算和跳转逻辑。
- 在 `IRBuilder` 中补齐创建此类指令的便捷接口与 `IRPrinter` 适配,并修复了 `IRPrinter` 存在的块命名 `%%` 重复问题。
- 优化 `Context::NextTemp` 分配命名使用 `%t` 前缀,解决非线性顺序下纯数字临时变量引发 `llc` 后端词法顺序验证失败问题。
2. **比较和逻辑表达式**(任务 2.4
- 新增实现 `visitRelExp`、`visitEqExp`。
- 实现条件二元表达式全链路短路求值 (`visitLAndExp`、`visitLOrExp`)。短路时通过控制流跳转+利用局部栈变量分配并多次赋值记录实现栈传递,规避了 `phi` 的麻烦。
- 利用 `visitCondUnaryExp` 增加逻辑非 `!` 判定。
3. **控制流框架支持**(任务 2.1 - 2.3
- 在 `visitStmt` 中完美实现了 `if-else` 条件语句(自动插入无条件跳合块)、`while` 循环语句。
- 在 `IRGen` 实例中通过 `current_loop_cond_bb_` 等维护循环栈,实现了 `break``continue`
- 修复了此前框架在 `IRGenDecl.cpp``visitBlock` 中缺少终结向上传递导致的 `break` 生成不匹配死块 BUG 及重复 `Branch` 问题。
4. **关键前序 Bug 修复**
- 发现了在原框架里 `src/sem/Sema.cpp` 进行 AST 解析时 `RelExp``EqExp` 对于非原生底层变量追踪由于左偏漏调规则导致 `null_ptr` (`变量使用缺少语义绑定a`) 报错的问题,并做出了精修复。
### 🧪 测试验证
- **Lab2 语义分析**:修复后所有已有的语义正例验证正常。
- **IR 生成与后端执行**:✅ 自建嵌套含复合逻辑循环脚本测试通过。
- **验证命令**(运行含 break 和 while 的范例文件):
```bash
cd build && make -j$(nproc) && cd .. && ./scripts/verify_ir.sh test/test_case/functional/29_break.sy --run
```
**完整测试脚本**
```bash
for f in test/test_case/functional/*.sy; do echo "Testing $f..."; ./scripts/verify_ir.sh "$f" --run > /dev/null || echo "FAILED $f"; done
```
## 人员 3 完成情况详细说明(更新于 2026-04-06
### ✅ 已完成任务
人员 3 (hp) 已完整实现 Lab2 IR 生成中函数及常量的扩展支持,包括:
1. **支持全局变量声明与初始化**(任务 3.1
- 在 `IRGenDecl.cpp` 中通过判断 `func_ == nullptr` 区分全局和局部作用域。
- 扩充了 `Float` / `PtrFloat``ConstantFloat` 等浮点数支持,补充 `GlobalVariable` 派生类。
- 正确调用 `module_.CreateGlobalVariable` 处理整型和浮点型全局初始化,维护在 `storage_map_` 中。
2. **支持函数参数处理**(任务 3.2
- 在 `IR.h``Value` 体系中增加 `Argument` 类。
- 在 `IRGenFunc.cpp` 中实现对 `funcFParams` 的处理。
- 在入口块为每个参数 `alloca` 栈槽,通过 `store` 存入形参初值,并绑定至 `storage_map_` 供内部读取。
3. **支持函数调用生成**(任务 3.3
- 在 `IR.h``IRBuilder.cpp` 补充 `Opcode::Call``CallInst` 及其打印逻辑。
- 在 `IRGenExp.cpp` (`visitUnaryExp`) 支持 `funcCallExp` 解析。
- 提取计算所有的实参表达式 (`funcRParams`) 后生成 `call` 指令;对于库函数支持基于 `Sema` 的占位符签名构建。
4. **支持 const 常量声明**(任务 3.4
- 在 `IRGenDecl.cpp` 新增 `visitConstDecl``visitConstDef` 实现。
- 维护独立的 `const_values_` 映射表记录 `ConstantValue*`
- 在 `visitLVal` 时如果检测到是已定义的常量,直接嵌入常量值完成折叠,省去内存的 `load` 开销。
### 🧪 测试验证
- **全局/局部变量、常量引用测试**:✅ IR 输出正确(通过访问 `storage_map_``const_values_` 获取数据)。
- **参数传递与函数调用链路测试**:✅ 多参数函数(包含返回值)和调用外部 `putint` 的样例生成的 LLVM IR 结构清晰、运行正确。
- **集成测试验证**:✅ 能完美与人员 1 和人员 2 的前置工作合并通过,确保了控制流、运算体系与函数调用的兼容。
### 🔄 协作接口
人员 3 的实现对全局体系及调用链路做出了以下约定:
- **常量折叠访问机制**:扩展引入了 `const_values_` 映射机制,允许表达式树中的左值在编译期直接折叠为字面量常量。
- **参数栈操作模型**:统一了函数的栈变量调用约定(将传参全统一按 Alloca 栈分配处理),这为后续实验中后端进行简单且一致的寄存器/栈映射及死代码消除等数据流分析提供了稳定基础。

@ -0,0 +1,436 @@
# 实验进度与测试方法
## 1. 当前实验进度
本文档用于记录当前仓库在各个 Lab 上的实现状态,以及对应的测试与验证方式。
需要注意:本仓库当前仍处于“课程示例框架 + 逐步补全”的阶段,并不是一个已经完整实现全部 SysY 语义的编译器。
### 1.1 Lab1 当前进度
Lab1 对应前端语法分析与语法树构建。
当前状态:
- 已提供 `SysY.g4`、ANTLR 驱动与语法树打印能力。
- 已支持通过 `--emit-parse-tree` 输出语法树。
- 可使用 `parse-only` 模式单独构建前端,不依赖 `sem` / `irgen` / `mir`
### 1.2 Lab2 当前进度
Lab2 对应“语法树 -> 语义检查 -> IR”。
当前状态可以拆成两部分来看:
1. `Sema`
- 已完成一版基于当前 SysY grammar 的语义检查基础实现。
- 已支持多层作用域、变量/常量重定义检查、先声明后使用。
- 已支持函数符号收集、函数调用检查、`main` 入口检查。
- 已支持 `break` / `continue` 使用位置检查。
- 已支持 `return` 与函数返回类型匹配检查。
- 已支持 `const` 常量表达式求值、数组维度检查、全局初始化常量性检查。
- 已支持 `int/float` 标量表达式、比较、逻辑表达式的基础类型检查。
- 已内建 `getint`、`putch`、`getfloat`、`getarray`、`putarray` 等常见运行库函数声明。
2. `IRGen`
- 当前仓库原有 `IRGen` 仍是最小示例版本。
- 当前只适合支持“局部 `int` 变量 + 常量 + 简单表达式 + `return`”这类极小子集。
- 由于 grammar 已扩展,而 `IRGen` 尚未完全同步,所以 Lab2 目前**只完成了前半部分Sema 基础扩展**。
- Lab2 的 IR 生成部分仍需继续补全。
### 1.3 Lab3 当前进度
Lab3 对应“IR -> MIR -> 汇编”。
当前状态:
- 仓库中保留了最小后端链路。
- 仅适合消费当前最小 IR 子集。
- 尚不具备对完整 SysY 程序稳定生成汇编的能力。
### 1.4 Lab4-Lab6 当前进度
当前仓库已经预留:
- IR 分析与 Pass 目录结构
- `Mem2Reg`、`ConstFold`、`ConstProp`、`DCE`、`CSE`、`CFGSimplify` 等文件框架
- 循环分析、支配树、后端优化等实验入口
但这些阶段是否“完成”,取决于你们后续自行补全,不应默认认为仓库当前已经完全实现。
## 2. 推荐测试思路
建议把测试分成三层:
1. `单阶段验证`
- 只验证某个阶段是否工作,例如只看 parse、只看 sema、只看 IR 输出。
2. `链路验证`
- 从源码一路走到 IR 或汇编,再运行程序,比对 `.out`
3. `批量回归`
- 对 `test/test_case` 下多个测试统一执行,避免只靠 `simple_add.sy` 判断功能是否完成。
## 3. 别人拉取当前实现后的推荐编译方式
如果其他同学拉取了当前仓库,建议按下面顺序准备环境并编译。
### 3.1 先生成 ANTLR 输出
当前仓库的 CMake 会收集构建目录中的 ANTLR 生成文件,但不会自动调用 ANTLR所以第一次构建前应先执行
```bash
mkdir -p build/generated/antlr4
java -jar third_party/antlr-4.13.2-complete.jar \
-Dlanguage=Cpp \
-visitor -no-listener \
-Xexact-output-dir \
-o build/generated/antlr4 \
src/antlr4/SysY.g4
```
### 3.2 如果只想验证 Lab1
只构建 parse-only 前端:
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCOMPILER_PARSE_ONLY=ON
cmake --build build -j "$(nproc)"
```
构建后可直接运行:
```bash
./scripts/test_lab1.sh test/test_case/functional
```
### 3.3 如果想验证当前 Lab2 的 Sema 部分
由于当前仓库中的 `IRGen` 还没有完全跟上新 grammar而我们这次主要完成的是 `Sema`,所以推荐单独准备一个 `build-sema/` 目录来验证语义检查。
推荐命令如下:
```bash
cmake -S . -B build-sema -DCMAKE_BUILD_TYPE=Release -DCOMPILER_PARSE_ONLY=OFF
mkdir -p build-sema/generated
cp -r build/generated/antlr4 build-sema/generated/
cmake --build build-sema --target frontend utils sem -j "$(nproc)"
```
然后编译 `sema_check`
```bash
g++ -std=c++17 \
-Iinclude \
-Isrc \
-Ibuild-sema/generated/antlr4 \
-Ithird_party/antlr4-runtime-4.13.2/runtime/src \
tools/sema_check.cpp \
build-sema/src/sem/libsem.a \
build-sema/src/frontend/libfrontend.a \
build-sema/src/utils/libutils.a \
build-sema/libantlr4_runtime.a \
-pthread \
-o build-sema/sema_check
```
完成后即可运行:
```bash
./scripts/test_lab2_sema.sh positive
./scripts/test_lab2_sema.sh negative
```
说明:
- `build/` 主要用于 Lab1 parse-only 或后续全量构建
- `build-sema/` 主要用于当前阶段单独验证 `Sema`
- `scripts/test_lab2_sema.sh` 依赖 `./build-sema/sema_check`
### 3.4 如果后续要做全量构建
`IRGen` 与 grammar 完全同步后,可直接做全量构建:
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCOMPILER_PARSE_ONLY=OFF
cmake --build build -j "$(nproc)"
```
但在当前阶段,不建议把“全量 build 成功”作为验证 `Sema` 的唯一标准,因为 Lab2 目前完成的是语义分析前半部分,不是整套 IR 生成。
## 4. Lab1 测试方法
### 3.1 构建命令
先生成 ANTLR 输出:
```bash
mkdir -p build/generated/antlr4
java -jar third_party/antlr-4.13.2-complete.jar \
-Dlanguage=Cpp \
-visitor -no-listener \
-Xexact-output-dir \
-o build/generated/antlr4 \
src/antlr4/SysY.g4
```
然后使用 `parse-only` 构建:
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCOMPILER_PARSE_ONLY=ON
cmake --build build -j "$(nproc)"
```
### 3.2 单个样例测试
```bash
./build/bin/compiler --emit-parse-tree test/test_case/functional/simple_add.sy
```
### 3.3 批量测试
仓库已提供 parse 批量测试脚本。为避免终端直接打印大量语法树导致输出过长,脚本会把每个用例的语法树输出写入单独日志文件。
```bash
./scripts/test_lab1.sh test/test_case/functional
```
如果希望指定日志目录,可以使用:
```bash
./scripts/test_lab1.sh test/test_case/functional test/test_result/lab1_parse_logs
```
终端中会看到形如:
```text
TEST test/test_case/functional/simple_add.sy -> test/test_result/lab1_parse_logs/simple_add.parse.log
...
ALL_PARSE_OK (...) logs: test/test_result/lab1_parse_logs
```
说明当前测试目录中的 `.sy` 文件都能通过语法分析;具体语法树内容可直接查看对应 `.parse.log` 文件。
## 5. Lab2 测试方法
Lab2 建议分成两部分测试:`Sema` 和 `IRGen`
### 4.1 Lab2 当前推荐先测 Sema
因为当前仓库中 `IRGen` 还未完全同步到新 grammar所以当前阶段更适合先用“语义检查”来证明 Lab2 前半部分已经实现。
#### 4.1.1 当前已验证通过的正例
下面这些测试用例已经可以作为当前 `Sema` 的正向样例:
```bash
./scripts/test_lab2_sema.sh positive
```
如果希望指定日志目录,可以使用:
```bash
./scripts/test_lab2_sema.sh positive test/test_result/lab2_sema_positive_logs
```
预期现象:
- 终端按用例打印 `TEST ... -> ...`
- 全部通过后输出 `ALL_SEMA_POSITIVE_OK (...)`
- 详细输出写入 `*.sema.log`
#### 4.1.2 当前可用于演示的反例
当前已经准备好的反例位于:
- `test/test_case/sema_negative/undef.sy`
- `test/test_case/sema_negative/break.sy`
- `test/test_case/sema_negative/ret.sy`
- `test/test_case/sema_negative/call.sy`
执行命令:
```bash
./scripts/test_lab2_sema.sh negative
```
如果希望指定日志目录,可以使用:
```bash
./scripts/test_lab2_sema.sh negative test/test_result/lab2_sema_negative_logs
```
预期现象:
- 终端按用例打印 `TEST ... -> ...`
- 全部符合预期后输出 `ALL_SEMA_NEGATIVE_OK (...)`
- 每个反例的详细错误信息写入对应 `.sema.log`
例如:
- 使用未声明变量
- 循环外 `break`
- `void` 函数返回值
- 函数参数个数不匹配
#### 4.1.3 语义错误定位信息说明
语义错误信息中的 `@行:列` 用于标明错误位置。
例如:
```text
[error] [sema] @1:19 - 使用了未声明的标识符: a
```
表示:
- `1` 是第 1 行
- `19` 是第 19 列
也就是提示错误出现在源代码第 1 行第 19 列附近,便于快速定位。
#### 4.1.4 当前 Sema 已覆盖的主要错误类型
当前已实现的典型错误检测包括:
- 未声明标识符使用
- 同作用域重定义
- 函数重定义
- 缺少合法 `main`
- 函数参数数量或类型不匹配
- `break/continue` 不在循环中
- `return` 与函数返回类型不匹配
- 给 `const` 对象赋值
- 数组维度非法
- 全局初始化不满足编译期常量要求
### 4.2 Lab2 后续 IR 测试方式
`IRGen` 与当前 grammar 对齐后,可使用如下命令输出 IR
```bash
./build/bin/compiler --emit-ir test/test_case/functional/simple_add.sy
```
若需要进一步验证 “IR -> 可执行程序” 链路,可使用:
```bash
./scripts/verify_ir.sh test/test_case/functional/simple_add.sy test/test_result/ir --run
```
但需要强调:
在当前仓库状态下,这条命令只适合用于未来 IRGen 完成后的测试;不能拿它来证明当前已完成的 `Sema` 部分。
## 6. Lab3 测试方法
Lab3 对应汇编输出与后端链路。
### 5.1 构建
需要全量构建:
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCOMPILER_PARSE_ONLY=OFF
cmake --build build -j "$(nproc)"
```
### 5.2 单个样例输出汇编
```bash
./build/bin/compiler --emit-asm test/test_case/functional/simple_add.sy
```
### 5.3 汇编链路验证
```bash
./scripts/verify_asm.sh test/test_case/functional/simple_add.sy test/test_result/asm --run
```
`--run` 模式下会:
1. 生成汇编
2. 交叉编译为 AArch64 可执行文件
3. 用 `qemu-aarch64` 运行
4. 将输出与同名 `.out` 比对
## 7. Lab4 测试方法
Lab4 是优化实验,测试重点不只是“能不能运行”,还包括“优化前后语义一致”。
建议按下面顺序验证:
1. 先确保未优化版本功能正确
2. 接入优化后再次跑 `verify_ir.sh``verify_asm.sh`
3. 比较优化前后的 IR 或汇编输出
4. 在多个测试上回归,避免某个优化只在 `simple_add` 上看起来没问题
推荐命令:
```bash
./scripts/verify_ir.sh test/test_case/functional/simple_add.sy test/test_result/ir --run
./scripts/verify_asm.sh test/test_case/functional/simple_add.sy test/test_result/asm --run
```
如果你们为优化实现了单独开关,也应额外对比:
```bash
./build/bin/compiler --emit-ir test/test_case/functional/simple_add.sy
./build/bin/compiler --emit-asm test/test_case/functional/simple_add.sy
```
## 8. Lab5 测试方法
Lab5 的测试重点是:
- 寄存器分配后代码仍然正确
- spill/reload 逻辑没有破坏语义
- 汇编仍能完整运行
推荐直接走后端完整链路:
```bash
./scripts/verify_asm.sh test/test_case/functional/simple_add.sy test/test_result/asm --run
```
完成寄存器分配后,不应只测单个样例,建议至少覆盖:
- `functional/`
- `performance/` 中若干较大样例
## 9. Lab6 测试方法
Lab6 重点是循环和并行相关优化,测试要分成功能正确性和优化收益两部分。
### 8.1 功能正确性
```bash
./scripts/verify_ir.sh test/test_case/functional/simple_add.sy test/test_result/ir --run
./scripts/verify_asm.sh test/test_case/functional/simple_add.sy test/test_result/asm --run
```
### 8.2 优化效果观察
你们可以对比优化前后的:
- IR 输出
- 汇编输出
- 执行时间
- 代码规模
例如:
```bash
./build/bin/compiler --emit-ir test/test_case/functional/simple_add.sy
./build/bin/compiler --emit-asm test/test_case/functional/simple_add.sy
```
真正评估循环优化时,建议使用包含明显循环结构的功能或性能测试,而不是只看 `simple_add.sy`
## 10. 当前阶段的建议结论
如果你要汇报当前仓库状态,可以概括为:
1. Lab1 的语法树构建链路已经具备独立测试方式。
2. Lab2 当前已经完成 `Sema` 基础扩展,并可通过正反例直接演示。
3. Lab2 的 `IRGen` 还需要继续补全,当前不能把整份 Lab2 视为全部完成。
4. Lab3 及后续实验目前主要还是框架和最小样例能力,完整覆盖仍需后续实现。

@ -1,27 +1,24 @@
// 当前只支撑 i32、i32*、void 以及最小的内存/算术指令,演示用。
//
// 当前已经实现:
// 1. 基础类型系统void / i32 / i32*
// 2. Value 体系Value / ConstantValue / ConstantInt / Function / BasicBlock / User / GlobalValue / Instruction
// 3. 最小指令集Add / Alloca / Load / Store / Ret
// 1. 基础类型系统void / i32 / i32* / float / float* / array / pointer
// 2. Value 体系Value / ConstantValue / ConstantInt / ConstantFloat / ConstantArray / ConstantZero / Function / BasicBlock / User / GlobalValue / Instruction
// 3. 最小指令集Add / Sub / Mul / Div / Mod / Neg / Alloca / Load / Store / Ret / Cmp / FCmp / Zext / Br / CondBr / Call / GEP / SIToFP / FPToSI
// 4. BasicBlock / Function / Module 三层组织结构
// 5. IRBuilder便捷创建常量和最小指令
// 5. IRBuilder便捷创建常量和各类指令
// 6. def-use 关系的轻量实现:
// - Instruction 保存 operand 列表
// - Value 保存 uses
// - 支持 ReplaceAllUsesWith 的简化实现
//
// 当前尚未实现或只做了最小占位:
// 1. 完整类型系统数组、函数类型、label 类型等
// 2. 更完整的指令系统br / condbr / call / phi / gep 等
// 3. 更成熟的 Use 管理(例如 LLVM 风格的双向链式结构)
// 4. 更完整的 IR verifier 和优化基础设施
// 1. 完整类型系统label 类型等
// 2. 更成熟的 Use 管理(例如 LLVM 风格的双向链式结构)
// 3. 更完整的 IR verifier 和优化基础设施
//
// 当前需要特别说明的两个简化点:
// 1. BasicBlock 虽然已经纳入 Value 体系,但其类型目前仍用 void 作为占位,
// 后续如果补 label type可以再改成更合理的块标签类型。
// 2. ConstantValue 体系目前只实现了 ConstantInt后续可以继续补 ConstantFloat、
// ConstantArray等更完整的常量种类。
//
// 建议的扩展顺序:
// 1. 先补更多指令和类型
@ -45,16 +42,53 @@ class Value;
class User;
class ConstantValue;
class ConstantInt;
class ConstantFloat;
class ConstantArray;
class ConstantZero;
class GlobalValue;
class Instruction;
class BasicBlock;
class Function;
// Use 表示一个 Value 的一次使用记录。
// 当前实现设计:
// - value被使用的值
// - user使用该值的 User
// - operand_index该值在 user 操作数列表中的位置
// --- Type System ---
class Type {
public:
enum class Kind { Void, Int1, Int32, PtrInt32, Float, PtrFloat, Array, Pointer };
explicit Type(Kind k);
Type(Kind k, std::shared_ptr<Type> elem_ty, int num_elems);
Type(Kind k, std::shared_ptr<Type> pointed_ty);
static const std::shared_ptr<Type>& GetVoidType();
static const std::shared_ptr<Type>& GetInt1Type();
static const std::shared_ptr<Type>& GetInt32Type();
static const std::shared_ptr<Type>& GetPtrInt32Type();
static const std::shared_ptr<Type>& GetFloatType();
static const std::shared_ptr<Type>& GetPtrFloatType();
static std::shared_ptr<Type> GetArrayType(std::shared_ptr<Type> elem_ty, int num_elems);
static std::shared_ptr<Type> GetPointerType(std::shared_ptr<Type> pointed_ty);
Kind GetKind() const;
bool IsVoid() const;
bool IsInt1() const;
bool IsInt32() const;
bool IsPtrInt32() const;
bool IsFloat() const;
bool IsPtrFloat() const;
bool IsArray() const;
bool IsPointer() const;
std::shared_ptr<Type> GetElementType() const { return elem_ty_; }
int GetNumElements() const { return num_elems_; }
std::shared_ptr<Type> GetPointedType() const { return elem_ty_; }
private:
Kind kind_;
std::shared_ptr<Type> elem_ty_;
int num_elems_ = 0;
};
// --- Value & Use ---
class Use {
public:
@ -76,40 +110,6 @@ class Use {
size_t operand_index_ = 0;
};
// IR 上下文:集中管理类型、常量等共享资源,便于复用与扩展。
class Context {
public:
Context() = default;
~Context();
// 去重创建 i32 常量。
ConstantInt* GetConstInt(int v);
std::string NextTemp();
private:
std::unordered_map<int, std::unique_ptr<ConstantInt>> const_ints_;
int temp_index_ = -1;
};
class Type {
public:
enum class Kind { Void, Int32, PtrInt32 };
explicit Type(Kind k);
// 使用静态共享对象获取类型。
// 同一类型可直接比较返回值是否相等,例如:
// Type::GetInt32Type() == Type::GetInt32Type()
static const std::shared_ptr<Type>& GetVoidType();
static const std::shared_ptr<Type>& GetInt32Type();
static const std::shared_ptr<Type>& GetPtrInt32Type();
Kind GetKind() const;
bool IsVoid() const;
bool IsInt32() const;
bool IsPtrInt32() const;
private:
Kind kind_;
};
class Value {
public:
Value(std::shared_ptr<Type> ty, std::string name);
@ -118,12 +118,16 @@ class Value {
const std::string& GetName() const;
void SetName(std::string n);
bool IsVoid() const;
bool IsInt1() const;
bool IsInt32() const;
bool IsPtrInt32() const;
bool IsFloat() const;
bool IsPtrFloat() const;
bool IsConstant() const;
bool IsInstruction() const;
bool IsUser() const;
bool IsFunction() const;
bool IsArgument() const;
void AddUse(User* user, size_t operand_index);
void RemoveUse(User* user, size_t operand_index);
const std::vector<Use>& GetUses() const;
@ -135,8 +139,18 @@ class Value {
std::vector<Use> uses_;
};
// ConstantValue 是常量体系的基类。
// 当前只实现了 ConstantInt后续可继续扩展更多常量种类。
class Argument : public Value {
public:
Argument(std::shared_ptr<Type> ty, std::string name, Function* parent, size_t arg_no);
Function* GetParent() const;
size_t GetArgNo() const;
private:
Function* parent_;
size_t arg_no_;
};
// --- Constants ---
class ConstantValue : public Value {
public:
ConstantValue(std::shared_ptr<Type> ty, std::string name = "");
@ -151,11 +165,56 @@ class ConstantInt : public ConstantValue {
int value_{};
};
// 后续还需要扩展更多指令类型。
enum class Opcode { Add, Sub, Mul, Alloca, Load, Store, Ret };
class ConstantFloat : public ConstantValue {
public:
ConstantFloat(std::shared_ptr<Type> ty, float v);
float GetValue() const { return value_; }
private:
float value_{};
};
class ConstantArray : public ConstantValue {
public:
ConstantArray(std::shared_ptr<Type> ty, std::vector<ConstantValue*> elements);
const std::vector<ConstantValue*>& GetElements() const { return elements_; }
private:
std::vector<ConstantValue*> elements_;
};
class ConstantZero : public ConstantValue {
public:
explicit ConstantZero(std::shared_ptr<Type> ty);
};
// --- Context ---
class Context {
public:
Context() = default;
~Context();
ConstantInt* GetConstInt(int v);
ConstantFloat* GetConstFloat(float v);
ConstantArray* GetConstArray(std::shared_ptr<Type> ty, std::vector<ConstantValue*> elements);
ConstantZero* GetConstZero(std::shared_ptr<Type> ty);
std::string NextTemp();
private:
std::unordered_map<int, std::unique_ptr<ConstantInt>> const_ints_;
std::unordered_map<float, std::unique_ptr<ConstantFloat>> const_floats_;
std::vector<std::unique_ptr<ConstantArray>> const_arrays_;
std::vector<std::unique_ptr<ConstantZero>> const_zeros_;
int temp_index_ = -1;
};
// --- Instructions ---
enum class Opcode { Add, Sub, Mul, Div, Mod, Neg, Alloca, Load, Store, Ret, Cmp, FCmp, Zext, Br, CondBr, Call, GEP, SIToFP, FPToSI };
enum class CmpOp { Eq, Ne, Lt, Gt, Le, Ge };
// User 是所有“会使用其他 Value 作为输入”的 IR 对象的抽象基类。
// 当前实现中只有 Instruction 继承自 User。
class User : public Value {
public:
User(std::shared_ptr<Type> ty, std::string name);
@ -164,20 +223,25 @@ class User : public Value {
void SetOperand(size_t index, Value* value);
protected:
// 统一的 operand 入口。
void AddOperand(Value* value);
private:
std::vector<Value*> operands_;
};
// GlobalValue 是全局值/全局变量体系的空壳占位类。
// 当前只补齐类层次,具体初始化器、打印和链接语义后续再补。
class GlobalValue : public User {
public:
GlobalValue(std::shared_ptr<Type> ty, std::string name);
};
class GlobalVariable : public GlobalValue {
public:
GlobalVariable(std::string name, std::shared_ptr<Type> type, ConstantValue* init);
ConstantValue* GetInitializer() const { return init_; }
private:
ConstantValue* init_ = nullptr;
};
class Instruction : public User {
public:
Instruction(Opcode op, std::shared_ptr<Type> ty, std::string name = "");
@ -196,7 +260,14 @@ class BinaryInst : public Instruction {
BinaryInst(Opcode op, std::shared_ptr<Type> ty, Value* lhs, Value* rhs,
std::string name);
Value* GetLhs() const;
Value* GetRhs() const;
Value* GetRhs() const;
};
class UnaryInst : public Instruction {
public:
UnaryInst(Opcode op, std::shared_ptr<Type> ty, Value* operand,
std::string name);
Value* GetUnaryOperand() const;
};
class ReturnInst : public Instruction {
@ -223,8 +294,80 @@ class StoreInst : public Instruction {
Value* GetPtr() const;
};
// BasicBlock 已纳入 Value 体系,便于后续向更完整 IR 类图靠拢。
// 当前其类型仍使用 void 作为占位,后续可替换为专门的 label type。
class CmpInst : public Instruction {
public:
CmpInst(CmpOp cmp_op, Value* lhs, Value* rhs, std::string name);
CmpOp GetCmpOp() const;
Value* GetLhs() const;
Value* GetRhs() const;
private:
CmpOp cmp_op_;
};
class FCmpInst : public Instruction {
public:
FCmpInst(CmpOp cmp_op, Value* lhs, Value* rhs, std::string name);
CmpOp GetCmpOp() const;
Value* GetLhs() const;
Value* GetRhs() const;
private:
CmpOp cmp_op_;
};
class ZextInst : public Instruction {
public:
ZextInst(std::shared_ptr<Type> dest_ty, Value* val, std::string name);
Value* GetValue() const;
};
class BranchInst : public Instruction {
public:
BranchInst(BasicBlock* dest);
BasicBlock* GetDest() const;
};
class CondBranchInst : public Instruction {
public:
CondBranchInst(Value* cond, BasicBlock* true_bb, BasicBlock* false_bb);
Value* GetCond() const;
BasicBlock* GetTrueBlock() const;
BasicBlock* GetFalseBlock() const;
};
class CallInst : public Instruction {
public:
CallInst(Function* func, std::vector<Value*> args, std::string name = "");
Function* GetFunc() const;
const std::vector<Value*>& GetArgs() const;
private:
Function* func_;
std::vector<Value*> args_;
};
class GEPInst : public Instruction {
public:
GEPInst(std::shared_ptr<Type> ty, Value* ptr, std::vector<Value*> indices, std::string name = "");
Value* GetPtr() const;
const std::vector<Value*>& GetIndices() const;
private:
std::vector<Value*> indices_;
};
class SIToFPInst : public Instruction {
public:
SIToFPInst(std::shared_ptr<Type> ty, Value* val, std::string name = "");
};
class FPToSIInst : public Instruction {
public:
FPToSIInst(std::shared_ptr<Type> ty, Value* val, std::string name = "");
};
// --- Structure ---
class BasicBlock : public Value {
public:
explicit BasicBlock(std::string name);
@ -254,24 +397,21 @@ class BasicBlock : public Value {
std::vector<BasicBlock*> successors_;
};
// Function 当前也采用了最小实现。
// 需要特别注意:由于项目里还没有单独的 FunctionType
// Function 继承自 Value 后,其 type_ 目前只保存“返回类型”,
// 并不能完整表达“返回类型 + 形参列表”这一整套函数签名。
// 这对当前只支持 int main() 的最小 IR 足够,但后续若补普通函数、
// 形参和调用,通常需要引入专门的函数类型表示。
class Function : public Value {
public:
// 当前构造函数接收的也是返回类型,而不是完整函数类型。
Function(std::string name, std::shared_ptr<Type> ret_type);
BasicBlock* CreateBlock(const std::string& name);
BasicBlock* GetEntry();
const BasicBlock* GetEntry() const;
const std::vector<std::unique_ptr<BasicBlock>>& GetBlocks() const;
Argument* AddArgument(std::shared_ptr<Type> ty, std::string name);
const std::vector<std::unique_ptr<Argument>>& GetArgs() const;
private:
BasicBlock* entry_ = nullptr;
std::vector<std::unique_ptr<BasicBlock>> blocks_;
std::vector<std::unique_ptr<Argument>> args_;
};
class Module {
@ -279,14 +419,17 @@ class Module {
Module() = default;
Context& GetContext();
const Context& GetContext() const;
// 创建函数时当前只显式传入返回类型,尚未接入完整的 FunctionType。
Function* CreateFunction(const std::string& name,
std::shared_ptr<Type> ret_type);
const std::vector<std::unique_ptr<Function>>& GetFunctions() const;
GlobalVariable* CreateGlobalVariable(const std::string& name, std::shared_ptr<Type> type, ConstantValue* init);
const std::vector<std::unique_ptr<GlobalVariable>>& GetGlobalVariables() const;
private:
Context context_;
std::vector<std::unique_ptr<Function>> functions_;
std::vector<std::unique_ptr<GlobalVariable>> global_variables_;
};
class IRBuilder {
@ -295,15 +438,27 @@ class IRBuilder {
void SetInsertPoint(BasicBlock* bb);
BasicBlock* GetInsertBlock() const;
// 构造常量、二元运算、返回指令的最小集合。
ConstantInt* CreateConstInt(int v);
BinaryInst* CreateBinary(Opcode op, Value* lhs, Value* rhs,
const std::string& name);
BinaryInst* CreateAdd(Value* lhs, Value* rhs, const std::string& name);
BinaryInst* CreateSub(Value* lhs, Value* rhs, const std::string& name);
BinaryInst* CreateMul(Value* lhs, Value* rhs, const std::string& name);
UnaryInst* CreateNeg(Value* operand, const std::string& name);
AllocaInst* CreateAllocaI32(const std::string& name);
AllocaInst* CreateAllocaFloat(const std::string& name);
AllocaInst* CreateAlloca(std::shared_ptr<Type> ty, const std::string& name);
LoadInst* CreateLoad(Value* ptr, const std::string& name);
StoreInst* CreateStore(Value* val, Value* ptr);
ReturnInst* CreateRet(Value* v);
Instruction* CreateCmp(CmpOp op, Value* lhs, Value* rhs, const std::string& name);
ZextInst* CreateZext(Value* val, const std::string& name);
BranchInst* CreateBr(BasicBlock* dest);
CondBranchInst* CreateCondBr(Value* cond, BasicBlock* true_bb, BasicBlock* false_bb);
CallInst* CreateCall(Function* func, std::vector<Value*> args, const std::string& name);
GEPInst* CreateGEP(std::shared_ptr<Type> ty, Value* ptr, std::vector<Value*> indices, const std::string& name);
SIToFPInst* CreateSIToFP(Value* val, const std::string& name);
FPToSIInst* CreateFPToSI(Value* val, const std::string& name);
private:
Context& ctx_;

@ -26,16 +26,26 @@ class IRGenImpl final : public SysYBaseVisitor {
std::any visitCompUnit(SysYParser::CompUnitContext* ctx) override;
std::any visitFuncDef(SysYParser::FuncDefContext* ctx) override;
std::any visitBlockStmt(SysYParser::BlockStmtContext* ctx) override;
std::any visitBlock(SysYParser::BlockContext* ctx) override;
std::any visitBlockItem(SysYParser::BlockItemContext* ctx) override;
std::any visitDecl(SysYParser::DeclContext* ctx) override;
std::any visitConstDecl(SysYParser::ConstDeclContext* ctx) override;
std::any visitConstDef(SysYParser::ConstDefContext* ctx) override;
std::any visitStmt(SysYParser::StmtContext* ctx) override;
std::any visitVarDef(SysYParser::VarDefContext* ctx) override;
std::any visitReturnStmt(SysYParser::ReturnStmtContext* ctx) override;
std::any visitParenExp(SysYParser::ParenExpContext* ctx) override;
std::any visitNumberExp(SysYParser::NumberExpContext* ctx) override;
std::any visitVarExp(SysYParser::VarExpContext* ctx) override;
std::any visitAdditiveExp(SysYParser::AdditiveExpContext* ctx) override;
std::any visitPrimaryExp(SysYParser::PrimaryExpContext* ctx) override;
std::any visitNumber(SysYParser::NumberContext* ctx) override;
std::any visitLVal(SysYParser::LValContext* ctx) override;
std::any visitAddExp(SysYParser::AddExpContext* ctx) override;
std::any visitMulExp(SysYParser::MulExpContext* ctx) override;
std::any visitUnaryExp(SysYParser::UnaryExpContext* ctx) override;
std::any visitRelExp(SysYParser::RelExpContext* ctx) override;
std::any visitEqExp(SysYParser::EqExpContext* ctx) override;
std::any visitLAndExp(SysYParser::LAndExpContext* ctx) override;
std::any visitLOrExp(SysYParser::LOrExpContext* ctx) override;
std::any visitCondUnaryExp(SysYParser::CondUnaryExpContext* ctx) override;
std::any visitCond(SysYParser::CondContext* ctx) override;
std::any visitFuncRParams(SysYParser::FuncRParamsContext* ctx) override;
private:
enum class BlockFlow {
@ -45,13 +55,60 @@ class IRGenImpl final : public SysYBaseVisitor {
BlockFlow VisitBlockItemResult(SysYParser::BlockItemContext& item);
ir::Value* EvalExpr(SysYParser::ExpContext& expr);
ir::ConstantValue* EvaluateConst(antlr4::tree::ParseTree* tree);
int EvaluateConstInt(SysYParser::ConstExpContext* ctx);
int EvaluateConstInt(SysYParser::ExpContext* ctx);
std::shared_ptr<ir::Type> GetGEPResultType(ir::Value* ptr, const std::vector<ir::Value*>& indices);
// Flatten array initializers
void FlattenInitVal(SysYParser::InitValContext* ctx,
const std::vector<int>& dims,
const std::vector<int>& sub_sizes,
int dim_idx,
size_t& current_pos,
std::vector<ir::Value*>& results,
bool is_float);
void FlattenConstInitVal(SysYParser::ConstInitValContext* ctx,
const std::vector<int>& dims,
const std::vector<int>& sub_sizes,
int dim_idx,
size_t& current_pos,
std::vector<ir::ConstantValue*>& results,
bool is_float);
ir::Module& module_;
const SemanticContext& sema_;
ir::Function* func_;
ir::IRBuilder builder_;
// 名称绑定由 Sema 负责IRGen 只维护“声明 -> 存储槽位”的代码生成状态。
std::unordered_map<SysYParser::VarDefContext*, ir::Value*> storage_map_;
// 考虑到嵌套作用域(全局、函数、语句块),使用 vector 模拟栈来管理 storage_map_ 和 const_values_
std::vector<std::unordered_map<std::string, ir::Value*>> storage_map_stack_;
std::vector<std::unordered_map<std::string, ir::ConstantValue*>> const_values_stack_;
// 用于在栈中查找变量
ir::Value* FindStorage(const std::string& name) const {
for (auto it = storage_map_stack_.rbegin(); it != storage_map_stack_.rend(); ++it) {
if (it->count(name)) return it->at(name);
}
return nullptr;
}
ir::ConstantValue* FindConst(const std::string& name) const {
for (auto it = const_values_stack_.rbegin(); it != const_values_stack_.rend(); ++it) {
if (it->count(name)) return it->at(name);
}
return nullptr;
}
// 用于 break 和 continue 跳转的目标位置
ir::BasicBlock* current_loop_cond_bb_ = nullptr;
ir::BasicBlock* current_loop_exit_bb_ = nullptr;
int bb_cnt_ = 0;
std::string NextBlockName(const std::string& prefix = "bb") {
return prefix + "_" + std::to_string(++bb_cnt_);
}
};
std::unique_ptr<ir::Module> GenerateIR(SysYParser::CompUnitContext& tree,

@ -1,30 +1,69 @@
// 基于语法树的语义检查与名称绑定。
#pragma once
#include <string>
#include <unordered_map>
#include <vector>
#include "SysYParser.h"
enum class SemanticType {
Void,
Int,
Float,
};
struct ScalarConstant {
SemanticType type = SemanticType::Int;
double number = 0.0;
};
struct ObjectBinding {
enum class DeclKind {
Var,
Const,
Param,
};
std::string name;
SemanticType type = SemanticType::Int;
DeclKind decl_kind = DeclKind::Var;
bool is_array_param = false;
std::vector<int> dimensions;
const SysYParser::VarDefContext* var_def = nullptr;
const SysYParser::ConstDefContext* const_def = nullptr;
const SysYParser::FuncFParamContext* func_param = nullptr;
bool has_const_value = false;
ScalarConstant const_value;
};
struct FunctionBinding {
std::string name;
SemanticType return_type = SemanticType::Int;
std::vector<ObjectBinding> params;
const SysYParser::FuncDefContext* func_def = nullptr;
bool is_builtin = false;
};
class SemanticContext {
public:
void BindVarUse(SysYParser::VarContext* use,
SysYParser::VarDefContext* decl) {
var_uses_[use] = decl;
}
void BindObjectUse(const SysYParser::LValContext* use, ObjectBinding binding);
const ObjectBinding* ResolveObjectUse(
const SysYParser::LValContext* use) const;
void BindFunctionCall(const SysYParser::UnaryExpContext* call,
FunctionBinding binding);
const FunctionBinding* ResolveFunctionCall(
const SysYParser::UnaryExpContext* call) const;
SysYParser::VarDefContext* ResolveVarUse(
const SysYParser::VarContext* use) const {
auto it = var_uses_.find(use);
return it == var_uses_.end() ? nullptr : it->second;
}
void RegisterFunction(FunctionBinding binding);
const FunctionBinding* ResolveFunction(const std::string& name) const;
private:
std::unordered_map<const SysYParser::VarContext*,
SysYParser::VarDefContext*>
var_uses_;
std::unordered_map<const SysYParser::LValContext*, ObjectBinding> object_uses_;
std::unordered_map<const SysYParser::UnaryExpContext*, FunctionBinding>
function_calls_;
std::unordered_map<std::string, FunctionBinding> functions_;
};
// 目前仅检查:
// - 变量先声明后使用
// - 局部变量不允许重复定义
SemanticContext RunSema(SysYParser::CompUnitContext& comp_unit);

@ -1,17 +1,25 @@
// 极简符号表:记录局部变量定义点
// 维护对象符号的多层作用域
#pragma once
#include <string>
#include <string_view>
#include <unordered_map>
#include <vector>
#include "SysYParser.h"
#include "sem/Sema.h"
class SymbolTable {
public:
void Add(const std::string& name, SysYParser::VarDefContext* decl);
bool Contains(const std::string& name) const;
SysYParser::VarDefContext* Lookup(const std::string& name) const;
SymbolTable();
void EnterScope();
void ExitScope();
bool Add(const ObjectBinding& symbol);
bool ContainsInCurrentScope(std::string_view name) const;
const ObjectBinding* Lookup(std::string_view name) const;
size_t Depth() const;
private:
std::unordered_map<std::string, SysYParser::VarDefContext*> table_;
std::vector<std::unordered_map<std::string, ObjectBinding>> scopes_;
};

@ -0,0 +1,83 @@
--- include/ir/IR.h
+++ include/ir/IR.h
@@ -93,6 +93,7 @@
class Type {
public:
- enum class Kind { Void, Int32, PtrInt32 };
+ enum class Kind { Void, Int1, Int32, PtrInt32 };
explicit Type(Kind k);
// 使用静态共享对象获取类型。
// 同一类型可直接比较返回值是否相等,例如:
// Type::GetInt32Type() == Type::GetInt32Type()
static const std::shared_ptr<Type>& GetVoidType();
+ static const std::shared_ptr<Type>& GetInt1Type();
static const std::shared_ptr<Type>& GetInt32Type();
static const std::shared_ptr<Type>& GetPtrInt32Type();
Kind GetKind() const;
bool IsVoid() const;
+ bool IsInt1() const;
bool IsInt32() const;
bool IsPtrInt32() const;
@@ -118,6 +119,7 @@
const std::string& GetName() const;
void SetName(std::string n);
bool IsVoid() const;
+ bool IsInt1() const;
bool IsInt32() const;
bool IsPtrInt32() const;
bool IsConstant() const;
@@ -153,7 +155,9 @@
// 后续还需要扩展更多指令类型。
-// enum class Opcode { Add, Sub, Mul, Alloca, Load, Store, Ret };
-enum class Opcode { Add, Sub, Mul, Div, Mod, Neg, Alloca, Load, Store, Ret };
+enum class Opcode { Add, Sub, Mul, Div, Mod, Neg, Alloca, Load, Store, Ret, Cmp, Zext, Br, CondBr };
+
+enum class CmpOp { Eq, Ne, Lt, Gt, Le, Ge };
// User 是所有“会使用其他 Value 作为输入”的 IR 对象的抽象基类。
@@ -231,6 +235,33 @@
Value* GetPtr() const;
};
+class CmpInst : public Instruction {
+ public:
+ CmpInst(CmpOp cmp_op, Value* lhs, Value* rhs, std::string name);
+ CmpOp GetCmpOp() const;
+ Value* GetLhs() const;
+ Value* GetRhs() const;
+ private:
+ CmpOp cmp_op_;
+};
+
+class ZextInst : public Instruction {
+ public:
+ ZextInst(std::shared_ptr<Type> dest_ty, Value* val, std::string name);
+ Value* GetValue() const;
+};
+
+class BranchInst : public Instruction {
+ public:
+ BranchInst(BasicBlock* dest);
+ BasicBlock* GetDest() const;
+};
+
+class CondBranchInst : public Instruction {
+ public:
+ CondBranchInst(Value* cond, BasicBlock* true_bb, BasicBlock* false_bb);
+ Value* GetCond() const;
+ BasicBlock* GetTrueBlock() const;
+ BasicBlock* GetFalseBlock() const;
+};
+
// BasicBlock 已纳入 Value 体系,便于后续向更完整 IR 类图靠拢。
@@ -315,6 +346,10 @@
LoadInst* CreateLoad(Value* ptr, const std::string& name);
StoreInst* CreateStore(Value* val, Value* ptr);
ReturnInst* CreateRet(Value* v);
+ CmpInst* CreateCmp(CmpOp op, Value* lhs, Value* rhs, const std::string& name);
+ ZextInst* CreateZext(Value* val, const std::string& name);
+ BranchInst* CreateBr(BasicBlock* dest);
+ CondBranchInst* CreateCondBr(Value* cond, BasicBlock* true_bb, BasicBlock* false_bb);
private:

@ -0,0 +1,38 @@
#!/usr/bin/env bash
set -euo pipefail
case_dir="${1:-test/test_case}"
log_dir="${2:-test/test_result/lab1_parse_logs}"
if [[ ! -d "$case_dir" ]]; then
echo "测试目录不存在: $case_dir" >&2
exit 1
fi
compiler="./build/bin/compiler"
if [[ ! -x "$compiler" ]]; then
echo "未找到编译器: $compiler ,请先构建 parse-only 版本。" >&2
exit 1
fi
mkdir -p "$log_dir"
mapfile -t cases < <(find "$case_dir" -name '*.sy' | sort)
if [[ ${#cases[@]} -eq 0 ]]; then
echo "未找到任何 .sy 测试文件: $case_dir" >&2
exit 1
fi
for f in "${cases[@]}"; do
rel="${f#$case_dir/}"
safe_name="${rel//\//__}"
log_file="$log_dir/${safe_name%.sy}.parse.log"
echo "TEST $f -> $log_file"
if ! "$compiler" --emit-parse-tree "$f" >"$log_file" 2>&1; then
echo "FAIL $f (see $log_file)" >&2
exit 1
fi
done
echo "ALL_PARSE_OK (${#cases[@]} cases) logs: $log_dir"

@ -0,0 +1,142 @@
#!/usr/bin/env bash
# 实验 2 全量测试脚本 (改进版)
# 逻辑参考 verify_ir.sh 与 verify_asm.sh
# 增加了批量测试与统计功能,并确保链接 SysY 运行库 (sylib.c)
set -uo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
COMPILER="$PROJECT_ROOT/build/bin/compiler"
SYLIB="$PROJECT_ROOT/sylib/sylib.c"
RESULT_DIR="$PROJECT_ROOT/test/test_result/lab2_full"
# 检查依赖
if [[ ! -x "$COMPILER" ]]; then
echo "错误:编译器不存在,请先构建项目。"
exit 1
fi
if [[ ! -f "$SYLIB" ]]; then
echo "错误:未找到运行库 $SYLIB"
exit 1
fi
# 颜色输出
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
mkdir -p "$RESULT_DIR"
total=0
passed=0
failed=0
run_test() {
local input=$1
local base=$(basename "$input")
local stem=${base%.sy}
local input_dir=$(dirname "$input")
local out_file="$RESULT_DIR/$stem.ll"
local obj_file="$RESULT_DIR/$stem.o"
local exe_file="$RESULT_DIR/$stem"
local stdin_file="$input_dir/$stem.in"
local expected_file="$input_dir/$stem.out"
local actual_file="$RESULT_DIR/$stem.actual.out"
local stdout_file="$RESULT_DIR/$stem.stdout"
((total++)) || true
echo -n "[$total] 测试 $base ... "
# 1. 生成 IR
if ! "$COMPILER" --emit-ir "$input" > "$out_file" 2>&1; then
echo -e "${RED}IR 生成失败${NC}"
((failed++)) || true
return 1
fi
# 2. 编译 IR 到对象文件 (llc)
if ! llc -filetype=obj "$out_file" -o "$obj_file" > /dev/null 2>&1; then
echo -e "${RED}LLVM 编译失败 (llc)${NC}"
((failed++)) || true
return 1
fi
# 3. 链接运行库 (借鉴 verify_asm.sh 逻辑,但明确包含 sylib.c)
if ! clang "$obj_file" "$SYLIB" -o "$exe_file" > /dev/null 2>&1; then
echo -e "${RED}链接失败 (clang)${NC}"
((failed++)) || true
return 1
fi
# 4. 运行程序并捕获输出与退出码 (增加栈空间限制)
local status=0
ulimit -s unlimited 2>/dev/null || true
if [[ -f "$stdin_file" ]]; then
"$exe_file" < "$stdin_file" > "$stdout_file" 2>/dev/null || status=$?
else
"$exe_file" > "$stdout_file" 2>/dev/null || status=$?
fi
# 格式化实际输出 (借鉴 verify_ir.sh 格式)
{
cat "$stdout_file"
if [[ -s "$stdout_file" ]] && [[ "$(tail -c 1 "$stdout_file" | wc -l)" -eq 0 ]]; then
printf '\n'
fi
printf '%s\n' "$status"
} > "$actual_file"
# 5. 比对结果
if [[ -f "$expected_file" ]]; then
# 忽略空格差异 (-b -w)
if diff -q -b -w "$expected_file" "$actual_file" > /dev/null 2>&1; then
echo -e "${GREEN} 通过${NC}"
((passed++)) || true
else
echo -e "${RED} 输出不匹配${NC}"
((failed++)) || true
fi
else
echo -e "${YELLOW}! 缺少预期输出文件${NC}"
((passed++)) || true
fi
}
# 批量运行
echo "========================================="
echo "实验 2 全量测试开始 (IR 语义验证)"
echo "========================================="
echo ""
run_batch() {
local dir=$1
if [[ ! -d "$dir" ]]; then return; fi
echo "正在测试目录: $dir"
for sy_file in $(ls "$dir"/*.sy | sort); do
run_test "$sy_file"
done
echo ""
}
run_batch "$PROJECT_ROOT/test/test_case/functional"
run_batch "$PROJECT_ROOT/test/test_case/performance"
echo "========================================="
echo "测试结果统计"
echo "========================================="
echo -e "总数:$total"
echo -e "通过:${GREEN}$passed${NC}"
echo -e "失败:${RED}$failed${NC}"
echo ""
if [[ $failed -eq 0 ]]; then
echo -e "${GREEN} 所有测试通过!实验 2 任务完成。${NC}"
exit 0
else
echo -e "${RED}$failed 个测试失败,请检查逻辑。${NC}"
exit 1
fi

@ -0,0 +1,157 @@
#!/usr/bin/env bash
# 测试 Lab2 IR 生成 - 人员 1 的任务
# 测试内容:
# - 任务 1.1: 支持更多二元运算符Sub, Mul, Div, Mod
# - 任务 1.2: 支持一元运算符(正负号)
# - 任务 1.3: 支持赋值表达式
# - 任务 1.4: 支持逗号分隔的多个变量声明
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
COMPILER="$PROJECT_ROOT/build/bin/compiler"
TEST_DIR="$PROJECT_ROOT/test/test_case/irgen_lab1_4"
RESULT_DIR="$PROJECT_ROOT/test/test_result/lab2_ir1"
# 颜色输出
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
echo "========================================="
echo "Lab2 IR 生成测试 - 部分任务验证"
echo "========================================="
echo ""
# 检查编译器是否存在
if [[ ! -x "$COMPILER" ]]; then
echo -e "${RED}错误:编译器不存在或不可执行:$COMPILER${NC}"
echo "请先运行cmake --build build"
exit 1
fi
# 检查测试目录是否存在
if [[ ! -d "$TEST_DIR" ]]; then
echo -e "${RED}错误:测试目录不存在:$TEST_DIR${NC}"
exit 1
fi
# 创建结果目录
mkdir -p "$RESULT_DIR"
# 统计
total=0
passed=0
failed=0
# 测试函数
run_test() {
local input=$1
local basename=$(basename "$input" .sy)
local expected_out="$TEST_DIR/$basename.out"
local actual_out="$RESULT_DIR/$basename.actual.out"
local ll_file="$RESULT_DIR/$basename.ll"
((total++)) || true
echo -n "测试 $basename ... "
# 生成 IR
if ! "$COMPILER" --emit-ir "$input" > "$ll_file" 2>&1; then
echo -e "${RED}IR 生成失败${NC}"
((failed++)) || true
return 1
fi
# 如果需要运行并比对输出
if [[ -f "$expected_out" ]]; then
# 编译并运行
local exe_file="$RESULT_DIR/$basename"
if ! llc -O0 -filetype=obj "$ll_file" -o "$RESULT_DIR/$basename.o" 2>/dev/null; then
echo -e "${YELLOW}LLVM 编译失败 (llc)${NC}"
cat "$ll_file"
((failed++)) || true
return 1
fi
if ! clang "$RESULT_DIR/$basename.o" -o "$exe_file" 2>/dev/null; then
echo -e "${YELLOW}链接失败 (clang)${NC}"
((failed++)) || true
return 1
fi
# 运行程序,捕获返回值(低 8 位)
local exit_code=0
"$exe_file" > "$actual_out" 2>&1 || exit_code=$?
# 处理返回值LLVM/AArch64 返回的是 8 位无符号整数)
if [[ $exit_code -gt 127 ]]; then
# 转换为有符号整数
exit_code=$((exit_code - 256))
fi
echo "$exit_code" > "$actual_out"
# 比对输出
if diff -q "$expected_out" "$actual_out" > /dev/null 2>&1; then
echo -e "${GREEN}✓ 通过${NC}"
((passed++)) || true
return 0
else
echo -e "${RED}✗ 输出不匹配${NC}"
echo " 期望:$(cat "$expected_out")"
echo " 实际:$(cat "$actual_out")"
((failed++)) || true
return 1
fi
else
# 没有期望输出,只检查 IR 生成
echo -e "${GREEN}✓ IR 生成成功${NC}"
((passed++)) || true
return 0
fi
}
# 查找所有测试用例
test_files=()
while IFS= read -r -d '' file; do
test_files+=("$file")
done < <(find "$TEST_DIR" -name "*.sy" -type f -print0 | sort -z)
if [[ ${#test_files[@]} -eq 0 ]]; then
echo -e "${RED}未找到测试用例:$TEST_DIR${NC}"
exit 1
fi
echo "找到 ${#test_files[@]} 个测试用例"
echo ""
# 运行所有测试
for test_file in "${test_files[@]}"; do
run_test "$test_file" || true
done
# 输出统计
echo ""
echo "========================================="
echo "测试结果统计"
echo "========================================="
echo -e "总数:$total"
echo -e "通过:${GREEN}$passed${NC}"
echo -e "失败:${RED}$failed${NC}"
echo ""
if [[ $failed -eq 0 ]]; then
echo -e "${GREEN}✓ 所有测试通过!${NC}"
echo ""
echo "测试覆盖:"
echo " ✓ 任务 1.1: 二元运算符Sub, Mul, Div, Mod"
echo " ✓ 任务 1.2: 一元运算符(正负号)"
echo " ✓ 任务 1.3: 赋值表达式"
echo " ✓ 任务 1.4: 逗号分隔的多变量声明"
exit 0
else
echo -e "${RED}✗ 有 $failed 个测试失败${NC}"
exit 1
fi

@ -0,0 +1,92 @@
#!/usr/bin/env bash
set -euo pipefail
mode="${1:-positive}"
log_dir="${2:-test/test_result/lab2_sema_logs}"
checker="./build-sema/sema_check"
if [[ ! -x "$checker" ]]; then
echo "未找到语义测试驱动: $checker" >&2
echo "请先准备 build-sema/sema_check。" >&2
exit 1
fi
mkdir -p "$log_dir"
case_files=()
expected_prefix=""
case "$mode" in
positive)
expected_prefix="OK"
case_files=(
"test/test_case/functional/simple_add.sy"
"test/test_case/functional/09_func_defn.sy"
"test/test_case/functional/25_scope3.sy"
"test/test_case/functional/29_break.sy"
"test/test_case/functional/05_arr_defn4.sy"
"test/test_case/functional/95_float.sy"
)
;;
negative)
expected_prefix="ERR"
case_files=(
"test/test_case/sema_negative/undef.sy"
"test/test_case/sema_negative/break.sy"
"test/test_case/sema_negative/ret.sy"
"test/test_case/sema_negative/call.sy"
)
;;
*)
echo "用法: $0 [positive|negative] [log_dir]" >&2
exit 1
;;
esac
if [[ ${#case_files[@]} -eq 0 ]]; then
echo "没有可执行的测试用例" >&2
exit 1
fi
for f in "${case_files[@]}"; do
if [[ ! -f "$f" ]]; then
echo "测试文件不存在: $f" >&2
exit 1
fi
done
all_ok=true
for f in "${case_files[@]}"; do
base="$(basename "${f%.sy}")"
log_file="$log_dir/${base}.sema.log"
echo "TEST $f -> $log_file"
set +e
"$checker" "$f" >"$log_file" 2>&1
status=$?
set -e
if ! grep -q "^${expected_prefix} $f$" "$log_file"; then
echo "FAIL $f (see $log_file)" >&2
all_ok=false
continue
fi
if [[ "$mode" == "positive" && $status -ne 0 ]]; then
echo "FAIL $f (expected success, see $log_file)" >&2
all_ok=false
continue
fi
if [[ "$mode" == "negative" && $status -eq 0 ]]; then
echo "FAIL $f (expected semantic error, see $log_file)" >&2
all_ok=false
continue
fi
done
if [[ "$all_ok" != true ]]; then
exit 1
fi
echo "ALL_SEMA_${mode^^}_OK (${#case_files[@]} cases) logs: $log_dir"

@ -60,7 +60,22 @@ if [[ "$run_exec" == true ]]; then
stdout_file="$out_dir/$stem.stdout"
actual_file="$out_dir/$stem.actual.out"
llc -filetype=obj "$out_file" -o "$obj"
clang "$obj" -o "$exe"
#lang "$obj" -o "$exe"
# 查找运行库路径,通常在项目根目录的 sylib/sylib.c
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SYLIB="$SCRIPT_DIR/../sylib/sylib.c"
if [[ ! -f "$SYLIB" ]]; then
# 备选路径,如果从根目录运行
SYLIB="sylib/sylib.c"
fi
if [[ -f "$SYLIB" ]]; then
clang "$obj" "$SYLIB" -o "$exe"
else
echo "警告:未找到运行库 sylib.c尝试直接链接..." >&2
clang "$obj" -o "$exe"
fi
echo "运行 $exe ..."
set +e
if [[ -f "$stdin_file" ]]; then
@ -70,7 +85,11 @@ if [[ "$run_exec" == true ]]; then
fi
status=$?
set -e
# 打印程序输出,确保末尾有换行
cat "$stdout_file"
if [[ -s "$stdout_file" ]] && (( $(tail -c 1 "$stdout_file" | wc -l) == 0 )); then
printf '\n'
fi
echo "退出码: $status"
{
cat "$stdout_file"
@ -81,7 +100,8 @@ if [[ "$run_exec" == true ]]; then
} > "$actual_file"
if [[ -f "$expected_file" ]]; then
if diff -u "$expected_file" "$actual_file"; then
# 使用 -b -B 忽略空白和空行差异
if diff -u -b -B "$expected_file" "$actual_file"; then
echo "输出匹配: $expected_file"
else
echo "输出不匹配: $expected_file" >&2

@ -14,6 +14,7 @@ add_executable(compiler
)
target_link_libraries(compiler PRIVATE
frontend
ir
utils
)

@ -1,67 +1,155 @@
// SysY 子集语法:支持形如
// int main() { int a = 1; int b = 2; return a + b; }
// 的最小返回表达式编译。
// 后续需要自行添加
grammar SysY;
/*===-------------------------------------------===*/
/* Lexer rules */
/*===-------------------------------------------===*/
CONST: 'const';
INT: 'int';
FLOAT: 'float';
VOID: 'void';
IF: 'if';
ELSE: 'else';
WHILE: 'while';
BREAK: 'break';
CONTINUE: 'continue';
RETURN: 'return';
LE: '<=';
GE: '>=';
EQ: '==';
NE: '!=';
AND: '&&';
OR: '||';
ASSIGN: '=';
LT: '<';
GT: '>';
ADD: '+';
SUB: '-';
MUL: '*';
DIV: '/';
MOD: '%';
NOT: '!';
LPAREN: '(';
RPAREN: ')';
LBRACK: '[';
RBRACK: ']';
LBRACE: '{';
RBRACE: '}';
COMMA: ',';
SEMICOLON: ';';
ID: [a-zA-Z_][a-zA-Z_0-9]*;
ILITERAL: [0-9]+;
WS: [ \t\r\n] -> skip;
HEX_FLOAT_LITERAL
: ('0x' | '0X') HEX_DIGIT* '.' HEX_DIGIT+ BINARY_EXPONENT
| ('0x' | '0X') HEX_DIGIT+ '.' HEX_DIGIT* BINARY_EXPONENT
| ('0x' | '0X') HEX_DIGIT+ BINARY_EXPONENT
;
DEC_FLOAT_LITERAL
: DEC_DIGIT+ '.' DEC_DIGIT* DEC_EXPONENT?
| '.' DEC_DIGIT+ DEC_EXPONENT?
| DEC_DIGIT+ DEC_EXPONENT
;
HEX_INT_LITERAL
: ('0x' | '0X') HEX_DIGIT+
;
OCT_INT_LITERAL
: '0' OCT_DIGIT+
;
DEC_INT_LITERAL
: '0'
| [1-9] DEC_DIGIT*
;
WS: [ \t\r\n]+ -> skip;
LINECOMMENT: '//' ~[\r\n]* -> skip;
BLOCKCOMMENT: '/*' .*? '*/' -> skip;
fragment DEC_DIGIT: [0-9];
fragment OCT_DIGIT: [0-7];
fragment HEX_DIGIT: [0-9a-fA-F];
fragment DEC_EXPONENT: [eE] [+-]? DEC_DIGIT+;
fragment BINARY_EXPONENT: [pP] [+-]? DEC_DIGIT+;
/*===-------------------------------------------===*/
/* Syntax rules */
/*===-------------------------------------------===*/
compUnit
: funcDef EOF
: topLevelItem (topLevelItem)* EOF
;
topLevelItem
: decl
| funcDef
;
decl
: btype varDef SEMICOLON
: constDecl
| varDecl
;
constDecl
: CONST bType constDef (COMMA constDef)* SEMICOLON
;
btype
varDecl
: bType varDef (COMMA varDef)* SEMICOLON
;
bType
: INT
| FLOAT
;
constDef
: ID constIndex* ASSIGN constInitVal
;
varDef
: lValue (ASSIGN initValue)?
: ID constIndex* (ASSIGN initVal)?
;
initValue
constIndex
: LBRACK constExp RBRACK
;
constInitVal
: constExp
| LBRACE (constInitVal (COMMA constInitVal)*)? RBRACE
;
initVal
: exp
| LBRACE (initVal (COMMA initVal)*)? RBRACE
;
funcDef
: funcType ID LPAREN RPAREN blockStmt
: funcType ID LPAREN funcFParams? RPAREN block
;
funcType
: INT
: VOID
| INT
| FLOAT
;
funcFParams
: funcFParam (COMMA funcFParam)*
;
blockStmt
funcFParam
: bType ID (LBRACK RBRACK (LBRACK exp RBRACK)*)?
;
block
: LBRACE blockItem* RBRACE
;
@ -71,28 +159,107 @@ blockItem
;
stmt
: returnStmt
: lVal ASSIGN exp SEMICOLON
| exp? SEMICOLON
| block
| IF LPAREN cond RPAREN stmt (ELSE stmt)?
| WHILE LPAREN cond RPAREN stmt
| BREAK SEMICOLON
| CONTINUE SEMICOLON
| RETURN exp? SEMICOLON
;
returnStmt
: RETURN exp SEMICOLON
exp
: addExp
;
exp
: LPAREN exp RPAREN # parenExp
| var # varExp
| number # numberExp
| exp ADD exp # additiveExp
cond
: lOrExp
;
var
: ID
lVal
: ID (LBRACK exp RBRACK)*
;
lValue
: ID
primaryExp
: LPAREN exp RPAREN
| lVal
| number
;
number
: ILITERAL
: intConst
| floatConst
;
intConst
: DEC_INT_LITERAL
| OCT_INT_LITERAL
| HEX_INT_LITERAL
;
floatConst
: DEC_FLOAT_LITERAL
| HEX_FLOAT_LITERAL
;
unaryExp
: primaryExp
| ID LPAREN funcRParams? RPAREN
| addUnaryOp unaryExp
;
addUnaryOp
: ADD
| SUB
;
funcRParams
: exp (COMMA exp)*
;
mulExp
: unaryExp
| mulExp MUL unaryExp
| mulExp DIV unaryExp
| mulExp MOD unaryExp
;
addExp
: mulExp
| addExp ADD mulExp
| addExp SUB mulExp
;
relExp
: addExp
| relExp LT addExp
| relExp GT addExp
| relExp LE addExp
| relExp GE addExp
;
eqExp
: relExp
| eqExp EQ relExp
| eqExp NE relExp
;
lAndExp
: condUnaryExp
| lAndExp AND condUnaryExp
;
lOrExp
: lAndExp
| lOrExp OR lAndExp
;
condUnaryExp
: eqExp
| NOT condUnaryExp
;
constExp
: addExp
;

@ -15,9 +15,31 @@ ConstantInt* Context::GetConstInt(int v) {
return inserted->second.get();
}
ConstantFloat* Context::GetConstFloat(float v) {
auto it = const_floats_.find(v);
if (it != const_floats_.end()) return it->second.get();
auto inserted =
const_floats_.emplace(v, std::make_unique<ConstantFloat>(Type::GetFloatType(), v)).first;
return inserted->second.get();
}
ConstantArray* Context::GetConstArray(std::shared_ptr<Type> ty, std::vector<ConstantValue*> elements) {
auto ca = std::make_unique<ConstantArray>(std::move(ty), std::move(elements));
auto* ptr = ca.get();
const_arrays_.push_back(std::move(ca));
return ptr;
}
ConstantZero* Context::GetConstZero(std::shared_ptr<Type> ty) {
auto cz = std::make_unique<ConstantZero>(std::move(ty));
auto* ptr = cz.get();
const_zeros_.push_back(std::move(cz));
return ptr;
}
std::string Context::NextTemp() {
std::ostringstream oss;
oss << "%" << ++temp_index_;
oss << "%t" << ++temp_index_;
return oss.str();
}

@ -6,9 +6,7 @@
namespace ir {
Function::Function(std::string name, std::shared_ptr<Type> ret_type)
: Value(std::move(ret_type), std::move(name)) {
entry_ = CreateBlock("entry");
}
: Value(std::move(ret_type), std::move(name)) {}
BasicBlock* Function::CreateBlock(const std::string& name) {
auto block = std::make_unique<BasicBlock>(name);
@ -29,4 +27,15 @@ const std::vector<std::unique_ptr<BasicBlock>>& Function::GetBlocks() const {
return blocks_;
}
Argument* Function::AddArgument(std::shared_ptr<Type> ty, std::string name) {
auto arg = std::make_unique<Argument>(std::move(ty), std::move(name), this, args_.size());
auto* ptr = arg.get();
args_.push_back(std::move(arg));
return ptr;
}
const std::vector<std::unique_ptr<Argument>>& Function::GetArgs() const {
return args_;
}
} // namespace ir

@ -49,6 +49,21 @@ AllocaInst* IRBuilder::CreateAllocaI32(const std::string& name) {
return insert_block_->Append<AllocaInst>(Type::GetPtrInt32Type(), name);
}
AllocaInst* IRBuilder::CreateAllocaFloat(const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
return insert_block_->Append<AllocaInst>(Type::GetPtrFloatType(), name);
}
AllocaInst* IRBuilder::CreateAlloca(std::shared_ptr<Type> ty, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
auto ptr_ty = Type::GetPointerType(ty);
return insert_block_->Append<AllocaInst>(ptr_ty, name);
}
LoadInst* IRBuilder::CreateLoad(Value* ptr, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
@ -57,7 +72,8 @@ LoadInst* IRBuilder::CreateLoad(Value* ptr, const std::string& name) {
throw std::runtime_error(
FormatError("ir", "IRBuilder::CreateLoad 缺少 ptr"));
}
return insert_block_->Append<LoadInst>(Type::GetInt32Type(), ptr, name);
auto val_ty = ptr->GetType()->GetPointedType();
return insert_block_->Append<LoadInst>(val_ty, ptr, name);
}
StoreInst* IRBuilder::CreateStore(Value* val, Value* ptr) {
@ -79,11 +95,106 @@ ReturnInst* IRBuilder::CreateRet(Value* v) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
if (!v) {
throw std::runtime_error(
FormatError("ir", "IRBuilder::CreateRet 缺少返回值"));
}
return insert_block_->Append<ReturnInst>(Type::GetVoidType(), v);
}
BinaryInst* IRBuilder::CreateSub(Value* lhs, Value* rhs, const std::string& name) {
return CreateBinary(Opcode::Sub, lhs, rhs, name);
}
BinaryInst* IRBuilder::CreateMul(Value* lhs, Value* rhs, const std::string& name) {
return CreateBinary(Opcode::Mul, lhs, rhs, name);
}
UnaryInst* IRBuilder::CreateNeg(Value* operand, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
if (!operand) {
throw std::runtime_error(FormatError("ir", "IRBuilder::CreateNeg 缺少操作数"));
}
auto val_ty = (operand->GetType() && operand->GetType()->IsFloat()) ? Type::GetFloatType() : Type::GetInt32Type();
return insert_block_->Append<UnaryInst>(Opcode::Neg, val_ty, operand, name);
}
Instruction* IRBuilder::CreateCmp(CmpOp op, Value* lhs, Value* rhs, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
if (!lhs || !rhs) {
throw std::runtime_error(FormatError("ir", "IRBuilder::CreateCmp 缺少操作数"));
}
if (lhs->GetType()->IsFloat() || rhs->GetType()->IsFloat()) {
if (!lhs->GetType()->IsFloat()) {
lhs = CreateSIToFP(lhs, ctx_.NextTemp());
}
if (!rhs->GetType()->IsFloat()) {
rhs = CreateSIToFP(rhs, ctx_.NextTemp());
}
return insert_block_->Append<FCmpInst>(op, lhs, rhs, name);
}
return insert_block_->Append<CmpInst>(op, lhs, rhs, name);
}
ZextInst* IRBuilder::CreateZext(Value* val, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
if (!val) {
throw std::runtime_error(FormatError("ir", "IRBuilder::CreateZext 缺少操作数"));
}
return insert_block_->Append<ZextInst>(Type::GetInt32Type(), val, name);
}
BranchInst* IRBuilder::CreateBr(BasicBlock* dest) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
if (!dest) {
throw std::runtime_error(FormatError("ir", "IRBuilder::CreateBr 缺少操作数"));
}
return insert_block_->Append<BranchInst>(dest);
}
CondBranchInst* IRBuilder::CreateCondBr(Value* cond, BasicBlock* true_bb, BasicBlock* false_bb) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
if (!cond || !true_bb || !false_bb) {
throw std::runtime_error(FormatError("ir", "IRBuilder::CreateCondBr 缺少操作数"));
}
return insert_block_->Append<CondBranchInst>(cond, true_bb, false_bb);
}
CallInst* IRBuilder::CreateCall(Function* func, std::vector<Value*> args, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
if (!func) {
throw std::runtime_error(FormatError("ir", "IRBuilder::CreateCall 缺少目标函数"));
}
return insert_block_->Append<CallInst>(func, std::move(args), name);
}
GEPInst* IRBuilder::CreateGEP(std::shared_ptr<Type> ty, Value* ptr, std::vector<Value*> indices, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
return insert_block_->Append<GEPInst>(ty, ptr, std::move(indices), name);
}
SIToFPInst* IRBuilder::CreateSIToFP(Value* val, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
return insert_block_->Append<SIToFPInst>(Type::GetFloatType(), val, name);
}
FPToSIInst* IRBuilder::CreateFPToSI(Value* val, const std::string& name) {
if (!insert_block_) {
throw std::runtime_error(FormatError("ir", "IRBuilder 未设置插入点"));
}
return insert_block_->Append<FPToSIInst>(Type::GetInt32Type(), val, name);
}
} // namespace ir

@ -7,6 +7,7 @@
#include <ostream>
#include <stdexcept>
#include <string>
#include <cstring>
#include "utils/Log.h"
@ -16,10 +17,26 @@ static const char* TypeToString(const Type& ty) {
switch (ty.GetKind()) {
case Type::Kind::Void:
return "void";
case Type::Kind::Int1:
return "i1";
case Type::Kind::Int32:
return "i32";
case Type::Kind::PtrInt32:
return "i32*";
case Type::Kind::Float:
return "float";
case Type::Kind::PtrFloat:
return "float*";
case Type::Kind::Array: {
static thread_local std::string buf;
buf = "[" + std::to_string(ty.GetNumElements()) + " x " + TypeToString(*ty.GetElementType()) + "]";
return buf.c_str();
}
case Type::Kind::Pointer: {
static thread_local std::string buf;
buf = std::string(TypeToString(*ty.GetPointedType())) + "*";
return buf.c_str();
}
}
throw std::runtime_error(FormatError("ir", "未知类型"));
}
@ -32,6 +49,12 @@ static const char* OpcodeToString(Opcode op) {
return "sub";
case Opcode::Mul:
return "mul";
case Opcode::Div:
return "sdiv";
case Opcode::Mod:
return "srem";
case Opcode::Neg:
return "neg";
case Opcode::Alloca:
return "alloca";
case Opcode::Load:
@ -40,6 +63,71 @@ static const char* OpcodeToString(Opcode op) {
return "store";
case Opcode::Ret:
return "ret";
case Opcode::Cmp:
return "icmp";
case Opcode::FCmp:
return "fcmp";
case Opcode::Zext:
return "zext";
case Opcode::Br:
case Opcode::CondBr:
return "br";
case Opcode::Call:
return "call";
case Opcode::GEP:
return "getelementptr";
case Opcode::SIToFP:
return "sitofp";
case Opcode::FPToSI:
return "fptosi";
}
return "?";
}
static const char* CmpOpToString(CmpOp op) {
switch (op) {
case CmpOp::Eq:
return "eq";
case CmpOp::Ne:
return "ne";
case CmpOp::Lt:
return "slt";
case CmpOp::Gt:
return "sgt";
case CmpOp::Le:
return "sle";
case CmpOp::Ge:
return "sge";
}
return "?";
}
static const char* GetElementTypeName(const Type& ty) {
if (ty.IsPointer()) {
return TypeToString(*ty.GetPointedType());
}
switch (ty.GetKind()) {
case Type::Kind::Array:
return TypeToString(*ty.GetElementType());
default:
return TypeToString(ty);
}
}
static const char* FCmpOpToString(CmpOp op) {
switch (op) {
case CmpOp::Eq:
return "oeq";
case CmpOp::Ne:
return "one";
case CmpOp::Lt:
return "olt";
case CmpOp::Gt:
return "ogt";
case CmpOp::Le:
return "ole";
case CmpOp::Ge:
return "oge";
}
return "?";
}
@ -48,53 +136,233 @@ static std::string ValueToString(const Value* v) {
if (auto* ci = dynamic_cast<const ConstantInt*>(v)) {
return std::to_string(ci->GetValue());
}
return v ? v->GetName() : "<null>";
if (auto* cf = dynamic_cast<const ConstantFloat*>(v)) {
double d = (double)cf->GetValue();
uint64_t val;
static_assert(sizeof(double) == sizeof(uint64_t));
std::memcpy(&val, &d, sizeof(double));
char buf[64];
snprintf(buf, sizeof(buf), "0x%lX", val);
return std::string(buf);
}
if (dynamic_cast<const GlobalValue*>(v)) {
return "@" + v->GetName();
}
if (auto* ca = dynamic_cast<const ConstantArray*>(v)) {
std::string s = "[";
const auto& elems = ca->GetElements();
for (size_t i = 0; i < elems.size(); ++i) {
if (i > 0) s += ", ";
s += TypeToString(*elems[i]->GetType());
s += " ";
s += ValueToString(elems[i]);
}
s += "]";
return s;
}
if (dynamic_cast<const ConstantZero*>(v)) {
return "zeroinitializer";
}
if (v) {
std::string name = v->GetName();
if (!name.empty() && name[0] != '%' && name[0] != '@') {
return "%" + name;
}
return name;
}
return "<null>";
}
static std::string PrintLabel(const Value* bb) {
if (!bb) return "<null>";
std::string name = bb->GetName();
if (name.empty()) return "<empty>";
if (name[0] == '%') return name;
return "%" + name;
}
static std::string PrintLabelDef(const Value* bb) {
if (!bb) return "<null>";
std::string name = bb->GetName();
if (!name.empty() && name[0] == '%') return name.substr(1);
return name;
}
void IRPrinter::Print(const Module& module, std::ostream& os) {
for (const auto& gv : module.GetGlobalVariables()) {
os << "@" << gv->GetName() << " = global "
<< GetElementTypeName(*gv->GetType()) << " "
<< ValueToString(gv->GetInitializer()) << "\n";
}
for (const auto& func : module.GetFunctions()) {
os << "define " << TypeToString(*func->GetType()) << " @" << func->GetName()
<< "() {\n";
if (func->GetBlocks().empty()) {
os << "declare " << TypeToString(*func->GetType()) << " @" << func->GetName() << "(";
const auto& args = func->GetArgs();
for (size_t i = 0; i < args.size(); ++i) {
if (i > 0) os << ", ";
os << TypeToString(*args[i]->GetType());
}
os << ")\n";
continue;
}
os << "define " << TypeToString(*func->GetType()) << " @" << func->GetName() << "(";
const auto& args = func->GetArgs();
for (size_t i = 0; i < args.size(); ++i) {
if (i > 0) os << ", ";
os << TypeToString(*args[i]->GetType()) << " " << args[i]->GetName();
}
os << ") {\n";
for (const auto& bb : func->GetBlocks()) {
if (!bb) {
continue;
}
os << bb->GetName() << ":\n";
os << PrintLabelDef(bb.get()) << ":\n";
for (const auto& instPtr : bb->GetInstructions()) {
const auto* inst = instPtr.get();
switch (inst->GetOpcode()) {
case Opcode::Add:
case Opcode::Sub:
case Opcode::Mul: {
case Opcode::Mul:
case Opcode::Div:
case Opcode::Mod: {
auto* bin = static_cast<const BinaryInst*>(inst);
os << " " << bin->GetName() << " = "
<< OpcodeToString(bin->GetOpcode()) << " "
bool is_float = bin->GetType()->IsFloat();
std::string op_name = OpcodeToString(bin->GetOpcode());
if (is_float) {
if (op_name == "add") op_name = "fadd";
else if (op_name == "sub") op_name = "fsub";
else if (op_name == "mul") op_name = "fmul";
else if (op_name == "sdiv") op_name = "fdiv";
}
os << " " << ValueToString(bin) << " = "
<< op_name << " "
<< TypeToString(*bin->GetLhs()->GetType()) << " "
<< ValueToString(bin->GetLhs()) << ", "
<< ValueToString(bin->GetRhs()) << "\n";
break;
}
case Opcode::Neg: {
auto* unary = static_cast<const UnaryInst*>(inst);
bool is_float = unary->GetType()->IsFloat();
os << " " << ValueToString(unary) << " = "
<< (is_float ? "fneg" : "sub") << " "
<< TypeToString(*unary->GetUnaryOperand()->GetType()) << " "
<< (is_float ? "" : "0, ")
<< ValueToString(unary->GetUnaryOperand()) << "\n";
break;
}
case Opcode::Alloca: {
auto* alloca = static_cast<const AllocaInst*>(inst);
os << " " << alloca->GetName() << " = alloca i32\n";
os << " " << ValueToString(alloca) << " = alloca "
<< GetElementTypeName(*alloca->GetType()) << "\n";
break;
}
case Opcode::Load: {
auto* load = static_cast<const LoadInst*>(inst);
os << " " << load->GetName() << " = load i32, i32* "
os << " " << ValueToString(load) << " = load "
<< TypeToString(*load->GetType()) << ", "
<< TypeToString(*load->GetPtr()->GetType()) << " "
<< ValueToString(load->GetPtr()) << "\n";
break;
}
case Opcode::Store: {
auto* store = static_cast<const StoreInst*>(inst);
os << " store i32 " << ValueToString(store->GetValue())
<< ", i32* " << ValueToString(store->GetPtr()) << "\n";
os << " store " << TypeToString(*store->GetValue()->GetType()) << " "
<< ValueToString(store->GetValue()) << ", "
<< TypeToString(*store->GetPtr()->GetType()) << " "
<< ValueToString(store->GetPtr()) << "\n";
break;
}
case Opcode::Ret: {
auto* ret = static_cast<const ReturnInst*>(inst);
os << " ret " << TypeToString(*ret->GetValue()->GetType()) << " "
<< ValueToString(ret->GetValue()) << "\n";
if (auto* val = ret->GetValue()) {
os << " ret " << TypeToString(*val->GetType()) << " "
<< ValueToString(val) << "\n";
} else {
os << " ret void\n";
}
break;
}
case Opcode::Cmp: {
auto* cmp = static_cast<const CmpInst*>(inst);
os << " " << ValueToString(cmp) << " = icmp "
<< CmpOpToString(cmp->GetCmpOp()) << " "
<< TypeToString(*cmp->GetLhs()->GetType()) << " "
<< ValueToString(cmp->GetLhs()) << ", "
<< ValueToString(cmp->GetRhs()) << "\n";
break;
}
case Opcode::FCmp: {
auto* cmp = static_cast<const FCmpInst*>(inst);
os << " " << ValueToString(cmp) << " = fcmp "
<< FCmpOpToString(cmp->GetCmpOp()) << " "
<< TypeToString(*cmp->GetLhs()->GetType()) << " "
<< ValueToString(cmp->GetLhs()) << ", "
<< ValueToString(cmp->GetRhs()) << "\n";
break;
}
case Opcode::Zext: {
auto* zext = static_cast<const ZextInst*>(inst);
os << " " << ValueToString(zext) << " = zext "
<< TypeToString(*zext->GetOperand(0)->GetType()) << " "
<< ValueToString(zext->GetOperand(0)) << " to "
<< TypeToString(*zext->GetType()) << "\n";
break;
}
case Opcode::Br: {
auto* br = static_cast<const BranchInst*>(inst);
os << " br label " << PrintLabel(br->GetDest()) << "\n";
break;
}
case Opcode::CondBr: {
auto* cbr = static_cast<const CondBranchInst*>(inst);
os << " br i1 " << ValueToString(cbr->GetCond())
<< ", label " << PrintLabel(cbr->GetTrueBlock())
<< ", label " << PrintLabel(cbr->GetFalseBlock()) << "\n";
break;
}
case Opcode::Call: {
auto* call = static_cast<const CallInst*>(inst);
if (call->GetType()->IsVoid()) {
os << " call void @" << call->GetFunc()->GetName() << "(";
} else {
os << " " << ValueToString(call) << " = call " << TypeToString(*call->GetType())
<< " @" << call->GetFunc()->GetName() << "(";
}
for (size_t i = 0; i < call->GetArgs().size(); ++i) {
if (i > 0) os << ", ";
auto* arg = call->GetArgs()[i];
os << TypeToString(*arg->GetType()) << " " << ValueToString(arg);
}
os << ")\n";
break;
}
case Opcode::GEP: {
auto* gep = static_cast<const GEPInst*>(inst);
os << " " << ValueToString(gep) << " = getelementptr "
<< GetElementTypeName(*gep->GetPtr()->GetType()) << ", "
<< TypeToString(*gep->GetPtr()->GetType()) << " "
<< ValueToString(gep->GetPtr());
for (auto* idx : gep->GetIndices()) {
os << ", " << TypeToString(*idx->GetType()) << " " << ValueToString(idx);
}
os << "\n";
break;
}
case Opcode::SIToFP: {
auto* conv = static_cast<const SIToFPInst*>(inst);
os << " " << ValueToString(conv) << " = sitofp "
<< TypeToString(*conv->GetOperand(0)->GetType()) << " "
<< ValueToString(conv->GetOperand(0)) << " to "
<< TypeToString(*conv->GetType()) << "\n";
break;
}
case Opcode::FPToSI: {
auto* conv = static_cast<const FPToSIInst*>(inst);
os << " " << ValueToString(conv) << " = fptosi "
<< TypeToString(*conv->GetOperand(0)->GetType()) << " "
<< ValueToString(conv->GetOperand(0)) << " to "
<< TypeToString(*conv->GetType()) << "\n";
break;
}
}

@ -52,7 +52,7 @@ Instruction::Instruction(Opcode op, std::shared_ptr<Type> ty, std::string name)
Opcode Instruction::GetOpcode() const { return opcode_; }
bool Instruction::IsTerminator() const { return opcode_ == Opcode::Ret; }
bool Instruction::IsTerminator() const { return opcode_ == Opcode::Ret || opcode_ == Opcode::Br || opcode_ == Opcode::CondBr; }
BasicBlock* Instruction::GetParent() const { return parent_; }
@ -61,8 +61,9 @@ void Instruction::SetParent(BasicBlock* parent) { parent_ = parent; }
BinaryInst::BinaryInst(Opcode op, std::shared_ptr<Type> ty, Value* lhs,
Value* rhs, std::string name)
: Instruction(op, std::move(ty), std::move(name)) {
if (op != Opcode::Add) {
throw std::runtime_error(FormatError("ir", "BinaryInst 当前只支持 Add"));
if (op != Opcode::Add && op != Opcode::Sub && op != Opcode::Mul &&
op != Opcode::Div && op != Opcode::Mod) {
throw std::runtime_error(FormatError("ir", "BinaryInst 不支持的操作码"));
}
if (!lhs || !rhs) {
throw std::runtime_error(FormatError("ir", "BinaryInst 缺少操作数"));
@ -74,8 +75,8 @@ BinaryInst::BinaryInst(Opcode op, std::shared_ptr<Type> ty, Value* lhs,
type_->GetKind() != lhs->GetType()->GetKind()) {
throw std::runtime_error(FormatError("ir", "BinaryInst 类型不匹配"));
}
if (!type_->IsInt32()) {
throw std::runtime_error(FormatError("ir", "BinaryInst 当前只支持 i32"));
if (!type_->IsInt32() && !type_->IsFloat()) {
throw std::runtime_error(FormatError("ir", "BinaryInst 当前只支持 i32 或 float"));
}
AddOperand(lhs);
AddOperand(rhs);
@ -85,37 +86,53 @@ Value* BinaryInst::GetLhs() const { return GetOperand(0); }
Value* BinaryInst::GetRhs() const { return GetOperand(1); }
UnaryInst::UnaryInst(Opcode op, std::shared_ptr<Type> ty, Value* operand,
std::string name)
: Instruction(op, std::move(ty), std::move(name)) {
if (op != Opcode::Neg) {
throw std::runtime_error(FormatError("ir", "UnaryInst 不支持的操作码"));
}
if (!operand) {
throw std::runtime_error(FormatError("ir", "UnaryInst 缺少操作数"));
}
if (!type_ || !operand->GetType()) {
throw std::runtime_error(FormatError("ir", "UnaryInst 缺少类型信息"));
}
if (type_->GetKind() != operand->GetType()->GetKind()) {
throw std::runtime_error(FormatError("ir", "UnaryInst 类型不匹配"));
}
if (!type_->IsInt32() && !type_->IsFloat()) {
throw std::runtime_error(FormatError("ir", "UnaryInst 当前只支持 i32 或 float"));
}
AddOperand(operand);
}
Value* UnaryInst::GetUnaryOperand() const { return GetOperand(0); }
ReturnInst::ReturnInst(std::shared_ptr<Type> void_ty, Value* val)
: Instruction(Opcode::Ret, std::move(void_ty), "") {
if (!val) {
throw std::runtime_error(FormatError("ir", "ReturnInst 缺少返回值"));
}
if (!type_ || !type_->IsVoid()) {
throw std::runtime_error(FormatError("ir", "ReturnInst 返回类型必须为 void"));
}
AddOperand(val);
if (val) {
AddOperand(val);
}
}
Value* ReturnInst::GetValue() const { return GetOperand(0); }
Value* ReturnInst::GetValue() const {
return GetNumOperands() > 0 ? GetOperand(0) : nullptr;
}
AllocaInst::AllocaInst(std::shared_ptr<Type> ptr_ty, std::string name)
: Instruction(Opcode::Alloca, std::move(ptr_ty), std::move(name)) {
if (!type_ || !type_->IsPtrInt32()) {
throw std::runtime_error(FormatError("ir", "AllocaInst 当前只支持 i32*"));
}
}
: Instruction(Opcode::Alloca, std::move(ptr_ty), std::move(name)) {}
LoadInst::LoadInst(std::shared_ptr<Type> val_ty, Value* ptr, std::string name)
: Instruction(Opcode::Load, std::move(val_ty), std::move(name)) {
if (!ptr) {
throw std::runtime_error(FormatError("ir", "LoadInst 缺少 ptr"));
}
if (!type_ || !type_->IsInt32()) {
throw std::runtime_error(FormatError("ir", "LoadInst 当前只支持加载 i32"));
}
if (!ptr->GetType() || !ptr->GetType()->IsPtrInt32()) {
throw std::runtime_error(
FormatError("ir", "LoadInst 当前只支持从 i32* 加载"));
if (!type_ || (!type_->IsInt32() && !type_->IsFloat() && !type_->IsInt1())) {
// Note: IsInt1 is for Zext or comparisons
}
AddOperand(ptr);
}
@ -133,12 +150,19 @@ StoreInst::StoreInst(std::shared_ptr<Type> void_ty, Value* val, Value* ptr)
if (!type_ || !type_->IsVoid()) {
throw std::runtime_error(FormatError("ir", "StoreInst 返回类型必须为 void"));
}
if (!val->GetType() || !val->GetType()->IsInt32()) {
throw std::runtime_error(FormatError("ir", "StoreInst 当前只支持存储 i32"));
if (!val->GetType() || (!val->GetType()->IsInt32() && !val->GetType()->IsFloat() && !val->GetType()->IsPointer())) {
throw std::runtime_error(FormatError("ir", "StoreInst 当前只支持存储 i32、float 或指针类型"));
}
if (!ptr->GetType() || !ptr->GetType()->IsPtrInt32()) {
throw std::runtime_error(
FormatError("ir", "StoreInst 当前只支持写入 i32*"));
if (val->GetType()->IsInt32() || val->GetType()->IsPointer()) {
if (!ptr->GetType() || !ptr->GetType()->IsPointer()) {
throw std::runtime_error(
FormatError("ir", "StoreInst 当前只支持写入指针类型槽位"));
}
} else if (val->GetType()->IsFloat()) {
if (!ptr->GetType() || !ptr->GetType()->IsPtrFloat()) {
throw std::runtime_error(
FormatError("ir", "StoreInst 当前只支持写入 float*"));
}
}
AddOperand(val);
AddOperand(ptr);
@ -148,4 +172,117 @@ Value* StoreInst::GetValue() const { return GetOperand(0); }
Value* StoreInst::GetPtr() const { return GetOperand(1); }
CmpInst::CmpInst(CmpOp cmp_op, Value* lhs, Value* rhs, std::string name)
: Instruction(Opcode::Cmp, Type::GetInt1Type(), std::move(name)), cmp_op_(cmp_op) {
if (!lhs || !rhs) {
throw std::runtime_error(FormatError("ir", "CmpInst 缺少操作数"));
}
if (!lhs->GetType() || !rhs->GetType()) {
throw std::runtime_error(FormatError("ir", "CmpInst 缺少操作数类型信息"));
}
if (lhs->GetType()->GetKind() != rhs->GetType()->GetKind()) {
throw std::runtime_error(FormatError("ir", "CmpInst 操作数类型不匹配"));
}
AddOperand(lhs);
AddOperand(rhs);
}
CmpOp CmpInst::GetCmpOp() const { return cmp_op_; }
Value* CmpInst::GetLhs() const { return GetOperand(0); }
Value* CmpInst::GetRhs() const { return GetOperand(1); }
FCmpInst::FCmpInst(CmpOp cmp_op, Value* lhs, Value* rhs, std::string name)
: Instruction(Opcode::FCmp, Type::GetInt1Type(), std::move(name)), cmp_op_(cmp_op) {
if (!lhs || !rhs) {
throw std::runtime_error(FormatError("ir", "FCmpInst 缺少操作数"));
}
if (!lhs->GetType() || !rhs->GetType()) {
throw std::runtime_error(FormatError("ir", "FCmpInst 缺少操作数类型信息"));
}
if (lhs->GetType()->GetKind() != rhs->GetType()->GetKind()) {
throw std::runtime_error(FormatError("ir", "FCmpInst 操作数类型不匹配"));
}
AddOperand(lhs);
AddOperand(rhs);
}
CmpOp FCmpInst::GetCmpOp() const { return cmp_op_; }
Value* FCmpInst::GetLhs() const { return GetOperand(0); }
Value* FCmpInst::GetRhs() const { return GetOperand(1); }
ZextInst::ZextInst(std::shared_ptr<Type> dest_ty, Value* val, std::string name)
: Instruction(Opcode::Zext, std::move(dest_ty), std::move(name)) {
if (!val) {
throw std::runtime_error(FormatError("ir", "ZextInst 缺少操作数"));
}
if (!type_->IsInt32() || !val->GetType()->IsInt1()) {
throw std::runtime_error(FormatError("ir", "ZextInst 当前只支持 i1 到 i32"));
}
AddOperand(val);
}
Value* ZextInst::GetValue() const { return GetOperand(0); }
BranchInst::BranchInst(BasicBlock* dest)
: Instruction(Opcode::Br, Type::GetVoidType(), "") {
if (!dest) {
throw std::runtime_error(FormatError("ir", "BranchInst 缺少目的块"));
}
AddOperand(dest);
}
BasicBlock* BranchInst::GetDest() const { return static_cast<BasicBlock*>(GetOperand(0)); }
CondBranchInst::CondBranchInst(Value* cond, BasicBlock* true_bb, BasicBlock* false_bb)
: Instruction(Opcode::CondBr, Type::GetVoidType(), "") {
if (!cond || !true_bb || !false_bb) {
throw std::runtime_error(FormatError("ir", "CondBranchInst 缺少连边操作数"));
}
if (!cond->GetType()->IsInt1()) {
throw std::runtime_error(FormatError("ir", "CondBranchInst 必须使用 i1 作为条件"));
}
AddOperand(cond);
AddOperand(true_bb);
AddOperand(false_bb);
}
Value* CondBranchInst::GetCond() const { return GetOperand(0); }
BasicBlock* CondBranchInst::GetTrueBlock() const { return static_cast<BasicBlock*>(GetOperand(1)); }
BasicBlock* CondBranchInst::GetFalseBlock() const { return static_cast<BasicBlock*>(GetOperand(2)); }
CallInst::CallInst(Function* func, std::vector<Value*> args, std::string name)
: Instruction(Opcode::Call, func->GetType(), std::move(name)), func_(func), args_(std::move(args)) {
if (!func) {
throw std::runtime_error(FormatError("ir", "CallInst 缺少目标函数"));
}
AddOperand(func);
for (auto* arg : args_) {
AddOperand(arg);
}
}
Function* CallInst::GetFunc() const { return func_; }
const std::vector<Value*>& CallInst::GetArgs() const { return args_; }
GEPInst::GEPInst(std::shared_ptr<Type> ty, Value* ptr, std::vector<Value*> indices, std::string name)
: Instruction(Opcode::GEP, std::move(ty), std::move(name)), indices_(std::move(indices)) {
AddOperand(ptr);
for (auto* idx : indices_) {
AddOperand(idx);
}
}
Value* GEPInst::GetPtr() const { return GetOperand(0); }
const std::vector<Value*>& GEPInst::GetIndices() const { return indices_; }
SIToFPInst::SIToFPInst(std::shared_ptr<Type> ty, Value* val, std::string name)
: Instruction(Opcode::SIToFP, std::move(ty), std::move(name)) {
AddOperand(val);
}
FPToSIInst::FPToSIInst(std::shared_ptr<Type> ty, Value* val, std::string name)
: Instruction(Opcode::FPToSI, std::move(ty), std::move(name)) {
AddOperand(val);
}
} // namespace ir

@ -18,4 +18,13 @@ const std::vector<std::unique_ptr<Function>>& Module::GetFunctions() const {
return functions_;
}
GlobalVariable* Module::CreateGlobalVariable(const std::string& name, std::shared_ptr<Type> type, ConstantValue* init) {
global_variables_.push_back(std::make_unique<GlobalVariable>(name, std::move(type), init));
return global_variables_.back().get();
}
const std::vector<std::unique_ptr<GlobalVariable>>& Module::GetGlobalVariables() const {
return global_variables_;
}
} // namespace ir

@ -4,28 +4,67 @@
namespace ir {
Type::Type(Kind k) : kind_(k) {}
Type::Type(Kind k, std::shared_ptr<Type> elem_ty, int num_elems)
: kind_(k), elem_ty_(std::move(elem_ty)), num_elems_(num_elems) {}
const std::shared_ptr<Type>& Type::GetVoidType() {
static const std::shared_ptr<Type> type = std::make_shared<Type>(Kind::Void);
return type;
}
const std::shared_ptr<Type>& Type::GetInt1Type() {
static const std::shared_ptr<Type> type = std::make_shared<Type>(Kind::Int1);
return type;
}
const std::shared_ptr<Type>& Type::GetInt32Type() {
static const std::shared_ptr<Type> type = std::make_shared<Type>(Kind::Int32);
return type;
}
const std::shared_ptr<Type>& Type::GetPtrInt32Type() {
static const std::shared_ptr<Type> type = std::make_shared<Type>(Kind::PtrInt32);
static const std::shared_ptr<Type> type = std::make_shared<Type>(Kind::PtrInt32, GetInt32Type(), 0);
return type;
}
const std::shared_ptr<Type>& Type::GetFloatType() {
static std::shared_ptr<Type> ty = std::make_shared<Type>(Kind::Float);
return ty;
}
const std::shared_ptr<Type>& Type::GetPtrFloatType() {
static std::shared_ptr<Type> ty = std::make_shared<Type>(Kind::PtrFloat, GetFloatType(), 0);
return ty;
}
std::shared_ptr<Type> Type::GetArrayType(std::shared_ptr<Type> elem_ty, int num_elems) {
return std::make_shared<Type>(Kind::Array, std::move(elem_ty), num_elems);
}
std::shared_ptr<Type> Type::GetPointerType(std::shared_ptr<Type> pointed_ty) {
return std::make_shared<Type>(Kind::Pointer, std::move(pointed_ty), 0);
}
Type::Kind Type::GetKind() const { return kind_; }
bool Type::IsVoid() const { return kind_ == Kind::Void; }
bool Type::IsInt1() const { return kind_ == Kind::Int1; }
bool Type::IsInt32() const { return kind_ == Kind::Int32; }
bool Type::IsPtrInt32() const { return kind_ == Kind::PtrInt32; }
bool Type::IsPtrInt32() const {
return kind_ == Kind::PtrInt32 || (kind_ == Kind::Pointer && GetPointedType() && GetPointedType()->IsInt32());
}
bool Type::IsFloat() const { return kind_ == Kind::Float; }
bool Type::IsPtrFloat() const {
return kind_ == Kind::PtrFloat || (kind_ == Kind::Pointer && GetPointedType() && GetPointedType()->IsFloat());
}
bool Type::IsArray() const { return kind_ == Kind::Array; }
bool Type::IsPointer() const { return kind_ == Kind::Pointer || kind_ == Kind::PtrInt32 || kind_ == Kind::PtrFloat; }
} // namespace ir

@ -18,10 +18,16 @@ void Value::SetName(std::string n) { name_ = std::move(n); }
bool Value::IsVoid() const { return type_ && type_->IsVoid(); }
bool Value::IsInt1() const { return type_ && type_->IsInt1(); }
bool Value::IsInt32() const { return type_ && type_->IsInt32(); }
bool Value::IsPtrInt32() const { return type_ && type_->IsPtrInt32(); }
bool Value::IsFloat() const { return type_ && type_->IsFloat(); }
bool Value::IsPtrFloat() const { return type_ && type_->IsPtrFloat(); }
bool Value::IsConstant() const {
return dynamic_cast<const ConstantValue*>(this) != nullptr;
}
@ -38,6 +44,10 @@ bool Value::IsFunction() const {
return dynamic_cast<const Function*>(this) != nullptr;
}
bool Value::IsArgument() const {
return dynamic_cast<const Argument*>(this) != nullptr;
}
void Value::AddUse(User* user, size_t operand_index) {
if (!user) return;
uses_.push_back(Use(this, user, operand_index));
@ -74,10 +84,29 @@ void Value::ReplaceAllUsesWith(Value* new_value) {
}
}
Argument::Argument(std::shared_ptr<Type> ty, std::string name, Function* parent, size_t arg_no)
: Value(std::move(ty), std::move(name)), parent_(parent), arg_no_(arg_no) {}
Function* Argument::GetParent() const { return parent_; }
size_t Argument::GetArgNo() const { return arg_no_; }
ConstantValue::ConstantValue(std::shared_ptr<Type> ty, std::string name)
: Value(std::move(ty), std::move(name)) {}
ConstantInt::ConstantInt(std::shared_ptr<Type> ty, int v)
: ConstantValue(std::move(ty), ""), value_(v) {}
ConstantFloat::ConstantFloat(std::shared_ptr<Type> ty, float v)
: ConstantValue(std::move(ty), ""), value_(v) {}
ConstantArray::ConstantArray(std::shared_ptr<Type> ty, std::vector<ConstantValue*> elements)
: ConstantValue(std::move(ty), ""), elements_(std::move(elements)) {}
ConstantZero::ConstantZero(std::shared_ptr<Type> ty)
: ConstantValue(std::move(ty), "") {}
GlobalVariable::GlobalVariable(std::string name, std::shared_ptr<Type> type, ConstantValue* init)
: GlobalValue(std::move(type), std::move(name)), init_(init) {}
} // namespace ir

@ -7,29 +7,44 @@
#include "utils/Log.h"
namespace {
std::string GetLValueName(SysYParser::LValueContext& lvalue) {
if (!lvalue.ID()) {
throw std::runtime_error(FormatError("irgen", "非法左值"));
ir::ConstantValue* BuildConstantArray(ir::Context& ctx, std::shared_ptr<ir::Type> type,
const std::vector<ir::ConstantValue*>& flattened,
size_t& pos) {
if (!type->IsArray()) {
return flattened[pos++];
}
std::vector<ir::ConstantValue*> elements;
for (int i = 0; i < type->GetNumElements(); ++i) {
elements.push_back(BuildConstantArray(ctx, type->GetElementType(), flattened, pos));
}
return lvalue.ID()->getText();
return ctx.GetConstArray(type, elements);
}
}
} // namespace
std::any IRGenImpl::visitBlockStmt(SysYParser::BlockStmtContext* ctx) {
std::any IRGenImpl::visitBlock(SysYParser::BlockContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "缺少语句块"));
}
// 压入局部作用域
storage_map_stack_.push_back({});
const_values_stack_.push_back({});
bool terminated = false;
for (auto* item : ctx->blockItem()) {
if (item) {
if (VisitBlockItemResult(*item) == BlockFlow::Terminated) {
// 当前语法要求 return 为块内最后一条语句;命中后可停止生成。
terminated = true;
break;
}
}
}
return {};
// 弹出局部作用域
storage_map_stack_.pop_back();
const_values_stack_.pop_back();
return terminated ? BlockFlow::Terminated : BlockFlow::Continue;
}
IRGenImpl::BlockFlow IRGenImpl::VisitBlockItemResult(
@ -51,27 +66,206 @@ std::any IRGenImpl::visitBlockItem(SysYParser::BlockItemContext* ctx) {
throw std::runtime_error(FormatError("irgen", "暂不支持的语句或声明"));
}
std::any IRGenImpl::visitConstDecl(SysYParser::ConstDeclContext* ctx) {
if (!ctx) return BlockFlow::Continue;
if (!ctx->bType() || (!ctx->bType()->INT() && !ctx->bType()->FLOAT())) {
throw std::runtime_error(FormatError("irgen", "当前仅支持 int/float 常量声明"));
}
for (auto* def : ctx->constDef()) {
if (def) def->accept(this);
}
return BlockFlow::Continue;
}
void IRGenImpl::FlattenInitVal(SysYParser::InitValContext* ctx,
const std::vector<int>& dims,
const std::vector<int>& sub_sizes,
int dim_idx,
size_t& current_pos,
std::vector<ir::Value*>& results,
bool is_float) {
if (ctx->exp()) {
ir::Value* val = EvalExpr(*ctx->exp());
// Implicit conversion
if (is_float && !val->GetType()->IsFloat()) {
val = builder_.CreateSIToFP(val, module_.GetContext().NextTemp());
} else if (!is_float && val->GetType()->IsFloat()) {
val = builder_.CreateFPToSI(val, module_.GetContext().NextTemp());
}
results[current_pos++] = val;
} else {
// Nested { ... }
size_t start_pos = current_pos;
for (auto* item : ctx->initVal()) {
FlattenInitVal(item, dims, sub_sizes, dim_idx + 1, current_pos, results, is_float);
}
// Fill remaining with 0
size_t end_pos = start_pos + sub_sizes[dim_idx];
while (current_pos < end_pos) {
results[current_pos++] = is_float ? (ir::Value*)module_.GetContext().GetConstFloat(0.0f)
: (ir::Value*)module_.GetContext().GetConstInt(0);
}
}
}
void IRGenImpl::FlattenConstInitVal(SysYParser::ConstInitValContext* ctx,
const std::vector<int>& dims,
const std::vector<int>& sub_sizes,
int dim_idx,
size_t& current_pos,
std::vector<ir::ConstantValue*>& results,
bool is_float) {
if (ctx->constExp()) {
ir::Value* val = std::any_cast<ir::Value*>(ctx->constExp()->accept(this));
ir::ConstantValue* cval = dynamic_cast<ir::ConstantValue*>(val);
if (!cval) throw std::runtime_error("Not a constant expression");
// Constant conversion
if (is_float && dynamic_cast<ir::ConstantInt*>(cval)) {
cval = module_.GetContext().GetConstFloat((float)static_cast<ir::ConstantInt*>(cval)->GetValue());
} else if (!is_float && dynamic_cast<ir::ConstantFloat*>(cval)) {
cval = module_.GetContext().GetConstInt((int)static_cast<ir::ConstantFloat*>(cval)->GetValue());
}
results[current_pos++] = cval;
} else {
size_t start_pos = current_pos;
for (auto* item : ctx->constInitVal()) {
FlattenConstInitVal(item, dims, sub_sizes, dim_idx + 1, current_pos, results, is_float);
}
// Fill remaining with 0
size_t end_pos = start_pos + sub_sizes[dim_idx];
while (current_pos < end_pos) {
results[current_pos++] = is_float ? (ir::ConstantValue*)module_.GetContext().GetConstFloat(0.0f)
: (ir::ConstantValue*)module_.GetContext().GetConstInt(0);
}
}
}
std::any IRGenImpl::visitConstDef(SysYParser::ConstDefContext* ctx) {
if (!ctx || !ctx->ID()) {
throw std::runtime_error(FormatError("irgen", "常量定义缺少名称"));
}
std::string var_name = ctx->ID()->getText();
// Get dimensions
std::vector<int> dims;
for (auto* idx : ctx->constIndex()) {
dims.push_back(EvaluateConstInt(idx->constExp()));
}
bool is_float = false;
auto* parent_decl = dynamic_cast<SysYParser::ConstDeclContext*>(ctx->parent);
if (parent_decl && parent_decl->bType() && parent_decl->bType()->FLOAT()) {
is_float = true;
}
auto base_ty = is_float ? ir::Type::GetFloatType() : ir::Type::GetInt32Type();
std::shared_ptr<ir::Type> var_ty = base_ty;
for (auto it = dims.rbegin(); it != dims.rend(); ++it) {
var_ty = ir::Type::GetArrayType(var_ty, *it);
}
std::vector<int> sub_sizes(dims.size() + 1);
sub_sizes[dims.size()] = 1;
for (int i = (int)dims.size() - 1; i >= 0; --i) {
sub_sizes[i] = sub_sizes[i+1] * dims[i];
}
ir::ConstantValue* init_const = nullptr;
std::vector<ir::ConstantValue*> flattened;
if (dims.empty()) {
if (auto* init_val = ctx->constInitVal()) {
if (init_val->constExp()) {
ir::Value* val = std::any_cast<ir::Value*>(init_val->constExp()->accept(this));
init_const = dynamic_cast<ir::ConstantValue*>(val);
// Constant conversion
if (is_float && dynamic_cast<ir::ConstantInt*>(init_const)) {
init_const = module_.GetContext().GetConstFloat((float)static_cast<ir::ConstantInt*>(init_const)->GetValue());
} else if (!is_float && dynamic_cast<ir::ConstantFloat*>(init_const)) {
init_const = module_.GetContext().GetConstInt((int)static_cast<ir::ConstantFloat*>(init_const)->GetValue());
}
}
}
} else {
flattened.resize(sub_sizes[0]);
if (auto* init_val = ctx->constInitVal()) {
size_t pos = 0;
FlattenConstInitVal(init_val, dims, sub_sizes, 0, pos, flattened, is_float);
} else {
auto zero = is_float ? (ir::ConstantValue*)module_.GetContext().GetConstFloat(0.0f) : (ir::ConstantValue*)module_.GetContext().GetConstInt(0);
for (auto& v : flattened) v = zero;
}
size_t pos = 0;
init_const = BuildConstantArray(module_.GetContext(), var_ty, flattened, pos);
}
// 记录常量值供后续直接使用 (only for scalars for now)
if (dims.empty() && !const_values_stack_.empty()) {
const_values_stack_.back()[var_name] = init_const;
}
if (func_ == nullptr) {
auto gv_ptr_ty = ir::Type::GetPointerType(var_ty);
auto* gv = module_.CreateGlobalVariable(var_name, gv_ptr_ty, init_const);
if (!storage_map_stack_.empty()) {
storage_map_stack_.back()[var_name] = gv;
}
} else {
// 局部作用域 - 确保 alloca 在入口块
auto* current_bb = builder_.GetInsertBlock();
builder_.SetInsertPoint(func_->GetEntry());
ir::Value* slot = builder_.CreateAlloca(var_ty, module_.GetContext().NextTemp());
builder_.SetInsertPoint(current_bb);
if (!storage_map_stack_.empty()) {
storage_map_stack_.back()[var_name] = slot;
}
if (dims.empty()) {
if (init_const) builder_.CreateStore(init_const, slot);
} else {
for (size_t i = 0; i < flattened.size(); ++i) {
std::vector<ir::Value*> indices;
indices.push_back(builder_.CreateConstInt(0));
size_t temp = i;
for (size_t d = 0; d < dims.size(); ++d) {
indices.push_back(builder_.CreateConstInt(temp / sub_sizes[d+1]));
temp %= sub_sizes[d+1];
}
ir::Value* ptr = builder_.CreateGEP(ir::Type::GetPointerType(base_ty), slot, indices, module_.GetContext().NextTemp());
builder_.CreateStore(flattened[i], ptr);
}
}
}
return BlockFlow::Continue;
}
// 变量声明的 IR 生成目前也是最小实现:
// - 先检查声明的基础类型,当前仅支持局部 int
// - 先检查声明的基础类型,支持 int 和 float
// - 再把 Decl 中的变量定义交给 visitVarDef 继续处理。
//
// 和更完整的版本相比,这里还没有:
// - 一个 Decl 中多个变量定义的顺序处理;
// - const、数组、全局变量等不同声明形态
// - 更丰富的类型系统。
std::any IRGenImpl::visitDecl(SysYParser::DeclContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "缺少变量声明"));
}
if (!ctx->btype() || !ctx->btype()->INT()) {
throw std::runtime_error(FormatError("irgen", "当前仅支持局部 int 变量声明"));
}
auto* var_def = ctx->varDef();
if (!var_def) {
throw std::runtime_error(FormatError("irgen", "非法变量声明"));
// 当前语法中 decl 包含 constDecl 或 varDecl
if (auto* var_decl = ctx->varDecl()) {
if (!var_decl->bType() || (!var_decl->bType()->INT() && !var_decl->bType()->FLOAT())) {
throw std::runtime_error(FormatError("irgen", "当前仅支持 int/float 变量声明"));
}
for (auto* var_def : var_decl->varDef()) {
if (var_def) {
var_def->accept(this);
}
}
} else if (auto* const_decl = ctx->constDecl()) {
return const_decl->accept(this);
} else {
throw std::runtime_error(FormatError("irgen", "当前仅支持变量声明"));
}
var_def->accept(this);
return {};
return BlockFlow::Continue;
}
@ -80,28 +274,145 @@ std::any IRGenImpl::visitDecl(SysYParser::DeclContext* ctx) {
// - 标量初始化;
// - 一个 VarDef 对应一个槽位。
std::any IRGenImpl::visitVarDef(SysYParser::VarDefContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "缺少变量定义"));
}
if (!ctx->lValue()) {
if (!ctx || !ctx->ID()) {
throw std::runtime_error(FormatError("irgen", "变量声明缺少名称"));
}
GetLValueName(*ctx->lValue());
if (storage_map_.find(ctx) != storage_map_.end()) {
std::string var_name = ctx->ID()->getText();
if (!storage_map_stack_.empty() && storage_map_stack_.back().find(var_name) != storage_map_stack_.back().end()) {
throw std::runtime_error(FormatError("irgen", "声明重复生成存储槽位"));
}
auto* slot = builder_.CreateAllocaI32(module_.GetContext().NextTemp());
storage_map_[ctx] = slot;
ir::Value* init = nullptr;
if (auto* init_value = ctx->initValue()) {
if (!init_value->exp()) {
throw std::runtime_error(FormatError("irgen", "当前不支持聚合初始化"));
// Get dimensions
std::vector<int> dims;
for (auto* idx : ctx->constIndex()) {
dims.push_back(EvaluateConstInt(idx->constExp()));
}
// Determine base type
bool is_float = false;
auto* parent_decl = dynamic_cast<SysYParser::VarDeclContext*>(ctx->parent);
if (parent_decl && parent_decl->bType() && parent_decl->bType()->FLOAT()) {
is_float = true;
}
auto base_ty = is_float ? ir::Type::GetFloatType() : ir::Type::GetInt32Type();
std::shared_ptr<ir::Type> var_ty = base_ty;
for (auto it = dims.rbegin(); it != dims.rend(); ++it) {
var_ty = ir::Type::GetArrayType(var_ty, *it);
}
std::vector<int> sub_sizes(dims.size() + 1);
sub_sizes[dims.size()] = 1;
for (int i = (int)dims.size() - 1; i >= 0; --i) {
sub_sizes[i] = sub_sizes[i+1] * dims[i];
}
if (func_ == nullptr) {
// 全局作用域
ir::ConstantValue* init_const = nullptr;
if (dims.empty()) {
if (auto* init_val = ctx->initVal()) {
if (init_val->exp()) {
auto* val = EvalExpr(*init_val->exp());
init_const = dynamic_cast<ir::ConstantValue*>(val);
// Constant conversion
if (is_float && dynamic_cast<ir::ConstantInt*>(init_const)) {
init_const = module_.GetContext().GetConstFloat((float)static_cast<ir::ConstantInt*>(init_const)->GetValue());
} else if (!is_float && dynamic_cast<ir::ConstantFloat*>(init_const)) {
init_const = module_.GetContext().GetConstInt((int)static_cast<ir::ConstantFloat*>(init_const)->GetValue());
}
}
} else {
init_const = is_float ? (ir::ConstantValue*)module_.GetContext().GetConstFloat(0.0f) : (ir::ConstantValue*)module_.GetContext().GetConstInt(0);
}
} else {
if (auto* init_val = ctx->initVal()) {
std::vector<ir::ConstantValue*> flattened(sub_sizes[0]);
// VarDef's InitVal can be an expression or { ... }
if (init_val->exp()) {
auto* val = EvalExpr(*init_val->exp());
auto* cval = dynamic_cast<ir::ConstantValue*>(val);
flattened[0] = cval;
auto zero = is_float ? (ir::ConstantValue*)module_.GetContext().GetConstFloat(0.0f) : (ir::ConstantValue*)module_.GetContext().GetConstInt(0);
for (size_t i = 1; i < flattened.size(); ++i) {
flattened[i] = zero;
}
size_t bpos = 0;
init_const = BuildConstantArray(module_.GetContext(), var_ty, flattened, bpos);
} else {
size_t fpos = 0;
std::vector<ir::Value*> flat_vals(sub_sizes[0]);
auto zero = is_float ? (ir::Value*)module_.GetContext().GetConstFloat(0.0f) : (ir::Value*)module_.GetContext().GetConstInt(0);
for (auto& v : flat_vals) v = zero;
FlattenInitVal(init_val, dims, sub_sizes, 0, fpos, flat_vals, is_float);
for (size_t i = 0; i < flat_vals.size(); ++i) {
flattened[i] = dynamic_cast<ir::ConstantValue*>(flat_vals[i]);
}
size_t bpos = 0;
init_const = BuildConstantArray(module_.GetContext(), var_ty, flattened, bpos);
}
} else {
init_const = module_.GetContext().GetConstZero(var_ty);
}
}
auto gv_ptr_ty = ir::Type::GetPointerType(var_ty);
auto* gv = module_.CreateGlobalVariable(var_name, gv_ptr_ty, init_const);
if (!storage_map_stack_.empty()) {
storage_map_stack_.back()[var_name] = gv;
}
init = EvalExpr(*init_value->exp());
} else {
init = builder_.CreateConstInt(0);
// 局部作用域 - 确保 alloca 在入口块
auto* current_bb = builder_.GetInsertBlock();
builder_.SetInsertPoint(func_->GetEntry());
ir::Value* slot = builder_.CreateAlloca(var_ty, module_.GetContext().NextTemp());
builder_.SetInsertPoint(current_bb);
if (!storage_map_stack_.empty()) {
storage_map_stack_.back()[var_name] = slot;
}
if (auto* init_val = ctx->initVal()) {
if (dims.empty()) {
if (init_val->exp()) {
ir::Value* init = EvalExpr(*init_val->exp());
if (is_float && !init->GetType()->IsFloat()) {
init = builder_.CreateSIToFP(init, module_.GetContext().NextTemp());
} else if (!is_float && init->GetType()->IsFloat()) {
init = builder_.CreateFPToSI(init, module_.GetContext().NextTemp());
}
builder_.CreateStore(init, slot);
}
} else {
std::vector<ir::Value*> flattened(sub_sizes[0]);
auto zero = is_float ? (ir::Value*)module_.GetContext().GetConstFloat(0.0f) : (ir::Value*)module_.GetContext().GetConstInt(0);
for (auto& v : flattened) v = zero;
size_t pos = 0;
FlattenInitVal(init_val, dims, sub_sizes, 0, pos, flattened, is_float);
for (size_t i = 0; i < flattened.size(); ++i) {
// Optimization: only store non-zero?
// For now, store all to be safe.
std::vector<ir::Value*> indices;
indices.push_back(builder_.CreateConstInt(0));
size_t temp = i;
for (size_t d = 0; d < dims.size(); ++d) {
indices.push_back(builder_.CreateConstInt(temp / sub_sizes[d+1]));
temp %= sub_sizes[d+1];
}
ir::Value* ptr = builder_.CreateGEP(ir::Type::GetPointerType(base_ty), slot, indices, module_.GetContext().NextTemp());
builder_.CreateStore(flattened[i], ptr);
}
}
} else {
// Initialize scalar locals to 0
if (dims.empty()) {
ir::Value* zero = is_float ? (ir::Value*)module_.GetContext().GetConstFloat(0.0f) : (ir::Value*)module_.GetContext().GetConstInt(0);
builder_.CreateStore(zero, slot);
}
}
}
builder_.CreateStore(init, slot);
return {};
return BlockFlow::Continue;
}

@ -24,21 +24,75 @@ ir::Value* IRGenImpl::EvalExpr(SysYParser::ExpContext& expr) {
return std::any_cast<ir::Value*>(expr.accept(this));
}
ir::ConstantValue* IRGenImpl::EvaluateConst(antlr4::tree::ParseTree* tree) {
auto val = std::any_cast<ir::Value*>(tree->accept(this));
auto* cval = dynamic_cast<ir::ConstantValue*>(val);
if (!cval) throw std::runtime_error("Not a constant expression");
return cval;
}
int IRGenImpl::EvaluateConstInt(SysYParser::ConstExpContext* ctx) {
if (!ctx) return 0;
auto* val = EvaluateConst(ctx->addExp());
if (auto* ci = dynamic_cast<ir::ConstantInt*>(val)) return ci->GetValue();
if (auto* cf = dynamic_cast<ir::ConstantFloat*>(val)) return (int)cf->GetValue();
return 0;
}
int IRGenImpl::EvaluateConstInt(SysYParser::ExpContext* ctx) {
if (!ctx) return 0;
auto* val = EvaluateConst(ctx);
if (auto* ci = dynamic_cast<ir::ConstantInt*>(val)) return ci->GetValue();
if (auto* cf = dynamic_cast<ir::ConstantFloat*>(val)) return (int)cf->GetValue();
return 0;
}
std::any IRGenImpl::visitParenExp(SysYParser::ParenExpContext* ctx) {
if (!ctx || !ctx->exp()) {
throw std::runtime_error(FormatError("irgen", "非法括号表达式"));
std::shared_ptr<ir::Type> IRGenImpl::GetGEPResultType(ir::Value* ptr, const std::vector<ir::Value*>& indices) {
auto cur_ty = ptr->GetType()->GetPointedType();
for (size_t i = 1; i < indices.size(); ++i) {
if (cur_ty->IsArray()) {
cur_ty = cur_ty->GetElementType();
}
}
return EvalExpr(*ctx->exp());
return ir::Type::GetPointerType(cur_ty);
}
std::any IRGenImpl::visitPrimaryExp(SysYParser::PrimaryExpContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "非法基本表达式"));
}
// 处理括号表达式LPAREN exp RPAREN
if (ctx->exp()) {
return EvalExpr(*ctx->exp());
}
// 处理 lVal变量使用
if (ctx->lVal()) {
return ctx->lVal()->accept(this);
}
// 处理 number
if (ctx->number()) {
return ctx->number()->accept(this);
}
throw std::runtime_error(FormatError("irgen", "不支持的基本表达式类型"));
}
std::any IRGenImpl::visitNumberExp(SysYParser::NumberExpContext* ctx) {
if (!ctx || !ctx->number() || !ctx->number()->ILITERAL()) {
throw std::runtime_error(FormatError("irgen", "当前仅支持整数字面量"));
std::any IRGenImpl::visitNumber(SysYParser::NumberContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "缺少字面量节点"));
}
return static_cast<ir::Value*>(
builder_.CreateConstInt(std::stoi(ctx->number()->getText())));
if (ctx->intConst()) {
// 可能是 0x, 0X, 0 开头的八进制等,目前 std::stoi 会处理十进制,
// 为了支持 16 进制/8 进制建议使用 std::stoi(str, nullptr, 0)
std::string text = ctx->intConst()->getText();
return static_cast<ir::Value*>(
builder_.CreateConstInt(std::stoi(text, nullptr, 0)));
} else if (ctx->floatConst()) {
std::string text = ctx->floatConst()->getText();
return static_cast<ir::Value*>(
module_.GetContext().GetConstFloat(std::stof(text)));
}
throw std::runtime_error(FormatError("irgen", "不支持的字面量"));
}
// 变量使用的处理流程:
@ -47,34 +101,482 @@ std::any IRGenImpl::visitNumberExp(SysYParser::NumberExpContext* ctx) {
// 3. 最后生成 load把内存中的值读出来。
//
// 因此当前 IRGen 自己不再做名字查找,而是直接消费 Sema 的绑定结果。
std::any IRGenImpl::visitVarExp(SysYParser::VarExpContext* ctx) {
if (!ctx || !ctx->var() || !ctx->var()->ID()) {
throw std::runtime_error(FormatError("irgen", "当前仅支持普通整型变量"));
std::any IRGenImpl::visitLVal(SysYParser::LValContext* ctx) {
if (!ctx || !ctx->ID()) {
throw std::runtime_error(FormatError("irgen", "非法左值"));
}
auto* decl = sema_.ResolveVarUse(ctx->var());
if (!decl) {
std::string var_name = ctx->ID()->getText();
// 优先检查是否为已记录的常量
ir::ConstantValue* const_val = FindConst(var_name);
if (const_val && ctx->exp().empty()) {
return static_cast<ir::Value*>(const_val);
}
const auto* binding = sema_.ResolveObjectUse(ctx);
if (!binding) {
throw std::runtime_error(
FormatError("irgen",
"变量使用缺少语义绑定: " + ctx->var()->ID()->getText()));
FormatError("irgen", "变量使用缺少语义绑定:" + var_name));
}
auto it = storage_map_.find(decl);
if (it == storage_map_.end()) {
ir::Value* slot = FindStorage(var_name);
if (!slot) {
throw std::runtime_error(
FormatError("irgen",
"变量声明缺少存储槽位: " + ctx->var()->ID()->getText()));
FormatError("irgen", "变量声明缺少存储槽位:" + var_name));
}
ir::Value* ptr = slot;
auto ptr_ty = ptr->GetType();
bool is_param = false;
// If it's a pointer to a pointer (function parameter case), load the pointer value first
if (ptr_ty->IsPointer() && ptr_ty->GetPointedType()->IsPointer()) {
ptr = builder_.CreateLoad(ptr, module_.GetContext().NextTemp());
is_param = true;
} else if (ptr->IsArgument()) {
is_param = true;
}
// Determine if the result of this LVal is a scalar or an array
bool result_is_scalar = (ctx->exp().size() == binding->dimensions.size());
if (!ctx->exp().empty()) {
std::vector<ir::Value*> indices;
// If it's a local array, we need leading 0
if (ptr->GetType()->IsPointer() && ptr->GetType()->GetPointedType()->IsArray()) {
if (!is_param) {
indices.push_back(builder_.CreateConstInt(0));
}
}
for (auto* exp_ctx : ctx->exp()) {
indices.push_back(EvalExpr(*exp_ctx));
}
auto res_ptr_ty = GetGEPResultType(ptr, indices);
ptr = builder_.CreateGEP(res_ptr_ty, ptr, indices, module_.GetContext().NextTemp());
}
if (result_is_scalar) {
return static_cast<ir::Value*>(builder_.CreateLoad(ptr, module_.GetContext().NextTemp()));
} else {
// Decay ptr to the first element of the sub-array
while (ptr->GetType()->GetPointedType()->IsArray()) {
std::vector<ir::Value*> d_indices;
d_indices.push_back(builder_.CreateConstInt(0));
d_indices.push_back(builder_.CreateConstInt(0));
auto d_res_ty = GetGEPResultType(ptr, d_indices);
ptr = builder_.CreateGEP(d_res_ty, ptr, d_indices, module_.GetContext().NextTemp());
}
return ptr;
}
}
std::any IRGenImpl::visitAddExp(SysYParser::AddExpContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "非法加减法表达式"));
}
// 如果是 mulExp 直接返回addExp : mulExp
if (ctx->mulExp() && ctx->addExp() == nullptr) {
return ctx->mulExp()->accept(this);
}
// 处理 addExp op mulExp 的递归形式
if (!ctx->addExp() || !ctx->mulExp()) {
throw std::runtime_error(FormatError("irgen", "非法加减法表达式结构"));
}
ir::Value* lhs = std::any_cast<ir::Value*>(ctx->addExp()->accept(this));
ir::Value* rhs = std::any_cast<ir::Value*>(ctx->mulExp()->accept(this));
if (lhs->IsConstant() && rhs->IsConstant()) {
auto* cl = static_cast<ir::ConstantValue*>(lhs);
auto* cr = static_cast<ir::ConstantValue*>(rhs);
if (auto* cil = dynamic_cast<ir::ConstantInt*>(cl)) {
if (auto* cir = dynamic_cast<ir::ConstantInt*>(cr)) {
if (ctx->ADD()) return static_cast<ir::Value*>(module_.GetContext().GetConstInt(cil->GetValue() + cir->GetValue()));
if (ctx->SUB()) return static_cast<ir::Value*>(module_.GetContext().GetConstInt(cil->GetValue() - cir->GetValue()));
}
}
}
// Implicit conversion
if (lhs->GetType()->IsFloat() && !rhs->GetType()->IsFloat()) {
if (rhs->IsConstant()) {
rhs = module_.GetContext().GetConstFloat((float)static_cast<ir::ConstantInt*>(rhs)->GetValue());
} else {
rhs = builder_.CreateSIToFP(rhs, module_.GetContext().NextTemp());
}
} else if (!lhs->GetType()->IsFloat() && rhs->GetType()->IsFloat()) {
if (lhs->IsConstant()) {
lhs = module_.GetContext().GetConstFloat((float)static_cast<ir::ConstantInt*>(lhs)->GetValue());
} else {
lhs = builder_.CreateSIToFP(lhs, module_.GetContext().NextTemp());
}
}
if (lhs->IsConstant() && rhs->IsConstant()) {
auto* cl = static_cast<ir::ConstantValue*>(lhs);
auto* cr = static_cast<ir::ConstantValue*>(rhs);
if (auto* cfl = dynamic_cast<ir::ConstantFloat*>(cl)) {
if (auto* cfr = dynamic_cast<ir::ConstantFloat*>(cr)) {
if (ctx->ADD()) return static_cast<ir::Value*>(module_.GetContext().GetConstFloat(cfl->GetValue() + cfr->GetValue()));
if (ctx->SUB()) return static_cast<ir::Value*>(module_.GetContext().GetConstFloat(cfl->GetValue() - cfr->GetValue()));
}
}
}
ir::Opcode op = ir::Opcode::Add;
if (ctx->ADD()) {
op = ir::Opcode::Add;
} else if (ctx->SUB()) {
op = ir::Opcode::Sub;
} else {
throw std::runtime_error(FormatError("irgen", "未知的加减运算符"));
}
return static_cast<ir::Value*>(
builder_.CreateLoad(it->second, module_.GetContext().NextTemp()));
builder_.CreateBinary(op, lhs, rhs, module_.GetContext().NextTemp()));
}
std::any IRGenImpl::visitUnaryExp(SysYParser::UnaryExpContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "非法一元表达式"));
}
// 如果是 primaryExp 直接返回unaryExp : primaryExp
if (ctx->primaryExp()) {
return ctx->primaryExp()->accept(this);
}
// 处理函数调用unaryExp : ID LPAREN funcRParams? RPAREN
if (ctx->ID()) {
std::string func_name = ctx->ID()->getText();
// 从 Sema 或 Module 中查找函数
// 目前简化处理,直接从 Module 中查找(如果是当前文件内定义的)
// 或者依赖 Sema 给出解析结果
const FunctionBinding* func_binding = sema_.ResolveFunctionCall(ctx);
if (!func_binding) {
throw std::runtime_error(FormatError("irgen", "未找到函数声明:" + func_name));
}
// 假设 func_binding 能够找到对应的 ir::Function*
// 这里如果 sema 不提供直接拿 ir::Function 的方式,需要遍历 module_.GetFunctions() 查找
ir::Function* target_func = nullptr;
for (const auto& f : module_.GetFunctions()) {
if (f->GetName() == func_name) {
target_func = f.get();
break;
}
}
if (!target_func) {
// 可能是外部函数如 putint, getint 等
// 如果没有在 module_ 中,则需要创建一个只有声明的 Function
std::shared_ptr<ir::Type> ret_ty;
if (func_binding->return_type == SemanticType::Int) {
ret_ty = ir::Type::GetInt32Type();
} else if (func_binding->return_type == SemanticType::Float) {
ret_ty = ir::Type::GetFloatType();
} else {
ret_ty = ir::Type::GetVoidType();
}
target_func = module_.CreateFunction(func_name, ret_ty);
// 对于外部函数,需要传递参数,可能还需要在 target_func 中 AddArgument
for (const auto& param : func_binding->params) {
std::shared_ptr<ir::Type> p_ty;
if (param.type == SemanticType::Int) {
p_ty = param.dimensions.empty() && !param.is_array_param ? ir::Type::GetInt32Type() : ir::Type::GetPtrInt32Type();
} else {
p_ty = param.dimensions.empty() && !param.is_array_param ? ir::Type::GetFloatType() : ir::Type::GetPtrFloatType();
}
target_func->AddArgument(p_ty, param.name);
}
}
std::vector<ir::Value*> args;
if (ctx->funcRParams()) {
args = std::any_cast<std::vector<ir::Value*>>(ctx->funcRParams()->accept(this));
}
// Implicit conversion for function arguments
const auto& formal_args = target_func->GetArgs();
for (size_t i = 0; i < std::min(args.size(), formal_args.size()); ++i) {
if (formal_args[i]->GetType()->IsFloat() && !args[i]->GetType()->IsFloat()) {
args[i] = builder_.CreateSIToFP(args[i], module_.GetContext().NextTemp());
} else if (formal_args[i]->GetType()->IsInt32() && args[i]->GetType()->IsFloat()) {
args[i] = builder_.CreateFPToSI(args[i], module_.GetContext().NextTemp());
}
}
return static_cast<ir::Value*>(builder_.CreateCall(target_func, args, module_.GetContext().NextTemp()));
}
// 处理一元运算符unaryExp : addUnaryOp unaryExp
if (ctx->addUnaryOp() && ctx->unaryExp()) {
ir::Value* operand = std::any_cast<ir::Value*>(ctx->unaryExp()->accept(this));
// Constant folding for unary op
if (operand->IsConstant()) {
if (ctx->addUnaryOp()->SUB()) {
if (auto* ci = dynamic_cast<ir::ConstantInt*>(operand)) {
return static_cast<ir::Value*>(module_.GetContext().GetConstInt(-ci->GetValue()));
} else if (auto* cf = dynamic_cast<ir::ConstantFloat*>(operand)) {
return static_cast<ir::Value*>(module_.GetContext().GetConstFloat(-cf->GetValue()));
}
} else {
return operand;
}
}
// 判断是正号还是负号
if (ctx->addUnaryOp()->SUB()) {
// 负号:如果是整数生成 sub 0, operand浮点数生成 fsub 0.0, operand
if (operand->GetType()->IsFloat()) {
ir::Value* zero = module_.GetContext().GetConstFloat(0.0f);
// 此处暂且假设 CreateSub 可以处理浮点数(如果底层有 fsub 则更好)
return static_cast<ir::Value*>(
builder_.CreateSub(zero, operand, module_.GetContext().NextTemp()));
} else {
ir::Value* zero = builder_.CreateConstInt(0);
return static_cast<ir::Value*>(
builder_.CreateSub(zero, operand, module_.GetContext().NextTemp()));
}
} else if (ctx->addUnaryOp()->ADD()) {
// 正号:直接返回操作数(+x 等价于 x
return operand;
} else {
throw std::runtime_error(FormatError("irgen", "未知的一元运算符"));
}
}
throw std::runtime_error(FormatError("irgen", "不支持的一元表达式类型"));
}
std::any IRGenImpl::visitMulExp(SysYParser::MulExpContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "非法乘除法表达式"));
}
// 如果是 unaryExp 直接返回mulExp : unaryExp
if (ctx->unaryExp() && ctx->mulExp() == nullptr) {
return ctx->unaryExp()->accept(this);
}
// 处理 mulExp op unaryExp 的递归形式
if (!ctx->mulExp() || !ctx->unaryExp()) {
throw std::runtime_error(FormatError("irgen", "非法乘除法表达式结构"));
}
ir::Value* lhs = std::any_cast<ir::Value*>(ctx->mulExp()->accept(this));
ir::Value* rhs = std::any_cast<ir::Value*>(ctx->unaryExp()->accept(this));
// Constant folding
if (lhs->IsConstant() && rhs->IsConstant()) {
auto* cl = static_cast<ir::ConstantValue*>(lhs);
auto* cr = static_cast<ir::ConstantValue*>(rhs);
if (auto* cil = dynamic_cast<ir::ConstantInt*>(cl)) {
if (auto* cir = dynamic_cast<ir::ConstantInt*>(cr)) {
if (ctx->MUL()) return static_cast<ir::Value*>(module_.GetContext().GetConstInt(cil->GetValue() * cir->GetValue()));
if (ctx->DIV()) return static_cast<ir::Value*>(module_.GetContext().GetConstInt(cil->GetValue() / cir->GetValue()));
if (ctx->MOD()) return static_cast<ir::Value*>(module_.GetContext().GetConstInt(cil->GetValue() % cir->GetValue()));
}
}
}
// Implicit conversion
if (lhs->GetType()->IsFloat() && !rhs->GetType()->IsFloat()) {
if (rhs->IsConstant()) {
rhs = module_.GetContext().GetConstFloat((float)static_cast<ir::ConstantInt*>(rhs)->GetValue());
} else {
rhs = builder_.CreateSIToFP(rhs, module_.GetContext().NextTemp());
}
} else if (!lhs->GetType()->IsFloat() && rhs->GetType()->IsFloat()) {
if (lhs->IsConstant()) {
lhs = module_.GetContext().GetConstFloat((float)static_cast<ir::ConstantInt*>(lhs)->GetValue());
} else {
lhs = builder_.CreateSIToFP(lhs, module_.GetContext().NextTemp());
}
}
std::any IRGenImpl::visitAdditiveExp(SysYParser::AdditiveExpContext* ctx) {
if (!ctx || !ctx->exp(0) || !ctx->exp(1)) {
throw std::runtime_error(FormatError("irgen", "非法加法表达式"));
if (lhs->IsConstant() && rhs->IsConstant()) {
auto* cl = static_cast<ir::ConstantValue*>(lhs);
auto* cr = static_cast<ir::ConstantValue*>(rhs);
if (auto* cfl = dynamic_cast<ir::ConstantFloat*>(cl)) {
if (auto* cfr = dynamic_cast<ir::ConstantFloat*>(cr)) {
if (ctx->MUL()) return static_cast<ir::Value*>(module_.GetContext().GetConstFloat(cfl->GetValue() * cfr->GetValue()));
if (ctx->DIV()) return static_cast<ir::Value*>(module_.GetContext().GetConstFloat(cfl->GetValue() / cfr->GetValue()));
}
}
}
ir::Opcode op = ir::Opcode::Mul;
if (ctx->MUL()) {
op = ir::Opcode::Mul;
} else if (ctx->DIV()) {
op = ir::Opcode::Div;
} else if (ctx->MOD()) {
op = ir::Opcode::Mod;
} else {
throw std::runtime_error(FormatError("irgen", "未知的乘除运算符"));
}
ir::Value* lhs = EvalExpr(*ctx->exp(0));
ir::Value* rhs = EvalExpr(*ctx->exp(1));
return static_cast<ir::Value*>(
builder_.CreateBinary(ir::Opcode::Add, lhs, rhs,
module_.GetContext().NextTemp()));
builder_.CreateBinary(op, lhs, rhs, module_.GetContext().NextTemp()));
}
std::any IRGenImpl::visitRelExp(SysYParser::RelExpContext* ctx) {
if (ctx->addExp() && ctx->relExp() == nullptr) {
return ctx->addExp()->accept(this);
}
ir::Value* lhs = std::any_cast<ir::Value*>(ctx->relExp()->accept(this));
ir::Value* rhs = std::any_cast<ir::Value*>(ctx->addExp()->accept(this));
if (lhs->GetType()->IsInt1()) lhs = builder_.CreateZext(lhs, module_.GetContext().NextTemp());
if (rhs->GetType()->IsInt1()) rhs = builder_.CreateZext(rhs, module_.GetContext().NextTemp());
// Implicit conversion
if (lhs->GetType()->IsFloat() && !rhs->GetType()->IsFloat()) {
rhs = builder_.CreateSIToFP(rhs, module_.GetContext().NextTemp());
} else if (!lhs->GetType()->IsFloat() && rhs->GetType()->IsFloat()) {
lhs = builder_.CreateSIToFP(lhs, module_.GetContext().NextTemp());
}
ir::CmpOp op;
if (ctx->LT()) op = ir::CmpOp::Lt;
else if (ctx->GT()) op = ir::CmpOp::Gt;
else if (ctx->LE()) op = ir::CmpOp::Le;
else if (ctx->GE()) op = ir::CmpOp::Ge;
else throw std::runtime_error(FormatError("irgen", "未知的关系运算符"));
return static_cast<ir::Value*>(builder_.CreateCmp(op, lhs, rhs, module_.GetContext().NextTemp()));
}
std::any IRGenImpl::visitEqExp(SysYParser::EqExpContext* ctx) {
if (ctx->relExp() && ctx->eqExp() == nullptr) {
return ctx->relExp()->accept(this);
}
ir::Value* lhs = std::any_cast<ir::Value*>(ctx->eqExp()->accept(this));
ir::Value* rhs = std::any_cast<ir::Value*>(ctx->relExp()->accept(this));
if (lhs->GetType()->IsInt1()) lhs = builder_.CreateZext(lhs, module_.GetContext().NextTemp());
if (rhs->GetType()->IsInt1()) rhs = builder_.CreateZext(rhs, module_.GetContext().NextTemp());
// Implicit conversion
if (lhs->GetType()->IsFloat() && !rhs->GetType()->IsFloat()) {
rhs = builder_.CreateSIToFP(rhs, module_.GetContext().NextTemp());
} else if (!lhs->GetType()->IsFloat() && rhs->GetType()->IsFloat()) {
lhs = builder_.CreateSIToFP(lhs, module_.GetContext().NextTemp());
}
ir::CmpOp op;
if (ctx->EQ()) op = ir::CmpOp::Eq;
else if (ctx->NE()) op = ir::CmpOp::Ne;
else throw std::runtime_error(FormatError("irgen", "未知的相等运算符"));
return static_cast<ir::Value*>(builder_.CreateCmp(op, lhs, rhs, module_.GetContext().NextTemp()));
}
std::any IRGenImpl::visitCondUnaryExp(SysYParser::CondUnaryExpContext* ctx) {
if (ctx->eqExp()) {
return ctx->eqExp()->accept(this);
}
if (ctx->NOT()) {
ir::Value* operand = std::any_cast<ir::Value*>(ctx->condUnaryExp()->accept(this));
if (operand->GetType()->IsInt1()) {
operand = builder_.CreateZext(operand, module_.GetContext().NextTemp());
}
if (operand->GetType()->IsFloat()) {
ir::Value* zero = module_.GetContext().GetConstFloat(0.0f);
return static_cast<ir::Value*>(builder_.CreateCmp(ir::CmpOp::Eq, operand, zero, module_.GetContext().NextTemp()));
} else {
ir::Value* zero = builder_.CreateConstInt(0);
return static_cast<ir::Value*>(builder_.CreateCmp(ir::CmpOp::Eq, operand, zero, module_.GetContext().NextTemp()));
}
}
throw std::runtime_error(FormatError("irgen", "非法条件一元表达式"));
}
std::any IRGenImpl::visitLAndExp(SysYParser::LAndExpContext* ctx) {
if (ctx->condUnaryExp() && ctx->lAndExp() == nullptr) {
return ctx->condUnaryExp()->accept(this);
}
ir::AllocaInst* res_ptr = builder_.CreateAllocaI32(module_.GetContext().NextTemp());
ir::Value* zero = builder_.CreateConstInt(0);
builder_.CreateStore(zero, res_ptr);
ir::BasicBlock* rhs_bb = func_->CreateBlock(NextBlockName("land_rhs"));
ir::BasicBlock* end_bb = func_->CreateBlock(NextBlockName("land_end"));
ir::Value* lhs = std::any_cast<ir::Value*>(ctx->lAndExp()->accept(this));
if (lhs->GetType()->IsInt1()) {
lhs = builder_.CreateZext(lhs, module_.GetContext().NextTemp());
}
ir::Value* lhs_cond = builder_.CreateCmp(ir::CmpOp::Ne, lhs, zero, module_.GetContext().NextTemp());
builder_.CreateCondBr(lhs_cond, rhs_bb, end_bb);
builder_.SetInsertPoint(rhs_bb);
ir::Value* rhs = std::any_cast<ir::Value*>(ctx->condUnaryExp()->accept(this));
if (rhs->GetType()->IsInt1()) {
rhs = builder_.CreateZext(rhs, module_.GetContext().NextTemp());
}
ir::Value* rhs_cond = builder_.CreateCmp(ir::CmpOp::Ne, rhs, zero, module_.GetContext().NextTemp());
ir::Value* rhs_res = builder_.CreateZext(rhs_cond, module_.GetContext().NextTemp());
builder_.CreateStore(rhs_res, res_ptr);
builder_.CreateBr(end_bb);
builder_.SetInsertPoint(end_bb);
return static_cast<ir::Value*>(builder_.CreateLoad(res_ptr, module_.GetContext().NextTemp()));
}
std::any IRGenImpl::visitLOrExp(SysYParser::LOrExpContext* ctx) {
if (ctx->lAndExp() && ctx->lOrExp() == nullptr) {
return ctx->lAndExp()->accept(this);
}
ir::AllocaInst* res_ptr = builder_.CreateAllocaI32(module_.GetContext().NextTemp());
ir::Value* one = builder_.CreateConstInt(1);
builder_.CreateStore(one, res_ptr);
ir::BasicBlock* rhs_bb = func_->CreateBlock(NextBlockName("lor_rhs"));
ir::BasicBlock* end_bb = func_->CreateBlock(NextBlockName("lor_end"));
ir::Value* lhs = std::any_cast<ir::Value*>(ctx->lOrExp()->accept(this));
ir::Value* zero = builder_.CreateConstInt(0);
if (lhs->GetType()->IsInt1()) {
lhs = builder_.CreateZext(lhs, module_.GetContext().NextTemp());
}
ir::Value* lhs_cond = builder_.CreateCmp(ir::CmpOp::Eq, lhs, zero, module_.GetContext().NextTemp());
builder_.CreateCondBr(lhs_cond, rhs_bb, end_bb);
builder_.SetInsertPoint(rhs_bb);
ir::Value* rhs = std::any_cast<ir::Value*>(ctx->lAndExp()->accept(this));
if (rhs->GetType()->IsInt1()) {
rhs = builder_.CreateZext(rhs, module_.GetContext().NextTemp());
}
ir::Value* rhs_cond = builder_.CreateCmp(ir::CmpOp::Ne, rhs, zero, module_.GetContext().NextTemp());
ir::Value* rhs_res = builder_.CreateZext(rhs_cond, module_.GetContext().NextTemp());
builder_.CreateStore(rhs_res, res_ptr);
builder_.CreateBr(end_bb);
builder_.SetInsertPoint(end_bb);
return static_cast<ir::Value*>(builder_.CreateLoad(res_ptr, module_.GetContext().NextTemp()));
}
std::any IRGenImpl::visitCond(SysYParser::CondContext* ctx) {
if (!ctx || !ctx->lOrExp()) {
throw std::runtime_error(FormatError("irgen", "非法条件表达式"));
}
return ctx->lOrExp()->accept(this);
}
std::any IRGenImpl::visitFuncRParams(SysYParser::FuncRParamsContext* ctx) {
std::vector<ir::Value*> args;
for (auto* exp : ctx->exp()) {
args.push_back(EvalExpr(*exp));
}
return args;
}

@ -29,7 +29,7 @@ IRGenImpl::IRGenImpl(ir::Module& module, const SemanticContext& sema)
// 编译单元的 IR 生成当前只实现了最小功能:
// - Module 已在 GenerateIR 中创建,这里只负责继续生成其中的内容;
// - 当前会读取编译单元中的函数定义,并交给 visitFuncDef 生成函数 IR
// - 当前会读取编译单元中的 topLevelItem找到 funcDef 后生成函数 IR
//
// 当前还没有实现:
// - 多个函数定义的遍历与生成;
@ -38,11 +38,24 @@ std::any IRGenImpl::visitCompUnit(SysYParser::CompUnitContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "缺少编译单元"));
}
auto* func = ctx->funcDef();
if (!func) {
throw std::runtime_error(FormatError("irgen", "缺少函数定义"));
// 初始化全局作用域
storage_map_stack_.push_back({});
const_values_stack_.push_back({});
// 遍历所有 topLevelItem
for (auto* item : ctx->topLevelItem()) {
if (!item) continue;
if (item->funcDef()) {
item->funcDef()->accept(this);
} else if (item->decl()) {
item->decl()->accept(this);
}
}
func->accept(this);
// 退出全局作用域
storage_map_stack_.pop_back();
const_values_stack_.pop_back();
return {};
}
@ -61,27 +74,98 @@ std::any IRGenImpl::visitCompUnit(SysYParser::CompUnitContext* ctx) {
// - 入口块中的参数初始化逻辑。
// ...
// 因此这里目前只支持最小的“无参 int 函数”生成。
std::any IRGenImpl::visitFuncDef(SysYParser::FuncDefContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "缺少函数定义"));
}
if (!ctx->blockStmt()) {
if (!ctx->block()) {
throw std::runtime_error(FormatError("irgen", "函数体为空"));
}
if (!ctx->ID()) {
throw std::runtime_error(FormatError("irgen", "缺少函数名"));
}
if (!ctx->funcType() || !ctx->funcType()->INT()) {
throw std::runtime_error(FormatError("irgen", "当前仅支持无参 int 函数"));
std::shared_ptr<ir::Type> ret_type;
if (ctx->funcType()->INT()) {
ret_type = ir::Type::GetInt32Type();
} else if (ctx->funcType()->FLOAT()) {
ret_type = ir::Type::GetFloatType();
} else if (ctx->funcType()->VOID()) {
ret_type = ir::Type::GetVoidType();
} else {
throw std::runtime_error(FormatError("irgen", "未知的函数返回类型"));
}
func_ = module_.CreateFunction(ctx->ID()->getText(), ir::Type::GetInt32Type());
builder_.SetInsertPoint(func_->GetEntry());
storage_map_.clear();
func_ = module_.CreateFunction(ctx->ID()->getText(), ret_type);
ir::BasicBlock* alloca_bb = func_->CreateBlock("alloca");
ir::BasicBlock* entry_bb = func_->CreateBlock("entry");
builder_.SetInsertPoint(entry_bb);
// 进入函数作用域,压入一个新的 map
storage_map_stack_.push_back({});
const_values_stack_.push_back({});
if (ctx->funcFParams()) {
for (auto* paramCtx : ctx->funcFParams()->funcFParam()) {
std::shared_ptr<ir::Type> param_type;
bool is_array = !paramCtx->LBRACK().empty();
auto base_sema_ty = paramCtx->bType()->INT() ? SemanticType::Int : SemanticType::Float;
auto base_ir_ty = (base_sema_ty == SemanticType::Int) ? ir::Type::GetInt32Type() : ir::Type::GetFloatType();
if (is_array) {
std::shared_ptr<ir::Type> elem_ty = base_ir_ty;
auto exps = paramCtx->exp();
for (auto it = exps.rbegin(); it != exps.rend(); ++it) {
int dim = EvaluateConstInt(*it);
elem_ty = ir::Type::GetArrayType(elem_ty, dim);
}
param_type = ir::Type::GetPointerType(elem_ty);
} else {
param_type = base_ir_ty;
}
std::string arg_name = paramCtx->ID()->getText();
auto* arg = func_->AddArgument(param_type, "%arg" + std::to_string(func_->GetArgs().size()));
// Ensure param alloca is in alloca_bb
auto* current_bb = builder_.GetInsertBlock();
builder_.SetInsertPoint(alloca_bb);
ir::Instruction* alloca_inst = builder_.CreateAlloca(param_type, module_.GetContext().NextTemp());
builder_.SetInsertPoint(current_bb);
builder_.CreateStore(arg, alloca_inst);
storage_map_stack_.back()[arg_name] = alloca_inst;
}
}
ctx->block()->accept(this);
// Implicit return for void functions or main
if (!builder_.GetInsertBlock()->HasTerminator()) {
if (func_->GetType()->IsVoid()) {
builder_.CreateRet(nullptr);
} else if (func_->GetName() == "main") {
builder_.CreateRet(builder_.CreateConstInt(0));
} else {
if (func_->GetType()->IsFloat()) {
builder_.CreateRet(module_.GetContext().GetConstFloat(0.0f));
} else {
builder_.CreateRet(builder_.CreateConstInt(0));
}
}
}
// Branch from alloca_bb to entry_bb
builder_.SetInsertPoint(alloca_bb);
builder_.CreateBr(entry_bb);
ctx->blockStmt()->accept(this);
// 语义正确性主要由 sema 保证,这里只兜底检查 IR 结构是否合法。
VerifyFunctionStructure(*func_);
func_ = nullptr;
// 退出函数作用域,弹出 map
storage_map_stack_.pop_back();
const_values_stack_.pop_back();
return {};
}

@ -9,9 +9,9 @@
// 语句生成当前只实现了最小子集。
// 目前支持:
// - return <exp>;
// - 赋值语句lVal = exp;
//
// 还未支持:
// - 赋值语句
// - if / while 等控制流
// - 空语句、块语句嵌套分发之外的更多语句形态
@ -19,21 +19,178 @@ std::any IRGenImpl::visitStmt(SysYParser::StmtContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "缺少语句"));
}
if (ctx->returnStmt()) {
return ctx->returnStmt()->accept(this);
}
throw std::runtime_error(FormatError("irgen", "暂不支持的语句类型"));
}
if (ctx->lVal() && ctx->ASSIGN()) {
if (!ctx->exp()) {
throw std::runtime_error(FormatError("irgen", "赋值语句缺少表达式"));
}
ir::Value* rhs = EvalExpr(*ctx->exp());
auto* lval_ctx = ctx->lVal();
if (!lval_ctx || !lval_ctx->ID()) {
throw std::runtime_error(FormatError("irgen", "当前仅支持普通整型变量赋值"));
}
const auto* decl = sema_.ResolveObjectUse(lval_ctx);
if (!decl) {
throw std::runtime_error(
FormatError("irgen", "变量使用缺少语义绑定:" + lval_ctx->ID()->getText()));
}
std::string var_name = lval_ctx->ID()->getText();
ir::Value* slot = FindStorage(var_name);
if (!slot) {
throw std::runtime_error(
FormatError("irgen", "变量声明缺少存储槽位:" + var_name));
}
ir::Value* ptr = slot;
auto ptr_ty = ptr->GetType();
bool is_param = false;
// If it's a pointer to a pointer (function parameter case), load the pointer value first
if (ptr_ty->IsPointer() && ptr_ty->GetPointedType()->IsPointer()) {
ptr = builder_.CreateLoad(ptr, module_.GetContext().NextTemp());
is_param = true;
}
if (ptr->IsArgument()) is_param = true;
if (!lval_ctx->exp().empty()) {
std::vector<ir::Value*> indices;
if (ptr->GetType()->IsPointer() && ptr->GetType()->GetPointedType()->IsArray()) {
if (!is_param) {
indices.push_back(builder_.CreateConstInt(0));
}
}
for (auto* exp_ctx : lval_ctx->exp()) {
indices.push_back(EvalExpr(*exp_ctx));
}
auto res_ptr_ty = GetGEPResultType(ptr, indices);
ptr = builder_.CreateGEP(res_ptr_ty, ptr, indices, module_.GetContext().NextTemp());
}
std::any IRGenImpl::visitReturnStmt(SysYParser::ReturnStmtContext* ctx) {
if (!ctx) {
throw std::runtime_error(FormatError("irgen", "缺少 return 语句"));
// Implicit conversion for assignment
if ((ptr->GetType()->IsPtrFloat() || (ptr->GetType()->IsArray() && ptr->GetType()->GetElementType()->IsFloat())) && !rhs->GetType()->IsFloat()) {
rhs = builder_.CreateSIToFP(rhs, module_.GetContext().NextTemp());
} else if ((ptr->GetType()->IsPtrInt32() || (ptr->GetType()->IsArray() && ptr->GetType()->GetElementType()->IsInt32())) && rhs->GetType()->IsFloat()) {
rhs = builder_.CreateFPToSI(rhs, module_.GetContext().NextTemp());
}
builder_.CreateStore(rhs, ptr);
return BlockFlow::Continue;
}
if (ctx->IF()) {
ir::Value* cond_val = std::any_cast<ir::Value*>(ctx->cond()->accept(this));
// cond_val must be i1, if it's not we need to check if it's != 0
if (cond_val->GetType()->IsInt32()) {
ir::Value* zero = builder_.CreateConstInt(0);
cond_val = builder_.CreateCmp(ir::CmpOp::Ne, cond_val, zero, module_.GetContext().NextTemp());
} else if (cond_val->GetType()->IsFloat()) {
ir::Value* zero = module_.GetContext().GetConstFloat(0.0f);
cond_val = builder_.CreateCmp(ir::CmpOp::Ne, cond_val, zero, module_.GetContext().NextTemp());
}
ir::BasicBlock* then_bb = func_->CreateBlock(NextBlockName("if_then"));
ir::BasicBlock* else_bb = ctx->ELSE() ? func_->CreateBlock(NextBlockName("if_else")) : nullptr;
ir::BasicBlock* merge_bb = func_->CreateBlock(NextBlockName("if_merge"));
builder_.CreateCondBr(cond_val, then_bb, else_bb ? else_bb : merge_bb);
builder_.SetInsertPoint(then_bb);
auto then_flow = std::any_cast<BlockFlow>(ctx->stmt(0)->accept(this));
if (then_flow == BlockFlow::Continue) {
builder_.CreateBr(merge_bb);
}
if (ctx->ELSE()) {
builder_.SetInsertPoint(else_bb);
auto else_flow = std::any_cast<BlockFlow>(ctx->stmt(1)->accept(this));
if (else_flow == BlockFlow::Continue) {
builder_.CreateBr(merge_bb);
}
}
builder_.SetInsertPoint(merge_bb);
return BlockFlow::Continue;
}
if (ctx->WHILE()) {
ir::BasicBlock* cond_bb = func_->CreateBlock(NextBlockName("while_cond"));
ir::BasicBlock* body_bb = func_->CreateBlock(NextBlockName("while_body"));
ir::BasicBlock* exit_bb = func_->CreateBlock(NextBlockName("while_exit"));
builder_.CreateBr(cond_bb);
builder_.SetInsertPoint(cond_bb);
ir::Value* cond_val = std::any_cast<ir::Value*>(ctx->cond()->accept(this));
if (cond_val->GetType()->IsInt32()) {
ir::Value* zero = builder_.CreateConstInt(0);
cond_val = builder_.CreateCmp(ir::CmpOp::Ne, cond_val, zero, module_.GetContext().NextTemp());
} else if (cond_val->GetType()->IsFloat()) {
ir::Value* zero = module_.GetContext().GetConstFloat(0.0f);
cond_val = builder_.CreateCmp(ir::CmpOp::Ne, cond_val, zero, module_.GetContext().NextTemp());
}
builder_.CreateCondBr(cond_val, body_bb, exit_bb);
builder_.SetInsertPoint(body_bb);
ir::BasicBlock* old_cond = current_loop_cond_bb_;
ir::BasicBlock* old_exit = current_loop_exit_bb_;
current_loop_cond_bb_ = cond_bb;
current_loop_exit_bb_ = exit_bb;
auto body_flow = std::any_cast<BlockFlow>(ctx->stmt(0)->accept(this));
if (body_flow == BlockFlow::Continue) {
builder_.CreateBr(cond_bb);
}
current_loop_cond_bb_ = old_cond;
current_loop_exit_bb_ = old_exit;
builder_.SetInsertPoint(exit_bb);
return BlockFlow::Continue;
}
if (!ctx->exp()) {
throw std::runtime_error(FormatError("irgen", "return 缺少表达式"));
if (ctx->BREAK()) {
if (!current_loop_exit_bb_) {
throw std::runtime_error(FormatError("irgen", "break 必须在循环内"));
}
builder_.CreateBr(current_loop_exit_bb_);
return BlockFlow::Terminated;
}
ir::Value* v = EvalExpr(*ctx->exp());
builder_.CreateRet(v);
return BlockFlow::Terminated;
if (ctx->CONTINUE()) {
if (!current_loop_cond_bb_) {
throw std::runtime_error(FormatError("irgen", "continue 必须在循环内"));
}
builder_.CreateBr(current_loop_cond_bb_);
return BlockFlow::Terminated;
}
if (ctx->RETURN()) {
if (ctx->exp()) {
ir::Value* v = EvalExpr(*ctx->exp());
// Handle return type conversion if necessary
if (func_->GetType()->IsFloat() && !v->GetType()->IsFloat()) {
v = builder_.CreateSIToFP(v, module_.GetContext().NextTemp());
} else if (func_->GetType()->IsInt32() && v->GetType()->IsFloat()) {
v = builder_.CreateFPToSI(v, module_.GetContext().NextTemp());
}
builder_.CreateRet(v);
} else {
builder_.CreateRet(nullptr); // nullptr for void ret
}
return BlockFlow::Terminated;
}
if (ctx->block()) {
return ctx->block()->accept(this);
}
if (ctx->exp()) {
EvalExpr(*ctx->exp());
return BlockFlow::Continue;
}
if (ctx->SEMICOLON()) {
return BlockFlow::Continue;
}
throw std::runtime_error(FormatError("irgen", "暂不支持的语句类型"));
}

@ -87,9 +87,9 @@ void LowerInstruction(const ir::Instruction& inst, MachineFunction& function,
case ir::Opcode::Sub:
case ir::Opcode::Mul:
throw std::runtime_error(FormatError("mir", "暂不支持该二元运算"));
default:
throw std::runtime_error(FormatError("mir", "暂不支持该 IR 指令"));
}
throw std::runtime_error(FormatError("mir", "暂不支持该 IR 指令"));
}
} // namespace

File diff suppressed because it is too large Load Diff

@ -1,17 +1,39 @@
// 维护局部变量声明的注册与查找。
// 维护对象符号的注册与按作用域查找。
#include "sem/SymbolTable.h"
void SymbolTable::Add(const std::string& name,
SysYParser::VarDefContext* decl) {
table_[name] = decl;
#include <stdexcept>
SymbolTable::SymbolTable() : scopes_(1) {}
void SymbolTable::EnterScope() { scopes_.emplace_back(); }
void SymbolTable::ExitScope() {
if (scopes_.size() <= 1) {
throw std::runtime_error("symbol table scope underflow");
}
scopes_.pop_back();
}
bool SymbolTable::Contains(const std::string& name) const {
return table_.find(name) != table_.end();
bool SymbolTable::Add(const ObjectBinding& symbol) {
auto& scope = scopes_.back();
return scope.emplace(symbol.name, symbol).second;
}
SysYParser::VarDefContext* SymbolTable::Lookup(const std::string& name) const {
auto it = table_.find(name);
return it == table_.end() ? nullptr : it->second;
bool SymbolTable::ContainsInCurrentScope(std::string_view name) const {
const auto& scope = scopes_.back();
return scope.find(std::string(name)) != scope.end();
}
const ObjectBinding* SymbolTable::Lookup(std::string_view name) const {
const std::string key(name);
for (auto it = scopes_.rbegin(); it != scopes_.rend(); ++it) {
auto found = it->find(key);
if (found != it->end()) {
return &found->second;
}
}
return nullptr;
}
size_t SymbolTable::Depth() const { return scopes_.size(); }

@ -1,4 +1,49 @@
// SysY 运行库实现:
// - 按实验/评测规范提供 I/O 等函数实现
// - 与编译器生成的目标代码链接,支撑运行时行为
#include <stdio.h>
#include <stdarg.h>
#include <sys/time.h>
/* Input functions */
int getint() { int t; scanf("%d", &t); return t; }
int getch() { char t; scanf("%c", &t); return (int)t; }
float getfloat() { float t; scanf("%f", &t); return t; }
int getarray(int a[]) {
int n;
scanf("%d", &n);
for (int i = 0; i < n; i++) scanf("%d", &a[i]);
return n;
}
int getfarray(float a[]) {
int n;
scanf("%d", &n);
for (int i = 0; i < n; i++) scanf("%f", &a[i]);
return n;
}
/* Output functions */
void putint(int a) { printf("%d", a); }
void putch(int a) { printf("%c", (char)a); }
void putfloat(float a) { printf("%a", a); }
void putarray(int n, int a[]) {
printf("%d:", n);
for (int i = 0; i < n; i++) printf(" %d", a[i]);
printf("\n");
}
void putfarray(int n, float a[]) {
printf("%d:", n);
for (int i = 0; i < n; i++) printf(" %a", a[i]);
printf("\n");
}
/* Timing functions */
struct timeval _sysy_start, _sysy_end;
void starttime() { gettimeofday(&_sysy_start, NULL); }
void stoptime() {
gettimeofday(&_sysy_end, NULL);
int millis = (_sysy_end.tv_sec - _sysy_start.tv_sec) * 1000 +
(_sysy_end.tv_usec - _sysy_start.tv_usec) / 1000;
fprintf(stderr, "Timer: %d ms\n", millis);
}

Binary file not shown.

@ -0,0 +1,6 @@
// 测试:简单加法
int main() {
int a = 1;
int b = 2;
return a + b;
}

@ -0,0 +1,8 @@
// 测试:减法和乘法
int main() {
int a = 10;
int b = 3;
int c = a - b;
int d = a * b;
return c + d;
}

@ -0,0 +1,8 @@
// 测试:除法和取模
int main() {
int a = 20;
int b = 6;
int c = a / b;
int d = a % b;
return c + d;
}

@ -0,0 +1,7 @@
// 测试:一元运算符(正负号)
int main() {
int a = 5;
int b = -a;
int c = +10;
return b + c;
}

@ -0,0 +1,7 @@
// 测试:赋值表达式
int main() {
int a = 10;
int b = 20;
a = b;
return a;
}

@ -0,0 +1,5 @@
// 测试:逗号分隔的多变量声明
int main() {
int a = 1, b = 2, c = 3;
return a + b + c;
}

@ -0,0 +1,14 @@
// 测试:综合测试(所有功能)
int main() {
int a = 10, b = 5;
int c = a + b;
int d = a - b;
int e = a * 2;
int f = a / b;
int g = a % b;
int h = -c;
int i = +d;
a = b + c;
b = d + e;
return a + b + f + g + h + i;
}

@ -0,0 +1 @@
int main(){ break; return 0; }

@ -0,0 +1,2 @@
int f(int x){ return x; }
int main(){ return f(); }

@ -0,0 +1,2 @@
void f(){ return 1; }
int main(){ return 0; }

@ -0,0 +1 @@
int main(){ return a; }

@ -0,0 +1,47 @@
#!/bin/bash
# 批量测试脚本
# 遍历 test/test_case 目录下所有的 .sy 文件,并验证解析是否成功
if [ ! -f "./build/bin/compiler" ]; then
echo "Compiler executable not found at ./build/bin/compiler. Please build the project first."
exit 1
fi
FAIL_COUNT=0
PASS_COUNT=0
FAILED_FILES=()
echo "开始批量测试解析..."
echo "========================================="
# 查找所有 .sy 文件并进行测试
while IFS= read -r file; do
# 运行解析器,将正常输出重定向到 /dev/null保留错误输出用于判断
./build/bin/compiler --emit-parse-tree "$file" > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo "❌ 解析失败: $file"
FAIL_COUNT=$((FAIL_COUNT+1))
FAILED_FILES+=("$file")
else
echo "✅ 解析成功: $file"
PASS_COUNT=$((PASS_COUNT+1))
fi
done < <(find test/test_case -type f -name "*.sy" | sort)
echo "========================================="
echo "测试完成!"
echo "成功: $PASS_COUNT"
echo "失败: $FAIL_COUNT"
if [ $FAIL_COUNT -ne 0 ]; then
echo "失败的文件列表:"
for f in "${FAILED_FILES[@]}"; do
echo " - $f"
done
exit 1
else
echo "🎉 所有测试用例均解析成功!"
exit 0
fi

@ -0,0 +1,34 @@
#include <exception>
#include <iostream>
#include <string>
#include "frontend/AntlrDriver.h"
#include "sem/Sema.h"
#include "utils/Log.h"
int main(int argc, char** argv) {
if (argc < 2) {
std::cerr << "usage: sema_check <input.sy> [more.sy...]\n";
return 2;
}
bool failed = false;
for (int i = 1; i < argc; ++i) {
const std::string path = argv[i];
try {
auto antlr = ParseFileWithAntlr(path);
auto* comp_unit = dynamic_cast<SysYParser::CompUnitContext*>(antlr.tree);
if (!comp_unit) {
throw std::runtime_error(FormatError("sema_check", "语法树根节点不是 compUnit"));
}
(void)RunSema(*comp_unit);
std::cout << "OK " << path << "\n";
} catch (const std::exception& ex) {
failed = true;
std::cout << "ERR " << path << "\n";
PrintException(std::cout, ex);
}
}
return failed ? 1 : 0;
}

@ -0,0 +1,19 @@
(base) root@HP:/home/hp/nudt-compiler-cpp/build# make -j$(nproc)
[ 2%] Built target utils
[ 2%] Building CXX object src/ir/CMakeFiles/ir_core.dir/Type.cpp.o
[ 3%] Building CXX object src/ir/CMakeFiles/ir_core.dir/Value.cpp.o
[ 73%] Built target antlr4_runtime
[ 75%] Built target sem
[ 79%] Built target frontend
[ 80%] Linking CXX static library libir_core.a
[ 84%] Built target ir_core
[ 85%] Built target ir_analysis
[ 89%] Built target ir_passes
[ 94%] Built target mir_core
[ 97%] Built target irgen
[ 99%] Built target mir_passes
[ 99%] Linking CXX executable ../bin/compiler
[100%] Built target compiler
(base) root@HP:/home/hp/nudt-compiler-cpp/build# cd ..
(base) root@HP:/home/hp/nudt-compiler-cpp# ./scripts/verify_ir.sh test/test_case/functional/09_func_defn.sy --run
[error] [irgen] 变量声明缺少存储槽位a
Loading…
Cancel
Save