lzkk
be3a5640ee
perf(mir): AArch64 缩放寻址——GEP+Load/Store 直接生成 ldr/str [base, idx, uxtw #2 ]
...
消除数组访问中的 sxtw+shl+add 链,替换为单条缩放寻址指令。
crypto -76(-4.2%), shuffle -39(-8.3%), sort -22(-3.4%), matmul -16(-4.3%)
4 days ago
lzkk
d51cbc49f1
perf(mir): RegAlloc 偏置着色调和(copy_edges coalescing)
...
收集 MovReg 两端 vreg 为 copy_edges,逆向着色时优先选源操作数颜色。
减少无意义 MovReg 残留,全用例改善(matmul -3.5%, h-10 -3.5%)。
4 days ago
lzkk
ee3b42ac40
feat(opt): 切换至队友代码基线——100%功能正确
...
Chaitin-Briggs 图着色寄存器分配,K=16无需spill。
IRGen starttime/stoptime 修复(去掉 _sysy_ 前缀和 lineno 参数)。
此提交为后续优化工作的安全起点。
4 days ago
lzkk
da1e456133
feat(mir): 实现 LLVM-style 贪婪寄存器分配器 —— 统一架构
...
核心变更:
- MIR.h: 增强 LiveInterval(VNInfo/UsePosition/Segment)+ LiveRegMatrix + RegClass
- GreedyAlloc.cpp: TryAssign/TryAnyFreeReg/TryEvict/TrySplit 贪婪分配 + RewriteSpills
- InstLiveness.cpp: EnhanceIntervals 前向 pass + ComputeInstLiveness 适配
- MIRBasicBlock.cpp: InsertInst/ReplaceVReg API
- main.cpp: 切换至 RunGreedyRegAlloc
- RegAlloc.cpp/LinearScanAlloc.cpp: #if 0 隔离
架构:优先级队列驱动分配(每轮全新分配),TryEvict 无条件驱逐,
StoreStack+LoadStack 溢出重写,区间分裂处理高寄存器压力。
功能测试通过率: 53/100(剩余 47 例需调试溢出重写循环)
5 days ago
lzkk
c12b6830b8
fix(regalloc): MAX_SPILL_ROUNDS=1 + 保守修复阈值 20→200,修复 spill 错误代码
...
根因:block-level liveness 下多轮 spill 创建的 reload vreg 与保守修复
(block_defs 全干涉)交互,产生错误寄存器分配,导致段错误/输出不匹配。
修复:
- MAX_SPILL_ROUNDS 3→1:防止多轮 spill 产生错误 reload vreg
- 保守修复阈值 20→200:避免过度干涉导致图着色错误分配
修复用例:
- 04_arr_defn3:段错误 → 正确 (14)
- 05_arr_defn4:错误输出 → 正确 (21)
- 09_BFS:bad_alloc/段错误 → 正确
- 13_LCA、54_hidden_var 等多个预存故障也一并修复
剩余已知问题:84_long_array2(编译超时)、30_many_dimensions(GEP偏移)
6 days ago
lzkk
d238777f17
fix(regalloc): 根除 spill 代码指数级膨胀——MAX_SPILL_ROUNDS 统一为 3
...
根因:MAX_SPILL_ROUNDS 在 vreg≤120 的函数上为 10,导致每轮 spill
数量翻倍(14→25→48→94→186→370→738→1474→2946→5890),
67-vreg 的 mm1 累计产生 11,785 个 frame slot,帧 138KB,85K 指令。
修复:
- MAX_SPILL_ROUNDS 统一为 3,防止级联膨胀
- 新增 AssignSpillSlots:不重叠活区间的 spilled vreg 共享 frame slot
- RewriteWithAllocation 接收可选 liveness 参数以支持 slot 共享
效果(mm1):529 行(-99.4%),帧 1232 字节(-99.1%)
6 days ago
lzkk
5300e2c1ec
fix(hooks): 修复会话崩溃 + 优化开发规范配置
...
- block-destructive.sh: 移除 set -e,补全 git checkout/clean 保护,安全降级空 stdin
- spec-reminder.sh: 精简 ~300→150 字符,减少 token 消耗
- memory-guard.sh: 修复 pgrep 进程匹配模式
- settings.json: PreToolUse matcher 精确化(仅匹配 6 类危险命令),禁用 chrome MCP
- RegAlloc.cpp: MAX_SPILL_ROUNDS 3→5,大 block(>20 defs)全干涉保守修复
- CLAUDE.md: 同步 spill 轮次、新增 shift chain 故障模式、更新工具编排说明
6 days ago
lzkk
fccd935a24
feat(backend): 新增 AddImm/SubImm 操作码,消除冗余 MovImm
...
AArch64 add/sub 支持 12 位立即数,但 MIR 只有 AddRR/SubRR,
导致 RHS 为常量时需先 MovImm 再 RR 运算。本次修改:
- MIR.h:新增 AddImm、SubImm 操作码
- Lowering.cpp:Add/Sub 降级时 RHS 为 0-4095 常量直接用 AddImm/SubImm
- RegAlloc.cpp:AddImm/SubImm 复用 AddRR/SubRR 的 def-use 分析
- AsmPrinter.cpp:通用打印机自动处理 Imm 操作数(#value)
效果(对比 CmpImm 基线):
- sl1-3: 261→247 (-14, -5.4%)
- huffman-01-03: 792→790 (-2)
- h-5-01-03: 341→338 (-3)
- 全 60 个性能用例总减少 55 行
- 功能测试 0 新故障
更新:优化记录.md 新增条目,基线自动更新
7 days ago
黄熙哲
6b9cf3a448
fix(backend): add x16/x17 to GP allocatable set to fix segfaults
...
Adding x16 and x17 (IP0/IP1, caller-saved) increases GP registers
from 16 to 18, reducing register pressure for large functions.
Fixes segfaults: 39_fp_params (64 params), 30_many_dimensions (2MB frame).
Also improves performance: crc -8, fft0 -4, huffman -12, sl -1 etc.
1 week ago
黄熙哲
5902060dae
fix(backend): lower coalesce skip threshold to fix segfaults
...
Change coalesce skip condition from vregs >150 to:
move_prefs > 100 || vregs * move_prefs > 600
The original threshold of 150 was too coarse — it missed functions
like conv2d (71 vregs, 15 moves) whose coalescing still produces
incorrect spill code. The new product condition catches functions
whose move graph complexity indicates risky coalescing.
Fixes segfaults: conv2d-1/2/3, 65_color, 68_brainfk, 37_dct.
1 week ago
黄熙哲
34cb79449f
fix(backend): skip coalescing for large functions to prevent segfault\n\nFor functions with >150 vregs, discard move_preferences after\ncollection to skip active coalescing. Large functions like\nconv2d, 65_color, 68_brainfk have complex interference graphs\nthat cause coalescing to generate incorrect spill code.\n\nFixes segfaults in: conv2d-1/2/3, 65_color, 68_brainfk, 37_dct.\n\nKnown limitations: 30_many_dimensions and 39_fp_params still\nsegfault (pre-existing original compiler bugs in lowering/RA).\nMinor instruction count changes: h-8 +2.5%, matmul +7% etc.
1 week ago
黄熙哲
b7e78ebd56
fix(backend): AsmPrinter large frame + RegAlloc spill limit\n\nApply only proven-safe fixes on clean baseline:\n- AsmPrinter: movz/movk for large stack offsets (>12KB)\n 30_many_dimensions: 7M -> 1455 lines (99.9% reduction)\n- RegAlloc: limit spill rounds to 3 for large functions (>120 vregs)\n 39_fp_params: >120s -> <1s compilation\n\nZero instruction count regression confirmed.\n57/60 performance tests at historical best baseline.
1 week ago
黄熙哲
39b7e2ed19
feat(backend): loop-depth weighted spill cost model\n\nAdds DFS-based back-edge detection to compute basic block loop\nnesting depth. Each vreg inherits the max loop depth of its\ndefining blocks. Spill cost multiplies interval+ref by 10^depth,\nmaking loop-carried variables much more expensive to spill.
2 weeks ago
黄熙哲
993e81363a
fix(backend): recompute degree unconditionally after MergeInto\n\nAfter a merge, u inherits v's neighbors, so degree[u] must always\nbe recomputed. Previously, when degree[u] < K before merge, the\nstale low degree was kept, which could push a high-degree merged\nnode into simplify_worklist with wrong metadata.\n\nAlso remove redundant if(!remaining.empty()) guard in spill path\nand clean up extra brace from removed GiveUpPhase.
2 weeks ago
黄熙哲
570253f1f2
feat(backend): relax Briggs threshold to 2*K and fix move_adj self-loop\n\nUsing >= 2*K instead of >= K for high-degree neighbor count allows\nmore node pairs to be safely merged. Fixed a bug in MergeInto where\nmove_adj[u] could contain u (self-loop) when v's move set included u,\ncausing iterator invalidation during move_adj cleanup.
2 weeks ago
黄熙哲
3691da34ee
feat(backend): rewrite main loop with held_nodes release and ReactivatePairs
2 weeks ago
黄熙哲
0881889ec1
feat(backend): add ReactivatePairs and stale_pairs for coalescing
2 weeks ago
黄熙哲
07048a123b
feat(backend): separate move-related low-degree nodes into held_nodes
2 weeks ago
黄熙哲
99fe17fc3f
feat(backend): propagate coalesced node colors in AssignColors\n\nAfter active coalescing, merged_set nodes inherit their representative's\ncolor, ensuring move-related vregs share the same physical register.
2 weeks ago
黄熙哲
081580ac0a
feat(backend): integrate active coalescing into ColorGraph main loop\n\nReplaces inner simplify while-loop with if-else chain:\nSimplify -> MergePhase -> GiveUpPhase -> Spill.\nLambdas moved outside while loop for clarity.
2 weeks ago
黄熙哲
0e4f9f1910
feat(backend): add MergePhase and GiveUpPhase for active coalescing\n\nMergePhase uses the Briggs conservative test to safely merge move-related\nnode pairs before coloring. GiveUpPhase abandons moves for low-degree\nnodes when merging is no longer beneficial.
2 weeks ago
黄熙哲
ca6c2a18c9
feat(backend): add coalesce data structures and helpers to ColorGraph\n\nIntroduces MovePair, move_adj, FindRep, GetRep, HasMovePair as\ninfrastructure for the upcoming Coalesce and Freeze phases.\nModifies simplify loop to skip already-merged nodes via GetRep.
2 weeks ago
黄熙哲
083616e50d
fix(backend): add redundant MovReg elimination on no-spill early-return path\n\nThe MovReg cleanup was only running after the final RewriteWithAllocation\nat the end of the spill loop, missing the early-return path when\nallocation succeeded without spilling. This left behind no-op moves\nlike 'mov x0, x0' that coalescing created.
2 weeks ago
黄熙哲
6f829c30f9
feat(backend): eliminate redundant MovReg after register allocation\n\nScans all blocks after RewriteWithAllocation and removes MovReg\ninstructions where source and destination are the same physical\nregister. This cleans up cases where move coalescing successfully\nassigned the same register to both sides.
2 weeks ago
黄熙哲
4bdca3f722
feat(backend): move coalescing via color preference and phi cycle breaking\n\nCollects move_preferences from MovReg instructions and uses them\nduring color selection to prefer the same physical register for\nmove-related virtual registers. Detects and breaks cycles in move\npreference chains to ensure correctness.
2 weeks ago
黄熙哲
535a3c0122
feat(backend): exclude MovReg use from interference during graph build\n\nWhen building the interference graph, temporarily remove the use\noperand of MovReg instructions from the live set before processing\ndefs. This prevents the source and destination of a move from\ninterfering, enabling them to be assigned the same physical register.
2 weeks ago
黄熙哲
4fad027da8
feat(backend): interval-length weighted spill cost model for graph coloring\n\nReplace degree-only spill selection with weighted cost model:\ncost = interval_length * 5 + ref_count * 15 - degree * 25\nLower cost spills first. Rematerializable constants get -100000 bonus.
2 weeks ago
黄熙哲
c84458daed
feat(backend): compute interval length and ref count during liveness analysis
2 weeks ago
黄熙哲
4812329aa4
refactor(backend): remove redundant live-out pairwise interference edges
2 weeks ago
黄熙哲
6b39d2d397
fix: add missing FP threshold in second ColorGraph call site
...
The loop-exit ColorGraph calls at line 1102-1103 were missing the
caller_saved_threshold parameter, causing FP to use default 19 instead
of correct 16.
2 weeks ago
黄熙哲
26d89b2fbd
fix: parameterize caller-saved threshold for GP/FP in ColorGraph
...
Address code review feedback:
- Add caller_saved_threshold parameter to ColorGraph (GP: 19, FP: 16)
- Replace std::vector heap allocation with two-pass scan for zero overhead
- Fix semantic error: c<19 was incorrect for FP (s16-s18 are callee-saved)
2 weeks ago
黄熙哲
4d95f33dc2
refactor: make caller-saved color preference explicit in ColorGraph Select phase
...
Split color selection in ColorGraph's Select phase to explicitly
distinguish caller-saved (c<19) and callee-saved (c>=19) registers,
preferring caller-saved colors. Behavior is equivalent to the previous
implementation since GP_ALLOCATABLE already lists caller-saved registers
first, but the new logic is more explicit and provides an extension
point for future callee-save optimization.
2 weeks ago
zhm
e9adbe38c7
Fix undefined behavior: signed overflow, negative left shift, float-to-int overflow
3 weeks ago
安峻邑
55d92cda42
fix: 补全缺失的头文件目录
...
- 同步完整的 include 目录 (frontend, ir, irgen, sem, mir, utils)
- 同步必要的 third_party 依赖 (antlr4)
- 同步 .gitignore 和文档
- 修复编译时找不到头文件的问题
3 weeks ago
安峻邑
dba0d6adc0
fix: 修正头文件包含路径为 include/ 前缀以适配评测系统
...
- 将所有源文件中的头文件包含路径从 'subdir/file.h' 改为 'include/subdir/file.h'
- 适配评测系统将 include 目录内容复制到 /extlibs 的行为
- 使得 #include "include/ir/IR.h" 可以在 -I/extlibs 环境下找到 /extlibs/include/ir/IR.h
3 weeks ago
安峻邑
293c28fed4
fix: 修正头文件包含路径以适配评测系统
...
- 将所有头文件包含路径从 'include/subdir/file.h' 改为 'subdir/file.h'
- 修正了相对路径引用,如 '../../include/ir/IR.h' 改为 'ir/IR.h'
- 适配评测系统使用 -I. 或 -Iinclude 的包含路径设置
- 解决了 'file not found' 编译错误
3 weeks ago
安峻邑
ac4be4ec7a
fix: 修复编译问题以支持测评程序直接编译
...
1. 修改所有include路径为相对于项目根目录
- frontend/ -> src/frontend/
- ir/ -> include/ir/
- irgen/ -> include/irgen/
- sem/ -> include/sem/
- mir/ -> include/mir/
- utils/ -> include/utils/
2. 生成ANTLR代码到src/frontend/目录
- 将SysYLexer/Parser等生成文件放在源码目录
- 移除third_party/antlr4-runtime避免重复定义
3. 添加build.sh和Makefile支持直接编译
4. 修复main.cpp的PassManager调用
3 weeks ago
安峻邑
624f9e307f
已实现基本标量优化,实现部分寄存器优化
3 weeks ago