Code Comment (Bilingual)
Objectives
- •Add bilingual (Chinese & English) comments to code
- •Follow consistent comment formatting rules
- •Explain complex logic with reasons
- •Maintain clear code documentation
Comment Rules Overview
| Location | Language | Format |
|---|---|---|
| File-level docstring | English only | Standard docstring |
| Function docstring | Chinese + English | Two-line format: Chinese first line, English second line |
| Inline comments | Chinese + English | Chinese line, then English line, above code |
| Code spacing | - | Blank line between code blocks |
1. File-level Docstring (English Only)
python
""" Lab 2: Q-Learning Agent for Cliff Walking Student ID: 041107730 Implements Q-Learning using Bellman equation: Q(s,a) = r + γ * max Q(s',a') Modified from Hybrid Activity 1 to solve the Cliff Walking problem. """
2. Function Docstring (Two-Line Bilingual Format)
Two lines with Chinese first line, English second line:
python
def train(env, episodes: int = 50, gamma: float = 0.9) -> list:
"""训练Q-Learning智能体
Train Q-Learning agent"""
def reset() -> tuple:
"""重置环境到初始状态
Reset environment to initial state"""
Rules:
- •Use triple quotes
""" - •Chinese description on first line
- •English description on second line
- •Keep it concise, no blank line between Chinese and English
- •No parameter or return value details in docstring
3. Inline Comments (Line-by-Line Bilingual)
Chinese comment immediately followed by English comment, placed ABOVE code:
python
# 初始化Q表,使用随机值 # Initialize Q-table with random values qtable = [[random.random() for _ in range(env.actions())] for _ in range(env.states())] # 增加步数计数 # Increment step count steps += 1
Rules:
- •Comment goes ABOVE the code, NOT beside it
- •Chinese line first, English line immediately after (no blank line between)
- •Blank line AFTER comments and before next code block
- •For complex logic, add explanation:
python
# 使用贝尔曼方程更新Q表:Q(s,a) = r + γ * max Q(s',a') # Update Q-table using Bellman equation: Q(s,a) = r + γ * max Q(s',a') qtable[state][action] = reward + gamma * max(qtable[next_state]) # 衰减探索率,随着学习进行减少随机探索 # Decay exploration rate, reduce random exploration as learning progresses epsilon -= decay * epsilon
4. Code Spacing
IMPORTANT: Always add blank lines between code blocks:
python
def main():
# 打印程序标题
# Print program header
print("=" * 50)
# 创建悬崖行走环境
# Create Cliff Walking environment
env = GridEnv(size=12)
# 设置超参数
# Set hyperparameters
EPISODES = 50
GAMMA = 0.9
Rules:
- •Blank line after each code block
- •No blank line between Chinese and English comments
- •Comments always above code, never beside it
5. Complex Logic Comments
For complex logic with multiple lines, keep Chinese and English paired line-by-line:
python
# 使用贝尔曼方程更新Q表:Q(s,a) = r + γ * max Q(s',a')
# Update Q-table using Bellman equation: Q(s,a) = r + γ * max Q(s',a')
# 这里alpha=1,即完全替换旧值(不使用加权平均)
# Here alpha=1, meaning completely replace old value (no weighted average)
# 完整公式应为:Q(s,a) = Q(s,a) + α * [r + γ * max Q(s',a') - Q(s,a)]
# Full formula should be: Q(s,a) = Q(s,a) + α * [r + γ * max Q(s',a') - Q(s,a)]
qtable[state][action] = reward + gamma * max(qtable[next_state])
# 检查是否掉下悬崖(底行,第1-10列)
# Check if agent fell off cliff (bottom row, columns 1-10)
# 原因:悬崖行走问题的核心机制,大负奖励惩罚掉入悬崖
# Reason: Core mechanism of Cliff Walking problem, large negative reward penalizes falling
if self.y == 3 and 1 <= self.x <= 10:
reward = -100
6. Import Comments
Add bilingual comments above imports:
python
# 导入抽象基类模块,用于定义环境接口 # Import abstract base class module for defining environment interface import abc # 导入操作系统、时间和随机模块 # Import os, time and random modules import os import time import random
7. Entry Point Comment
python
# 程序入口点,运行主函数
# Program entry point, run main function
if __name__ == "__main__":
main()
Comment Checklist
Before finishing:
- • File-level docstring is English only
- • All function docstrings use two-line format: Chinese first line, English second line
- • All inline comments have Chinese line immediately followed by English line
- • Comments are placed ABOVE code, not beside it
- • No blank line between Chinese and English lines (in both docstrings and comments)
- • Blank line between each code block
- • Complex logic has explanation and reason
- • Every code block has comments
- • Import statements have bilingual comments
Quick Reference
python
# File docstring (English only)
"""
Lab 2: Q-Learning Agent
Implements Q-Learning algorithm
"""
# Function docstring (two-line bilingual)
def train(env):
"""训练Q-Learning智能体
Train Q-Learning agent"""
# Inline comment (line-by-line bilingual, above code)
# 初始化Q表,使用随机值
# Initialize Q-table with random values
qtable = [[random.random() for _ in range(env.actions())] for _ in range(env.states())]
# 增加步数计数
# Increment step count
steps += 1
Key Rules Summary
- •Function docstrings: Two lines - Chinese first line, English second line (NO blank line between)
- •Inline comments: Chinese line, English line, then code (NO blank line between Chinese/English)
- •Comment placement: Always ABOVE code, never beside it
- •Code spacing: Blank line after each code block
- •No blank line: Between Chinese and English lines (both in docstrings and comments)
Complete Example
python
"""
Lab 2: Q-Learning Agent for Cliff Walking
Student ID: 041107730
Implements Q-Learning using Bellman equation
"""
# 导入抽象基类模块,用于定义环境接口
# Import abstract base class module for defining environment interface
import abc
# 导入操作系统、时间和随机模块
# Import os, time and random modules
import os
import time
import random
class Env(abc.ABC):
"""环境抽象基类
Environment abstract base class"""
@abc.abstractmethod
def actions(self) -> int:
"""返回动作空间的大小
Return the size of action space"""
raise NotImplementedError()
def train(env, episodes: int = 50, gamma: float = 0.9) -> list:
"""训练Q-Learning智能体
Train Q-Learning agent"""
# 初始化Q表,使用随机值
# Initialize Q-table with random values
qtable = [[random.random() for _ in range(env.actions())] for _ in range(env.states())]
# 训练主循环,遍历所有回合
# Main training loop, iterate through all episodes
for episode in range(episodes):
# 重置环境,获取初始状态
# Reset environment and get initial state
state = env.reset()
# 使用贝尔曼方程更新Q表
# Update Q-table using Bellman equation
qtable[state][action] = reward + gamma * max(qtable[next_state])
# 返回训练好的Q表
# Return the trained Q-table
return qtable
# 程序入口点,运行主函数
# Program entry point, run main function
if __name__ == "__main__":
main()