AgentSkillsCN

data-protection

适用于落实数据保护控制措施,以满足合规与隐私需求。涵盖静止与传输中的加密、PII 处理、数据分类、GDPR、CCPA、HIPAA 的合规要求,以及数据保留策略。 适用场景:数据加密、PII 处理、GDPR、CCPA、HIPAA、数据分类、数据保留、数据脱敏、令牌化、隐私设计、数据最小化。 不适用场景:加密算法的选择(应使用密码学相关技能)、访问控制设计(应使用身份验证相关技能)、审计日志记录(应使用日志监控相关技能)。

SKILL.md
--- frontmatter
name: data-protection
description: |
    Use when implementing data protection controls for compliance and privacy. Covers encryption at rest and in transit, PII handling, data classification, GDPR, CCPA, HIPAA requirements, and data retention policies.
    USE FOR: data encryption, PII handling, GDPR, CCPA, HIPAA, data classification, data retention, data masking, tokenization, privacy by design, data minimization
    DO NOT USE FOR: cryptographic algorithm selection (use cryptography), access control design (use authentication), audit logging (use logging-monitoring)
license: MIT
metadata:
  displayName: "Data Protection"
  author: "Tyler-R-Kendrick"
compatibility: claude, copilot, cursor

Data Protection

Overview

Data protection spans both compliance requirements and technical controls. Organizations must protect data not only to meet regulatory obligations but also to maintain customer trust and limit business risk. Privacy by design — the principle of embedding privacy considerations into every stage of system development — is mandatory under GDPR and is increasingly becoming the expectation under other frameworks. Effective data protection requires understanding what data you have, where it lives, how it flows, who can access it, and how long it must be retained.

Data Classification

Data classification assigns sensitivity levels to information assets, enabling organizations to apply proportionate security controls. Every piece of data should be classified, and the classification should drive technical and procedural safeguards.

LevelDescriptionExamplesControls
PublicInformation intended for public disclosure; no adverse impact if exposedMarketing materials, public documentation, open-source code, press releasesNo special controls required; integrity protections recommended
InternalInformation for internal use only; minor impact if disclosedInternal policies, meeting notes, project plans, org chartsAccess restricted to employees; basic access controls; no public sharing
ConfidentialSensitive business information; significant impact if disclosedCustomer lists, financial reports, contracts, source code, employee recordsEncryption at rest and in transit; role-based access control; audit logging; NDA required for third-party access
RestrictedHighly sensitive data; severe regulatory, legal, or business impact if disclosedPII, PHI, payment card data (PCI), trade secrets, cryptographic keysStrong encryption (AES-256); strict need-to-know access; MFA required; full audit trail; data loss prevention (DLP); breach notification obligations

Encryption Requirements

At Rest

Data at rest — stored in databases, file systems, object storage, or backups — must be encrypted to protect against unauthorized access in the event of physical theft, unauthorized backup access, or misconfigured storage permissions.

RequirementDetails
AlgorithmAES-256 (AES-256-GCM preferred for authenticated encryption)
Transparent encryptionUse platform-native encryption (AWS KMS + S3 SSE, Azure Storage Service Encryption, GCP CMEK) for minimal application changes
Key storageKeys must be stored separately from encrypted data; use dedicated key management services (AWS KMS, Azure Key Vault, HashiCorp Vault, GCP Cloud KMS)
Key rotationRotate encryption keys at least annually; automate rotation where possible; retain old keys for decrypting historical data
Database encryptionEnable Transparent Data Encryption (TDE) for relational databases; use field-level encryption for highly sensitive columns
Backup encryptionAll backups must be encrypted with the same rigor as primary data; test restoration from encrypted backups regularly

In Transit

Data in transit — moving between clients and servers, between services, or between data centers — must be protected from eavesdropping and tampering.

RequirementDetails
ProtocolTLS 1.2 as minimum; TLS 1.3 preferred; disable TLS 1.0/1.1 and SSL entirely
HTTPS everywhereAll API endpoints, web interfaces, and webhooks must use HTTPS; redirect HTTP to HTTPS; set HSTS headers
Certificate validationValidate server certificates in all clients; do not disable certificate verification in production; pin certificates for mobile apps and critical service-to-service calls where appropriate
Internal trafficEncrypt service-to-service communication within the network using mTLS or a service mesh (Istio, Linkerd); do not assume internal networks are trusted
Certificate managementUse automated certificate management (Let's Encrypt, AWS ACM) to prevent expiration-related outages; monitor certificate expiry

Privacy Regulations

The following table compares the major privacy regulations that most organizations must consider. This is a summary — consult legal counsel for jurisdiction-specific compliance requirements.

RegulationJurisdictionKey RequirementsPenalties
GDPREuropean Union (and EEA)Lawful basis for processing (consent, contract, legitimate interest); data subject rights (access, rectification, erasure, portability); Data Protection Officer (DPO) required for large-scale processing; breach notification to supervisory authority within 72 hours; Data Protection Impact Assessments (DPIA) for high-risk processing; privacy by design and by defaultUp to 4% of global annual revenue or 20 million EUR, whichever is higher
CCPA/CPRACalifornia, USARight to know what data is collected; right to delete personal information; right to opt-out of sale/sharing of personal information; right to non-discrimination for exercising rights; data minimization and purpose limitation (CPRA); risk assessments for high-risk processing (CPRA)Up to $7,500 per intentional violation; $2,500 per unintentional violation; private right of action for data breaches ($100-$750 per consumer per incident)
HIPAAUnited States (healthcare)Administrative, physical, and technical safeguards for Protected Health Information (PHI); Business Associate Agreements (BAA) required for third-party processors; access logs and audit controls; minimum necessary standard for PHI access; breach notification to individuals and HHSUp to $1.5 million per year per violation category; criminal penalties for willful neglect (up to $250,000 and imprisonment)

PII Handling

Personally Identifiable Information (PII) requires specific technical and procedural controls throughout its lifecycle.

Data Minimization

Collect only the personal data that is strictly necessary for the stated purpose. Review data collection forms, API request schemas, and database schemas to identify and eliminate unnecessary fields. Ask: "Do we need this data to provide the service?" If not, do not collect it.

Purpose Limitation

Use personal data only for the purpose for which it was collected. If a new use case arises, assess whether it is compatible with the original purpose or whether new consent is required. Document the purpose for each data element in a data inventory.

Tokenization and Pseudonymization

  • Tokenization: Replace sensitive data with non-sensitive tokens that map back to the original data via a secure token vault. Tokens have no mathematical relationship to the original data.
  • Pseudonymization: Replace identifying fields with pseudonyms; the data can be re-identified with a separate key. GDPR recognizes pseudonymization as a risk-reduction measure but still treats pseudonymized data as personal data.

Right to Erasure (Right to be Forgotten)

Implement a process to delete or anonymize all personal data associated with a data subject upon verified request. This must cover primary databases, backups, caches, logs, analytics systems, and third-party processors. Key considerations:

  • Define what "erasure" means for each data store (hard delete vs. cryptographic erasure vs. anonymization).
  • Propagate deletion requests to all downstream systems and third-party processors.
  • Maintain a record of the deletion request and its execution (without retaining the deleted data itself).
  • Handle exceptions for legal holds and regulatory retention requirements.

Data Subject Access Requests (DSAR)

Build automated or semi-automated workflows to respond to DSARs within regulatory timelines (30 days under GDPR). The system must be able to locate, extract, and export all personal data associated with a given identity across all data stores.

Data Masking and Tokenization

TechniqueWhen to UseDetails
TokenizationWhen the original data must be retrievable by authorized systems (e.g., payment processing)Non-reversible without access to the token vault; tokens can preserve format (e.g., same length as a credit card number); ideal for PCI DSS scope reduction
Format-Preserving Encryption (FPE)When encrypted data must retain the same format as the original (e.g., SSN, phone number)Encrypts data while preserving length and character set; useful for legacy systems that validate field formats; uses FF1 or FF3-1 algorithms
Dynamic MaskingWhen data must be masked in real-time based on the requesting user's role or contextApplied at query time; the underlying data is not modified; useful for analytics and support access where full data is not needed
Static MaskingWhen creating non-production copies of data (e.g., for testing or development environments)Applied once to create a masked copy; original data structure is preserved; irreversible; ensures test environments contain no real PII

Key decisions:

  • Use tokenization when you need to reverse the process for legitimate business operations.
  • Use static masking for non-production environments — never use real PII in development or testing.
  • Use dynamic masking for production access control scenarios where different users need different levels of data visibility.

Data Retention

Data retention policies define how long data is kept and what happens when the retention period expires. Retaining data longer than necessary increases risk, storage costs, and regulatory exposure.

Define Retention Policies

  • Establish retention periods based on regulatory requirements, business needs, and contractual obligations.
  • Document retention policies in a data retention schedule that covers every data category.
  • Ensure retention periods are specific (e.g., "3 years after account closure") rather than vague (e.g., "as long as needed").

Automated Deletion

  • Implement automated processes to delete or anonymize data when the retention period expires.
  • Use database-level TTL (Time-to-Live) features, scheduled jobs, or lifecycle policies (e.g., S3 lifecycle rules) to enforce retention.
  • Verify automated deletion is working through regular audits.

Legal Hold

  • Implement a legal hold process that can suspend automated deletion for data relevant to litigation, regulatory investigations, or audits.
  • Legal holds must be granular — hold only the specific data relevant to the matter, not all data for a given user or category.
  • Track all active legal holds and release them promptly when they are no longer needed.

Backup Considerations

  • Retention policies must account for backup cycles. Data deleted from primary storage may still exist in backups.
  • Define how long backups are retained and how deletion requests are handled for data in backups (e.g., delete upon next backup rotation vs. cryptographic erasure of the backup encryption key).
  • Regularly test the ability to restore from backups and verify that retained data is still usable.

Best Practices

  • Maintain a comprehensive data inventory — you cannot protect data you do not know about; catalog all personal and sensitive data, where it is stored, how it flows, and who has access.
  • Apply privacy by design from the start — embed data protection into architecture decisions, not as a retrofit; conduct Data Protection Impact Assessments (DPIAs) for new systems that process personal data.
  • Encrypt by default — enable encryption at rest and in transit for all data stores and communication channels; treat encryption as baseline, not an add-on for sensitive data only.
  • Implement least-privilege access to sensitive data — restrict access to personal and confidential data based on role and business need; review access permissions regularly and revoke unnecessary grants.
  • Automate compliance processes — manual compliance processes do not scale; automate DSAR fulfillment, retention enforcement, consent management, and breach detection to reduce human error and response times.
  • Test your data deletion capabilities — regularly verify that erasure requests result in actual deletion across all systems, including backups, caches, logs, and third-party processors.
  • Monitor for data exfiltration — deploy Data Loss Prevention (DLP) tools to detect unauthorized data transfers via email, cloud storage, USB, and API calls; alert on anomalous data access patterns.
  • Train all employees on data handling — security and privacy training should cover data classification, acceptable use, incident reporting, and regulatory obligations; tailor training to roles (developers, support, executives).