Reference

Customizing ground

All ground behaviour is driven by ground.yaml. Copy ground.example.yaml, fill in the required fields, and use ground deploy --dry-run to preview before applying.

ground.yaml โ€” complete reference

Every key ground accepts is shown below. Only the org block is required. All other sections have sensible defaults or can be omitted entirely.

# โ”€โ”€ REQUIRED โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
org:
  name: "My Research University"          # human name โ€” appears in SSP titles
  region: us-east-1                       # AWS region for all ground stacks
  management_account_id: "123456789012"   # 12-digit management account ID
  audit_email: audit@example.edu          # email for the Log Archive account
  logging_email: logging@example.edu      # optional โ€” separate Logging account
                                          # (defaults to audit_email if omitted)
  workload_ous:                           # extra OUs inside the Research tier
    - research                            # โ†’ /Research/research/
    - sandbox                             # โ†’ /Research/sandbox/
    - hpc-cluster                         # โ†’ /SensitiveResearch/hpc-cluster/

# โ”€โ”€ NETWORKING โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
network:
  transit_gateway: true                   # deploy Transit Gateway in Network Hub account
                                          # false = no TGW, VPCs connect directly
  cidr_block: "10.0.0.0/8"               # supernet for all VPCs (e.g., 10.0.0.0/8)
  vpc_endpoints:                          # Interface endpoints created in each spoke VPC
    - s3                                  # S3 Gateway endpoint (free)
    - ec2                                 # EC2 API โ€” for compute-heavy SREs
    - sts                                 # STS โ€” required for assume-role calls
    - ssm                                 # Systems Manager โ€” required for Session Manager
    - secretsmanager                      # Secrets Manager โ€” avoid hardcoded creds
    - kms                                 # KMS โ€” for CMK encrypt/decrypt calls
    - logs                                # CloudWatch Logs โ€” for VPC flow log delivery
    - execute-api                         # optional โ€” API Gateway private endpoints
    - sagemaker.api                       # optional โ€” SageMaker in air-gapped VPCs
    - sagemaker.runtime                   # optional โ€” SageMaker inference

# โ”€โ”€ IDENTITY โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
identity:
  identity_center: true                   # create IAM Identity Center permission sets
                                          # false = skip (use existing IdP/Federation)
  instance_arn: ""                        # set if Identity Center already exists
                                          # (ground will add permission sets, not re-create)

# โ”€โ”€ LOGGING โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
logging:
  retention_days: 365                     # S3 lifecycle expiry for audit logs
                                          # FedRAMP Moderate minimum: 365; CMMC L2: 90
                                          # attest will warn if below your framework's minimum
  bucket_name: ""                         # audit bucket name โ€” auto-generated if empty
                                          # format: ground-audit-<management-account-id>

# โ”€โ”€ SECURITY โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# ground deploys a logging-protection SCP only. Detection services
# (GuardDuty, Security Hub, Macie) are activated by 'attest apply' based on
# your active compliance frameworks. ground cannot know the correct standard
# without knowing your frameworks โ€” that is attest's job.
#
# Declare non-AWS services here so attest can assess which controls they satisfy.
# ground.yaml is version-controlled โ€” these declarations are audit evidence.
security:
  external_services:

    # Category options:
    #   edr               โ€” endpoint detection and response
    #   siem              โ€” security information and event management
    #   cspm              โ€” cloud security posture management
    #   cwpp              โ€” cloud workload protection
    #   data-transfer     โ€” research data movement (e.g., Globus)
    #   vuln-scanning     โ€” vulnerability scanning
    #   identity          โ€” identity provider / MFA / PAM
    #   research-platform โ€” research-specific services (HPC schedulers, portals)
    #
    # Feature options (compliance capabilities the service brings):
    #   fedramp-high      โ€” FedRAMP High authorized
    #   fedramp-moderate  โ€” FedRAMP Moderate authorized
    #   baa               โ€” Business Associate Agreement available (HIPAA)
    #   hipaa-compliant   โ€” HIPAA-eligible service
    #   high-assurance    โ€” high-assurance mode available (Globus terminology)
    #   soc2-type2        โ€” SOC 2 Type II report available
    #   iso27001          โ€” ISO 27001 certified
    #   itar-compliant    โ€” ITAR-eligible service

    # Globus โ€” research data transfer platform
    # Required for NIH dbGaP, DOE Superfacility, many HPC data movement workflows.
    # High Assurance mode enforces MFA and encrypted transfer for CUI/PHI datasets.
    - name: globus
      vendor: "Globus / University of Chicago"
      category: data-transfer
      features:
        - baa              # BAA available for HIPAA-covered workflows
        - high-assurance   # HA mode enforces MFA + encryption per endpoint
        - fedramp-moderate
      scope:               # OU names this service applies to; empty = org-wide
        - SensitiveResearch
        - NIHGenomic
        - HIPAAResearch
      notes: "High Assurance enabled; BAA signed 2025-01-15. Required for dbGaP transfers."
      probe: ground-probe-globus   # optional: verify declarations via Globus API
      probe_config:                # passed to the probe binary as JSON on stdin
        client_id: "your-globus-app-client-id"
        endpoint_ids:
          - "abc123-..."

    # CrowdStrike Falcon โ€” EDR/XDR (FedRAMP High authorized)
    - name: crowdstrike-falcon
      vendor: CrowdStrike
      category: edr
      features:
        - fedramp-high
        - soc2-type2
      scope: []            # empty = org-wide

    # Splunk โ€” SIEM (comment out if using CloudWatch / OpenSearch instead)
    # - name: splunk-cloud
    #   vendor: Splunk
    #   category: siem
    #   features: [fedramp-moderate]
    #   scope: []

    # Palo Alto Prisma Cloud โ€” CSPM + CWPP
    # - name: prisma-cloud
    #   vendor: "Palo Alto Networks"
    #   category: cspm
    #   features: [soc2-type2]
    #   scope: []

    # Tenable.io โ€” vulnerability scanning
    # - name: tenable-io
    #   vendor: Tenable
    #   category: vuln-scanning
    #   features: [fedramp-moderate]
    #   scope: []

# โ”€โ”€ TAGGING โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
tagging:
  required_tags:                          # One Deny SCP per tag (OR logic, not AND)
    - project                             # what research project this resource supports
    - environment                         # dev / staging / prod
    - owner                               # team or PI email
    - data-classification                 # public / internal / restricted / cui / phi
    # - cost-center                         # add institutional finance tags as needed
    # - grant-number                        # for grant reporting and cost allocation

Declaring external services

Research institutions deploy services alongside AWS that carry compliance weight โ€” Globus for data transfer, CrowdStrike for endpoint protection, Splunk for SIEM. Ground records these declarations so attest can assess which controls they satisfy. Ground does not deploy, configure, or verify these services.

Why declare them in ground.yaml?

Declarations are version-controlled and auditable. "We declared Globus High Assurance with a signed BAA on 2025-01-15" is audit evidence. attest reads these from ground-meta.json and maps them to controls โ€” for example, a CrowdStrike FedRAMP High declaration contributes evidence toward NIST 800-171 ยง3.14.x endpoint protection controls.

What about Globus specifically?

Globus High Assurance mode enforces MFA and encrypted transfer per endpoint โ€” it is often required for NIH dbGaP transfers and DOE Superfacility data movement. Declaring it with baa and high-assurance features lets attest mark HIPAA data-transfer controls and NIH GDS transfer requirements as addressed.

Detection services are different. GuardDuty, Security Hub, and Macie are AWS-native and are activated by attest apply โ€” ground does not configure them because the correct Security Hub standard depends on which compliance frameworks are active, which only attest knows after attest compile.

Service probes (verification)

Declarations in ground.yaml are self-reported. Probes verify them by querying the service's own API. A probe is a small binary that ground invokes during ground export-metadata โ€” it receives the service config as JSON on stdin and writes a discovery result to stdout.

# ground invokes the probe with the probe_config block as JSON on stdin:
echo '{"client_id":"...","endpoint_ids":["abc123"]}' | ground-probe-globus

# The probe writes a discovery result to stdout:
{
  "service": "globus",
  "probed_at": "2026-04-30T14:22:00Z",
  "features_verified": ["high-assurance", "baa"],
  "features_unverified": ["fedramp-moderate"],
  "endpoints": [
    {
      "id": "abc123-...",
      "display_name": "University HPC Cluster",
      "high_assurance": true,
      "subscription_id": "plus-abc..."
    }
  ]
}

# Exit codes: 0 = verified, 1 = partial (some checks failed), 2 = error
# ground merges probe results into ground-meta.json alongside declarations.
Probe binaries are not part of ground. The probe interface is a specification โ€” the ground-probe-* binaries are separate projects. This keeps ground small and lets anyone write a probe for any service. A reference ground-probe-globus implementation is planned that queries the Globus Transfer API to verify endpoint HA status and subscription tier.

If no probe is configured for a service, ground uses the declared features list as-is. Probes add verified evidence on top of declarations โ€” they do not replace them. If a probe is unavailable or returns an error, ground falls back to the declaration and notes the verification status in ground-meta.json.

Deploying individual stacks

Ground deploy runs all four stacks in sequence. Deploy incrementally using IaC output, or target stacks directly via AWS CLI.

# Preview all stacks without deploying
ground deploy --dry-run

# Generate Terraform or CDK and apply selectively
ground deploy --output terraform
cd ground-terraform
terraform apply -target=aws_cloudformation_stack.ground_logging \
                -target=aws_cloudformation_stack.ground_security

Stack deployment order

  1. accounts โ€” OU hierarchy, account vending
  2. security โ€” logging-protection SCP
  3. logging โ€” CloudTrail, Config, S3 audit bucket
  4. identity โ€” IAM Identity Center permission sets
  5. network โ€” Transit Gateway, VPCs, endpoints

Minimum for attest scan: deploy logging + security. attest will warn about the missing stacks but can still assess posture from CloudTrail and Config rules.

IaC output modes

Ground generates CloudFormation (default), Terraform HCL, or CDK TypeScript. CloudFormation deploys directly; Terraform and CDK generate artifacts for your existing toolchain.

ground deploy --output cloudformation  # default: deploy now
ground deploy --output terraform        # โ†’ ground-terraform/
ground deploy --output cdk             # โ†’ ground-cdk/

Custom workload OUs

The five default OUs (Security, Infrastructure, Research, Sensitive Research, DoD/CMMC) are always created. Add institution-specific OUs under workload_ous. Custom OUs inherit the SCPs of their parent tier and are tagged with ground:tier for attest auto-discovery.

org:
  workload_ous:
    - research             # โ†’ /Research/ OU, inherits Research SCP tier
    - sandbox              # โ†’ /Research/Sandbox/ sub-OU
    - hpc-cluster          # โ†’ /SensitiveResearch/HPC/ sub-OU, inherits Sensitive tier
    - dod-contractors      # โ†’ /DoD-CMMC/ sub-OU, inherits DoD SCP tier

Custom required tags

Each entry in required_tags generates its own Deny SCP statement โ€” OR logic, not AND. Resources missing any single tag will be denied creation. Add institutional tags alongside the attest defaults.

tagging:
  required_tags:
    - project              # what research project / grant this supports
    - environment          # dev / staging / prod
    - owner                # PI email or team name
    - data-classification  # public / internal / restricted / cui / phi
    - cost-center          # โ†’ finance reporting (institution-specific)
    - grant-number         # โ†’ grant reporting and cost allocation

VPC endpoint selection

VPC endpoints keep traffic inside the AWS network and are required by most FedRAMP and CMMC controls. The table below shows which endpoints are relevant to common research workload patterns.

EndpointWhen to includeCost
s3Always โ€” free gateway endpointFree
stsAlways โ€” IAM assume-role calls go through STS~$7/mo per AZ
ssmRequired for Session Manager (no bastion hosts)~$7/mo per AZ
secretsmanagerAny workload reading secrets (databases, API keys)~$7/mo per AZ
kmsAny workload using CMKs for encryption~$7/mo per AZ
ec2EC2-heavy SREs; allows EC2 API calls without internet~$7/mo per AZ
logsRequired for VPC Flow Log delivery~$7/mo per AZ
sagemaker.apiSageMaker in air-gapped VPCs (CMMC L2+)~$7/mo per AZ
execute-apiPrivate API Gateway endpoints~$7/mo per AZ

Cost is per AZ, per region, per month. All endpoints use aws:PrincipalOrgID conditions โ€” only principals in your org can use them.

Log retention by framework

Set logging.retention_days to meet the most restrictive framework active in your SRE. attest will warn during attest scan if retention falls below the framework minimum.

FrameworkMinimum retentionRecommended
CMMC Level 190 days365 days
CMMC Level 290 days365 days
HIPAA6 years (2190 days)2555 days (7 yr)
FedRAMP Moderate365 days365 days
NIH GDS365 days after DUA closeout365 days
FERPA5 years (1825 days)1825 days
ISO 2700190 days365 days
HIPAA + NIH GDS conflict: HIPAA requires long retention; NIH GDS requires deletion of genomic data at DUA closeout. attest's provenance-aware provisioning resolves this by using per-prefix S3 lifecycle rules โ€” HIPAA audit logs stay 7 years; dbGaP-sourced genomic objects expire at DUA closeout. Set retention_days to your highest requirement and let attest manage the per-dataset overrides.