# ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ # ┃ Riveta Labs – Robots & Agent Policy ┃ # ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫ # ┃ Updated: 2026-01-22 | SEO + GEO Optimized ┃ # ┃ ┃ # ┃ Policy: ┃ # ┃ • Search engines & AI assistants: WELCOME ┃ # ┃ • Checkouts/account pages: Humans only ┃ # ┃ • Aggressive scrapers: Blocked or throttled ┃ # ┃ ┃ # ┃ Terms: https://www.shopify.com/legal/terms ┃ # ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ # ============================================================================= # SITEMAPS # ============================================================================= Sitemap: https://rivetalabs.com/sitemap.xml # ============================================================================= # GLOBAL DEFAULT # ============================================================================= User-agent: * Allow: / # --- Shopify Security Blocks (private/system/duplicate paths) --- Disallow: /a/downloads/-/* Disallow: /admin Disallow: /account Disallow: /cart Disallow: /carts Disallow: /orders Disallow: /checkouts/ Disallow: /checkout Disallow: /72409743497/checkouts Disallow: /72409743497/orders Disallow: /search Disallow: /sf_private_access_tokens Disallow: /apple-app-site-association Disallow: /.well-known/shopify/monorail Disallow: /recommendations/products Disallow: /*/recommendations/products # --- Duplicate URL Parameters (prevents index bloat) --- Disallow: /collections/*sort_by* Disallow: /*/collections/*sort_by* Disallow: /collections/*+* Disallow: /collections/*%2B* Disallow: /collections/*%2b* Disallow: /*/collections/*+* Disallow: /*/collections/*%2B* Disallow: /*/collections/*%2b* Disallow: /blogs/*+* Disallow: /blogs/*%2B* Disallow: /blogs/*%2b* Disallow: /*/blogs/*+* Disallow: /*/blogs/*%2B* Disallow: /*/blogs/*%2b* Disallow: /*?*oseid=* Disallow: /*preview_theme_id* Disallow: /*preview_script_id* Disallow: /*/*?*ls=*&ls=* Disallow: /*/*?*ls%3D*%3Fls%3D* Disallow: /*/*?*ls%3d*%3fls%3d* # --- FIX: Block Vendor & Search Leaks --- Disallow: /collections/vendors* Disallow: /collections/types* Disallow: /*?q=* # --- Shopify Remote Preview Slugs --- Disallow: /products/*-remote Disallow: /*/products/*-remote Disallow: /collections/*/products/*-remote Disallow: /*/collections/*/products/*-remote # ============================================================================= # SEARCH ENGINES (Tier 1 Priority) # ============================================================================= # Google User-agent: Googlebot Allow: / User-agent: Googlebot-Image Allow: / User-agent: AdsBot-Google Allow: / User-agent: Mediapartners-Google Allow: / # Bing (CRITICAL: ChatGPT uses Bing's index) User-agent: Bingbot Allow: / User-agent: msnbot Allow: / # DuckDuckGo User-agent: DuckDuckBot Allow: / # Yahoo/Slurp User-agent: Slurp Allow: / # Yandex User-agent: YandexBot Allow: / # Baidu User-agent: Baiduspider Allow: / # Apple (Siri/Spotlight) User-agent: Applebot Allow: / # ============================================================================= # AI CRAWLERS – GEO OPTIMIZATION (Maximum Visibility) # ============================================================================= # Common Crawl – High volume, used for AI training without consent controls # (Comment out if you want maximum AI training visibility) User-agent: CCBot Allow: / # --- Google AI (Gemini) --- User-agent: Google-Extended Allow: / # --- OpenAI (ChatGPT) --- User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: OAI-SearchBot Allow: / # --- Anthropic (Claude) --- User-agent: ClaudeBot Allow: / User-agent: Claude-Web Allow: / User-agent: Claude-SearchBot Allow: / User-agent: Claude-User Allow: / User-agent: anthropic-ai Allow: / # --- Perplexity --- User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / # --- Cohere --- User-agent: cohere-ai Allow: / # --- You.com --- User-agent: YouBot Allow: / # --- DuckDuckGo AI --- User-agent: DuckAssistBot Allow: / # --- Amazon (Alexa) --- User-agent: Amazonbot Allow: / # --- Microsoft (Copilot uses Bing) --- # (Covered by Bingbot above) # ============================================================================= # SOCIAL MEDIA PREVIEWS (Indirect SEO via shares) # ============================================================================= User-agent: facebookexternalhit Allow: / User-agent: LinkedInBot Allow: / User-agent: Twitterbot Allow: / User-agent: Slackbot Allow: / # ============================================================================= # SEO TOOLS – Throttled (not blocked) # ============================================================================= User-agent: AhrefsBot Crawl-delay: 10 User-agent: AhrefsSiteAudit Crawl-delay: 10 User-agent: SemrushBot Crawl-delay: 10 User-agent: MJ12bot Crawl-delay: 10 User-agent: DotBot Crawl-delay: 10 User-agent: BLEXBot Crawl-delay: 10 User-agent: Pinterest Crawl-delay: 1 # ============================================================================= # BLOCKED – Aggressive/Unwanted Crawlers # ============================================================================= # ByteDance (TikTok) – Extremely aggressive, resource drain User-agent: Bytespider Disallow: / # Apache Nutch – Often abused for scraping User-agent: Nutch Disallow: / # Scrapy bots User-agent: Scrapy Disallow: / # Known bad actors User-agent: DataForSeoBot Disallow: / User-agent: Sogou Disallow: / User-agent: PetalBot Disallow: /