GitHub Actions Failure Debugging¶

This skill provides a systematic approach to debugging failing GitHub Actions workflows in pull requests.

When to Use This Skill¶

Use this skill when: - GitHub Actions workflows are failing - CI/CD pipeline shows red status - Pull request checks are not passing - Asked to investigate or fix workflow failures

Prerequisites¶

Access to GitHub MCP Server tools
Repository with GitHub Actions workflows
Appropriate permissions to view workflow logs

Instructions¶

List recent workflow runs for the pull request

Use the list_workflow_runs tool to retrieve recent runs and their status:

# Tool will return: workflow names, run IDs, status, conclusion

Identify failed jobs

Review the workflow run results to find: - Which workflows failed - Which specific jobs failed within those workflows - Timestamp and trigger information

Get summarized failure information

Use the summarize_job_log_failures tool to get an AI summary of failed job logs: - This avoids filling context with thousands of log lines - Provides focused failure analysis - Highlights key error messages and patterns

Retrieve detailed logs if needed

If the summary doesn't provide enough information: - Use get_job_logs tool with the specific job ID - Use get_workflow_run_logs tool for complete workflow logs - Search for error messages, stack traces, or failure indicators

Attempt local reproduction

Try to reproduce the failure in your own environment: - Check out the same branch - Run the same commands locally - Verify environment variables and dependencies match

Fix the failing build

Based on the error analysis: - Make necessary code changes - Update workflow configuration if needed - Ensure dependencies are properly specified - Add error handling or missing resources

Verify the fix

Before committing: - Test locally if you reproduced the issue - Check that the fix addresses the root cause - Consider adding tests to prevent regression - Review workflow syntax if configuration was changed

Examples¶

Example 1: Test Failure¶

A workflow fails with test errors.

Steps: 1. Use summarize_job_log_failures → "3 unit tests failing in user authentication module" 2. Examine specific test failures 3. Run tests locally: npm test -- --grep authentication 4. Fix the failing tests 5. Verify all tests pass locally 6. Commit and push changes

Example 2: Dependency Installation Failure¶

A workflow fails during dependency installation.

Steps: 1. Use get_job_logs for the "Install dependencies" step 2. Look for package resolution errors: npm ERR! 404 Not Found - package@version 3. Check package.json for typos or incorrect versions 4. Update to correct package version 5. Test installation locally: npm install 6. Commit fix with clear message

Example 3: Environment Configuration Issue¶

A workflow fails with missing environment variables.

Steps: 1. summarize_job_log_failures → "Environment variable DATABASE_URL not set" 2. Check workflow file for required secrets/variables 3. Verify secrets are configured in repository settings 4. If missing, add the required secret 5. If configured, check workflow syntax for correct reference 6. Re-run the workflow

Best Practices¶

Start with summarized logs to avoid context overflow
Focus on the first failure in a sequence (later failures may be cascading)
Check recent changes to code and workflow files
Look for common patterns: dependency version conflicts, missing files, permission issues
Document the root cause in commit messages
Consider if the failure indicates a broader issue that needs addressing

Common Issues¶

Issue: Logs show "Resource not available" or timeout errors
Solution: Check if external services or dependencies are accessible. May need to update URLs or credentials.

Issue: "File not found" errors in workflow
Solution: Verify file paths are correct relative to repository root. Check if files are committed and pushed.

Issue: Tests pass locally but fail in CI
Solution: Check for environment differences (OS, Node version, environment variables). Update workflow to match local environment or fix environment-specific code.

Issue: Intermittent failures
Solution: Look for race conditions, timing-dependent tests, or flaky external dependencies. Consider retry logic or more robust test design.