Testing Practices Questionnaire#

TODO#

Review and refine questionnaire based on pilot feedback
Distribute to 17 teams
Compile and analyze responses

Team Testing Practices Questionnaire#

Team Name: Enter your team name

Date: Today’s date

Participants: Names of team members participating in this discussion

How to Use This Questionnaire#

Purpose#

This questionnaire is designed to gather comprehensive input about your team’s current testing practices, challenges, and priorities. The goal is to understand the testing landscape across all 17 teams and identify opportunities for shared improvements and knowledge transfer.

Filling Out Responses#

Be Descriptive: Provide specific details about your practices rather than just “yes” or “no” answers. Examples, tools, and processes are especially valuable.

Use “NA” When Appropriate: If a question doesn’t apply to your team, write “NA”. Adding a brief context about why can be helpful for understanding your team’s situation. For example:

“NA - We don’t do A/B testing because our service is internal-facing only”
“NA - No staging environment due to legacy infrastructure constraints”
“NA - Component testing isn’t relevant for our data pipeline architecture”

Team Reflection: Many teams find value in discussing these questions together. You might discover practices you hadn’t formally recognized or identify gaps that warrant attention.

No Wrong Answers: Every team’s context is different. What works for one team may not work for another. Be honest about your current state rather than describing an idealized version.

Ask for Clarification: If any question is unclear or could be interpreted multiple ways, feel free to note your interpretation or ask for clarification.

Section 1: Testing Types & Coverage#

This section covers the various types of testing your team performs. For each testing type that applies to your team, provide details about your current practices:

1.1 Regression Testing#

Performance and reliability testing that validates system behavior under various conditions.

Q:1.1.1 Availability testing: How do you test that your system stays up and running? Availability testing validates system uptime and reliability.

Q:1.1.2 Latency testing: How do you measure and test response times? Latency testing measures how quickly your system responds to requests.

Q:1.1.3 Throughput/Volume testing: How do you test your system’s ability to handle scale and volume of requests? Throughput testing validates how many requests your system can process simultaneously.

Q:1.1.4 Saturation testing: How do you test and monitor resource usage limits (Memory/CPU)? What tools do you use for saturation/load testing? Include any frameworks you use or mention if in-house. Saturation testing identifies the point where resource constraints impact performance.

1.2 Functional Testing#

Feature and behavior testing that validates what the system does from various perspectives.

Q:1.2.1 System/App Blackbox testing: How do you test features from a user perspective without looking at internal code? What tools do you use for UI and end-to-end testing? Include any frameworks you use or mention if in-house. Blackbox testing validates functionality through external interfaces (API endpoints, UI automation, end-to-end user journeys).

Q:1.2.2 Component testing: How do you test individual components or modules in isolation? What tools do you use for integration and API testing? Include any frameworks you use or mention if in-house. Component testing validates specific modules separately (microservices, database layers, authentication modules).

Q:1.2.3 Integration test dependencies: Are there particular dependencies (like databases, message brokers, external APIs) that are excluded from your automated integration tests? If yes, which ones and why? This captures what external systems are mocked vs tested with real implementations.

Q:1.2.4 Unit testing with mocks: What is your approach to testing individual functions/methods with mocked dependencies? Unit testing validates the smallest testable parts of code in isolation using mock objects.

Q:1.2.5 User Acceptance Testing (UAT): How do you validate that features meet user requirements? UAT ensures the system works for real users and meets business requirements.

1.3 Performance Testing#

Testing that validates user experience and feature effectiveness.

Q:1.3.1 A/B testing: How do you compare different versions of features? What tools do you use for A/B testing? Include any frameworks you use or mention if in-house. A/B testing compares two versions of a feature to determine which performs better.

Q:1.4 Which testing types are you strongest in? Which need improvement? Based on your answers to questions Q:1.1.1-Q:1.3.1 above, reflect on your team’s testing strengths and areas that could use more attention

Q:1.5 Are there testing gaps you’re aware of but haven’t addressed? Looking at the testing types from questions Q:1.1.1-Q:1.3.1, identify areas where you know testing is needed but hasn’t been implemented yet

Section 2: Test Triggers & Automation#

This section focuses on how your tests get started - whether by humans, automation, or a combination - and how frequently different types of testing occur.

Q:2.1 How are your tests initiated? Describe your current setup: Explain the different ways your tests get started

Q:2.1.1 Manual testing: What tests require human intervention to start and execute? Manual testing involves human-driven test execution and validation.

Q:2.1.2 Automatic testing (CI/CD, scheduled): What is your team’s general approach to automated testing? Describe your typical setup without listing every component. What would be the general percentage of test coverage? Automatic testing runs without human intervention, triggered by events or schedules.

Q:2.1.3 Half-automated testing: What tests do humans start but then run automatically? Half-automated testing combines human initiation with automated execution.

Q:2.2 What is the frequency of your different test types? Thinking about the testing types from Section 1 (regression, functional, performance), describe how often different types of tests run. We are not looking for exact numbers, just some rough numbers on what is executed.

Q:2.2.1 Continuous/On every commit: What tests run with every code change? Continuous testing provides immediate feedback on code changes.

Q:2.2.2 Daily testing: Do you have tests that run daily? What would be the rough coverage of such tests in your application landscape? If yes, could you give one or two examples? If no, could you share if it’s not applicable to your scope and why? Daily testing catches issues that may emerge over time or with different data sets.

Q:2.2.3 Weekly testing: Do you have tests that run weekly? What would be the rough coverage of such tests in your application landscape? If yes, could you give one or two examples? If no, could you share if it’s not applicable to your scope and why? Weekly testing often includes longer-running or comprehensive test suites.

Q:2.2.4 Pre-release testing: What tests are specifically run before deploying? Pre-release testing validates readiness for production deployment.

Q:2.2.5 Ad-hoc/Manual testing: What tests are run on-demand or irregularly? Ad-hoc testing addresses specific concerns or investigates issues as they arise.

Section 3: Monitoring & Testing Integration#

This section explores how you connect production monitoring and system behavior back to your testing practices - creating a feedback loop between what happens in production and what you test for.

Q:3.1 How do you link monitoring data back to testing decisions? Explain how production metrics inform what you test

Q:3.1.1 Technical behavior monitoring: How do system metrics (CPU, memory, response times) influence your testing strategy? Technical monitoring tracks system performance metrics to inform testing decisions. Examples: High CPU alerts leading to load testing, or memory leaks resulting in long-running tests.

Q:3.1.2 Functional behavior monitoring: How does feature usage and user behavior data shape your testing approach? Functional monitoring tracks user interactions and feature usage to guide testing priorities. Examples: Analytics showing user abandonment leading to targeted tests, or error tracking driving resilience testing.

Q:3.2 Production monitoring feedback loop: How do production issues or metrics changes lead to new or modified tests? A feedback loop uses production monitoring data to continuously improve testing strategy. Examples: Response time degradation leading to new performance tests, or incidents generating regression test cases.

Q:3.3 Production behavior validation: How do you ensure your test scenarios match what happens in the real world? Production behavior validation confirms that tests accurately represent real-world usage patterns and conditions.

Q:3.4 Business metrics tracking: How do you connect technical performance to business outcomes? What tools do you use for tracking business metrics and observability? Include any frameworks you use or mention if in-house. Business metrics tracking links system performance to measurable business results.

Section 4: Testing Environments#

Q:4.1 Describe your testing environments: Detail the different environments where you run tests

Q:4.1.1 Production testing: What testing happens directly in the live production environment? Production testing validates system behavior in the real environment with actual users and data. For additional reading on production testing approaches, see Zalando’s E2E probes for regular automatic testing in production: https://zcp.docs.zalando.net/concepts/reference-store/testing/#e2e-probes-on-web

Q:4.1.2 Staging environment: How is your pre-production testing environment used? Staging environments mirror production for final validation before release.

Q:4.1.3 Development/Local testing: What testing happens during development on local machines? Development testing provides fast feedback during code creation and modification.

Q:4.2 What data strategy do you use in each environment? Explain what kind of data is used for testing in different environments

Q:4.2.1 Synthetic data: How do you use artificially created data for testing? Synthetic data is artificially generated to simulate real data without privacy concerns.

Q:4.2.2 Scrubbed production data: How do you use real data with sensitive information removed? Scrubbed production data maintains data relationships while removing personally identifiable information.

Q:4.2.3 Live production data: How do you use actual production data for testing? Live production data provides the most realistic testing conditions but requires careful privacy and safety controls.

Q:4.3 Staging-production similarity: How similar is your staging environment to production? Environment parity measures how closely staging matches production in infrastructure, data, and configuration.

Section 5: Testing Challenges & Pain Points#

Q:5.1 Functional regressions in production: Do you experience functional regressions that leak past your PR pipeline and only show up in production? If yes, describe typical examples and frequency. Functional regressions are features that break or behave differently than expected, escaping existing test coverage.

Q:5.2 Test data and user isolation: How do you handle creating test users, data, or transactions in production without affecting real business metrics or customer data? Test isolation in production prevents test activities from contaminating real business analytics or databases.

Q:5.3 Pre-production monitoring probes: Do you have effective monitoring probes set up before code goes live? Describe your strategy. Monitoring probes are automated checks that validate system behavior before full deployment.

Q:5.4 Release health validation: Do you have a standardized process to validate that a release is healthy? If yes, describe it. If no, what makes this challenging? Release health validation is a consistent process to confirm deployment success and system stability.

Q:5.5 Performance issue correlation: Can you effectively correlate performance issues with specific technical factors? For example, can you link a slow UI to specific CPU/Memory spikes? Performance correlation connects user-facing problems with specific technical causes for faster troubleshooting.

Thank you for your participation! This information will help us understand testing practices across teams and identify opportunities for shared improvements.

WIP Notes#

1. Kinds of Testing#

Your notes branch into two primary streams—technical/performance and functional/behavioral.

Regression Testing

Availability: Ensuring the system stays up.
Latency: Measuring response times.
Throughput / Volume: Handling the scale of requests.
Saturation: Monitoring resource limits (Memory / CPU usage).

Functional Testing

System / App Blackbox Testing: Testing features without internal code visibility (utilizing probes).
Component Testing: Validating individual modules (linked to “more nodes” and “more test data”).
Unit / Mocks: Validating logic at the lowest level.
UAT (User Acceptance Testing): Ensuring it meets user requirements.

Performance / Business Testing

A/B Testing: Comparing two versions of a feature.
Business Metrics: Tying technical performance back to business outcomes.

2. Triggers#

This section defines what initiates the testing process.

Manual: Human-initiated tests.
Automatic: Fully automated via CI/CD or scheduled jobs.
Half-Automated: Human-triggered but execution is automated.
Frequency: Determining how often these triggers occur (recurring vs. event-based).

3. Links: Monitoring to Testing#

This describes the “Control Loop” or feedback cycle between system behavior and test execution.

Monitoring Input: Tracking Technical Behavior (metrics) and Functional Behavior (features).
Feedback Loop: Using monitoring data to inform what needs to be tested or to validate if a test passed in the live environment.

4. Environments / Realms#

Where the testing lifecycle actually lives.

Prod (Production): The live environment where real users are.
Staging: The “pre-live” mirror for final validation.
Change Lifecycle: Mapping where the code is in its journey from dev to live.
Data Strategy: Determining “what kind of data” is used in each realm (synthetic vs. scrubbed prod data).

5. Common Challenges (Impediments)#

These are the core pain points for your problem statement:

Functional Regression: Issues that “leak” past the Pull Request (PR) pipeline and only show up in prod.
Test Entities / Isolation: The difficulty of creating test data/users in prod without “polluting” real business analytics or databases.
Pre-production Probes: Setting up effective monitoring probes before the code is fully “live.”
Defined Process to Validate a Release: Lack of a standardized workflow to confirm a release is “healthy.”
Correlation of Factors: The difficulty in associating performance drops with specific technical factors (e.g., linking a slow UI to specific CPU/Memory spikes).