Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding fault injection integ tests #4399

Merged
merged 1 commit into from
Oct 22, 2024
Merged

Conversation

mye956
Copy link
Contributor

@mye956 mye956 commented Oct 17, 2024

Summary

This PR will introduce new integration tests that will tests running the new fault injection endpoints within TMDS in parallel. These tests will only be running for linux and will need sudo permissions to run the utility tools such as iptables and tc

Implementation details

  • getFaultInjectionTaskResponse(): Helper function to build a TaskResponseobject with fault injection configurations. This will be used during the tests
  • getHostNetworkInterfaceName(): helper function that uses the DefaultNetInterfaceName() function to get the default host network interface name. It is used when constructing the TaskResponse object in getFaultInjectionTaskResponse()
  • getNetworkBlackHolePortRequestBody(): Getter function to create a request body for network black hole port requests
  • getNetworkLatencyRequestBody(): Getter function to create a request body for network latency requests
  • getNetworkPacketLossRequestBody(): Getter function to create a request body for network packet loss requests
  • startServer(): This function will create and start up a new HTTP server that will include all of the network fault handlers and will create all of the mocks used during the tests.
  • shutdownServer(): Helper function to stop/shutdown the HTTP server in startServer().
  • getURL(): Helper function to construct the URL of the fault injection endpoints
  • cleanupBlackHolePortFault(): Helper function that will hit the network-black-hole-port/stop endpoints to stop/clean up the black hole port fault on the host
  • cleanupLatencyAndPacketLossFaults(): Helper function that will hit both network-latency/stop and network-packet-loss/stop endpoint to clean up both latency and packet loss faults on the host
  • skipForUnsupportedTc(): Helper function that tests whether or not the tc utility installed on the host can use the -j flag. If it can't then we will skip running the integ tests

Testing

  • New tests cases:
    • Make 2 network black hole port in parallel
    • Make 2 network latency in parallel
    • Make 2 network packet loss in parallel
    • Make 1 network latency and 1 packet loss in parallel
    • Make 1 network black hole port and 1 latency in parallel
    • Make 1 network black hole port and 1 packet loss in parallel

Manually ran the tests:

--- PASS: TestParallelNetworkFaults (72.19s)
    --- PASS: TestParallelNetworkFaults/network_black_hole_port_same_type (12.03s)
    --- PASS: TestParallelNetworkFaults/network_latency_same_type (12.03s)
    --- PASS: TestParallelNetworkFaults/network_packet_loss_same_type (12.04s)
    --- PASS: TestParallelNetworkFaults/network_latency_and_packet_loss_different_type (12.03s)
    --- PASS: TestParallelNetworkFaults/network_black_hole_port_and_latency_different_type (12.03s)
    --- PASS: TestParallelNetworkFaults/network_black_hole_port_and_packet_loss_different_type (12.03s)
PASS
ok      github.com/aws/amazon-ecs-agent/ecs-agent/tmds/handlers/fault/v1/handlers       72.193s

New tests cover the changes: yes

Description for the changelog

Feature: Adding fault injection integration tests

Additional Information

Does this PR include breaking model changes? If so, Have you added transformation functions?

Does this PR include the addition of new environment variables in the README?

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@mye956 mye956 force-pushed the fault-injection-integ branch from 259981f to 8891f06 Compare October 17, 2024 19:00
@mye956 mye956 force-pushed the fault-injection-integ branch from 8891f06 to b777766 Compare October 18, 2024 18:02
@mye956 mye956 force-pushed the fault-injection-integ branch 2 times, most recently from 3a08f40 to b7c4e29 Compare October 18, 2024 18:53
@mye956 mye956 force-pushed the fault-injection-integ branch 2 times, most recently from 99b4d36 to 74c9c67 Compare October 18, 2024 21:40
@mye956 mye956 force-pushed the fault-injection-integ branch 2 times, most recently from ec14c56 to e046084 Compare October 18, 2024 22:53
@mye956 mye956 force-pushed the fault-injection-integ branch 2 times, most recently from 99759b1 to 74252d1 Compare October 21, 2024 18:03
@mye956 mye956 changed the title [WIP DO NOT REVIEW] Adding fault injection integ tests Adding fault injection integ tests Oct 21, 2024
@mye956 mye956 marked this pull request as ready for review October 21, 2024 18:13
@mye956 mye956 requested a review from a team as a code owner October 21, 2024 18:13
serverPort = 3333
)

var (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit and non-blocking: Using getter functions over vars is better for long term maintenance since that prevents tests from modifying the originally defined values. Once we have a test that modifies these variables they cannot reliably be used by other tests as constants.

agent/handlers/task_server_setup_sudo_integ_test.go Outdated Show resolved Hide resolved

go func() {
if err := server.Serve(listener); err != nil && err != http.ErrServerClosed {
t.Logf("ListenAndServe(): %s\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fail the test here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it should

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a require.NoError

agent/handlers/task_server_setup_sudo_integ_test.go Outdated Show resolved Hide resolved
agent/handlers/task_server_setup_sudo_integ_test.go Outdated Show resolved Hide resolved
agent/handlers/task_server_setup_sudo_integ_test.go Outdated Show resolved Hide resolved
agent/handlers/task_server_setup_sudo_integ_test.go Outdated Show resolved Hide resolved
agent/handlers/task_server_setup_sudo_integ_test.go Outdated Show resolved Hide resolved
@mye956 mye956 force-pushed the fault-injection-integ branch 3 times, most recently from 58b00f2 to 69f7417 Compare October 22, 2024 18:34
@mye956 mye956 force-pushed the fault-injection-integ branch from 69f7417 to 467b456 Compare October 22, 2024 18:46
xxx0624
xxx0624 previously approved these changes Oct 22, 2024
@mye956 mye956 merged commit ec1598e into aws:dev Oct 22, 2024
40 checks passed
@mye956 mye956 mentioned this pull request Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants